Latent topic models serve as an unsupervised machine learning technique for mining data in large corpora. These models have been successfully applied to various types of data, including text documents, images, biological data, and more. Various methods for approximate inference have been suggested and many variations and extensions of the basic models have been published over the past decade in the machine learning and data mining communities. This workshop aims at initiating further discussion on improving topic models and exploring new directions of research.

These include, but are not limited to:

  • Applications that build upon topics models: We encourage works that use topic models for applications such as document summarization, classification, annotation, machine translation, search, and more.
  • Thorough evaluation of topic models: Recent papers have shown that existing evaluation techniques are insufficient. We are interested in evaluation methodologies for topic models both with respect to other topic models and non-topic model methods.
  • Applying topic models to structured corpora and incorporating additional features into topic models: Considering the inner structure of a corpus may lead to enhanced analysis capabilities. Examples include the use of authorship information, the study of temporal changes and network models.
  • Exploration of topic models in new domains: Recently topic models have been applied to problems in computational biology, collaborative filtering and more. It would be interesting to explore even further the applicability of these models.
  • Approximate inference for topic models: Exact inference in hierarchical topic models is intractable and researchers resort to approximate methods. We welcome introduction of new approximate inference methods as well as further investigation and evaluation of existing methods, their extensions and combination.