*** New workshop at NIPS*05 here ***
A long standing issue with learning in graphical models has been determining the appropriate model size and structure. In many real world applications, traditional models with a small number of latent variables seem inadequate. In the quest for creating more flexible modelling tools, recent research has turned to the limit of such models with infinitely many latent variables and parameters (for example, neural nets with infinitely many hidden units, mixture models with infinitely many clusters, and hidden Markov models with infinitely many states). This limit corresponds to the field traditionally covered by Nonparametric Bayesian Statistics, which assumes a priori that the data was generated from a nonparametric model, with a possibly infinite number of parameters, experts, or hidden states, etc. In particular, such infinite models involving the use of Dirichlet processes have recently been introduced as a very attractive alternative to finite models where cumbersome model selection is required.
The workshop will bring together researchers and practitioners of nonparametric Bayesian methods to share their experiences and expertise with the general NIPS community, in an effort to transfer and build upon key methodologies developed in the statistics community. In particular, we wish to discuss the following themes and questions:
- Modelling issues: Beyond identifying the number of states or components in a mixture, can we apply ideas of infinite models to structure learning, where we learn distributions over both the number of hidden variables, their dimensionalities, and the graphical structure? What about structured and semi-structured data?
- Statistical issues: Under what conditions can we expect these models to give consistent estimates of densities? What about identifiability of parameters and structures? What can we say about the convergence of these models as the amount of data increases?
- Computational/Inferential issues: Inference in infinite models currently requires expensive MCMC sampling. For more popular use of these models we need more efficient inference schemes. Are there more efficient MCMC sampling schemes, or can we apply approximate methods like variational techniques to infinite models? Can these methods be scaled to high dimensional data?
- Beyond infinite models: How do infinite models compare with other existing methodologies? Are there problems for which Dirichlet processes are unsuitable, and if so how can we tackle these cases?