This page contains the abstracts and accompanying material for the talks in the final programme, as well as Larry Wasserman's talk which was unfortunately cancelled at the last minute (at bottom).
SATURDAY MORNING SESSION ...other sessions: afternoon cancelled
| 7:30 - 7:40 | Opening Remarks - Matt Beal and Yee Whye Teh |
| 7:40 - 8:10 |
Introductory Tutorial on Nonparametric Methods and Infinite Models
Talk slides: [pdf] |
| 8:15 - 8:35 |
Hierarchical models with multiple mixtures of Dirichlet processes
This talk will discuss the use of hierarchical models with multiple
layers of mixtures of Dirichlet processes. For example, with modern
diagnostic equipment, one can measure the individual cell sizes from a
sample of ones blood. One can then use a mixture of Dirchlet process to
model the distribution cell sizes for each distribution. This model is
somewhat equivalent to putting a distribution on the family of kernel
density estimates. Escobar and West (1995) showed how the kernel density
estimator approximates a Bayesian method of estimating denisties based on
a mixture of Dirichlet processes (MDP). However, since the MDP method is a
proper Bayesian model, one can use hierarchical priors and calculate
posterior distributions of functionals of interest.
Using these techniques, we develop a highly flexible hierarchical model
in the space of distributions. This model allows us to model samples of
densities and to find outliers in the space of distributions. This talk
will discuss the methods used to compute these models and to assess
outliers. These techniques will be used to identify diseased subjects
based on the distribution of the size of the subject's red blood
cells. |
| 8:40 - 9:00 |
The Hierarchical Dirichlet Process
Certain interesting data sets can naturally be grouped into subsets, each
of which having its own distinct features. Some of these data sets can be
well characterized as having arisen from a model consisting of several
mixture models, and the aim is to extract features (mixture components)
that explain the data well. |
| 9:05 - 9:15 | Break |
| 9:15 - 9:35 |
Application of nonparametric Bayesian methods in genetic inference
The problem of inferring haplotypes from genotypes of single nucleotide
polymorphisms (SNPs) is essential for the understanding of genetic
variation within and among populations, with important applications to the
genetic analysis of disease propensities and other complex traits. In
this paper we present a novel statistical model for haplotype inference.
Our model is a Bayesian model based on a prior known as the Dirichlet
process, a nonparametric prior which provides control over the size of the
unknown pool of population haplotypes. The model also incorporates a
likelihood that allows statistical errors in the haplotype/genotype
relationship, trading off these errors against the size of the pool of
haplotypes. We describe an algorithm based on Markov chain Monte Carlo
for posterior inference. The overall result is a flexible Bayesian model
that is reminiscent of parsimony methods in its preference for small
haplotype pools. We apply this new approach to the analysis of both
simulated and real genotype data, and compare to extant methods. |
| 9:40 - 10:00 |
Approximate Solutions to Nonparametric Bayesian Hierarchical Modelling with Applications to Information Filtering
In some cases---for example in applications involving hierarchical
Bayes---one learns a ``prior'' parameter distribution from
repeated experiments. It often occurs that the ``learned prior''
does not correspond to a family of distributions which can easily
be specified. In this paper, we present a solution to this problem
by formulating our prior in terms on an infinite-dimensional Dirichlet
process.
This usage of a Dirichlet process in a statistical model is sometimes
referred to as
Dirichlet enhancement and is a main approach in nonparametric
hierarchical Bayesian modeling. Typically,
nonparametric hierarchical modeling relies on an efficient
implementation of Gibbs sampling. We demonstrate how Gibbs
sampling can be used in our context. In addition we present two
novel non sample-based approximations. The first approximation is based on a
MAP approximation using an EM algorithm. The second one is based on a
variational approximation. We apply a Dirichlet enhanced
hierarchical model to information filtering where nonparametric
hierarchical modeling allows the principled combination of both
content filtering and collaborative filtering. We demonstrate the
effectiveness of our approximation using two applications: the
retrieval of art images and the information filtering of Reuters
news data. |
| 10:05 - 10:25 |
Extended Bayesian Statistical Inference and Renormalization Group
The work discusses the improvement and the generalization of the
predictive distribution in Bayesian statistical inference
in a most general setting. |
SATURDAY AFTERNOON SESSION ...other sessions: morning cancelled
| 4:00 - 4:10 | Welcome back | ||
| 4:10 - 4:30 |
Nonparametric Bayesian models for semi-supervised learning
We describe nonparametric Bayesian approaches to generalizing class labels from few labeled examples, guided by a much larger set of unlabeled examples. We posit some (potentially infinite) latent structure underlying both the observed features and the unobserved class labels, which allows the unlabeled examples to influence how class labels are generalized from labeled examples. In one approach, we assume a latent tree-structure to the domain. The tree (or a distribution over trees) may be inferred using the unlabeled data. A prior over concepts generated by a mutation process on the inferred tree(s) allows efficient computation of the optimal Bayesian classification function from the labeled examples. This approach performs well on real-world datasets and extends naturally to handle two difficult problems: learning from very sparse data, and learning from positive examples only. Time permitting, we will also discuss an approach to Bayesian semi-supervised learning with text data, based on an infinite version of the Latent Dirichlet Allocation (LDA) topic model. | ||
| 4:35 - 4:55 |
Some Remarks about Bayesian Infinite Regression and Gaussian Processes
Talk slides: [pdf] | ||
| 5:00 - 5:10 | Break | ||
| 5:10 - 5:30 |
Expectation Propagation for Infinite Mixtures
I will describe a method for approximate inference in infinite models that
uses deterministic Expectation Propagation instead of Monte Carlo. For
infinite Gaussian mixtures, it provides cluster parameter estimates,
cluster memberships, and model evidence. Model parameters, such as the
expected size of the mixture, can be efficiently tuned via EM with EP as
the E-step. The same approach can apply to other infinite models such as
infinite HMMs. | ||
| 5:35 - 5:55 |
Variational Approximations for the Truncated Dirichlet Process
The truncated Dirichlet process is a finite approximation to the full
Dirichlet process based on its stick-breaking construction
(Sethuraman, 1994). Ishwaran and James (2001) use this distribution
to develop a blocked Gibbs sampling algorithm for nonparametric
Bayesian inference. We develop a deterministic mean-field variational
approximation algorithm for the same model. This approach is easily
applied to nonparametric mixture models where the prior distribution
on the mixture components is conjugate to the data distribution. We
demonstrate that, in high dimensions, this technique is faster than
both the blocked and full Dirichlet process Gibbs samplers. | ||
| 6:00 - 7:00 | Panel Discussion
| ||
| 7:00 | Closing Remarks - Matt Beal and Yee Whye Teh |
CANCELLED TALKS ...other sessions: morning afternoon
| 4:35 - 4:55 |
Frequentist Properties of Infinite Dimensional Bayes
Nonparametric Bayesian methods involve placing a prior on an infinite
dimensional space and computing the posterior. I will argue that it is
essential to examine the frequentist properties of these Bayesian
methods since the prior cannot be specified with complete
confidence. We look at three properties: consistency, rate optimality
and correct coverage. Many priors are consistent, fewer are rate
optimal and whether any yield correct coverage remains an open
question. I will then discuss some recent frequentist methods for
nonparametric inference that do yield correct coverage. The latter is
joint work with Chris Genovese. |
