Defining Priors for Distributions Using Dirichlet Diffusion Trees

Radford M. Neal, Dept. of Statistics and Dept. of Computer Science, University of Toronto

I introduce a family of prior distributions over univariate or multivariate distributions, based on the use of a ``Dirichlet diffusion tree'' to generate exchangeable data sets. These priors can be viewed as generalizations of Dirichlet processes and of Dirichlet process mixtures. They are potentially of general use for modeling unknown distributions, either of observed data or of latent values. Unlike simple mixture models, Dirichlet diffusion tree priors can capture the hierarchical structure that is present in many distributions. Depending on the ``divergence function'' employed, a Dirichlet diffusion tree prior can produce discrete or continuous distributions. Empirical evidence is presented that some divergence functions produce distributions that are absolutely continuous, while others produce distributions that are continuous but not absolutely continuous. Although Dirichlet diffusion trees are defined in terms of a continuous-time stochastic process, inference for finite data sets can be expressed in terms of finite-dimensional quantities, which should allow computations to be performed by reasonably efficient Markov chain Monte Carlo methods.

Technical Report No. 0104, Dept. of Statistics, University of Toronto (March 2001), 25 pages: postscript, pdf.

The results in this paper were produced using software available on-line.

Associated references: Parts (but not all) of this technical report became part of the following paper:
Neal, R. M. (2003) ``Density modeling and clustering using Dirichlet diffusion trees'', in J. M. Bernardo, et al. (editors) Bayesian Statistics 7, pp. 619-629: abstract, postscript, pdf, associated software.