In July 2011 I moved to Harvard University to join the School of Engineering and Applied Sciences. My new web page is here.

In Advances in Neural Information Processing Systems (NIPS) 23

Ryan Prescott Adams, Zoubin Ghahramani, and Michael I. Jordan

Many data are naturally modeled by an unobserved hierarchical structure. In this paper we propose a flexible nonparametric prior over unknown data hierarchies. The approach uses nested stick-breaking processes to allow for trees of unbounded width and depth, where data can live at any node and are infinitely exchangeable. One can view our model as providing infinite mixtures where the components have a dependency structure corresponding to an evolutionary diffusion down a tree. By using a stick-breaking approach, we can apply Markov chain Monte Carlo methods based on slice sampling to perform Bayesian inference and simulate from the posterior distribution on trees. We apply our method to hierarchical clustering of images and topic modeling of text data.

This paper was presented as a plenary oral at NIPS 2010.

pdf | ps | bibtex | arXiv:1006.1062v1 [stat.ME] | code


CIFAR Image Data Sets

These data were gathered by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. Alex maintains the authoritative page about these data, and the correct citation for these data is:

Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, Department of Computer Science, University of Toronto. 2009.

For the visualization done in the NIPS paper, I used the raw images as PNG files, which you can grab as tarballs:

I used 256-dimensional binary features, which were also the work of Alex and should be cited as such. I provide them below for convenience, and they are simply lists of integers whose binary representations are the features.


Frequently Asked Questions