NIPS 2007 Tutorial on Deep Belief Nets

Summary so far

•

Restricted Boltzmann Machines provide a simple way to

learn a layer of features without any supervision.

–

Maximum likelihood learning is computationally

expensive because of the normalization term, but

contrastive divergence learning is fast and usually

works well.

•

Many layers of representation can be learned by treating

the hidden states of one RBM as the visible data for

training the next RBM (a composition of experts).

•

This creates good generative models that can then be

fine-tuned.

–

Contrastive wake-sleep can fine-tune generation.