NIPS 2007 Tutorial on Deep Belief Nets


	Training a deep network

(the main reason RBM’s are interesting)

•

First train a layer of features that receive input directly

from the pixels.

•

Then treat the activations of the trained features as if

they were pixels and learn features of features in a

second hidden layer.

•

It can be proved that each time we add another layer of

features we improve a variational lower bound on the log

probability of the training data.

–

The proof is slightly complicated.

–

But it is based on a neat equivalence between an

RBM and a deep directed model (described later)