lecnew21

Training a deep network

•

First train a layer of features that receive input directly

from the pixels.

•

Then treat the activations of the trained features as if

they were pixels and learn features of features in a

second hidden layer.

•

It can be proved that each time we add another layer of

features we get a better model of the set of training

images.

–

i.e. we assign lower energy to the real data and

higher energy to all other possible images.

–

The proof is complicated. It uses variational free

energy, a method that physicists use for analyzing

complicated non-equilibrium systems.

–

But it is based on a neat equivalence.