lec2a

Why greedy learning works

•

Each time we learn a new layer, the inference at

the layer below becomes incorrect, but the

variational bound on the log prob of the data

improves provided we start the learning from the

tied weights that implement the complementary

prior.

•

Now that we have a guarantee we can loosen

the restrictions and still feel confident.

–

Allow layers to vary in size.

–

Do not start the learning at each layer from

the weights in the layer below.