CIAR Second Summer School Tutorial
Lecture 2a

Learning a Deep Belief Net

A neural network model of digit recognition

The generative model

Why its hard to learn belief nets one layer at a time

Using complementary priors to eliminate explaining away

An example of a complementary prior

Inference in a DAG with replicated weights

A picture of the Boltzmann machine learning algorithm for an RBM

"The learning rule for a..."

Multilayer contrastive divergence

A simplified version with all hidden layers the same size

Learning a deep causal network

"Then freeze the bottom layer..."

"Then freeze the bottom two..."

Why the hidden configurations should be treated as data when learning the next layer of weights

Why greedy learning works

Back-fitting

Show the movie

Examples of correctly recognized MNIST test digits (the 49 closest calls)

How well does it discriminate on MNIST test set with no extra information about geometric distortions?

Slide 21

Samples generated by running the top-level RBM with one label clamped. There are 1000 iterations of alternating Gibbs sampling between samples.

Samples generated by running top-level RBM with one label clamped. Initialized by an up-pass from a random binary image. 20 iterations between samples.

The wake-sleep algorithm

The flaws in the wake-sleep algorithm

A contrastive divergence version of wake-sleep

Mode averaging

A different way to capture low-dimensional manifolds

Learning with realistic labels

Learning with auditory labels

Some problems with backpropagation
(again!)

The obvious solution to all of these problems: Use greedy unsupervised learning to find a sensible set of weights one layer at a time. Then fine-tune with backpropagation

Modelling the distribution of digit images

Results on permutation-invariant MNIST task