the next generation of neural networks

25

First, model the distribution of digit images


The top two layers form a restricted
Boltzmann machine whose free energy
landscape should model the low
dimensional manifolds of the digits.

2000 units

500 units


The network learns a density model for
unlabeled digit images. When we generate
from the model we often get things that look
like real digits of all classes.

But do the hidden features really help with
digit discrimination?

Add 10 softmaxed units to the top and do
backpropagation. This gets 1.15% errors.

500 units


28 x 28
pixel
image