Learning Deep Belief Nets
Deep Belief Nets (DBN's) will be explained in the lecture on Oct
29. Instead of learning layers of features by backpropagating errors,
they learn one layer at a time by trying to build a generative model
of the data or the activities of the feature detectors in the layer
below. After they have learned features in this way, they can be
fine-tuned with backpropagation. Their main advantage is that they can
learn the layers of features from large sets of unlabelled data.
The data and the code
For this project you should use the MNIST dataset. This data and the
code for training DBN's is available here.
You want the code for classifying images, not the code for deep
The main point of the project
You should investigate how the relative performance of three learning
methods changes as you change the relative amounts of labelled and
unlabelled data. The three methods are DBN's,
SVMlite, and SVMlite applied to the features learned by DBN's. SVMlite
is explained in several places on the web. We will add more
information soon on the easiest way to use SVMlite with Matlab for
multiclass classification (as opposed to 2 class which is easier).
Support vector machines will be explained in the lecture on Nov 12,
but you dont need to understand much about them to run SVMlite.
Choosing the data
You have to decide how much labelled and unlabelled data to use.
Using small amounts of labelled data is a good place to start because
it makes the supervised learning fast and it gives a high error-rate
which makes comparisons easier.
Designing the Deep Belief Network
You have to decide how many hidden layers the DBN should have, how
many units each layer should have, and how long the pre-training
should be. This project does not involve writing your own code from
scratch, so we will expect you to perform sensible experiments to
choose these numbers (and the numbers of labelled and unlabelled
examples) and it will be evaluated on how well you do this.
Extra work for teams
If you are working as a pair, you should also compare the three
methods above with a single feedforward network trained with
backpropagation on the labelled data.