a) Implement in Matlab the algorithm described in [paper on web]. Use two hidden layers. Hand in the code of your algorithm. Include comments in the code.
b) Find a real dataset consisting of binary vectors. Since you will be doing unsupervised learning you can ignore the "outputs" or you can just treat them as extra inputs. You may need to convert non-binary attributes into binary ones. Components of an input vector that can take on many different discrete values can be represented by using many different binary units, only one of which is on. One place to look for a dataset is the UCI repository . Learn on a subset of your dataset and use the remainder to assess how well the data is modelled by the wake-sleep algorithm compared with a naive model that estimates the probability of a datavector by multiplying together the separate probabilities of each of its binary components. The wake-sleep algorithm allows you to get a noisy but unbiased estimate of the net coding cost of a datavector (after you get the bits back). By getting several noisy estimates and averaging, you can get good estimates of the probabilities of test datavectors. Hand in a description of the dataset you used (including any preprocessing you did), and a detailed description of how you compared the performance of the model trained by wake-sleep with the performance of the naive model.
a) Implement in Matlab the learning algorithm for Restricted Boltzmann Machines described in the reading [ps.gz] [pdf]. Hand in the code of your algorithm. Include comments in the code.
b) Find a real dataset that has binary input vectors from two different classes (see project 1 above for how to find a dataset.). Learn a separate model of each class. Then make a plot of all the test cases using the free energy scores under each of the two models as the axes. Compare the discrimination performance on the test data with that of two naive models that estimate the probability of a datavector by multiplying together the separate probabilities of each of its binary components. You do not need to give a numerical estimate of the performance -- just hand in the plots of the test cases for each of the two approaches with some comments. Also hand in a description of the dataset you used (including any preprocessing you did).
1. It is related to the material in the course.
2. It involves programming a learning algorithm in Matlab.
3. It involves investigating the algorithm's performance on data.
4. You come and see me during office hours to discuss your plan.