PRELIMINARIES From the archive for assignment 2: http://www.cs.toronto.edu/~hinton/csc321/matlab/assignment2.tar.gz copy the files: assign2data2012.mat, classbp2.m. Recall that the former file has 3000 test and 150 training data points of digits. The latter file implements a feedforward neural network, and the backpropagation algorithm to learn the weights of the neural network from training data. Also obtain: http://www.cs.toronto.edu/~hinton/csc321/matlab/train_nn.m Run train_nn.m in matlab, to train the neural network on the 150 training cases and test on the 3000 "test" cases. How many errors do you see at the end of the run ? ASSIGNMENT 4. Obtain the files listed below. You will use them to train an RBM, that will be used to initialize (pretrain) the neural network for predicting digits. http://www.cs.toronto.edu/~hinton/csc321/matlab/unlabeled.mat http://www.cs.toronto.edu/~hinton/csc321/matlab/rbmfun.m http://www.cs.toronto.edu/~hinton/csc321/matlab/showrbmweights.m The first file above contains the unlabeled training data. You can load the data by using the command "load unlabeled" in matlab. The second file contains code to train an rbm, and the third file allows you to visualize the weights of the rbm. The point of the assignment is to figure out how to use the 2000 unlabeled cases and the function rbmfun to do better on the "test" set (which should really be called a validation set since you use it many times for deciding things like the number of hidden units). 1. (5 points) Using results from running rbmfun on the unlabeled data, modify classbp2 in a way that allows you to get a best test error of less than 505, in at least 5 runs out of 10. What are the exact changes you made? Give a list of the variables, and the values you assigned to them that allowed you to get these results. If you cannot achieve the desired error rate, give the exact details of the best settings you could find. Also report the error that you get with the same settings for classbp2 but without using rbmfun and the unlabeled data. 2. (3 points) Say how you think the use of the unlabeled data influences the number of hidden units that you should use. Report some evidence for your opinion. 3. (2 points) Using the same parameters (other than maxepoch) as in the file: train_nn.m, run your experiment by training the rbmfun for 50, 75 and 100 epochs and report the best test set error for each case (averaged over three runs). On the basis of these numbers what can you say about the effect of number of epochs of RBM training on the final test error from the neural network ? Your report should not be more than three pages long (including figures) and should not contain more than 2 pages of text. One page of text is quite sufficient.