You will need to get http://www.cs.toronto.edu/~hinton/csc321/matlab/classbp2.m http://www.cs.toronto.edu/~hinton/csc321/matlab/experiment.m http://www.cs.toronto.edu/~hinton/csc321/matlab/assign2data08.mat You will also need the files which you should already have from assignment1 load assign2data08.mat and then type restart = 1; maxepoch = 2500; numhid = 60; epsilon = .01; finalmomentum = 0.8; weightcost = 0; classbp2; You can set the variable errorprintfreq in classbp2 to make it measure the error as frequently as you want. classbp2 differs from classbp1 in several ways. It uses momentum to speed the learning and weightcost to keep the weights small. It expects you to set more global variables by hand (see the code file). This makes it easier to set up experiments in which you try many different settings. PART 1 (2 points) Using numhid=60 and maxepoch=2500 and weightcost=0, play around with epsilon and finalmomentum to find settings that make tE low after 2500 epochs. Briefly report what you discover. Include the values of epsilon and finalmomentum that work best and say what values they produce for the test errors and the cross-entropy error. PART 2 (2 points) Using numhid=60 and maxepoch=2500 and finalmomentum=0.8 set epsilon to a sensible value based on your experiments in part 1 and then try various values for weightcost to see how it affects the final value of tE. You may find the file experiment.m useful, but you will have to edit it. Briefly report what you discovered and include a plot of the final value of tE against the weightcost. Your report on Parts 1 and 2 combined should be NOT MORE THAN ONE PAGE long, but graphs and printouts of runs can be attached. PART 3: (4 points) Copy the files: http://www.cs.toronto.edu/~hinton/csc321/matlab/bayeswithbest.m http://www.cs.toronto.edu/~hinton/csc321/matlab/makeallvecs.m http://www.cs.toronto.edu/~hinton/csc321/matlab/maketeacher.m http://www.cs.toronto.edu/~hinton/csc321/matlab/applyweights.m http://www.cs.toronto.edu/~hinton/csc321/matlab/hinton.m http://www.cs.toronto.edu/~hinton/csc321/matlab/blob.m First type makeallvecs; This makes a matrix in which each row is a possible weight vector. then type maketeacher; This makes a teacher network Then type: numcases=10; bayeswithbest; Figure 2 will show you how well the outputs of the teacher on the TEST data can be predicted by bayes-averaging the outputs of all possible nets. "Bayes-averaging" means weighting the prediction of each net by the posterior probability of that net given the training data and the prior (which is flat in this example). The third column is the predictions of the best single net. Figure 1 will show you how well the outputs of the teacher on the training data are predicted by the bayes-average and by the best single net. Figure 3 shows a histogram of the posterior probability distribution across all 9^4 weight vectors. Notice that the posterior can be very spread out so that even the best net gets a very small posterior probability. Your report should be at most half a page and should describe the effects of changing the number of training cases. You should also try using maketeacher to see how the results depend on the particular teacher net.