SML310 Mini-Project 4: PyTorch and Transfer Learning

Overview

For this project, you will work with image data, use more advanced features of PyTorch, and explore overfitting.

Data: Actor Faces

You will be working with a subset of the FaceSrub dataset. You will also get more practice with working with Python, and using the functions from Python’s libraries (Google for help when necessary). I downloaded the images and cropped out actors’ faces for you. The images are in data.zip .

For this project, you will be building a face classifier and exploring overfitting.

Part 1: Read in the images (4 pts)

Read in all the images in data/cropped64 into a matrix, and display a few photos of each actor in your report.

To list files in a folder, you can use os.listdir . A tutorial on working with images is availble here . To display several images in a grid, use subplot .

Are the images registered? (I.e., can you overlay them on top of each other and have features such as the eyes, the nose, and the mouth be aligned?)

Part 2: Convert the images to grayscale (1 pts)

Convert the color image you read in to grayscale. In your report, explain how you did that, and include an example of an image that was converted to grayscale using code.

Part 3: Logistic regression (10 pts)

Build a logistic regression classifier for the images. Plot the learning curves for a training and validation set, and report the performance on the test set.

Part 4: Visualization (10 pts)

Display the weights that are associated with each actor that you obtained when you have trained your logistic regression. Report any observations that you made.

Part 5: One-hidden-layer Neural Network (10 pts)

Train a one-hidden-layer neural network to classify the faces. Plot the learning curves for the training and validation set, and report the performance on the test set. Use 30 hidden units. Compare the performance you got here to the performance you got in Part 3.

Part 6: Visualization and interpretation (10 pts)

Visualize the weights connecting the inputs to the hidden units. (Visualize all 30 sets of weights, and display them using subplot ).

Part 7: Visualization and interpretation II (5 pts)

Identify the hidden units that are most important for classifying the input as one of the actors (up to you which one). Explain how you did that, and display the sets of weights that are associated with those units.

Part 8: Overfitting and regularization (15 pts OPTIONAL if you attend Precept 12)

For this part, you should demonstrate overfitting when running logistic regression. Decrease the size of the training set enough that the overfitting is very obvious. Display the learning curves and the results, as well as the weights that you obtain. Point out what indicates that there is overfitting in the results that you obtain.

Demonstrate the effect of regularization.

Part 9: Transfer Learning (25 pts OPTIONAL if you attend Precept 12)

Learn a face classifier using transfer learning: for each input, first compute the activations of the layer computed by features() in the AlexNet implementation. Store those activations in a new matrix, and then train a classifier using those new inputs. Plot the learning curves, and report the results on the test set. Note that AlexNet requires color images.

Advice: the easiest (if not the most elegant) way to compute the AlexNet features is to copy-and-paste class MyAlexNet , rename it to something else, and edit the forward() function in it. Nothing else needs to be modified.

What to submit

Please submit all of your Python code, as well as a report in PDF format. Your report should address every one of the tasks above. Your report should be repoducible: in the report, include the function calls that the TA should make to get the outputs that you are showing in your report. (You do not need to include helper functions in your report.

Report quality (10 pts)

Your report should be readable. 10 pts will be awarded for very readable and professional reports. 5 pts will be awarded for reports that are readable with some effort.