CSC321 Project 1: Face Recognition and Gender Classification with K-Nearest Neighbours (Worth: 7%)

For this project, you will build a a system for face recognition and gender classification, and test it on a large(-ish) dataset of faces, getting practice with data-science-flavour projects along the way.

The Input

You will work with a subset of the FaceScrub dataset. The subset of male actors is here and the subset of female actors is here. The dataset consists of URLs of images with faces, as well as the bounding boxes of the faces. The format of the bounding box is as follows (from the FaceScrub readme.txt file):

The format is x1,y1,x2,y2, where (x1,y1) is the coordinate of the top-left corner of the bounding box and (x2,y2) is that of the bottom-right corner, with (0,0) as the top-left corner of the image. Assuming the image is represented as a Python NumPy array I, a face in I can be obtained as I[y1:y2, x1:x2].

You may find it helpful to use and/or modify my script for downloading the image data

At first, you should work with the faces of the following actors: act = ['Gerard Butler', 'Daniel Radcliffe', 'Michael Vartan', 'Lorraine Bracco', 'Peri Gilpin', 'Angie Harmon']

For this project, you should crop out the images of the faces, convert them to grayscale, and resize them to 32x32 before proceeding further. You should use scipy.misc.imresize to scale images, and you can use rgb2gray to convert RGB images to grayscale images.

Part 1 (10%)

Describe the dataset of faces. In particular, provide at least three examples of the images in the dataset, as well as at least three examples of cropped out faces. Comment on the quality of the annotation of the dataset: are the bounding boxes accurate? Can the cropped-out faces be aligned with each other?

Part 2 (5%)

Separate the dataset into three non-overlapping parts: the training set (100 face images per actor), the validation set (10 face images per actor), and the test set (10 face images per actor). For the report, describe how you did that. (Any method is fine). The training set will contain faces whose labels you assume you know. The test set and the validation set will contain faces whose labels you pretend to not know and will attempt to determine using the data in the training set. You will use the performance on the validation set to tune the K, and you will then report the final performance on the test set.

Part 3 (45%)

Write code to recognize faces from the set of faces act using K-Nearest Neighbours (note that face recognition is a classification task!). Determine the K using the validation set, and the report the performance on the test set. Here and elsewhere in the project, use the L2 distance between the flattened images of cropped-out faces in order to find the nearest neighbours.

Note: the reason you need to use both a validation set and a test set is that picking the best k by using the validation set and then reporting the performance on the same set makes it so that you report artificially high performance. A better measure is obtained by picking the parameters of the algorithm using a validation set, and then measuring the performance on a separate test set.

The performance you obtain will likely be around 50% or a little higher

In addition to the performance on the test set, in your report, display 5 failure cases (where the majority of the k nearest neighbours of a test face are not the same person as the test person). Display both the face to be recognized and its 5 nearest neighbours.

Part 4 (10%)

Plot the performance on the test, train, and validation sets of the K-Nearest-Neighbours algorithm for various K. Make sure the axes of your graphs are correctly labelled. Explain why the plots look the way they do.

Part 5 (20%)

Write code to determine, for every face in a set of faces, the gender of the person using K-Nearest-Neighbours. This should work in the same way as face recognition, except instead of each face being assigned a name, each face is considered to be simply either male or female. Again, use your validation set to select a k for the best performance (i.e., the proportion of faces whose gender was classified correctly), report the performance for the different k's for the validation set, and then report the results on the test set for the k that works best for the validation set. For this part, you should still use the set of actors act for both the test and the validation set.

The performance you obtain will likely be around 80% or a little higher

Part 6 (10%)

Now, evaluate the implementation in Part 5 on faces of actors who are not in the training set (i.e., actors other than the people in the set act.) Discuss the difference between the performance in Part 5 and the performance in Part 6.

What to submit

The project should be implemented using Python 2 and should run on CDF. Your report should be in PDF format. You should use LaTeX to generate the report, and submit the .tex file as well. A sample template is on the course website. You will submit three files: faces.py, faces.tex, and faces.pdf.

Reproducibility counts! We should be able to obtain all the graphs and figures in your report by running your code. The only exception is that you may pre-download the images (what and how you did that, including the code you used to download the images, should be included in your submission.) Submissions that are not reproducible will not receive full marks. If your graphs/reported numbers cannot be reproduced by running the code, you may be docked up to 20%. (Of course, if the code is simply incomplete, you may lose even more.) Suggestion: if you are using randomness anywhere, use numpy.random.seed().

You must use LaTeX to generate the report. LaTeX is the tool used to generate virtually all technical reports and research papers in machine learning, and students report that after they get used to writing reports in LaTeX, they start using LaTeX for all their course reports. In addition, using LaTeX facilitates the production of reproducible results.

Using my code

You are free to use any of the code available from the course website.

Important:

Readability counts! If your code isn't readable or your report doesn't make sense, they are not that useful. In addition, the TA can't read them. You will lose marks for those things.
It is perfectly fine to discuss general ideas with other people, if you acknowledge ideas in your report that are not your own. However, you must not look at other people's code, or show your code to other people, and you must not look at other people's reports, or show your report to other people. All of those things are academic offences.