In this assignment, students work with a Recurrent Neural Network (RNN) and explore the mechanism that allows RNNs to model English text characterbycharacter. Students learn to think of an RNN as a state machine, and explore the mechanism that allows RNNs to model and generate text. Students acquire facility with working with an implementation of an RNN and adding functionality to it. The assignment was originally used in a thirdyear Introduction to Neural Networks and Machine Learning course, and could be adapted for use in an Intro AI course where neural networks are a special topic: the assignment can be modified to require students to understand how neural networks produce outputs, and but not how they are trained. The assignment only requires an intuitive understanding state machines, and does not require Calculus.
Recurrent Neural Networks have recently been shown to be remarkably effective at modelling English text chacaterbycharacter. Realisticseeming ”fake” English text, and even C code, can be generated using RNN models. Students should be pointed to Andrej Karpathy's excellent essay on "[t]he unreasonable effectiveness of Reccurent Neural Netowrks," which inspired this assignment.
Many students and practitioners tend to view RNNs as black boxes. This assignment forces students to understand how RNNs model text by having students explain some properties of the outputs generated by the model (e.g., newlines always following colons, as in the training text) in reference to specific weights of the model.
Students are provided with a simple RNN model that was trained on a corpus of Shakespeare plays. To make sure students gain an understanding of an implementation of an RNN and to give students practice with working with ML code, students are asked to modify and extend mincharrnn.py to generate text from an RNN model at different temeperatures and with different starting strings. Students are then asked to explain several properties of the text by viewing the RNN as a state machine. Students are encouraged to find new interesting properties of the "fake" text generated by the model and to explain how the model generates text with those properties, for bonus marks.
In courses that spend a fair amount of time on RNNs, a tutorial can be run to explain the RNN implementation used, mincharrnn.py. The tutorial handout (LaTeX source) is provided.
Summary  Students extend and modify existing code to generate "fake English" text from an RNN. Students explore how the RNN model is able to generate text that resembles the training text by analyzing the weights and architecture of the RNN. Optionally, students train the RNN themselves using a corpus of Shakespeare plays as the training set. 
Topics  Recurrent Neural Networks, text models, generation from probabilistic models 
Audience  Third and fourth year students in Intro ML classes. Students in Intro AI classes in which neural netwroks are covered. Accessible to any student with a good background in Discrete Math from CS and to excellent high school students. 
Difficulty  Students find this assignment easier than the average thirdyear assignment. Thirdyear students take about five hours to complete it. 
Strengths 

Weaknesses 

Dependencies  Students must be have a good understanding of Recurrent Neural Networks (though not necessarily of the details of learning RNNs). Students should enough CS background to be comfortable with reasoning about Finite State Machinelike structures. Students should have had some practice programming with NumPy. 
Variants 

Handout: HTML (markdown source).
This assignment should be supported by 23 hours of lecture on Recurrent Neural Networks. Slides relevant to the assignment are available on request. When using the assignment, we also ran a tutorial explain the RNN code used, Andrej Karpathy's mincharrnn.py, linebyline. The tutorial handout (LaTeX source) is provided.
Prodding students to be creative when looking for interesting properties of the generated text proved to be a challenge: many were stuck in the "character A follows character B" paradigm. We would consider quickly introducing various properties of English text when teaching about language models.