Contact information
Instructor Gerald Penn Office PT 283 Office hours M 4-6pm Email gpenn@teach.cs.toronto.edu (please put CSC 401/2511 in the subject line) Forum (Piazza) Piazza - (signup) Quercus https://q.utoronto.ca/courses/352606 Email policy For non-confidential inquiries, consult the Piazza forum first. Otherwise, for confidential assignment-related inquiries, consult the TA associated with the particular assignment. Emails sent with appropriate subject headings and from University of Toronto email addresses are most likely not to be redirected towards junk email folders, for example.
Course overview
This course presents an introduction to natural language computing in applications such as information retrieval and extraction, intelligent web searching, speech recognition, and machine translation. These applications will involve various statistical and machine learning techniques. Assignments will be completed in Python. All code must run on the 'teaching servers'.
Prerequisites: CSC207/ CSC209/ APS105/ APS106/ ESC180/ CSC180 and STA237/ STA247/ STA255/ STA257/ STAB52/ ECE302/ STA286/ CHE223/ CME263/ MIE231/ MIE236/ MSE238/ ECE286 and a CGPA of 3.0 or higher or a CSC subject POSt. MAT 223 or 240, CSC 311 (or equivalent) are strongly recommended.
See also the course information sheet.
Meeting times
Locations BA Bahen Centre for Information Technology Lectures MW 10-11h at BA 1180; 11-12h at BA 1190 Tutorials F 10-11h at BA 1180; 11-12h at BA 1190
Syllabus
The following is an estimate of the topics to be covered in the course and is subject to change.
- Introduction to corpus-based linguistics
- N-gram, linguistic features, word embeddings
- Entropy and information theory
- Intro to deep neural networks and neural language models
- Machine translation (statistical and neural) (MT)
- Transformers, attention based models and variants
- Large language models (LLMs)
- Acoustics and phonetics
- Speech features and speaker identification
- Dynamic programming for speech recognition.
- Speech synthesis (TTS)
- Information Retrieval (IR)
- Text Summarization
- Ethics in NLP
Calendar
4 September First lecture 18 September Last day to enrol 24 September Part 1 of Assignment 1 due 8 October Assignment 1 due 28 October Last day to drop CSC 2511 28 October - 1 November Reading week -- no lectures or tutorial 4 November Last day to drop CSC 401 5 November Assignment 2 due 3 December Last lecture 3 December Assignment 3 due 6-21 December Final exam period
Readings for this course
Optional Foundations of Statistical Natural Language Processing C. Manning and H. Schutze Optional Speech and Language Processing D. Jurafsky and J.H. Martin (2nd ed.) Optional Deep Learning I Goodfellow, Y Bengio, and A Courville
Supplementary reading
Please see additional lecture specific supplementary resources under Lecture Materials section.
Evaluation policies
- General
- You will be graded on three homework assignments and a final exam. The relative proportions of these grades are as follows:
Assignment 1 20% Assignment 2 20% Assignment 3 20% Ethics Surveys (2x) 1% Final exam 39% - Lateness
- A 10% (absolute) deduction is applied to late homework one minute after the due time. Thereafter, an additional 10% deduction is applied every 24 hours up to 72 hours late at which time the homework will receive a mark of zero. No exceptions will be made except in case of documented emergencies.
- Final
- The final exam will be a timed 3-hour test. A mark of at least 50 on the final exam is required to pass the course. In other words, if you receive a 49 or less on the final exam then you automatically fail the course, regardless of your performance in the rest of the course.
- Collaboration and plagiarism
- No collaboration on the homeworks is permitted. The work you submit must be your own. `Collaboration' in this context includes but is not limited to sharing of source code, correction of another's source code, copying of written answers, and sharing of answers prior to or after submission of the work (including the final exam). Failure to observe this policy is an academic offense, carrying a penalty ranging from a zero on the homework to suspension from the university. The use of AI writing assistance (ChatGPT, Copilot, etc) is allowed only for refining the English grammar and/or spelling of text that you have already written. Submitting any Python code generated or modified by any AI assistants is strictly prohibited. See Academic integrity at the University of Toronto.
Lecture materials
-
- Introduction
- Date: 4 Sep.
- Reading: Manning & Schütze: Sections 1.3-1.4.2, Sections 6.0-6.2.1
- Corpora and Smoothing
- Dates: 9-16 Sep.
- Reading: Manning & Schütze: Section 1.4.3, Section 6.1-6.2.2, Section 6.2.5, Sections 6.3
- Reading: Jurafsky & Martin: 3.4-3.5
- See also the supplementary reading for Good-Turing smoothing
- Features and Classification
- Date: 18-23 Sep.
- Reading: Manning & Schütze: Section 1.4.3, Section 6.1-6.2.2, Section 6.2.5, Sections 6.3
- Reading: Jurafsky & Martin: 3.4-3.5
- Entropy and information theory
- Dates: 25-30 Sep.
- Reading: Manning & Schütze: Sections 2.2, 5.3-5.5
- Intro. to NNs and Neural Langauge Models
- Dates: 7, 9 Oct.
- Reading: DL (Goodfellow et al.). Sections: 6.3, 6.6, 10.2, 10.5, 10.10
- (Optional) Supplementary resources and readings:
- Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space. (2013)" link
- Xin Rong. "word2Vec Parameter Learning Explained". link
- Bolukbasi, Tolga, et al. "Man is to computer programmer as woman is to homemaker? debiasing word embeddings." NeurIPS (2016). link
- Greff, Klaus, et al. "LSTM: A search space odyssey." IEEE (2016). link
- Jozefowicz, Sutskever et al. "An empirical exploration of recurrent network architectures." ICML (2015). link
- GRU: Cho, et al. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." (2014). link
- ELMo: Peters, Matthew E., et al. "Deep contextualized word representations. (2018)." link Blogs:
- The Unreasonable Effectiveness of Recurrent Neural Networks. link
- Colah's Blog. "Understanding LSTM Networks". link.
- Machine Translation (MT)
- Dates: 16,21,23 Oct.
- Readings:
- Manning & Schuütze Sections 13.0, 13.1.2, 13.1.3, 13.2, 13.3, 14.2.2
- DL (Goodfellow et al.). Sections: 10.3, 10.4, 10.7
- (Optional) Supplementary resources and readings:
- Papineni, et al. "BLEU: a method for automatic evaluation of machine translation." ACL (2002). link
- Sutskever, Ilya, Oriol Vinyals et al. "Sequence to sequence learning with neural networks."(2014). link
- Bahdanau, Dzmitry, et al. "Neural machine translation by jointly learning to align and translate."(2014). link
- Luong, Manning, et al. "Effective approaches to attention-based neural machine translation." arXiv (2015). link
- Britz, Denny, et al. "Massive exploration of neural machine translation architectures."(2017). link
- BPE: Sennrich, et al. "Neural machine translation of rare words with subword units." arXiv (2015). link
- Wordpiece: Wu, Yonghui, et al. "Google's neural machine translation system: Bridging the gap between human and machine translation." arXiv (2016). link Blogs:
- Distill: Olah & Carter "Attention and Augmented RNNs"(2016). link
- Transformers
- Dates: 4,6 Nov.
- Readings:
- Vaswani et al. "Attention is all you need." (2017). link
- (Optional) Supplementary resources and readings:
- RoPE: Su, Jianlin, et al. "Roformer: Enhanced transformer with rotary position embedding." (2021). [arxiv]
- Ba, Kiros, and Hinton. "Layer normalization." (2016). [link]
- Xiong, Ruibin, et al. "On layer normalization in the transformer architecture." ICML PMLR (2020). [link]
- Xie et al. "ResiDual: Transformer with Dual Residual Connections." (2023). [arxiv] [github] BERTology:
- Devlin et al. "BERT: Pre-training of deep bidirectional transformers for language understanding." (2019). link
- Clark et al. "What does bert look at? an analysis of bert's attention." (2019). link
- Rogers, Anna et al. "A primer in BERTology: What we know about how bert works." TACL(2020). link
- Tenney et al. "BERT rediscovers the classical NLP pipeline." (2019). link
- Niu et al. "Does BERT rediscover a classical NLP pipeline." (2022). link
- Lewis et al. "BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension." (2019). link
- T5: Raffel et al. "Exploring the limits of transfer learning with a unified text-to-text transformer." J. Mach. Learn. Res. 21.140 (2020). link
- GPT3: Radford et al. "Language models are few-shot learners." (2020). link Attention-free models:
- Fu, Daniel, et al. "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture." (2023). [arxiv]. [blog]. Token-free models:
- Clark et al. "CANINE: Pre-training an efficient tokenization-free encoder for language representation." (2021). link
- Xue et al. "ByT5: Towards a token-free future with pre-trained byte-to-byte models." (2022). link Blogs:
- Harvard NLP. "The Annotated Transformer". link.
- Jay Allamar. "The Illustrated Transformer". link.
- Acoustics and Phonetics
- Dates: 8,11 Nov.
- Reading: Phonetics: J&M SLP2 (2nd ed.) Chapter 7; J&M SLP3 (3rd ed.) Chapter H
- Speech Features and Speaker Identification
- Dates: 13,18 Nov.
- Readings:
- Jurafsky & Martin SLP3 (3rd ed.): Chapter 16. link
- Dynamic Programming for Speech Recognition
- Dates: 18,20 Nov.
- Readings: N/A
- Information Retrieval (IR)
- Date(s): 20 Nov.
- Readings:
- Jurafsky & Martin SLP3 (3rd ed.): Chapter 14, only the first part (14.1). link
- Text Summarization
- Date(s): 25 Nov.
- Guest Lectures on Ethics: [Module 1], [Module 2]
- Date(s): 27 Nov., 29 Nov.
- Supplementary materials/links:
- Guest lecturer: Steven Coyne
- The Embedded Ethics Education Initiative at UofT, SRI Institute link
- SRI Institute events
- Summary and Review (last lecture).
- Date: 3 Dec.
- Introduction
Tutorial materials
-
- Assignment 1 tutorials:
- Sept. 6, 2024: Tutorial 0
- Sept. 13, 2024: Tutorial 1 with slides
- Sept. 27, 2024: Tutorial 2 with slides
- Assignment 2 tutorials:
- Oct. 11, 2024: Tutorial 1
- Oct. 18, 2024: Tutorial 2
- Assignment 3 tutorials:
- Nov. 15, 2024: Tutorial 1
- Nov. 22, 2024: Tutorial 2
Assignments
-
Here is the ID template that you must submit with your assignments.
Head TA: Ken Shi
Extension requests: All extension requests must be made to the head TA. All undergrads should follow the FAS student absences policy. Specifically, undergrads must file an ACORN absence declaration when it is allowed, and a VOI form for extensions due to illness when it is not allowed (because an ACORN declaration has already been filed this term). Grads should always use a VOI form for extensions due to illness.
Remark requests: Please follow the remarking policy.
General Tips & F.A.Q.:
- Working on teach.cs (wolf) server: CSC401_F24_Assignments.pdf
- Creating a local env mimicking teach.cs environment:
- Note that a 24-hour `silence policy' will be in effect -- we do not guarantee that the instructors or TAs will respond to your request within 24 hours before an assignment's due time.
Assignment 1: Financial Sentiment Analysis
- Due: 24th September / 8th October, 2024
- For all A1-related emails, please use: csc401-2024-09-a1@cs.toronto.edu
- Download the starter code from MarkUS
Assignment 2: Neural Machine Translation with Transformers
- Due: 5th November, 2024
- For all A2-related emails, please use: csc401-2024-09-a2@cs.toronto.edu
Assignment 3: ASR, Speakers, and Lies
- Due: 3rd December, 2024
- For all A3 related emails, please use: csc401-2024-09-a3@cs.toronto.edu
Back to top
Past course materials
Fall 2024: S24 course page
Fall 2023: S23 course page
Fall 2022: S22 course page
Old Exams
- The old exam repository from UofT libraries (may not contain this particular course's final exams).
- Final exam from 2017 (it bears structural similarity to our final exam this term, but the material we cover now has changed)
News and announcements
-
- FIRST WEEK: Our first lecture will take place on 4th September at 10:00 or 11:00, depending on your section. There will be a tutorial on the 6th.
- ANNOUNCEMENT FROM ACCESSIBILITY SERVICES: Accessibility Services is seeking volunteer note takers for students in this class who are registered in Accessibility Services. By volunteering to take notes for students with disabilities, you are making a positive contribution to their academic success. By volunteering as a note-taker, you will benefit as well - It is an excellent way to improve your own note-taking skills and to maintain consistent class attendance. At the end of term, we would be happy to provide a Certificate of Appreciation for your hard work. To request a Certificate of Appreciation please email us at as.notetaking@utoronto.ca. You may also qualify for a Co-Curricular Record by registering your volunteer work on Folio before the end of June. We also have a draw for qualifying volunteers throughout the academic year. Register online as a Volunteer Note-Taker at: https://clockwork.studentlife.utoronto.ca/custom/misc/home.aspx Email us at as.notetaking@utoronto.ca if you have questions or require any assistance with uploading notes. If you are no longer able to upload notes for a course, please also let us know immediately. Thank you for your support and for making notes more accessible for our students.