CSC401/2511 :: Natural Language Computing

Contact information

Instructors		Gerald Penn, Raeid Saqur, and Sean Robertson.
Office		PT 283; BA 2270
Office hours		Gerald Penn: F 12h-14h at PT 283; RS/SR: M 12-13h at BA 2270
Email		csc401-2024-01@cs. (add the toronto.edu suffix)

Forum (Piazza)		Piazza - (signup)
Quercus		https://q.utoronto.ca/courses/337533

Email policy		For non-confidential inquiries, consult the Piazza forum first. Otherwise, for confidential assignment-related inquiries, consult the TA associated with the particular assignment. Emails sent with appropriate subject headings and from University of Toronto email addresses are most likely not to be redirected towards junk email folders, for example.

Lecture materials

Assigned readings give you more in-depth information on ideas covered in lectures. You will not be asked questions relating to readings for the assignments, but they will be useful in studying for the final exam.

Provided PDFs are ~ 10% of their original size for portability, at the expense of fidelity.

For pre-lecture readings and in-class note taking, please see under Quercus Modules. The final versions (ex-post errata, and/or other modifications will be posted here on the course website.

Introduction.
- Date: 8 Jan.
- Reading: Manning & Schütze: Sections 1.3-1.4.2, Sections 6.0-6.2.1
Corpora and Smoothing.
- Dates: 10 Jan.
- Reading: Manning & Schütze: Section 1.4.3, Section 6.1-6.2.2, Section 6.2.5, Sections 6.3
- Reading: Jurafsky & Martin: 3.4-3.5
Entropy and information theory.
- Dates: 15, 17 Jan.
- Reading: Manning & Schütze: Sections 2.2, 5.3-5.5
Features and Classification.
- Date: 22 Jan.
- Reading: Manning & Schütze: Section 1.4.3, Section 6.1-6.2.2, Section 6.2.5, Sections 6.3
- Reading: Jurafsky & Martin: 3.4-3.5
Intro. to NNs and Neural Langauge Models.
- Dates: 24, 29 Jan.
- Reading: DL (Goodfellow et al.). Sections: 6.3, 6.6, 10.2, 10.5, 10.10
- (Optional) Supplementary resources and readings:
Machine translation (MT).
- Dates: 31 Jan.; 5,7 Feb.
- Readings:
- (Optional) Supplementary resources and readings:
Transformers.
- Dates: 12, 14 Feb.
- Readings:
- (Optional) Supplementary resources and readings:
Large language models.
- Date: 26 Feb.
- Readings: No required readings for this lecture.
- (Optional) Supplementary resources and readings:
Acoustics and phonetics.
- Dates: 28 Feb., 4 Mar.
- Reading: Phonetics: J&M SLP2 (2nd ed.) Chapter 7; J&M SLP3 (3rd ed.) Chapter H
Speech features and speaker identification.
- Dates: 6 Mar.
- Readings:
Dynamic programming for speech recognition.
- Dates: 11, 13, 18 Mar.
- Readings: N/A
Information Retrieval (IR).
- Date(s): 20 Mar.
- Readings:
Text Summarization.
- Date(s): 25 Mar.
Guest Lecture on Ethics: [Module 1], [Module 2].
- Date(s): 27 Mar., 1 Apr.
- Supplementary materials/links:
Summary and Review (last lecture).
- Date: 3 Apr.

Tutorial materials

Enrolled students: Please see under Quercus Modules. The final version (ex-post errata, and/or other modifications will be posted here on the course website (for anyone auditing).

Assignment 1 tutorials:

Jan. 19, 2024: Tutorial 1
Jan. 26, 2024: Tutorial 2 - Entropy and decisions
Feb. 9, 2024: A1 Q/A + O.H. w/ the TAs (no slides).

Assignment 2 tutorials:

Feb. 16, 2024: Tutorial-I: Intro. to PyTorch (ft. Yun Shun)
Mar. 1, 2024: Tutorial-II: Machine Translation with Transformers (ft. Julia Watson)
Mar. 8, 2024: A2 Q/A + O.H. w/ the TAs (no slides).

Assignment 3 tutorials:

See Quercus -> Pages -> Tutorial Materials

Mar. 15, 2024: Tutorial-I
Mar. 22, 2024: Tutorial-II
Mar. 29, 2024: Good Friday - No Tutorial (Q/A + O.H. w/ the TAs in person).

Assignments

Enrolled students: Please use the Quercus Assignments page for all materials. The final version (ex-post errata, updates) will be posted here (for anyone auditing the course). Here is the ID template that you must submit with your assignments. Here is the MarkUs link you use to submit them.

Extension requests: Please follow the extension request procedure detailed here. A copy of Special Consideration Form here.
Remark requests: Please follow the remarking policy detailed here.

General Tips & F.A.Q.:

Working on teach.cs (wolf) server: A1-intro.pdf
Creating a local env mimicking teach.cs environment: wolf-py310-requirements.txt

Assignment 1: Identifying political affiliations on Reddit

Released: Jan 15, 2024. Due: Feb 9, 2024
For all A1 related emails, please use: csc401-2024-01-a1@cs. (add the toronto.edu suffix)
The starter-code, marking rubric, requirements.txt files will be on teach.cs server - please see handout
Working on teach.cs + A1 Q/As: A1-intro.pdf

Assignment 2: Neural Machine Translation with Transformers

Released: Feb 10, 2024. Due: Mar 8, 2024
For all A2 related emails, please use: csc401-2024-01-a2@cs. (add the toronto.edu suffix)
Download the starter code from MarkUS
Marking rubric.pdf, criteria.yml.
The associated requirements file,
LaTex Report template a2_report.zip,

Assignment 3: ASR, Speakers, and Lies

Released: Mar 6, 2024. Due: Apr 5, 2024
For all A3 related emails, please use: csc401-2024-01-a3@cs. (add the toronto.edu suffix)
The starter code is available on: MarkUS

News and announcements

[15-Jan-2024] ASSIGNMENTS: A1 has been released.
FIRST LECTURE: 8 January at 10h or 11h (check your section on ACORN enrolment).
FIRST TUTORIAL: There will be NO tutorial on the first week of lectures (i.e. 12 January, Fri).
READING WEEK BREAK: The week of Feb. 19-23 - there will be no lectures or tutorials.
LAST LECTURE: 5 April (check sessional calenders).
FINAL EXAM: April 25, 2024 [ArtSci Final Exam Schedule ]

CSC401/2511 - Natural Language Computing

Spring 2024

Contact information

Lecture materials

Tutorial materials

Assignments

News and announcements