CSC 401/2511 -- Natural Language Computing

Winter 2008


Index of this document


Contact information

Instructor: Gerald Penn
Office: PT 396B (St. George campus)
Office hours: Wednesays and Fridays 1-2, or by appointment
Tel: 978-7390
Email: gpenn@cdf.utoronto.ca
Back to the index

Meeting times

Lectures: WF 12-1, RW 110
Tutorials: M 12-1, RW 110
(Exceptions: the tutorials on 28th January and 4th February will take place in BA 2200;

        there will be a lecture on Monday, 11th February and a tutorial on Friday, 15th February;
        there will be a lecture on Monday, 10th March and a tutorial on Friday, 14th March
        there will be lectures on Monday, Wednesday and Friday on the week of 24th March - no tutorial)

 

Assignment/Tutor

 

Assignment Due Tutor
1 11 February Chris Parisien
2 10 March Xiaodan Zhu
3 7 April Siavash Kazemian

A bulletin board has also been created for the class, which willi be monitored by the TAs.

Back to the index

Texts for the Course


 

Required C. Manning & H. Schuetze, Foundations of Statistical Natural Language Processing, MIT, 1999. Errata
  for which there is an on-line edition from MIT CogNet  
Optional D. Jurafsky & J. Martin, Speech and Language Processing, Prentice Hall, 2000. Errata
Recommended A. Martelli, Python in a Nutshell, O'Reilly, 2003. Errata
Optional M. Lutz, D. Ascher, Learning Python, 2nd ed., O'Reilly, 2003. Errata
Free! various tutorials on the Python website

Supplementary Reading for the Lectures


 
Topic Title Author Publication Details
parsing, 
phrase structure models
Statistical Language Learning E. Charniak MIT Press, 1993.
machine learning The Elements of Statistical Learning T. Hastie, R. Tibshirani and J. Friedman Springer, 2001.
information theory 
(including entropy)
Elements of Information Theory T. M. Cover and J. A. Thomas Wiley & Sons, 1991.
maximum entropy modelling A Maximum Entropy Approach to Natural Language Processing A. L. Berger, S. A. Della Pietra and V. J. Della Pietra Computational Linguistics, 22(1): 39-71.
hidden Markov models 
(state emission)
Fundamentals of Speech Recognition, Chapter 6. L. Rabiner and B.-H. Juang Prentice Hall, 1993.
Good-Turing estimation A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams K. Church and W. Gale Computer Speech and Language 5:19-54.
information retrieval Modern Information Retrieval R. Baeza-Yates and B. Ribeiro-Neto ACM Press, 1999.
text summarization Automatic Summarization I. Mani Benjamins, 2001.
 phonetics (articulatory and acoustic) Acoustic Phonetics K. N. Stevens MIT Press, 1998.

Back to the index


Tentative Course outline

Back to the index

Calendar of important course-related events


 

Date Event
Mon, 7 January First lecture
Fri, 18 January Last day to add course (CSC 2511)
Sun, 20 January Last day to add course (CSC 401)
Mon, 11 February Assignment 1 due
18-22 February Reading Week - no classes
Fri, 29 February Last day to drop course (CSC 2511)
Sun, 9 March Last day to drop course (CSC 401)
Mon, 10 March Assignment 2 due
Mon, 7 April Assignment 3 due
Fri, 11 April Last lecture
21 April - 9 May Final exam period

Back to the index


Evaluation and related policies

There will be three homeworks, and a final exam. The relative weights of these components towards the final mark are shown in the table below:

 

Assignment 1 20%
Assignment 2 20%
Assignment 3 20%
Final 40%

Important note on final: A mark of at least a D- on the final exam is required to pass the course.  In other words, if you receive an F on the final exam you automatically fail the course, regardless of your performance on homeworks.

Important note on homeworks: No late homeworks will be accepted except in case of documented medical or other emergencies.

Policy on collaboration: No collaboration on homeworks is permitted.  The work you submit must be your own.  Failure to observe this policy is an academic offense, carrying a  penalty ranging from a zero on the homework to suspension from the university.

Back to the index


Announcements

In this space, you will find announcements related to the course. Please check this space at least weekly. Back to the index

Handouts

In this space you will find on-line postscript versions of course handouts, including homeworks and solutions (posted after the due date).

To view these handouts you will need access to a postscript previewer. If your machine does not have the required software, you can allegedly download it for free.

Back to the index

Old Exams

Some old midterm and final exams for this course (with no solutions). Back to the index

Gerald Penn, 2 April, 2008
This web-page was adapted from the web-page for another course, created by Vassos Hadzilacos.