CSC 2518 -- Spoken Language Processing

Spring 2024

Index of this document

Contact information
Meeting times
Readings
Tentative course outline
Calendar of important course-related events
Evaluation
Announcements

Contact information

Instructor: Gerald Penn

Office: PT 283 (St. George campus)

Tel: 978-7390

Email: gpenn@cs.utoronto.ca

Back to the index

Meeting times

Lectures: R 1-3

Back to the index

Presented Readings

Who

When

What

Where

Sinclair Hudson,
Gerald Penn

1 February

1) Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders

2) Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies”

3) TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech

1) ICASSP 2020

2) Interspeech 2021

3) TASLP 2021

Megan Cao,
Zixin Zhao

8 February

1) Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

2) LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech

3) Unsupervised pretraining transfers well across languages

1) Interspeech 2021

2) Interspeech 2021

3) ICASSP 2020

8 February

1) Layer-Wise Analysis of a Self-Supervised Speech Representation Model

2) Comparative layer-wise analysis of self-supervised speech models

1) ASRU 2021

2) ICASSP 2023

15 February

Toward a realistic model of speech processing in the brain with self-supervised learning

NeurIPS 2022

Carolina Villamizar Agudelo

29 February

1) ContentVec: An improved self-supervised speech representation by disentangling speakers

2) Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering

1) PMLR 2022

2) Interspeech 2023

Addison Weatherhead

7 March

Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads

(Interspeech 2023)

21 March

1) Distilhubert: Speech Representation Learning by Layer-Wise Distillation of Hidden-Unit Bert

2) LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

1) ICASSP 2022

2) Interspeech 2022

28 March

1) Learning dependencies of discrete speech representations with neural hidden Markov models

2) A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning

1) ICASSP 2023

2) Interspeech 2020

Merrick Liu,
Dunyang Ni,
Michael Ong

4 April

1) Do self-supervised speech models develop human-like perception biases?

2) Evaluating computational models of infant phonetic learning across languages

1) ACL 2022

2) CogSci 2020

Additional Readings for the Lectures


Title	Author	Publication Details
Spoken Language Processing	X. Huang, A. Acero and H.-W. Hon	Prentice Hall, 2001.
Discrete-Time Signal Processing	J.R. Deller, Jr. , J.H.L. Hansen, and J.G. Proakis	IEEE Press, 2000.
Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.	T. Hastie, R. Tibshirani, J. Friedman	Springer, 2009
Open Finite-State Transducer Tutorial	C. Allauzen, M. Jansche and M. Riley

Back to the index

Topic of this year's offering

Foundations of Foundational Models

Back to the index

Calendar of important course-related events


Date	Event
Thu, 11 January	First meeting
Mon, 22 January	Last day to add course
Tue, 20 February	Last day to drop course
Thu, 22 February	Reading Week
Thu, 4 April	Last meeting
Fri, 19 April	Final papers/projects due

Back to the index

Evaluation

Your final mark will be determined by a term paper/project, and a presentation of a paper in class. The relative weights of these components towards the final mark are shown in the table below:


(Best) class presentation	30%
Term paper/project	70%

Back to the index

Announcements

In this space, you will find announcements related to the course. Please check this space at least weekly.

11 January: We will observe the FAS reading week.
11 January: CSC 2511 is recommended as a prerequisite for this class.

Back to the index

Lecture Slides

These are posted on Quercus

Gerald Penn, 31 March, 2024
This web-page was adapted from the web-page for another course, created by Vassos Hadzilacos.