CSC 2518 -- Spoken Language Processing

  Spring 2024


Index of this document


Contact information

Instructor: Gerald Penn
Office: PT 283 (St. George campus)
Tel: 978-7390
Email: gpenn@cs.utoronto.ca
Back to the index

Meeting times

Lectures: R 1-3
Back to the index

Presented Readings

 

Who
When
What
Where
Sinclair Hudson,
Gerald Penn
1 February
1) Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
2) Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies”
3) TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
1) ICASSP 2020
2) Interspeech 2021
3) TASLP 2021
Megan Cao,
Zixin Zhao
8 February
1) Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
2) LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
3) Unsupervised pretraining transfers well across languages
1) Interspeech 2021
2) Interspeech 2021
3) ICASSP 2020
Xin Peng 8 February
1) Layer-Wise Analysis of a Self-Supervised Speech Representation Model
2) Comparative layer-wise analysis of self-supervised speech models
1) ASRU 2021
2) ICASSP 2023
Borong Xu 15 February Toward a realistic model of speech processing in the brain with self-supervised learning NeurIPS 2022
Carolina Villamizar Agudelo 29 February
1) ContentVec: An improved self-supervised speech representation by disentangling speakers
2) Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering
1) PMLR 2022
2) Interspeech 2023
Addison Weatherhead 7 March Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads (Interspeech 2023)
Chang Yuan 21 March
1) Distilhubert: Speech Representation Learning by Layer-Wise Distillation of Hidden-Unit Bert
2) LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT
1) ICASSP 2022
2) Interspeech 2022
Ken Shi 28 March
1) Learning dependencies of discrete speech representations with neural hidden Markov models
2) A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning
1) ICASSP 2023
2) Interspeech 2020
Merrick Liu,
Dunyang Ni,
Michael Ong
4 April
1) Do self-supervised speech models develop human-like perception biases?
2) Evaluating computational models of infant phonetic learning across languages
1) ACL 2022
2) CogSci 2020

Additional Readings for the Lectures


Title Author Publication Details
Spoken Language Processing X. Huang, A. Acero and H.-W. Hon Prentice Hall, 2001.
Discrete-Time Signal Processing J.R. Deller, Jr. , J.H.L. Hansen, and J.G. Proakis IEEE Press, 2000.
Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. T. Hastie, R. Tibshirani, J. Friedman Springer, 2009
Open Finite-State Transducer Tutorial C. Allauzen, M. Jansche and M. Riley

Back to the index


Topic of this year's offering

Back to the index

Calendar of important course-related events


Date Event
Thu, 11 January First meeting
Mon, 22 January Last day to add course
Tue, 20 February Last day to drop course
Thu, 22 February Reading Week
Thu, 4 April Last meeting
Fri, 19 April Final papers/projects due

Back to the index


Evaluation

Your final mark will be determined by a term paper/project, and a presentation of a paper in class.  The relative weights of these components towards the final mark are shown in the table below:
 

(Best) class presentation 30%
Term paper/project 70%

Back to the index


Announcements

In this space, you will find announcements related to the course. Please check this space at least weekly. Back to the index

Lecture Slides



Gerald Penn, 31 March, 2024
This web-page was adapted from the web-page for another course, created by Vassos Hadzilacos.