CSC2518 :: Spoken Language Processing :: University of Toronto

Contact information

Instructor		Frank Rudzicz
Office		550 University Ave., rm 12-175, Toronto ON, M5G 2A2
Office hours		ad hoc
Office phone		416 597 3422 x7971
Email		frank@cs.toronto.[EDUCATION] (fix the suffix)

Meeting times

Lectures

11h00-13h00 in BA 1200

Course outline

This is a graduate course broadly on topics of speech processing by machine including digital signal processing, automatic speech recognition, and speech synthesis. The theme this year is Speech in healthcare and assistive technologies which will include automatic dictation of speech for medical records, analysis of speech in language pathologies (e.g., in cerebral palsy, Parkinson's disease, and Alzheimer's disease), and assistive technologies such as text-to-speech (with and without brain-computer interfaces) for people with limited speech ability.

News and announcements

LECTURE CANCELLED: 22 September. If you wish to discuss your project proposal, please contact me directly.

Lecture materials

Week	Title	Speaker	Supplemental material
8 Sep.	Introduction to speech signal processing	Frank Rudzicz
15 Sep.	Introduction to clinical and biomedical aspects of speech	Frank Rudzicz
29 Sep.	(1 hour) B.N. Pasley, S.V. David, N. Mesgarani, A. Flinker, S.A. Shamma, N.E. Crone, R.T. Knight, E.F. Chang (2012) Reconstructing Speech from Human Auditory Cortex. PLoS ONE Biology, 10(1):1-13. (1/2 hour) H-Y Wu, M. Rubinstein, E. Shih, J. Guttag, F. Durand, W.T. Freeman (2012) Eulerian video magnification for revealing subtle changes in the world. ACM Transactions on Graphics, 31(4).	Alex Francois-Nienaber Orion Buske	Alex's slides. Sound reconstructed from the brain. Orion's slides. Website.
6 Oct.	(1 hour) L. Feenaughty, K. Tjaden, J. Sussman (2014) Relationship between acoustic measures and judgments of intelligibility in Parkinson's disease: A within-speaker approach. Clinical Linguistics & Phonetics, pages 1--22.	Teresa Valenzano	Teresa's slides.
20 Oct.	(1/2 hour) K.L. Lansford, J.M. Liss (2014) Vowel acoustics in dysarthria: Speech disorder diagnosis and classification. Journal of Speech, Language, and Hearing Research, 57, pages 57-67 (1/2 hour) A. Temko, C. Nadeu, W. Marnane, G. Boylan, G. Lightbody (2011) EEG Signal Description with Spectral-Envelope-Based Speech Recognition Features for Detection of Neonatal Seizures. IEEE Transactions on Information Technology in Biomedicine, 15(6): 839-847. TBD	Gillian DeBoer Ladislav Rampasek Narges Norouzi	Gillian's slides. Ladislav's slides.
27 Oct.	(1/2 hour) K. Brigham, B.V.K.V. Kumar (2010) Imagined Speech Classification with EEG Signals for Silent Communication: A Preliminary Investigation into Synthetic Telepathy. Proceedings of IEEE International Conference on Bioinformatics and Biomedical Engineering (iCBBE), pages 1-4. (1/2 hour) C.S. DaSalla, H. Kambara, M. Sato, Y. Koike (2009) Single-trial classification of vowel speech imagery using common spatial patterns. Neural Networks, 22(9):1334-1339. TBD	Peter Hamilton Peter Hamilton Kuan-Chieh Wang	Peter's slides.
3 Nov.	(1/2 hour) J. Lee, K.C. Hustad, G. Weismer (2014) Predicting Speech Intelligibility With a Multiple Speech Subsystems Approach in Children With Cerebral Palsy. Journal of Speech, Language, and Hearing Research, preprint. (1/2 hour) R. Patel (2002) Prosodic Control in Severe Dysarthria. Journal of Speech, Language, and Hearing Research, 45(5):858-870. (1/2 hour) A.J. Sporka, T. Felzer, S.H. Kurniawan, O. Poláček, P. Haiduk, and I.S. MacKenzie (2011) CHANTI: predictive text entry using non-verbal vocal input. Proceedings of the SIGCHI Conference on Human Factors in Computing System, pages 2463-2472.	Gillian DeBoer Aryan Arbabi Aryan Arbabi	Gillian's slides. Aryan's slides.
10 Nov.	(1 hour) S. Petrik, C. Drexel, L. Fessler, J. Jancsary, A. Klein, G. Kubin, J. Matiasek, F. Pernkopf, H. Trost (2011) Semantic and phonetic automatic reconstruction of medical dictations. Computer Speech & Language, 25(2):363-385.	Arjun Subramanian	Arjun's slides.
24 Nov.	(1 hour) Y. Yunusova, J.S. Rosenthal, K. Rudy, M. Baljko, J. Daskalogiannakis, J. (2012). Positional targets for lingual consonants defined using electromagnetic articulography. Journal of the Acoustical Society of America, 132(2):1027–1038. (1/2 hour) D. Bone, T. Chaspari, K. Audhkhasi, J. Gibson, A. Tsiartas, M. Van Segbroeck, M. Li, S. Lee, S. Narayanan. (2013) Classifying language-related developmental disorders from speech cues: the promise and the potential confounds. In Proceedings of INTERSPEECH 2013, pages 182-186. (1/2 hour) P.O. Kristensson, K. Vertanen (2012). The Potential of Dwell-Free Eye-Typing for Fast Assistive Gaze Communication. Proceedings of ETRA 2012 pages 241-244, Santa Barbara CA.	Rojin Majd Ladislav Rampasek Orion Buske	Rojin's slides. Ladislav's slides. Orion's slides.
1 Dec.	(1 hour) A.B. Kain, J.-P. Hosom, X. Niu, J.P.H. van Santen, M. Fried-Oken, J. Staehely (2007) Improving the intelligibility of dysarthric speech. Speech Communication, 49(9):743-759. (1 hour) E.W. Healy, S.E. Yoho, Y. Wang, D. Wang (2013) An algorithm to improve speech recognition in noise for hearing-impaired listeners. Journal of the Acoustical Society of America, 134(4):3029-38.	Stacey June Oue Sara Sabour Rouh Aghdam	Stacey's slides. Sara's slides.
8 Dec.	(1/2 hour) A. Tsanas, M.A. Little, P.E. McSharry, J. Spielman, L.O. Ramig (2012) Novel speech signal processing algorithms for high-accuracy classification of Parkinson's disease. IEEE Transactions on Biomedical Engineering, 59(5):1264-1271. (1/2 hour) D. Hakkani-Tur, D. Vergyri, G. Tur (2010) Speech-based automated cognitive status assessment. Proceedings of Interspeech 2010, pages 1-4. (1 hour) T. Nose and T. Kobayashi (2011) Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency. Speech Communication, 53(7):973-985.	Maria Yancheva Maria Yancheva Moritz Stiefel

Optional readings - general introduction

Optional	Foundations of Statistical Natural Language Processing	C. Manning and H. Schutze	Errata Free online edition (free if you're on a UofT computer of VPN)
Optional	Speech and Language Processing	D. Jurafsky and J.H. Martin	Errata
Optional	Spoken Language Processing: A Guide to Theory, Algorithm, and System Development	X. Huang, A. Acero, and H.-W. Hon

Evaluation policies

General

You will be graded on a 1-hour in-class presentation (or two half-hour presentations), overall participation, and a final project report. The relative proportions of these grades are as follows:

Class presentation/participation		20%
Final project		80%

Collaboration and plagiarism

No collaboration or plagiarism in either the class presentation or the project is permitted. The work you submit must be your own. 'Collaboration' in this context includes but is not limited to sharing of source code, correction of another's source code, or uncited copying of a previous work. See Academic integrity at the University of Toronto.

Course project

Although you will be expected to submit all source code, and possibly be called upon to give a demonstration, you will be marked on typical factors in academic publications, namely 1) originality, 2) sufficient survey of existing work, 3) technical correctness, 4) empirical methods, 5) overall presentation. You will submit a report in the style of an academic publication according to one of:

Calendar

8 September 2014		First lecture
22 September 2014		Last day to add CSC 2518
27 October 2014		Last day to drop CSC 2518
TBD		Last lecture
15 December 2014		Final project due

See Dates for graduate students.

Old website

Here is the website for the iteration of this course offered in 2011, with additional handouts: CSC2518 2011 webpage

CSC2518 - Spoken Language Processing

Speech in healthcare and assistive technologies - Fall 2014

Contact information

Meeting times

Course outline

News and announcements

Lecture materials

Suggested readings

Speech recognition in healthcare

Speech-based communication aids

Speech-based diagnosis

Clinically-relevant features of speech & other

Optional readings - general introduction

Evaluation policies

Calendar

Old website