Physiological Models of Production in Speech Recognition

Abstract


My work in natural language and speech processing has overlapped with machine learning, human-computer interaction, computer vision, speech-language pathology, rehabilitation engineering, digital signal processing, linguistics, and speech science.

I am interested in intelligent augmentative technologies for people with physical and cognitive disabilities. This is an emerging field of research as aging populations in Western nations will yield an increased prevalence of dementia (e.g., Alzheimer's disease) and of communicative complications related to cardiovascular stroke.

My doctoral research applied artificial intelligence to understanding speech deficits caused by neurological trauma in the speech-motor interface, which is a phenomenon called dysarthria. Dysarthric speakers can understand and produce abstract language, but lack the articulatory coordination to produce fully intelligible speech.

My work in automatic speech recognition significantly improved the rates of correct recognition for individuals with dysarthria (caused by cerebral palsy) by building statistical knowledge of speech production into the process. That knowledge was acquired by our collection of aligned acoustic and articulatory speech data (using electromagnetic articulography) from dysarthric individuals in collaboration with the Oral Dynamics Lab in the Faculty of Medicine, the Holland-Bloorview Kids Rehab Hospital in Toronto, and the Ontario Federation for Cerebral Palsy.

My work also involved the creation of a novel augmentative communication system that converts hard-to-understand speech signals to be more intelligible to typical listeners. That work was nominated for an NSERC Innovation Challenge award and has been provisionally patented (US 78053/00002).

More information can be found in my scientific publications, especially my doctoral dissertation and one of my seminar talks.