Datasets for CL and NLP - A collection of datasets for research in computational linguistics and natural language processing.

CatGO - A light-weight Python implmentation of popular categorization models with automatic parameter optimization.

Kara One - Multimodal database of imagined and articulated speech recorded with electroencephalography (EEG), video face tracking, and speech acoustics during phonologically-relevant language tasks. 14 participants, 24 GB.

TORGO - Acoustic-articulatory database of people with and without dysarthria caused by cerebral palsy. Articulatory data from AG500 electromagnetic articulography. 8 participants with dysarthria, 7 without, 18 GB.