Research

// Master's

Predicting clinical scores for Alzheimer's disease and related dementias using language features

Used a set of 477 lexicosyntactic, acoustic, and semantic features extracted from 393 speech samples in DementiaBank to predict clinical MMSE scores, an indicator of the severity of cognitive decline associated with dementia. Used a bivariate dynamic Bayes net (Kalman filter) to represent the longitudinal progression of observed lingustic features and MMSE scores over time, and obtain a mean absolute error (MAE) of 3.83 in predicting MMSE, comparable to within-subject interrater standard variation of 3.9 to 4.8 (Molloy, 1991). Ongoing work focuses on the use of longitudinal data samples as a way of improving predictions, and development of an online data acquisition platform.

// Undergraduate Thesis

Automatic detection of deception in child speech

Worked with Prof. Frank Rudzicz on performing feature analysis and using syntactic features for automatic detection of deception in natural language. Applied the resulting syntactic feature set to the classification of varying degrees of truth-telling and deception in real transcriptions of interviews with 4 to 7-year old children awaiting court appearances for suspected child abuse cases in the Los Angeles County Dependency Court. Analysed performance of a neural network, a support vector machine, and a random forest classifier compared to a baseline.

// Undergraduate Summer Research

Software for Collaborative Science: MyeLink

Worked with Prof. Steve Easterbrook on developing software tools for climate scientists. Developed MyeLink: a wiki plugin for Mediawiki in PHP with LAMP, which automatically generates a list of related wiki articles based on lexico-semantic and structural features of the current page, without a need for explicit user-provided search terms. Visualized similarity between pages by creating a custom word-cloud generator using PHP's GD graphics library.