Machine Learning for Medicine
I work with clinicians and data scientists to develop NLP methods for extracting patient information from clinical notes and social media data and for understanding the nosology of complex diseases better.
Exploring Complex Mental Health Symptoms via Classifying Social Media Data with Explainable LLMs
ML4H at NeurIPS, 2024
CLPsych at EACL, 2024
Thrombosis Research, 2021
Interpretability and Representation Learning
How do neural networks actually process information? I investigate what features neural networks learn and how they use them, from understanding positional encodings in transformers to how ConvNets process color and intensity.
NeurReps Workshop at NeurIPS 2025
ECLR Workshop at ICCV, 2025
OpenSUN3D Workshop at ICCV, 2025
Breaking Symmetry When Training Transformers
NAACL Student Research Workshop, 2024
ICLR Tiny Papers, 2024
Synthetic Datasets for Exploring How ConvNets and ViTs Classify Images When Colour is An Important Cue
CRV Workshops, 2024
How Do ConvNets Understand Image Intensity?
ICLR Tiny Papers, 2023
ML for Understanding Creativity, Psychology, and Cognitive Science
What makes a chess move "brilliant"? What makes a Wordle game "amusing"? I use machine learning to understand human creativity, cognitive biases, and social psychology.
Semantic, Orthographic, and Phonological Biases in Humans' Wordle Gameplay
Findings of the ACL: IJCNLP-AACL 2025
Exploring the "Honour Culture" Theory Using Social Media Data
Workshop on Social Influence in Conversations at EMNLP, 2024
Quantifying the Complexity of Literary Fiction
NLP for Digital Humanities Workshop at EMNLP, 2024
Predicting User Perception of Move Brilliance in Chess
ICCC 2024 Featured in New Scientist and CBC Radio
Toward A Reinforcement-Learning-Based System for Adjusting Medication to Minimize Speech Disfluency
ML for Cognitive and Mental Health Workshop at AAAI, 2024
AI and Data Science Applications
From materials design to travel modeling to scientific literature analysis, I work on applying machine learning and data science to diverse real-world problems.
Automatically Extracting Scientific Metrics with LLMs: A Case Study of ImageNet Papers
Workshop at NeurIPS 2025
ECLR Workshop at ICCV, 2025
ICLR Tiny Papers, 2024
Learning Latent Factor Models of Travel Data for Travel Prediction and Analysis
Canadian Conference on AI (AI 2014) Best Paper Award
Pedagogy of Introductory Programming
How do we teach programming effectively? I develop assignments and study teaching methods for introductory computer science courses.
Synergies between Intro to Data Science and Intro to Programming via Purely Functional Programming
Statistical Society of Canada (SSC) Annual Meeting, 2025
Getting Used to Pointers with Pointer Drills
ITiCSE 2025
A Tournament for Pong AI Engines
Nifty Assignments at SIGCSE 2018
Automatically Solving SAT/TOEFL Synonym Questions with Computational Linguistics
Nifty Assignments at SIGCSE 2017
Pedagogy of Data Science
How does data science education relate to computer science education? I explore pathways for students to engage with data science and computational thinking.
SIGCSE 2021
Introduction to Data Science as a Pathway to Further Study in Computing
ICER 2019
Pedagogy of Machine Learning
I design model AI assignments and explore effective ways to teach machine learning concepts, emphasizing interpretation of models and thinking about data.
GPT is Coming to Class: A Song
Education Program at NeurIPS 2025 (on YouTube!)
Understanding How Neural Networks See (And Read): A Slide Deck
Education Program at NeurIPS 2025
Occam's Razor and Bender and Koller's Octopus
Workshop on Teaching NLP at ACL, 2024
Predicting and Preventing Deaths in the ICU: Designing and Analyzing an AI System
Model AI Assignments at EAAI 2020
AI Education Matters: Building a Fake News Detector
AI Matters 5(3), 2019
Model AI Assignments at EAAI 2019
Teaching with Deep Learning Frameworks in Introductory Machine Learning Courses
AI Matters 4(3), 2018
Understanding How Recurrent Neural Networks Model Text
Model AI Assignments at EAAI 2018
Neural Networks for Face Recognition with TensorFlow
Model AI Assignments at EAAI 2018
Statistics
Automatic Model Selection using Wasserstein Generative Adversarial Networks
Statistical Society of Canada (SSC) Annual Meeting, 2024
"Medium-n studies" in computing education conferences
Koli Calling 2023
For an episode of CBC Marketplace, we analyzed data from food safety inspections of locations of restaurant chains nationwide, and produced rankings of restaurants, for each city, and nationwide. We show how to combine data from different cities, in which inspector standards and levels of compliance vary, by modelling the data of the number of violations detected using quasi-Poisson regression, where the city and the chain are covariates. We subsequently worked on fitting hierarchical Bayesian models to the data to identify more differences between chains and model the data better.
Episode video: Canada's Restaurant Secrets, broadcast on Apr 11, 2014 on CBC. Watch for the Poisson regression formula at 5min 48sec! (See screenshot, or watch on youtube.)
Technical report: Michael Guerzhoy and Nathan Taback, Ranking Restaurant Chains by the Number of Health Violations Found during Inspections.
Contributed conference talk: Hierarchical Bayesian Models for Uncertainty-Quantified Ranking of Restaurant Chains by Food Safety Compliance (French version), at the 43rd Annual Meeting of the Statistical Society of Canada, June 2015, Halifax, NS.