Machine Learning for Medicine
I work with clinicians and data scientists to develop NLP methods for extracting patient information from clinical notes and social media data and for understanding the nosology of complex diseases better.
Exploring Complex Mental Health Symptoms via Classifying Social Media Data with Explainable LLMs
ML4H at NeurIPS, 2024
CLPsych at EACL, 2024
Thrombosis Research, 2021
Interpretability and Representation Learning
How do neural networks actually process information? I investigate what features neural networks learn and how they use them, from understanding positional encodings in transformers to how ConvNets process color and intensity.
NeurReps Workshop at NeurIPS 2025
ECLR Workshop at ICCV, 2025
OpenSUN3D Workshop at ICCV, 2025
Breaking Symmetry When Training Transformers
NAACL Student Research Workshop, 2024
ICLR Tiny Papers, 2024
Synthetic Datasets for Exploring How ConvNets and ViTs Classify Images When Colour is An Important Cue
CRV Workshops, 2024
How Do ConvNets Understand Image Intensity?
ICLR Tiny Papers, 2023
ML for Understanding Creativity, Psychology, and Cognitive Science
What makes a chess move "brilliant"? What makes a Wordle game "amusing"? I use machine learning to understand human creativity, cognitive biases, and social psychology.
Semantic, Orthographic, and Phonological Biases in Humans' Wordle Gameplay
Findings of the ACL: IJCNLP-AACL 2025
Exploring the "Honour Culture" Theory Using Social Media Data
Workshop on Social Influence in Conversations at EMNLP, 2024
Quantifying the Complexity of Literary Fiction
NLP for Digital Humanities Workshop at EMNLP, 2024
Predicting User Perception of Move Brilliance in Chess
ICCC 2024 Featured in New Scientist and CBC Radio
Toward A Reinforcement-Learning-Based System for Adjusting Medication to Minimize Speech Disfluency
ML for Cognitive and Mental Health Workshop at AAAI, 2024
AI and Data Science Applications
From materials design to travel modeling to scientific literature analysis, I work on applying machine learning and data science to diverse real-world problems.
Automatically Extracting Scientific Metrics with LLMs: A Case Study of ImageNet Papers
Workshop at NeurIPS 2025
ECLR Workshop at ICCV, 2025
ICLR Tiny Papers, 2024
Learning Latent Factor Models of Travel Data for Travel Prediction and Analysis
Canadian Conference on AI (AI 2014) Best Paper Award
Pedagogy of Introductory Programming
How do we teach programming effectively? I develop assignments and study teaching methods for introductory computer science courses.
Synergies between Intro to Data Science and Intro to Programming via Purely Functional Programming
Statistical Society of Canada (SSC) Annual Meeting, 2025
Getting Used to Pointers with Pointer Drills
ITiCSE 2025
A Tournament for Pong AI Engines
Nifty Assignments at SIGCSE 2018
Automatically Solving SAT/TOEFL Synonym Questions with Computational Linguistics
Nifty Assignments at SIGCSE 2017
Pedagogy of Data Science
How does data science education relate to computer science education? I explore pathways for students to engage with data science and computational thinking.
SIGCSE 2021
Introduction to Data Science as a Pathway to Further Study in Computing
ICER 2019
Pedagogy of Machine Learning
I design model AI assignments and explore effective ways to teach machine learning concepts, emphasizing interpretation of models and thinking about data.
GPT is Coming to Class: A Song
Education Program at NeurIPS 2025 (on YouTube!)
Understanding How Neural Networks See (And Read): A Slide Deck
Education Program at NeurIPS 2025
Occam's Razor and Bender and Koller's Octopus
Workshop on Teaching NLP at ACL, 2024
Predicting and Preventing Deaths in the ICU: Designing and Analyzing an AI System
Model AI Assignments at EAAI 2020
AI Education Matters: Building a Fake News Detector
AI Matters 5(3), 2019
Model AI Assignments at EAAI 2019
Teaching with Deep Learning Frameworks in Introductory Machine Learning Courses
AI Matters 4(3), 2018
Understanding How Recurrent Neural Networks Model Text
Model AI Assignments at EAAI 2018
Neural Networks for Face Recognition with TensorFlow
Model AI Assignments at EAAI 2018
Statistics
Automatic Model Selection using Wasserstein Generative Adversarial Networks
Statistical Society of Canada (SSC) Annual Meeting, 2024
"Medium-n studies" in computing education conferences
Koli Calling 2023
For an episode of CBC Marketplace, we analyzed data from food safety inspections of locations of restaurant chains nationwide, and produced rankings of restaurants, for each city, and nationwide. We show how to combine data from different cities, in which inspector standards and levels of compliance vary, by modelling the data of the number of violations detected using quasi-Poisson regression, where the city and the chain are covariates. We subsequently worked on fitting hierarchical Bayesian models to the data to identify more differences between chains and model the data better.
Episode video: Canada's Restaurant Secrets, broadcast on Apr 11, 2014 on CBC. Watch for the Poisson regression formula at 5min 48sec! (See screenshot, or watch on youtube.)
Technical report: Michael Guerzhoy and Nathan Taback, Ranking Restaurant Chains by the Number of Health Violations Found during Inspections.
Contributed conference talk: Hierarchical Bayesian Models for Uncertainty-Quantified Ranking of Restaurant Chains by Food Safety Compliance (French version), at the 43rd Annual Meeting of the Statistical Society of Canada, June 2015, Halifax, NS.
Computer Vision (pre-2020)
ConvNets for Photo Orientation Detection
We apply a ConvNet to the task of photo orientation detection, and produce visualizations to help demonstrate how the ConvNet accomplishes the task.
Paper: Ujash Joshi and Michael Guerzhoy, Automatic Photo Orientation Detection with Convolution Neural Networks, in Proc. of the Conference on Computer and Robot Vision (CRV 2017), May 2017, Edmonton, Alberta.
Computer Vision for Speech Analysis
If you compute the spectrogram of a sound signal, you can treat it like an image (kind of) and apply object detection algorithms to analyze it. Specifically, I was working on phone classification.
Project report (MSc paper): Michael Guerzhoy, Boosting Local Spectro-Temporal Features for Speech Analysis, 2010. (Online abstract.)
Background Colour Detection/Rectangular Object Detection
For the background colour detection part, we describe a way to use the fact that the background colour appears in patches and the fact that we can predict the edge statistics of the background/non-background boundary.
We also describe a perceptual organization based rectangle detection algorithm, and use a large synthetically-generated set to tune the parameters.
The intended application is streamlining of the process of scanning in documents like photos and business cards using a flatbed scanner.
Paper: Michael Guerzhoy and Hui Zhou. Segmentation of Rectangular Objects Lying on an Unknown Background in a Small Preview Scan Image. In Proc. of the Canadian Conference on Computer and Robot Vision (CRV 2008), May 2008, Windsor, Ontario.
Photo Orientation Detection
We developed a system that determines the orientation of the input photo (from 0, 90, 180, and 270 degrees). You can try it if you have an Epson scanner.
Patent: Michael Guerzhoy and Hui Zhou. Method and system for automatically determining the orientation of a digital image. U.S. Patent 8,094,971, issued Apr 30, 2013.
Youtube review: "Auto-photo orientation: I've tested this feature and it works good... Without this feature checked, you need to place the upper-left-hand corner of the photo face down in the lower-left corner of the scanner. With this feature turned on (or checked), you can place any corner of the photo, face down in the lower-left corner of the scanner, and the Epson Perfection does a good job of making sure the photo is right-side-up after scanning. This is a good feature when you can't remember which corner of the photo you need to place down."