Statistical models for ranking

screenshot from the tv show marketplace

For an episode of CBC Markeplace, we analyzed data from food safety inspections of locations of restaurant chains nationwide, and produced rankings of restaurants, for each city, and nationwide. We show to how to combine data from different cities, in which inspector standards and levels of compliance vary, by modelling the data of the number of violations detected using quasi-Poisson regression, where the city and the chain are covariates. We subsequently worked on fitting hierarchical Bayesian models to the data to identify more differences between chains and model the data better.

Episode video: Canada's Restaurant Secrets, broadcast on Apr 11, 2014 on CBC. Watch for the Poisson regression formula at 5min 48sec! (See screenshot, or watch on youtube.)

Technical report: Michael Guerzhoy and Nathan Taback, Ranking Restaurant Chains by the Number of Health Violations Found during Inspections.

Contributed conference talk: Hierarchical Bayesian Models for Uncertainty-Quantified Ranking of Restaurant Chains by Food Safety Compliance (French version), at the 43rd Annual Meeting of the Statistical Society of Canada, June 2015, Halifax, NS.

Latent factor models of human travel

visualization of the latent affinities of Paris

We decompose the likelihood of traveling from A to B into three factors: the desirability of B as a destination, the affinity between source A and destination B, and the individual-varying propensity to travel the distance between A and B. By analyzing a large dataset of geotagged Flickr photos, we estimate the desirabilities of destinations on the map and affinities between locations, as well as discover clusters of individuals with varying propensities to travel large distances. We analyze the learned affinity factors to discover travel patterns within and across linguistic boundaries. We also analyze a dataset of Shanghai taxi trips.

Paper: Michael Guerzhoy and Aaron Hertzmann, Learning Latent Factor Models of Human Travel. At NIPS Workshop on Social Network and Social Media Analysis: Methods, Models and Applications (Social 2012), Dec. 2012, Lake Tahoe, Nevada.

Paper: Michael Guerzhoy and Aaron Hertzmann, Learning Latent Factor Models of Travel Data for Travel Prediction and Analysis. In Proc. of the Canadian Conference on Aritifical Intelligence (AI 2014), May 2014, Montreal, Quebec. Best Paper Award.

Project website

NLP for large-scale data analysis in healthcare

We work with clinicians to extract patient information dictated patient charts.

Paper: D. Landsman et al., Cohort profile: St. Michael's Hospital Tuberculosis Database (SMH-TB), a retrospective cohort of electronic health record data and variables extracted using natural language processing, PLOS One, 2021.

Paper: A. A. Verma, H. Masoom, C. Pou-Prom, et al., Developing and validating natural language processing algorithms for radiology reports compared to ICD-10 codes for identifying venous thromboembolism in hospitalized medical patients, Thrombosis Research, 2021.

ConvNets for Photo Orientation Detection: improving performance and visualizing the system

Visualization of the reason a photo with birds was classified as upright

We apply a ConvNet to the task of photo orientation detection, and produce visualizations to help demonstrate how the ConvNet accomplishes the task.

Paper: Ujash Joshi and Michael Guerzhoy, Automatic Photo Orientation Detection with Convolution Neural Networks, in Proc. of the Conference on Computer and Robot Vision (CRV 2017), May 2017, Edmonton, Alberta.

Teaching machine learning and neural networks

An array of face-like images

I am interested in the pedagogy of machine learning and neural networks. I advocate for getting rid of "starter code" where at all possible, emphasizing the interpretation of models, and emphasizing thinking about the data.

Article: Michael Guerzhoy, Teaching with Deep Learning Frameworks in Introductory Machine Learning Courses in AI Matters 4(3), 2018.

Model AI Assignment: Michael Guerzhoy, Neural Networks for Face Recognition with TensorFlow, presented at Model AI Assignments at EAAI 2018.

Model AI Assignment: Michael Guerzhoy and Renjie Liao, Understanding How Recurrent Neural Networks Model Text, presented at Model AI Assignments at EAAI 2018.

Model AI Assignment: Michael Guerzhoy and Lisa Zhang, Building a Fake News Detector, to be presented at Model AI Assignments at EAAI 2019

Article: Michael Guerzhoy, Lisa Zhang, and Georgy Noarov, AI Education Matters: Building a Fake News Detector in AI Matters 5(3), 2019

CS1, Intro to Data Science, and computational thinking

Code for generating fake data in R for inference

How are Introduction to Data Science courses related to CS1 courses? What kind of programming students do after they take Introduction to Data Science? Conceptually, how do we compare different CS1 courses and Data Science courses? Do they all teach computational thinking? (And if not, is there even such a thing as computational thinking?)

Poster abstract: Michael Guerzhoy, Introduction to Data Science as a Pathway to Further Study in Computing (and the poster) at ICER 2019

Model AI Assignment: Stephen Keeley and Michael Guerzhoy, Predicting and Preventing Deaths in the ICU: Designing and Analyzing an AI System, presented at Model AI Assignments at EAAI 2020.

Assignment/mini-curriculum: Claire S. Lee, Jeremy Du, and Michael Guerzhoy, Auditing the COMPAS Recidivism Risk Assessment Tool: Predictive Modeling and Fairness in Machine Learning in CS1, presented at the Tips, Techniques, and Courseware session at ITiCSE 2020. Paper.

Computer vision for speech analysis

rotary phone

If you compute the spectrogram of a sound signal, you can treat it like an image (kind of) and apply object detection algorithms to analyze it. Specifically, I was working on phone classification.

Project report (MSc paper): Michael Guerzhoy, Boosting Local Spectro-Temporal Features for Speech Analysis, 2010. (Online abstract.)

Object detection

a cat with its face marked out

I've worked on several object detection projects. I've been particularly interested in image features. On the right is the output of a cat detector I made at Epson.

Book chapter: Joshua Seltzer, Michael Guerzhoy, and Monika Havelka (2019). Computer vision methodologies for automated processing of camera trap data: a technological review. In Yuhong He and Qihao Weng (eds), High Spatial Resolution Remote Sensing: Data, Techniques, and Applications, CRC Press, Taylor & Francis Group, Boca Raton, Florida (2019).

Background colour detection/rectangular object detection

several photos lying on a brown background

For the background colour detection part, we describe a way to use the fact that the background colour appears in patches and the fact that we can predict the edge statistics of the background/non-background boundary.

We also describe a perceptual organization based rectangle detection algorithm, and use a large synthetically-generated set to tune the parameters.

The intended application is streamlining of the process of scanning in documents like photos and business cards using a flatbed scanner.

Paper: Michael Guerzhoy and Hui Zhou. Segmentation of Rectangular Objects Lying on an Unknown Background in a Small Preview Scan Image. In Proc. of the Canadian Conference on Computer and Robot Vision (CRV 2008), May 2008, Windsor, Ontario.

Patent: Michael Guerzhoy and Hui Zhou. Method and apparatus for detecting objects in an image. U.S. Patent 8,098,936, issued Jan 17, 2012.

Patent: Michael Guerzhoy and Hui Zhou. Method and apparatus for detecting objects in an image. U.S. Patent 8,433,133, issued Apr 30, 2013.

Photo orientation detection

a photo of a city, rotateted by 180 degrees. Caption on the bottom reads 'this side up'

We developed a system that determines the orientation of the input photo (from 0, 90, 180, and 270 degrees). You can try it if you have an Epson scanner.

Patent: Michael Guerzhoy and Hui Zhou. Method and system for automatically determining the orientation of a digital image. U.S. Patent 8,094,971, issued Apr 30, 2013.

a screenshot of scanner software

Youtube review: "Auto-photo orientation: I've tested this feature and it works good... Without this feature checked, you need to place the upper-left-hand corner of the photo face down in the lower-left corner of the scanner. With this feature turned on (or checked), you can place any corner of the photo, face down in the lower-left corner of the scanner, and the Epson Perfection does a good job of making sure the photo is right-side-up after scanning. This is a good feature when you can't remember which corner of the photo you need to place down."