Prisms by Jon Snoek
Refereed Publications
AISTATS 2012
On Nonparametric Guidance for Learning Autoencoder Representations
Jasper Snoek, Ryan Prescott Adams, Hugo Larochelle.
To appear at Artificial Intelligence and Statistics, 2012
Unsupervised discovery of latent representations, in addition to being useful for density modeling, visualisation and exploratory data analysis, is also increasingly important for learning features relevant to discriminative tasks. Autoencoders, in particular, have proven to be an effective way to learn latent codes that reflect meaningful variations in data. A continuing challenge, however, is guiding an autoencoder toward representations that are useful for particular tasks. A complementary challenge is to find codes that are invariant to irrelevant transformations of the data. The most common way of introducing such problem-specific guidance in autoencoders has been through the incorporation of a parametric component that ties the latent representation to the label information. In this work, we argue that a preferable approach relies instead on a nonparametric guidance mechanism. Conceptually, it ensures that there exists a function that can predict the label information, without explicitly instantiating that function. The superiority of this guidance mechanism is confirmed on two datasets. In particular, this approach is able to incorporate invariance information (lighting, elevation, etc.) from the small NORB object recognition dataset and yields state-of-the-art performance for a single layer, non-convolutional network.
        @inproceedings{snoek-etal-2011b,
          title = {On Nonparametric Guidance for Learning Autoencoder Representations.},
          author = {Jasper Snoek, Ryan Prescott Adams and Hugo Larochelle},
          booktitle = {AISTATS},
          year = {2012} }
        
ICDM 2011
From Videos To Places: Geolocating the World's Videos
Jasper Snoek, Luciano Sbaiz, Hrishikesh Aradhye.
ICDM Workshop on Large Scale Visual Analytics, Dec 2011
This paper explores the problem of large-scale automatic video geolocation. A methodology is developed to infer the location at which videos from Youtube.com were recorded using video content and various additional signals. Specifically, multiple binary Adaboost classifiers are trained to identify particular places based on learning decision stumps on sets of hundreds of thousands of sparse features. A one-vs-all classification strategy is then used to classify the location at which videos were recorded. Empirical validation is performed on an immense data set of 20 million labeled videos. Results demonstrate that high accuracy video geolocation is indeed possible for many videos and locations and interesting relationships exist between between videos and the places where they are recorded.
        @inproceedings{snoek-etal-2011a,
          title = {From Videos To Places: Geolocating the Worlds Videos.},
          author = {Jasper Snoek, Luciano Sbaiz and Hrishikesh Aradhye},
          booktitle = {ICDM Workshop on Large Scale Visual Analytics},
          year = {2011}
        }
        
HISB 2011
Towards Aging-In-Place: Automatic Assessment of Product Usability for Older Adults with Dementia,
Babak Taati,Jasper Snoek, Alex Mihailidis.
Healthcare Informatics, Imaging and Systems Biology. July 2011.
Considerations of how to facilitate aging-in-place are becoming increasingly pertinent as caregivers are over- whelmed by an aging population. A primary challenge to independent living is the inability to use products associated with tasks of daily living. As improving the usability of these products for the elderly will extend their independence, this work attempts to automate and expedite the assessment of us- ability using artificial intelligence. Video analysis is performed to temporally segment video of human-product interaction and automatically identify segments in which the human has difficulty operating the product. The approach is applied to the study of water faucet designs for older adults with dementia. Empirical analysis is performed on videos of dementia patients operating various faucet types, demonstrating the accuracy of the temporal segmentation (88.1%) and the ability to estimate the relative advantage of either design in terms of operational ease.
        @inproceedings{taati-etal-2011a,
          title = {Towards Aging-In-Place: Automatic Assessment of Product Usability for Older Adults with Dementia.},
          author = {Babak Taati and Jasper Snoek and Alex Mihailidis},
          booktitle = {IEEE Conference on Healthcare Informatics, Imaging and Systems Biology},
          year = {2011}
        }
        
Snowbird
Semiparametric Latent Variable Models for Guided Representation
Jasper Snoek, Ryan Adams, Hugo Larochelle.
The Learning Workshop (Snowbird). April 2011
        @inproceedings{snoek-etal-2010a,
          title = {Automatic Segmentation of Video to Aid the Study of Faucet Usability for Older Adults.},
          author = {Jasper Snoek and Babak Taati and Yulia Eskin and Alex Mihailidis},
          booktitle = {CVPR Workshop for Human Communicative Behavior Analysis},
          year = {2010}
        }

        
CVPR4HB
Automatic Segmentation of Video to Aid the Study of Faucet Usability for Older Adults.
Jasper Snoek, Babak Taati, Yulia Eskin and Alex Mihailidis.
CVPR Workshop for Human Communicative Behavior Analysis. June 2010
Assessing the usability of objects for a specific population can be laborious and time consuming. Furthermore, for the older adult population, the usability of objects involved in the completion of tasks of daily living is critical to `aging-in-place' and the preservation of independence. This paper explores the automation of the process of observing older adults with Alzheimer's as they use various types of faucets. Features extracted from video and audio signals encode the subjects' progression through a hand-washing task and temporal segmentation is used to determine the state of the process at each video frame. Histograms of optical flow, a hand-tracking particle filter, and a water detection algorithm are used to extract features encoding the state of the handwashing process. A Hidden Markov Support Vector Machine is used to label each video frame of the handwashing process as belonging to one of five states with an overall accuracy of 93.58%.
        @inproceedings{snoek-etal-2010a,
          title = {Automatic Segmentation of Video to Aid the Study of Faucet Usability for Older Adults.},
          author = {Jasper Snoek and Babak Taati and Yulia Eskin and Alex Mihailidis},
          booktitle = {CVPR Workshop for Human Communicative Behavior Analysis},
          year = {2010}
        }

        
ISG 2010
Automated Detection of Falls in the Home – Current Challenges and Future Directions. Vancouver, May 2010.
Jasper Snoek, Babak Taati and Alex Mihailidis.
International Society of Gerontechnology World Conference. June 2010
Improved medical care and the aging of the ‘baby boomer’ generation have resulted in the increasing size and proportion of the over sixty-five adult population. As the older generation grows, falls in the home are becoming increasingly epidemic. A recent study has identified falls as the most expensive category of injury for the Canadian healthcare system, costing in total over $6.2 billion in 2004 alone [1]. Adults over the age of 65 accounted for the majority of the costs, 84% of fall-related deaths and 59% of hospitalizations, and the prevailing type of injury-causing fall was falls on the same level, followed by falls on stairs. Under the hypothesis that immediate access to emergency healthcare after a fall will reduce the severity of injuries, promote greater independence and allow older adults to age in place for longer, there have been a number of efforts to create automatic fall-detection systems. In past work, we have developed and carefully optimized computer vision based systems to detect falls in the home [2, 3]. However, a number of challenges remain before the systems can be deployed in real situations. In particular, gathering enough data to train the artificial intelligence algorithms has proven to be prohibitively expensive. The variance in different people’s gait and motion confuses the predictions of simple machine learning algorithms as they struggle to distinguish idiosyncratic behavior from dangerous events. More complex algorithms that model uncertainty in the data perform much better at the task of detecting dangerous events, but are computationally too expensive to use in a practical real-time system. In this work we show that recent advances in Gaussian Processes (GP) allow us to model the uncertainty and variance in the training data. This variance indicates which training data would be most beneficial to obtain (i.e. that which most decreases the uncertainty). We demonstrate preliminary results showing that the model can be used to automatically generate new training data to improve the performance of simpler algorithms. Method: Twenty training videos of different subjects simulating fall events, difficult borderline events (e.g. crouching and sitting) and normal gait were collected. A leave-one-out round-robin classification experiment is performed to compare the detection accuracy of two standard approaches (Logistic Regression and Support Vector Machines) with GP. Results: The GP outperforms the other two approaches by over 10% (from 75% to 86%) and produces a more favorable precision/recall tradeoff. We demonstrate how using a GP to generate new data allows the other algorithms to approach the accuracy of the GP. In addition, we show that by modeling the uncertainty in the training data one can identify the missing data that would be most valuable to improve the performance of the system.
CRV 2010
Water Flow Detection in a Handwashing Task.
Babak Taati, Jasper Snoek, David Giesbrecht and A. Mihailidis.
Canadian Conference on Computer and Robot Vision, June 2010
Older adults suffering from Alzheimer's disease often require assistance with performing simple activities of daily living, such as washing their hands in the bathroom. This severely limits their independence and places a heavy care giving burden on their family and the healthcare system. The motivation for developing a water detection algorithm is for it to be used within a system that provides reminding prompts for Alzheimer's sufferers and to study product usability for older adults with cognitive impairments. Water detection in a video sequence poses a challenging computer vision problem since it is difficult to model the flow of water in a structured manner. A real-time detection system is presented here that estimates the presence of flowing water in a bathroom sink during a hand washing task based on classifying video and audio features with an overall accuracy of 88.76%. Visual features are extracted using temporal image derivatives and hand tracking is used to enhance the robustness in the visual features.
          @article{taati-et-al-2010a,
            author = {Babak Taati and Jasper Snoek and David Giesbrecht and Alex Mihailidis},
            title = {Water Flow Detection in a Handwashing Task},
            journal ={Computer and Robot Vision, Canadian Conference},
            year = {2010},
            pages = {175-182},
            publisher = {IEEE Computer Society},
          }
        
IMAVIS
An Automated Tool for Detecting Anomalous Events on Stairs.
Jasper Snoek, Jesse Hoey, Liam Stewart, Richard S. Zemel and Alex Mihailidis.
Journal of Image and Vision Computing. Vol 27, pp. 153-166 (Jan, 2009)
This paper presents a method for automatically detecting unusual human events on stairs from video data. The motivation is to provide a tool for biomedical researchers to rapidly find the events of interest within large quantities of video data. Our system identifies potential sequences containing anomalies, and reduces the amount of data that needs to be searched by a human. We compute two sets of features from a video of a person descending a stairwell. The first set of features are the foot positions and velocities.We track both feet using a mixed state particle filter with an appearance model based on histograms of oriented gradients. We compute expected (most likely) foot positions given the state of the filter at each frame. The second set of features are the parameters of the mean optical flow over a foreground region. Our final classification system inputs these two sets of features into a hidden Markov model (HMM) to analyse the spatio-temporal progression of the stair descent. A single HMM is trained on sequences of normal stair use, and a threshold on sequence likelihoods is used to detect unusual events in new data.We demonstrate our system on a data set with five people descending a set of stairs in a laboratory environment. We show how our system can successfully detect nearly all anomalous events, with a low false positive rate. We discuss limitations and suggest improvements to the system.
Technology and Aging
An Automated Tool for Detecting and Preventing Unsafe Stair Use.
Jasper Snoek, Jesse Hoey and Alex Mihailidis
Technology and Aging. Vol 21. Assistive Technology Research Series. Jan 2008. IOS Press
ICTA07a
An Automated Tool for Detecting and Preventing Unsafe Stair Use
Jasper Snoek, Jesse Hoey and Alex Mihailidis.
International Conference on Technology and Aging. June 2007.
ICTA07b
Identifying the Risk of Falling: An Artificially Intelligent Tool to Support a Deployable Falls Prevention Program for Older Adults.
Jen Boger, Cheryl Beach, Jasper Snoek, Andrew Mitz, and Alex Mihailidis.
International Conference on Technology and Aging. June 2007.
CRV06
Automated Detection and Recognition of Unusual Events on Stairs.
Jasper Snoek, Jesse Hoey, Liam Stewart and Richard S. Zemel.
Computer and Robot Vision, Quebec City, Canada. June 2006.
ICADI06
An Automated Tool for Detecting and Preventing Unsafe Stair Use by Older Adults.
Jasper Snoek, Alex Mihailidis and Jesse Hoey.
International Conference on Aging, Disability and Independence. St. Petersburg, FL. Feb, 2006.