Refereed Publications
AISTATS 2012
On Nonparametric Guidance for Learning Autoencoder Representations
Jasper Snoek, Ryan Prescott Adams, Hugo Larochelle.
To appear at Artificial Intelligence and Statistics, 2012
Unsupervised discovery of latent representations, in addition to
being useful for density modeling, visualisation and exploratory
data analysis, is also increasingly important for learning features
relevant to discriminative tasks. Autoencoders, in particular, have
proven to be an effective way to learn latent codes that reflect
meaningful variations in data. A continuing challenge, however, is
guiding an autoencoder toward representations that are useful for
particular tasks. A complementary challenge is to find codes that
are invariant to irrelevant transformations of the data. The most
common way of introducing such problem-specific guidance in
autoencoders has been through the incorporation of a parametric
component that ties the latent representation to the label
information. In this work, we argue that a preferable approach
relies instead on a nonparametric guidance mechanism.
Conceptually, it ensures that there exists a function that can
predict the label information, without explicitly instantiating that
function. The superiority of this guidance mechanism is confirmed on
two datasets. In particular, this approach is able to incorporate
invariance information (lighting, elevation, etc.) from the small
NORB object recognition dataset and yields state-of-the-art
performance for a single layer, non-convolutional network.
@inproceedings{snoek-etal-2011b,
title = {On Nonparametric Guidance for Learning Autoencoder Representations.},
author = {Jasper Snoek, Ryan Prescott Adams and Hugo Larochelle},
booktitle = {AISTATS},
year = {2012} }
ICDM 2011
From Videos To Places: Geolocating the World's Videos
Jasper Snoek, Luciano Sbaiz, Hrishikesh Aradhye.
ICDM Workshop on Large Scale Visual Analytics, Dec 2011
This paper explores the problem of large-scale automatic video
geolocation. A methodology is developed to infer the location at
which videos from Youtube.com were recorded using video content and
various additional signals. Specifically, multiple binary Adaboost
classifiers are trained to identify particular places based on
learning decision stumps on sets of hundreds of thousands of sparse
features. A one-vs-all classification strategy is then used to
classify the location at which videos were recorded. Empirical
validation is performed on an immense data set of 20 million labeled
videos. Results demonstrate that high accuracy video geolocation is
indeed possible for many videos and locations and interesting
relationships exist between between videos and the places where they
are recorded.
@inproceedings{snoek-etal-2011a,
title = {From Videos To Places: Geolocating the Worlds Videos.},
author = {Jasper Snoek, Luciano Sbaiz and Hrishikesh Aradhye},
booktitle = {ICDM Workshop on Large Scale Visual Analytics},
year = {2011}
}
HISB 2011
Towards Aging-In-Place: Automatic Assessment of Product Usability for Older Adults with Dementia,
Babak Taati,Jasper Snoek, Alex Mihailidis.
Healthcare Informatics, Imaging and Systems Biology. July 2011.
Considerations of how to facilitate aging-in-place
are becoming increasingly pertinent as caregivers are over-
whelmed by an aging population. A primary challenge to
independent living is the inability to use products associated
with tasks of daily living. As improving the usability of these
products for the elderly will extend their independence, this
work attempts to automate and expedite the assessment of us-
ability using artificial intelligence. Video analysis is performed
to temporally segment video of human-product interaction
and automatically identify segments in which the human has
difficulty operating the product. The approach is applied to the
study of water faucet designs for older adults with dementia.
Empirical analysis is performed on videos of dementia patients
operating various faucet types, demonstrating the accuracy of
the temporal segmentation (88.1%) and the ability to estimate
the relative advantage of either design in terms of operational
ease.
@inproceedings{taati-etal-2011a,
title = {Towards Aging-In-Place: Automatic Assessment of Product Usability for Older Adults with Dementia.},
author = {Babak Taati and Jasper Snoek and Alex Mihailidis},
booktitle = {IEEE Conference on Healthcare Informatics, Imaging and Systems Biology},
year = {2011}
}
Snowbird
Semiparametric Latent Variable Models for Guided Representation
Jasper Snoek, Ryan Adams, Hugo Larochelle.
The Learning Workshop (Snowbird). April 2011
@inproceedings{snoek-etal-2010a,
title = {Automatic Segmentation of Video to Aid the Study of Faucet Usability for Older Adults.},
author = {Jasper Snoek and Babak Taati and Yulia Eskin and Alex Mihailidis},
booktitle = {CVPR Workshop for Human Communicative Behavior Analysis},
year = {2010}
}
CVPR4HB
Automatic Segmentation of Video to Aid the Study of Faucet Usability for Older Adults.
Jasper Snoek, Babak Taati, Yulia Eskin and Alex Mihailidis.
CVPR Workshop for Human Communicative Behavior Analysis. June 2010
Assessing the usability of objects for a specific population
can be laborious and time consuming. Furthermore, for the
older adult population, the usability of objects involved in
the completion of tasks of daily living is critical to
`aging-in-place' and the preservation of independence. This
paper explores the automation of the process of observing
older adults with Alzheimer's as they use various types of
faucets. Features extracted from video and audio signals
encode the subjects' progression through a hand-washing task
and temporal segmentation is used to determine the state of
the process at each video frame. Histograms of optical flow,
a hand-tracking particle filter, and a water detection
algorithm are used to extract features encoding the state of
the handwashing process. A Hidden Markov Support Vector
Machine is used to label each video frame of the handwashing
process as belonging to one of five states with an overall
accuracy of 93.58%.
@inproceedings{snoek-etal-2010a,
title = {Automatic Segmentation of Video to Aid the Study of Faucet Usability for Older Adults.},
author = {Jasper Snoek and Babak Taati and Yulia Eskin and Alex Mihailidis},
booktitle = {CVPR Workshop for Human Communicative Behavior Analysis},
year = {2010}
}
ISG 2010
Automated Detection of Falls in the Home – Current Challenges and Future Directions. Vancouver, May 2010.
Jasper
Snoek, Babak Taati and Alex Mihailidis.
International Society of Gerontechnology World Conference. June 2010
Improved medical care and the aging of the ‘baby boomer’
generation have resulted in the increasing size and proportion
of the over sixty-five adult population. As the older
generation grows, falls in the home are becoming increasingly
epidemic. A recent study has identified falls as the most
expensive category of injury for the Canadian healthcare
system, costing in total over $6.2 billion in 2004 alone [1].
Adults over the age of 65 accounted for the majority of the
costs, 84% of fall-related deaths and 59% of hospitalizations,
and the prevailing type of injury-causing fall was falls on
the same level, followed by falls on stairs. Under the
hypothesis that immediate access to emergency healthcare after
a fall will reduce the severity of injuries, promote greater
independence and allow older adults to age in place for
longer, there have been a number of efforts to create
automatic fall-detection systems. In past work, we have
developed and carefully optimized computer vision based
systems to detect falls in the home [2, 3]. However, a number
of challenges remain before the systems can be deployed in
real situations. In particular, gathering enough data to
train the artificial intelligence algorithms has proven to be
prohibitively expensive. The variance in different people’s
gait and motion confuses the predictions of simple machine
learning algorithms as they struggle to distinguish
idiosyncratic behavior from dangerous events. More complex
algorithms that model uncertainty in the data perform much
better at the task of detecting dangerous events, but are
computationally too expensive to use in a practical real-time
system. In this work we show that recent advances in Gaussian
Processes (GP) allow us to model the uncertainty and variance
in the training data. This variance indicates which training
data would be most beneficial to obtain (i.e. that which most
decreases the uncertainty). We demonstrate preliminary
results showing that the model can be used to automatically
generate new training data to improve the performance of
simpler algorithms. Method: Twenty training videos of
different subjects simulating fall events, difficult
borderline events (e.g. crouching and sitting) and normal gait
were collected. A leave-one-out round-robin classification
experiment is performed to compare the detection accuracy of
two standard approaches (Logistic Regression and Support
Vector Machines) with GP. Results: The GP outperforms the
other two approaches by over 10% (from 75% to 86%) and
produces a more favorable precision/recall tradeoff. We
demonstrate how using a GP to generate new data allows the
other algorithms to approach the accuracy of the GP. In
addition, we show that by modeling the uncertainty in the
training data one can identify the missing data that would be
most valuable to improve the performance of the system.
CRV 2010
Water Flow Detection in a Handwashing Task.
Babak Taati, Jasper
Snoek, David Giesbrecht and A. Mihailidis.
Canadian Conference on Computer and Robot Vision, June 2010
Older adults suffering from Alzheimer's disease often require
assistance with performing simple activities of daily living,
such as washing their hands in the bathroom. This severely
limits their independence and places a heavy care giving
burden on their family and the healthcare system. The
motivation for developing a water detection algorithm is for
it to be used within a system that provides reminding prompts
for Alzheimer's sufferers and to study product usability for
older adults with cognitive impairments. Water detection in a
video sequence poses a challenging computer vision problem
since it is difficult to model the flow of water in a
structured manner. A real-time detection system is presented
here that estimates the presence of flowing water in a
bathroom sink during a hand washing task based on classifying
video and audio features with an overall accuracy of
88.76%. Visual features are extracted using temporal image
derivatives and hand tracking is used to enhance the
robustness in the visual features.
@article{taati-et-al-2010a,
author = {Babak Taati and Jasper Snoek and David Giesbrecht and Alex Mihailidis},
title = {Water Flow Detection in a Handwashing Task},
journal ={Computer and Robot Vision, Canadian Conference},
year = {2010},
pages = {175-182},
publisher = {IEEE Computer Society},
}
IMAVIS
An Automated Tool for Detecting Anomalous Events on Stairs.
Jasper Snoek, Jesse Hoey, Liam Stewart, Richard S. Zemel and Alex Mihailidis.
Journal of Image and Vision Computing. Vol 27, pp. 153-166 (Jan, 2009)
This paper presents a method for automatically detecting
unusual human events on stairs from video data. The
motivation is to provide a tool for biomedical researchers
to rapidly find the events of interest within large
quantities of video data. Our system identifies potential
sequences containing anomalies, and reduces the amount of
data that needs to be searched by a human. We compute two
sets of features from a video of a person descending a
stairwell. The first set of features are the foot positions
and velocities.We track both feet using a mixed state
particle filter with an appearance model based on histograms
of oriented gradients. We compute expected (most likely)
foot positions given the state of the filter at each
frame. The second set of features are the parameters of the
mean optical flow over a foreground region. Our final
classification system inputs these two sets of features into
a hidden Markov model (HMM) to analyse the spatio-temporal
progression of the stair descent. A single HMM is trained on
sequences of normal stair use, and a threshold on sequence
likelihoods is used to detect unusual events in new data.We
demonstrate our system on a data set with five people
descending a set of stairs in a laboratory environment. We
show how our system can successfully detect nearly all
anomalous events, with a low false positive rate. We discuss
limitations and suggest improvements to the system.
@article{snoek-etal-2009a,
address= {Newton, MA, USA},
author = {Snoek, Jasper and Hoey, Jesse
and Stewart, Liam and Zemel, Richard S. and Mihailidis, Alex},
doi = {10.1016/j.imavis.2008.04.021},
issn = {0262-8856},
journal = {Image and Vision Computing},
number = {1-2},
pages = {153--166},
publisher = {Butterworth-Heinemann},
title = {Automated detection of unusual events on stairs},
url = {http://dx.doi.org/10.1016/j.imavis.2008.04.021},
volume = {27},
year = {2009}
}
Technology and Aging
An Automated Tool for Detecting and Preventing Unsafe Stair Use.
Jasper
Snoek, Jesse Hoey and Alex Mihailidis
Technology and Aging. Vol 21. Assistive Technology Research Series. Jan 2008. IOS Press
ICTA07a
An Automated Tool for Detecting and Preventing Unsafe Stair Use
Jasper Snoek, Jesse Hoey and Alex Mihailidis.
International Conference on Technology and Aging. June 2007.
ICTA07b
Identifying the Risk of Falling: An Artificially
Intelligent Tool to Support a Deployable Falls
Prevention Program for Older Adults.
Jen Boger, Cheryl Beach, Jasper Snoek, Andrew Mitz, and
Alex Mihailidis.
International Conference on Technology and Aging. June 2007.
CRV06
Automated Detection and Recognition of Unusual Events on Stairs.
Jasper Snoek, Jesse Hoey, Liam Stewart and Richard S. Zemel.
Computer and Robot Vision, Quebec City, Canada. June 2006.
ICADI06
An Automated Tool for Detecting and
Preventing Unsafe Stair Use by Older Adults.
Jasper Snoek, Alex Mihailidis and Jesse Hoey.
International Conference on Aging, Disability and Independence. St. Petersburg, FL. Feb, 2006.