Jason Weston
Deep Learning via Semi-Supervised Embedding.
We show how nonlinear embedding algorithms popular for use with shallow
semi-supervised learning techniques such as kernel methods can be applied to
deep multi-layer architectures, either as a regularizer at the output layer, or
on each layer of the architecture. This provides a simple alternative to
existing approaches to deep learning whilst yielding competitive error rates
compared to those methods, and existing shallow semi-supervised techniques.
We then go on to generalize this approach to take advantage of sequential data:
for images, and text.
For images, we take advantage of the temporal coherence that naturally exists in
unlabeled video recordings. That is, two successive frames are likely to contain
the same object or objects. We demonstrate the effectiveness of this method in a
semi-supervised setting on some pose invariant object and face recognition
tasks.
For text, we describe a unified approach to tagging: a single convolutional
neural network architecture that, given a sentence, outputs a host of language
processing predictions: part-of-speech tags, chunks, named entity tags, and
semantic roles. State-of-the-art performance is attained by learning word
embeddings using a text specific semi-supervised task called a language model.
Joint work with: Ronan Collobert, Frederic Ratle, Hossein Mobahi, Pavel Kuksa
and Koray Kavukcuoglu.
Brief Bio.
Jason Weston received a PhD. degree in Machine Learning at Royal Holloway, University of London (advisor: Vladimir Vapnik) in 2000, where he also worked as a research assistant. From 2000 to 2002, he was a researcher at Biowulf technologies, New York, applying machine learning to bioinformatics. From 2002 to 2003 he was a research scientist at the Max Planck Institute for Biological Cybernetics, Tuebingen, Germany. Since 2004 he has been a research staff member at NEC Labs America, Princeton. Jason Weston's current research focuses on various aspects of statistical machine learning and its applications, particularly in bioinformatics and text.