George E. Dahl
- PhD Student Machine Learning Group
- Department of Computer Science
- University of Toronto
- Ontario, Canada
- email: Can be easily derived from the URL for this page.
About me
I am a PhD Student in the Machine Learning Group, supervised by Geoffrey Hinton.I am a recipient of the Microsoft Research PhD Fellowship (2012).
Research interests
- deep learning architectures
- speech recognition and language processing
- undirected graphical models
- most of statistical machine learning
Selected Publications
-
Training Restricted Boltzmann Machines on Word Observations
Accepted for publication in ICML 2012 [pdf] [arXiv preprint] [bibtex coming soon] [alias method pseudocode]
-
Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition
Accepted for publication in IEEE Transactions on Audio, Speech, and Language Processing [pdf] [bibtex]
-
Large Vocabulary Continuous Speech Recognition with Context-Dependent DBN-HMMs
Accepted for publication in ICASSP 2011 [pdf] [bibtex]
This paper is a conference-length version of the journal paper listed immediately above. -
Deep Belief Networks Using Discriminative Features for Phone Recognition
Accepted for publication in ICASSP 2011 [pdf] [bibtex]
-
Acoustic Modeling using Deep Belief Networks
Accepted for publication in IEEE Trans. on Audio, Speech and Language Processing. [pdf] [bibtex]
-
Deep Belief Networks for Phone Recognition
In NIPS Workshop on Deep Learning for Speech Recognition and Related Applications, 2009. [pdf] [bibtex]
The journal version of this work (listed immediately above) should be viewed as the definitive version. -
Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine
In Advances in Neural Information Processing Systems 23, 2010. [pdf] [bibtex]
-
Incorporating Side Information into Probabilistic Matrix Factorization Using Gaussian Processes
In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, 2010. [pdf] [bibtex] [code]
Code
I have implemented a version of the Hessian Free (truncated Newton) optimization approach that is based on James Martens's exposition of it in his paper that explored using HF for deep learning (please see James Martens's research page). My particular implementation was made possible with Ilya Sutskever's guidance and some of the implementation choices have been made to make it easier to compare my code to various optimizers he has written. Despite Ilya's generous assistance, any bugs or defects that might exist in the code I post here are my own. Please see Ilya's publication page for code he has released for HF and recurrent neural nets. It isn't too difficult to wrap his recurrent neural net model code in a way that let's my optimizer code optimize it. Without further ado, here is the code. The file is large because it also contains a copy of the curves dataset. The code requires gnumpy to run and I recommend using cudamat, written by Volodymyr Mnih, and running the code on a GPU and not in the slower simulation mode of gnumpy.