Google Scholar Profile


  • Gated-Attention Readers for Text Comprehension
    Bhuwan Dhingra, Hanxiao Liu, William W. Cohen, Ruslan Salakhutdinov
    arXiv [arXiv].

  • Deep Neural Networks with Massive Learned Knowledge
    Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, and Eric Xing
    Conference on Empirical Methods in Natural Language Processing (EMNLP'16).

  • Iterative Refinement of Approximate Posterior for Training Directed Belief Networks
    Devon Hjelm, Kyunghyun Cho, Junyoung Chung, Russ Salakhutdinov, Vince Calhoun, Nebojsa Jojic
    NIPS 2016, arXiv [arXiv].

  • Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations
    Behnam Neyshabur, Yuhuai Wu, Ruslan Salakhutdinov, Nathan Srebro
    NIPS 2016, arXiv [arXiv].

  • Stochastic Variational Deep Kernel Learning
    Andrew Gordon Wilson, Zhiting Hu, Eroc Xing, Ruslan Salakhutdinov
    NIPS 2016.

  • On Multiplicative Integration with Recurrent Neural Networks
    Yuhuai Wu, Saizheng Zhang, Ying Zhang, Yoshua Bengio, Ruslan Salakhutdinov
    NIPS 2016, arXiv [arXiv].

  • Encode, Review, and Decode: Reviewer Module for Caption Generation
    Zhilin Yang, Ye Yuan, Yuexin Wu, Ruslan Salakhutdinov, William W. Cohen
    NIPS 2016, arXiv [arXiv].

  • Architectural Complexity Measures of Recurrent Neural Networks
    Saizheng Zhang, Yuhuai Wu, Tong Che, Zhouhan Lin, Roland Memisevic, Ruslan Salakhutdinov, Yoshua Bengio
    NIPS 2016, arXiv [arXiv].

  • Multi-Task Cross-Lingual Sequence Tagging from Scratch
    Zhilin Yang, Ruslan Salakhutdinov, William Cohen
    arXiv [arXiv].

  • Revisiting Semi-Supervised Learning with Graph Embeddings
    Zhilin Yang, William Cohen, Ruslan Salakhutdinov
    ICML 2016, arXiv [arXiv].

  • Importance Weighted Autoencoders
    Yuri Burda, Roger Grosse, Ruslan Salakhutdinov
    ICLR, 2016, [arXiv]. Code is available [here].

  • Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
    Emilio Parisotto, Jimmy Lei Ba, Ruslan Salakhutdinov
    ICLR, 2016, [arXiv].

  • Generating Images from Captions with Attention
    Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, Ruslan Salakhutdinov
    ICLR, 2016, [arXiv], oral. [Generated Samples].

  • Data-Dependent Path Normalization in Neural Networks
    Behnam Neyshabur, Ryota Tomioka, Ruslan Salakhutdinov, Nathan Srebro
    ICLR, 2016, [arXiv].

  • Action Recognition using Visual Attention
    Shikhar Sharma, Ryan Kiros, Ruslan Salakhutdinov
    ICLR workshop, 2016 [arXiv]. [Code]. [Project Website].

  • Deep Kernel Learning
    Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, Eric Xing
    AI and Statistics, 2016, [arXiv].



  • Human-level concept learning through probabilistic program induction
    Brenden Lake, Ruslan Salakhutdinov, and Joshua Tenenbaum (2015),
    Science, 350(6266), 1332-1338, [paper],[Supporting Info.], [visual Turing tests], [Omniglot data set], [Code].


  • Learning Wake-Sleep Recurrent Attention Models
    Lei Jimmy Ba, Roger Grosse, Ruslan Salakhutdinov, Brendan Frey
    NIPS 2015. [arXiv].

  • Skip-Thought Vectors
    Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, Sanja Fidler
    NIPS 2015, [arXiv]. Code is available [here].

  • Path-SGD: Path-Normalized Optimization in Deep Neural Networks
    Behnam Neyshabur, Ruslan Salakhutdinov, Nathan Srebro
    NIPS 2015, [arXiv].

  • Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books
    Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, Sanja Fidler
    ICCV 2015, [arXiv], [ project page ], oral.

  • Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions
    Jimmy Ba, Kevin Swersky, Sanja Fidler, Ruslan Salakhutdinov
    ICCV 2015, [arXiv].

  • Learning Deep Generative Models
    Ruslan Salakhutdinov
    Annual Review of Statistics and Its Application, Vol. 2, pp. 361–385, 2015
    [pdf], 2015

  • Scaling Up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix
    Roger Grosse, Ruslan Salakhutdinov
    ICML 2015, [pdf].

  • Unsupervised Learning of Video Representations using LSTMs
    Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov
    ICML 2015, [arXiv], [pdf].

  • Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
    Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio
    ICML 2015, [arXiv], [pdf], [project page].


  • Siamese neural networks for one-shot image recognition.
    Gregory Koch, Richard Zemel, Ruslan Salakhutdinov
    ICML 2015 Deep Learning Workshop (2015). [pdf].

  • Exploiting Image-trained CNN Architectures for Unconstrained Video Classification
    Shengxin Zha, Florian Luisier, Walter Andrews, Nitish Srivastava, Ruslan Salakhutdinov
    In BMVC 2015 [arXiv], 2015

  • segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection
    Y. Zhu, R. Urtasun, R. Salakhutdinov and S.Fidler
    In Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, June 2015,
    [ arXiv ], [pdf], [ Project Page].

  • Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models
    Ryan Kiros, Ruslan Salakhutdinov, Richard Zemel.
    To appear in Transactions of the Association for Computational Linguistics (TACL), 2015.
    [ arXiv], [ results], [ demo ].
    An encoder-decoder architecture for ranking and generating image descriptions.
    Previous version appeared in NIPS Deep Learning Workshop, 2014.

  • Accurate and Conservative Estimates of MRF Log-likelihood using Reverse Annealing
    Yuri Burda, Roger B. Grosse, and Ruslan Salakhutdinov
    In AI and Statistics (AISTATS), 2015 [arXiv], [pdf].


  • Learning Generative Models with Visual Attention
    Yichuan Tang, Nitish Srivastava, and Ruslan Salakhutdinov
    Neural Information Processing Systems (NIPS 28), 2014, oral.
    [ pdf ], Supplementary material [ pdf].

  • A Multiplicative Model for Learning Distributed Text-Based Attribute Representations
    Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov.
    Neural Information Processing Systems (NIPS 28), 2014
    [ pdf ], Supplementary material [ zip].
    Previous version appeared in ICML Workshop on Knowledge-Powered Deep Learning for Text Mining, 2014. [ arXiv].

  • Multimodal Learning with Deep Boltzmann Machines
    Nitish Srivastava and Ruslan Salakhutdinov
    Journal of Machine Learning Research, 2014. [ pdf ]. Code is available [ here].

  • Dropout: A simple way to prevent neural networks from overfitting
    Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov
    Journal of Machine Learning Research, 2014. [ pdf].

  • Deep Learning for Neuroimaging: a Validation Study
    S. Plis, D. Hjelm, R. Salakhutdinov, E. Allen, H. Bockholt, J. Long, H. Johnson, J. Paulsen, J. Turner, and V. Calhoun
    Frontiers in Neuroscience, 2014. [ pdf].

  • Multi-task Neural Networks for QSAR Prediction
    George E. Dahl, Navdeep Jaitly, Ruslan Salakhutdinov, 2014.
    [ arXiv].

  • Restricted Boltzmann Machines for Neuroimaging: An Application in Identifying Intrinsic Networks
    Devon Hjelma, Vince Calhouna, Ruslan Salakhutdinov, Elena Allena, Tulay Adali, and Sergey Plisa
    In NeuroImage, Volume 96, Aug 1 2014, pages 245 - 260. [ pdf].

  • Multimodal Neural Language Models
    Ryan Kiros, Ruslan Salakhutdinov, Richard Zemel.
    In 31th International Conference on Machine Learning (ICML 2014)
    [pdf], [ Project Page].


  • Annealing between Distributions by Averaging Moments
    Roger Grosse, Chris Maddison, and Ruslan Salakhutdinov
    In Neural Information Processing Systems (NIPS 27), 2013, oral.
    [pdf], Supplementary material [ pdf].

  • Discriminative Transfer Learning with Tree-based Priors
    Nitish Srivastava and Ruslan Salakhutdinov
    In Neural Information Processing Systems (NIPS 27), 2013, [pdf], Supplementary material [ zip].

  • Learning Stochastic Feedforward Neural Networks
    Yichuan Tang and Ruslan Salakhutdinov
    In Neural Information Processing Systems (NIPS 27), 2013 [pdf], Supplementary material [ pdf].

  • One-shot Learning by Inverting a Compositional Causal Process
    Brenden Lake, Ruslan Salakhutdinov, and Josh Tenenbaum
    In Neural Information Processing Systems (NIPS 27), 2013, [pdf], Supplementary material [ pdf].

  • The Power of Asymmetry in Binary Hashing
    B. Neyshabur, N. Srebro, R. Salakhutdinov, Y. Makarychev, and P. Yadollahpour
    In Neural Information Processing Systems (NIPS 27), 2013, [pdf].

  • Learning with Hierarchical-Deep Models
    Ruslan Salakhutdinov, Josh Tenenbaum, and Antonio Torralba
    IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1958-1971, Aug. 2013, [pdf].

  • Modeling Documents with Deep Boltzmann Machines
    Nitish Srivastava, Ruslan Salakhutdinov, Geoffrey Hinton
    In Uncertainty in Artificial Intelligence (UAI), Seattle, USA, 2013, oral.
    [pdf],

  • Tensor Analyzers
    Yichuan Tang, Ruslan Salakhutdinov and Geoffrey Hinton
    In 30th International Conference on Machine Learning (ICML), Atlanta, USA, 2013 [pdf], [ supp ], [ code].



  • Multimodal Learning with Deep Boltzmann Machines
    Nitish Srivastava and Ruslan Salakhutdinov
    Neural Information Processing Systems (NIPS 26), 2012, oral.
    [ pdf], Supplementary material [ zip].
    Code is available [ here].

  • Hamming Distance Metric Learning
    Mohammad Norouzi, David Fleet, and Ruslan Salakhutdinov
    Neural Information Processing Systems (NIPS 26), 2012 [ pdf], Supplementary material [ pdf].

  • A Better Way to Pretrain Deep Boltzmann Machines
    Ruslan Salakhutdinov and Geoffrey Hinton
    Neural Information Processing Systems (NIPS 26), 2012, [ pdf].

  • Matrix Reconstruction with the Local Max Norm.
    Rina Foygel, Nathan Srebro, Ruslan Salakhutdinov
    Neural Information Processing Systems (NIPS 26), 2012, [ pdf], Supplementary material [ pdf].

  • Cardinality Restricted Boltzmann Machines
    Kevin Swersky, Daniel Tarlow, Ilya Sutskever, Ruslan Salakhutdinov, Richard Zemel, and Ryan Adams.
    Neural Information Processing Systems (NIPS 26), 2012, [ pdf].
  • An Efficient Learning Procedure for Deep Boltzmann Machines
    Ruslan Salakhutdinov and Geoffrey Hinton
    Neural Computation August 2012, Vol. 24, No. 8: 1967 -- 2006. [ pdf],

  • Improving neural networks by preventing co-adaptation of feature detectors
    Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov
    arXiv [ pdf],

  • Exploiting Compositionality to Explore a Large Space of Model Structures
    Roger Grosse, Ruslan Salakhutdinov, William Freeman, and Joshua Tenenbaum
    UAI 2012 [ pdf].
    Best student paper award (Congratulations Roger).

  • One-Shot Learning with a Hierarchical Nonparametric Bayesian Model
    Ruslan Salakhutdinov, Josh Tenenbaum, and Antonio Torralba
    JMLR WC&P Unsupervised and Transfer Learning, 2012, [ pdf] `

  • Deep Lambertian Networks
    Yichuan Tang , Ruslan Salakhut dinov, and Geoffrey Hinton
    The 29th International Conference on Machine Learning (ICML 2012) [ pdf],

  • Deep Mixtures of Factor Analysers
    Yichuan Tang , Ruslan Salakhut dinov, and Geoffrey Hinton
    The 29th International Conference on Machine Learning (ICML 2012) [ pdf],

  • Concept learning as motor program induction: A large-scale empirical study.
    Brenden Lake , Ruslan Salakhutdinov, and Josh Tenenbaum.
    Proceedings of the 34rd Annual Conference of the Cognitive Science Society, 2012 [ pdf], Supporting Info

  • Robust Boltzmann Machines for Recognition and Denoising
    Yichuan Tang , Ruslan Salakhut dinov, and Geoffrey Hinton
    IEEE Computer Vision and Pattern Recognition (CVPR) 2012. [ pdf]

  • Resource Configurable Spoken Query Detection using Deep Boltzmann Machines
    Yaodong Zhang, Ruslan Salakhutdinov, Hung-An Chang, and James Glass.
    37th International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012 [ pdf]

  • Domain Adaptation: A Small Sample Statistical Approach
    Dean Foster, Sham Kakade, and Ruslan Salakhutdinov
    JMLR W&CP 15 (AISTATS), 2012 [ pdf]



  • Learning to Learn with Compound Hierarchical-Deep Models
    Ruslan Salakhutdinov, Josh Tenenbaum , Antonio Torralba
    Neural Information Processing Systems (NIPS 25), 2011, [ pdf]

  • Transfer Learning by Borrowing Examples
    Joseph Lim , Ruslan Salakhutdinov Antonio Torralba
    Neural Information Processing Systems (NIPS 25). 2011, [ pdf]

  • Learning with the Weighted Trace-norm under Arbitrary Sampling Distributions
    Rina Foygel, Ruslan Salakhutdinov, Ohad Shamir, Nathan Srebro
    Neural Information Processing Systems (NIPS 25), 2011, [ pdf]
    Supplementary material [ pdf]

  • One-shot learning of simple visual concepts
    Brenden Lake , Ruslan Salakhutdinov, Jason Gross, and Josh Tenenbaum.
    Proceedings of the 33rd Annual Conference of the Cognitive Science Society, 2011 [ pdf], videos

  • Learning to Share Visual Appearance for Multiclass Object Detection
    Ruslan Salakhutdinov, Antonio Torralba , and Josh Tenenbaum.
    IEEE Computer Vision and Pattern Recognition (CVPR) 2011 [ pdf]

  • Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm.
    Ruslan Salakhutdinov and Nathan Srebro.
    Neural Information Processing Systems 24, 2011
    [bibtex] [ pdf]
    Earlier version: [arXiv:1002.2780v1], [ps.gz][ pdf]

  • Practical Large-Scale Optimization for Max-Norm Regularization.
    Jason Lee, Benjamin Recht, Ruslan Salakhutdinov, Nathan Srebro, and Joel A. Tropp
    Neural Information Processing Systems 24, 2011
    [bibtex] [ pdf]


  • Discovering Binary Codes for Documents by Learning Deep Generative Models.
    Geoffrey Hinton and Ruslan Salakhutdinov.
    Topics in Cognitive Science, 2010
    [bibtex] [ pdf]

  • One-Shot Learning with a Hierarchical Nonparametric Bayesian Model.
    Ruslan Salakhutdinov, Josh Tenenbaum, and Antonio Torralba.
    MIT Technical Report MIT-CSAIL-TR-2010-052, 2010, [ pdf]

  • Learning in Deep Boltzmann Machines using Adaptive MCMC.
    Ruslan Salakhutdinov.
    In 27th International Conference on Machine Learning (ICML-2010)
    [bibtex] [ps.gz], [ pdf]

  • Efficient Learning of Deep Boltzmann Machines.
    Ruslan Salakhutdinov and Hugo Larochelle.
    AI and Statistics, 2010
    [bibtex] [ps.gz][ pdf]

  • Learning in Markov Random Fields using Tempered Transitions.
    Ruslan Salakhutdinov.
    Neural Information Processing Systems 23, 2010
    [bibtex] [ps.gz][ pdf]

  • Replicated Softmax: an Undirected Topic Model.
    Ruslan Salakhutdinov and Geoffrey Hinton.
    Neural Information Processing Systems 23, 2010
    [bibtex] [ps.gz][pdf]

  • Modelling Relational Data using Bayesian Clustered Tensor Factorization.
    Ilya Sutskever, Ruslan Salakhutdinov, and Josh Tenenbaum.
    Neural Information Processing Systems 23, 2010
    [bibtex] [pdf]


  • Learning Deep Generative Models.
    Ruslan Salakhutdinov
    PhD Thesis, Sep 2009
    Dept. of Computer Science, University of Toronto
    [bibtex] [ps.gz][pdf]

  • Semantic Hashing.
    Ruslan Salakhutdinov and Geoffrey Hinton
    International Journal of Approximate Reasoning, 2009
    [bibtex] [pdf]
    Earlier verision appeared in: SIGIR workshop on Information Retrieval and applications of Graphical Models (2007)
    [bibtex] [ps.gz, pdf]

  • Learning Nonlinear Dynamic Models.
    John Langford, Ruslan Salakhutdinov and Tong Zhang.
    Proceedings of the 26th International Conference on Machine Learning (ICML), 2009.
    [bibtex] [ps.gz][ pdf]

  • Evaluation Methods for Topic Models.
    Hanna M. Wallach, Iain Murray, Ruslan Salakhutdinov and David Mimno.
    Proceedings of the 26th International Conference on Machine Learning (ICML), 2009.
    [bibtex] [ pdf]

  • Deep Boltzmann Machines
    Ruslan Salakhutdinov and Geoffrey Hinton
    12th International Conference on Artificial Intelligence and Statistics (2009).
    [bibtex] [ps.gz][ pdf]

  • Evaluating probabilities under high-dimensional latent variable models.
    Iain Murray and Ruslan Salakhutdinov
    Neural Information Processing Systems 22 (NIPS 2009)
    [bibtex] [ pdf], Jan 2009


  • Learning and Evaluating Boltzmann Machines
    Ruslan Salakhutdinov
    Technical Report UTML TR 2008-002, Dept. of Computer Science, University of Toronto
    [bibtex] [ps.gz][ pdf]
    This paper introduces a new Boltzmann machine learning algorithm that combines variational techniques and MCMC.

  • On the Quantitative Analysis of Deep Belief Networks.
    Ruslan Salakhutdinov and Iain Murray
    In 25th International Conference on Machine Learning (ICML-2008)
    [bibtex] [ps.gz],[ pdf], [code]

  • Bayesian Probabilistic Matrix Factorization using MCMC.
    Ruslan Salakhutdinov and Andriy Mnih
    In 25th International Conference on Machine Learning (ICML-2008)
    [bibtex] [ps.gz],[ pdf]

  • Probabilistic Matrix Factorization.
    Ruslan Salakhutdinov and Andriy Mnih
    Neural Information Processing Systems 21 (NIPS 2008)
    [bibtex] [ps.gz][pdf], Jan 2008, oral.

  • Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes.
    Ruslan Salakhutdinov and Geoffrey Hinton
    Neural Information Processing Systems 21 (NIPS 2008)
    [bibtex] [ps.gz][pdf], Jan 2008

  • Restricted Boltzmann Machines for Collaborative Filtering.
    Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton
    ICML 2007
    [bibtex] [ps.gz][pdf]

  • Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure.
    Ruslan Salakhutdinov and Geoffrey Hinton
    AI and Statistics 2007
    [bibtex] [ps.gz][ pdf]

  • Reducing the Dimensionality of Data with Neural Networks.
    Geoffrey E. Hinton and Ruslan R. Salakhutdinov
    Science, 28 July 2006:
    Vol. 313. no. 5786, pp. 504 - 507
    [bibtex] [pdf][ Science Online]
    Supporting Online Material [pdf, Science Online]
    Matlab Code is available here
    Figures are available in eps format: [fig1, fig2, fig3, fig4]
    and in jpeg format: [fig1, fig2, fig3, fig4]

  • Simultaneous Localization and Surveying with Multiple Agents.
    Sam Roweis & Ruslan Salakhutdinov (2005)
    In R. Murray-Smith, R. Shorten (eds), Switching and Learning in Feedback Systems (Springer LNCS vol 3355, 2005). pp. 313--332
    [bibtex] [pdf]

  • Neighbourhood Component Analysis
    Jacob Goldberger, Sam Roweis, Geoff Hinton, Ruslan Salakhutdinov
    Neural Information Processing Systems 17 (NIPS'04).
    [bibtex] [pdf]

  • Semi-Supervised Mixture-of-Experts Classification
    Grigoris Karakoulas & Ruslan Salakhutdinov
    The Fourth IEEE International Conference on Data Mining (ICDM 04)
    [bibtex]

  • On the Convergence of Bound Optimization Algorithms
    Ruslan Salakhutdinov & Sam T. Roweis & Zoubin Ghahramani (2003).
    Uncertainty in Artificial Intelligence (UAI-2003). pp 509-516
    [bibtex] [ps.gz] [pdf]

  • Optimization with EM and Expectation-Conjugate-Gradient
    Ruslan Salakhutdinov & Sam T. Roweis & Zoubin Ghahramani (2003).
    International Conference on Machine Learning (ICML-2003). pp 672-679
    [bibtex] [ps.gz] [pdf]

  • Adaptive Overrelaxed Bound Optimization Methods.
    Ruslan Salakhutdinov & Sam T. Roweis (2003).
    International Conference on Machine Learning (ICML-2003). pp 664-671
    [bibtex] [ps.gz] [pdf]

    Also check out demos on Adaptive vs Standard EM for Mixture of Factor Analyzers here and Mixture of Gaussians here


      Technical Reports/Unpublished Manuscripts

      1. Notes on the KL-divergence between a Markov chain and its equilibrium distribution
        Iain Murray and Ruslan Salakhutdinov (2008)
        [pdf]

      2. Relationship between gradient and EM steps in latent variable models.
        Ruslan Salakhutdinov & Sam T. Roweis & Zoubin Ghahramani (2002).
        Unpublished Report. [draft version (sep.02)-->ps.gz(32K) pdf(70K)]

      3. Expectation Conjugate-Gradient: An Alternative to EM
        Ruslan Salakhutdinov & Sam T. Roweis & Zoubin Ghahramani (2003).
        [draft version (june.02)-->ps.gz(186K) pdf(640K)]