Radford Neal's Research: Neural Networks

The neural network field (a.k.a. ``connectionism'') deals with models that are relevant to, or at least inspired by, the way learning and computation may occur in the brain.

I am particularly interested in neural networks that implement latent variable models, and in Bayesian inference for neural network models.

``Multilayer perceptron'' (or ``backprop'') networks are the most common type of neural network in ``supervised learning'' applications. The following publications deal with Bayesian inference for multilayer perceptron networks implemented using Markov chain Monte Carlo methods:

Neal, R. M. and Zhang, J. (2006) ``High dimensional classification with Bayesian neural networks and Dirichlet diffusion trees'', in I. Guyon, S. Gunn, M. Nikravesh, and L. A. Zadeh (editors) Feature Extraction: Foundations and Applications, Studies in Fuzziness and Soft Computing, Volume 207, Springer, pp. 265-295.

Neal, R. M. (2006) ``Classification with Bayesian neural networks'', in J. Quinonero-Candela, B. Magnini, I. Dagan, and F. D'Alche-Buc (editors) Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Textual Entailment, Lecture Notes in Computer Science no. 3944, Springer-Verlag, pp. 28-32.

Neal, R. M. (1996) Bayesian Learning for Neural Networks, Lecture Notes in Statistics No. 118, New York: Springer-Verlag: blurb, associated references, associated software.

Neal, R. M. (1998) ``Assessing relevance determination methods using DELVE'', in C. M. Bishop (editor), Neural Networks and Machine Learning, pp. 97-129, Springer-Verlag: abstract, associated references, postscript, pdf.

Neal, R. M. (1992) ``Bayesian training of backpropagation networks by the hybrid Monte Carlo method'', Technical Report CRG-TR-92-1, Dept. of Computer Science, University of Toronto, 21 pages: abstract, postscript, pdf, associated references.

You can also obtain software for Bayesian learning of neural networks that I have developed.

Other sorts of neural networks are aimed at ``unsupervised learning'', among which are some based on latent variables. The following paper takes a neural network view of what are known as ``belief networks'' in the expert systems literature:

Neal, R. M. (1990) ``Learning stochastic feedforward networks'', Technical Report CRG-TR-90-7, Dept. of Computer Science, University of Toronto, 34 pages: abstract, postscript, pdf, associated reference.
When they contain latent variables, the above networks are implemented using Markov chain Monte Carlo.

The ``Helmholtz Machine'' is another approach to implementing network models with latent variables. It is discussed in the following papers:

Dayan, P., Hinton, G. E., Neal, R. M., and Zemel, R. S. (1995) ``The Helmholtz machine'', Neural Computation, vol. 7, pp. 1022-1037: abstract, associated reference.

Hinton, G. E., Dayan, P., Frey, B. J., and Neal, R. M. (1995) ``The ``wake-sleep'' algorithm for unsupervised neural networks'', Science, vol. 268, pp. 1158-1161: abstract, associated references.

Neal, R. M. and Dayan, P. (1996) ``Factor analysis using delta-rule wake-sleep learning'', Technical Report No. 9607, Dept. of Statistics, University of Toronto, 23 pages: abstract, postscript, pdf, associated references, associated software.


Back to Radford Neal's research interests
Back to Radford Neal's home page