I am particularly interested in neural networks that implement latent variable models, and in Bayesian inference for neural network models.
``Multilayer perceptron'' (or ``backprop'') networks are the most common type of neural network in ``supervised learning'' applications. The following publications deal with Bayesian inference for multilayer perceptron networks implemented using Markov chain Monte Carlo methods:
Neal, R. M. (1996) Bayesian Learning for Neural Networks, Lecture Notes in Statistics No. 118, New York: Springer-Verlag: blurb, associated references, associated software.You can also obtain software for Bayesian learning of neural networks that I have developed.
Neal, R. M. (1998) ``Assessing relevance determination methods using DELVE'', in C. M. Bishop (editor), Neural Networks and Machine Learning, pp. 97-129, Springer-Verlag: abstract, associated references, postscript, pdf.
Neal, R. M. (1992) ``Bayesian training of backpropagation networks by the hybrid Monte Carlo method'', Technical Report CRG-TR-92-1, Dept. of Computer Science, University of Toronto, 21 pages: abstract, postscript, pdf, associated references.
Other sorts of neural networks are aimed at ``unsupervised learning'', among which are some based on latent variables. The following paper takes a neural network view of what are known as ``belief networks'' in the expert systems literature:
Neal, R. M. (1990) ``Learning stochastic feedforward networks'', Technical Report CRG-TR-90-7, Dept. of Computer Science, University of Toronto, 34 pages: abstract, postscript, pdf, associated reference.When they contain latent variables, the above networks are implemented using Markov chain Monte Carlo.
The ``Helmholtz Machine'' is another approach to implementing network models with latent variables. It is discussed in the following papers:
Dayan, P., Hinton, G. E., Neal, R. M., and Zemel, R. S. (1995) ``The Helmholtz machine'', Neural Computation, vol. 7, pp. 1022-1037: abstract, associated reference.
Hinton, G. E., Dayan, P., Frey, B. J., and Neal, R. M. (1995) ``The ``wake-sleep'' algorithm for unsupervised neural networks'', Science, vol. 268, pp. 1158-1161: abstract, associated references.
Neal, R. M. and Dayan, P. (1996) ``Factor analysis using delta-rule wake-sleep learning'', Technical Report No. 9607, Dept. of Statistics, University of Toronto, 23 pages: abstract, postscript, pdf, associated references, associated software.