Connectionist learning procedures are presented for ``sigmoid'' and ``noisy-OR'' varieties of probabilistic belief networks. These networks have previously been seen primarily as a means of representing knowledge derived from experts. Here it is shown that the ``Gibbs sampling'' simulation procedure for such networks can support maximum-likelihood learning from empirical data through local gradient ascent. This learning procedure resembles that used for ``Boltzmann machines'', and like it, allows the use of ``hidden'' variables to model correlations between visible variables. Due to the directed nature of the connections in a belief network, however, the ``negative phase'' of Boltzmann machine learning is unnecessary. Experimental results show that, as a result, learning in a sigmoid belief network can be faster than in a Boltzmann machine. These networks have other advantages over Boltzmann machines in pattern classification and decision making applications, are naturally applicable to unsupervised learning problems, and provide a link between work on connectionist learning and work on the representation of expert knowledge.
Artificial Intelligence, vol. 56, pp. 71-113 (1992).
Neal, R. M. (1990) ``Learning stochastic feedforward networks'', Technical Report CRG-TR-90-7, Dept. of Computer Science, University of Toronto, 34 pages: abstract, postscript, pdf.The paper contains some material not present in the technical report. The technical report contains a few details not present in the paper.