The ``Wake-Sleep'' Algorithm for Unsupervised Neural Networks

Geoffrey E. Hinton, Peter Dayan, Brendan J. Frey, and Radford M. Neal
Dept. of Computer Science, University of Toronto

An unsupervised learning algorithm for a multilayer network of stochastic neurons is described. Bottom-up ``recognition'' connections convert the input into representations in successive hidden layers, and top-down ``generative'' connections reconstruct the representation in one layer from the representation in the layer above. In the ``wake'' phase, neurons are driven by recognition connections, and generative connections are adapted to increase the probability that they would reconstruct the correct activity vector in the layer below. In the ``sleep'' phase, neurons are driven by generative connections, and recognition connections are adapted to increase the probability that they would produce the correct activity vector in the layer above.

Science, vol. 268, pp. 1158-1161 (1995).


Associated references: The wake-sleep algorithm is a variation on the ``Helmholtz Machine'', discussed in the following paper:
Dayan, P., Hinton, G. E., Neal, R. M., and Zemel, R. S. (1995) ``The Helmholtz machine'', Neural Computation, vol. 7, pp. 1022-1037: abstract.

The wake-sleep algorithm is used to learn factor analysis models in the following technical report:
Neal, R. M. and Dayan, P. (1996) ``Factor analysis using delta-rule wake-sleep learning'', Technical Report No. 9607, Dept. of Statistics, University of Toronto, 23 pages: abstract, postscript, pdf, associated references, associated software.

The wake-sleep algorithm is used to model temporal sequences in the following conference paper:
Hinton, G. E., Dayan, P., To, A., and Neal, R. (1995) ``The Helmholtz machine through time'', in F. Fogelman-Soulie and R. Gallinari (editors) ICANN-95, pp. 483-490: abstract.