•Hybrid Monte Carlo can only take
small steps because the energy surface is curved.
•With a single layer of hidden
units, it is possible to use alternating parallel Gibbs
sampling.
–Much less computation
–Much faster mixing
–Can be extended to use pooled
second layer (Max Welling)
–Can only be used in deep networks by learning one hidden layer at a time.