Learning Energy-Based Models of High-Dimensional Data

Faster mixing chains

•

Hybrid Monte Carlo can only take small steps

because the energy surface is curved.

•

With a single layer of hidden units, it is possible

to use alternating parallel Gibbs sampling.

–

Much less computation

–

Much faster mixing

–

Can be extended to use pooled second layer

(Max Welling)

–

Can only be used in deep networks by

learning one hidden layer at a time.