Learning Energy-Based Models of High-Dimensional Data


Ways to combine Gibbs sampling with

	learning

•

The obvious method is to start with a random hidden

configuration for each datavector and to do Gibbs

sampling until we have reached equilibrium.

•

Then use the equilibrium samples from the posterior

distribution over hidden configurations to update the

weights (online or batch or mini-batch)

•

But how do we decide how much Gibbs sampling is

required to reach equilibrium?

–

There is no simple test and if we don’t do enough

there is no guarantee that the learning will work, even

if we use an infinitesimal learning rate.