Ways to combine Gibbs sampling with
learning
• The obvious method is to start with a random hidden
configuration for each datavector and to do Gibbs
sampling until we have reached equilibrium.
• Then use the equilibrium samples from the posterior
distribution over hidden configurations to update the
weights (online or batch or mini-batch)
• But how do we decide how much Gibbs sampling is
required to reach equilibrium?
– There is no simple test and if we don’t do enough
there is no guarantee that the learning will work, even
if we use an infinitesimal learning rate.