Lecture 1b

Faster mixing chains

•

Hybrid Monte Carlo can only take small steps because

the energy surface is curved.

•

With a single layer of hidden units, it is possible to use

alternating parallel Gibbs sampling.

–

Step 1: each student-t hidden unit picks a variance

from the posterior distribution over variances given

the violation produced by the current datavector. If the

violation is big, it picks a big variance

•

This is equivalent to picking a Gaussian from an infinite

mixture of Gaussians (because that’s what a student-t is).

–

With the variances fixed, each hidden unit defines a

one-dimensional Gaussians in the dataspace.

–

Step 2: pick a visible vector from the product of all the

one-dimensional Gaussians.