














• 
It is silly to
run the Markov chain all the way to



equilibrium if
we can get the information required


for learning in
just a few steps.




– 
The
way in which the model systematically



distorts
the data distribution in the first few



steps
tells us a lot about how the model is



wrong.




– 
But the
model could have strong modes far



from
any data. These modes will not be



sampled
by confabulations. Is this a problem



in
practice?

