Intuitive motivation
•It is silly to run the Markov chain all the way to equilibrium if we can get the information required for learning in just a few steps.
–The way in which the model systematically distorts the data distribution in the first few steps tells us a lot about how the model is wrong.
–But the model could have strong modes far from any data. These modes will not be sampled by confabulations. Is this a problem in practice?