The shortcut
Instead of taking the negative samples from the
equilibrium distribution, use slight corruptions of
the datavectors..
Much less variance because a datavector and
its confabulation form a matched pair.
Seems to be very biased, but maybe it is
optimizing a different objective function.
What about regions far from the data that have
high density under the model?
If the model is perfect and there is an infinite
amount of data, the confabulations will be
equilibrium samples. So the shortcut will not cause
learning to mess up a perfect model.