32
Back-fitting
After we have learned all the layers greedily, the weights in the
lower layers will no longer be optimal. We can improve them in two
ways:
Untie the recognition weights from the generative weights and
learn recognition weights that take into account the non-
complementary prior implemented by the weights in higher
layers.
Improve the generative weights to take into account the non-
complementary priors implemented by the weights in higher
layers.
What algorithm should we use for fine-tuning the weights that are
learned greedily?
We use a contrastive version of the “wake-sleep” algorithm. This
is explained in the written paper. It will not be described in the
talk.