Lec1b

The flaws in the wake-sleep algorithm

•

The recognition weights are trained to invert the

generative model in parts of the space where

there is no data.

–

This is wasteful.

•

The recognition weights follow the gradient of

the wrong divergence. They minimize KL(P||Q)

but the variational bound requires minimization

of KL(Q||P).

–

This leads to incorrect mode-averaging.