


















• 
By using the
variational bound, we can learn sigmoid belief nets



quickly.



• 
If we add
bottomup recognition connections to a generative sigmoid


belief net, we
get a nice neural network model that requires a wake



phase and a sleep
phase.




– 
The
activation rules and the learning rules are very simple in



both
phases. This makes neuroscientists happy.



• 
But there are
problems:




– 
The
learning of the recognition weights in the sleep phase is not



quite
following the gradient of the variational bound.




– 
Even
if we could follow the right gradient, the variational



approximation
might be so crude that it severely limits what we



can
learn.



• 
Variational
learning works because the learning tries to find regions



of the parameter
space in which the variational bound is fairly tight,



even if this
means getting a model that gives lower log probability to



the data.

