lec2b


	The up-down algorithm:
A contrastive divergence version of wake-sleep

•

Replace the top layer of the DAG by an RBM

–

This eliminates bad variational approximations caused

by top-level units that are independent in the prior.

–

It is nice to have an associative memory at the top.

•

Replace the ancestral pass in the sleep phase by a top-

down pass starting with the state of the RBM produced by

the wake phase.

–

This makes sure the recognition weights are trained in

the vicinity of the data.

–

It also reduces mode averaging. If the recognition

weights prefer one mode, they will stick with that mode

even if the generative weights like some other mode

just as much.