Lec1b

What the wake phase achieves

•

The bottom-up recognition weights are used to compute

a sample from the distribution Q over hidden

configurations. Q approximates the true posterior, P.

–

In each layer Q assumes the states are independent

given the states in the layer below. It ignores

explaining away.

•

The changes to the generative weights are designed to

reduce the average cost (i.e. energy) of generating the

data when the hidden configurations are sampled from

the approximate posterior.

–

The updates to the generative weights follow the

gradient of the variational bound with respect to the

parameters of the model.