IJCAI 2005 Research Excellence Award Lecture

The learning rule for sigmoid belief nets

•

Suppose we could observe

the states of all the hidden

units when the net was

generating an observed data-

vector.

–

This is equivalent to getting

samples from the posterior

distribution over hidden

configurations given the

observed datavactor.

•

For each node, it is easy to

maximize the log probability of

its observed state given the

observed states of its parents.