NIPS 2007 Tutorial on Deep Belief Nets

•

A directed module also converts its data

distribution into an aggregated posterior

–

Task 1 is now harder because the

posterior for each training case is non-

factorial.

–

Task 2 is performed using an

independent prior. This is a bad

approximation unless the aggregated

posterior is close to factorial.

•

A directed module attempts to make the

aggregated posterior factorial in one step.

–

This is too difficult and leads to a bad

compromise. There is no guarantee

that the aggregated posterior is easier

to model than the data distribution.

Task 2


aggregated
posterior distribution
on hidden units

Task 1


data distribution
on visible units