IJCAI 2005 Research Excellence Award Lecture

7

Why learning is hard in a sigmoid belief net.

•

To learn W, we need the posterior

distribution in the first hidden layer.

•

Problem 1: The posterior is typically

intractable because of “explaining

away”.

•

Problem 2: The posterior depends

on the prior created by higher layers

as well as the likelihood.

–

So to learn W, we need to know

the weights in higher layers, even

if we are only approximating the

posterior. All the weights interact.

•

Problem 3: We need to integrate

over all possible configurations of

the higher variables to get the prior

for first hidden layer. Yuk!

hidden variables

hidden variables

prior

hidden variables

likelihood

W

data