Learning a deep directed
network
etc.
h
2
•
First learn with all the weights tied
–
This is exactly equivalent to
learning an RBM
–
Contrastive divergence learning
is equivalent to ignoring the small
derivatives contributed by the tied
weights between deeper layers.
v
2
h
1
v
1
h
0
h
0
v
0
v
0