 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
Suppose we could
“observe” the
|
|
|
states of all
the hidden units
|
|
|
when the net was
generating the
|
|
|
observed data.
|
|
|
– |
E.g.
Generate randomly from
|
|
|
the
net and ignore all the
|
|
|
times
when it does not
|
|
|
generate
data in the training
|
|
|
set.
|
|
|
– |
Keep
n examples of the
|
|
|
hidden
states for each
|
|
|
datavector
in the training set.
|
|
• |
For each node,
maximize the log
|
|
probability of
its “observed” state
|
|
|
given the
observed states of its
|
|
|
parents.
|
|
|
– |
This
minimizes the energy of
|
|
|
the
complete configuration.
|
|