 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
Each time we
learn a new layer, the inference at
|
|
|
the layer below
becomes incorrect, but the
|
|
|
variational
bound on the log prob of the data
|
|
|
improves
provided we start the learning from the
|
|
tied weights
that implement the complementary
|
|
|
prior.
|
|
|
| • |
Now that we have
a guarantee we can loosen
|
|
|
the restrictions
and still feel confident.
|
|
|
|
– |
Allow
layers to vary in size.
|
|
|
|
– |
Do
not start the learning at each layer from
|
|
|
the
weights in the layer below.
|
|