 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
Each time we
learn a new layer, the inference at the
|
|
|
layer below
becomes incorrect, but the variational bound
|
|
on the log prob
of the data improves.
|
|
|
• |
Since the bound
starts as an equality, learning a new
|
|
|
layer never
decreases the log prob of the data, provided
|
|
|
we start the
learning from the tied weights that
|
|
|
implement the
complementary prior.
|
|
|
• |
Now that we have
a guarantee we can loosen the
|
|
|
restrictions and
still feel confident.
|
|
|
|
– |
Allow
layers to vary in size.
|
|
|
|
– |
Do
not start the learning at each layer from the
|
|
|
weights
in the layer below.
|
|