 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
Re-representing
the data: Each time the base
|
|
|
learner is
called, it passes a transformed version
|
|
of the data to
the next learner.
|
|
|
|
– |
Can
we learn a deep, dense DAG one layer at
|
|
a
time, starting at the bottom, and still
|
|
|
guarantee
that learning each layer improves
|
|
|
the
overall model of the training data?
|
|
|
|
• |
This
seems very unlikely. Surely we need to know
|
|
|
the
weights in higher layers to learn lower layers?
|
|