 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
We want to learn
models with multiple layers of
non-
|
|
|
linear
features.
|
|
| • |
Perceptrons: Use a layer of hand-coded, non-adaptive
|
|
|
features
followed by a layer of adaptive decision units.
|
|
|
– |
Needs
supervision signal for each training case.
|
|
|
– |
Only
one layer of adaptive weights.
|
|
| • |
Back-propagation: Use multiple layers of adaptive
|
|
|
features and
train by backpropagating error derivatives
|
|
– |
Needs
supervision signal for each training case.
|
|
|
– |
Learning
time scales poorly for deep networks.
|
|
| • |
Support
Vector Machines: Use a very
large set of fixed
|
|
|
features
|
|
|
– |
Needs
supervision signal for each training case.
|
|
|
– |
Does
not learn multiple layers of features
|
|