 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
When the amount
of training data is limited, we need to
|
|
|
avoid
overfitting.
|
|
|
|
– |
Averaging
the predictions of many different networks
|
|
|
is
a good way to do this.
|
|
|
|
– |
It
works best if the networks are as different as
|
|
|
possible.
|
|
|
| • |
If the data is
really a mixture of several different
|
|
|
“regimes” it is
helpful to identify these regimes and use a
|
|
|
separate, simple
model for each regime.
|
|
|
|
– |
We
want to use the desired outputs to help cluster
|
|
|
cases
into regimes. Just clustering the inputs is not as
|
|
efficient.
|
|