 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
Divide the total
dataset into three subsets:
|
|
|
|
– |
Training
data is used for learning
the
|
|
|
parameters
of the model.
|
|
|
|
– |
Validation
data is not used of learning
but is
|
|
|
used
for deciding what type of model and
|
|
|
what
amount of regularization works best.
|
|
|
|
– |
Test
data is used to get a final,
unbiased
|
|
|
estimate
of how well the network works. We
|
|
|
expect
this estimate to be worse than on the
|
|
|
validation
data.
|
|
|
| • |
We could then
re-divide the total dataset to get
|
|
|
another unbiased
estimate of the true error rate.
|