 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
It prevents the
network from using weights that it does
|
|
|
not need.
|
|
|
|
– |
This
can often improve generalization a lot.
|
|
|
|
– |
It
helps to stop it from fitting the sampling error.
|
|
|
|
– |
It
makes a smoother model in which the output
|
|
|
changes
more slowly as the input changes. w
|
|
|
| • |
If the network
has two very similar inputs it prefers to put
|
|
|
half the weight
on each rather than all the weight on one.
|