 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
Weight-decay
reduces the effect
|
|
of noise in the
inputs.
|
|
|
|
– |
The
noise variance is
|
|
|
amplified
by the squared
|
|
|
weight
|
|
|
| • |
The amplified
noise makes an
|
|
|
additive
contribution to the
|
|
|
squared error.
|
|
|
|
– |
So
minimizing the squared
|
|
|
error
tends to minimize the
|
|
|
squared
weights when the
|
|
|
inputs
are noisy.
|
|
|
| • |
It gets more
complicated for
|
|
|
non-linear
networks.
|
|