lec7

The effect of weight-decay

•

It prevents the network from using weights that it does

not need.

–

This can often improve generalization a lot.

–

It helps to stop it from fitting the sampling error.

–

It makes a smoother model in which the output

changes more slowly as the input changes. w

•

If the network has two very similar inputs it prefers to put

half the weight on each rather than all the weight on one.

w/2