Limiting the size of the weights
•
Weight-decay involves
adding an extra term to the
cost function that penalizes
the squared weights.
–
Keeps weights small
unless they have big
error derivatives.
C
w