Lecture 2


	The bias-variance trade-off

(a figment of the frequentists lack of imagination?)

•

Imagine that the training set was drawn at random from a

whole set of training sets.

•

The squared loss can be decomposed into a “bias” term

and a “variance” term.

–

Bias = systematic error in the model’s estimates

–

Variance = noise in the estimates cause by sampling

noise in the training set.

•

There is also an additional loss due to the fact that the

target values are noisy.

–

We eliminate this extra, irreducible loss from the math

by using the average target values (i.e. the unknown,

noise-free values)