lecture 10: Support Vector Machines

A weird measure of model complexity

•

Suppose that we pick n datapoints and assign labels of +

or – to them at random. If our model class (e.g. a neural

net with a certain number of hidden units) is powerful

enough to learn any association of labels with the data,

its too powerful!

•

Maybe we can characterize the power of a model class

by asking how many datapoints it can “shatter” i.e. learn

perfectly for all possible assignments of labels.

–

This number of datapoints is called the Vapnik-

Chervonenkis dimension.

–

The model does not need to shatter all sets of

datapoints of size h. One set is sufficient.

•

For planes in 3-D, h=4 even though 4 co-planar points cannot

be shattered.