 |
 |
 |
 |
 |
 |
 |
 |
where N = size of
training set
|
|
h = VC dimension of the model
class
|
|
p = upper bound on probability
that this bound fails
|
|
So if we train models with different
complexity, we
|
|
|
should pick the
one that minimizes this bound
|
|
Actually, this is only sensible if we think the bound is
|
|
|
fairly
tight, which it usually isn’t. The theory provides
|
|
|
insight,
but in practice we still need some witchcraft.
|
|