The probabilistic guarantee
where N = size of training set
†††††††††† h = VC dimension of the model class
†††††††††† p = upper bound on probability that this bound fails
††† So if we train models with different complexity, we
should pick the one that minimizes this bound
††† Actually, this is only sensible if we think the bound is
fairly tight, which it usually isnít. The theory provides
insight, but in practice we still need some witchcraft.