 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
|
We usually need
to decide between many different models:
|
|
|
|
|
Different
numbers of basis functions
|
|
|
|
|
Different
types of basis functions
|
|
|
|
|
Different
strengths of regularizers
|
|
|
|
The frequentist
way to decide between models is to hold back a
|
|
|
validation set
and pick the model that does best on the validation
|
|
|
data.
|
|
|
|
|
This
gives less training data. We can use a small validation set
|
|
and
evaluate models by training many different times using
|
|
|
different
small validation sets. But this is
tedious.
|
|
|
|
The Bayesian
alternative is to use all of the data for training each
|
|
|
model and to use
the evidence to pick the best model (or to
|
|
|
average over
models).
|
|
|
|
The evidence is
the marginal likelihood with the parameters
|
|
|
integrated out.
|
|