lec9

Full Bayesian Learning

•

Instead of trying to find the best single setting of the

parameters (as in ML or MAP) compute the full posterior

distribution over parameter settings

–

This is extremely computationally intensive for all but

the simplest models (its feasible for a biased coin).

•

To make predictions, let each different setting of the

parameters make its own prediction and then combine

all these predictions by weighting each of them by the

posterior probability of that setting of the parameters.

–

This is also computationally intensive.

•

The full Bayesian approach allows us to use complicated

models even when we do not have much data