Lecture 4

An amazing fact

•

If we use just the right amount of Gaussian noise, and if

we let the weight vector wander around for long enough

before we take a sample, we will get a sample from the

true posterior over weight vectors.

–

This is called a “Markov Chain Monte Carlo” method

and it makes it feasible to use full Bayesian learning

with hundreds or thousands of parameters.

–

There are related MCMC methods that are more

complicated but more efficient (we don’t need to let the

weights wander around for so long before we get

samples from the posterior).

•

Radford Neal (1995) showed that this works extremely

well when data is limited but the model needs to be

complicated.