What can we do if there are too many
parameters for a grid to be feasible?
The number of grid points is exponential in the number
of parameters.
So we cannot deal with more than a few parameters
using a grid.
If there is enough data to make most parameter vectors
very unlikely, only need a tiny fraction of the grid points
make a significant contribution to the predictions.
Maybe we can just evaluate this tiny fraction
It might be good enough to just sample weight vectors
according to their posterior probabilities.
Sample weight vectors
with this probability