Empirical Bayes
target and
input on
test case
precision
of output
noise
training
data
precision
of prior
The equation above is the right predictive distribution (assuming we
do not have hyperpriors for alpha and beta).
The equation below is a more tractable approximation that works
well if the posterior distributions for alpha and beta are highly
peaked (so the distributions are well approximated by their most
likely values)
point estimates of alpha and beta
that maximize the evidence