lec15


The probability distribution that is implicitly

	assumed when using squared error

•

Minimizing the squared

residuals is equivalent to

maximizing the log probability

of the correct answers under a

Gaussian centered at the

model’s guess.

–

If we assume that the

variance of the Gaussian is

the same for all cases, its

value does not matter.


d
correct
answer


y
model’s

prediction