 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
We want to
maximize the product of the probabilities of
|
|
|
the outputs on
the training cases
|
|
|
|
– |
Assume
the output errors on different training cases,
|
|
c,
are independent.
|
|
|
| • |
Because the log
function is monotonic, it does not
|
|
|
change where the
maxima are. So we can maximize
|
|
|
sums of log probabilities
|
|