 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
The obvious
approach is to write down an error function
|
|
|
and try to show
that each step of the learning procedure
|
|
|
reduces the
error.
|
|
|
– |
For
stochastic online learning we would like to show
|
|
|
that
each step reduces the expected error, where the
|
|
|
expectation
is across the choice of training cases.
|
|
|
– |
It
cannot be a squared error because the size of the
|
|
|
update
does not depend on the size of the mistake.
|
|
• |
The textbook
tries to use the sum of the distances on the
|
|
wrong side of
the decision surface as an error measure.
|
|
|
– |
Its
conclusion is that the perceptron convergence
|
|
|
procedure
is not guaranteed to reduce the total error
|
|
|
at
each step.
|
|
|
• |
This is
true for that error function even if there is a set of
|
|
|
weights
that gets the right answer for every training case.
|
|