|
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
Consider the
squared
|
|
|
distance between
any
|
|
|
satisfactory
weight vector
|
|
|
and the current
weight
|
|
|
vector.
|
|
|
|
– |
Every
time the
|
|
|
perceptron
makes a
|
|
|
mistake,
the learning
|
|
|
algorithm
reduces the
|
|
|
squared
distance
|
|
|
between
the current
|
|
|
weight
vector and any
|
|
|
satisfactory
weight
|
|
|
vector
(unless it crosses
|
|
the
decision plane).
|
|
|
|
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
• |
So consider
“generously satisfactory”
|
|
|
weight vectors
that lie within the
|
|
|
feasible region
by a margin at least as
|
|
|
great as the
largest update.
|
|
|
|
– |
Every
time the perceptron makes a
|
|
|
mistake,
the squared distance to all
|
|
|
of
these weight vectors is always
|
|
|
decreased
by at least the squared
|
|
|
length
of the smallest update vector.
|
|
|