Why the learning procedure works
Consider the squared
distance between any
satisfactory weight vector
and the current weight
vector.
Every time the
perceptron makes a
mistake, the learning
algorithm moves the
current weight vector
towards all satisfactory
weight vectors (unless it
crosses the constraint
plane).
So consider “generously satisfactory”
weight vectors that lie within the
feasible region by a margin at least as
great as the largest update.
Every time the perceptron makes a
mistake, the squared distance to all
of these weight vectors is always
decreased by at least the squared
length of the smallest update vector.
Text Box: right
wrong