Why the learning procedure works
Consider the squared
distance between any
satisfactory weight vector
and the current weight
vector.
Every time the
perceptron makes a
mistake, the learning
algorithm reduces the
squared distance
between the current
weight vector and any
satisfactory weight
vector (unless it crosses
the decision plane).
So consider “generously satisfactory”
weight vectors that lie within the
feasible region by a margin at least as
great as the largest update.
Every time the perceptron makes a
mistake, the squared distance to all
of these weight vectors is always
decreased by at least the squared
length of the smallest update vector.
Text Box: right
wrong