The perceptron convergence procedure
• Add an extra component with value 1 to each feature
vector. The “bias” weight on this component is minus the
threshold. Now we can forget the threshold.
• Pick training cases using any policy that ensures that
every training case will keep getting picked
– If the output is correct, leave its weights alone.
– If the output is 0 but should be 1, add the feature
vector to the weight vector.
– If the output is 1 but should be 0, subtract the feature
vector from the weight vector
• This is guaranteed to find a set of weights that gets the
right answer on the whole training set if any such set exists
• There is no need to choose a learning rate.