Behaviour of the iterative learning procedure
Do the updates to the weights always make them get
closer to their correct values?  No!
Does the online version of the learning procedure
eventually get the right answer? Yes, if the learning rate
gradually decreases in the appropriate way.
How quickly do the weights converge to their correct
values? It can be very slow if two input dimensions are
highly correlated (e.g. ketchup and chips).
Can the iterative procedure be generalized to much
more complicated, multi-layer, non-linear nets? YES!