Curvature Matrices (optional material!)
Each element in the curvature
matrix specifies how the
gradient in one direction
changes as we move in some
other direction.
For a linear network with a
squared error, the curvature
matrix of the error, E, is the
covariance matrix of the inputs.
The reason steepest descent
goes wrong is that the ratio of
the magnitudes of the
gradients for different weights
changes as we move down the
gradient.
i          j           k
i
j
k