lecture 7

IRLS


•	For a linear model with a squared error, the
	optimal weights are given by

•	This can be derived as a single update on the
	initial weight vector in which the gradient vector
	is pre-multiplied by the inverse of the curvature
	of the error surface to decide the direction and
	magnitude of the weight update: