lec13

Principal Components Analysis

•

This takes N-dimensional data and finds the M orthogonal

directions in which the data has the most variance

–

These M principal directions form a subspace.

–

We can represent an N-dimensional datapoint by its

projections onto the M principal directions

•

This loses all information about where the datapoint is located

in the remaining orthogonal directions.

–

We reconstruct by using the mean value (over all the

data) on the N-M directions that are not represented.

•

The reconstruction error is the sum over all these

unrepresented directions of the squared differences from the

mean.