lec13

Self-supervised backprop in a linear network

•

If the hidden and output layers are linear, it will

learn hidden units that are a linear function of

the data and minimize the squared

reconstruction error.

–

This is exactly what Principal Components

Analysis does.

•

The M hidden units will span the same space as

the first M principal components found by PCA

–

Their weight vectors may not be orthogonal

–

They will tend to have equal variances