lec7post

Self-supervised backprop and PCA

•

If the hidden and output layers are linear, it will

learn hidden units that are a linear function of the

data and minimize the squared reconstruction

error.

•

The m hidden units will span the same space as

the first m principal components

–

Their weight vectors may not be orthogonal

–

They will tend to have equal variances