lec3

Learning with hidden units

•

Networks without hidden units are very limited in the

input-output mappings they can model.

–

More layers of linear units do not help. Its still linear.

–

Fixed output non-linearities are not enough

•

We need multiple layers of adaptive non-linear hidden

units. This gives us a universal approximator. But how

can we train such nets?

–

We need an efficient way of adapting all the weights,

not just the last layer. This is hard. Learning the

weights going into hidden units is equivalent to

learning features.

–

Nobody is telling us directly what hidden units should

do.