lec3

The idea behind backpropagation

•

We don’t know what the hidden units ought to do, but we

can compute how fast the error changes as we change a

hidden activity.

–

Instead of using desired activities to train the hidden

units, use error derivatives w.r.t. hidden activities.

–

Each hidden activity can affect many output units and

can therefore have many separate effects on the error.

These effects must be combined.

–

We can compute error derivatives for all the hidden units

efficiently.

–

Once we have the error derivatives for the hidden

activities, its easy to get the error derivatives for the

weights going into a hidden unit.