 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
Randomly perturb
one weight and see
|
|
if it improves
performance. If so, save
|
|
|
the change.
|
|
|
– |
Very
inefficient. We need to do
|
|
|
multiple
forward passes on a
|
|
|
representative
set of training data
|
|
|
just
to change one weight.
|
|
|
– |
Towards
the end of learning, large
|
|
|
weight
perturbations will nearly
|
|
|
always
make things worse.
|
|
| • |
We could
randomly perturb all the
|
|
|
weights in
parallel and correlate the
|
|
|
performance gain
with the weight
|
|
|
changes.
|
|
|
– |
Not
any better because we need
|
|
|
lots
of trials to “see” the effect of
|
|
|
changing
one weight through the
|
|
|
noise
created by all the others.
|
|