 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
Use a global
learning rate
|
|
|
multiplied by a
local gain
|
|
|
on each
connection.
|
|
|
| • |
Increase the
local gains if
|
|
|
the gradient does
not
|
|
|
change sign.
|
|
|
| • |
Use additive
increases and
|
|
multiplicative
decreases.
|
|
|
|
– |
This
ensures that big
|
|
|
learning
rates decay
|
|
|
rapidly
when
|
|
|
oscillations
start.
|
|