 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
How often to
update
|
|
|
|
– |
after
each training case?
|
|
|
|
– |
after
a full sweep through the training data?
|
|
|
– |
after
a “mini-batch” of training cases?
|
|
|
| • |
How much to
update
|
|
|
|
– |
Use
a fixed learning rate?
|
|
|
|
– |
Adapt
the learning rate?
|
|
|
|
– |
Add
momentum?
|
|
|
|
– |
Don’t
use steepest descent?
|
|