 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| • |
Start by training
a single expert on all of the
|
|
|
data.
|
|
|
| • |
Then make two
copies of that expert, perturb the
|
|
|
weights slightly,
and train the system of two
|
|
|
experts and a
manager.
|
|
|
|
– |
This
allows each expert to be regularized by
|
|
|
|
the
data that it does not eventually model.
|
|
|
| • |
Then split each
expert again.
|
|
|
|
– |
You
have to decide how many times to split.
|
|