Mixture of Experts project

Mixtures of experts

•

Instead of training one big neural net to deal with every

training case, use several small neural nets.

–

Each small net is good at a particular subset of the

cases

–

There is a “manager” or “gating network” that decides

which small net should be used for each case.

–

The gating net and the expert nets are all trained at

the same time to minimize one big objective function.

•

This will be covered in detail in the lecture on Oct 23.

–

Meanwhile you could read the 1991 paper called

“Adaptive mixtures of local experts” on my webpage.