Feudal Reinforcement Learning
Peter Dayan
CNL
The Salk Institute
San Diego, CA
Geoffrey Hinton
Department of Computer Science
University of Toronto
Abstract
One way to speed up reinforcement learning is to enable learning to
happen simultaneously at multiple resolutions in space and time. This paper shows how to
create a Q-learning managerial hierarchy in which high level managers learn how to set
tasks to their sub-managers who, in turn, learn how to satisfy them. Sub-managers need not
initially understand their managers' commands. They simply learn to maximise their
reinforcement in the context of the current command.
We illustrate the system using a simple maze task.. As the system
learns how to get around, satisfying commands at the multiple levels, it explores more
efficiently than standard, flat, Q-learning and builds a more comprehensive map.
Download [ps.gz] [pdf]
Advances in Neural Information Processing Systems 5. S. J.
Hanson, J. D. Cowan and C. L. Giles (Eds.), Morgan Kaufmann: San Mateo, CA.
[home page] [publications]