Accelerating Reinforcement Learning with Imitation
A Thesis by Bob Price
Abstract
Imitation can be viewed as a means of enhancing learning in multiagent
environments. It augments an agent's ability to learn useful
behaviours by making intelligent use of the knowledge implicit in
behaviors demonstrated by cooperative teachers or other more
experienced agents. Using reinforcement learning theory, we construct
a new, formal framework for imitation that permits agents to combine
prior knowledge, learned knowledge and knowledge extracted from
observations of other agents. This framework, which we call
implicit imitation, uses observations of other agents to provide an
observer agent with information about its action capabilities in
unexperienced situations. Efficient algorithms are derived from this
framework for agents with both similar and dissimilar action
capabilities and a series of experiments demonstrate that
implicit imitation can dramatically accelerate reinforcement learning
in certain cases. Further experiments demonstrate that the framework
can handle transfer between agents with different reward structures,
learning from multiple mentors, and selecting relevant portions of
examples. A Bayesian version of implicit imitation is then derived
and several experiments are used to illustrate how it smoothly
integrates model extraction with prior knowledge and optimal
exploration. Initial work is also presented to illustrate how
extensions to implicit imitation can be derived for continuous
domains, partially observable environments and multi-agent Markov
games. The derivation of algorithms and experimental results are
supplemented with an analysis of relevant domain features for
imitating agents and some performance metrics for predicting imitation
gains are suggested.
Download Formats
Bibtex Entry
@phdthesis{
    author = "Bob Price",
    title = "Accelerating Reinforcement Learning with Imitation",
    school = "University of British Columbia",
    year = "2003"
}