Bob Price and Craig Boutilier
Department of Computer Science
University of Toronto
Toronto, ON M5S 3H5
email: {price,cebly}@cs.ubc.ca
Abstract
Implicit imitation can accelerate
reinforcement learning (RL) by augmenting the Bellman equations with
information from the observation of expert agents (mentors). We
propose two extensions that permit imitation of agents with
heterogeneous actions: feasibility testing, which detects infeasible
mentor actions, and k-step repair, which searhes for plans that
approximate infeasible actions. We show empirically that both of
extensions allow imitatation agents to converge more quickly in
in the presence of heterogeneous actions.
To appear, AI'2001 (14th Canadian Conference on AI)
Return to List of Papers