CSC2626: Imitation Learning for Robotics, Winter 2021
Overview
In the next few decades we are going to witness millions of people, from various backgrounds and levels of technical expertise, needing to effectively interact with robotic technologies on a daily basis. As such, people will need to modify the behavior of their robots without explicitly writing code, but by providing only a small number of kinesthetic or visual demonstrations. At the same time, robots should try to infer and predict the human's intentions and internal objectives from past interactions, in order to provide assistance before it is explicitly asked. This graduate-level course will examine some of the most important papers in imitation learning for robot control, placing more emphasis on developments in the last 10 years. Its purpose is to familiarize students with the frontiers of this research area, to help them identify open problems, and to enable them to make a novel contribution.Prerequisites
You need to be comfortable with: introductory machine learning concepts (such as from CSC411/CSC413/ECE521 or equivalent), linear algebra, basic multivariable calculus, intro to probability. You also need to have strong programming skills in Python. Note: if you don't meet all the prerequisites above please contact the instructor by email. Optional, but recommended: experience with neural networks, such as from CSC321, introductory-level familiarity with reinforcement learning and control.Teaching Staff
Teaching Assistant
          
          y@cs.toronto.edu, y=homanga
	  Office Hours: Fri 2-3pm ET, on Zoom
	Course Details
Lectures: Mondays, 3-5pm ET (online synchronous delivery + recorded lectures)
	  Zoom link is posted on the course's Quercus homepage
	  All announcements will be posted on Quercus
	  Discussions will take place on Piazza
	  Anonymous feedback form for suggested improvements
	  
	  
	Grading and Important Dates
- Assignment 1 (25%): due Jan 28, at 6pm ET
- Assignment 2 (25%): due Apr 5th, at 6pm ET
- Project Proposal (10%): Due Feb 17 at 6pm. Students can take on projects in groups of 2-3 people. Tips for a good project proposal can be found here. Proposals should not be based only on papers covered in class by Feb 17th. Students are encouraged to look further ahead in the schedule and to start planning their project definition well ahead of this deadline. Students who need help choosing or crystallizing a project idea should email the instructor or the TA.
- Midterm Progress Report (5%): Due Mar 10 at 6pm ET. Tips and expectations for a good midterm progress report are here.
- Project Presentation (5%): On Apr 5, during class. This will be a short presentation, approximately 5-10 minutes, depending on the number of groups.
- Final Project Report and Code (30%): Due Apr 12 at 6pm ET. Tips and expectations for a good final project report can be found here.
Course Description
This course will broadly cover the following areas:- Imitating the policies of demonstrators (people, expensive algorithms, optimal controllers)
- Connections between imitation learning, optimal control, and reinforcement learning
- Learning the cost functions that best explain a set of demonstrations
- Shared autonomy between humans and robots for real-time control
Schedule
Recommended, but optional, books
- Robot programming by demonstration, by Aude Billard, Sylvain Calinon, Rudiger Dillmann, Stefan Schaal
- Robot learning from human teachers, by Sonia Chernova, Andrea Thomaz
- An algorithmic perspective on imitation learning, by Takayuki Osa, Joni Pajarinen, Gerhard Neumann, Andrew Bagnell, Pieter Abbeel, Jan Peters
Recommended simulators and datasets
You are encouraged to use the simplest possible simulator to accomplish the task you are interested in. In most cases this means Mujoco, but feel free to build your own.For all the starred environments below, please be aware of the 1-machine/student licensing restriction for the Mujoco physics engine:
- OpenAI Gym (Robotics*, Mujoco*, Box2D, Classic Control)
- DeepMind control suite*
- Surreal Robosuite (manipulation*)
- Klampt (manipulation and locomotion tasks, contact modeling)
- DART (manipulation and locomotion tasks, contact modeling)
- Udacity self-driving car simulator (based on Unity, needs a GPU)
- CARLA self-driving car simulator (based on Unreal Engine 4, needs a GPU)
- Holodeck (based on Unreal Engine 4, needs a GPU)
- AirSim (flying vehicles and cars, based on Unreal Engine 4, needs a GPU)
- TORCS self-driving car simulator
- V-REP (robot arms, humanoids, hexapods)
- DeepMind Lab (navigation in mazes)
- Gibson environment (navigation, locomotion in indoor environments, needs a GPU)
- RLBench (vision-based manipulation, has demonstrations)
- IKEA furniture assembly environment (vision-based dual-arm manipulation for furniture assembly)
- ALFRED (vision and language based navigation and manipulation)
- D4RL (manipulation and navigation datasets for offline RL)
- RoboTurk (demonstration data for manipulation)
- AI Habitat (visual navigation)
- Isaac Gym (gym environments and more, but blazing fast, end-to-end GPU accelerated)
- RaiSim (supports biomechanics of human motion, as well as quadrupeds)
- Flightmare (fast multi-quadrotor simulation)
- PyBullet Drones (fast multi-quadrotor simulation, more aerodynamic effects)
- Deformable Ravens (deformable object simulation in PyBullet with demonstrations)
Resources for planning, control, and RL
- Open Motion Planning Library
- Control Toolbox from ETHZ (C++ only at the moment, but includes automatic differentiation)
- Trajectory optimization
- Black-DROPS Policy Search (C++ only at the moment)
- Guided Policy Search
- OpenAI Baselines
Resources for ML
- PyTorch
- Tensorflow
- GPyTorch (for gaussian processes)
Recommended courses
- Robot Learning Seminar by Abdeslam Boularias
- Deep RL course by Sergey Levine, John Schulman, Chelsea Finn
- Deep RL course by Jimmy Ba
- Robot Learning and Sensorimotor Control course by Sethu Vijayakumar
- Algorithmic HRI course by Anca Dragan
- Related sections from Russ Tedrake's underactuated robotics course