CSC2626 Syllabus

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

CSC2626 – Imitation Learning for Robotics

Winter 2021

Instructor: Florian Shkurti Lectures: Mon 1-3pm ET, Zoom


Email: [email protected] Office Hours: Wed 2-3pm ET, Zoom
TA: Homanga Bharadhwaj
Email: [email protected] Office Hours: Fri 2-3pm ET, Zoom

Course Page: http://www.cs.toronto.edu/~florian/courses/csc2626w21

Zoom Link: See the course’s Quercus home page or email the instructor.

Online Delivery: Lectures and office hours will be held synchronously online. Class participation and
questions are encouraged. Lectures will be delivered synchronously via Zoom, and recorded for asynchronous
viewing. Office hours will occur synchronously online, but will not be recorded. Enrolled students in a
significantly different timezone than ET, who might need special accommodation for office hours, should
reach out to the instructor. Students are encouraged to attend lectures and office hours to ask questions,
but we will also use Piazza for questions and discussions. Course announcements will be sent via Quercus.

Communication for the course:

• The official discussion board for the course is Quercus https://q.utoronto.ca. All course announce-
ments will be posted there.

• Email the instructor or the TA with “CSC2626” in the subject line, otherwise your email might get
mislabeled and potentially not seen.

• You are welcome to provide anonymous feedback / suggestions for improvement any time during the
semester: https://www.surveymonkey.com/r/LJJV5LY

Overview: This graduate-level course will examine some of the most important papers in imitation learning
for robot control, placing more emphasis on developments in the last 10 years. Its purpose is to familiarize
students with the frontiers of this research area, to help them identify open problems, and to enable them
to make a novel contribution. The majority of lectures, particularly after the first two weeks of introductory
material, will consist of in-class student presentations. This course will broadly cover the following areas:

• Imitating the policies of demonstrators (people, expensive algorithms, optimal controllers)

• Connections between imitation learning, optimal control, and reinforcement learning

• Learning the cost functions that best explain a set of demonstrations

• Shared autonomy between humans and robots for real-time control

The course involves a significant final project component, which will likely involve the use of robot simulators
(see the course webpage for suggestions on simulators).

Prerequisites: You need to be comfortable with introductory machine learning concepts (such as from
CSC411/ECE521 or equivalent), linear algebra, basic multivariate calculus, intro to probability. You also
need to have strong programming skills in Python. Note: if you don’t meet all the prerequisites above
please contact the instructor by email. Optional, but recommended: experience with neural networks, such
as from CSC321 or equivalent, and introductory-level familiarity with reinforcement learning and control.

page 1 of 2
CSC2626 Jan 11, 2021

Main References: There is no required textbook for this course. In-class discussions will be based on
research papers. The following are optional, but recommended textbooks:

• Aude Billard, Sylvain Calinon, Rudiger Dillmann, Stefan Schaal, Robot programming by demonstration.

• Sonia Chernova, Andrea Thomaz, Robot learning from human teachers.

• Takayuki Osa, Joni Pajarinen, Gerhard Neumann, Andrew Bagnell, Pieter Abbeel, Jan Peters, An
algorithmic perspective on imitation learning

Grading Policy: 2x assignments (50%) and 1x course project (50%). The grade of the course project
consists of a proposal (10%), midterm progress report (5%), project presentation (5%), and a final report
with code at the end of the term (30%).

Tentative Course Outline By Week:

1. Imitation vs. Robust Behavioral Cloning

2. Intro to Optimal Control and Model-Based Reinforcement Learning

3. Batch / Offline Reinforcement Learning

4. Imitation Learning Combined with Reinforcement Learning, Control,


and Planning #1

5. Imitation as Program Induction and Modular Decomposition of


Demonstrations

6. Inverse Reinforcement Learning #1

7. Shared Autonomy for Robot Control with Human in-the-Loop

8. Adversarial Imitation Learning

9. Imitation Learning Combined with Reinforcement Learning, Control,


and Planning #2

10. Inverse Reinforcement Learning #2

11. Rewards and Value Alignment

12. TBA

13. Project Presentations

Important Due Dates (tentative):


Assignment 1 . . . . . . . . . . . . . . . . Jan 27, 2021, by 6pm ET
Assignment 2 . . . . . . . . . . . . . . . . Feb 12, 2021, by 6pm ET
Project Proposal . . . . . . . . . . . . . Feb 17, 2021, by 6pm ET
Midterm Report . . . . . . . . . . . . . Mar 10, 2021, by 6pm ET
Project Presentations . . . . . . . . . . . . . . . . . . . . . . Apr 5, 2021
Final Report and Code . . . . . . Apr 12, 2021, by 6pm ET

page 2 of 2

You might also like