The tutorial took place at RSS 2024 on Friday, July 19, 2024. The recording of the tutorial is available on YouTube and posted below.
Diffusion policy
Robotic Transformer
Dobb-E
Creating robots that can perform a wide variety of complex tasks, with generalization to unstructured real-world settings, has long been a north star for the field. Recently, advances in machine learning and data collection frameworks have allowed supervised learning-from-demonstration approaches to take significant strides in this direction (e.g. Diffusion policy, RT-1/2/X, or Dobb-E). In this paradigm, policy learning is cast as a supervised learning problem to learn a mapping from raw observations to demonstrated actions.
So what matters for Supervised Policy Learning (SPL)? Unlike image processing or language modeling, SPL has some extra challenges: the environment dynamics are stochastic and hard to model, and the target “action” variable in a human demonstration dataset is continuous, often noisy, high-dimensional, and has multiple modes. Because often these decision making problems are sequential and real world dynamics is chaotic, even a small imprecision in modeling the demonstrated behavior in a single step can lead to cascading failures ten steps down. Thus, modern policies for robot learning try to achieve the following objectives: to model long-and short-term correlations in actions, to capture and generate from diverse modes of behavior, and to improvise in unseen situations while remaining precise in seen ones. Crucially, these properties need to be achieved with a relatively small amount of embodied data, and so design choices become important for successful policies.
In this tutorial, we provide a brief introduction to supervised behavior policy learning, including an overview of current state-of-the-art methods for learning real-world robot policies. In particular, we will focus on how to implement or adapt the state-of-the-art in behavior cloning algorithm for a new robotic task. The intended audience of this tutorial is robotics researchers, both in academia and industry, who are interested in applying supervised learning at scale to their robot learning problems.
This tutorial will teach attendees how to apply SPL algorithms to their own datasets and robots, starting from simple multi-layer perceptron or nearest neighbor baselines, all the way to transformers or diffusion based policies. We will also discuss selection of appropriate algorithms given problem complexity, dataset size, diversity, and compute requirements. Finally, the attendees will hear from current practitioners about problems that often arise in real-world deployment of such policies, and will learn how to start debugging their own behavior cloning based systems when faced with similar problems.
The list of topics covered in this tutorial includes:Friday, July 19, 2024 (Central European time)
Supervised policy learning: background and history
Recent applications from across the field
Why (or why not) supervised policy learning?
Core ideas and challenges
High level taxonomy of architectures and algorithms
What we are yet to understand well
Setting up an environment
Dissecting a robot demonstration dataset: open-loop replays
Setting up a base policy: BC-MLP
Going non-parametric: Nearest Neighbors and VINN
Failure cases: noise, multi-modality
Multi-modality through tokenization: Behavior transformer
Q&A: Behavior transformer
Multi-modality through denoising: Diffusion policy
Q&A: Diffusion Policy
See above on the schedule for paired links with schedule. For the full set of recording and slides follow the links below.
Full set of slides Video recordingNur Muhammad "Mahi" Shafiullah
New York University
Siyuan Feng
Toyota Research Institute
Lerrel Pinto
New York University
Russ Tedrake
Massachusetts Institute of Technology, Toyota Research Institute
N. Shafiullah, S. Feng, L. Pinto & R. Tedrake. (2024, July). Supervised Policy Learning for Real Robots. Tutorial presented at the Robotics: Science and Systems (RSS), Delft. https://supervised-robot-learning.github.io/.
@misc{shafiullah2024supervised, author = {Shafiullah, Nur Muhammad Mahi. and Feng, Siyuan. and Pinto, Lerrel. and Tedrake, Russ.}, title = {Supervised Policy Learning for Real Robots}, year = {2024}, month = {July}, note = {Tutorial presented at the Robotics: Science and Systems (RSS), Delft}, url = {https://supervised-robot-learning.github.io} }