240119 AA289 Annie Chen
03 Feb 2024 (12 months ago)
Reinforcement Learning for Autonomous Robots
- Recent advances in autonomous robots have led to robots that can perform tasks in controlled environments.
- However, these robots often struggle to adapt to unexpected circumstances and novel scenarios during real-world deployment.
- Reinforcement learning provides a framework for robots to adapt autonomously, but it is challenging to apply directly during deployment due to the need for feedback, retries, and the ability to learn from scratch.
Reset-Free Reinforcement Learning
- Reset-free reinforcement learning addresses some of these challenges by allowing robots to practice both learning the task and undoing it without human intervention.
- Single-life reinforcement learning is introduced as a paradigm where the agent is given prior experience and must adapt to a new scenario without human intervention or supervision within a single episode.
Robust Autonomous Modulation (REALM)
- The proposed method, Robust Autonomous Modulation (REALM), leverages the expressive power of each behavior's value function to guide behavior selection during adaptation.
- REALM fine-tunes the value functions of pre-trained behaviors to correct for overestimation in out-of-distribution states.
- The selection mechanism in REALM quickly identifies appropriate behaviors in a given situation, eliminating the need for a separate high-level controller or adaptation module.
- REALM is agnostic to how the policies and value functions of the prior behaviors are trained and can provide improvements in new situations with either a small or large number of pre-trained behaviors.
- The adaptation process in REALM happens within a single episode at test time, allowing robots to adapt to a variety of situations without the need for extensive online training.
Rome: A Simple Algorithm for Autonomous Deployment-Time Adaptation
- Rome is a simple algorithm for autonomous deployment-time adaptation.
- Rome outperforms prior methods in simulated and real-world experiments.
- Rome can adapt to novel situations within a single episode.
- Rome can handle dynamic changing payloads and unseen objects.
- Rome can leverage parts of each relevant behavior to complete tasks.
- Rome provides a mechanism for single-life test-time adaptation to unseen situations.