Stanford Seminar - Continual Safety Assurances for Learning-Enabled Robotic Systems

07 Dec 2024 (4 months ago)

Introduction and Challenges of Safety in AI and Autonomy

The long-term goal is to develop robot algorithms that operate with guaranteed safety and performance in new and uncertain environments, applicable to various fields such as autonomous drones, cars, and space exploration (30s).
Machine learning and AI are becoming increasingly pervasive in autonomy stacks, particularly for perception, trajectory forecasting, planning, and control, due to their ability to capture real-world complexity (56s).
However, the inclusion of machine learning has also introduced new safety challenges, such as the need for safety assurances in systems that operate in uncertain environments (1m21s).
Recent incidents, such as Cruise cars being rolled back from San Francisco and robots crashing into humans on factory floors, have highlighted the severity of these safety challenges (1m28s).
The US government has passed an executive order to prioritize discussion on AI safety, and other governments have followed suit, emphasizing the need for safety assurances in AI and autonomy (1m36s).
There is a tension between enabling systems to leverage machine learning capabilities while maintaining safety, which is the focus of the discussion (1m48s).
Most machine learning systems are designed without specific regard to safety, and safety issues are often addressed with post-hoc solutions, referred to as "safety bandages" (2m14s).
These safety bandages are not scalable, can be conservative and degrade performance, and may not work in new deployment conditions (2m49s).
To overcome these challenges, safety is viewed as a continuous process, formally ingrained in different stages of the learning process, from design and training to deployment and iterative improvement (3m26s).
Algorithms are developed to programmatically incorporate safety requirements in the training process, learning inherently safe and robust controllers and policies for robotic systems, referred to as "design time safety methods" or "design for safety" (3m47s).
The goal is to develop methods for detecting out-of-distribution and anomalous situations in learning-enabled robotic systems, adapting their behavior to maintain safety, and learning from past failures to improve safety over time (4m13s).
A continual safety assurance framework is proposed, where assurances are provided during design time, monitored and adapted during operation time, and continuously improved over the system's life cycle (4m40s).
The framework involves learning provably safe controllers from data, adapting these controllers online under new deployment conditions, and stress-testing policies or controllers to minimize safety-critical failures (5m2s).

Safety Analysis and Reachability Analysis

Safety analysis aims to determine whether and how a robot can prevent its trajectory from entering an undesirable set of states, referred to as the failure set (5m39s).
Two key aspects of safety analysis are quantifying which configurations of the robot are doomed to fail versus which are safe, and how to keep the robot in safe configurations, referred to as the "whether and how of safety" (5m55s).
Control theory provides powerful frameworks for safety analysis, including Hamilton-Jacobi reachability analysis, which is used to mathematically characterize and automatically compute safety requirements (6m20s).
Hamilton-Jacobi reachability analysis assumes a robotic system with dynamics, state, control, and disturbance, and computes the backward reachable tube, a set of initial states from which the robot will be driven to an undesirable set of states despite best control efforts (6m55s).
The backward reachable tube represents the unsafe configurations for the robot and should be avoided, as illustrated by a simple example of a quadrotor moving longitudinally up and down in a room (7m41s).
A system's trajectory can be analyzed to determine if it will eventually crash into a ceiling or floor, making it impossible to avoid collision, or if it can be controlled to stay within a safe set, with the contrast between these two scenarios represented by a light red region and a blue region, respectively (8m2s).
The blue region, or safe set, is the area where the system has a controller or policy to keep it inside at all times, and reachability analysis can provide both this safe set and the safety controller (8m15s).
Reachability analysis involves defining a failure set implicitly with a function L of x, which is negative inside the failure set and positive outside, representing the safety reward the robot gets at state x (8m41s).
The cumulative reward of a trajectory is given by the minimum safety reward along the trajectory, which is different from typical optimal control and reinforcement learning problems where the cumulative reward is the sum of rewards (9m15s).
The sign of the cumulative reward can be used to determine whether the system trajectory was ever safe or not, and control can be used to formulate a game between the control and disturbance to maximize the safety reward (9m36s).
The value function corresponding to this game captures the closest the system will ever get to the failure set, and if the value function is negative, the system must have entered the failure set at some point (10m18s).
The backward reachable tube, or unsafe set, is the set of states where the value function is less than or equal to zero, and this value function can be computed using the principle of dynamic programming, resulting in a partial differential equation (10m43s).
The partial differential equation, known as the Hamilton-Jacobi-Bellman equation, relates how taking a particular action affects the system's value or distance to the failure set, and solving this equation provides the value function (11m21s).
The value function can be visualized, with more red indicating a more negative value function and more blue indicating a more positive value function, representing the states closer to the ceiling or floor (11m29s).
Hamilton-Jacobi reachability analysis is a method that can capture the requirements of safety analysis, providing a set of safe states and a controller to keep the system inside those safe states, with the help of a safety value function V, whose sign indicates whether the system is safe or unsafe, and whose gradient tells a safe controller for the system (13m9s).
The safety value function is more negative towards the lower end than the positive end, even though the failure set is both the ceiling and the floor, because gravity is pushing the system more in one direction, making it more unsafe when closer to the floor (12m7s).
The safety controller is a function that, at any state X, pushes the system towards higher and higher values, which means more safe, by aligning with the gradient ascent direction of the value function (12m47s).

Neural Approximations of Safety Value Function and Deep Reach

The Hamilton-Jacobi reachability analysis can be applied to general nonlinear autonomous systems, but it has challenges, including scalability, as it is computationally hard to scale these methods beyond even five-dimensional systems, and it is not immediately clear how to interface these methods with real-world data and machine learning models (13m46s).
To address these challenges, neural approximations of the safety value function can be used, which involves learning a neural approximation of the safety value function that takes as input the state and time of the system and outputs the corresponding safety value function (14m43s).
The Deep Reach method is a self-supervised learning method that relies on the fact that the true safety value function must satisfy a partial differential equation, which can be used as a signal to train the safety value function (15m6s).
In the Deep Reach method, the safety value function is trained by randomly sampling some state and time, propagating it through the neural network, computing the value function, and computing the PD violation error, which is then used to backpropagate and update the neural network (15m27s).
The goal is to optimize the neural network parameters to learn a more accurate representation of the safety value function, which is consistent with a partial differential equation, allowing for the explicit inclusion of safety requirements in the training process and the learning of inherently safe controllers from data (15m42s).
This approach has two key advantages: it can explicitly bake in safety requirements and learn inherently safe controllers, and neural representations are easily scalable to higher dimensional systems, enabling the synthesis of safe controllers for a broader class of autonomous systems (16m1s).
The three aircraft conflict resolution problem is used as an example, where two evader vehicles and a pursuer vehicle with uncertain behavior need to be safeguarded against, and the failure set is defined as any configuration where two aircraft are in close proximity (16m29s).
Due to the high dimensionality of the problem, a direct computation of the unsafe set is not scalable, and a common approach is to compute pairwise collision sets between aircraft and take their union as an approximation (17m1s).
However, with Deep Reach, it is possible to directly compute the high-dimensional unsafe set, capturing more configurations than the approximation, including three-way interactions between aircraft that could not be captured earlier (17m39s).
The Deep Reach framework has also been applied to autonomous driving, specifically in the context of urban driving, where an autonomous car needs to navigate around a stranded vehicle in its lane while avoiding oncoming traffic (18m36s).
In this scenario, a learning-based controller was initially used, but it was not able to capture the nuanced intersection point between the two cars, leading to collisions, whereas Deep Reach can bake in safety requirements to prevent such collisions (19m12s).
A safety controller was demonstrated in a car scenario, where it automatically adjusted its behavior to oncoming traffic, making the white vehicle wait behind the standard vehicle if the orange driver was aggressive, and then crossing the lane once it was safe (19m31s).
This behavior emerged automatically from the learning-based system because the safety requirements were embedded in the learning process.
In the latest work, Deep Reach was applied for learning safe controllers for lagged locomotion, a hybrid system with both continuous and discrete controls (20m29s).
The safety controller successfully avoided collisions with obstacles, even when deliberately pushed into a collision by Shuang, and sometimes changed its walking pattern completely to maintain safety (20m53s).

Probabilistic Safety Assurances and Conformal Prediction

To provide safety under neural network representations, guarantees were put on hold, and probabilistic safety assurances for Deep Reach were explored (21m19s).
The key idea behind probabilistic assurance is that the learned safety value function induces a candidate safe policy for the robot, which should ideally achieve the same value as the original policy (22m3s).
The gap between the two value functions can be used to calibrate the learning error, and finding a bound on the maximum learning error is of interest (22m25s).
To provide assurances, computing the bound Delta is necessary, and various methods have been explored, including neural network verification methods and scenario optimization (23m3s).
Conformal prediction is a method that is particularly exciting due to its simplicity and beauty, and it will be discussed further (23m17s).
Conformal prediction provides a probabilistic bound on Delta, which cannot be computed exactly but can be approximated with high confidence, resulting in a high-confidence safe set after correcting the value function (23m42s).
This method is illustrated in the context of a multi-vehicle collision avoidance problem, where the learned backward reachable tube is corrected by conformal prediction to obtain a certified safe set (24m20s).

Adapting Neural Safety Representations and Online Adaptation

Neural safety representations combine traditional safety analysis methods with dynamic learning capabilities, allowing for the incorporation of safety constraints in the learning process and scalability to higher-dimensional systems (24m47s).
These representations have two key advantages: they enable the incorporation of safety constraints in the learning process and are easily scalable to higher-dimensional systems (25m0s).
However, in real-world scenarios, safety constraints, dynamics, control authority, and environment are subject to change, requiring the system to dynamically adapt safety controllers to maintain system safety (25m27s).
Neural safety representations can be adapted online by inputting additional data, such as uncertain system or environment parameters, to learn the safety value function as a function of these parameters (25m58s).
This approach is referred to as parameter-conditioned safety value functions, which can be quickly activated online to maintain safety in response to changing deployment conditions (26m13s).
An example of this approach is demonstrated in a simple drone delivery problem, where the drone must navigate to its goal location while avoiding collisions with obstacles in uncertain wind conditions (26m34s).
By learning a parameter-conditioned value function as a function of the uncertain wind intensity, the safe set corresponding to different wind conditions can be obtained, allowing the drone to adapt its route accordingly (26m55s).
The drone can start with a more direct route to its goal location in low wind conditions but adapt its route in response to changing wind conditions to maintain safety (27m15s).
A robotic system's safe set can change due to external factors such as high wind intensity, and using parameter condition representation, the safety controller can quickly adapt to take a longer but safer route to its goal location (27m36s).
The safety value function can be adapted directly from high-dimensional observations such as RGB images or LiDAR scans, allowing the robot to dynamically construct its safety value function and maintain safety in unknown environments (28m4s).
This approach is called observation condition reachable set, and it has two key adaptation components: dynamically adapting the safety value function using LiDAR scans to adapt to unknown obstacles, and constantly estimating the uncertainty in the robot's dynamics (28m30s).
The uncertainty estimation component allows the robot to adapt its safety value function to account for factors such as rugged terrains or slippery surfaces, and it can be used as a safety filter for a nominal policy (28m52s).
The framework is agnostic to the underlying nominal policy and can be used with various RL-based policies, MPC-based policies, and in different environments with dynamic obstacles and adversarial humans (29m59s).

Stress Testing Learning-Based Policies and Identifying Visual Failures

The neural safety representations can enable adaptation to dynamic obstacles and environment uncertainty, and can be used to stress test learning-based policies to identify data regimes where they might cause safety-critical failures (30m28s).
Stress testing is particularly important for vision-based controllers, which are becoming increasingly ubiquitous in autonomy stacks but are challenging to handle using traditional safety analysis methods due to their high dimensionality and complicated nature (31m21s).
The problem of stress testing learning-based policies can be formalized to identify data regimes where they might cause safety-critical failures (31m35s).
A robotic system with dynamics and a visual sensor is considered, where the sensor provides visual observations such as RGB images or point clouds, which are used as input by a vision-based controller to apply control to the robot (31m37s).
The goal is to find the set of images or point clouds that lead to the failure of the overall closed-loop system, not just the vision-based controller (32m7s).
The failure discovery problem is cast as a reachability problem, and the corresponding backward reachable tube is used to extract visual failures (32m25s).
The robot's sensor function and vision-based controller are cascaded to obtain an equivalent state-based policy for the robot, which simplifies the closed-loop system for stress testing (32m39s).
The backward reachable tube is computed to find the set of all states that will lead to failure despite the best control action, but in this case, a specific controller is used instead of the best control action (33m11s).
The backward reachable tube is used to find the images seen by the robot along the failure states, which gives the visual failures for the system (33m48s).
An example of this work is the collaboration with Boeing, which designed a vision-based controller for an autonomous aircraft taxiing on a runway using only RGB images from a camera on the right wing (34m3s).
The goal is to keep the aircraft on the runway, and the failure set is defined as outside the runway boundary (34m27s).
The backward reachable tube is computed for the aircraft under the vision-based controller, and the starting configurations that will lead to failure are shown in red, while the safe configurations are shown in blue (34m37s).
Representative images that will cause failure are shown, and analysis of one such image reveals that the vision-based controller confuses runway markings with the center line of the runway, causing the aircraft to drive towards the marking (34m56s).
A proposed framework can identify semantic failures in vision-based control systems, such as those that lead to system-level failures, but not component-level failures, as seen in the example of an aircraft leaving a runway due to a vision-based control failure (35m33s).
The framework can detect prediction errors in vision-based controllers, with high prediction errors not always leading to system-level failures, and low prediction errors sometimes triggering system-level failures (35m53s).
The goal is to target component-level errors that lead to system-level failures, and the framework can combine with parameter condition reachable sets to obtain failures as a function of different environment latents, such as time of day or cloud conditions (36m31s).
For example, a state that is a failure during the morning due to runway markings may be safe at night, improving aircraft safety, and the framework can find failures as a function of cloud conditions, providing a diverse set of failures (36m41s).
The framework was applied to an autonomous indoor navigation pipeline, which was trained entirely in simulation and worked well in the real world, but had interesting failure modes, such as the vision-based controller learning a correlation between light-colored surfaces and traversability (37m13s).
This correlation led to the controller thinking it could go through light-colored walls, resulting in a collision, and the framework can be used to identify such failures and improve the vision-based controller (38m12s).
The ultimate purpose of identifying failures is to use them to improve the vision-based controller, such as by training an anomaly detector (39m1s).
A detector can be used to determine the probability of failure of a system given an image, and if the anomaly detector triggers a fail, a fallback controller can be used to preserve safety in an online mode (39m10s).
The detector flags a failure input, slowing down the robot, and once the failure is resolved, the system goes back to the learning-based controller to maintain system safety (39m32s).
Designing the anomaly detector and fallback controller are interesting questions, and a catalog of failures can provide a starting point for addressing these questions (39m52s).
Targeted incremental training of the controller on failure data can improve performance, as shown by a significant reduction in unsafe volume after training (40m11s).

Other Safety Research Projects and Imitation Learning

The key goal of the research is to design autonomous systems that can leverage modern machine learning methods while maintaining safety, considering safety in different stages of the learning process (40m33s).
Other projects in the lab include Deep Reach, a paradigm to learn controllers for robotic systems, and exploring other learning paradigms such as imitation learning (41m14s).
Imitation learning suffers from the compounding error problem, and to address this, researchers have been injecting adversarial disturbances during data collection to learn safety-aware imitation learning policies (41m28s).
The adversarial disturbance is computed using reachability analysis and pushes the system towards safety-critical states, allowing the robot to learn corrective actions from those states (42m1s).
The hypothesis is that by visiting more safety-critical states, the robot can learn to correct errors and improve safety during test time (42m25s).
A hypothesis is proposed to recover from certain states, and an experiment is conducted using the same aircraft texting problem to test the imitation policy, which goes from image to action, and the results show that injecting safety information significantly improves the safety of the test policy at test time (42m59s).
The demonstration and rollout of the policy for the vanilla method and the safety-guided imitation policy are compared, showing that the vanilla method cannot recover from errors near the boundary of the runway due to lack of data, while the safety-guided method can learn recovery behavior from such states (43m37s).
The data used for both methods are exactly the same, and the safety-guided imitation policy is also applied to a real crazy fly, resulting in better sim-to-real transfer, specifically in the context of safety (44m12s).
The idea of using safety-critical information to guide the exploration of a learning agent is proposed as a way to design more robust policies, and an example is shown using safety information to guide the exploration of a sampling-based planner, such as MPPI (44m48s).
The use of safety information can also help with exploration during the design phase, achieving the same performance with much fewer samples, and this direction is being actively pursued (45m27s).

Language-Based Safety Constraints and Vision-Language Models

The challenge of determining who gives the safety constraints and who tells what is safe or unsafe is discussed, and the use of language as a medium for flexibly defining these constraints is proposed (45m39s).
A vision-language model is used to convert natural language feedback into physical constraints that are more interpretable by traditional safety analysis methods, such as reachability analysis or control barrier functions, to design a safety controller that adheres to these natural language feedback (46m12s).
An experiment is conducted using a vision-language model to convert user-given safety constraints into a safety constraint, and a safety controller is computed to avoid a coffee spill on the floor (46m34s).
The concept of using tools like VMS and LLMs to define rich safety constraints and semantic safety constraints for robotic systems is discussed as a starting point for further development (47m6s).

Further Exploration of Safety Value Function Parameterization and Scalability

The parameterization of the value function for safety is explored, with the possibility of using latent parameters instead of explicit physical parameters, allowing for more complex inputs to be processed (48m35s).
The use of encoder-decoder frameworks or simultaneous learning can be employed to preprocess complex inputs and make them more manageable for the safety value function (49m7s).
The scalability of deep BRE (Bayesian Reinforcement Exploration) is discussed, with the complexity of the computation scaling with the dimensionality of the system, but with the potential for better performance if the value function has a lower-dimensional manifold (49m31s).
Deep BRE relies on the fact that many systems of interest have a lower-dimensional representation of the value function, allowing it to overcome some of the challenges associated with the curse of dimensionality (50m10s).
The possibility of using smarter gridding methods to improve the performance of vanilla HJB (Hamilton-Jacobi-Bellman) is mentioned, with the idea that using a parametric representation with fewer parameters could be more efficient (50m29s).

Challenges in Incremental Training and Monotonic Improvement

The concept of training a vision-based controller using anomalies detected during training is discussed, with the question of whether such a controller would be transferable to a different environment, such as a different runway (50m46s).
Incremental training of networks can lead to failures in certain regions of the state space where the system was previously fine, due to the lack of monotonic improvement in machine learning methods as more data is added (51m1s).
This issue is significant for safety, as it means that the performance of the system may not be completely monotonic over the previous dataset, even if the overall metric improves (51m41s).
The lack of monotonic improvement in incremental training makes it challenging to ensure safety, as the system may fail in some regions even if it was previously safe (52m12s).
Researchers are exploring ways to achieve monotonic improvement in machine learning models as more data is added, which is an open question in the field (52m36s).

Co-optimizing Safety and Performance and Levels of Safety

Safety and performance are often at odds, and current methods for ensuring safety, such as safety filtering, can be myopic and do not consider the long-term effects of actions on performance (53m17s).
A potential solution is to design controllers that co-optimize safety and performance using dynamic programming, which can optimize both requirements simultaneously (53m36s).
There is a trade-off between computation and the level of safety assurance, and researchers may need to compromise on safety to achieve more scalable computation (54m8s).
Companies often think of safety in terms of levels or tiers, rather than absolute safety, but the methods for computing these levels can be ad hoc or heuristic-based (54m44s).
Levels of safety have a hierarchy, with different severities associated with various incidents, such as collisions, and this hierarchy is considered when evaluating safety levels (54m53s).
There is limited academic work on the hierarchy of safety levels, but it is an interesting area of study (55m11s).

Adversarial Situations and Game Theory in Autonomous Driving

An example of an adversarial situation is when a car is convinced to squeeze into a space, and this situation raises questions about the symmetry of controllers in such scenarios (55m25s).
In a hypothetical situation where a human and an autonomous vehicle (AO) swap controllers, it is unclear whether the system would still work, and this scenario enters the realm of game theory (55m37s).
The assumption of symmetric information is made, assuming that both cars can be controlled and that there is a level of cooperativeness between drivers to avoid collisions (55m55s).
In reality, oncoming traffic cannot be controlled, and the assumption of cooperativeness may not always hold true (56m7s).
The avoidance of adversarial situations by human drivers may be more robust than expected, and there is a distribution of behaviors that can be classified as either adversarial or cooperative (56m33s).
A behavior-level classification of vehicles is often performed by analyzing their past trajectories to determine whether they are behaving cooperatively or adversarially (56m43s).
Based on this classification, different planning strategies are employed to navigate around the vehicle, and this process involves a level of behavior planning (56m54s).
Cars in the city are not necessarily shy and have become more assertive over time (57m10s).