Runtime Safety through Adaptive Shielding: From Hidden Parameter Inference to Provable Guarantees

1The University of Virginia, 2The University of Texas at Austin
Flowchart of our Adaptive Shielding Approach

Our adaptive shielding framework integrates real-time hidden parameter inference with uncertainty-aware action filtering. Ensuring robots operate safely despite unknown variations in their physical properties (like mass or friction) is a major challenge.

Abstract

Variations in hidden parameters, such as a robot's mass distribution or friction, pose safety risks during execution. We develop a runtime shielding mechanism for reinforcement learning, building on the formalism of constrained hidden-parameter Markov decision processes. Function encoders enable real-time inference of hidden parameters from observations, allowing the shield and the underlying policy to adapt online. The shield constrains the action space by forecasting future safety risks (such as obstacle proximity) and accounts for uncertainty via conformal prediction. We prove that the proposed mechanism satisfies probabilistic safety guarantees and yields optimal policies among the set of safety-compliant policies. Experiments across diverse environments with varying hidden parameters show that our method significantly reduces safety violations and achieves strong out-of-distribution generalization, while incurring minimal runtime overhead.

The Challenge: Safety with Unknown Dynamics

Robots and autonomous systems in open-world environments often encounter varying underlying dynamics due to unobserved hidden parameters (mass, friction, terrain). These variations pose significant safety risks and challenge the generalization of reinforcement learning (RL) systems. Current methods often trade off adaptability for safety, or vice-versa, especially when dynamics change online.


Our Approach: Adaptive Shielding with Safety Regularization

We propose a runtime shielding framework that adapts online to hidden parameters while offering provable probabilistic safety. The core components are:

  1. Online Hidden-Parameter Adaptation: We leverage function encoders to efficiently infer hidden parameters from recent observations, enabling both the policy and the shield to adapt without retraining.
  2. Safety-Regularized RL Objective (SRO): A novel objective function that balances reward maximization with safety by integrating a cost-sensitive value estimate, encouraging low-violation behavior during training. Refer to Appendix G to see why this design choice is important.
    Q-augmented safety objective

    where

    Q-safe objective
  3. Adaptive Shield: A runtime shield that filters potentially unsafe actions proposed by the policy. It uses the inferred dynamics and conformal prediction to quantify uncertainty in future state forecasts, ensuring actions comply with safety margins.

This combination allows for robust safety and performance even when the robot's dynamics change unexpectedly.

Theoretical Results

  1. Proposition: Optimality Preservation: SRO effectively guides policy learning toward safe behaviors while not unnecessarily degrading performance when the agent already behaves safely.
  2. Theorem: Provable Safety and Performance: An optimal policy, when augmented with our adaptive shield, maximizes the expected cumulative discounted return while maintaining a tight bound on the average cost rate. This provides a formal guarantee that our framework not only achieves high performance in terms of task completion but also adheres to provable safety guarantees.


Empirical Results

Our method demonstrates significant improvements in safety and generalization compared to established baselines across various Safe-Gym benchmarks.


RQ3: Execution-Time Efficiency

The runtime speed of our method compared to the baselines

Our approach introduces only modest runtime overhead compared to baselines, with the shield being triggered selectively. This makes it suitable for real-time deployment. (Refer to Table 1 in the paper for detailed metrics).

Further ablation studies and detailed results can be found in the main paper and appendix.