Search | arXiv e-print repository

Learning Robot Safety from Sparse Human Feedback using Conformal Prediction

Authors: Aaron O. Feldman, Joseph A. Vincent, Maximilian Adang, Jun En Low, Mac Schwager

Abstract: Ensuring robot safety can be challenging; user-defined constraints can miss edge cases, policies can become unsafe even when trained from safe data, and safety can be subjective. Thus, we learn about robot safety by showing policy trajectories to a human who flags unsafe behavior. From this binary feedback, we use the statistical method of conformal prediction to identify a region of states, poten… ▽ More Ensuring robot safety can be challenging; user-defined constraints can miss edge cases, policies can become unsafe even when trained from safe data, and safety can be subjective. Thus, we learn about robot safety by showing policy trajectories to a human who flags unsafe behavior. From this binary feedback, we use the statistical method of conformal prediction to identify a region of states, potentially in learned latent space, guaranteed to contain a user-specified fraction of future policy errors. Our method is sample-efficient, as it builds on nearest neighbor classification and avoids withholding data as is common with conformal prediction. By alerting if the robot reaches the suspected unsafe region, we obtain a warning system that mimics the human's safety preferences with guaranteed miss rate. From video labeling, our system can detect when a quadcopter visuomotor policy will fail to steer through a designated gate. We present an approach for policy improvement by avoiding the suspected unsafe region. With it we improve a model predictive controller's safety, as shown in experimental testing with 30 quadcopter flights across 6 navigation tasks. Code and videos are provided. △ Less

Submitted 8 January, 2025; originally announced January 2025.

arXiv:2402.06778 [pdf, other]

Distributed Quasi-Newton Method for Multi-Agent Optimization

Authors: Ola Shorinwa, Mac Schwager

Abstract: We present a distributed quasi-Newton (DQN) method, which enables a group of agents to compute an optimal solution of a separable multi-agent optimization problem locally using an approximation of the curvature of the aggregate objective function. Each agent computes a descent direction from its local estimate of the aggregate Hessian, obtained from quasi-Newton approximation schemes using the gra… ▽ More We present a distributed quasi-Newton (DQN) method, which enables a group of agents to compute an optimal solution of a separable multi-agent optimization problem locally using an approximation of the curvature of the aggregate objective function. Each agent computes a descent direction from its local estimate of the aggregate Hessian, obtained from quasi-Newton approximation schemes using the gradient of its local objective function. Moreover, we introduce a distributed quasi-Newton method for equality-constrained optimization (EC-DQN), where each agent takes Karush-Kuhn-Tucker-like update steps to compute an optimal solution. In our algorithms, each agent communicates with its one-hop neighbors over a peer-to-peer communication network to compute a common solution. We prove convergence of our algorithms to a stationary point of the optimization problem. In addition, we demonstrate the competitive empirical convergence of our algorithm in both well-conditioned and ill-conditioned optimization problems, in terms of the computation time and communication cost incurred by each agent for convergence, compared to existing distributed first-order and second-order methods. Particularly, in ill-conditioned problems, our algorithms achieve a faster computation time for convergence, while requiring a lower communication cost, across a range of communication networks with different degrees of connectedness. △ Less

Submitted 26 September, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

arXiv:2309.12235 [pdf, other]

Distributed Conjugate Gradient Method via Conjugate Direction Tracking

Authors: Ola Shorinwa, Mac Schwager

Abstract: We present a distributed conjugate gradient method for distributed optimization problems, where each agent computes an optimal solution of the problem locally without any central computation or coordination, while communicating with its immediate, one-hop neighbors over a communication network. Each agent updates its local problem variable using an estimate of the average conjugate direction acros… ▽ More We present a distributed conjugate gradient method for distributed optimization problems, where each agent computes an optimal solution of the problem locally without any central computation or coordination, while communicating with its immediate, one-hop neighbors over a communication network. Each agent updates its local problem variable using an estimate of the average conjugate direction across the network, computed via a dynamic consensus approach. Our algorithm enables the agents to use uncoordinated step-sizes. We prove convergence of the local variable of each agent to the optimal solution of the aggregate optimization problem, without requiring decreasing step-sizes. In addition, we demonstrate the efficacy of our algorithm in distributed state estimation problems, and its robust counterparts, where we show its performance compared to existing distributed first-order optimization methods. △ Less

Submitted 26 February, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

arXiv:2110.01027 [pdf, other]

Maximum-Entropy Multi-Agent Dynamic Games: Forward and Inverse Solutions

Authors: Negar Mehr, Mingyu Wang, Mac Schwager

Abstract: In this paper, we study the problem of multiple stochastic agents interacting in a dynamic game scenario with continuous state and action spaces. We define a new notion of stochastic Nash equilibrium for boundedly rational agents, which we call the Entropic Cost Equilibrium (ECE). We show that ECE is a natural extension to multiple agents of Maximum Entropy optimality for single agents. We solve b… ▽ More In this paper, we study the problem of multiple stochastic agents interacting in a dynamic game scenario with continuous state and action spaces. We define a new notion of stochastic Nash equilibrium for boundedly rational agents, which we call the Entropic Cost Equilibrium (ECE). We show that ECE is a natural extension to multiple agents of Maximum Entropy optimality for single agents. We solve both the "forward" and "inverse" problems for the multi-agent ECE game. For the forward problem, we provide a Riccati algorithm to compute closed-form ECE feedback policies for the agents, which are exact in the Linear-Quadratic-Gaussian case. We give an iterative variant to find locally ECE feedback policies for the nonlinear case. For the inverse problem, we present an algorithm to infer the cost functions of the multiple interacting agents given noisy, boundedly rational input and state trajectory examples from agents acting in an ECE. The effectiveness of our algorithms is demonstrated in a simulated multi-agent collision avoidance scenario, and with data from the INTERACTION traffic dataset. In both cases, we show that, by taking into account the agents' game theoretic interactions using our algorithm, a more accurate model of agents' costs can be learned, compared with standard inverse optimal control methods. △ Less

Submitted 3 October, 2021; originally announced October 2021.

arXiv:2107.10956 [pdf, other]

Reciprocal Multi-Robot Collision Avoidance with Asymmetric State Uncertainty

Authors: Kunal Shah, Guillermo Angeris, Mac Schwager

Abstract: We present a general decentralized formulation for a large class of collision avoidance methods and show that all collision avoidance methods of this form are guaranteed to be collision free. This class includes several existing algorithms in the literature as special cases. We then present a particular instance of this collision avoidance method, CARP (Collision Avoidance by Reciprocal Projection… ▽ More We present a general decentralized formulation for a large class of collision avoidance methods and show that all collision avoidance methods of this form are guaranteed to be collision free. This class includes several existing algorithms in the literature as special cases. We then present a particular instance of this collision avoidance method, CARP (Collision Avoidance by Reciprocal Projections), that is effective even when the estimates of other agents' positions and velocities are noisy. The method's main computational step involves the solution of a small convex optimization problem, which can be quickly solved in practice, even on embedded platforms, making it practical to use on computationally-constrained robots such as quadrotors. This method can be extended to find smooth polynomial trajectories for higher dynamic systems such at quadrotors. We demonstrate this algorithm's performance in simulations and on a team of physical quadrotors. Our method finds optimal projections in a median time of 17.12ms for 285 instances of 100 randomly generated obstacles, and produces safe polynomial trajectories at over 60hz on-board quadrotors. Our paper is accompanied by an open source Julia implementation and ROS package. △ Less

Submitted 22 July, 2021; originally announced July 2021.

Comments: arXiv admin note: text overlap with arXiv:1905.12875

arXiv:2002.11775 [pdf, other]

SACBP: Belief Space Planning for Continuous-Time Dynamical Systems via Stochastic Sequential Action Control

Authors: Haruki Nishimura, Mac Schwager

Abstract: We propose a novel belief space planning technique for continuous dynamics by viewing the belief system as a hybrid dynamical system with time-driven switching. Our approach is based on the perturbation theory of differential equations and extends Sequential Action Control to stochastic dynamics. The resulting algorithm, which we name SACBP, does not require discretization of spaces or time and sy… ▽ More We propose a novel belief space planning technique for continuous dynamics by viewing the belief system as a hybrid dynamical system with time-driven switching. Our approach is based on the perturbation theory of differential equations and extends Sequential Action Control to stochastic dynamics. The resulting algorithm, which we name SACBP, does not require discretization of spaces or time and synthesizes control signals in near real-time. SACBP is an anytime algorithm that can handle general parametric Bayesian filters under certain assumptions. We demonstrate the effectiveness of our approach in an active sensing scenario and a model-based Bayesian reinforcement learning problem. In these challenging problems, we show that the algorithm significantly outperforms other existing solution techniques including approximate dynamic programming and local trajectory optimization. △ Less

Submitted 12 July, 2021; v1 submitted 26 February, 2020; originally announced February 2020.

Comments: accepted in Internatinoal Journal of Robotics Research (IJRR)

arXiv:1905.12875 [pdf, other]

Fast Reciprocal Collision Avoidance Under Measurement Uncertainty

Authors: Guillermo Angeris, Kunal Shah, Mac Schwager

Abstract: We present a fully distributed collision avoidance algorithm based on convex optimization for a team of mobile robots. This method addresses the practical case in which agents sense each other via measurements from noisy on-board sensors with no inter-agent communication. Under some mild conditions, we provide guarantees on mutual collision avoidance for a broad class of policies including the one… ▽ More We present a fully distributed collision avoidance algorithm based on convex optimization for a team of mobile robots. This method addresses the practical case in which agents sense each other via measurements from noisy on-board sensors with no inter-agent communication. Under some mild conditions, we provide guarantees on mutual collision avoidance for a broad class of policies including the one presented. Additionally, we provide numerical examples of computational performance and show that, in both 2D and 3D simulations, all agents avoid each other and reach their desired goals in spite of their uncertainty about the locations of other agents. △ Less

Submitted 31 May, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

arXiv:1303.3522 [pdf, other]

Rebalancing the Rebalancers: Optimally Routing Vehicles and Drivers in Mobility-on-Demand Systems

Authors: Stephen L. Smith, Marco Pavone, Mac Schwager, Emilio Frazzoli, Daniela Rus

Abstract: In this paper we study rebalancing strategies for a mobility-on-demand urban transportation system blending customer-driven vehicles with a taxi service. In our system, a customer arrives at one of many designated stations and is transported to any other designated station, either by driving themselves, or by being driven by an employed driver. The system allows for one-way trips, so that customer… ▽ More In this paper we study rebalancing strategies for a mobility-on-demand urban transportation system blending customer-driven vehicles with a taxi service. In our system, a customer arrives at one of many designated stations and is transported to any other designated station, either by driving themselves, or by being driven by an employed driver. The system allows for one-way trips, so that customers do not have to return to their origin. When some origins and destinations are more popular than others, vehicles will become unbalanced, accumulating at some stations and becoming depleted at others. This problem is addressed by employing rebalancing drivers to drive vehicles from the popular destinations to the unpopular destinations. However, with this approach the rebalancing drivers themselves become unbalanced, and we need to "rebalance the rebalancers" by letting them travel back to the popular destinations with a customer. Accordingly, in this paper we study how to optimally route the rebalancing vehicles and drivers so that stability (in terms of boundedness of the number of waiting customers) is ensured while minimizing the number of rebalancing vehicles traveling in the network and the number of rebalancing drivers needed; surprisingly, these two objectives are aligned, and one can find the optimal rebalancing strategy by solving two decoupled linear programs. Leveraging our analysis, we determine the minimum number of drivers and minimum number of vehicles needed to ensure stability in the system. Interestingly, our simulations suggest that, in Euclidean network topologies, one would need between 1/3 and 1/4 as many drivers as vehicles. △ Less

Submitted 14 March, 2013; originally announced March 2013.

Comments: Extended version of 2013 American Control Conference paper

arXiv:1102.0603 [pdf, other]

doi 10.1109/TRO.2011.2174493

Persistent Robotic Tasks: Monitoring and Sweeping in Changing Environments

Authors: Stephen L. Smith, Mac Schwager, Daniela Rus

Abstract: We present controllers that enable mobile robots to persistently monitor or sweep a changing environment. The changing environment is modeled as a field which grows in locations that are not within range of a robot, and decreases in locations that are within range of a robot. We assume that the robots travel on given closed paths. The speed of each robot along its path is controlled to prevent the… ▽ More We present controllers that enable mobile robots to persistently monitor or sweep a changing environment. The changing environment is modeled as a field which grows in locations that are not within range of a robot, and decreases in locations that are within range of a robot. We assume that the robots travel on given closed paths. The speed of each robot along its path is controlled to prevent the field from growing unbounded at any location. We consider the space of speed controllers that can be parametrized by a finite set of basis functions. For a single robot, we develop a linear program that is guaranteed to compute a speed controller in this space to keep the field bounded, if such a controller exists. Another linear program is then derived whose solution is the speed controller that minimizes the maximum field value over the environment. We extend our linear program formulation to develop a multi-robot controller that keeps the field bounded. The multi-robot controller has the unique feature that it does not require communication among the robots. Simulation studies demonstrate the robustness of the controllers to modeling errors, and to stochasticity in the environment. △ Less

Submitted 3 February, 2011; originally announced February 2011.

Comments: This is an expanded version of a paper at ICRA 2011. This expanded version has been submitted to the IEEE Transactions on Robotics

Showing 1–9 of 9 results for author: Schwager, M