-
Certifying Stability of Reinforcement Learning Policies using Generalized Lyapunov Functions
Authors:
Kehan Long,
Jorge Cortés,
Nikolay Atanasov
Abstract:
We study the problem of certifying the stability of closed-loop systems under control policies derived from optimal control or reinforcement learning (RL). Classical Lyapunov methods require a strict step-wise decrease in the Lyapunov function but such a certificate is difficult to construct for a learned control policy. The value function associated with an RL policy is a natural Lyapunov functio…
▽ More
We study the problem of certifying the stability of closed-loop systems under control policies derived from optimal control or reinforcement learning (RL). Classical Lyapunov methods require a strict step-wise decrease in the Lyapunov function but such a certificate is difficult to construct for a learned control policy. The value function associated with an RL policy is a natural Lyapunov function candidate but it is not clear how it should be modified. To gain intuition, we first study the linear quadratic regulator (LQR) problem and make two key observations. First, a Lyapunov function can be obtained from the value function of an LQR policy by augmenting it with a residual term related to the system dynamics and stage cost. Second, the classical Lyapunov decrease requirement can be relaxed to a generalized Lyapunov condition requiring only decrease on average over multiple time steps. Using this intuition, we consider the nonlinear setting and formulate an approach to learn generalized Lyapunov functions by augmenting RL value functions with neural network residual terms. Our approach successfully certifies the stability of RL policies trained on Gymnasium and DeepMind Control benchmarks. We also extend our method to jointly train neural controllers and stability certificates using a multi-step Lyapunov loss, resulting in larger certified inner approximations of the region of attraction compared to the classical Lyapunov approach. Overall, our formulation enables stability certification for a broad class of systems with learned policies by making certificates easier to construct, thereby bridging classical control theory and modern learning-based methods.
△ Less
Submitted 19 May, 2025; v1 submitted 16 May, 2025;
originally announced May 2025.
-
Learned IMU Bias Prediction for Invariant Visual Inertial Odometry
Authors:
Abdullah Altawaitan,
Jason Stanley,
Sambaran Ghosal,
Thai Duong,
Nikolay Atanasov
Abstract:
Autonomous mobile robots operating in novel environments depend critically on accurate state estimation, often utilizing visual and inertial measurements. Recent work has shown that an invariant formulation of the extended Kalman filter improves the convergence and robustness of visual-inertial odometry by utilizing the Lie group structure of a robot's position, velocity, and orientation states. H…
▽ More
Autonomous mobile robots operating in novel environments depend critically on accurate state estimation, often utilizing visual and inertial measurements. Recent work has shown that an invariant formulation of the extended Kalman filter improves the convergence and robustness of visual-inertial odometry by utilizing the Lie group structure of a robot's position, velocity, and orientation states. However, inertial sensors also require measurement bias estimation, yet introducing the bias in the filter state breaks the Lie group symmetry. In this paper, we design a neural network to predict the bias of an inertial measurement unit (IMU) from a sequence of previous IMU measurements. This allows us to use an invariant filter for visual inertial odometry, relying on the learned bias prediction rather than introducing the bias in the filter state. We demonstrate that an invariant multi-state constraint Kalman filter (MSCKF) with learned bias predictions achieves robust visual-inertial odometry in real experiments, even when visual information is unavailable for extended periods and the system needs to rely solely on IMU measurements.
△ Less
Submitted 10 May, 2025;
originally announced May 2025.
-
Variational Formulation of the Particle Flow Particle Filter
Authors:
Yinzhuang Yi,
Jorge Cortés,
Nikolay Atanasov
Abstract:
This paper provides a formulation of the particle flow particle filter from the perspective of variational inference. We show that the transient density used to derive the particle flow particle filter follows a time-scaled trajectory of the Fisher-Rao gradient flow in the space of probability densities. The Fisher-Rao gradient flow is obtained as a continuous-time algorithm for variational infere…
▽ More
This paper provides a formulation of the particle flow particle filter from the perspective of variational inference. We show that the transient density used to derive the particle flow particle filter follows a time-scaled trajectory of the Fisher-Rao gradient flow in the space of probability densities. The Fisher-Rao gradient flow is obtained as a continuous-time algorithm for variational inference, minimizing the Kullback-Leibler divergence between a variational density and the true posterior density.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
MISO: Multiresolution Submap Optimization for Efficient Globally Consistent Neural Implicit Reconstruction
Authors:
Yulun Tian,
Hanwen Cao,
Sunghwan Kim,
Nikolay Atanasov
Abstract:
Neural implicit representations have had a significant impact on simultaneous localization and mapping (SLAM) by enabling robots to build continuous, differentiable, and high-fidelity 3D maps from sensor data. However, as the scale and complexity of the environment increase, neural SLAM approaches face renewed challenges in the back-end optimization process to keep up with runtime requirements and…
▽ More
Neural implicit representations have had a significant impact on simultaneous localization and mapping (SLAM) by enabling robots to build continuous, differentiable, and high-fidelity 3D maps from sensor data. However, as the scale and complexity of the environment increase, neural SLAM approaches face renewed challenges in the back-end optimization process to keep up with runtime requirements and maintain global consistency. We introduce MISO, a hierarchical optimization approach that leverages multiresolution submaps to achieve efficient and scalable neural implicit reconstruction. For local SLAM within each submap, we develop a hierarchical optimization scheme with learned initialization that substantially reduces the time needed to optimize the implicit submap features. To correct estimation drift globally, we develop a hierarchical method to align and fuse the multiresolution submaps, leading to substantial acceleration by avoiding the need to decode the full scene geometry. MISO significantly improves computational efficiency and estimation accuracy of neural signed distance function (SDF) SLAM on large-scale real-world benchmarks.
△ Less
Submitted 27 April, 2025;
originally announced April 2025.
-
Learning Scene-Level Signed Directional Distance Function with Ellipsoidal Priors and Neural Residuals
Authors:
Zhirui Dai,
Hojoon Shin,
Yulun Tian,
Ki Myung Brian Lee,
Nikolay Atanasov
Abstract:
Dense geometric environment representations are critical for autonomous mobile robot navigation and exploration. Recent work shows that implicit continuous representations of occupancy, signed distance, or radiance learned using neural networks offer advantages in reconstruction fidelity, efficiency, and differentiability over explicit discrete representations based on meshes, point clouds, and vo…
▽ More
Dense geometric environment representations are critical for autonomous mobile robot navigation and exploration. Recent work shows that implicit continuous representations of occupancy, signed distance, or radiance learned using neural networks offer advantages in reconstruction fidelity, efficiency, and differentiability over explicit discrete representations based on meshes, point clouds, and voxels. In this work, we explore a directional formulation of signed distance, called signed directional distance function (SDDF). Unlike signed distance function (SDF) and similar to neural radiance fields (NeRF), SDDF has a position and viewing direction as input. Like SDF and unlike NeRF, SDDF directly provides distance to the observed surface along the direction, rather than integrating along the view ray, allowing efficient view synthesis. To learn and predict scene-level SDDF efficiently, we develop a differentiable hybrid representation that combines explicit ellipsoid priors and implicit neural residuals. This approach allows the model to effectively handle large distance discontinuities around obstacle boundaries while preserving the ability for dense high-fidelity prediction. We show that SDDF is competitive with the state-of-the-art neural implicit scene models in terms of reconstruction accuracy and rendering efficiency, while allowing differentiable view prediction for robot trajectory optimization.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
DynaGSLAM: Real-Time Gaussian-Splatting SLAM for Online Rendering, Tracking, Motion Predictions of Moving Objects in Dynamic Scenes
Authors:
Runfa Blark Li,
Mahdi Shaghaghi,
Keito Suzuki,
Xinshuang Liu,
Varun Moparthi,
Bang Du,
Walker Curtis,
Martin Renschler,
Ki Myung Brian Lee,
Nikolay Atanasov,
Truong Nguyen
Abstract:
Simultaneous Localization and Mapping (SLAM) is one of the most important environment-perception and navigation algorithms for computer vision, robotics, and autonomous cars/drones. Hence, high quality and fast mapping becomes a fundamental problem. With the advent of 3D Gaussian Splatting (3DGS) as an explicit representation with excellent rendering quality and speed, state-of-the-art (SOTA) work…
▽ More
Simultaneous Localization and Mapping (SLAM) is one of the most important environment-perception and navigation algorithms for computer vision, robotics, and autonomous cars/drones. Hence, high quality and fast mapping becomes a fundamental problem. With the advent of 3D Gaussian Splatting (3DGS) as an explicit representation with excellent rendering quality and speed, state-of-the-art (SOTA) works introduce GS to SLAM. Compared to classical pointcloud-SLAM, GS-SLAM generates photometric information by learning from input camera views and synthesize unseen views with high-quality textures. However, these GS-SLAM fail when moving objects occupy the scene that violate the static assumption of bundle adjustment. The failed updates of moving GS affects the static GS and contaminates the full map over long frames. Although some efforts have been made by concurrent works to consider moving objects for GS-SLAM, they simply detect and remove the moving regions from GS rendering ("anti'' dynamic GS-SLAM), where only the static background could benefit from GS. To this end, we propose the first real-time GS-SLAM, "DynaGSLAM'', that achieves high-quality online GS rendering, tracking, motion predictions of moving objects in dynamic scenes while jointly estimating accurate ego motion. Our DynaGSLAM outperforms SOTA static & "Anti'' dynamic GS-SLAM on three dynamic real datasets, while keeping speed and memory efficiency in practice.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
LATMOS: Latent Automaton Task Model from Observation Sequences
Authors:
Weixiao Zhan,
Qiyue Dong,
Eduardo Sebastián,
Nikolay Atanasov
Abstract:
Robot task planning from high-level instructions is an important step towards deploying fully autonomous robot systems in the service sector. Three key aspects of robot task planning present challenges yet to be resolved simultaneously, namely, (i) factorization of complex tasks specifications into simpler executable subtasks, (ii) understanding of the current task state from raw observations, and…
▽ More
Robot task planning from high-level instructions is an important step towards deploying fully autonomous robot systems in the service sector. Three key aspects of robot task planning present challenges yet to be resolved simultaneously, namely, (i) factorization of complex tasks specifications into simpler executable subtasks, (ii) understanding of the current task state from raw observations, and (iii) planning and verification of task executions. To address these challenges, we propose LATMOS, an automata-inspired task model that, given observations from correct task executions, is able to factorize the task, while supporting verification and planning operations. LATMOS combines an observation encoder to extract the features from potentially high-dimensional observations with automata theory to learn a sequential model that encapsulates an automaton with symbols in the latent feature space. We conduct extensive evaluations in three task model learning setups: (i) abstract tasks described by logical formulas, (ii) real-world human tasks described by videos and natural language prompts and (iii) a robot task described by image and state observations. The results demonstrate the improved plan generation and verification capabilities of LATMOS across observation modalities and tasks.
△ Less
Submitted 11 March, 2025;
originally announced March 2025.
-
LTLCodeGen: Code Generation of Syntactically Correct Temporal Logic for Robot Task Planning
Authors:
Behrad Rabiei,
Mahesh Kumar A. R.,
Zhirui Dai,
Surya L. S. R. Pilla,
Qiyue Dong,
Nikolay Atanasov
Abstract:
This paper focuses on planning robot navigation tasks from natural language specifications. We develop a modular approach, where a large language model (LLM) translates the natural language instructions into a linear temporal logic (LTL) formula with propositions defined by object classes in a semantic occupancy map. The LTL formula and the semantic occupancy map are provided to a motion planning…
▽ More
This paper focuses on planning robot navigation tasks from natural language specifications. We develop a modular approach, where a large language model (LLM) translates the natural language instructions into a linear temporal logic (LTL) formula with propositions defined by object classes in a semantic occupancy map. The LTL formula and the semantic occupancy map are provided to a motion planning algorithm to generate a collision-free robot path that satisfies the natural language instructions. Our main contribution is LTLCodeGen, a method to translate natural language to syntactically correct LTL using code generation. We demonstrate the complete task planning method in real-world experiments involving human speech to provide navigation instructions to a mobile robot. We also thoroughly evaluate our approach in simulated and real-world experiments in comparison to end-to-end LLM task planning and state-of-the-art LLM-to-LTL translation methods.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
Neural Configuration-Space Barriers for Manipulation Planning and Control
Authors:
Kehan Long,
Ki Myung Brian Lee,
Nikola Raicevic,
Niyas Attasseri,
Melvin Leok,
Nikolay Atanasov
Abstract:
Planning and control for high-dimensional robot manipulators in cluttered, dynamic environments require both computational efficiency and robust safety guarantees. Inspired by recent advances in learning configuration-space distance functions (CDFs) as robot body representations, we propose a unified framework for motion planning and control that formulates safety constraints as CDF barriers. A CD…
▽ More
Planning and control for high-dimensional robot manipulators in cluttered, dynamic environments require both computational efficiency and robust safety guarantees. Inspired by recent advances in learning configuration-space distance functions (CDFs) as robot body representations, we propose a unified framework for motion planning and control that formulates safety constraints as CDF barriers. A CDF barrier approximates the local free configuration space, substantially reducing the number of collision-checking operations during motion planning. However, learning a CDF barrier with a neural network and relying on online sensor observations introduce uncertainties that must be considered during control synthesis. To address this, we develop a distributionally robust CDF barrier formulation for control that explicitly accounts for modeling errors and sensor noise without assuming a known underlying distribution. Simulations and hardware experiments on a 6-DoF xArm manipulator show that our neural CDF barrier formulation enables efficient planning and robust real-time safe control in cluttered and dynamic environments, relying only on onboard point-cloud observations.
△ Less
Submitted 6 May, 2025; v1 submitted 6 March, 2025;
originally announced March 2025.
-
SplatSDF: Boosting Neural Implicit SDF via Gaussian Splatting Fusion
Authors:
Runfa Blark Li,
Keito Suzuki,
Bang Du,
Ki Myung Brian Lee,
Nikolay Atanasov,
Truong Nguyen
Abstract:
A signed distance function (SDF) is a useful representation for continuous-space geometry and many related operations, including rendering, collision checking, and mesh generation. Hence, reconstructing SDF from image observations accurately and efficiently is a fundamental problem. Recently, neural implicit SDF (SDF-NeRF) techniques, trained using volumetric rendering, have gained a lot of attent…
▽ More
A signed distance function (SDF) is a useful representation for continuous-space geometry and many related operations, including rendering, collision checking, and mesh generation. Hence, reconstructing SDF from image observations accurately and efficiently is a fundamental problem. Recently, neural implicit SDF (SDF-NeRF) techniques, trained using volumetric rendering, have gained a lot of attention. Compared to earlier truncated SDF (TSDF) fusion algorithms that rely on depth maps and voxelize continuous space, SDF-NeRF enables continuous-space SDF reconstruction with better geometric and photometric accuracy. However, the accuracy and convergence speed of scene-level SDF reconstruction require further improvements for many applications. With the advent of 3D Gaussian Splatting (3DGS) as an explicit representation with excellent rendering quality and speed, several works have focused on improving SDF-NeRF by introducing consistency losses on depth and surface normals between 3DGS and SDF-NeRF. However, loss-level connections alone lead to incremental improvements. We propose a novel neural implicit SDF called "SplatSDF" to fuse 3DGSandSDF-NeRF at an architecture level with significant boosts to geometric and photometric accuracy and convergence speed. Our SplatSDF relies on 3DGS as input only during training, and keeps the same complexity and efficiency as the original SDF-NeRF during inference. Our method outperforms state-of-the-art SDF-NeRF models on geometric and photometric evaluation by the time of submission.
△ Less
Submitted 23 November, 2024;
originally announced November 2024.
-
PKF: Probabilistic Data Association Kalman Filter for Multi-Object Tracking
Authors:
Hanwen Cao,
George J. Pappas,
Nikolay Atanasov
Abstract:
In this paper, we derive a new Kalman filter with probabilistic data association between measurements and states. We formulate a variational inference problem to approximate the posterior density of the state conditioned on the measurement data. We view the unknown data association as a latent variable and apply Expectation Maximization (EM) to obtain a filter with update step in the same form as…
▽ More
In this paper, we derive a new Kalman filter with probabilistic data association between measurements and states. We formulate a variational inference problem to approximate the posterior density of the state conditioned on the measurement data. We view the unknown data association as a latent variable and apply Expectation Maximization (EM) to obtain a filter with update step in the same form as the Kalman filter but with expanded measurement vector of all potential associations. We show that the association probabilities can be computed as permanents of matrices with measurement likelihood entries. We also propose an ambiguity check that associates only a subset of ambiguous measurements and states probabilistically, thus reducing the association time and preventing low-probability measurements from harming the estimation accuracy. Experiments in simulation show that our filter achieves lower tracking errors than the well-established joint probabilistic data association filter (JPDAF), while running at comparable rate. We also demonstrate the effectiveness of our filter in multi-object tracking (MOT) on multiple real-world datasets, including MOT17, MOT20, and DanceTrack. We achieve better higher order tracking accuracy (HOTA) than previous Kalman-filter methods and remain real-time. Associating only bounding boxes without deep features or velocities, our method ranks top-10 on both MOT17 and MOT20 in terms of HOTA. Given offline detections, our algorithm tracks at 250+ fps on a single laptop CPU. Code is available at https://github.com/hwcao17/pkf.
△ Less
Submitted 10 November, 2024;
originally announced November 2024.
-
Control Strategies for Pursuit-Evasion Under Occlusion Using Visibility and Safety Barrier Functions
Authors:
Minnan Zhou,
Mustafa Shaikh,
Vatsalya Chaubey,
Patrick Haggerty,
Shumon Koga,
Dimitra Panagou,
Nikolay Atanasov
Abstract:
This paper develops a control strategy for pursuit-evasion problems in environments with occlusions. We address the challenge of a mobile pursuer keeping a mobile evader within its field of view (FoV) despite line-of-sight obstructions. The signed distance function (SDF) of the FoV is used to formulate visibility as a control barrier function (CBF) constraint on the pursuer's control inputs. Simil…
▽ More
This paper develops a control strategy for pursuit-evasion problems in environments with occlusions. We address the challenge of a mobile pursuer keeping a mobile evader within its field of view (FoV) despite line-of-sight obstructions. The signed distance function (SDF) of the FoV is used to formulate visibility as a control barrier function (CBF) constraint on the pursuer's control inputs. Similarly, obstacle avoidance is formulated as a CBF constraint based on the SDF of the obstacle set. While the visibility and safety CBFs are Lipschitz continuous, they are not differentiable everywhere, necessitating the use of generalized gradients. To achieve non-myopic pursuit, we generate reference control trajectories leading to evader visibility using a sampling-based kinodynamic planner. The pursuer then tracks this reference via convex optimization under the CBF constraints. We validate our approach in CARLA simulations and real-world robot experiments, demonstrating successful visibility maintenance using only onboard sensing, even under severe occlusions and dynamic evader movements.
△ Less
Submitted 23 March, 2025; v1 submitted 2 November, 2024;
originally announced November 2024.
-
Generalizable Motion Planning via Operator Learning
Authors:
Sharath Matada,
Luke Bhan,
Yuanyuan Shi,
Nikolay Atanasov
Abstract:
In this work, we introduce a planning neural operator (PNO) for predicting the value function of a motion planning problem. We recast value function approximation as learning a single operator from the cost function space to the value function space, which is defined by an Eikonal partial differential equation (PDE). Therefore, our PNO model, despite being trained with a finite number of samples a…
▽ More
In this work, we introduce a planning neural operator (PNO) for predicting the value function of a motion planning problem. We recast value function approximation as learning a single operator from the cost function space to the value function space, which is defined by an Eikonal partial differential equation (PDE). Therefore, our PNO model, despite being trained with a finite number of samples at coarse resolution, inherits the zero-shot super-resolution property of neural operators. We demonstrate accurate value function approximation at $16\times$ the training resolution on the MovingAI lab's 2D city dataset, compare with state-of-the-art neural value function predictors on 3D scenes from the iGibson building dataset and showcase optimal planning with 4-DOF robotic manipulators. Lastly, we investigate employing the value function output of PNO as a heuristic function to accelerate motion planning. We show theoretically that the PNO heuristic is $ε$-consistent by introducing an inductive bias layer that guarantees our value functions satisfy the triangle inequality. With our heuristic, we achieve a $30\%$ decrease in nodes visited while obtaining near optimal path lengths on the MovingAI lab 2D city dataset, compared to classical planning methods ($A^\ast$, $RRT^\ast$).
△ Less
Submitted 26 May, 2025; v1 submitted 23 October, 2024;
originally announced October 2024.
-
Neural Configuration Distance Function for Continuum Robot Control
Authors:
Kehan Long,
Hardik Parwana,
Georgios Fainekos,
Bardh Hoxha,
Hideki Okamoto,
Nikolay Atanasov
Abstract:
This paper presents a novel method for modeling the shape of a continuum robot as a Neural Configuration Euclidean Distance Function (N-CEDF). By learning separate distance fields for each link and combining them through the kinematics chain, the learned N-CEDF provides an accurate and computationally efficient representation of the robot's shape. The key advantage of a distance function represent…
▽ More
This paper presents a novel method for modeling the shape of a continuum robot as a Neural Configuration Euclidean Distance Function (N-CEDF). By learning separate distance fields for each link and combining them through the kinematics chain, the learned N-CEDF provides an accurate and computationally efficient representation of the robot's shape. The key advantage of a distance function representation of a continuum robot is that it enables efficient collision checking for motion planning in dynamic and cluttered environments, even with point-cloud observations. We integrate the N-CEDF into a Model Predictive Path Integral (MPPI) controller to generate safe trajectories for multi-segment continuum robots. The proposed approach is validated for continuum robots with various links in several simulated environments with static and dynamic obstacles.
△ Less
Submitted 27 February, 2025; v1 submitted 20 September, 2024;
originally announced September 2024.
-
Safe Bubble Cover for Motion Planning on Distance Fields
Authors:
Ki Myung Brian Lee,
Zhirui Dai,
Cedric Le Gentil,
Lan Wu,
Nikolay Atanasov,
Teresa Vidal-Calleja
Abstract:
We consider the problem of planning collision-free trajectories on distance fields. Our key observation is that querying a distance field at one configuration reveals a region of safe space whose radius is given by the distance value, obviating the need for additional collision checking within the safe region. We refer to such regions as safe bubbles, and show that safe bubbles can be obtained fro…
▽ More
We consider the problem of planning collision-free trajectories on distance fields. Our key observation is that querying a distance field at one configuration reveals a region of safe space whose radius is given by the distance value, obviating the need for additional collision checking within the safe region. We refer to such regions as safe bubbles, and show that safe bubbles can be obtained from any Lipschitz-continuous safety constraint. Inspired by sampling-based planning algorithms, we present three algorithms for constructing a safe bubble cover of free space, named bubble roadmap (BRM), rapidly exploring bubble graph (RBG), and expansive bubble graph (EBG). The bubble sampling algorithms are combined with a hierarchical planning method that first computes a discrete path of bubbles, followed by a continuous path within the bubbles computed via convex optimization. Experimental results show that the bubble-based methods yield up to 5- 10 times cost reduction relative to conventional baselines while simultaneously reducing computational efforts by orders of magnitude.
△ Less
Submitted 23 August, 2024;
originally announced August 2024.
-
Variable-Frequency Model Learning and Predictive Control for Jumping Maneuvers on Legged Robots
Authors:
Chuong Nguyen,
Abdullah Altawaitan,
Thai Duong,
Nikolay Atanasov,
Quan Nguyen
Abstract:
Achieving both target accuracy and robustness in dynamic maneuvers with long flight phases, such as high or long jumps, has been a significant challenge for legged robots. To address this challenge, we propose a novel learning-based control approach consisting of model learning and model predictive control (MPC) utilizing a variable-frequency scheme. Compared to existing MPC techniques, we learn a…
▽ More
Achieving both target accuracy and robustness in dynamic maneuvers with long flight phases, such as high or long jumps, has been a significant challenge for legged robots. To address this challenge, we propose a novel learning-based control approach consisting of model learning and model predictive control (MPC) utilizing a variable-frequency scheme. Compared to existing MPC techniques, we learn a model directly from experiments, accounting not only for leg dynamics but also for modeling errors and unknown dynamics mismatch in hardware and during contact. Additionally, learning the model with variable-frequency allows us to cover the entire flight phase and final jumping target, enhancing the prediction accuracy of the jumping trajectory. Using the learned model, we also design variable-frequency to effectively leverage different jumping phases and track the target accurately. In a total of 92 jumps on Unitree A1 robot hardware, we verify that our approach outperforms other MPCs using fixed frequency or nominal model, reducing the jumping distance error 2 to 8 times. We also achieve jumping distance errors of less than 3 percent during continuous jumping on uneven terrain with randomly placed perturbations of random heights (up to 4 cm or 27 percent the robot standing height). Our approach obtains distance errors of 1 to 2 cm on 34 single and continuous jumps with different jumping targets and model uncertainties. Code is available at https://github.com/DRCL-USC/Learning MPC Jumping.
△ Less
Submitted 6 December, 2024; v1 submitted 20 July, 2024;
originally announced July 2024.
-
SlideSLAM: Sparse, Lightweight, Decentralized Metric-Semantic SLAM for Multi-Robot Navigation
Authors:
Xu Liu,
Jiuzhou Lei,
Ankit Prabhu,
Yuezhan Tao,
Igor Spasojevic,
Pratik Chaudhari,
Nikolay Atanasov,
Vijay Kumar
Abstract:
This paper develops a real-time decentralized metric-semantic Simultaneous Localization and Mapping (SLAM) algorithm framework that enables a heterogeneous robot team to collaboratively construct object-based metric-semantic maps of real-world environments featuring indoor, urban, and forests without relying on GPS. The framework integrates a data-driven front-end for instance segmentation from ei…
▽ More
This paper develops a real-time decentralized metric-semantic Simultaneous Localization and Mapping (SLAM) algorithm framework that enables a heterogeneous robot team to collaboratively construct object-based metric-semantic maps of real-world environments featuring indoor, urban, and forests without relying on GPS. The framework integrates a data-driven front-end for instance segmentation from either RGBD cameras or LiDARs and a custom back-end for optimizing robot trajectories and object landmarks in the map. To allow multiple robots to merge their information, we design semantics-driven place recognition algorithms that leverage the informativeness and viewpoint invariance of the object-level metric-semantic map for inter-robot loop closure detection. A communication module is designed to track each robot's observations and those of other robots whenever communication links are available. Our framework enables real-time decentralized operations onboard robots, allowing them to leverage communication opportunistically. We integrate the proposed framework with the autonomous navigation and exploration systems of three types of aerial and ground robots, conducting extensive experiments in a variety of indoor and outdoor environments. These experiments demonstrate its accuracy in inter-robot localization and object mapping, along with its moderate demands on computation, storage, and communication resources. The framework is open-sourced and is suitable for both single-agent and multi-robot metric-semantic SLAM applications. The project website and code can be found at https://xurobotics.github.io/slideslam/ and https://github.com/XuRobotics/SLIDE_SLAM, respectively.
△ Less
Submitted 24 December, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
Cross-Embodiment Robot Manipulation Skill Transfer using Latent Space Alignment
Authors:
Tianyu Wang,
Dwait Bhatt,
Xiaolong Wang,
Nikolay Atanasov
Abstract:
This paper focuses on transferring control policies between robot manipulators with different morphology. While reinforcement learning (RL) methods have shown successful results in robot manipulation tasks, transferring a trained policy from simulation to a real robot or deploying it on a robot with different states, actions, or kinematics is challenging. To achieve cross-embodiment policy transfe…
▽ More
This paper focuses on transferring control policies between robot manipulators with different morphology. While reinforcement learning (RL) methods have shown successful results in robot manipulation tasks, transferring a trained policy from simulation to a real robot or deploying it on a robot with different states, actions, or kinematics is challenging. To achieve cross-embodiment policy transfer, our key insight is to project the state and action spaces of the source and target robots to a common latent space representation. We first introduce encoders and decoders to associate the states and actions of the source robot with a latent space. The encoders, decoders, and a latent space control policy are trained simultaneously using loss functions measuring task performance, latent dynamics consistency, and encoder-decoder ability to reconstruct the original states and actions. To transfer the learned control policy, we only need to train target encoders and decoders that align a new target domain to the latent space. We use generative adversarial training with cycle consistency and latent dynamics losses without access to the task reward or reward tuning in the target domain. We demonstrate sim-to-sim and sim-to-real manipulation policy transfer with source and target robots of different states, actions, and embodiments. The source code is available at \url{https://github.com/ExistentialRobotics/cross_embodiment_transfer}.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Sensor-Based Distributionally Robust Control for Safe Robot Navigation in Dynamic Environments
Authors:
Kehan Long,
Yinzhuang Yi,
Zhirui Dai,
Sylvia Herbert,
Jorge Cortés,
Nikolay Atanasov
Abstract:
We introduce a novel method for mobile robot navigation in dynamic, unknown environments, leveraging onboard sensing and distributionally robust optimization to impose probabilistic safety constraints. Our method introduces a distributionally robust control barrier function (DR-CBF) that directly integrates noisy sensor measurements and state estimates to define safety constraints. This approach i…
▽ More
We introduce a novel method for mobile robot navigation in dynamic, unknown environments, leveraging onboard sensing and distributionally robust optimization to impose probabilistic safety constraints. Our method introduces a distributionally robust control barrier function (DR-CBF) that directly integrates noisy sensor measurements and state estimates to define safety constraints. This approach is applicable to a wide range of control-affine dynamics, generalizable to robots with complex geometries, and capable of operating at real-time control frequencies. Coupled with a control Lyapunov function (CLF) for path following, the proposed CLF-DR-CBF control synthesis method achieves safe, robust, and efficient navigation in challenging environments. We demonstrate the effectiveness and robustness of our approach for safe autonomous navigation under uncertainty in simulations and real-world experiments with differential-drive robots.
△ Less
Submitted 5 May, 2025; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Multi-Robot Object SLAM Using Distributed Variational Inference
Authors:
Hanwen Cao,
Sriram Shreedharan,
Nikolay Atanasov
Abstract:
Multi-robot simultaneous localization and mapping (SLAM) enables a robot team to achieve coordinated tasks by relying on a common map of the environment. Constructing a map by centralized processing of the robot observations is undesirable because it creates a single point of failure and requires pre-existing infrastructure and significant communication throughput. This paper formulates multi-robo…
▽ More
Multi-robot simultaneous localization and mapping (SLAM) enables a robot team to achieve coordinated tasks by relying on a common map of the environment. Constructing a map by centralized processing of the robot observations is undesirable because it creates a single point of failure and requires pre-existing infrastructure and significant communication throughput. This paper formulates multi-robot object SLAM as a variational inference problem over a communication graph subject to consensus constraints on the object estimates maintained by different robots. To solve the problem, we develop a distributed mirror descent algorithm with regularization enforcing consensus among the communicating robots. Using Gaussian distributions in the algorithm, we also derive a distributed multi-state constraint Kalman filter (MSCKF) for multi-robot object SLAM. Experiments on real and simulated data show that our method improves the trajectory and object estimates, compared to individual-robot SLAM, while achieving better scaling to large robot teams, compared to centralized multi-robot SLAM.
△ Less
Submitted 20 August, 2024; v1 submitted 28 April, 2024;
originally announced April 2024.
-
Riemannian Optimization for Active Mapping with Robot Teams
Authors:
Arash Asgharivaskasi,
Fritz Girke,
Nikolay Atanasov
Abstract:
Autonomous exploration of unknown environments using a team of mobile robots demands distributed perception and planning strategies to enable efficient and scalable performance. Ideally, each robot should update its map and plan its motion not only relying on its own observations, but also considering the observations of its peers. Centralized solutions to multi-robot coordination are susceptible…
▽ More
Autonomous exploration of unknown environments using a team of mobile robots demands distributed perception and planning strategies to enable efficient and scalable performance. Ideally, each robot should update its map and plan its motion not only relying on its own observations, but also considering the observations of its peers. Centralized solutions to multi-robot coordination are susceptible to central node failure and require a sophisticated communication infrastructure for reliable operation. Current decentralized active mapping methods consider simplistic robot models with linear-Gaussian observations and Euclidean robot states. In this work, we present a distributed multi-robot mapping and planning method, called Riemannian Optimization for Active Mapping (ROAM). We formulate an optimization problem over a graph with node variables belonging to a Riemannian manifold and a consensus constraint requiring feasible solutions to agree on the node variables. We develop a distributed Riemannian optimization algorithm that relies only on one-hop communication to solve the problem with consensus and optimality guarantees. We show that multi-robot active mapping can be achieved via two applications of our distributed Riemannian optimization over different manifolds: distributed estimation of a 3-D semantic map and distributed planning of SE(3) trajectories that minimize map uncertainty. We demonstrate the performance of ROAM in simulation and real-world experiments using a team of robots with RGB-D cameras.
△ Less
Submitted 4 May, 2024; v1 submitted 28 April, 2024;
originally announced April 2024.
-
Distributionally Robust Policy and Lyapunov-Certificate Learning
Authors:
Kehan Long,
Jorge Cortes,
Nikolay Atanasov
Abstract:
This article presents novel methods for synthesizing distributionally robust stabilizing neural controllers and certificates for control systems under model uncertainty. A key challenge in designing controllers with stability guarantees for uncertain systems is the accurate determination of and adaptation to shifts in model parametric uncertainty during online deployment. We tackle this with a nov…
▽ More
This article presents novel methods for synthesizing distributionally robust stabilizing neural controllers and certificates for control systems under model uncertainty. A key challenge in designing controllers with stability guarantees for uncertain systems is the accurate determination of and adaptation to shifts in model parametric uncertainty during online deployment. We tackle this with a novel distributionally robust formulation of the Lyapunov derivative chance constraint ensuring a monotonic decrease of the Lyapunov certificate. To avoid the computational complexity involved in dealing with the space of probability measures, we identify a sufficient condition in the form of deterministic convex constraints that ensures the Lyapunov derivative constraint is satisfied. We integrate this condition into a loss function for training a neural network-based controller and show that, for the resulting closed-loop system, the global asymptotic stability of its equilibrium can be certified with high confidence, even with Out-of-Distribution (OoD) model uncertainties. To demonstrate the efficacy and efficiency of the proposed methodology, we compare it with an uncertainty-agnostic baseline approach and several reinforcement learning approaches in two control problems in simulation.
△ Less
Submitted 3 August, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Learning Generalizable Feature Fields for Mobile Manipulation
Authors:
Ri-Zhao Qiu,
Yafei Hu,
Yuchen Song,
Ge Yang,
Yang Fu,
Jianglong Ye,
Jiteng Mu,
Ruihan Yang,
Nikolay Atanasov,
Sebastian Scherer,
Xiaolong Wang
Abstract:
An open problem in mobile manipulation is how to represent objects and scenes in a unified manner so that robots can use both for navigation and manipulation. The latter requires capturing intricate geometry while understanding fine-grained semantics, whereas the former involves capturing the complexity inherent at an expansive physical scale. In this work, we present GeFF (Generalizable Feature F…
▽ More
An open problem in mobile manipulation is how to represent objects and scenes in a unified manner so that robots can use both for navigation and manipulation. The latter requires capturing intricate geometry while understanding fine-grained semantics, whereas the former involves capturing the complexity inherent at an expansive physical scale. In this work, we present GeFF (Generalizable Feature Fields), a scene-level generalizable neural feature field that acts as a unified representation for both navigation and manipulation that performs in real-time. To do so, we treat generative novel view synthesis as a pre-training task, and then align the resulting rich scene priors with natural language via CLIP feature distillation. We demonstrate the effectiveness of this approach by deploying GeFF on a quadrupedal robot equipped with a manipulator. We quantitatively evaluate GeFF's ability for open-vocabulary object-/part-level manipulation and show that GeFF outperforms point-based baselines in runtime and storage-accuracy trade-offs, with qualitative examples of semantics-aware navigation and articulated object manipulation.
△ Less
Submitted 26 November, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Distributed Optimization with Consensus Constraint for Multi-Robot Semantic Octree Mapping
Authors:
Arash Asgharivaskasi,
Nikolay Atanasov
Abstract:
This work develops a distributed optimization algorithm for multi-robot 3-D semantic mapping using streaming range and visual observations and single-hop communication. Our approach relies on gradient-based optimization of the observation log-likelihood of each robot subject to a map consensus constraint to build a common multi-class map of the environment. This formulation leads to closed-form up…
▽ More
This work develops a distributed optimization algorithm for multi-robot 3-D semantic mapping using streaming range and visual observations and single-hop communication. Our approach relies on gradient-based optimization of the observation log-likelihood of each robot subject to a map consensus constraint to build a common multi-class map of the environment. This formulation leads to closed-form updates which resemble Bayes rule with one-hop prior averaging. To reduce the amount of information exchanged among the robots, we utilize an octree data structure that compresses the multi-class map distribution using adaptive-resolution.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Port-Hamiltonian Neural ODE Networks on Lie Groups For Robot Dynamics Learning and Control
Authors:
Thai Duong,
Abdullah Altawaitan,
Jason Stanley,
Nikolay Atanasov
Abstract:
Accurate models of robot dynamics are critical for safe and stable control and generalization to novel operational conditions. Hand-designed models, however, may be insufficiently accurate, even after careful parameter tuning. This motivates the use of machine learning techniques to approximate the robot dynamics over a training set of state-control trajectories. The dynamics of many robots are de…
▽ More
Accurate models of robot dynamics are critical for safe and stable control and generalization to novel operational conditions. Hand-designed models, however, may be insufficiently accurate, even after careful parameter tuning. This motivates the use of machine learning techniques to approximate the robot dynamics over a training set of state-control trajectories. The dynamics of many robots are described in terms of their generalized coordinates on a matrix Lie group, e.g. on $SE(3)$ for ground, aerial, and underwater vehicles, and generalized velocity, and satisfy conservation of energy principles. This paper proposes a port-Hamiltonian formulation over a Lie group of the structure of a neural ordinary differential equation (ODE) network to approximate the robot dynamics. In contrast to a black-box ODE network, our formulation embeds energy conservation principle and Lie group's constraints in the dynamics model and explicitly accounts for energy-dissipation effect such as friction and drag forces in the dynamics model. We develop energy shaping and damping injection control for the learned, potentially under-actuated Hamiltonian dynamics to enable a unified approach for stabilization and trajectory tracking with various robot platforms.
△ Less
Submitted 11 June, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Physics-Informed Multi-Agent Reinforcement Learning for Distributed Multi-Robot Problems
Authors:
Eduardo Sebastian,
Thai Duong,
Nikolay Atanasov,
Eduardo Montijano,
Carlos Sagues
Abstract:
The networked nature of multi-robot systems presents challenges in the context of multi-agent reinforcement learning. Centralized control policies do not scale with increasing numbers of robots, whereas independent control policies do not exploit the information provided by other robots, exhibiting poor performance in cooperative-competitive tasks. In this work we propose a physics-informed reinfo…
▽ More
The networked nature of multi-robot systems presents challenges in the context of multi-agent reinforcement learning. Centralized control policies do not scale with increasing numbers of robots, whereas independent control policies do not exploit the information provided by other robots, exhibiting poor performance in cooperative-competitive tasks. In this work we propose a physics-informed reinforcement learning approach able to learn distributed multi-robot control policies that are both scalable and make use of all the available information to each robot. Our approach has three key characteristics. First, it imposes a port-Hamiltonian structure on the policy representation, respecting energy conservation properties of physical robot systems and the networked nature of robot team interactions. Second, it uses self-attention to ensure a sparse policy representation able to handle time-varying information at each robot from the interaction graph. Third, we present a soft actor-critic reinforcement learning algorithm parameterized by our self-attention port-Hamiltonian control policy, which accounts for the correlation among robots during training while overcoming the need of value function factorization. Extensive simulations in different multi-robot scenarios demonstrate the success of the proposed approach, surpassing previous multi-robot reinforcement learning solutions in scalability, while achieving similar or superior performance (with averaged cumulative reward up to x2 greater than the state-of-the-art with robot teams x6 larger than the number of robots at training time). We also validate our approach on multiple real robots in the Georgia Tech Robotarium under imperfect communication, demonstrating zero-shot sim-to-real transfer and scalability across number of robots.
△ Less
Submitted 22 June, 2025; v1 submitted 30 December, 2023;
originally announced January 2024.
-
Distributed Bayesian Estimation in Sensor Networks: Consensus on Marginal Densities
Authors:
Parth Paritosh,
Nikolay Atanasov,
Sonia Martinez
Abstract:
In this paper, we aim to design and analyze distributed Bayesian estimation algorithms for sensor networks. The challenges we address are to (i) derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables, and (ii) leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual…
▽ More
In this paper, we aim to design and analyze distributed Bayesian estimation algorithms for sensor networks. The challenges we address are to (i) derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables, and (ii) leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual agents. This relates to applications such as cooperative localization and federated learning, where the data collected at any agent depends on a subset of all variables of interest. We present Bayesian density estimation algorithms using data from non-linear likelihoods at agents in centralized, distributed, and marginal distributed settings. After setting up a distributed estimation objective, we prove almost-sure convergence to the optimal set of pdfs at each agent. Then, we prove the same for a storage-aware algorithm estimating densities only over relevant variables at each agent. Finally, we present a Gaussian version of these algorithms and implement it in a mapping problem using variational inference to handle non-linear likelihood models associated with LiDAR sensing.
△ Less
Submitted 23 March, 2025; v1 submitted 2 December, 2023;
originally announced December 2023.
-
EAST: Environment Aware Safe Tracking using Planning and Control Co-Design
Authors:
Zhichao Li,
Yinzhuang Yi,
Zhuolin Niu,
Nikolay Atanasov
Abstract:
This paper considers the problem of autonomous mobile robot navigation in unknown environments with moving obstacles. We propose a new method to achieve environment-aware safe tracking (EAST) of robot motion plans that integrates an obstacle clearance cost for path planning, a convex reachable set for robot motion prediction, and safety constraints for dynamic obstacle avoidance. EAST adapts the m…
▽ More
This paper considers the problem of autonomous mobile robot navigation in unknown environments with moving obstacles. We propose a new method to achieve environment-aware safe tracking (EAST) of robot motion plans that integrates an obstacle clearance cost for path planning, a convex reachable set for robot motion prediction, and safety constraints for dynamic obstacle avoidance. EAST adapts the motion of the robot according to the locally sensed environment geometry and dynamics, leading to fast motion in wide open areas and cautious behavior in narrow passages or near moving obstacles. Our control design uses a reference governor, a virtual dynamical system that guides the robot's motion and decouples the path tracking and safety objectives. While reference governor methods have been used for safe tracking control in static environments, our key contribution is an extension to dynamic environments using convex optimization with control barrier function (CBF) constraints. Thus, our work establishes a connection between reference governor techniques and CBF techniques for safe control in dynamic environments. We validate our approach in simulated and real-world environments, featuring complex obstacle configurations and natural dynamic obstacle motion.
△ Less
Submitted 12 June, 2025; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Safe Stabilizing Control for Polygonal Robots in Dynamic Elliptical Environments
Authors:
Kehan Long,
Khoa Tran,
Melvin Leok,
Nikolay Atanasov
Abstract:
This paper addresses the challenge of safe navigation for rigid-body mobile robots in dynamic environments. We introduce an analytic approach to compute the distance between a polygon and an ellipse, and employ it to construct a control barrier function (CBF) for safe control synthesis. Existing CBF design methods for mobile robot obstacle avoidance usually assume point or circular robots, prevent…
▽ More
This paper addresses the challenge of safe navigation for rigid-body mobile robots in dynamic environments. We introduce an analytic approach to compute the distance between a polygon and an ellipse, and employ it to construct a control barrier function (CBF) for safe control synthesis. Existing CBF design methods for mobile robot obstacle avoidance usually assume point or circular robots, preventing their applicability to more realistic robot body geometries. Our work enables CBF designs that capture complex robot and obstacle shapes. We demonstrate the effectiveness of our approach in simulations highlighting real-time obstacle avoidance in constrained and dynamic environments for both mobile robots and multi-joint robot arms.
△ Less
Submitted 30 April, 2024; v1 submitted 30 September, 2023;
originally announced October 2023.
-
Optimal Scene Graph Planning with Large Language Model Guidance
Authors:
Zhirui Dai,
Arash Asgharivaskasi,
Thai Duong,
Shusen Lin,
Maria-Elizabeth Tzes,
George Pappas,
Nikolay Atanasov
Abstract:
Recent advances in metric, semantic, and topological mapping have equipped autonomous robots with semantic concept grounding capabilities to interpret natural language tasks. This work aims to leverage these new capabilities with an efficient task planning algorithm for hierarchical metric-semantic models. We consider a scene graph representation of the environment and utilize a large language mod…
▽ More
Recent advances in metric, semantic, and topological mapping have equipped autonomous robots with semantic concept grounding capabilities to interpret natural language tasks. This work aims to leverage these new capabilities with an efficient task planning algorithm for hierarchical metric-semantic models. We consider a scene graph representation of the environment and utilize a large language model (LLM) to convert a natural language task into a linear temporal logic (LTL) automaton. Our main contribution is to enable optimal hierarchical LTL planning with LLM guidance over scene graphs. To achieve efficiency, we construct a hierarchical planning domain that captures the attributes and connectivity of the scene graph and the task automaton, and provide semantic guidance via an LLM heuristic function. To guarantee optimality, we design an LTL heuristic function that is provably consistent and supplements the potentially inadmissible LLM guidance in multi-heuristic planning. We demonstrate efficient planning of complex natural language tasks in scene graphs of virtualized real environments.
△ Less
Submitted 10 January, 2024; v1 submitted 17 September, 2023;
originally announced September 2023.
-
Hamiltonian Dynamics Learning from Point Cloud Observations for Nonholonomic Mobile Robot Control
Authors:
Abdullah Altawaitan,
Jason Stanley,
Sambaran Ghosal,
Thai Duong,
Nikolay Atanasov
Abstract:
Reliable autonomous navigation requires adapting the control policy of a mobile robot in response to dynamics changes in different operational conditions. Hand-designed dynamics models may struggle to capture model variations due to a limited set of parameters. Data-driven dynamics learning approaches offer higher model capacity and better generalization but require large amounts of state-labeled…
▽ More
Reliable autonomous navigation requires adapting the control policy of a mobile robot in response to dynamics changes in different operational conditions. Hand-designed dynamics models may struggle to capture model variations due to a limited set of parameters. Data-driven dynamics learning approaches offer higher model capacity and better generalization but require large amounts of state-labeled data. This paper develops an approach for learning robot dynamics directly from point-cloud observations, removing the need and associated errors of state estimation, while embedding Hamiltonian structure in the dynamics model to improve data efficiency. We design an observation-space loss that relates motion prediction from the dynamics model with motion prediction from point-cloud registration to train a Hamiltonian neural ordinary differential equation. The learned Hamiltonian model enables the design of an energy-shaping model-based tracking controller for rigid-body robots. We demonstrate dynamics learning and tracking control on a real nonholonomic wheeled robot.
△ Less
Submitted 12 March, 2024; v1 submitted 17 September, 2023;
originally announced September 2023.
-
Dynamic Handover: Throw and Catch with Bimanual Hands
Authors:
Binghao Huang,
Yuanpei Chen,
Tianyu Wang,
Yuzhe Qin,
Yaodong Yang,
Nikolay Atanasov,
Xiaolong Wang
Abstract:
Humans throw and catch objects all the time. However, such a seemingly common skill introduces a lot of challenges for robots to achieve: The robots need to operate such dynamic actions at high-speed, collaborate precisely, and interact with diverse objects. In this paper, we design a system with two multi-finger hands attached to robot arms to solve this problem. We train our system using Multi-A…
▽ More
Humans throw and catch objects all the time. However, such a seemingly common skill introduces a lot of challenges for robots to achieve: The robots need to operate such dynamic actions at high-speed, collaborate precisely, and interact with diverse objects. In this paper, we design a system with two multi-finger hands attached to robot arms to solve this problem. We train our system using Multi-Agent Reinforcement Learning in simulation and perform Sim2Real transfer to deploy on the real robots. To overcome the Sim2Real gap, we provide multiple novel algorithm designs including learning a trajectory prediction model for the object. Such a model can help the robot catcher has a real-time estimation of where the object will be heading, and then react accordingly. We conduct our experiments with multiple objects in the real-world system, and show significant improvements over multiple baselines. Our project page is available at \url{https://binghao-huang.github.io/dynamic_handover/}.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Distributed Variational Inference for Online Supervised Learning
Authors:
Parth Paritosh,
Nikolay Atanasov,
Sonia Martinez
Abstract:
Developing efficient solutions for inference problems in intelligent sensor networks is crucial for the next generation of location, tracking, and mapping services. This paper develops a scalable distributed probabilistic inference algorithm that applies to continuous variables, intractable posteriors and large-scale real-time data in sensor networks. In a centralized setting, variational inferenc…
▽ More
Developing efficient solutions for inference problems in intelligent sensor networks is crucial for the next generation of location, tracking, and mapping services. This paper develops a scalable distributed probabilistic inference algorithm that applies to continuous variables, intractable posteriors and large-scale real-time data in sensor networks. In a centralized setting, variational inference is a fundamental technique for performing approximate Bayesian estimation, in which an intractable posterior density is approximated with a parametric density. Our key contribution lies in the derivation of a separable lower bound on the centralized estimation objective, which enables distributed variational inference with one-hop communication in a sensor network. Our distributed evidence lower bound (DELBO) consists of a weighted sum of observation likelihood and divergence to prior densities, and its gap to the measurement evidence is due to consensus and modeling errors. To solve binary classification and regression problems while handling streaming data, we design an online distributed algorithm that maximizes DELBO, and specialize it to Gaussian variational densities with non-linear likelihoods. The resulting distributed Gaussian variational inference (DGVI) efficiently inverts a $1$-rank correction to the covariance matrix. Finally, we derive a diagonalized version for online distributed inference in high-dimensional models, and apply it to multi-robot probabilistic mapping using indoor LiDAR data.
△ Less
Submitted 22 October, 2023; v1 submitted 5 September, 2023;
originally announced September 2023.
-
Learning to Identify Graphs from Node Trajectories in Multi-Robot Networks
Authors:
Eduardo Sebastian,
Thai Duong,
Nikolay Atanasov,
Eduardo Montijano,
Carlos Sagues
Abstract:
The graph identification problem consists of discovering the interactions among nodes in a network given their state/feature trajectories. This problem is challenging because the behavior of a node is coupled to all the other nodes by the unknown interaction model. Besides, high-dimensional and nonlinear state trajectories make it difficult to identify if two nodes are connected. Current solutions…
▽ More
The graph identification problem consists of discovering the interactions among nodes in a network given their state/feature trajectories. This problem is challenging because the behavior of a node is coupled to all the other nodes by the unknown interaction model. Besides, high-dimensional and nonlinear state trajectories make it difficult to identify if two nodes are connected. Current solutions rely on prior knowledge of the graph topology and the dynamic behavior of the nodes, and hence, have poor generalization to other network configurations. To address these issues, we propose a novel learning-based approach that combines (i) a strongly convex program that efficiently uncovers graph topologies with global convergence guarantees and (ii) a self-attention encoder that learns to embed the original state trajectories into a feature space and predicts appropriate regularizers for the optimization program. In contrast to other works, our approach can identify the graph topology of unseen networks with new configurations in terms of number of nodes, connectivity or state trajectories. We demonstrate the effectiveness of our approach in identifying graphs in multi-robot formation and flocking tasks.
△ Less
Submitted 21 October, 2023; v1 submitted 10 July, 2023;
originally announced July 2023.
-
Distributionally Robust Lyapunov Function Search Under Uncertainty
Authors:
Kehan Long,
Yinzhuang Yi,
Jorge Cortes,
Nikolay Atanasov
Abstract:
This paper develops methods for proving Lyapunov stability of dynamical systems subject to disturbances with an unknown distribution. We assume only a finite set of disturbance samples is available and that the true online disturbance realization may be drawn from a different distribution than the given samples. We formulate an optimization problem to search for a sum-of-squares (SOS) Lyapunov fun…
▽ More
This paper develops methods for proving Lyapunov stability of dynamical systems subject to disturbances with an unknown distribution. We assume only a finite set of disturbance samples is available and that the true online disturbance realization may be drawn from a different distribution than the given samples. We formulate an optimization problem to search for a sum-of-squares (SOS) Lyapunov function and introduce a distributionally robust version of the Lyapunov function derivative constraint. We show that this constraint may be reformulated as several SOS constraints, ensuring that the search for a Lyapunov function remains in the class of SOS polynomial optimization problems. For general systems, we provide a distributionally robust chance-constrained formulation for neural network Lyapunov function search. Simulations demonstrate the validity and efficiency of either formulation on non-linear uncertain dynamical systems.
△ Less
Submitted 11 July, 2024; v1 submitted 3 December, 2022;
originally announced December 2022.
-
Policy Learning for Active Target Tracking over Continuous SE(3) Trajectories
Authors:
Pengzhi Yang,
Shumon Koga,
Arash Asgharivaskasi,
Nikolay Atanasov
Abstract:
This paper proposes a novel model-based policy gradient algorithm for tracking dynamic targets using a mobile robot, equipped with an onboard sensor with limited field of view. The task is to obtain a continuous control policy for the mobile robot to collect sensor measurements that reduce uncertainty in the target states, measured by the target distribution entropy. We design a neural network con…
▽ More
This paper proposes a novel model-based policy gradient algorithm for tracking dynamic targets using a mobile robot, equipped with an onboard sensor with limited field of view. The task is to obtain a continuous control policy for the mobile robot to collect sensor measurements that reduce uncertainty in the target states, measured by the target distribution entropy. We design a neural network control policy with the robot $SE(3)$ pose and the mean vector and information matrix of the joint target distribution as inputs and attention layers to handle variable numbers of targets. We also derive the gradient of the target entropy with respect to the network parameters explicitly, allowing efficient model-based policy gradient optimization.
△ Less
Submitted 16 May, 2023; v1 submitted 2 December, 2022;
originally announced December 2022.
-
Lie Group Forced Variational Integrator Networks for Learning and Control of Robot Systems
Authors:
Valentin Duruisseaux,
Thai Duong,
Melvin Leok,
Nikolay Atanasov
Abstract:
Incorporating prior knowledge of physics laws and structural properties of dynamical systems into the design of deep learning architectures has proven to be a powerful technique for improving their computational efficiency and generalization capacity. Learning accurate models of robot dynamics is critical for safe and stable control. Autonomous mobile robots, including wheeled, aerial, and underwa…
▽ More
Incorporating prior knowledge of physics laws and structural properties of dynamical systems into the design of deep learning architectures has proven to be a powerful technique for improving their computational efficiency and generalization capacity. Learning accurate models of robot dynamics is critical for safe and stable control. Autonomous mobile robots, including wheeled, aerial, and underwater vehicles, can be modeled as controlled Lagrangian or Hamiltonian rigid-body systems evolving on matrix Lie groups. In this paper, we introduce a new structure-preserving deep learning architecture, the Lie group Forced Variational Integrator Network (LieFVIN), capable of learning controlled Lagrangian or Hamiltonian dynamics on Lie groups, either from position-velocity or position-only data. By design, LieFVINs preserve both the Lie group structure on which the dynamics evolve and the symplectic structure underlying the Hamiltonian or Lagrangian systems of interest. The proposed architecture learns surrogate discrete-time flow maps allowing accurate and fast prediction without numerical-integrator, neural-ODE, or adjoint techniques, which are needed for vector fields. Furthermore, the learnt discrete-time dynamics can be utilized with computationally scalable discrete-time (optimal) control strategies.
△ Less
Submitted 15 May, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Safe and Stable Control Synthesis for Uncertain System Models via Distributionally Robust Optimization
Authors:
Kehan Long,
Yinzhuang Yi,
Jorge Cortes,
Nikolay Atanasov
Abstract:
This paper considers enforcing safety and stability of dynamical systems in the presence of model uncertainty. Safety and stability constraints may be specified using a control barrier function (CBF) and a control Lyapunov function (CLF), respectively. To take model uncertainty into account, robust and chance formulations of the constraints are commonly considered. However, this requires known err…
▽ More
This paper considers enforcing safety and stability of dynamical systems in the presence of model uncertainty. Safety and stability constraints may be specified using a control barrier function (CBF) and a control Lyapunov function (CLF), respectively. To take model uncertainty into account, robust and chance formulations of the constraints are commonly considered. However, this requires known error bounds or a known distribution for the model uncertainty, and the resulting formulations may suffer from over-conservatism or over-confidence. In this paper, we assume that only a finite set of model parametric uncertainty samples is available and formulate a distributionally robust chance-constrained program (DRCCP) for control synthesis with CBF safety and CLF stability guarantees. To facilitate efficient computation of control inputs during online execution, we present a reformulation of the DRCCP as a second-order cone program (SOCP). Our formulation is evaluated in an adaptive cruise control example in comparison to 1) a baseline CLF-CBF quadratic programming approach, 2) a robust approach that assumes known error bounds of the system uncertainty, and 3) a chance-constrained approach that assumes a known Gaussian Process distribution of the uncertainty.
△ Less
Submitted 16 March, 2023; v1 submitted 3 October, 2022;
originally announced October 2022.
-
Learning Continuous Control Policies for Information-Theoretic Active Perception
Authors:
Pengzhi Yang,
Yuhan Liu,
Shumon Koga,
Arash Asgharivaskasi,
Nikolay Atanasov
Abstract:
This paper proposes a method for learning continuous control policies for active landmark localization and exploration using an information-theoretic cost. We consider a mobile robot detecting landmarks within a limited sensing range, and tackle the problem of learning a control policy that maximizes the mutual information between the landmark states and the sensor observations. We employ a Kalman…
▽ More
This paper proposes a method for learning continuous control policies for active landmark localization and exploration using an information-theoretic cost. We consider a mobile robot detecting landmarks within a limited sensing range, and tackle the problem of learning a control policy that maximizes the mutual information between the landmark states and the sensor observations. We employ a Kalman filter to convert the partially observable problem in the landmark state to Markov decision process (MDP), a differentiable field of view to shape the reward, and an attention-based neural network to represent the control policy. The approach is further unified with active volumetric mapping to promote exploration in addition to landmark localization. The performance is demonstrated in several simulated landmark localization tasks in comparison with benchmark methods.
△ Less
Submitted 16 May, 2023; v1 submitted 26 September, 2022;
originally announced September 2022.
-
Information-theoretic Abstraction of Semantic Octree Models for Integrated Perception and Planning
Authors:
Daniel T. Larsson,
Arash Asgharivaskasi,
Jaein Lim,
Nikolay Atanasov,
Panagiotis Tsiotras
Abstract:
In this paper, we develop an approach that enables autonomous robots to build and compress semantic environment representations from point-cloud data. Our approach builds a three-dimensional, semantic tree representation of the environment from sensor data which is then compressed by a novel information-theoretic tree-pruning approach. The proposed approach is probabilistic and incorporates the un…
▽ More
In this paper, we develop an approach that enables autonomous robots to build and compress semantic environment representations from point-cloud data. Our approach builds a three-dimensional, semantic tree representation of the environment from sensor data which is then compressed by a novel information-theoretic tree-pruning approach. The proposed approach is probabilistic and incorporates the uncertainty in semantic classification inherent in real-world environments. Moreover, our approach allows robots to prioritize individual semantic classes when generating the compressed trees, so as to design multi-resolution representations that retain the relevant semantic information while simultaneously discarding unwanted semantic categories. We demonstrate the approach by compressing semantic octree models of a large outdoor, semantically rich, real-world environment. In addition, we show how the octree abstractions can be used to create semantically-informed graphs for motion planning, and provide a comparison of our approach with uninformed graph construction methods such as Halton sequences.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.
-
LEMURS: Learning Distributed Multi-Robot Interactions
Authors:
Eduardo Sebastian,
Thai Duong,
Nikolay Atanasov,
Eduardo Montijano,
Carlos Sagues
Abstract:
This paper presents LEMURS, an algorithm for learning scalable multi-robot control policies from cooperative task demonstrations. We propose a port-Hamiltonian description of the multi-robot system to exploit universal physical constraints in interconnected systems and achieve closed-loop stability. We represent a multi-robot control policy using an architecture that combines self-attention mechan…
▽ More
This paper presents LEMURS, an algorithm for learning scalable multi-robot control policies from cooperative task demonstrations. We propose a port-Hamiltonian description of the multi-robot system to exploit universal physical constraints in interconnected systems and achieve closed-loop stability. We represent a multi-robot control policy using an architecture that combines self-attention mechanisms and neural ordinary differential equations. The former handles time-varying communication in the robot team, while the latter respects the continuous-time robot dynamics. Our representation is distributed by construction, enabling the learned control policies to be deployed in robot teams of different sizes. We demonstrate that LEMURS can learn interactions and cooperative behaviors from demonstrations of multi-agent navigation and flocking tasks.
△ Less
Submitted 21 February, 2023; v1 submitted 20 September, 2022;
originally announced September 2022.
-
Energy-Aware, Collision-Free Information Gathering for Heterogeneous Robot Teams
Authors:
Xiaoyi Cai,
Brent Schlotfeldt,
Kasra Khosoussi,
Nikolay Atanasov,
George J. Pappas,
Jonathan P. How
Abstract:
This paper considers the problem of safely coordinating a team of sensor-equipped robots to reduce uncertainty about a dynamical process, where the objective trades off information gain and energy cost. Optimizing this trade-off is desirable, but leads to a non-monotone objective function in the set of robot trajectories. Therefore, common multi-robot planners based on coordinate descent lose thei…
▽ More
This paper considers the problem of safely coordinating a team of sensor-equipped robots to reduce uncertainty about a dynamical process, where the objective trades off information gain and energy cost. Optimizing this trade-off is desirable, but leads to a non-monotone objective function in the set of robot trajectories. Therefore, common multi-robot planners based on coordinate descent lose their performance guarantees. Furthermore, methods that handle non-monotonicity lose their performance guarantees when subject to inter-robot collision avoidance constraints. As it is desirable to retain both the performance guarantee and safety guarantee, this work proposes a hierarchical approach with a distributed planner that uses local search with a worst-case performance guarantees and a decentralized controller based on control barrier functions that ensures safety and encourages timely arrival at sensing locations. Via extensive simulations, hardware-in-the-loop tests and hardware experiments, we demonstrate that the proposed approach achieves a better trade-off between sensing and energy cost than coordinate-descent-based algorithms.
△ Less
Submitted 9 March, 2023; v1 submitted 30 July, 2022;
originally announced August 2022.
-
Robust and Safe Autonomous Navigation for Systems with Learned SE(3) Hamiltonian Dynamics
Authors:
Zhichao Li,
Thai Duong,
Nikolay Atanasov
Abstract:
Stability and safety are critical properties for successful deployment of automatic control systems. As a motivating example, consider autonomous mobile robot navigation in a complex environment. A control design that generalizes to different operational conditions requires a model of the system dynamics, robustness to modeling errors, and satisfaction of safety \NEWZL{constraints}, such as collis…
▽ More
Stability and safety are critical properties for successful deployment of automatic control systems. As a motivating example, consider autonomous mobile robot navigation in a complex environment. A control design that generalizes to different operational conditions requires a model of the system dynamics, robustness to modeling errors, and satisfaction of safety \NEWZL{constraints}, such as collision avoidance. This paper develops a neural ordinary differential equation network to learn the dynamics of a Hamiltonian system from trajectory data. The learned Hamiltonian model is used to synthesize an energy-shaping passivity-based controller and analyze its \emph{robustness} to uncertainty in the learned model and its \emph{safety} with respect to constraints imposed by the environment. Given a desired reference path for the system, we extend our design using a virtual reference governor to achieve tracking control. The governor state serves as a regulation point that moves along the reference path adaptively, balancing the system energy level, model uncertainty bounds, and distance to safety violation to guarantee robustness and safety. Our Hamiltonian dynamics learning and tracking control techniques are demonstrated on \Revised{simulated hexarotor and quadrotor robots} navigating in cluttered 3D environments.
△ Less
Submitted 21 July, 2022;
originally announced July 2022.
-
A Survey on Active Simultaneous Localization and Mapping: State of the Art and New Frontiers
Authors:
Julio A. Placed,
Jared Strader,
Henry Carrillo,
Nikolay Atanasov,
Vadim Indelman,
Luca Carlone,
José A. Castellanos
Abstract:
Active Simultaneous Localization and Mapping (SLAM) is the problem of planning and controlling the motion of a robot to build the most accurate and complete model of the surrounding environment. Since the first foundational work in active perception appeared, more than three decades ago, this field has received increasing attention across different scientific communities. This has brought about ma…
▽ More
Active Simultaneous Localization and Mapping (SLAM) is the problem of planning and controlling the motion of a robot to build the most accurate and complete model of the surrounding environment. Since the first foundational work in active perception appeared, more than three decades ago, this field has received increasing attention across different scientific communities. This has brought about many different approaches and formulations, and makes a review of the current trends necessary and extremely valuable for both new and experienced researchers. In this work, we survey the state-of-the-art in active SLAM and take an in-depth look at the open challenges that still require attention to meet the needs of modern applications. After providing a historical perspective, we present a unified problem formulation and review the well-established modular solution scheme, which decouples the problem into three stages that identify, select, and execute potential navigation actions. We then analyze alternative approaches, including belief-space planning and deep reinforcement learning techniques, and review related work on multi-robot coordination. The manuscript concludes with a discussion of new research directions, addressing reproducible research, active spatial perception, and practical applications, among other topics.
△ Less
Submitted 13 February, 2023; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Latent Policies for Adversarial Imitation Learning
Authors:
Tianyu Wang,
Nikhil Karnwal,
Nikolay Atanasov
Abstract:
This paper considers learning robot locomotion and manipulation tasks from expert demonstrations. Generative adversarial imitation learning (GAIL) trains a discriminator that distinguishes expert from agent transitions, and in turn use a reward defined by the discriminator output to optimize a policy generator for the agent. This generative adversarial training approach is very powerful but depend…
▽ More
This paper considers learning robot locomotion and manipulation tasks from expert demonstrations. Generative adversarial imitation learning (GAIL) trains a discriminator that distinguishes expert from agent transitions, and in turn use a reward defined by the discriminator output to optimize a policy generator for the agent. This generative adversarial training approach is very powerful but depends on a delicate balance between the discriminator and the generator training. In high-dimensional problems, the discriminator training may easily overfit or exploit associations with task-irrelevant features for transition classification. A key insight of this work is that performing imitation learning in a suitable latent task space makes the training process stable, even in challenging high-dimensional problems. We use an action encoder-decoder model to obtain a low-dimensional latent action space and train a LAtent Policy using Adversarial imitation Learning (LAPAL). The encoder-decoder model can be trained offline from state-action pairs to obtain a task-agnostic latent action representation or online, simultaneously with the discriminator and generator training, to obtain a task-aware latent action representation. We demonstrate that LAPAL training is stable, with near-monotonic performance improvement, and achieves expert performance in most locomotion and manipulation tasks, while a GAIL baseline converges slower and does not achieve expert performance in high-dimensional environments.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
TerrainMesh: Metric-Semantic Terrain Reconstruction from Aerial Images Using Joint 2D-3D Learning
Authors:
Qiaojun Feng,
Nikolay Atanasov
Abstract:
This paper considers outdoor terrain mapping using RGB images obtained from an aerial vehicle. While feature-based localization and mapping techniques deliver real-time vehicle odometry and sparse keypoint depth reconstruction, a dense model of the environment geometry and semantics (vegetation, buildings, etc.) is usually recovered offline with significant computation and storage. This paper deve…
▽ More
This paper considers outdoor terrain mapping using RGB images obtained from an aerial vehicle. While feature-based localization and mapping techniques deliver real-time vehicle odometry and sparse keypoint depth reconstruction, a dense model of the environment geometry and semantics (vegetation, buildings, etc.) is usually recovered offline with significant computation and storage. This paper develops a joint 2D-3D learning approach to reconstruct a local metric-semantic mesh at each camera keyframe maintained by a visual odometry algorithm. Given the estimated camera trajectory, the local meshes can be assembled into a global environment model to capture the terrain topology and semantics during online operation. A local mesh is reconstructed using an initialization and refinement stage. In the initialization stage, we estimate the mesh vertex elevation by solving a least squares problem relating the vertex barycentric coordinates to the sparse keypoint depth measurements. In the refinement stage, we associate 2D image and semantic features with the 3D mesh vertices using camera projection and apply graph convolution to refine the mesh vertex spatial coordinates and semantic features based on joint 2D and 3D supervision. Quantitative and qualitative evaluation using real aerial images show the potential of our method to support environmental monitoring and surveillance applications.
△ Less
Submitted 16 January, 2024; v1 submitted 23 April, 2022;
originally announced April 2022.
-
Active Mapping via Gradient Ascent Optimization of Shannon Mutual Information over Continuous SE(3) Trajectories
Authors:
Arash Asgharivaskasi,
Shumon Koga,
Nikolay Atanasov
Abstract:
The problem of active mapping aims to plan an informative sequence of sensing views given a limited budget such as distance traveled. This paper consider active occupancy grid mapping using a range sensor, such as LiDAR or depth camera. State-of-the-art methods optimize information-theoretic measures relating the occupancy grid probabilities with the range sensor measurements. The non-smooth natur…
▽ More
The problem of active mapping aims to plan an informative sequence of sensing views given a limited budget such as distance traveled. This paper consider active occupancy grid mapping using a range sensor, such as LiDAR or depth camera. State-of-the-art methods optimize information-theoretic measures relating the occupancy grid probabilities with the range sensor measurements. The non-smooth nature of ray-tracing within a grid representation makes the objective function non-differentiable, forcing existing methods to search over a discrete space of candidate trajectories. This work proposes a differentiable approximation of the Shannon mutual information between a grid map and ray-based observations that enables gradient ascent optimization in the continuous space of SE(3) sensor poses. Our gradient-based formulation leads to more informative sensing trajectories, while avoiding occlusions and collisions. The proposed method is demonstrated in simulated and real-world experiments in 2-D and 3-D environments.
△ Less
Submitted 15 April, 2022;
originally announced April 2022.
-
Safe Control Synthesis with Uncertain Dynamics and Constraints
Authors:
Kehan Long,
Vikas Dhiman,
Melvin Leok,
Jorge Cortés,
Nikolay Atanasov
Abstract:
This paper considers safe control synthesis for dynamical systems with either probabilistic or worst-case uncertainty in both the dynamics model and the safety constraints. We formulate novel probabilistic and robust (worst-case) control Lyapunov function (CLF) and control barrier function (CBF) constraints that take into account the effect of uncertainty in either case. We show that either the pr…
▽ More
This paper considers safe control synthesis for dynamical systems with either probabilistic or worst-case uncertainty in both the dynamics model and the safety constraints. We formulate novel probabilistic and robust (worst-case) control Lyapunov function (CLF) and control barrier function (CBF) constraints that take into account the effect of uncertainty in either case. We show that either the probabilistic or the robust (worst-case) formulation leads to a second-order cone program (SOCP), which enables efficient safe and stable control synthesis. We evaluate our approach in PyBullet simulations of an autonomous robot navigating in unknown environments and compare the performance with a baseline CLF-CBF quadratic programming approach.
△ Less
Submitted 30 September, 2022; v1 submitted 19 February, 2022;
originally announced February 2022.
-
Distributed Multi-Agent Reinforcement Learning with One-hop Neighbors and Compute Straggler Mitigation
Authors:
Baoqian Wang,
Junfei Xie,
Nikolay Atanasov
Abstract:
Most multi-agent reinforcement learning (MARL) methods are limited in the scale of problems they can handle. With increasing numbers of agents, the number of training iterations required to find the optimal behaviors increases exponentially due to the exponentially growing joint state and action spaces. This paper tackles this limitation by introducing a scalable MARL method called Distributed mul…
▽ More
Most multi-agent reinforcement learning (MARL) methods are limited in the scale of problems they can handle. With increasing numbers of agents, the number of training iterations required to find the optimal behaviors increases exponentially due to the exponentially growing joint state and action spaces. This paper tackles this limitation by introducing a scalable MARL method called Distributed multi-Agent Reinforcement Learning with One-hop Neighbors (DARL1N). DARL1N is an off-policy actor-critic method that addresses the curse of dimensionality by restricting information exchanges among the agents to one-hop neighbors when representing value and policy functions. Each agent optimizes its value and policy functions over a one-hop neighborhood, significantly reducing the learning complexity, yet maintaining expressiveness by training with varying neighbor numbers and states. This structure allows us to formulate a distributed learning framework to further speed up the training procedure. Distributed computing systems, however, contain straggler compute nodes, which are slow or unresponsive due to communication bottlenecks, software or hardware problems. To mitigate the detrimental straggler effect, we introduce a novel coded distributed learning architecture, which leverages coding theory to improve the resilience of the learning system to stragglers. Comprehensive experiments show that DARL1N significantly reduces training time without sacrificing policy quality and is scalable as the number of agents increases. Moreover, the coded distributed learning architecture improves training efficiency in the presence of stragglers.
△ Less
Submitted 30 December, 2024; v1 submitted 17 February, 2022;
originally announced February 2022.
-
Physics-guided Learning-based Adaptive Control on the SE(3) Manifold
Authors:
Thai Duong,
Nikolay Atanasov
Abstract:
In real-world robotics applications, accurate models of robot dynamics are critical for safe and stable control in rapidly changing operational conditions. This motivates the use of machine learning techniques to approximate robot dynamics and their disturbances over a training set of state-control trajectories. This paper demonstrates that inductive biases arising from physics laws can be used to…
▽ More
In real-world robotics applications, accurate models of robot dynamics are critical for safe and stable control in rapidly changing operational conditions. This motivates the use of machine learning techniques to approximate robot dynamics and their disturbances over a training set of state-control trajectories. This paper demonstrates that inductive biases arising from physics laws can be used to improve the data efficiency and accuracy of the approximated dynamics model. For example, the dynamics of many robots, including ground, aerial, and underwater vehicles, are described using their $SE(3)$ pose and satisfy conservation of energy principles. We design a physically plausible model of the robot dynamics by imposing the structure of Hamilton's equations of motion in the design of a neural ordinary differential equation (ODE) network. The Hamiltonian structure guarantees satisfaction of $SE(3)$ kinematic constraints and energy conservation by construction. It also allows us to derive an energy-based adaptive controller that achieves trajectory tracking while compensating for disturbances. Our learning-based adaptive controller is verified on an under-actuated quadrotor robot.
△ Less
Submitted 12 January, 2022;
originally announced January 2022.