-
Parallel and Proximal Constrained Linear-Quadratic Methods for Real-Time Nonlinear MPC
Authors:
Wilson Jallet,
Ewen Dantec,
Etienne Arlaud,
Justin Carpentier,
Nicolas Mansard
Abstract:
Recent strides in nonlinear model predictive control (NMPC) underscore a dependence on numerical advancements to efficiently and accurately solve large-scale problems. Given the substantial number of variables characterizing typical whole-body optimal control (OC) problems - often numbering in the thousands - exploiting the sparse structure of the numerical problem becomes crucial to meet computat…
▽ More
Recent strides in nonlinear model predictive control (NMPC) underscore a dependence on numerical advancements to efficiently and accurately solve large-scale problems. Given the substantial number of variables characterizing typical whole-body optimal control (OC) problems - often numbering in the thousands - exploiting the sparse structure of the numerical problem becomes crucial to meet computational demands, typically in the range of a few milliseconds. Addressing the linear-quadratic regulator (LQR) problem is a fundamental building block for computing Newton or Sequential Quadratic Programming (SQP) steps in direct optimal control methods. This paper concentrates on equality-constrained problems featuring implicit system dynamics and dual regularization, a characteristic of advanced interiorpoint or augmented Lagrangian solvers. Here, we introduce a parallel algorithm for solving an LQR problem with dual regularization. Leveraging a rewriting of the LQR recursion through block elimination, we first enhanced the efficiency of the serial algorithm and then subsequently generalized it to handle parametric problems. This extension enables us to split decision variables and solve multiple subproblems concurrently. Our algorithm is implemented in our nonlinear numerical optimal control library ALIGATOR. It showcases improved performance over previous serial formulations and we validate its efficacy by deploying it in the model predictive control of a real quadruped robot.
△ Less
Submitted 3 June, 2024; v1 submitted 15 May, 2024;
originally announced May 2024.
-
CACTO-SL: Using Sobolev Learning to improve Continuous Actor-Critic with Trajectory Optimization
Authors:
Elisa Alboni,
Gianluigi Grandesso,
Gastone Pietro Rosati Papini,
Justin Carpentier,
Andrea Del Prete
Abstract:
Trajectory Optimization (TO) and Reinforcement Learning (RL) are powerful and complementary tools to solve optimal control problems. On the one hand, TO can efficiently compute locally-optimal solutions, but it tends to get stuck in local minima if the problem is not convex. On the other hand, RL is typically less sensitive to non-convexity, but it requires a much higher computational effort. Rece…
▽ More
Trajectory Optimization (TO) and Reinforcement Learning (RL) are powerful and complementary tools to solve optimal control problems. On the one hand, TO can efficiently compute locally-optimal solutions, but it tends to get stuck in local minima if the problem is not convex. On the other hand, RL is typically less sensitive to non-convexity, but it requires a much higher computational effort. Recently, we have proposed CACTO (Continuous Actor-Critic with Trajectory Optimization), an algorithm that uses TO to guide the exploration of an actor-critic RL algorithm. In turns, the policy encoded by the actor is used to warm-start TO, closing the loop between TO and RL. In this work, we present an extension of CACTO exploiting the idea of Sobolev learning. To make the training of the critic network faster and more data efficient, we enrich it with the gradient of the Value function, computed via a backward pass of the differential dynamic programming algorithm. Our results show that the new algorithm is more efficient than the original CACTO, reducing the number of TO episodes by a factor ranging from 3 to 10, and consequently the computation time. Moreover, we show that CACTO-SL helps TO to find better minima and to produce more consistent results.
△ Less
Submitted 17 December, 2023;
originally announced December 2023.
-
Stagewise Newton Method for Dynamic Game Control with Imperfect State Observation
Authors:
Armand Jordana,
Bilal Hammoud,
Justin Carpentier,
Ludovic Righetti
Abstract:
In this letter, we study dynamic game optimal control with imperfect state observations and introduce an iterative method to find a local Nash equilibrium. The algorithm consists of an iterative procedure combining a backward recursion similar to minimax differential dynamic programming and a forward recursion resembling a risk-sensitive Kalman smoother. A coupling equation renders the resulting c…
▽ More
In this letter, we study dynamic game optimal control with imperfect state observations and introduce an iterative method to find a local Nash equilibrium. The algorithm consists of an iterative procedure combining a backward recursion similar to minimax differential dynamic programming and a forward recursion resembling a risk-sensitive Kalman smoother. A coupling equation renders the resulting control dependent on the estimation. In the end, the algorithm is equivalent to a Newton step but has linear complexity in the time horizon length. Furthermore, a merit function and a line search procedure are introduced to guarantee convergence of the iterative scheme. The resulting controller reasons about uncertainty by planning for the worst case disturbances. Lastly, the low computational cost of the proposed algorithm makes it a promising method to do output-feedback model predictive control on complex systems at high frequency. Numerical simulations on realistic robotic problems illustrate the risk-sensitive behavior of the resulting controller.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
Leveraging Randomized Smoothing for Optimal Control of Nonsmooth Dynamical Systems
Authors:
Quentin Le Lidec,
Fabian Schramm,
Louis Montaut,
Cordelia Schmid,
Ivan Laptev,
Justin Carpentier
Abstract:
Optimal control (OC) algorithms such as Differential Dynamic Programming (DDP) take advantage of the derivatives of the dynamics to efficiently control physical systems. Yet, in the presence of nonsmooth dynamical systems, such class of algorithms are likely to fail due, for instance, to the presence of discontinuities in the dynamics derivatives or because of non-informative gradient. On the cont…
▽ More
Optimal control (OC) algorithms such as Differential Dynamic Programming (DDP) take advantage of the derivatives of the dynamics to efficiently control physical systems. Yet, in the presence of nonsmooth dynamical systems, such class of algorithms are likely to fail due, for instance, to the presence of discontinuities in the dynamics derivatives or because of non-informative gradient. On the contrary, reinforcement learning (RL) algorithms have shown better empirical results in scenarios exhibiting non-smooth effects (contacts, frictions, etc). Our approach leverages recent works on randomized smoothing (RS) to tackle non-smoothness issues commonly encountered in optimal control, and provides key insights on the interplay between RL and OC through the prism of RS methods. This naturally leads us to introduce the randomized Differential Dynamic Programming (R-DDP) algorithm accounting for deterministic but non-smooth dynamics in a very sample-efficient way. The experiments demonstrate that our method is able to solve classic robotic problems with dry friction and frictional contacts, where classical OC algorithms are likely to fail and RL algorithms require in practice a prohibitive number of samples to find an optimal solution.
△ Less
Submitted 22 January, 2024; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Infinite-Dimensional Sums-of-Squares for Optimal Control
Authors:
Eloïse Berthier,
Justin Carpentier,
Alessandro Rudi,
Francis Bach
Abstract:
We introduce an approximation method to solve an optimal control problem via the Lagrange dual of its weak formulation. It is based on a sum-of-squares representation of the Hamiltonian, and extends a previous method from polynomial optimization to the generic case of smooth problems. Such a representation is infinite-dimensional and relies on a particular space of functions-a reproducing kernel H…
▽ More
We introduce an approximation method to solve an optimal control problem via the Lagrange dual of its weak formulation. It is based on a sum-of-squares representation of the Hamiltonian, and extends a previous method from polynomial optimization to the generic case of smooth problems. Such a representation is infinite-dimensional and relies on a particular space of functions-a reproducing kernel Hilbert space-chosen to fit the structure of the control problem. After subsampling, it leads to a practical method that amounts to solving a semi-definite program. We illustrate our approach by a numerical application on a simple low-dimensional control problem.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Crocoddyl: An Efficient and Versatile Framework for Multi-Contact Optimal Control
Authors:
Carlos Mastalli,
Rohan Budhiraja,
Wolfgang Merkt,
Guilhem Saurel,
Bilal Hammoud,
Maximilien Naveau,
Justin Carpentier,
Ludovic Righetti,
Sethu Vijayakumar,
Nicolas Mansard
Abstract:
We introduce Crocoddyl (Contact RObot COntrol by Differential DYnamic Library), an open-source framework tailored for efficient multi-contact optimal control. Crocoddyl efficiently computes the state trajectory and the control policy for a given predefined sequence of contacts. Its efficiency is due to the use of sparse analytical derivatives, exploitation of the problem structure, and data sharin…
▽ More
We introduce Crocoddyl (Contact RObot COntrol by Differential DYnamic Library), an open-source framework tailored for efficient multi-contact optimal control. Crocoddyl efficiently computes the state trajectory and the control policy for a given predefined sequence of contacts. Its efficiency is due to the use of sparse analytical derivatives, exploitation of the problem structure, and data sharing. It employs differential geometry to properly describe the state of any geometrical system, e.g. floating-base systems. Additionally, we propose a novel optimal control algorithm called Feasibility-driven Differential Dynamic Programming (FDDP). Our method does not add extra decision variables which often increases the computation time per iteration due to factorization. FDDP shows a greater globalization strategy compared to classical Differential Dynamic Programming (DDP) algorithms. Concretely, we propose two modifications to the classical DDP algorithm. First, the backward pass accepts infeasible state-control trajectories. Second, the rollout keeps the gaps open during the early "exploratory" iterations (as expected in multiple-shooting methods with only equality constraints). We showcase the performance of our framework using different tasks. With our method, we can compute highly-dynamic maneuvers (e.g. jumping, front-flip) within few milliseconds.
△ Less
Submitted 11 March, 2020; v1 submitted 11 September, 2019;
originally announced September 2019.
-
Approximate Gradient Descent Convergence Dynamics for Adaptive Control on Heterogeneous Networks
Authors:
Jean Carpentier,
Sebastien Blandin
Abstract:
Adaptive control is a classical control method for complex cyber-physical systems, including transportation networks. In this work, we analyze the convergence properties of such methods on exemplar graphs, both theoretically and numerically. We first illustrate a limitation of the standard backpressure algorithm for scheduling optimization, and prove that a re-scaling of the model state can lead t…
▽ More
Adaptive control is a classical control method for complex cyber-physical systems, including transportation networks. In this work, we analyze the convergence properties of such methods on exemplar graphs, both theoretically and numerically. We first illustrate a limitation of the standard backpressure algorithm for scheduling optimization, and prove that a re-scaling of the model state can lead to an improvement in the overall system optimality by a factor of at most $\mathcal{O}(k)$ depending on the network parameters, where $k$ characterizes the network heterogeneity. We exhaustively describe the associated transient and steady-state regimes, and derive convergence properties within this generalized class of backpressure algorithms. Extensive simulations are conducted on both a synthetic network and on a more realistic large-scale network modeled on the Manhattan grid on which theoretical results are verified.
△ Less
Submitted 11 June, 2019;
originally announced June 2019.