Search | arXiv e-print repository

Together We Rise: Optimizing Real-Time Multi-Robot Task Allocation using Coordinated Heterogeneous Plays

Authors: Aritra Pal, Anandsingh Chauhan, Mayank Baranwal

Abstract: Efficient task allocation among multiple robots is crucial for optimizing productivity in modern warehouses, particularly in response to the increasing demands of online order fulfillment. This paper addresses the real-time multi-robot task allocation (MRTA) problem in dynamic warehouse environments, where tasks emerge with specified start and end locations. The objective is to minimize both the t… ▽ More Efficient task allocation among multiple robots is crucial for optimizing productivity in modern warehouses, particularly in response to the increasing demands of online order fulfillment. This paper addresses the real-time multi-robot task allocation (MRTA) problem in dynamic warehouse environments, where tasks emerge with specified start and end locations. The objective is to minimize both the total travel distance of robots and delays in task completion, while also considering practical constraints such as battery management and collision avoidance. We introduce MRTAgent, a dual-agent Reinforcement Learning (RL) framework inspired by self-play, designed to optimize task assignments and robot selection to ensure timely task execution. For safe navigation, a modified linear quadratic controller (LQR) approach is employed. To the best of our knowledge, MRTAgent is the first framework to address all critical aspects of practical MRTA problems while supporting continuous robot movements. △ Less

Submitted 21 February, 2025; originally announced February 2025.

Comments: Accepted to AAMAS 2025 (AAAI Track)

arXiv:2409.19279 [pdf, other]

Distributed Optimization via Energy Conservation Laws in Dilated Coordinates

Authors: Mayank Baranwal, Kushal Chakrabarti

Abstract: Optimizing problems in a distributed manner is critical for systems involving multiple agents with private data. Despite substantial interest, a unified method for analyzing the convergence rates of distributed optimization algorithms is lacking. This paper introduces an energy conservation approach for analyzing continuous-time dynamical systems in dilated coordinates. Instead of directly analyzi… ▽ More Optimizing problems in a distributed manner is critical for systems involving multiple agents with private data. Despite substantial interest, a unified method for analyzing the convergence rates of distributed optimization algorithms is lacking. This paper introduces an energy conservation approach for analyzing continuous-time dynamical systems in dilated coordinates. Instead of directly analyzing dynamics in the original coordinate system, we establish a conserved quantity, akin to physical energy, in the dilated coordinate system. Consequently, convergence rates can be explicitly expressed in terms of the inverse time-dilation factor. Leveraging this generalized approach, we formulate a novel second-order distributed accelerated gradient flow with a convergence rate of $O\left(1/t^{2-ε}\right)$ in time $t$ for $ε>0$. We then employ a semi second-order symplectic Euler discretization to derive a rate-matching algorithm with a convergence rate of $O\left(1/k^{2-ε}\right)$ in $k$ iterations. To the best of our knowledge, this represents the most favorable convergence rate for any distributed optimization algorithm designed for smooth convex optimization. Its accelerated convergence behavior is benchmarked against various state-of-the-art distributed optimization algorithms on practical, large-scale problems. △ Less

Submitted 28 September, 2024; originally announced September 2024.

Comments: 10 pages; (Near) optimal convergence rate

arXiv:2310.00419 [pdf, other]

On Linear Convergence of PI Consensus Algorithm under the Restricted Secant Inequality

Authors: Kushal Chakrabarti, Mayank Baranwal

Abstract: This paper considers solving distributed optimization problems in peer-to-peer multi-agent networks. The network is synchronous and connected. By using the proportional-integral (PI) control strategy, various algorithms with fixed stepsize have been developed. Two notable among them are the PI algorithm and the PI consensus algorithm. Although the PI algorithm has provable linear or exponential co… ▽ More This paper considers solving distributed optimization problems in peer-to-peer multi-agent networks. The network is synchronous and connected. By using the proportional-integral (PI) control strategy, various algorithms with fixed stepsize have been developed. Two notable among them are the PI algorithm and the PI consensus algorithm. Although the PI algorithm has provable linear or exponential convergence without the standard requirement of (strong) convexity, a similar guarantee for the PI consensus algorithm is unavailable. In this paper, using Lyapunov theory, we guarantee exponential convergence of the PI consensus algorithm for global cost functions that satisfy the restricted secant inequality, with rate-matching discretization, without requiring convexity. To accelerate the PI consensus algorithm, we incorporate local pre-conditioning in the form of constant positive definite matrices and numerically validate its efficiency compared to the prominent distributed convex optimization algorithms. Unlike classical pre-conditioning, where only the gradients are multiplied by a pre-conditioner, the proposed pre-conditioning modifies both the gradients and the consensus terms, thereby controlling the effect of the communication graph on the algorithm. △ Less

Submitted 28 October, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

Comments: Accepted for publication at the 2024 Tenth Indian Control Conference (ICC-10)

arXiv:2212.03765 [pdf, other]

Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points

Authors: Mayank Baranwal, Param Budhraja, Vishal Raj, Ashish R. Hota

Abstract: Gradient-based first-order convex optimization algorithms find widespread applicability in a variety of domains, including machine learning tasks. Motivated by the recent advances in fixed-time stability theory of continuous-time dynamical systems, we introduce a generalized framework for designing accelerated optimization algorithms with strongest convergence guarantees that further extend to a s… ▽ More Gradient-based first-order convex optimization algorithms find widespread applicability in a variety of domains, including machine learning tasks. Motivated by the recent advances in fixed-time stability theory of continuous-time dynamical systems, we introduce a generalized framework for designing accelerated optimization algorithms with strongest convergence guarantees that further extend to a subclass of non-convex functions. In particular, we introduce the GenFlow algorithm and its momentum variant that provably converge to the optimal solution of objective functions satisfying the Polyak-Łojasiewicz (PL) inequality in a fixed time. Moreover, for functions that admit non-degenerate saddle-points, we show that for the proposed GenFlow algorithm, the time required to evade these saddle-points is uniformly bounded for all initial conditions. Finally, for strongly convex-strongly concave minimax problems whose optimal solution is a saddle point, a similar scheme is shown to arrive at the optimal solution again in a fixed time. The superior convergence properties of our algorithm are validated experimentally on a variety of benchmark datasets. △ Less

Submitted 22 October, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

Comments: Accepted to Transactions on Automatic Control (TAC)

arXiv:2212.02397 [pdf, other]

PowRL: A Reinforcement Learning Framework for Robust Management of Power Networks

Authors: Anandsingh Chauhan, Mayank Baranwal, Ansuma Basumatary

Abstract: Power grids, across the world, play an important societal and economical role by providing uninterrupted, reliable and transient-free power to several industries, businesses and household consumers. With the advent of renewable power resources and EVs resulting into uncertain generation and highly dynamic load demands, it has become ever so important to ensure robust operation of power networks th… ▽ More Power grids, across the world, play an important societal and economical role by providing uninterrupted, reliable and transient-free power to several industries, businesses and household consumers. With the advent of renewable power resources and EVs resulting into uncertain generation and highly dynamic load demands, it has become ever so important to ensure robust operation of power networks through suitable management of transient stability issues and localize the events of blackouts. In the light of ever increasing stress on the modern grid infrastructure and the grid operators, this paper presents a reinforcement learning (RL) framework, PowRL, to mitigate the effects of unexpected network events, as well as reliably maintain electricity everywhere on the network at all times. The PowRL leverages a novel heuristic for overload management, along with the RL-guided decision making on optimal topology selection to ensure that the grid is operated safely and reliably (with no overloads). PowRL is benchmarked on a variety of competition datasets hosted by the L2RPN (Learning to Run a Power Network). Even with its reduced action space, PowRL tops the leaderboard in the L2RPN NeurIPS 2020 challenge (Robustness track) at an aggregate level, while also being the top performing agent in the L2RPN WCCI 2020 challenge. Moreover, detailed analysis depicts state-of-the-art performances by the PowRL agent in some of the test scenarios. △ Less

Submitted 20 April, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

Comments: Accepted at the 37th AAAI Conference on Artificial Intelligence

arXiv:2207.12845 [pdf, other]

Fixed-Time Convergence for a Class of Nonconvex-Nonconcave Min-Max Problems

Authors: Kunal Garg, Mayank Baranwal

Abstract: This study develops a fixed-time convergent saddle point dynamical system for solving min-max problems under a relaxation of standard convexity-concavity assumption. In particular, it is shown that by leveraging the dynamical systems viewpoint of an optimization algorithm, accelerated convergence to a saddle point can be obtained. Instead of requiring the objective function to be strongly-convex--… ▽ More This study develops a fixed-time convergent saddle point dynamical system for solving min-max problems under a relaxation of standard convexity-concavity assumption. In particular, it is shown that by leveraging the dynamical systems viewpoint of an optimization algorithm, accelerated convergence to a saddle point can be obtained. Instead of requiring the objective function to be strongly-convex--strongly-concave (as necessitated for accelerated convergence of several saddle-point algorithms), uniform fixed-time convergence is guaranteed for functions satisfying only the two-sided Polyak-Łojasiewicz (PL) inequality. A large number of practical problems, including the robust least squares estimation, are known to satisfy the two-sided PL inequality. The proposed method achieves arbitrarily fast convergence compared to any other state-of-the-art method with linear or even super-linear convergence, as also corroborated in numerical case studies. △ Less

Submitted 26 July, 2022; originally announced July 2022.

Comments: 6 pages, 2 figures

arXiv:2112.01363 [pdf, other]

Breaking the Convergence Barrier: Optimization via Fixed-Time Convergent Flows

Authors: Param Budhraja, Mayank Baranwal, Kunal Garg, Ashish Hota

Abstract: Accelerated gradient methods are the cornerstones of large-scale, data-driven optimization problems that arise naturally in machine learning and other fields concerning data analysis. We introduce a gradient-based optimization framework for achieving acceleration, based on the recently introduced notion of fixed-time stability of dynamical systems. The method presents itself as a generalization of… ▽ More Accelerated gradient methods are the cornerstones of large-scale, data-driven optimization problems that arise naturally in machine learning and other fields concerning data analysis. We introduce a gradient-based optimization framework for achieving acceleration, based on the recently introduced notion of fixed-time stability of dynamical systems. The method presents itself as a generalization of simple gradient-based methods suitably scaled to achieve convergence to the optimizer in a fixed-time, independent of the initialization. We achieve this by first leveraging a continuous-time framework for designing fixed-time stable dynamical systems, and later providing a consistent discretization strategy, such that the equivalent discrete-time algorithm tracks the optimizer in a practically fixed number of iterations. We also provide a theoretical analysis of the convergence behavior of the proposed gradient flows, and their robustness to additive disturbances for a range of functions obeying strong convexity, strict convexity, and possibly nonconvexity but satisfying the Polyak-Łojasiewicz inequality. We also show that the regret bound on the convergence rate is constant by virtue of the fixed-time convergence. The hyperparameters have intuitive interpretations and can be tuned to fit the requirements on the desired convergence rates. We validate the accelerated convergence properties of the proposed schemes on a range of numerical examples against the state-of-the-art optimization algorithms. Our work provides insights on developing novel optimization algorithms via discretization of continuous-time flows. △ Less

Submitted 20 March, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

Comments: Accepted at AAAI Conference on Artificial Intelligence, 2022

arXiv:2108.07555 [pdf, other]

doi 10.1145/3459637.3482386

Revisiting State Augmentation methods for Reinforcement Learning with Stochastic Delays

Authors: Somjit Nath, Mayank Baranwal, Harshad Khadilkar

Abstract: Several real-world scenarios, such as remote control and sensing, are comprised of action and observation delays. The presence of delays degrades the performance of reinforcement learning (RL) algorithms, often to such an extent that algorithms fail to learn anything substantial. This paper formally describes the notion of Markov Decision Processes (MDPs) with stochastic delays and shows that dela… ▽ More Several real-world scenarios, such as remote control and sensing, are comprised of action and observation delays. The presence of delays degrades the performance of reinforcement learning (RL) algorithms, often to such an extent that algorithms fail to learn anything substantial. This paper formally describes the notion of Markov Decision Processes (MDPs) with stochastic delays and shows that delayed MDPs can be transformed into equivalent standard MDPs (without delays) with significantly simplified cost structure. We employ this equivalence to derive a model-free Delay-Resolved RL framework and show that even a simple RL algorithm built upon this framework achieves near-optimal rewards in environments with stochastic delays in actions and observations. The delay-resolved deep Q-network (DRDQN) algorithm is bench-marked on a variety of environments comprising of multi-step and stochastic delays and results in better performance, both in terms of achieving near-optimal rewards and minimizing the computational overhead thereof, with respect to the currently established algorithms. △ Less

Submitted 17 August, 2021; originally announced August 2021.

Comments: Accepted at CIKM'21

arXiv:2006.02537 [pdf, ps, other]

doi 10.1109/LSP.2020.3027490

CAPPA: Continuous-time Accelerated Proximal Point Algorithm for Sparse Recovery

Authors: Kunal Garg, Mayank Baranwal

Abstract: This paper develops a novel Continuous-time Accelerated Proximal Point Algorithm (CAPPA) for $\ell_1$-minimization problems with provable fixed-time convergence guarantees. The problem of $\ell_1$-minimization appears in several contexts, such as sparse recovery (SR) in Compressed Sensing (CS) theory, and sparse linear and logistic regressions in machine learning to name a few. Most existing algor… ▽ More This paper develops a novel Continuous-time Accelerated Proximal Point Algorithm (CAPPA) for $\ell_1$-minimization problems with provable fixed-time convergence guarantees. The problem of $\ell_1$-minimization appears in several contexts, such as sparse recovery (SR) in Compressed Sensing (CS) theory, and sparse linear and logistic regressions in machine learning to name a few. Most existing algorithms for solving $\ell_1$-minimization problems are discrete-time, inefficient and require exhaustive computer-guided iterations. CAPPA alleviates this problem on two fronts: (a) it encompasses a continuous-time algorithm that can be implemented using analog circuits; (b) it betters LCA and finite-time LCA (recently developed continuous-time dynamical systems for solving SR problems) by exhibiting provable fixed-time convergence to optimal solution. Consequently, CAPPA is better suited for fast and efficient handling of SR problems. Simulation studies are presented that corroborate computational advantages of CAPPA. △ Less

Submitted 3 June, 2020; originally announced June 2020.

Comments: 6 pages, 5 figures

arXiv:1910.14214 [pdf, other]

doi 10.1109/LCSYS.2020.3020248

Robust Distributed Fixed-Time Economic Dispatch under Time-Varying Topology

Authors: Mayank Baranwal, Kunal Garg, Dimitra Panagou, Alfred O. Hero

Abstract: The centralized power generation infrastructure that defines the North American electric grid is slowly moving to the distributed architecture due to the explosion in use of renewable generation and distributed energy resources (DERs), such as residential solar, wind turbines and battery storage. Furthermore, variable pricing policies and profusion of flexible loads entail frequent and severe chan… ▽ More The centralized power generation infrastructure that defines the North American electric grid is slowly moving to the distributed architecture due to the explosion in use of renewable generation and distributed energy resources (DERs), such as residential solar, wind turbines and battery storage. Furthermore, variable pricing policies and profusion of flexible loads entail frequent and severe changes in power outputs required from the individual generation units, requiring fast availability of power allocation. To this end, a fixed-time convergent, fully distributed economic dispatch algorithm for scheduling optimal power generation among a set of DERs is proposed. The proposed algorithm incorporates both load balance and generation capacity constraints. △ Less

Submitted 26 August, 2020; v1 submitted 30 October, 2019; originally announced October 2019.

Comments: 6 pages, 3 figures, to appear in L-CSS

Journal ref: IEEE Control Systems Letters, vol. 5, no. 4, pp. 1183-1188, Oct. 2021

arXiv:1907.08720 [pdf, other]

Multiway k-Cut in Static and Dynamic Graphs: A Maximum Entropy Principle Approach

Authors: Mayank Baranwal, Amber Srivastava, Srinivasa Salapaka

Abstract: This work presents a maximum entropy principle based algorithm for solving minimum multiway $k$-cut problem defined over static and dynamic {\em digraphs}. A multiway $k$-cut problem requires partitioning the set of nodes in a graph into $k$ subsets, such that each subset contains one prespecified node, and the corresponding total cut weight is minimized. These problems arise in many applications… ▽ More This work presents a maximum entropy principle based algorithm for solving minimum multiway $k$-cut problem defined over static and dynamic {\em digraphs}. A multiway $k$-cut problem requires partitioning the set of nodes in a graph into $k$ subsets, such that each subset contains one prespecified node, and the corresponding total cut weight is minimized. These problems arise in many applications and are computationally complex (NP-hard). In the static setting this article presents an approach that uses a relaxed multiway $k$-cut cost function; we show that the resulting algorithm converges to a local minimum. This iterative algorithm is designed to avoid poor local minima with its run-time complexity as $\sim O(kIN^3)$, where $N$ is the number of vertices and $I$ is the number of iterations. In the dynamic setting, the edge-weight matrix has an associated dynamics with some of the edges in the graph capable of being influenced by an external input. The objective is to design the dynamics of the controllable edges so that multiway $k$-cut value remains small (or decreases) as the graph evolves under the dynamics. Also it is required to determine the time-varying partition that defines the minimum multiway $k$-cut value. Our approach is to choose a relaxation of multiway $k$-cut value, derived using maximum entropy principle, and treat it as a control Lyapunov function to design control laws that affect the weight dynamics. Simulations on practical examples of interactive foreground-background segmentation, minimum multiway $k$-cut optimization for non-planar graphs and dynamically evolving graphs that demonstrate the efficacy of the algorithm, are presented. △ Less

Submitted 19 July, 2019; originally announced July 2019.

Comments: 8 pages, 7 figures

arXiv:1905.10472 [pdf, other]

Accelerating Distributed Optimization via Fixed-time Convergent Flows: Extensions to Non-convex Functions and Consistent Discretization

Authors: Kunal Garg, Mayank Baranwal

Abstract: Distributed optimization has gained significant attention in recent years, primarily fueled by the availability of a large amount of data and privacy-preserving requirements. This paper presents a fixed-time convergent optimization algorithm for solving a potentially non-convex optimization problem using a first-order multi-agent system. Each agent in the network can access only its private object… ▽ More Distributed optimization has gained significant attention in recent years, primarily fueled by the availability of a large amount of data and privacy-preserving requirements. This paper presents a fixed-time convergent optimization algorithm for solving a potentially non-convex optimization problem using a first-order multi-agent system. Each agent in the network can access only its private objective function, while local information exchange is permitted between the neighbors. The proposed optimization algorithm combines a fixed-time convergent distributed parameter estimation scheme with a fixed-time distributed consensus scheme as its solution methodology. The results are presented under the assumption that the team objective function is strongly convex, as opposed to the common assumptions in the literature requiring each of the local objective functions to be strongly convex. The results extend to the class of possibly non-convex team objective functions satisfying only the Polyak-Łojasiewicz (PL) inequality. It is also shown that the proposed continuous-time scheme, when discretized using Euler's method, leads to consistent discretization, i.e., the fixed-time convergence behavior is preserved under discretization. Numerical examples comprising large-scale distributed linear regression and training of neural networks corroborate our theoretical analysis. △ Less

Submitted 27 May, 2022; v1 submitted 24 May, 2019; originally announced May 2019.

Comments: Under review. 10 pages, 1 figure

arXiv:1701.03065 [pdf, other]

Robust Distributed Control of DC Microgrids with Time-Varying Power Sharing

Authors: Mayank Baranwal, Alireza Askarian, Srinivasa M. Salapaka

Abstract: This paper addresses the problem of output voltage regulation for multiple DC/DC converters connected to a microgrid, and prescribes a scheme for sharing power among different sources. This architecture is structured in such a way that it admits quantifiable analysis of the closed-loop performance of the network of converters; the analysis simplifies to studying closed-loop performance of an equiv… ▽ More This paper addresses the problem of output voltage regulation for multiple DC/DC converters connected to a microgrid, and prescribes a scheme for sharing power among different sources. This architecture is structured in such a way that it admits quantifiable analysis of the closed-loop performance of the network of converters; the analysis simplifies to studying closed-loop performance of an equivalent {\em single-converter} system. The proposed architecture allows for the proportion in which the sources provide power to vary with time; thus overcoming limitations of our previous designs. Additionally, the proposed control framework is suitable to both centralized and decentralized implementations, i.e., the same control architecture can be employed for voltage regulation irrespective of the availability of common load-current (or power) measurement, without the need to modify controller parameters. The performance becomes quantifiably better with better communication of the demanded load to all the controllers at all the converters (in the centralized case); however guarantees viability when such communication is absent. Case studies comprising of battery, PV and generic sources are presented and demonstrate the enhanced performance of prescribed optimal controllers for voltage regulation and power sharing. △ Less

Submitted 11 January, 2017; originally announced January 2017.

Comments: arXiv admin note: substantial text overlap with arXiv:1604.04154

arXiv:1604.04154 [pdf, other]

Robust Control Framework for Time-Varying Power-Sharing among Distributed Energy Resources

Authors: Mayank Baranwal, Srinivasa M. Salapaka

Abstract: One of the most important challenges facing an electric grid is to incorporate renewables and distributed energy resources (DERs) to the grid. Because of the associated uncertainties in power generations and peak power demands, opportunities for improving the functioning and reliability of the grid lie in the design of an efficient, yet pragmatic distributed control framework with guaranteed robus… ▽ More One of the most important challenges facing an electric grid is to incorporate renewables and distributed energy resources (DERs) to the grid. Because of the associated uncertainties in power generations and peak power demands, opportunities for improving the functioning and reliability of the grid lie in the design of an efficient, yet pragmatic distributed control framework with guaranteed robustness margins. This paper addresses the problem of output voltage regulation for multiple DC-DC converters connected to a grid, and prescribes a robust scheme for sharing power among different sources. More precisely, we develop a control architecture where, unlike most standard control frameworks, the desired power ratios appear as reference signals to individual converter systems, and not as internal parameters of the system of parallel converters. This makes the proposed approach suited for scenarios when the desired power ratios vary rapidly with time. Additionally, the proposed control framework is suitable to both centralized and decentralized implementations, i.e., the same control architecture can be employed for voltage regulation irrespective of the availability of common load-current (or power) measurement, without the need to modify controller parameters. The control design is obtained using robust optimal-control framework. Case studies presented show the enhanced performance of prescribed optimal controllers for voltage regulation and power sharing. △ Less

Submitted 14 April, 2016; originally announced April 2016.

Comments: arXiv admin note: text overlap with arXiv:1604.03573

arXiv:1604.03573 [pdf, other]

Robust Decentralized Voltage Control of DC-DC Converters with Applications to Power Sharing and Ripple Sharing

Authors: Mayank Baranwal, Srinivasa M. Salapaka, Murti V. Salapaka

Abstract: This paper addresses the problem of output voltage regulation for multiple DC-DC converters connected to a grid, and prescribes a robust scheme for sharing power among different sources. Also it develops a method for sharing 120 Hz ripple among DC power sources in a prescribed proportion, which accommodates the different capabilities of DC power sources to sustain the ripple. We present a decentra… ▽ More This paper addresses the problem of output voltage regulation for multiple DC-DC converters connected to a grid, and prescribes a robust scheme for sharing power among different sources. Also it develops a method for sharing 120 Hz ripple among DC power sources in a prescribed proportion, which accommodates the different capabilities of DC power sources to sustain the ripple. We present a decentralized control architecture, where a nested (inner-outer) control design is used at every converter. An interesting aspect of the proposed design is that the analysis and design of the entire multi-converter system can be done using an equivalent single converter system, where the multi-converter system inherits the performance and robustness achieved by a design for the single-converter system. Another key aspect of this work is that the voltage regulation problem is addressed as a disturbance-rejection problem, where {\em unknown} load current is viewed as an external signal, and thus, no prior information is required on the nominal loading conditions. The control design is obtained using robust optimal-control framework. Case studies presented show the enhanced performance of prescribed optimal controllers. △ Less

Submitted 12 April, 2016; originally announced April 2016.

Showing 1–15 of 15 results for author: Baranwal, M