Search | arXiv e-print repository

Together We Rise: Optimizing Real-Time Multi-Robot Task Allocation using Coordinated Heterogeneous Plays

Authors: Aritra Pal, Anandsingh Chauhan, Mayank Baranwal

Abstract: Efficient task allocation among multiple robots is crucial for optimizing productivity in modern warehouses, particularly in response to the increasing demands of online order fulfillment. This paper addresses the real-time multi-robot task allocation (MRTA) problem in dynamic warehouse environments, where tasks emerge with specified start and end locations. The objective is to minimize both the t… ▽ More Efficient task allocation among multiple robots is crucial for optimizing productivity in modern warehouses, particularly in response to the increasing demands of online order fulfillment. This paper addresses the real-time multi-robot task allocation (MRTA) problem in dynamic warehouse environments, where tasks emerge with specified start and end locations. The objective is to minimize both the total travel distance of robots and delays in task completion, while also considering practical constraints such as battery management and collision avoidance. We introduce MRTAgent, a dual-agent Reinforcement Learning (RL) framework inspired by self-play, designed to optimize task assignments and robot selection to ensure timely task execution. For safe navigation, a modified linear quadratic controller (LQR) approach is employed. To the best of our knowledge, MRTAgent is the first framework to address all critical aspects of practical MRTA problems while supporting continuous robot movements. △ Less

Submitted 21 February, 2025; originally announced February 2025.

Comments: Accepted to AAMAS 2025 (AAAI Track)

arXiv:2502.04864 [pdf, other]

$TAR^2$: Temporal-Agent Reward Redistribution for Optimal Policy Preservation in Multi-Agent Reinforcement Learning

Authors: Aditya Kapoor, Kale-ab Tessera, Mayank Baranwal, Harshad Khadilkar, Stefano Albrecht, Mingfei Sun

Abstract: In cooperative multi-agent reinforcement learning (MARL), learning effective policies is challenging when global rewards are sparse and delayed. This difficulty arises from the need to assign credit across both agents and time steps, a problem that existing methods often fail to address in episodic, long-horizon tasks. We propose Temporal-Agent Reward Redistribution $TAR^2$, a novel approach that… ▽ More In cooperative multi-agent reinforcement learning (MARL), learning effective policies is challenging when global rewards are sparse and delayed. This difficulty arises from the need to assign credit across both agents and time steps, a problem that existing methods often fail to address in episodic, long-horizon tasks. We propose Temporal-Agent Reward Redistribution $TAR^2$, a novel approach that decomposes sparse global rewards into agent-specific, time-step-specific components, thereby providing more frequent and accurate feedback for policy learning. Theoretically, we show that $TAR^2$ (i) aligns with potential-based reward shaping, preserving the same optimal policies as the original environment, and (ii) maintains policy gradient update directions identical to those under the original sparse reward, ensuring unbiased credit signals. Empirical results on two challenging benchmarks, SMACLite and Google Research Football, demonstrate that $TAR^2$ significantly stabilizes and accelerates convergence, outperforming strong baselines like AREL and STAS in both learning speed and final performance. These findings establish $TAR^2$ as a principled and practical solution for agent-temporal credit assignment in sparse-reward multi-agent systems. △ Less

Submitted 7 February, 2025; originally announced February 2025.

Comments: 23 pages, 5 figures, 4 tables

arXiv:2412.14779 [pdf, other]

Agent-Temporal Credit Assignment for Optimal Policy Preservation in Sparse Multi-Agent Reinforcement Learning

Authors: Aditya Kapoor, Sushant Swamy, Kale-ab Tessera, Mayank Baranwal, Mingfei Sun, Harshad Khadilkar, Stefano V. Albrecht

Abstract: In multi-agent environments, agents often struggle to learn optimal policies due to sparse or delayed global rewards, particularly in long-horizon tasks where it is challenging to evaluate actions at intermediate time steps. We introduce Temporal-Agent Reward Redistribution (TAR$^2$), a novel approach designed to address the agent-temporal credit assignment problem by redistributing sparse rewards… ▽ More In multi-agent environments, agents often struggle to learn optimal policies due to sparse or delayed global rewards, particularly in long-horizon tasks where it is challenging to evaluate actions at intermediate time steps. We introduce Temporal-Agent Reward Redistribution (TAR$^2$), a novel approach designed to address the agent-temporal credit assignment problem by redistributing sparse rewards both temporally and across agents. TAR$^2$ decomposes sparse global rewards into time-step-specific rewards and calculates agent-specific contributions to these rewards. We theoretically prove that TAR$^2$ is equivalent to potential-based reward shaping, ensuring that the optimal policy remains unchanged. Empirical results demonstrate that TAR$^2$ stabilizes and accelerates the learning process. Additionally, we show that when TAR$^2$ is integrated with single-agent reinforcement learning algorithms, it performs as well as or better than traditional multi-agent reinforcement learning methods. △ Less

Submitted 19 December, 2024; originally announced December 2024.

Comments: 12 pages, 1 figure

arXiv:2409.19279 [pdf, other]

Distributed Optimization via Energy Conservation Laws in Dilated Coordinates

Authors: Mayank Baranwal, Kushal Chakrabarti

Abstract: Optimizing problems in a distributed manner is critical for systems involving multiple agents with private data. Despite substantial interest, a unified method for analyzing the convergence rates of distributed optimization algorithms is lacking. This paper introduces an energy conservation approach for analyzing continuous-time dynamical systems in dilated coordinates. Instead of directly analyzi… ▽ More Optimizing problems in a distributed manner is critical for systems involving multiple agents with private data. Despite substantial interest, a unified method for analyzing the convergence rates of distributed optimization algorithms is lacking. This paper introduces an energy conservation approach for analyzing continuous-time dynamical systems in dilated coordinates. Instead of directly analyzing dynamics in the original coordinate system, we establish a conserved quantity, akin to physical energy, in the dilated coordinate system. Consequently, convergence rates can be explicitly expressed in terms of the inverse time-dilation factor. Leveraging this generalized approach, we formulate a novel second-order distributed accelerated gradient flow with a convergence rate of $O\left(1/t^{2-ε}\right)$ in time $t$ for $ε>0$. We then employ a semi second-order symplectic Euler discretization to derive a rate-matching algorithm with a convergence rate of $O\left(1/k^{2-ε}\right)$ in $k$ iterations. To the best of our knowledge, this represents the most favorable convergence rate for any distributed optimization algorithm designed for smooth convex optimization. Its accelerated convergence behavior is benchmarked against various state-of-the-art distributed optimization algorithms on practical, large-scale problems. △ Less

Submitted 28 September, 2024; originally announced September 2024.

Comments: 10 pages; (Near) optimal convergence rate

arXiv:2407.12629 [pdf, ps, other]

A Methodology Establishing Linear Convergence of Adaptive Gradient Methods under PL Inequality

Authors: Kushal Chakrabarti, Mayank Baranwal

Abstract: Adaptive gradient-descent optimizers are the standard choice for training neural network models. Despite their faster convergence than gradient-descent and remarkable performance in practice, the adaptive optimizers are not as well understood as vanilla gradient-descent. A reason is that the dynamic update of the learning rate that helps in faster convergence of these methods also makes their anal… ▽ More Adaptive gradient-descent optimizers are the standard choice for training neural network models. Despite their faster convergence than gradient-descent and remarkable performance in practice, the adaptive optimizers are not as well understood as vanilla gradient-descent. A reason is that the dynamic update of the learning rate that helps in faster convergence of these methods also makes their analysis intricate. Particularly, the simple gradient-descent method converges at a linear rate for a class of optimization problems, whereas the practically faster adaptive gradient methods lack such a theoretical guarantee. The Polyak-Łojasiewicz (PL) inequality is the weakest known class, for which linear convergence of gradient-descent and its momentum variants has been proved. Therefore, in this paper, we prove that AdaGrad and Adam, two well-known adaptive gradient methods, converge linearly when the cost function is smooth and satisfies the PL inequality. Our theoretical framework follows a simple and unified approach, applicable to both batch and stochastic gradients, which can potentially be utilized in analyzing linear convergence of other variants of Adam. △ Less

Submitted 17 July, 2024; originally announced July 2024.

Comments: Accepted for publication at the main track of 27th European Conference on Artificial Intelligence (ECAI-2024)

arXiv:2407.10090 [pdf, other]

ReactAIvate: A Deep Learning Approach to Predicting Reaction Mechanisms and Unmasking Reactivity Hotspots

Authors: Ajnabiul Hoque, Manajit Das, Mayank Baranwal, Raghavan B. Sunoj

Abstract: A chemical reaction mechanism (CRM) is a sequence of molecular-level events involving bond-breaking/forming processes, generating transient intermediates along the reaction pathway as reactants transform into products. Understanding such mechanisms is crucial for designing and discovering new reactions. One of the currently available methods to probe CRMs is quantum mechanical (QM) computations. T… ▽ More A chemical reaction mechanism (CRM) is a sequence of molecular-level events involving bond-breaking/forming processes, generating transient intermediates along the reaction pathway as reactants transform into products. Understanding such mechanisms is crucial for designing and discovering new reactions. One of the currently available methods to probe CRMs is quantum mechanical (QM) computations. The resource-intensive nature of QM methods and the scarcity of mechanism-based datasets motivated us to develop reliable ML models for predicting mechanisms. In this study, we created a comprehensive dataset with seven distinct classes, each representing uniquely characterized elementary steps. Subsequently, we developed an interpretable attention-based GNN that achieved near-unity and 96% accuracy, respectively for reaction step classification and the prediction of reactive atoms in each such step, capturing interactions between the broader reaction context and local active regions. The near-perfect classification enables accurate prediction of both individual events and the entire CRM, mitigating potential drawbacks of Seq2Seq approaches, where a wrongly predicted character leads to incoherent CRM identification. In addition to interpretability, our model adeptly identifies key atom(s) even from out-of-distribution classes. This generalizabilty allows for the inclusion of new reaction types in a modular fashion, thus will be of value to experts for understanding the reactivity of new molecules. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: Accepted to 27th ECAI main track

arXiv:2310.00419 [pdf, other]

On Linear Convergence of PI Consensus Algorithm under the Restricted Secant Inequality

Authors: Kushal Chakrabarti, Mayank Baranwal

Abstract: This paper considers solving distributed optimization problems in peer-to-peer multi-agent networks. The network is synchronous and connected. By using the proportional-integral (PI) control strategy, various algorithms with fixed stepsize have been developed. Two notable among them are the PI algorithm and the PI consensus algorithm. Although the PI algorithm has provable linear or exponential co… ▽ More This paper considers solving distributed optimization problems in peer-to-peer multi-agent networks. The network is synchronous and connected. By using the proportional-integral (PI) control strategy, various algorithms with fixed stepsize have been developed. Two notable among them are the PI algorithm and the PI consensus algorithm. Although the PI algorithm has provable linear or exponential convergence without the standard requirement of (strong) convexity, a similar guarantee for the PI consensus algorithm is unavailable. In this paper, using Lyapunov theory, we guarantee exponential convergence of the PI consensus algorithm for global cost functions that satisfy the restricted secant inequality, with rate-matching discretization, without requiring convexity. To accelerate the PI consensus algorithm, we incorporate local pre-conditioning in the form of constant positive definite matrices and numerically validate its efficiency compared to the prominent distributed convex optimization algorithms. Unlike classical pre-conditioning, where only the gradients are multiplied by a pre-conditioner, the proposed pre-conditioning modifies both the gradients and the consensus terms, thereby controlling the effect of the communication graph on the algorithm. △ Less

Submitted 28 October, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

Comments: Accepted for publication at the 2024 Tenth Indian Control Conference (ICC-10)

arXiv:2212.03765 [pdf, other]

Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points

Authors: Mayank Baranwal, Param Budhraja, Vishal Raj, Ashish R. Hota

Abstract: Gradient-based first-order convex optimization algorithms find widespread applicability in a variety of domains, including machine learning tasks. Motivated by the recent advances in fixed-time stability theory of continuous-time dynamical systems, we introduce a generalized framework for designing accelerated optimization algorithms with strongest convergence guarantees that further extend to a s… ▽ More Gradient-based first-order convex optimization algorithms find widespread applicability in a variety of domains, including machine learning tasks. Motivated by the recent advances in fixed-time stability theory of continuous-time dynamical systems, we introduce a generalized framework for designing accelerated optimization algorithms with strongest convergence guarantees that further extend to a subclass of non-convex functions. In particular, we introduce the GenFlow algorithm and its momentum variant that provably converge to the optimal solution of objective functions satisfying the Polyak-Łojasiewicz (PL) inequality in a fixed time. Moreover, for functions that admit non-degenerate saddle-points, we show that for the proposed GenFlow algorithm, the time required to evade these saddle-points is uniformly bounded for all initial conditions. Finally, for strongly convex-strongly concave minimax problems whose optimal solution is a saddle point, a similar scheme is shown to arrive at the optimal solution again in a fixed time. The superior convergence properties of our algorithm are validated experimentally on a variety of benchmark datasets. △ Less

Submitted 22 October, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

Comments: Accepted to Transactions on Automatic Control (TAC)

arXiv:2212.02397 [pdf, other]

PowRL: A Reinforcement Learning Framework for Robust Management of Power Networks

Authors: Anandsingh Chauhan, Mayank Baranwal, Ansuma Basumatary

Abstract: Power grids, across the world, play an important societal and economical role by providing uninterrupted, reliable and transient-free power to several industries, businesses and household consumers. With the advent of renewable power resources and EVs resulting into uncertain generation and highly dynamic load demands, it has become ever so important to ensure robust operation of power networks th… ▽ More Power grids, across the world, play an important societal and economical role by providing uninterrupted, reliable and transient-free power to several industries, businesses and household consumers. With the advent of renewable power resources and EVs resulting into uncertain generation and highly dynamic load demands, it has become ever so important to ensure robust operation of power networks through suitable management of transient stability issues and localize the events of blackouts. In the light of ever increasing stress on the modern grid infrastructure and the grid operators, this paper presents a reinforcement learning (RL) framework, PowRL, to mitigate the effects of unexpected network events, as well as reliably maintain electricity everywhere on the network at all times. The PowRL leverages a novel heuristic for overload management, along with the RL-guided decision making on optimal topology selection to ensure that the grid is operated safely and reliably (with no overloads). PowRL is benchmarked on a variety of competition datasets hosted by the L2RPN (Learning to Run a Power Network). Even with its reduced action space, PowRL tops the leaderboard in the L2RPN NeurIPS 2020 challenge (Robustness track) at an aggregate level, while also being the top performing agent in the L2RPN WCCI 2020 challenge. Moreover, detailed analysis depicts state-of-the-art performances by the PowRL agent in some of the test scenarios. △ Less

Submitted 20 April, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

Comments: Accepted at the 37th AAAI Conference on Artificial Intelligence

arXiv:2207.12845 [pdf, other]

Fixed-Time Convergence for a Class of Nonconvex-Nonconcave Min-Max Problems

Authors: Kunal Garg, Mayank Baranwal

Abstract: This study develops a fixed-time convergent saddle point dynamical system for solving min-max problems under a relaxation of standard convexity-concavity assumption. In particular, it is shown that by leveraging the dynamical systems viewpoint of an optimization algorithm, accelerated convergence to a saddle point can be obtained. Instead of requiring the objective function to be strongly-convex--… ▽ More This study develops a fixed-time convergent saddle point dynamical system for solving min-max problems under a relaxation of standard convexity-concavity assumption. In particular, it is shown that by leveraging the dynamical systems viewpoint of an optimization algorithm, accelerated convergence to a saddle point can be obtained. Instead of requiring the objective function to be strongly-convex--strongly-concave (as necessitated for accelerated convergence of several saddle-point algorithms), uniform fixed-time convergence is guaranteed for functions satisfying only the two-sided Polyak-Łojasiewicz (PL) inequality. A large number of practical problems, including the robust least squares estimation, are known to satisfy the two-sided PL inequality. The proposed method achieves arbitrarily fast convergence compared to any other state-of-the-art method with linear or even super-linear convergence, as also corroborated in numerical case studies. △ Less

Submitted 26 July, 2022; originally announced July 2022.

Comments: 6 pages, 2 figures

arXiv:2203.00885 [pdf, other]

A Learning Based Framework for Handling Uncertain Lead Times in Multi-Product Inventory Management

Authors: Hardik Meisheri, Somjit Nath, Mayank Baranwal, Harshad Khadilkar

Abstract: Most existing literature on supply chain and inventory management consider stochastic demand processes with zero or constant lead times. While it is true that in certain niche scenarios, uncertainty in lead times can be ignored, most real-world scenarios exhibit stochasticity in lead times. These random fluctuations can be caused due to uncertainty in arrival of raw materials at the manufacturer's… ▽ More Most existing literature on supply chain and inventory management consider stochastic demand processes with zero or constant lead times. While it is true that in certain niche scenarios, uncertainty in lead times can be ignored, most real-world scenarios exhibit stochasticity in lead times. These random fluctuations can be caused due to uncertainty in arrival of raw materials at the manufacturer's end, delay in transportation, an unforeseen surge in demands, and switching to a different vendor, to name a few. Stochasticity in lead times is known to severely degrade the performance in an inventory management system, and it is only fair to abridge this gap in supply chain system through a principled approach. Motivated by the recently introduced delay-resolved deep Q-learning (DRDQN) algorithm, this paper develops a reinforcement learning based paradigm for handling uncertainty in lead times (\emph{action delay}). Through empirical evaluations, it is further shown that the inventory management with uncertain lead times is not only equivalent to that of delay in information sharing across multiple echelons (\emph{observation delay}), a model trained to handle one kind of delay is capable to handle delays of another kind without requiring to be retrained. Finally, we apply the delay-resolved framework to scenarios comprising of multiple products subjected to stochasticity in lead times, and elucidate how the delay-resolved framework negates the effect of any delay to achieve near-optimal performance. △ Less

Submitted 8 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

arXiv:2112.01363 [pdf, other]

Breaking the Convergence Barrier: Optimization via Fixed-Time Convergent Flows

Authors: Param Budhraja, Mayank Baranwal, Kunal Garg, Ashish Hota

Abstract: Accelerated gradient methods are the cornerstones of large-scale, data-driven optimization problems that arise naturally in machine learning and other fields concerning data analysis. We introduce a gradient-based optimization framework for achieving acceleration, based on the recently introduced notion of fixed-time stability of dynamical systems. The method presents itself as a generalization of… ▽ More Accelerated gradient methods are the cornerstones of large-scale, data-driven optimization problems that arise naturally in machine learning and other fields concerning data analysis. We introduce a gradient-based optimization framework for achieving acceleration, based on the recently introduced notion of fixed-time stability of dynamical systems. The method presents itself as a generalization of simple gradient-based methods suitably scaled to achieve convergence to the optimizer in a fixed-time, independent of the initialization. We achieve this by first leveraging a continuous-time framework for designing fixed-time stable dynamical systems, and later providing a consistent discretization strategy, such that the equivalent discrete-time algorithm tracks the optimizer in a practically fixed number of iterations. We also provide a theoretical analysis of the convergence behavior of the proposed gradient flows, and their robustness to additive disturbances for a range of functions obeying strong convexity, strict convexity, and possibly nonconvexity but satisfying the Polyak-Łojasiewicz inequality. We also show that the regret bound on the convergence rate is constant by virtue of the fixed-time convergence. The hyperparameters have intuitive interpretations and can be tuned to fit the requirements on the desired convergence rates. We validate the accelerated convergence properties of the proposed schemes on a range of numerical examples against the state-of-the-art optimization algorithms. Our work provides insights on developing novel optimization algorithms via discretization of continuous-time flows. △ Less

Submitted 20 March, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

Comments: Accepted at AAAI Conference on Artificial Intelligence, 2022

arXiv:2108.07555 [pdf, other]

doi 10.1145/3459637.3482386

Revisiting State Augmentation methods for Reinforcement Learning with Stochastic Delays

Authors: Somjit Nath, Mayank Baranwal, Harshad Khadilkar

Abstract: Several real-world scenarios, such as remote control and sensing, are comprised of action and observation delays. The presence of delays degrades the performance of reinforcement learning (RL) algorithms, often to such an extent that algorithms fail to learn anything substantial. This paper formally describes the notion of Markov Decision Processes (MDPs) with stochastic delays and shows that dela… ▽ More Several real-world scenarios, such as remote control and sensing, are comprised of action and observation delays. The presence of delays degrades the performance of reinforcement learning (RL) algorithms, often to such an extent that algorithms fail to learn anything substantial. This paper formally describes the notion of Markov Decision Processes (MDPs) with stochastic delays and shows that delayed MDPs can be transformed into equivalent standard MDPs (without delays) with significantly simplified cost structure. We employ this equivalence to derive a model-free Delay-Resolved RL framework and show that even a simple RL algorithm built upon this framework achieves near-optimal rewards in environments with stochastic delays in actions and observations. The delay-resolved deep Q-network (DRDQN) algorithm is bench-marked on a variety of environments comprising of multi-step and stochastic delays and results in better performance, both in terms of achieving near-optimal rewards and minimizing the computational overhead thereof, with respect to the currently established algorithms. △ Less

Submitted 17 August, 2021; originally announced August 2021.

Comments: Accepted at CIKM'21

arXiv:2011.00053 [pdf, other]

doi 10.1016/j.fuel.2020.118204

On sparse identification of complex dynamical systems: A study on discovering influential reactions in chemical reaction networks

Authors: Farshad Harirchi, Doohyun Kim, Omar Khalil, Sijia Liu, Paolo Elvati, Mayank Baranwal, Alfred Hero, Angela Violi

Abstract: A wide variety of real life complex networks are prohibitively large for modeling, analysis and control. Understanding the structure and dynamics of such networks entails creating a smaller representative network that preserves its relevant topological and dynamical properties. While modern machine learning methods have enabled identification of governing laws for complex dynamical systems, their… ▽ More A wide variety of real life complex networks are prohibitively large for modeling, analysis and control. Understanding the structure and dynamics of such networks entails creating a smaller representative network that preserves its relevant topological and dynamical properties. While modern machine learning methods have enabled identification of governing laws for complex dynamical systems, their inability to produce white-box models with sufficient physical interpretation renders such methods undesirable to domain experts. In this paper, we introduce a hybrid black-box, white-box approach for the sparse identification of the governing laws for complex, highly coupled dynamical systems with particular emphasis on finding the influential reactions in chemical reaction networks for combustion applications, using a data-driven sparse-learning technique. The proposed approach identifies a set of influential reactions using species concentrations and reaction rates,with minimal computational cost without requiring additional data or simulations. The new approach is applied to analyze the combustion chemistry of H2 and C3H8 in a constant-volume homogeneous reactor. The influential reactions determined by the sparse-learning method are consistent with the current kinetics knowledge of chemical mechanisms. Additionally, we show that a reduced version of the parent mechanism can be generated as a combination of the significantly reduced influential reactions identified at different times and conditions and that for both H2 and C3H8 fuel, the reduced mechanisms perform closely to the parent mechanisms as a function of the ignition delay time over a wide range of conditions. Our results demonstrate the potential of the sparse-learning approach as an effective and efficient tool for dynamical system analysis and reduction. The uniqueness of this approach as applied to combustion systems lies in the ability to identify influential reactions in specified conditions and times during the evolution of the combustion process. This ability is of great interest to understand chemical reaction systems. △ Less

Submitted 8 July, 2020; originally announced November 2020.

Journal ref: Fuel, Volume 279, 2020, 118204, ISSN 0016-2361

arXiv:2006.02537 [pdf, ps, other]

doi 10.1109/LSP.2020.3027490

CAPPA: Continuous-time Accelerated Proximal Point Algorithm for Sparse Recovery

Authors: Kunal Garg, Mayank Baranwal

Abstract: This paper develops a novel Continuous-time Accelerated Proximal Point Algorithm (CAPPA) for $\ell_1$-minimization problems with provable fixed-time convergence guarantees. The problem of $\ell_1$-minimization appears in several contexts, such as sparse recovery (SR) in Compressed Sensing (CS) theory, and sparse linear and logistic regressions in machine learning to name a few. Most existing algor… ▽ More This paper develops a novel Continuous-time Accelerated Proximal Point Algorithm (CAPPA) for $\ell_1$-minimization problems with provable fixed-time convergence guarantees. The problem of $\ell_1$-minimization appears in several contexts, such as sparse recovery (SR) in Compressed Sensing (CS) theory, and sparse linear and logistic regressions in machine learning to name a few. Most existing algorithms for solving $\ell_1$-minimization problems are discrete-time, inefficient and require exhaustive computer-guided iterations. CAPPA alleviates this problem on two fronts: (a) it encompasses a continuous-time algorithm that can be implemented using analog circuits; (b) it betters LCA and finite-time LCA (recently developed continuous-time dynamical systems for solving SR problems) by exhibiting provable fixed-time convergence to optimal solution. Consequently, CAPPA is better suited for fast and efficient handling of SR problems. Simulation studies are presented that corroborate computational advantages of CAPPA. △ Less

Submitted 3 June, 2020; originally announced June 2020.

Comments: 6 pages, 5 figures

arXiv:2002.05678 [pdf, ps, other]

The Power of Graph Convolutional Networks to Distinguish Random Graph Models: Short Version

Authors: Abram Magner, Mayank Baranwal, Alfred O. Hero III

Abstract: Graph convolutional networks (GCNs) are a widely used method for graph representation learning. We investigate the power of GCNs, as a function of their number of layers, to distinguish between different random graph models on the basis of the embeddings of their sample graphs. In particular, the graph models that we consider arise from graphons, which are the most general possible parameterizatio… ▽ More Graph convolutional networks (GCNs) are a widely used method for graph representation learning. We investigate the power of GCNs, as a function of their number of layers, to distinguish between different random graph models on the basis of the embeddings of their sample graphs. In particular, the graph models that we consider arise from graphons, which are the most general possible parameterizations of infinite exchangeable graph models and which are the central objects of study in the theory of dense graph limits. We exhibit an infinite class of graphons that are well-separated in terms of cut distance and are indistinguishable by a GCN with nonlinear activation functions coming from a certain broad class if its depth is at least logarithmic in the size of the sample graph. These results theoretically match empirical observations of several prior works. Finally, we show a converse result that for pairs of graphons satisfying a degree profile separation property, a very simple GCN architecture suffices for distinguishability. To prove our results, we exploit a connection to random walks on graphs. △ Less

Submitted 13 February, 2020; originally announced February 2020.

Comments: Conference version of arXiv:1910.12954

arXiv:1910.14214 [pdf, other]

doi 10.1109/LCSYS.2020.3020248

Robust Distributed Fixed-Time Economic Dispatch under Time-Varying Topology

Authors: Mayank Baranwal, Kunal Garg, Dimitra Panagou, Alfred O. Hero

Abstract: The centralized power generation infrastructure that defines the North American electric grid is slowly moving to the distributed architecture due to the explosion in use of renewable generation and distributed energy resources (DERs), such as residential solar, wind turbines and battery storage. Furthermore, variable pricing policies and profusion of flexible loads entail frequent and severe chan… ▽ More The centralized power generation infrastructure that defines the North American electric grid is slowly moving to the distributed architecture due to the explosion in use of renewable generation and distributed energy resources (DERs), such as residential solar, wind turbines and battery storage. Furthermore, variable pricing policies and profusion of flexible loads entail frequent and severe changes in power outputs required from the individual generation units, requiring fast availability of power allocation. To this end, a fixed-time convergent, fully distributed economic dispatch algorithm for scheduling optimal power generation among a set of DERs is proposed. The proposed algorithm incorporates both load balance and generation capacity constraints. △ Less

Submitted 26 August, 2020; v1 submitted 30 October, 2019; originally announced October 2019.

Comments: 6 pages, 3 figures, to appear in L-CSS

Journal ref: IEEE Control Systems Letters, vol. 5, no. 4, pp. 1183-1188, Oct. 2021

arXiv:1910.12954 [pdf, other]

Fundamental Limits of Deep Graph Convolutional Networks

Authors: Abram Magner, Mayank Baranwal, Alfred O. Hero III

Abstract: Graph convolutional networks (GCNs) are a widely used method for graph representation learning. To elucidate the capabilities and limitations of GCNs, we investigate their power, as a function of their number of layers, to distinguish between different random graph models (corresponding to different class-conditional distributions in a classification problem) on the basis of the embeddings of thei… ▽ More Graph convolutional networks (GCNs) are a widely used method for graph representation learning. To elucidate the capabilities and limitations of GCNs, we investigate their power, as a function of their number of layers, to distinguish between different random graph models (corresponding to different class-conditional distributions in a classification problem) on the basis of the embeddings of their sample graphs. In particular, the graph models that we consider arise from graphons, which are the most general possible parameterizations of infinite exchangeable graph models and which are the central objects of study in the theory of dense graph limits. We give a precise characterization of the set of pairs of graphons that are indistinguishable by a GCN with nonlinear activation functions coming from a certain broad class if its depth is at least logarithmic in the size of the sample graph. This characterization is in terms of a degree profile closeness property. Outside this class, a very simple GCN architecture suffices for distinguishability. We then exhibit a concrete, infinite class of graphons arising from stochastic block models that are well-separated in terms of cut distance and are indistinguishable by a GCN. These results theoretically match empirical observations of several prior works. To prove our results, we exploit a connection to random walks on graphs. Finally, we give empirical results on synthetic and real graph classification datasets, indicating that indistinguishable graph distributions arise in practice. △ Less

Submitted 12 May, 2020; v1 submitted 28 October, 2019; originally announced October 2019.

Comments: 19 pages

arXiv:1908.03517 [pdf]

doi 10.1109/TAC.2022.3214795

Fixed-Time Stable Proximal Dynamical System for Solving MVIPs

Authors: Kunal Garg, Mayank Baranwal, Rohit Gupta, Mouhacine Benosman

Abstract: In this paper, a novel modified proximal dynamical system is proposed to compute the solution of a mixed variational inequality problem (MVIP) within a fixed time, where the time of convergence is finite and is uniformly bounded for all initial conditions. Under the assumptions of strong monotonicity and Lipschitz continuity, it is shown that a solution of the modified proximal dynamical system ex… ▽ More In this paper, a novel modified proximal dynamical system is proposed to compute the solution of a mixed variational inequality problem (MVIP) within a fixed time, where the time of convergence is finite and is uniformly bounded for all initial conditions. Under the assumptions of strong monotonicity and Lipschitz continuity, it is shown that a solution of the modified proximal dynamical system exists, is uniquely determined, and converges to the unique solution of the associated MVIP within a fixed time. Furthermore, the fixed-time stability of the modified projected dynamical system continues to hold, even if the assumption of strong monotonicity is relaxed to that of strong pseudomonotonicity. Finally, it is shown that the solution obtained using the forward-Euler discretization of the proposed modified proximal dynamical system converges to an arbitrarily small neighborhood of the solution of the associated MVIP within a fixed number of time steps, independent of the initial conditions. △ Less

Submitted 19 October, 2022; v1 submitted 9 August, 2019; originally announced August 2019.

Comments: 12 pages, 2 figures

arXiv:1907.08720 [pdf, other]

Multiway k-Cut in Static and Dynamic Graphs: A Maximum Entropy Principle Approach

Authors: Mayank Baranwal, Amber Srivastava, Srinivasa Salapaka

Abstract: This work presents a maximum entropy principle based algorithm for solving minimum multiway $k$-cut problem defined over static and dynamic {\em digraphs}. A multiway $k$-cut problem requires partitioning the set of nodes in a graph into $k$ subsets, such that each subset contains one prespecified node, and the corresponding total cut weight is minimized. These problems arise in many applications… ▽ More This work presents a maximum entropy principle based algorithm for solving minimum multiway $k$-cut problem defined over static and dynamic {\em digraphs}. A multiway $k$-cut problem requires partitioning the set of nodes in a graph into $k$ subsets, such that each subset contains one prespecified node, and the corresponding total cut weight is minimized. These problems arise in many applications and are computationally complex (NP-hard). In the static setting this article presents an approach that uses a relaxed multiway $k$-cut cost function; we show that the resulting algorithm converges to a local minimum. This iterative algorithm is designed to avoid poor local minima with its run-time complexity as $\sim O(kIN^3)$, where $N$ is the number of vertices and $I$ is the number of iterations. In the dynamic setting, the edge-weight matrix has an associated dynamics with some of the edges in the graph capable of being influenced by an external input. The objective is to design the dynamics of the controllable edges so that multiway $k$-cut value remains small (or decreases) as the graph evolves under the dynamics. Also it is required to determine the time-varying partition that defines the minimum multiway $k$-cut value. Our approach is to choose a relaxation of multiway $k$-cut value, derived using maximum entropy principle, and treat it as a control Lyapunov function to design control laws that affect the weight dynamics. Simulations on practical examples of interactive foreground-background segmentation, minimum multiway $k$-cut optimization for non-planar graphs and dynamically evolving graphs that demonstrate the efficacy of the algorithm, are presented. △ Less

Submitted 19 July, 2019; originally announced July 2019.

Comments: 8 pages, 7 figures

arXiv:1905.10472 [pdf, other]

Accelerating Distributed Optimization via Fixed-time Convergent Flows: Extensions to Non-convex Functions and Consistent Discretization

Authors: Kunal Garg, Mayank Baranwal

Abstract: Distributed optimization has gained significant attention in recent years, primarily fueled by the availability of a large amount of data and privacy-preserving requirements. This paper presents a fixed-time convergent optimization algorithm for solving a potentially non-convex optimization problem using a first-order multi-agent system. Each agent in the network can access only its private object… ▽ More Distributed optimization has gained significant attention in recent years, primarily fueled by the availability of a large amount of data and privacy-preserving requirements. This paper presents a fixed-time convergent optimization algorithm for solving a potentially non-convex optimization problem using a first-order multi-agent system. Each agent in the network can access only its private objective function, while local information exchange is permitted between the neighbors. The proposed optimization algorithm combines a fixed-time convergent distributed parameter estimation scheme with a fixed-time distributed consensus scheme as its solution methodology. The results are presented under the assumption that the team objective function is strongly convex, as opposed to the common assumptions in the literature requiring each of the local objective functions to be strongly convex. The results extend to the class of possibly non-convex team objective functions satisfying only the Polyak-Łojasiewicz (PL) inequality. It is also shown that the proposed continuous-time scheme, when discretized using Euler's method, leads to consistent discretization, i.e., the fixed-time convergence behavior is preserved under discretization. Numerical examples comprising large-scale distributed linear regression and training of neural networks corroborate our theoretical analysis. △ Less

Submitted 27 May, 2022; v1 submitted 24 May, 2019; originally announced May 2019.

Comments: Under review. 10 pages, 1 figure

arXiv:1811.00102 [pdf, other]

On the Persistence of Clustering Solutions and True Number of Clusters in a Dataset

Authors: Amber Srivastava, Mayank Baranwal, Srinivasa Salapaka

Abstract: Typically clustering algorithms provide clustering solutions with prespecified number of clusters. The lack of a priori knowledge on the true number of underlying clusters in the dataset makes it important to have a metric to compare the clustering solutions with different number of clusters. This article quantifies a notion of persistence of clustering solutions that enables comparing solutions w… ▽ More Typically clustering algorithms provide clustering solutions with prespecified number of clusters. The lack of a priori knowledge on the true number of underlying clusters in the dataset makes it important to have a metric to compare the clustering solutions with different number of clusters. This article quantifies a notion of persistence of clustering solutions that enables comparing solutions with different number of clusters. The persistence relates to the range of data-resolution scales over which a clustering solution persists; it is quantified in terms of the maximum over two-norms of all the associated cluster-covariance matrices. Thus we associate a persistence value for each element in a set of clustering solutions with different number of clusters. We show that the datasets where natural clusters are a priori known, the clustering solutions that identify the natural clusters are most persistent - in this way, this notion can be used to identify solutions with true number of clusters. Detailed experiments on a variety of standard and synthetic datasets demonstrate that the proposed persistence-based indicator outperforms the existing approaches, such as, gap-statistic method, $X$-means, $G$-means, $PG$-means, dip-means algorithms and information-theoretic method, in accurately identifying the clustering solutions with true number of clusters. Interestingly, our method can be explained in terms of the phase-transition phenomenon in the deterministic annealing algorithm, where the number of distinct cluster centers changes (bifurcates) with respect to an annealing parameter. △ Less

Submitted 16 November, 2018; v1 submitted 31 October, 2018; originally announced November 2018.

arXiv:1701.03065 [pdf, other]

Robust Distributed Control of DC Microgrids with Time-Varying Power Sharing

Authors: Mayank Baranwal, Alireza Askarian, Srinivasa M. Salapaka

Abstract: This paper addresses the problem of output voltage regulation for multiple DC/DC converters connected to a microgrid, and prescribes a scheme for sharing power among different sources. This architecture is structured in such a way that it admits quantifiable analysis of the closed-loop performance of the network of converters; the analysis simplifies to studying closed-loop performance of an equiv… ▽ More This paper addresses the problem of output voltage regulation for multiple DC/DC converters connected to a microgrid, and prescribes a scheme for sharing power among different sources. This architecture is structured in such a way that it admits quantifiable analysis of the closed-loop performance of the network of converters; the analysis simplifies to studying closed-loop performance of an equivalent {\em single-converter} system. The proposed architecture allows for the proportion in which the sources provide power to vary with time; thus overcoming limitations of our previous designs. Additionally, the proposed control framework is suitable to both centralized and decentralized implementations, i.e., the same control architecture can be employed for voltage regulation irrespective of the availability of common load-current (or power) measurement, without the need to modify controller parameters. The performance becomes quantifiably better with better communication of the demanded load to all the controllers at all the converters (in the centralized case); however guarantees viability when such communication is absent. Case studies comprising of battery, PV and generic sources are presented and demonstrate the enhanced performance of prescribed optimal controllers for voltage regulation and power sharing. △ Less

Submitted 11 January, 2017; originally announced January 2017.

Comments: arXiv admin note: substantial text overlap with arXiv:1604.04154

arXiv:1606.06427 [pdf, other]

Clustering with Capacity and Size Constraints: A Deterministic Approach

Authors: Mayank Baranwal, Srinivasa M. Salapaka

Abstract: This paper discusses a deterministic clustering approach to capacitated resource allocation problems. In particular, the Deterministic Annealing (DA) algorithm from the data-compression literature, which bears a distinct analogy to the phase transformation under annealing process in statistical physics, is adapted to address problems pertaining to clustering with several forms of size constraints.… ▽ More This paper discusses a deterministic clustering approach to capacitated resource allocation problems. In particular, the Deterministic Annealing (DA) algorithm from the data-compression literature, which bears a distinct analogy to the phase transformation under annealing process in statistical physics, is adapted to address problems pertaining to clustering with several forms of size constraints. These constraints are addressed through appropriate modifications of the basic DA formulation by judiciously adjusting the free-energy function in the DA algorithm. At a given value of the annealing parameter, the iterations of the DA algorithm are of the form of a Descent Method, which motivate scaling principles for faster convergence. △ Less

Submitted 21 June, 2016; originally announced June 2016.

Comments: 6 pages, 5 figures

arXiv:1604.04169 [pdf, other]

A Deterministic Annealing Approach to the Multiple Traveling Salesmen and Related Problems

Authors: Mayank Baranwal, Brian Roehl, Srinivasa M. Salapaka

Abstract: This paper presents a novel and efficient heuristic framework for approximating the solutions to the multiple traveling salesmen problem (m-TSP) and other variants on the TSP. The approach adopted in this paper is an extension of the Maximum-Entropy-Principle (MEP) and the Deterministic Annealing (DA) algorithm. The framework is presented as a general tool that can be suitably adapted to a number… ▽ More This paper presents a novel and efficient heuristic framework for approximating the solutions to the multiple traveling salesmen problem (m-TSP) and other variants on the TSP. The approach adopted in this paper is an extension of the Maximum-Entropy-Principle (MEP) and the Deterministic Annealing (DA) algorithm. The framework is presented as a general tool that can be suitably adapted to a number of variants on the basic TSP. Additionally, unlike most other heuristics for the TSP, the framework presented in this paper is independent of the edges defined between any two pairs of nodes. This makes the algorithm particularly suited for variants such as the close-enough traveling salesman problem (CETSP) which are challenging due to added computational complexity. The examples presented in this paper illustrate the effectiveness of this new framework for use in TSP and many variants thereof. △ Less

Submitted 14 April, 2016; originally announced April 2016.

arXiv:1604.04154 [pdf, other]

Robust Control Framework for Time-Varying Power-Sharing among Distributed Energy Resources

Authors: Mayank Baranwal, Srinivasa M. Salapaka

Abstract: One of the most important challenges facing an electric grid is to incorporate renewables and distributed energy resources (DERs) to the grid. Because of the associated uncertainties in power generations and peak power demands, opportunities for improving the functioning and reliability of the grid lie in the design of an efficient, yet pragmatic distributed control framework with guaranteed robus… ▽ More One of the most important challenges facing an electric grid is to incorporate renewables and distributed energy resources (DERs) to the grid. Because of the associated uncertainties in power generations and peak power demands, opportunities for improving the functioning and reliability of the grid lie in the design of an efficient, yet pragmatic distributed control framework with guaranteed robustness margins. This paper addresses the problem of output voltage regulation for multiple DC-DC converters connected to a grid, and prescribes a robust scheme for sharing power among different sources. More precisely, we develop a control architecture where, unlike most standard control frameworks, the desired power ratios appear as reference signals to individual converter systems, and not as internal parameters of the system of parallel converters. This makes the proposed approach suited for scenarios when the desired power ratios vary rapidly with time. Additionally, the proposed control framework is suitable to both centralized and decentralized implementations, i.e., the same control architecture can be employed for voltage regulation irrespective of the availability of common load-current (or power) measurement, without the need to modify controller parameters. The control design is obtained using robust optimal-control framework. Case studies presented show the enhanced performance of prescribed optimal controllers for voltage regulation and power sharing. △ Less

Submitted 14 April, 2016; originally announced April 2016.

Comments: arXiv admin note: text overlap with arXiv:1604.03573

arXiv:1604.03590 [pdf, other]

Vehicle Routing Problem with Time Windows: A Deterministic Annealing approach

Authors: Mayank Baranwal, Pratik M. Parekh, Lavanya Marla, Srinivasa M. Salapaka, Carolyn L. Beck

Abstract: The Vehicle Routing Problem with Time-Windows (VRPTW) is an important problem in allocating resources on networks in time and space. We present in this paper a Deterministic Annealing (DA)-based approach to solving the VRPTW with its aspects of routing and scheduling, as well as to model additional constraints of heterogeneous vehicles and shipments. This is the first time, to our knowledge, that… ▽ More The Vehicle Routing Problem with Time-Windows (VRPTW) is an important problem in allocating resources on networks in time and space. We present in this paper a Deterministic Annealing (DA)-based approach to solving the VRPTW with its aspects of routing and scheduling, as well as to model additional constraints of heterogeneous vehicles and shipments. This is the first time, to our knowledge, that a DA approach has been used for problems in the class of the VRPTW. We describe how the DA approach can be adapted to generate an effective heuristic approach to the VRPTW. Our DA approach is also designed to not get trapped in local minima, and demonstrates less sensitivity to initial solutions. The algorithm trades off routing and scheduling in an n-dimensional space using a tunable parameter that allows us to generate qualitatively good solutions. These solutions differ in the degree of intersection of the routes, making the case for transfer points where shipments can be exchanged. Simulation results on randomly generated instances show that the constraints are respected and demonstrate near optimal results (when verifiable) in terms of schedules and tour length of individual tours in each solution. △ Less

Submitted 12 April, 2016; originally announced April 2016.

arXiv:1604.03573 [pdf, other]

Robust Decentralized Voltage Control of DC-DC Converters with Applications to Power Sharing and Ripple Sharing

Authors: Mayank Baranwal, Srinivasa M. Salapaka, Murti V. Salapaka

Abstract: This paper addresses the problem of output voltage regulation for multiple DC-DC converters connected to a grid, and prescribes a robust scheme for sharing power among different sources. Also it develops a method for sharing 120 Hz ripple among DC power sources in a prescribed proportion, which accommodates the different capabilities of DC power sources to sustain the ripple. We present a decentra… ▽ More This paper addresses the problem of output voltage regulation for multiple DC-DC converters connected to a grid, and prescribes a robust scheme for sharing power among different sources. Also it develops a method for sharing 120 Hz ripple among DC power sources in a prescribed proportion, which accommodates the different capabilities of DC power sources to sustain the ripple. We present a decentralized control architecture, where a nested (inner-outer) control design is used at every converter. An interesting aspect of the proposed design is that the analysis and design of the entire multi-converter system can be done using an equivalent single converter system, where the multi-converter system inherits the performance and robustness achieved by a design for the single-converter system. Another key aspect of this work is that the voltage regulation problem is addressed as a disturbance-rejection problem, where {\em unknown} load current is viewed as an external signal, and thus, no prior information is required on the nominal loading conditions. The control design is obtained using robust optimal-control framework. Case studies presented show the enhanced performance of prescribed optimal controllers. △ Less

Submitted 12 April, 2016; originally announced April 2016.

Showing 1–28 of 28 results for author: Baranwal, M