-
Generalized Smooth Stochastic Variational Inequalities: Almost Sure Convergence and Convergence Rates
Authors:
Daniil Vankov,
Angelia Nedich,
Lalitha Sankar
Abstract:
This paper focuses on solving a stochastic variational inequality (SVI) problem under relaxed smoothness assumption for a class of structured non-monotone operators. The SVI problem has attracted significant interest in the machine learning community due to its immediate application to adversarial training and multi-agent reinforcement learning. In many such applications, the resulting operators d…
▽ More
This paper focuses on solving a stochastic variational inequality (SVI) problem under relaxed smoothness assumption for a class of structured non-monotone operators. The SVI problem has attracted significant interest in the machine learning community due to its immediate application to adversarial training and multi-agent reinforcement learning. In many such applications, the resulting operators do not satisfy the smoothness assumption. To address this issue, we focus on the generalized smoothness assumption and consider two well-known stochastic methods with clipping, namely, projection and Korpelevich. For these clipped methods, we provide the first almost-sure convergence results without making any assumptions on the boundedness of either the stochastic operator or the stochastic samples. Furthermore, we provide the first in-expectation convergence rate results for these methods under a relaxed smoothness assumption.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
Optimizing $(L_0, L_1)$-Smooth Functions by Gradient Methods
Authors:
Daniil Vankov,
Anton Rodomanov,
Angelia Nedich,
Lalitha Sankar,
Sebastian U. Stich
Abstract:
We study gradient methods for optimizing $(L_0, L_1)$-smooth functions, a class that generalizes Lipschitz-smooth functions and has gained attention for its relevance in machine learning. We provide new insights into the structure of this function class and develop a principled framework for analyzing optimization methods in this setting. While our convergence rate estimates recover existing resul…
▽ More
We study gradient methods for optimizing $(L_0, L_1)$-smooth functions, a class that generalizes Lipschitz-smooth functions and has gained attention for its relevance in machine learning. We provide new insights into the structure of this function class and develop a principled framework for analyzing optimization methods in this setting. While our convergence rate estimates recover existing results for minimizing the gradient norm in nonconvex problems, our approach significantly improves the best-known complexity bounds for convex objectives. Moreover, we show that the gradient method with Polyak stepsizes and the normalized gradient method achieve nearly the same complexity guarantees as methods that rely on explicit knowledge of~$(L_0, L_1)$. Finally, we demonstrate that a carefully designed accelerated gradient method can be applied to $(L_0, L_1)$-smooth functions, further improving all previous results.
△ Less
Submitted 7 March, 2025; v1 submitted 14 October, 2024;
originally announced October 2024.
-
Model Predictive Control for Joint Ramping and Regulation-Type Service from Distributed Energy Resource Aggregations
Authors:
Joel Mathias,
Rajasekhar Anguluri,
Oliver Kosut,
Lalitha Sankar
Abstract:
Distributed energy resources (DERs) such as grid-responsive loads and batteries can be harnessed to provide ramping and regulation services across the grid. This paper concerns the problem of optimal allocation of different classes of DERs, where each class is an aggregation of similar DERs, to balance net-demand forecasts. The resulting resource allocation problem is solved using model-predictive…
▽ More
Distributed energy resources (DERs) such as grid-responsive loads and batteries can be harnessed to provide ramping and regulation services across the grid. This paper concerns the problem of optimal allocation of different classes of DERs, where each class is an aggregation of similar DERs, to balance net-demand forecasts. The resulting resource allocation problem is solved using model-predictive control (MPC) that utilizes a rolling sequence of finite time-horizon constrained optimizations. This is based on the concept that we have more accurate estimates of the load forecast in the short term, so each optimization in the rolling sequence of optimization problems uses more accurate short term load forecasts while ensuring satisfaction of capacity and dynamical constraints. Simulations demonstrate that the MPC solution can indeed reduce the ramping required from bulk generation, while mitigating near-real time grid disturbances.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Adaptive Methods for Variational Inequalities under Relaxed Smoothness Assumption
Authors:
Daniil Vankov,
Angelia Nedich,
Lalitha Sankar
Abstract:
Variational Inequality (VI) problems have attracted great interest in the machine learning (ML) community due to their application in adversarial and multi-agent training. Despite its relevance in ML, the oft-used strong-monotonicity and Lipschitz continuity assumptions on VI problems are restrictive and do not hold in practice. To address this, we relax smoothness and monotonicity assumptions and…
▽ More
Variational Inequality (VI) problems have attracted great interest in the machine learning (ML) community due to their application in adversarial and multi-agent training. Despite its relevance in ML, the oft-used strong-monotonicity and Lipschitz continuity assumptions on VI problems are restrictive and do not hold in practice. To address this, we relax smoothness and monotonicity assumptions and study structured non-monotone generalized smoothness. The key idea of our results is in adaptive stepsizes. We prove the first-known convergence results for solving generalized smooth VIs for the three popular methods, namely, projection, Korpelevich, and Popov methods. Our convergence rate results for generalized smooth VIs match or improve existing results on smooth VIs. We present numerical experiments that support our theoretical guarantees and highlight the efficiency of proposed adaptive stepsizes.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Last Iterate Convergence of Popov Method for Non-monotone Stochastic Variational Inequalities
Authors:
Daniil Vankov,
Angelia Nedich,
Lalitha Sankar
Abstract:
This paper focuses on non-monotone stochastic variational inequalities (SVIs) that may not have a unique solution. A commonly used efficient algorithm to solve VIs is the Popov method, which is known to have the optimal convergence rate for VIs with Lipschitz continuous and strongly monotone operators. We introduce a broader class of structured non-monotone operators, namely $p$-quasi sharp operat…
▽ More
This paper focuses on non-monotone stochastic variational inequalities (SVIs) that may not have a unique solution. A commonly used efficient algorithm to solve VIs is the Popov method, which is known to have the optimal convergence rate for VIs with Lipschitz continuous and strongly monotone operators. We introduce a broader class of structured non-monotone operators, namely $p$-quasi sharp operators ($p> 0$), which allows tractably analyzing convergence behavior of algorithms. We show that the stochastic Popov method converges almost surely to a solution for all operators from this class under a linear growth. In addition, we obtain the last iterate convergence rate (in expectation) for the method under a linear growth condition for $2$-quasi sharp operators. Based on our analysis, we refine the results for smooth $2$-quasi sharp and $p$-quasi sharp operators (on a compact set), and obtain the optimal convergence rates. We further provide numerical experiments that demonstrate advantages of stochastic Popov method over stochastic projection method for solving SVIs.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Robust Model Selection of Gaussian Graphical Models
Authors:
Abrar Zahin,
Rajasekhar Anguluri,
Lalitha Sankar,
Oliver Kosut,
Gautam Dasarathy
Abstract:
In Gaussian graphical model selection, noise-corrupted samples present significant challenges. It is known that even minimal amounts of noise can obscure the underlying structure, leading to fundamental identifiability issues. A recent line of work addressing this "robust model selection" problem narrows its focus to tree-structured graphical models. Even within this specific class of models, exac…
▽ More
In Gaussian graphical model selection, noise-corrupted samples present significant challenges. It is known that even minimal amounts of noise can obscure the underlying structure, leading to fundamental identifiability issues. A recent line of work addressing this "robust model selection" problem narrows its focus to tree-structured graphical models. Even within this specific class of models, exact structure recovery is shown to be impossible. However, several algorithms have been developed that are known to provably recover the underlying tree-structure up to an (unavoidable) equivalence class.
In this paper, we extend these results beyond tree-structured graphs. We first characterize the equivalence class up to which general graphs can be recovered in the presence of noise. Despite the inherent ambiguity (which we prove is unavoidable), the structure that can be recovered reveals local clustering information and global connectivity patterns in the underlying model. Such information is useful in a range of real-world problems, including power grids, social networks, protein-protein interactions, and neural structures. We then propose an algorithm which provably recovers the underlying graph up to the identified ambiguity. We further provide finite sample guarantees in the high-dimensional regime for our algorithm and validate our results through numerical simulations.
△ Less
Submitted 7 May, 2024; v1 submitted 10 November, 2022;
originally announced November 2022.
-
The Saddle-Point Accountant for Differential Privacy
Authors:
Wael Alghamdi,
Shahab Asoodeh,
Flavio P. Calmon,
Juan Felipe Gomez,
Oliver Kosut,
Lalitha Sankar,
Fei Wei
Abstract:
We introduce a new differential privacy (DP) accountant called the saddle-point accountant (SPA). SPA approximates privacy guarantees for the composition of DP mechanisms in an accurate and fast manner. Our approach is inspired by the saddle-point method -- a ubiquitous numerical technique in statistics. We prove rigorous performance guarantees by deriving upper and lower bounds for the approximat…
▽ More
We introduce a new differential privacy (DP) accountant called the saddle-point accountant (SPA). SPA approximates privacy guarantees for the composition of DP mechanisms in an accurate and fast manner. Our approach is inspired by the saddle-point method -- a ubiquitous numerical technique in statistics. We prove rigorous performance guarantees by deriving upper and lower bounds for the approximation error offered by SPA. The crux of SPA is a combination of large-deviation methods with central limit theorems, which we derive via exponentially tilting the privacy loss random variables corresponding to the DP mechanisms. One key advantage of SPA is that it runs in constant time for the $n$-fold composition of a privacy mechanism. Numerical experiments demonstrate that SPA achieves comparable accuracy to state-of-the-art accounting methods with a faster runtime.
△ Less
Submitted 19 August, 2022;
originally announced August 2022.
-
Parameter Estimation in Ill-conditioned Low-inertia Power Systems
Authors:
Rajasekhar Anguluri,
Lalitha Sankar,
Oliver Kosut
Abstract:
This paper examines model parameter estimation in dynamic power systems whose governing electro-mechanical equations are ill-conditioned or singular. This ill-conditioning is because of converter-interfaced power systems generators' zero or small inertia contribution. Consequently, the overall system inertia decreases, resulting in low-inertia power systems. We show that the standard state-space m…
▽ More
This paper examines model parameter estimation in dynamic power systems whose governing electro-mechanical equations are ill-conditioned or singular. This ill-conditioning is because of converter-interfaced power systems generators' zero or small inertia contribution. Consequently, the overall system inertia decreases, resulting in low-inertia power systems. We show that the standard state-space model based on least squares or subspace estimators fails to exist for these models. We overcome this challenge by considering a least-squares estimator directly on the coupled swing-equation model but not on its transformed first-order state-space form. We specifically focus on estimating inertia (mechanical and virtual) and damping constants, although our method is general enough for estimating other parameters. Our theoretical analysis highlights the role of network topology on the parameter estimates of an individual generator. For generators with greater connectivity, estimation of the associated parameters is more susceptible to variations in other generator states. Furthermore, we numerically show that estimating the parameters by ignoring their ill-conditioning aspects yields highly unreliable results.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
Localization and Estimation of Unknown Forced Inputs: A Group LASSO Approach
Authors:
Rajasekhar Anguluri,
Lalitha Sankar,
Oliver Kosut
Abstract:
We model and study the problem of localizing a set of sparse forcing inputs for linear dynamical systems from noisy measurements when the initial state is unknown. This problem is of particular relevance to detecting forced oscillations in electric power networks. We express measurements as an additive model comprising the initial state and inputs grouped over time, both expanded in terms of the b…
▽ More
We model and study the problem of localizing a set of sparse forcing inputs for linear dynamical systems from noisy measurements when the initial state is unknown. This problem is of particular relevance to detecting forced oscillations in electric power networks. We express measurements as an additive model comprising the initial state and inputs grouped over time, both expanded in terms of the basis functions (i.e., impulse response coefficients). Using this model, with probabilistic guarantees, we recover the locations and simultaneously estimate the initial state and forcing inputs using a variant of the group LASSO (linear absolute shrinkage and selection operator) method. Specifically, we provide a tight upper bound on: (i) the probability that the group LASSO estimator wrongly identifies the source locations, and (ii) the $\ell_2$-norm of the estimation error. Our bounds explicitly depend upon the length of the measurement horizon, the noise statistics, the number of inputs and sensors, and the singular values of impulse response matrices. Our theoretical analysis is one of the first to provide a complete treatment for the group LASSO estimator for linear dynamical systems under input-to-output delay assumptions. Finally, we validate our results on synthetic models and the IEEE 68-bus, 16-machine system.
△ Less
Submitted 19 January, 2022;
originally announced January 2022.
-
Being Properly Improper
Authors:
Tyler Sypherd,
Richard Nock,
Lalitha Sankar
Abstract:
Properness for supervised losses stipulates that the loss function shapes the learning algorithm towards the true posterior of the data generating distribution. Unfortunately, data in modern machine learning can be corrupted or twisted in many ways. Hence, optimizing a proper loss function on twisted data could perilously lead the learning algorithm towards the twisted posterior, rather than to th…
▽ More
Properness for supervised losses stipulates that the loss function shapes the learning algorithm towards the true posterior of the data generating distribution. Unfortunately, data in modern machine learning can be corrupted or twisted in many ways. Hence, optimizing a proper loss function on twisted data could perilously lead the learning algorithm towards the twisted posterior, rather than to the desired clean posterior. Many papers cope with specific twists (e.g., label/feature/adversarial noise), but there is a growing need for a unified and actionable understanding atop properness. Our chief theoretical contribution is a generalization of the properness framework with a notion called twist-properness, which delineates loss functions with the ability to "untwist" the twisted posterior into the clean posterior. Notably, we show that a nontrivial extension of a loss function called $α$-loss, which was first introduced in information theory, is twist-proper. We study the twist-proper $α$-loss under a novel boosting algorithm, called PILBoost, and provide formal and experimental results for this algorithm. Our overarching practical conclusion is that the twist-proper $α$-loss outperforms the proper $\log$-loss on several variants of twisted data.
△ Less
Submitted 31 January, 2022; v1 submitted 18 June, 2021;
originally announced June 2021.