Skip to main content

Showing 1–29 of 29 results for author: Lavaei, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.19091  [pdf, other

    math.OC cs.CC cs.LG math.NA stat.ML

    High Probability Complexity Bounds of Trust-Region Stochastic Sequential Quadratic Programming with Heavy-Tailed Noise

    Authors: Yuchen Fang, Javad Lavaei, Sen Na

    Abstract: In this paper, we consider nonlinear optimization problems with a stochastic objective and deterministic equality constraints. We propose a Trust-Region Stochastic Sequential Quadratic Programming (TR-SSQP) method and establish its high-probability iteration complexity bounds for identifying first- and second-order $ε$-stationary points. In our algorithm, we assume that exact objective values, gra… ▽ More

    Submitted 6 April, 2025; v1 submitted 24 March, 2025; originally announced March 2025.

    Comments: 50 pages, 5 figures

  2. arXiv:2503.16673  [pdf, other

    math.OC cs.CC cs.LG eess.SY

    Subgradient Method for System Identification with Non-Smooth Objectives

    Authors: Baturalp Yalcin, Javad Lavaei

    Abstract: This paper investigates a subgradient-based algorithm to solve the system identification problem for linear time-invariant systems with non-smooth objectives. This is essential for robust system identification in safety-critical applications. While existing work provides theoretical exact recovery guarantees using optimization solvers, the design of fast learning algorithms with convergence guaran… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

    Comments: 8 pages, 5 figures

    MSC Class: 62; 90; 93

  3. arXiv:2502.12391  [pdf, other

    cs.LG

    Reward-Safety Balance in Offline Safe RL via Diffusion Regularization

    Authors: Junyu Guo, Zhi Zheng, Donghao Ying, Ming Jin, Shangding Gu, Costas Spanos, Javad Lavaei

    Abstract: Constrained reinforcement learning (RL) seeks high-performance policies under safety constraints. We focus on an offline setting where the agent has only a fixed dataset -- common in realistic tasks to prevent unsafe exploration. To address this, we propose Diffusion-Regularized Constrained Offline Reinforcement Learning (DRCORL), which first uses a diffusion model to capture the behavioral policy… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

  4. arXiv:2409.00276  [pdf, other

    math.OC cs.CR cs.LG eess.SY

    Exact Recovery Guarantees for Parameterized Nonlinear System Identification Problem under Sparse Disturbances or Semi-Oblivious Attacks

    Authors: Haixiang Zhang, Baturalp Yalcin, Javad Lavaei, Eduardo D. Sontag

    Abstract: In this work, we study the problem of learning a nonlinear dynamical system by parameterizing its dynamics using basis functions. We assume that disturbances occur at each time step with an arbitrary probability $p$, which models the sparsity level of the disturbance vectors over time. These disturbances are drawn from an arbitrary, unknown probability distribution, which may depend on past distur… ▽ More

    Submitted 20 March, 2025; v1 submitted 30 August, 2024; originally announced September 2024.

    Comments: 43 pages

    MSC Class: 62; 90; 93

  5. arXiv:2405.16601  [pdf, other

    cs.LG

    A CMDP-within-online framework for Meta-Safe Reinforcement Learning

    Authors: Vanshaj Khattar, Yuhao Ding, Bilgehan Sel, Javad Lavaei, Ming Jin

    Abstract: Meta-reinforcement learning has widely been used as a learning-to-learn framework to solve unseen tasks with limited experience. However, the aspect of constraint violations has not been adequately addressed in the existing works, making their application restricted in real-world settings. In this paper, we study the problem of meta-safe reinforcement learning (Meta-SRL) through the CMDP-within-on… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Journal ref: ICLR 2023

  6. arXiv:2405.16053  [pdf, other

    cs.LG

    Pausing Policy Learning in Non-stationary Reinforcement Learning

    Authors: Hyunin Lee, Ming Jin, Javad Lavaei, Somayeh Sojoudi

    Abstract: Real-time inference is a challenge of real-world reinforcement learning due to temporal differences in time-varying environments: the system collects data from the past, updates the decision model in the present, and deploys it in the future. We tackle a common belief that continually updating the decision is optimal to minimize the temporal gap. We propose forecasting an online reinforcement lear… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: conference

  7. arXiv:2403.06056  [pdf, other

    math.OC cs.LG eess.SP

    Absence of spurious solutions far from ground truth: A low-rank analysis with high-order losses

    Authors: Ziye Ma, Ying Chen, Javad Lavaei, Somayeh Sojoudi

    Abstract: Matrix sensing problems exhibit pervasive non-convexity, plaguing optimization with a proliferation of suboptimal spurious solutions. Avoiding convergence to these critical points poses a major challenge. This work provides new theoretical insights that help demystify the intricacies of the non-convex landscape. In this work, we prove that under certain conditions, critical points sufficiently dis… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: Accepted by AISTATS 2024

  8. arXiv:2310.15549  [pdf, other

    math.OC cs.LG

    Algorithmic Regularization in Tensor Optimization: Towards a Lifted Approach in Matrix Sensing

    Authors: Ziye Ma, Javad Lavaei, Somayeh Sojoudi

    Abstract: Gradient descent (GD) is crucial for generalization in machine learning models, as it induces implicit regularization, promoting compact representations. In this work, we examine the role of GD in inducing implicit regularization for tensor optimization, particularly within the context of the lifted matrix sensing framework. This framework has been recently proposed to address the non-convex matri… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: NeurIPS23 Poster

  9. arXiv:2309.14989  [pdf, other

    cs.LG

    Tempo Adaptation in Non-stationary Reinforcement Learning

    Authors: Hyunin Lee, Yuhao Ding, Jongmin Lee, Ming Jin, Javad Lavaei, Somayeh Sojoudi

    Abstract: We first raise and tackle a ``time synchronization'' issue between the agent and the environment in non-stationary reinforcement learning (RL), a crucial factor hindering its real-world applications. In reality, environmental changes occur over wall-clock time ($t$) rather than episode progress ($k$), where wall-clock time signifies the actual elapsed time within the fixed duration $t \in [0, T]$.… ▽ More

    Submitted 27 October, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

    Comments: 53 pages. To be published in Neural Information Processing Systems (NeurIPS), 2023

  10. arXiv:2305.17568  [pdf, other

    cs.LG math.OC

    Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities

    Authors: Donghao Ying, Yunkai Zhang, Yuhao Ding, Alec Koppel, Javad Lavaei

    Abstract: We investigate safe multi-agent reinforcement learning, where agents seek to collectively maximize an aggregate sum of local objectives while satisfying their own safety constraints. The objective and constraints are described by {\it general utilities}, i.e., nonlinear functions of the long-term state-action occupancy measure, which encompass broader decision-making goals such as risk, exploratio… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

    Comments: 50 pages

  11. arXiv:2305.17567  [pdf, other

    cs.GT math.OC

    No-Regret Learning in Dynamic Competition with Reference Effects Under Logit Demand

    Authors: Mengzi Amy Guo, Donghao Ying, Javad Lavaei, Zuo-Jun Max Shen

    Abstract: This work is dedicated to the algorithm design in a competitive framework, with the primary goal of learning a stable equilibrium. We consider the dynamic price competition between two firms operating within an opaque marketplace, where each firm lacks information about its competitor. The demand follows the multinomial logit (MNL) choice model, which depends on the consumers' observed price and t… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

  12. arXiv:2305.10506  [pdf, other

    cs.LG math.OC

    Exact Recovery for System Identification with More Corrupt Data than Clean Data

    Authors: Baturalp Yalcin, Haixiang Zhang, Javad Lavaei, Murat Arcak

    Abstract: This paper investigates the system identification problem for linear discrete-time systems under adversaries and analyzes two lasso-type estimators. We examine both asymptotic and non-asymptotic properties of these estimators in two separate scenarios, corresponding to deterministic and stochastic models for the attack times. Since the samples collected from the system are correlated, the existing… ▽ More

    Submitted 24 April, 2024; v1 submitted 17 May, 2023; originally announced May 2023.

    MSC Class: 62; 90; 93

  13. arXiv:2302.07938  [pdf, ps, other

    cs.LG cs.AI cs.MA

    Scalable Multi-Agent Reinforcement Learning with General Utilities

    Authors: Donghao Ying, Yuhao Ding, Alec Koppel, Javad Lavaei

    Abstract: We study the scalable multi-agent reinforcement learning (MARL) with general utilities, defined as nonlinear functions of the team's long-term state-action occupancy measure. The objective is to find a localized policy that maximizes the average of the team's local utility functions without the full observability of each agent in the team. By exploiting the spatial correlation decay property of th… ▽ More

    Submitted 26 August, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

    Comments: Supplementary material for the contribution to American Control Conference 2023 under the same title

  14. arXiv:2302.07828  [pdf, other

    math.OC cs.LG

    Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points

    Authors: Ziye Ma, Igor Molybog, Javad Lavaei, Somayeh Sojoudi

    Abstract: This paper studies the role of over-parametrization in solving non-convex optimization problems. The focus is on the important class of low-rank matrix sensing, where we propose an infinite hierarchy of non-convex problems via the lifting technique and the Burer-Monteiro factorization. This contrasts with the existing over-parametrization technique where the search rank is limited by the dimension… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

  15. arXiv:2211.10815  [pdf, other

    cs.LG math.OC

    Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design

    Authors: Yuhao Ding, Ming Jin, Javad Lavaei

    Abstract: We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic non-stationary Markov decision processes (MDPs). Both the reward functions and the state transition kernels are unknown and allowed to vary arbitrarily over time with a budget on their cumulative variations. When this variation budget is known a prior, we propose two restart-based algorithms, namely R… ▽ More

    Submitted 19 November, 2022; originally announced November 2022.

    Comments: 33 pages,3 figures, AAAI 2023. arXiv admin note: text overlap with arXiv:2111.03947, arXiv:2102.05406 by other authors

  16. arXiv:2208.07469  [pdf, ps, other

    math.OC cs.LG

    Semidefinite Programming versus Burer-Monteiro Factorization for Matrix Sensing

    Authors: Baturalp Yalcin, Ziye Ma, Javad Lavaei, Somayeh Sojoudi

    Abstract: Many fundamental low-rank optimization problems, such as matrix completion, phase synchronization/retrieval, power system state estimation, and robust PCA, can be formulated as the matrix sensing problem. Two main approaches for solving matrix sensing are based on semidefinite programming (SDP) and Burer-Monteiro (B-M) factorization. The SDP method suffers from high computational and space complex… ▽ More

    Submitted 15 August, 2022; originally announced August 2022.

    Comments: 21 pages

    MSC Class: 90C22; 90C26

  17. arXiv:2205.10715  [pdf, other

    cs.LG math.OC

    Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction

    Authors: Donghao Ying, Mengzi Amy Guo, Hyunin Lee, Yuhao Ding, Javad Lavaei, Zuo-Jun Max Shen

    Abstract: We study Concave Constrained Markov Decision Processes (Concave CMDPs) where both the objective and constraints are defined as concave functions of the state-action occupancy measure. We propose the Variance-Reduced Primal-Dual Policy Gradient Algorithm (VR-PDPG), which updates the primal variable via policy gradient ascent and the dual variable via projected sub-gradient descent. Despite the chal… ▽ More

    Submitted 26 May, 2024; v1 submitted 21 May, 2022; originally announced May 2022.

  18. arXiv:2201.11965  [pdf, ps, other

    cs.LG

    Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints

    Authors: Yuhao Ding, Javad Lavaei

    Abstract: We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision processes (CMDPs) with non-stationary objectives and constraints, which plays a central role in ensuring the safety of RL in time-varying environments. In this problem, the reward/utility functions and the state transition functions are both allowed to vary arbitrarily over time as long as their cumul… ▽ More

    Submitted 19 November, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

    Comments: 32 pages, AAAI 2023

  19. arXiv:2110.10279  [pdf, other

    math.OC cs.LG

    Factorization Approach for Low-complexity Matrix Completion Problems: Exponential Number of Spurious Solutions and Failure of Gradient Methods

    Authors: Baturalp Yalcin, Haixiang Zhang, Javad Lavaei, Somayeh Sojoudi

    Abstract: It is well-known that the Burer-Monteiro (B-M) factorization approach can efficiently solve low-rank matrix optimization problems under the RIP condition. It is natural to ask whether B-M factorization-based methods can succeed on any low-rank matrix optimization problems with a low information-theoretic complexity, i.e., polynomial-time solvable problems that have a unique solution. In this work,… ▽ More

    Submitted 19 October, 2021; originally announced October 2021.

    Comments: 21 pages, 1 figure

  20. arXiv:2110.10117  [pdf, other

    cs.LG

    Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization

    Authors: Yuhao Ding, Junzi Zhang, Hyunin Lee, Javad Lavaei

    Abstract: Entropy regularization is an efficient technique for encouraging exploration and preventing a premature convergence of (vanilla) policy gradient methods in reinforcement learning (RL). However, the theoretical understanding of entropy-regularized RL algorithms has been limited. In this paper, we revisit the classical entropy regularized policy gradient methods with the soft-max policy parametrizat… ▽ More

    Submitted 13 July, 2024; v1 submitted 19 October, 2021; originally announced October 2021.

  21. arXiv:2110.10116  [pdf, ps, other

    cs.LG math.OC

    On the Global Optimum Convergence of Momentum-based Policy Gradient

    Authors: Yuhao Ding, Junzi Zhang, Javad Lavaei

    Abstract: Policy gradient (PG) methods are popular and efficient for large-scale reinforcement learning due to their relative stability and incremental nature. In recent years, the empirical success of PG methods has led to the development of a theoretical foundation for these methods. In this work, we generalize this line of research by studying the global convergence of stochastic PG methods with momentum… ▽ More

    Submitted 22 May, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: AISTATS 2022

  22. arXiv:2110.08923  [pdf, ps, other

    cs.LG math.OC

    A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization

    Authors: Donghao Ying, Yuhao Ding, Javad Lavaei

    Abstract: We study entropy-regularized constrained Markov decision processes (CMDPs) under the soft-max parameterization, in which an agent aims to maximize the entropy-regularized value function while satisfying constraints on the expected total utility. By leveraging the entropy regularization, our theoretical analysis shows that its Lagrangian dual function is smooth and the Lagrangian duality gap can be… ▽ More

    Submitted 7 April, 2023; v1 submitted 17 October, 2021; originally announced October 2021.

    Comments: 24 pages, AISTATS22

  23. arXiv:2105.08232  [pdf, other

    math.OC cs.LG stat.ML

    Sharp Restricted Isometry Property Bounds for Low-rank Matrix Recovery Problems with Corrupted Measurements

    Authors: Ziye Ma, Yingjie Bi, Javad Lavaei, Somayeh Sojoudi

    Abstract: In this paper, we study a general low-rank matrix recovery problem with linear measurements corrupted by some noise. The objective is to understand under what conditions on the restricted isometry property (RIP) of the problem local search methods can find the ground truth with a small error. By analyzing the landscape of the non-convex problem, we first propose a global guarantee on the maximum d… ▽ More

    Submitted 25 July, 2023; v1 submitted 17 May, 2021; originally announced May 2021.

  24. arXiv:2006.00453  [pdf, ps, other

    cs.LG math.OC stat.ML

    When Does MAML Objective Have Benign Landscape?

    Authors: Igor Molybog, Javad Lavaei

    Abstract: The paper studies the complexity of the optimization problem behind the Model-Agnostic Meta-Learning (MAML) algorithm. The goal of the study is to determine the global convergence of MAML on sequential decision-making tasks possessing a common structure. We are curious to know when, if at all, the benign landscape of the underlying tasks results in a benign landscape of the corresponding MAML obje… ▽ More

    Submitted 10 December, 2020; v1 submitted 31 May, 2020; originally announced June 2020.

    Comments: 12 pages, 3 figures

  25. arXiv:1901.01631  [pdf, other

    cs.LG math.OC stat.ML

    Sharp Restricted Isometry Bounds for the Inexistence of Spurious Local Minima in Nonconvex Matrix Recovery

    Authors: Richard Y. Zhang, Somayeh Sojoudi, Javad Lavaei

    Abstract: Nonconvex matrix recovery is known to contain no spurious local minima under a restricted isometry property (RIP) with a sufficiently small RIP constant $δ$. If $δ$ is too large, however, then counterexamples containing spurious local minima are known to exist. In this paper, we introduce a proof technique that is capable of establishing sharp thresholds on $δ$ to guarantee the inexistence of spur… ▽ More

    Submitted 13 June, 2019; v1 submitted 6 January, 2019; originally announced January 2019.

    Comments: v2: fixed several typos; v3: accepted at JMLR

    Journal ref: Journal of Machine Learning Research 20 (114): 1-34, 2019

  26. arXiv:1810.11505  [pdf, other

    eess.SY cs.LG

    Stability-certified reinforcement learning: A control-theoretic perspective

    Authors: Ming Jin, Javad Lavaei

    Abstract: We investigate the important problem of certifying stability of reinforcement learning policies when interconnected with nonlinear dynamical systems. We show that by regulating the input-output gradients of policies, strong guarantees of robust stability can be obtained based on a proposed semidefinite programming feasibility problem. The method is able to certify a large set of stabilizing contro… ▽ More

    Submitted 26 October, 2018; originally announced October 2018.

  27. arXiv:1805.10251  [pdf, other

    cs.LG math.OC stat.ML

    How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?

    Authors: Richard Y. Zhang, Cédric Josz, Somayeh Sojoudi, Javad Lavaei

    Abstract: When the linear measurements of an instance of low-rank matrix recovery satisfy a restricted isometry property (RIP)---i.e. they are approximately norm-preserving---the problem is known to contain no spurious local minima, so exact recovery is guaranteed. In this paper, we show that moderate RIP is not enough to eliminate spurious local minima, so existing results can only hold for near-perfect RI… ▽ More

    Submitted 30 October, 2018; v1 submitted 25 May, 2018; originally announced May 2018.

    Comments: 32nd Conference on Neural Information Processing Systems (NIPS 2018)

  28. arXiv:1204.4419  [pdf, ps, other

    math.OC cs.IT eess.SY

    Geometry of Power Flows and Optimization in Distribution Networks

    Authors: Javad Lavaei, David Tse, Baosen Zhang

    Abstract: We investigate the geometry of injection regions and its relationship to optimization of power flows in tree networks. The injection region is the set of all vectors of bus power injections that satisfy the network and operation constraints. The geometrical object of interest is the set of Pareto-optimal points of the injection region. If the voltage magnitudes are fixed, the injection region of a… ▽ More

    Submitted 19 August, 2013; v1 submitted 19 April, 2012; originally announced April 2012.

    Comments: To Appear in IEEE Transaction on Power Systems

  29. arXiv:1204.1106  [pdf, ps, other

    math.OC cs.DC eess.SY

    Message Passing for Dynamic Network Energy Management

    Authors: Matt Kraning, Eric Chu, Javad Lavaei, Stephen Boyd

    Abstract: We consider a network of devices, such as generators, fixed loads, deferrable loads, and storage devices, each with its own dynamic constraints and objective, connected by lossy capacitated lines. The problem is to minimize the total network objective subject to the device and line constraints, over a given time horizon. This is a large optimization problem, with variables for consumption or gener… ▽ More

    Submitted 4 April, 2012; originally announced April 2012.

    Comments: Submitted to IEEE Transactions on Smart grid