Search | arXiv e-print repository

Localized LoRA: A Structured Low-Rank Approximation for Efficient Fine-Tuning

Abstract: Parameter-efficient fine-tuning (PEFT) methods, such as LoRA, offer compact and effective alternatives to full model fine-tuning by introducing low-rank updates to pretrained weights. However, most existing approaches rely on global low-rank structures, which can overlook spatial patterns spread across the parameter space. In this work, we propose Localized LoRA, a generalized framework that model… ▽ More Parameter-efficient fine-tuning (PEFT) methods, such as LoRA, offer compact and effective alternatives to full model fine-tuning by introducing low-rank updates to pretrained weights. However, most existing approaches rely on global low-rank structures, which can overlook spatial patterns spread across the parameter space. In this work, we propose Localized LoRA, a generalized framework that models weight updates as a composition of low-rank matrices applied to structured blocks of the weight matrix. This formulation enables dense, localized updates throughout the parameter space-without increasing the total number of trainable parameters. We provide a formal comparison between global, diagonal-local, and fully localized low-rank approximations, and show that our method consistently achieves lower approximation error under matched parameter budgets. Experiments on both synthetic and practical settings demonstrate that Localized LoRA offers a more expressive and adaptable alternative to existing methods, enabling efficient fine-tuning with improved performance. △ Less

Submitted 30 May, 2025; originally announced June 2025.

arXiv:2110.03950 [pdf, ps, other]

Nonconvex-Nonconcave Min-Max Optimization with a Small Maximization Domain

Authors: Dmitrii M. Ostrovskii, Babak Barazandeh, Meisam Razaviyayn

Abstract: We study the problem of finding approximate first-order stationary points in optimization problems of the form $\min_{x \in X} \max_{y \in Y} f(x,y)$, where the sets $X,Y$ are convex and $Y$ is compact. The objective function $f$ is smooth, but assumed neither convex in $x$ nor concave in $y$. Our approach relies upon replacing the function $f(x,\cdot)$ with its $k$th order Taylor approximation (i… ▽ More We study the problem of finding approximate first-order stationary points in optimization problems of the form $\min_{x \in X} \max_{y \in Y} f(x,y)$, where the sets $X,Y$ are convex and $Y$ is compact. The objective function $f$ is smooth, but assumed neither convex in $x$ nor concave in $y$. Our approach relies upon replacing the function $f(x,\cdot)$ with its $k$th order Taylor approximation (in $y$) and finding a near-stationary point in the resulting surrogate problem. To guarantee its success, we establish the following result: let the Euclidean diameter of $Y$ be small in terms of the target accuracy $\varepsilon$, namely $O(\varepsilon^{\frac{2}{k+1}})$ for $k \in \mathbb{N}$ and $O(\varepsilon)$ for $k = 0$, with the constant factors controlled by certain regularity parameters of $f$; then any $\varepsilon$-stationary point in the surrogate problem remains $O(\varepsilon)$-stationary for the initial problem. Moreover, we show that these upper bounds are nearly optimal: the aforementioned reduction provably fails when the diameter of $Y$ is larger. For $0 \le k \le 2$ the surrogate function can be efficiently maximized in $y$; our general approximation result then leads to efficient algorithms for finding a near-stationary point in nonconvex-nonconcave min-max problems, for which we also provide convergence guarantees. △ Less

Submitted 8 October, 2021; originally announced October 2021.

Comments: 50 pages

arXiv:2106.06075 [pdf, other]

doi 10.1016/j.sigpro.2021.108245

A Decentralized Adaptive Momentum Method for Solving a Class of Min-Max Optimization Problems

Authors: Babak Barazandeh, Tianjian Huang, George Michailidis

Abstract: Min-max saddle point games have recently been intensely studied, due to their wide range of applications, including training Generative Adversarial Networks (GANs). However, most of the recent efforts for solving them are limited to special regimes such as convex-concave games. Further, it is customarily assumed that the underlying optimization problem is solved either by a single machine or in th… ▽ More Min-max saddle point games have recently been intensely studied, due to their wide range of applications, including training Generative Adversarial Networks (GANs). However, most of the recent efforts for solving them are limited to special regimes such as convex-concave games. Further, it is customarily assumed that the underlying optimization problem is solved either by a single machine or in the case of multiple machines connected in centralized fashion, wherein each one communicates with a central node. The latter approach becomes challenging, when the underlying communications network has low bandwidth. In addition, privacy considerations may dictate that certain nodes can communicate with a subset of other nodes. Hence, it is of interest to develop methods that solve min-max games in a decentralized manner. To that end, we develop a decentralized adaptive momentum (ADAM)-type algorithm for solving min-max optimization problem under the condition that the objective function satisfies a Minty Variational Inequality condition, which is a generalization to convex-concave case. The proposed method overcomes shortcomings of recent non-adaptive gradient-based decentralized algorithms for min-max optimization problems that do not perform well in practice and require careful tuning. In this paper, we obtain non-asymptotic rates of convergence of the proposed algorithm (coined DADAM$^3$) for finding a (stochastic) first-order Nash equilibrium point and subsequently evaluate its performance on training GANs. The extensive empirical evaluation shows that DADAM$^3$ outperforms recently developed methods, including decentralized optimistic stochastic gradient for solving such min-max problems. △ Less

Submitted 28 June, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

Journal ref: Signal Processing Volume 189, December 2021, 108245

arXiv:2105.05953 [pdf, other]

Efficient Algorithms for Estimating the Parameters of Mixed Linear Regression Models

Authors: Babak Barazandeh, Ali Ghafelebashi, Meisam Razaviyayn, Ram Sriharsha

Abstract: Mixed linear regression (MLR) model is among the most exemplary statistical tools for modeling non-linear distributions using a mixture of linear models. When the additive noise in MLR model is Gaussian, Expectation-Maximization (EM) algorithm is a widely-used algorithm for maximum likelihood estimation of MLR parameters. However, when noise is non-Gaussian, the steps of EM algorithm may not have… ▽ More Mixed linear regression (MLR) model is among the most exemplary statistical tools for modeling non-linear distributions using a mixture of linear models. When the additive noise in MLR model is Gaussian, Expectation-Maximization (EM) algorithm is a widely-used algorithm for maximum likelihood estimation of MLR parameters. However, when noise is non-Gaussian, the steps of EM algorithm may not have closed-form update rules, which makes EM algorithm impractical. In this work, we study the maximum likelihood estimation of the parameters of MLR model when the additive noise has non-Gaussian distribution. In particular, we consider the case that noise has Laplacian distribution and we first show that unlike the the Gaussian case, the resulting sub-problems of EM algorithm in this case does not have closed-form update rule, thus preventing us from using EM in this case. To overcome this issue, we propose a new algorithm based on combining the alternating direction method of multipliers (ADMM) with EM algorithm idea. Our numerical experiments show that our method outperforms the EM algorithm in statistical accuracy and computational time in non-Gaussian noise case. △ Less

Submitted 12 May, 2021; originally announced May 2021.

arXiv:2104.12676 [pdf, other]

Solving a class of non-convex min-max games using adaptive momentum methods

Authors: Babak Barazandeh, Davoud Ataee Tarzanagh, George Michailidis

Abstract: Adaptive momentum methods have recently attracted a lot of attention for training of deep neural networks. They use an exponential moving average of past gradients of the objective function to update both search directions and learning rates. However, these methods are not suited for solving min-max optimization problems that arise in training generative adversarial networks. In this paper, we pro… ▽ More Adaptive momentum methods have recently attracted a lot of attention for training of deep neural networks. They use an exponential moving average of past gradients of the objective function to update both search directions and learning rates. However, these methods are not suited for solving min-max optimization problems that arise in training generative adversarial networks. In this paper, we propose an adaptive momentum min-max algorithm that generalizes adaptive momentum methods to the non-convex min-max regime. Further, we establish non-asymptotic rates of convergence for the proposed algorithm when used in a reasonably broad class of non-convex min-max optimization problems. Experimental results illustrate its superior performance vis-a-vis benchmark methods for solving such problems. △ Less

Submitted 26 April, 2021; originally announced April 2021.

arXiv:2003.08093 [pdf, other]

Solving Non-Convex Non-Differentiable Min-Max Games using Proximal Gradient Method

Authors: Babak Barazandeh, Meisam Razaviyayn

Abstract: Min-max saddle point games appear in a wide range of applications in machine leaning and signal processing. Despite their wide applicability, theoretical studies are mostly limited to the special convex-concave structure. While some recent works generalized these results to special smooth non-convex cases, our understanding of non-smooth scenarios is still limited. In this work, we study special f… ▽ More Min-max saddle point games appear in a wide range of applications in machine leaning and signal processing. Despite their wide applicability, theoretical studies are mostly limited to the special convex-concave structure. While some recent works generalized these results to special smooth non-convex cases, our understanding of non-smooth scenarios is still limited. In this work, we study special form of non-smooth min-max games when the objective function is (strongly) convex with respect to one of the player's decision variable. We show that a simple multi-step proximal gradient descent-ascent algorithm converges to $ε$-first-order Nash equilibrium of the min-max game with the number of gradient evaluations being polynomial in $1/ε$. We will also show that our notion of stationarity is stronger than existing ones in the literature. Finally, we evaluate the performance of the proposed algorithm through adversarial attack on a LASSO estimator. △ Less

Submitted 18 March, 2020; originally announced March 2020.

arXiv:1904.09775 [pdf, other]

Training generative networks using random discriminators

Authors: Babak Barazandeh, Meisam Razaviyayn, Maziar Sanjabi

Abstract: In recent years, Generative Adversarial Networks (GANs) have drawn a lot of attentions for learning the underlying distribution of data in various applications. Despite their wide applicability, training GANs is notoriously difficult. This difficulty is due to the min-max nature of the resulting optimization problem and the lack of proper tools of solving general (non-convex, non-concave) min-max… ▽ More In recent years, Generative Adversarial Networks (GANs) have drawn a lot of attentions for learning the underlying distribution of data in various applications. Despite their wide applicability, training GANs is notoriously difficult. This difficulty is due to the min-max nature of the resulting optimization problem and the lack of proper tools of solving general (non-convex, non-concave) min-max optimization problems. In this paper, we try to alleviate this problem by proposing a new generative network that relies on the use of random discriminators instead of adversarial design. This design helps us to avoid the min-max formulation and leads to an optimization problem that is stable and could be solved efficiently. The performance of the proposed method is evaluated using handwritten digits (MNIST) and Fashion products (Fashion-MNIST) data sets. While the resulting images are not as sharp as adversarial training, the use of random discriminator leads to a much faster algorithm as compared to the adversarial counterpart. This observation, at the minimum, illustrates the potential of the random discriminator approach for warm-start in training GANs. △ Less

Submitted 22 April, 2019; originally announced April 2019.

arXiv:1903.06632 [pdf]

Designing an Optimal Portfolio for Iran's Stock Market with Genetic Algorithm using Neural Network Prediction of Risk and Return Stocks

Authors: Masoud Fekri, Babak Barazandeh

Abstract: Optimal capital allocation between different assets is an important financial problem, which is generally framed as the portfolio optimization problem. General models include the single-period and multi-period cases. The traditional Mean-Variance model introduced by Harry Markowitz has been the basis of many models used to solve the portfolio optimization problem. The overall goal is to achieve th… ▽ More Optimal capital allocation between different assets is an important financial problem, which is generally framed as the portfolio optimization problem. General models include the single-period and multi-period cases. The traditional Mean-Variance model introduced by Harry Markowitz has been the basis of many models used to solve the portfolio optimization problem. The overall goal is to achieve the highest return and lowest risk in portfolio optimization problems. In this paper, we will present an optimal portfolio based the Markowitz Mean-Variance-Skewness with weight constraints model for short-term investment opportunities in Iran's stock market. We will use a neural network based predictor to predict the stock returns and measure the risk of stocks based on the prediction errors in the neural network. We will perform a series of experiments on our portfolio optimization model with the real data from Iran's stock market indices including Bank, Insurance, Investment, Petroleum Products and Chemicals indices. Finally, 8 different portfolios with low, medium and high risks for different type of investors (risk-averse or risk taker) using genetic algorithm will be designed and analyzed. △ Less

Submitted 17 February, 2019; originally announced March 2019.

arXiv:1810.09447 [pdf]

A Method for Robust Online Classification using Dictionary Learning: Development and Assessment for Monitoring Manual Material Handling Activities Using Wearable Sensors

Authors: Babak Barazandeh, Mohammadhussein Rafieisakhaei, Sunwook Kim, Zhenyu, Kong, Maury A. Nussbaum

Abstract: Classification methods based on sparse estimation have drawn much attention recently, due to their effectiveness in processing high-dimensional data such as images. In this paper, a method to improve the performance of a sparse representation classification (SRC) approach is proposed; it is then applied to the problem of online process monitoring of human workers, specifically manual material hand… ▽ More Classification methods based on sparse estimation have drawn much attention recently, due to their effectiveness in processing high-dimensional data such as images. In this paper, a method to improve the performance of a sparse representation classification (SRC) approach is proposed; it is then applied to the problem of online process monitoring of human workers, specifically manual material handling (MMH) operations monitored using wearable sensors (involving 111 sensor channels). Our proposed method optimizes the design matrix (aka dictionary) in the linear model used for SRC, minimizing its ill-posedness to achieve a sparse solution. This procedure is based on the idea of dictionary learning (DL): we optimize the design matrix formed by training datasets to minimize both redundancy and coherency as well as reducing the size of these datasets. Use of such optimized training data can subsequently improve classification accuracy and help decrease the computational time needed for the SRC; it is thus more applicable for online process monitoring. Performance of the proposed methodology is demonstrated using wearable sensor data obtained from manual material handling experiments, and is found to be superior to those of benchmark methods in terms of accuracy, while also requiring computational time appropriate for MMH online monitoring. △ Less

Submitted 21 October, 2018; originally announced October 2018.

arXiv:1809.08705 [pdf, other]

On the Behavior of the Expectation-Maximization Algorithm for Mixture Models

Authors: Babak Barazandeh, Meisam Razaviyayn

Abstract: Finite mixture models are among the most popular statistical models used in different data science disciplines. Despite their broad applicability, inference under these models typically leads to computationally challenging non-convex problems. While the Expectation-Maximization (EM) algorithm is the most popular approach for solving these non-convex problems, the behavior of this algorithm is not… ▽ More Finite mixture models are among the most popular statistical models used in different data science disciplines. Despite their broad applicability, inference under these models typically leads to computationally challenging non-convex problems. While the Expectation-Maximization (EM) algorithm is the most popular approach for solving these non-convex problems, the behavior of this algorithm is not well understood. In this work, we focus on the case of mixture of Laplacian (or Gaussian) distribution. We start by analyzing a simple equally weighted mixture of two single dimensional Laplacian distributions and show that every local optimum of the population maximum likelihood estimation problem is globally optimal. Then, we prove that the EM algorithm converges to the ground truth parameters almost surely with random initialization. Our result extends the existing results for Gaussian distribution to Laplacian distribution. Then we numerically study the behavior of mixture models with more than two components. Motivated by our extensive numerical experiments, we propose a novel stochastic method for estimating the mean of components of a mixture model. Our numerical experiments show that our algorithm outperforms the Naive EM algorithm in almost all scenarios. △ Less

Submitted 23 September, 2018; originally announced September 2018.

Showing 1–10 of 10 results for author: Barazandeh, B