Skip to main content

Showing 1–28 of 28 results for author: Balzano, L

Searching in archive math. Search in all archives.
.
  1. arXiv:2505.14808  [pdf, ps, other

    stat.ML cs.LG math.ST

    Out-of-Distribution Generalization of In-Context Learning: A Low-Dimensional Subspace Perspective

    Authors: Soo Min Kwon, Alec S. Xu, Can Yaras, Laura Balzano, Qing Qu

    Abstract: This work aims to demystify the out-of-distribution (OOD) capabilities of in-context learning (ICL) by studying linear regression tasks parameterized with low-rank covariance matrices. With such a parameterization, we can model distribution shifts as a varying angle between the subspace of the training and testing covariance matrices. We prove that a single-layer linear attention model incurs a te… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

  2. arXiv:2504.09873  [pdf, other

    cs.LG cs.AI math.NA stat.ML

    Truncated Matrix Completion - An Empirical Study

    Authors: Rishhabh Naik, Nisarg Trivedi, Davoud Ataee Tarzanagh, Laura Balzano

    Abstract: Low-rank Matrix Completion (LRMC) describes the problem where we wish to recover missing entries of partially observed low-rank matrix. Most existing matrix completion work deals with sampling procedures that are independent of the underlying data values. While this assumption allows the derivation of nice theoretical guarantees, it seldom holds in real-world applications. In this paper, we consid… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Journal ref: Proceedings of the 30th European Signal Processing Conference EUSIPCO 2022 847-851

  3. arXiv:2503.19859  [pdf, other

    cs.LG eess.SP math.OC stat.CO stat.ML

    An Overview of Low-Rank Structures in the Training and Adaptation of Large Models

    Authors: Laura Balzano, Tianjiao Ding, Benjamin D. Haeffele, Soo Min Kwon, Qing Qu, Peng Wang, Zhangyang Wang, Can Yaras

    Abstract: The rise of deep learning has revolutionized data processing and prediction in signal processing and machine learning, yet the substantial computational demands of training and deploying modern large-scale deep models present significant challenges, including high computational costs and energy consumption. Recent research has uncovered a widespread phenomenon in deep networks: the emergence of lo… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

    Comments: Authors are listed alphabetically; 27 pages, 10 figures

  4. arXiv:2405.03073  [pdf, other

    math.OC stat.ML

    Convergence and Complexity Guarantee for Inexact First-order Riemannian Optimization Algorithms

    Authors: Yuchen Li, Laura Balzano, Deanna Needell, Hanbaek Lyu

    Abstract: We analyze inexact Riemannian gradient descent (RGD) where Riemannian gradients and retractions are inexactly (and cheaply) computed. Our focus is on understanding when inexact RGD converges and what is the complexity in the general nonconvex and constrained setting. We answer these questions in a general framework of tangential Block Majorization-Minimization (tBMM). We establish that tBMM conver… ▽ More

    Submitted 9 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: 23 pages, 5 figures. ICML 2024. Appendix revised

  5. arXiv:2312.10330  [pdf, other

    math.OC stat.ML

    Convergence and complexity of block majorization-minimization for constrained block-Riemannian optimization

    Authors: Yuchen Li, Laura Balzano, Deanna Needell, Hanbaek Lyu

    Abstract: Block majorization-minimization (BMM) is a simple iterative algorithm for nonconvex optimization that sequentially minimizes a majorizing surrogate of the objective function in each block coordinate while the other block coordinates are held fixed. We consider a family of BMM algorithms for minimizing smooth nonconvex objectives, where each parameter block is constrained within a subset of a Riema… ▽ More

    Submitted 6 August, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

    Comments: 54 pages, 8 figures. Related work updated

  6. arXiv:2311.02960  [pdf, other

    cs.LG cs.CV math.OC

    Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

    Authors: Peng Wang, Xiao Li, Can Yaras, Zhihui Zhu, Laura Balzano, Wei Hu, Qing Qu

    Abstract: Over the past decade, deep learning has proven to be a highly effective tool for learning meaningful features from raw data. However, it remains an open question how deep networks perform hierarchical feature learning across layers. In this work, we attempt to unveil this mystery by investigating the structures of intermediate features. Motivated by our empirical findings that linear layers mimic… ▽ More

    Submitted 21 May, 2025; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: 65 pages, 17 figures

  7. arXiv:2308.02802  [pdf, other

    math.NA

    Learning physics-based reduced-order models from data using nonlinear manifolds

    Authors: Rudy Geelen, Laura Balzano, Stephen Wright, Karen Willcox

    Abstract: We present a novel method for learning reduced-order models of dynamical systems using nonlinear manifolds. First, we learn the manifold by identifying nonlinear structure in the data through a general representation learning problem. The proposed approach is driven by embeddings of low-order polynomial form. A projection onto the nonlinear manifold reveals the algebraic structure of the reduced-s… ▽ More

    Submitted 19 February, 2024; v1 submitted 5 August, 2023; originally announced August 2023.

  8. arXiv:2306.13748  [pdf, other

    math.NA

    Learning latent representations in high-dimensional state spaces using polynomial manifold constructions

    Authors: Rudy Geelen, Laura Balzano, Karen Willcox

    Abstract: We present a novel framework for learning cost-efficient latent representations in problems with high-dimensional state spaces through nonlinear dimension reduction. By enriching linear state approximations with low-order polynomial terms we account for key nonlinear interactions existing in the data thereby reducing the problem's intrinsic dimensionality. Two methods are introduced for learning t… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  9. arXiv:2301.00423  [pdf, ps, other

    math.OC

    A Proximal DC Algorithm for Sample Average Approximation of Chance Constrained Programming

    Authors: Peng Wang, Rujun Jiang, Qingyuan Kong, Laura Balzano

    Abstract: Chance constrained programming (CCP) refers to a type of optimization problem with uncertain constraints that are satisfied with at least a prescribed probability level. In this work, we study the sample average approximation (SAA) of chance constraints. This is an important approach to solving CCP, especially in the data-driven setting where only a sample of multiple realizations of the random ve… ▽ More

    Submitted 28 April, 2025; v1 submitted 1 January, 2023; originally announced January 2023.

    Comments: 42 pages, 4 tables

  10. arXiv:2207.02829  [pdf, other

    math.OC cs.DS cs.LG

    Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods

    Authors: Davoud Ataee Tarzanagh, Parvin Nazari, Bojian Hou, Li Shen, Laura Balzano

    Abstract: This paper introduces \textit{online bilevel optimization} in which a sequence of time-varying bilevel problems is revealed one after the other. We extend the known regret bounds for online single-level algorithms to the bilevel setting. Specifically, we provide new notions of \textit{bilevel regret}, develop an online alternating time-averaged gradient method that is capable of leveraging smoothn… ▽ More

    Submitted 8 July, 2024; v1 submitted 6 July, 2022; originally announced July 2022.

    Comments: Published at AISTATS 2024. V7: minor edits to the statement of Lemma 18 and Assumption A

  11. arXiv:2206.05553  [pdf, ps, other

    math.OC stat.ML

    Convergence and Recovery Guarantees of the K-Subspaces Method for Subspace Clustering

    Authors: Peng Wang, Huikang Liu, Anthony Man-Cho So, Laura Balzano

    Abstract: The K-subspaces (KSS) method is a generalization of the K-means method for subspace clustering. In this work, we present local convergence analysis and a recovery guarantee for KSS, assuming data are generated by the semi-random union of subspaces model, where $N$ points are randomly sampled from $K \ge 2$ overlapping subspaces. We show that if the initial assignment of the KSS method lies within… ▽ More

    Submitted 18 June, 2022; v1 submitted 11 June, 2022; originally announced June 2022.

    Comments: This paper is accepted by ICML 2022

  12. arXiv:2205.13653  [pdf, other

    math.OC eess.SP

    A Semidefinite Relaxation for Sums of Heterogeneous Quadratic Forms on the Stiefel Manifold

    Authors: Kyle Gilman, Sam Burer, Laura Balzano

    Abstract: We study the maximization of sums of heterogeneous quadratic forms over the Stiefel manifold, a nonconvex problem that arises in several modern signal processing and machine learning applications such as heteroscedastic probabilistic principal component analysis (HPPCA). In this work, we derive a novel semidefinite program (SDP) relaxation of the original problem and study a few of its theoretical… ▽ More

    Submitted 7 April, 2025; v1 submitted 26 May, 2022; originally announced May 2022.

  13. arXiv:2111.07018  [pdf, other

    cs.LG eess.SY math.OC stat.ML

    Identification and Adaptive Control of Markov Jump Systems: Sample Complexity and Regret Bounds

    Authors: Yahya Sattar, Zhe Du, Davoud Ataee Tarzanagh, Laura Balzano, Necmiye Ozay, Samet Oymak

    Abstract: Learning how to effectively control unknown dynamical systems is crucial for intelligent autonomous systems. This task becomes a significant challenge when the underlying dynamics are changing with time. Motivated by this challenge, this paper considers the problem of controlling an unknown Markov jump linear system (MJS) to optimize a quadratic objective. By taking a model-based perspective, we c… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

  14. arXiv:2105.12358  [pdf, other

    math.OC cs.LG eess.SY

    Certainty Equivalent Quadratic Control for Markov Jump Systems

    Authors: Zhe Du, Yahya Sattar, Davoud Ataee Tarzanagh, Laura Balzano, Samet Oymak, Necmiye Ozay

    Abstract: Real-world control applications often involve complex dynamics subject to abrupt changes or variations. Markov jump linear systems (MJS) provide a rich framework for modeling such dynamics. Despite an extensive history, theoretical understanding of parameter sensitivities of MJS control is somewhat lacking. Motivated by this, we investigate robustness aspects of certainty equivalent model-based op… ▽ More

    Submitted 26 May, 2021; originally announced May 2021.

    Comments: 17 pages, 8 figures

  15. HePPCAT: Probabilistic PCA for Data with Heteroscedastic Noise

    Authors: David Hong, Kyle Gilman, Laura Balzano, Jeffrey A. Fessler

    Abstract: Principal component analysis (PCA) is a classical and ubiquitous method for reducing data dimensionality, but it is suboptimal for heterogeneous data that are increasingly common in modern applications. PCA treats all samples uniformly so degrades when the noise is heteroscedastic across samples, as occurs, e.g., when samples come from sources of heterogeneous quality. This paper develops a probab… ▽ More

    Submitted 1 December, 2021; v1 submitted 9 January, 2021; originally announced January 2021.

    Comments: This article has been accepted for publication in the IEEE Transactions on Signal Processing. (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information. 26 pages, 14 figures

    Journal ref: IEEE Transactions on Signal Processing, Vol. 69, pp. 4819-4834, 2021

  16. arXiv:1911.01931  [pdf, other

    cs.LG cs.DS math.OC math.PR stat.ML

    Online matrix factorization for Markovian data and applications to Network Dictionary Learning

    Authors: Hanbaek Lyu, Deanna Needell, Laura Balzano

    Abstract: Online Matrix Factorization (OMF) is a fundamental tool for dictionary learning problems, giving an approximate representation of complex data sets in terms of a reduced number of extracted features. Convergence guarantees for most of the OMF algorithms in the literature assume independence between data matrices, and the case of dependent data streams remains largely unexplored. In this paper, we… ▽ More

    Submitted 7 November, 2020; v1 submitted 5 November, 2019; originally announced November 2019.

    Comments: 39 pages, 13 figures

    Journal ref: Journal of Machine Learning Research 21 (2020)

  17. arXiv:1904.00423  [pdf, other

    math.OC

    A Memory-efficient Algorithm for Large-scale Sparsity Regularized Image Reconstruction

    Authors: Greg Ongie, Naveen Murthy, Laura Balzano, Jeffrey A. Fessler

    Abstract: We derive a memory-efficient first-order variable splitting algorithm for convex image reconstruction problems with non-smooth regularization terms. The algorithm is based on a primal-dual approach, where one of the dual variables is updated using a step of the Frank-Wolfe algorithm, rather than the typical proximal point step used in other primal-dual algorithms. We show in certain cases this res… ▽ More

    Submitted 31 March, 2019; originally announced April 2019.

  18. arXiv:1810.12862  [pdf, other

    math.ST

    Optimally Weighted PCA for High-Dimensional Heteroscedastic Data

    Authors: David Hong, Fan Yang, Jeffrey A. Fessler, Laura Balzano

    Abstract: Modern data are increasingly both high-dimensional and heteroscedastic. This paper considers the challenge of estimating underlying principal components from high-dimensional data with noise that is heteroscedastic across samples, i.e., some samples are noisier than others. Such heteroscedasticity naturally arises, e.g., when combining data from diverse sources or sensors. A natural way to account… ▽ More

    Submitted 13 September, 2022; v1 submitted 30 October, 2018; originally announced October 2018.

    Comments: 39 pages, 9 figures

    MSC Class: 62H25

  19. Asymptotic performance of PCA for high-dimensional heteroscedastic data

    Authors: David Hong, Laura Balzano, Jeffrey A. Fessler

    Abstract: Principal Component Analysis (PCA) is a classical method for reducing the dimensionality of data by projecting them onto a subspace that captures most of their variation. Effective use of PCA in modern applications requires understanding its performance for data that are both high-dimensional and heteroscedastic. This paper analyzes the statistical performance of PCA in this setting, i.e., for hig… ▽ More

    Submitted 23 June, 2018; v1 submitted 20 March, 2017; originally announced March 2017.

    Comments: 34 pages (including supplement), 17 figures

    MSC Class: 62H25; 62H12; 62F12

    Journal ref: J. Multivariate Analysis 167:435-52 Sep 2018

  20. Real-Time Energy Disaggregation of a Distribution Feeder's Demand Using Online Learning

    Authors: Gregory S. Ledva, Laura Balzano, Johanna L. Mathieu

    Abstract: Though distribution system operators have been adding more sensors to their networks, they still often lack an accurate real-time picture of the behavior of distributed energy resources such as demand responsive electric loads and residential solar generation. Such information could improve system reliability, economic efficiency, and environmental impact. Rather than installing additional, costly… ▽ More

    Submitted 4 May, 2018; v1 submitted 16 January, 2017; originally announced January 2017.

    Comments: 14 pages, article in press, 2018, IEEE Transactions on Power Systems

  21. Towards a Theoretical Analysis of PCA for Heteroscedastic Data

    Authors: David Hong, Laura Balzano, Jeffrey A. Fessler

    Abstract: Principal Component Analysis (PCA) is a method for estimating a subspace given noisy samples. It is useful in a variety of problems ranging from dimensionality reduction to anomaly detection and the visualization of high dimensional data. PCA performs well in the presence of moderate noise and even with missing data, but is also sensitive to outliers. PCA is also known to have a phase transition w… ▽ More

    Submitted 12 October, 2016; originally announced October 2016.

    Comments: Presented at 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

  22. arXiv:1610.00199  [pdf, ps, other

    math.NA stat.ML

    Convergence of a Grassmannian Gradient Descent Algorithm for Subspace Estimation From Undersampled Data

    Authors: Dejiao Zhang, Laura Balzano

    Abstract: Subspace learning and matrix factorization problems have great many applications in science and engineering, and efficient algorithms are critical as dataset sizes continue to grow. Many relevant problem formulations are non-convex, and in a variety of contexts it has been observed that solving the non-convex problem directly is not only efficient but reliably accurate. We discuss convergence theo… ▽ More

    Submitted 20 February, 2022; v1 submitted 1 October, 2016; originally announced October 2016.

    Comments: 31 pages, 3 figures

  23. arXiv:1506.07405  [pdf, ps, other

    math.NA stat.ML

    Global Convergence of a Grassmannian Gradient Descent Algorithm for Subspace Estimation

    Authors: Dejiao Zhang, Laura Balzano

    Abstract: It has been observed in a variety of contexts that gradient descent methods have great success in solving low-rank matrix factorization problems, despite the relevant problem formulation being non-convex. We tackle a particular instance of this scenario, where we seek the $d$-dimensional subspace spanned by a streaming data matrix. We apply the natural first order incremental gradient descent meth… ▽ More

    Submitted 24 June, 2016; v1 submitted 24 June, 2015; originally announced June 2015.

    Comments: 23 pages, 10 figures

    MSC Class: 90C52; 65Y20 ACM Class: G.1.6; F.2.1

  24. arXiv:1307.5494  [pdf, other

    math.NA cs.LG stat.ML

    On GROUSE and Incremental SVD

    Authors: Laura Balzano, Stephen J. Wright

    Abstract: GROUSE (Grassmannian Rank-One Update Subspace Estimation) is an incremental algorithm for identifying a subspace of Rn from a sequence of vectors in this subspace, where only a subset of components of each vector is revealed at each iteration. Recent analysis has shown that GROUSE converges locally at an expected linear rate, under certain assumptions. GROUSE has a similar flavor to the incrementa… ▽ More

    Submitted 20 July, 2013; originally announced July 2013.

  25. arXiv:1306.3391  [pdf, other

    math.NA

    Local Convergence of an Algorithm for Subspace Identification from Partial Data

    Authors: Laura Balzano, Stephen J. Wright

    Abstract: GROUSE (Grassmannian Rank-One Update Subspace Estimation) is an iterative algorithm for identifying a linear subspace of R^n from data consisting of partial observations of random vectors from that subspace. This paper examines local convergence properties of GROUSE, under assumptions on the randomness of the observed vectors, the randomness of the subset of elements observed at each iteration, an… ▽ More

    Submitted 1 July, 2014; v1 submitted 14 June, 2013; originally announced June 2013.

    Comments: 29 pages. 6 figures

  26. arXiv:1306.0404  [pdf, other

    cs.CV math.OC stat.ML

    Iterative Grassmannian Optimization for Robust Image Alignment

    Authors: Jun He, Dejiao Zhang, Laura Balzano, Tao Tao

    Abstract: Robust high-dimensional data processing has witnessed an exciting development in recent years, as theoretical results have shown that it is possible using convex programming to optimize data fit to a low-rank component plus a sparse outlier component. This problem is also known as Robust PCA, and it has found application in many areas of computer vision. In image and video processing and face reco… ▽ More

    Submitted 20 June, 2013; v1 submitted 3 June, 2013; originally announced June 2013.

    Comments: Preprint submitted to the special issue of the Image and Vision Computing Journal on the theme "The Best of Face and Gesture 2013"

    Journal ref: Image and Vision Computing, 32(10), 800-813, 2014

  27. arXiv:1109.3827  [pdf, other

    cs.IT cs.CV eess.SY math.OC stat.ML

    Online Robust Subspace Tracking from Partial Information

    Authors: Jun He, Laura Balzano, John C. S. Lui

    Abstract: This paper presents GRASTA (Grassmannian Robust Adaptive Subspace Tracking Algorithm), an efficient and robust online algorithm for tracking subspaces from highly incomplete information. The algorithm uses a robust $l^1$-norm cost function in order to estimate and track non-stationary subspaces when the streaming data vectors are corrupted with outliers. We apply GRASTA to the problems of robust m… ▽ More

    Submitted 20 September, 2011; v1 submitted 17 September, 2011; originally announced September 2011.

    Comments: 28 pages, 12 figures

  28. arXiv:1006.4046  [pdf, other

    cs.IT eess.SY math.OC stat.ML

    Online Identification and Tracking of Subspaces from Highly Incomplete Information

    Authors: Laura Balzano, Robert Nowak, Benjamin Recht

    Abstract: This work presents GROUSE (Grassmanian Rank-One Update Subspace Estimation), an efficient online algorithm for tracking subspaces from highly incomplete observations. GROUSE requires only basic linear algebraic manipulations at each iteration, and each subspace update can be performed in linear time in the dimension of the subspace. The algorithm is derived by analyzing incremental gradient descen… ▽ More

    Submitted 12 July, 2011; v1 submitted 21 June, 2010; originally announced June 2010.