Skip to main content

Showing 1–8 of 8 results for author: Yaras, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.19859  [pdf, other

    cs.LG eess.SP math.OC stat.CO stat.ML

    An Overview of Low-Rank Structures in the Training and Adaptation of Large Models

    Authors: Laura Balzano, Tianjiao Ding, Benjamin D. Haeffele, Soo Min Kwon, Qing Qu, Peng Wang, Zhangyang Wang, Can Yaras

    Abstract: The rise of deep learning has revolutionized data processing and prediction in signal processing and machine learning, yet the substantial computational demands of training and deploying modern large-scale deep models present significant challenges, including high computational costs and energy consumption. Recent research has uncovered a widespread phenomenon in deep networks: the emergence of lo… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

    Comments: Authors are listed alphabetically; 27 pages, 10 figures

  2. arXiv:2501.02364  [pdf, other

    cs.LG cs.CV stat.ML

    Understanding How Nonlinear Layers Create Linearly Separable Features for Low-Dimensional Data

    Authors: Alec S. Xu, Can Yaras, Peng Wang, Qing Qu

    Abstract: Deep neural networks have attained remarkable success across diverse classification tasks. Recent empirical studies have shown that deep networks learn features that are linearly separable across classes. However, these findings often lack rigorous justifications, even under relatively simple settings. In this work, we address this gap by examining the linear separation capabilities of shallow non… ▽ More

    Submitted 4 January, 2025; originally announced January 2025.

    Comments: 32 pages, 9 figures

  3. arXiv:2412.07909  [pdf, other

    cs.LG cs.AI cs.CV

    Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning

    Authors: Can Yaras, Siyi Chen, Peng Wang, Qing Qu

    Abstract: Multimodal learning has recently gained significant popularity, demonstrating impressive performance across various zero-shot classification tasks and a range of perceptive and generative applications. Models such as Contrastive Language-Image Pretraining (CLIP) are designed to bridge different modalities, such as images and text, by learning a shared representation space through contrastive learn… ▽ More

    Submitted 10 December, 2024; originally announced December 2024.

    Comments: The first two authors contributed equally to this work

  4. arXiv:2406.04112  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

    Authors: Can Yaras, Peng Wang, Laura Balzano, Qing Qu

    Abstract: While overparameterization in machine learning models offers great benefits in terms of optimization and generalization, it also leads to increased computational requirements as model sizes grow. In this work, we show that by leveraging the inherent low-dimensional structures of data and compressible dynamics within the model parameters, we can reap the benefits of overparameterization without the… ▽ More

    Submitted 9 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML'24 (Oral)

  5. arXiv:2311.02960  [pdf, other

    cs.LG cs.CV math.OC

    Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

    Authors: Peng Wang, Xiao Li, Can Yaras, Zhihui Zhu, Laura Balzano, Wei Hu, Qing Qu

    Abstract: Over the past decade, deep learning has proven to be a highly effective tool for learning meaningful features from raw data. However, it remains an open question how deep networks perform hierarchical feature learning across layers. In this work, we attempt to unveil this mystery by investigating the structures of intermediate features. Motivated by our empirical findings that linear layers mimic… ▽ More

    Submitted 9 January, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: 61 pages, 14 figures

  6. arXiv:2306.01154  [pdf, other

    cs.LG

    The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks

    Authors: Can Yaras, Peng Wang, Wei Hu, Zhihui Zhu, Laura Balzano, Qing Qu

    Abstract: Over the past few years, an extensively studied phenomenon in training deep networks is the implicit bias of gradient descent towards parsimonious solutions. In this work, we investigate this phenomenon by narrowing our focus to deep linear networks. Through our analysis, we reveal a surprising "law of parsimony" in the learning dynamics when the data possesses low-dimensional structures. Specific… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: The first two authors contributed to this work equally; 32 pages, 12 figures

  7. arXiv:2209.09211  [pdf, other

    cs.LG cs.CV cs.IT eess.SP stat.ML

    Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold

    Authors: Can Yaras, Peng Wang, Zhihui Zhu, Laura Balzano, Qing Qu

    Abstract: When training overparameterized deep networks for classification tasks, it has been widely observed that the learned features exhibit a so-called "neural collapse" phenomenon. More specifically, for the output features of the penultimate layer, for each class the within-class features converge to their means, and the means of different classes exhibit a certain tight frame structure, which is also… ▽ More

    Submitted 7 March, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: The first two authors contributed to this work equally; 38 pages, 13 figures. Accepted at NeurIPS'22

  8. arXiv:2104.14032  [pdf, other

    cs.CV

    Randomized Histogram Matching: A Simple Augmentation for Unsupervised Domain Adaptation in Overhead Imagery

    Authors: Can Yaras, Kaleb Kassaw, Bohao Huang, Kyle Bradbury, Jordan M. Malof

    Abstract: Modern deep neural networks (DNNs) are highly accurate on many recognition tasks for overhead (e.g., satellite) imagery. However, visual domain shifts (e.g., statistical changes due to geography, sensor, or atmospheric conditions) remain a challenge, causing the accuracy of DNNs to degrade substantially and unpredictably when testing on new sets of imagery. In this work, we model domain shifts cau… ▽ More

    Submitted 11 August, 2023; v1 submitted 28 April, 2021; originally announced April 2021.

    Comments: Includes a main paper (10 pages). This paper is currently undergoing peer review