Skip to main content

Showing 1–3 of 3 results for author: Dass, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.09199  [pdf, ps, other

    cs.LG cs.AI cs.DC

    FLoRIST: Singular Value Thresholding for Efficient and Accurate Federated Fine-Tuning of Large Language Models

    Authors: Hariharan Ramesh, Jyotikrishna Dass

    Abstract: Integrating Low-Rank Adaptation (LoRA) into federated learning offers a promising solution for parameter-efficient fine-tuning of Large Language Models (LLMs) without sharing local data. However, several methods designed for federated LoRA present significant challenges in balancing communication efficiency, model accuracy, and computational cost, particularly among heterogeneous clients. These me… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 21 pages, 12 figures

  2. arXiv:2310.19820  [pdf, other

    cs.LG cs.CV

    NetDistiller: Empowering Tiny Deep Learning via In-Situ Distillation

    Authors: Shunyao Zhang, Yonggan Fu, Shang Wu, Jyotikrishna Dass, Haoran You, Yingyan, Lin

    Abstract: Boosting the task accuracy of tiny neural networks (TNNs) has become a fundamental challenge for enabling the deployments of TNNs on edge devices which are constrained by strict limitations in terms of memory, computation, bandwidth, and power supply. To this end, we propose a framework called NetDistiller to boost the achievable accuracy of TNNs by treating them as sub-networks of a weight-sharin… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  3. arXiv:2211.05109  [pdf, other

    cs.CV cs.AR cs.LG

    ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention

    Authors: Jyotikrishna Dass, Shang Wu, Huihong Shi, Chaojian Li, Zhifan Ye, Zhongfeng Wang, Yingyan Lin

    Abstract: Vision Transformer (ViT) has emerged as a competitive alternative to convolutional neural networks for various computer vision applications. Specifically, ViT multi-head attention layers make it possible to embed information globally across the overall image. Nevertheless, computing and storing such attention matrices incurs a quadratic cost dependency on the number of patches, limiting its achiev… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

    Comments: 14 pages, 15 figures, Accepted to IEEE HPCA 2023