Skip to main content

Showing 1–16 of 16 results for author: Acharya, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2309.01885  [pdf, other

    stat.ML cs.CL cs.LG

    QuantEase: Optimization-based Quantization for Language Models

    Authors: Kayhan Behdin, Ayan Acharya, Aman Gupta, Qingquan Song, Siyu Zhu, Sathiya Keerthi, Rahul Mazumder

    Abstract: With the rising popularity of Large Language Models (LLMs), there has been an increasing interest in compression techniques that enable their efficient deployment. This study focuses on the Post-Training Quantization (PTQ) of LLMs. Drawing from recent advances, our work introduces QuantEase, a layer-wise quantization framework where individual layers undergo separate quantization. The problem is f… ▽ More

    Submitted 1 December, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

  2. arXiv:2302.09693  [pdf, other

    stat.ML cs.LG

    mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization

    Authors: Kayhan Behdin, Qingquan Song, Aman Gupta, Sathiya Keerthi, Ayan Acharya, Borja Ocejo, Gregory Dexter, Rajiv Khanna, David Durfee, Rahul Mazumder

    Abstract: Modern deep learning models are over-parameterized, where different optima can result in widely varying generalization performance. The Sharpness-Aware Minimization (SAM) technique modifies the fundamental loss function that steers gradient descent methods toward flatter minima, which are believed to exhibit enhanced generalization prowess. Our study delves into a specific variant of SAM known as… ▽ More

    Submitted 30 September, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2212.04343

  3. arXiv:2207.05882  [pdf, other

    stat.ML cs.LG

    Employing Feature Selection Algorithms to Determine the Immune State of a Mouse Model of Rheumatoid Arthritis

    Authors: Brendon K. Colbert, Joslyn L. Mangal, Aleksandr Talitckii, Abhinav P. Acharya, Matthew M. Peet

    Abstract: The immune response is a dynamic process by which the body determines whether an antigen is self or nonself. The state of this dynamic process is defined by the relative balance and population of inflammatory and regulatory actors which comprise this decision making process. The goal of immunotherapy as applied to, e.g. Rheumatoid Arthritis (RA), then, is to bias the immune state in favor of the r… ▽ More

    Submitted 21 October, 2023; v1 submitted 12 July, 2022; originally announced July 2022.

  4. arXiv:2106.08882  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Robust Training in High Dimensions via Block Coordinate Geometric Median Descent

    Authors: Anish Acharya, Abolfazl Hashemi, Prateek Jain, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu

    Abstract: Geometric median (\textsc{Gm}) is a classical method in statistics for achieving a robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 0.5. However, its computational complexity makes it infeasible for robustifying stochastic gradient descent (SGD) for high-dimensional optimization problems. In this paper, we show that by applying \textsc{G… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

  5. arXiv:2012.04061  [pdf, other

    stat.ML cs.DC cs.LG math.OC

    Faster Non-Convex Federated Learning via Global and Local Momentum

    Authors: Rudrajit Das, Anish Acharya, Abolfazl Hashemi, Sujay Sanghavi, Inderjit S. Dhillon, Ufuk Topcu

    Abstract: We propose \texttt{FedGLOMO}, a novel federated learning (FL) algorithm with an iteration complexity of $\mathcal{O}(ε^{-1.5})$ to converge to an $ε$-stationary point (i.e., $\mathbb{E}[\|\nabla f(\bm{x})\|^2] \leq ε$) for smooth non-convex functions -- under arbitrary client heterogeneity and compressed communication -- compared to the $\mathcal{O}(ε^{-2})$ complexity of most prior works. Our key… ▽ More

    Submitted 24 October, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

  6. arXiv:2011.10643  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    On the Benefits of Multiple Gossip Steps in Communication-Constrained Decentralized Optimization

    Authors: Abolfazl Hashemi, Anish Acharya, Rudrajit Das, Haris Vikalo, Sujay Sanghavi, Inderjit Dhillon

    Abstract: In decentralized optimization, it is common algorithmic practice to have nodes interleave (local) gradient descent iterations with gossip (i.e. averaging over the network) steps. Motivated by the training of large-scale machine learning models, it is also increasingly common to require that messages be {\em lossy compressed} versions of the local parameters. In this paper, we show that, in such co… ▽ More

    Submitted 20 November, 2020; originally announced November 2020.

  7. arXiv:2006.09554  [pdf, other

    cs.LG cs.SI stat.ML

    Isometric Graph Neural Networks

    Authors: Matthew Walker, Bo Yan, Yiou Xiao, Yafei Wang, Ayan Acharya

    Abstract: Many tasks that rely on representations of nodes in graphs would benefit if those representations were faithful to distances between nodes in the graph. Geometric techniques to extract such representations have poor scaling over large graph size, and recent advances in Graph Neural Network (GNN) algorithms have limited ability to reflect graph distance information beyond the first degree neighborh… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

  8. arXiv:1902.07355  [pdf, other

    econ.GN cs.LG stat.AP

    Combining Outcome-Based and Preference-Based Matching: A Constrained Priority Mechanism

    Authors: Avidit Acharya, Kirk Bansak, Jens Hainmueller

    Abstract: We introduce a constrained priority mechanism that combines outcome-based matching from machine-learning with preference-based allocation schemes common in market design. Using real-world data, we illustrate how our mechanism could be applied to the assignment of refugee families to host country locations, and kindergarteners to schools. Our mechanism allows a planner to first specify a threshold… ▽ More

    Submitted 11 August, 2020; v1 submitted 19 February, 2019; originally announced February 2019.

    Comments: This manuscript has been accepted for publication by Political Analysis and will appear in a revised form subject to peer review and/or input from the journal's editor. End-users of this manuscript may only make use of it for private research and study and may not distribute it further

  9. arXiv:1811.00641  [pdf, other

    cs.LG cs.CL math.NA stat.ML

    Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

    Authors: Anish Acharya, Rahul Goel, Angeliki Metallinou, Inderjit Dhillon

    Abstract: Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce significant latency. We propose a compression method that leverages low rank matrix factorization during training,to compress the word embedding layer which… ▽ More

    Submitted 1 November, 2018; originally announced November 2018.

    Comments: Accepted in Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019)

  10. arXiv:1708.04941  [pdf, other

    quant-ph math-ph stat.AP

    Minimax estimation of qubit states with Bures risk

    Authors: Anirudh Acharya, Madalin Guta

    Abstract: The central problem of quantum statistics is to devise measurement schemes for the estimation of an unknown state, given an ensemble of $n$ independent identically prepared systems. For locally quadratic loss functions, the risk of standard procedures has the usual scaling of $1/n$. However, it has been noticed that for fidelity based metrics such as the Bures distance, the risk of conventional (n… ▽ More

    Submitted 25 September, 2017; v1 submitted 16 August, 2017; originally announced August 2017.

    Comments: 29 pages, 1 figure ; V2: Added section on quantum relative entropy, updated introduction and abstract, added references

    Journal ref: J. Phys. A: Math. Theor. 51 175307 (2018)

  11. arXiv:1609.03758  [pdf, other

    quant-ph math-ph stat.AP

    Statistical analysis of low rank tomography with compressive random measurements

    Authors: Anirudh Acharya, Madalin Guta

    Abstract: We consider the statistical problem of `compressive' estimation of low rank states with random basis measurements, where the estimation error is expressed terms of two metrics - the Frobenius norm and quantum infidelity. It is known that unlike the case of general full state tomography, low rank states can be identified from a reduced number of observables' expectations. Here we investigate whethe… ▽ More

    Submitted 13 September, 2016; originally announced September 2016.

    Comments: 19 pages, 3 figures

    Journal ref: J. Phys. A: Math. Theor. 50 195301 (2017)

  12. arXiv:1512.08996  [pdf, other

    stat.ML stat.AP stat.ME

    Nonparametric Bayesian Factor Analysis for Dynamic Count Matrices

    Authors: Ayan Acharya, Joydeep Ghosh, Mingyuan Zhou

    Abstract: A gamma process dynamic Poisson factor analysis model is proposed to factorize a dynamic count matrix, whose columns are sequentially observed count vectors. The model builds a novel Markov chain that sends the latent gamma random variables at time $(t-1)$ as the shape parameters of those at time $t$, which are linked to observed or latent counts under the Poisson likelihood. The significant chall… ▽ More

    Submitted 30 December, 2015; originally announced December 2015.

    Comments: Appeared in Artificial Intelligence and Statistics (AISTATS), May 2015. The ArXiv version fixes a typo in (8), the equation right above Section 3.2 in Page 4 of http://www.jmlr.org/proceedings/papers/v38/acharya15.pdf

  13. arXiv:1510.03229  [pdf, other

    quant-ph math-ph stat.AP

    Statistically efficient tomography of low rank states with incomplete measurements

    Authors: Anirudh Acharya, Theodore Kypraios, Madalin Guta

    Abstract: The construction of physically relevant low dimensional state models, and the design of appropriate measurements are key issues in tackling quantum state tomography for large dimensional systems. We consider the statistical problem of estimating low rank states in the set-up of multiple ions tomography, and investigate how the estimation error behaves with a reduction in the number of measurement… ▽ More

    Submitted 23 October, 2015; v1 submitted 12 October, 2015; originally announced October 2015.

    Comments: 19 pages, 6 figures ; V2: updated figure 5, added references, changed title, updated abstract

    Journal ref: New Journal of Physics, Volume 18, April 2016

  14. arXiv:1406.7117  [pdf, other

    stat.ME

    A Complete Review of Controlling the FDR in a Multiple Comparison Problem Framework -- The Benjamini-Hochberg Algorithm

    Authors: Anish Acharya

    Abstract: This paper is a review of the popular Benjamini Hochberg Method and other related useful methods of Multiple Hypothesis testing. This is written with the purpose of serving a short but complete easy to understand review of the main article with proper background. The paper titled 'Controlling the False Discovery Rate-a practical and powerful Approach to multiple Testing' by benjamini et. al.[1] pr… ▽ More

    Submitted 27 June, 2014; originally announced June 2014.

  15. arXiv:1211.2304  [pdf, other

    cs.LG stat.ML

    Probabilistic Combination of Classifier and Cluster Ensembles for Non-transductive Learning

    Authors: Ayan Acharya, Eduardo R. Hruschka, Joydeep Ghosh, Badrul Sarwar, Jean-David Ruvini

    Abstract: Unsupervised models can provide supplementary soft constraints to help classify new target data under the assumption that similar objects in the target set are more likely to share the same class label. Such models can also help detect possible differences between training and target distributions, which is useful in applications where concept drift may take place. This paper describes a Bayesian… ▽ More

    Submitted 10 November, 2012; originally announced November 2012.

  16. arXiv:1204.4521  [pdf, ps, other

    cs.LG cs.CV stat.ML

    A Privacy-Aware Bayesian Approach for Combining Classifier and Cluster Ensembles

    Authors: Ayan Acharya, Eduardo R. Hruschka, Joydeep Ghosh

    Abstract: This paper introduces a privacy-aware Bayesian approach that combines ensembles of classifiers and clusterers to perform semi-supervised and transductive learning. We consider scenarios where instances and their classification/clustering results are distributed across different data sites and have sharing restrictions. As a special case, the privacy aware computation of the model when instances of… ▽ More

    Submitted 19 April, 2012; originally announced April 2012.

    ACM Class: I.5.4