Skip to main content

Showing 1–7 of 7 results for author: Sivaprasad, P T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.24715  [pdf, ps, other

    cs.LG cs.AI cs.CL

    CoRet: Improved Retriever for Code Editing

    Authors: Fabio Fehr, Prabhu Teja Sivaprasad, Luca Franceschi, Giovanni Zappella

    Abstract: In this paper, we introduce CoRet, a dense retrieval model designed for code-editing tasks that integrates code semantics, repository structure, and call graph dependencies. The model focuses on retrieving relevant portions of a code repository based on natural language queries such as requests to implement new features or fix bugs. These retrieved code chunks can then be presented to a user or to… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: ACL 2025

  2. arXiv:2504.08703  [pdf, other

    cs.SE

    SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents

    Authors: Muhammad Shihab Rashid, Christian Bock, Yuan Zhuang, Alexander Buchholz, Tim Esler, Simon Valentin, Luca Franceschi, Martin Wistuba, Prabhu Teja Sivaprasad, Woo Jung Kim, Anoop Deoras, Giovanni Zappella, Laurent Callot

    Abstract: Coding agents powered by large language models have shown impressive capabilities in software engineering tasks, but evaluating their performance across diverse programming languages and real-world scenarios remains challenging. We introduce SWE-PolyBench, a new multi-language benchmark for repository-level, execution-based evaluation of coding agents. SWE-PolyBench contains 2110 instances from 21… ▽ More

    Submitted 23 April, 2025; v1 submitted 11 April, 2025; originally announced April 2025.

    Comments: 20 pages, 6 figures, corrected author name spelling

  3. arXiv:2406.03216  [pdf, other

    cs.LG cs.AI

    Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need

    Authors: Martin Wistuba, Prabhu Teja Sivaprasad, Lukas Balles, Giovanni Zappella

    Abstract: Recent Continual Learning (CL) methods have combined pretrained Transformers with prompt tuning, a parameter-efficient fine-tuning (PEFT) technique. We argue that the choice of prompt tuning in prior works was an undefended and unablated decision, which has been uncritically adopted by subsequent research, but warrants further research to understand its implications. In this paper, we conduct this… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  4. arXiv:2311.17601  [pdf, ps, other

    cs.LG cs.AI

    Continual Learning with Low Rank Adaptation

    Authors: Martin Wistuba, Prabhu Teja Sivaprasad, Lukas Balles, Giovanni Zappella

    Abstract: Recent work using pretrained transformers has shown impressive performance when fine-tuned with data from the downstream problem of interest. However, they struggle to retain that performance when the data characteristics changes. In this paper, we focus on continual learning, where a pre-trained transformer is updated to perform well on new data, while retaining its performance on data it was pre… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted at Workshop on Distribution Shifts (DistShift), NeurIPS 2023

  5. arXiv:2311.00586  [pdf, other

    cs.CV

    PAUMER: Patch Pausing Transformer for Semantic Segmentation

    Authors: Evann Courdier, Prabhu Teja Sivaprasad, François Fleuret

    Abstract: We study the problem of improving the efficiency of segmentation transformers by using disparate amounts of computation for different parts of the image. Our method, PAUMER, accomplishes this by pausing computation for patches that are deemed to not need any more computation before the final decoder. We use the entropy of predictions computed from intermediate activations as the pausing criterion,… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  6. arXiv:2110.10232  [pdf, other

    cs.LG cs.CV

    Test time Adaptation through Perturbation Robustness

    Authors: Prabhu Teja Sivaprasad, François Fleuret

    Abstract: Data samples generated by several real world processes are dynamic in nature \textit{i.e.}, their characteristics vary with time. Thus it is not possible to train and tackle all possible distributional shifts between training and inference, using the host of transfer learning methods in literature. In this paper, we tackle this problem of adapting to domain shift at inference time \textit{i.e.}, w… ▽ More

    Submitted 19 October, 2021; originally announced October 2021.

    Comments: Under review

  7. arXiv:1910.11758  [pdf, other

    cs.LG stat.ML

    Optimizer Benchmarking Needs to Account for Hyperparameter Tuning

    Authors: Prabhu Teja Sivaprasad, Florian Mai, Thijs Vogels, Martin Jaggi, François Fleuret

    Abstract: The performance of optimizers, particularly in deep learning, depends considerably on their chosen hyperparameter configuration. The efficacy of optimizers is often studied under near-optimal problem-specific hyperparameters, and finding these settings may be prohibitively costly for practitioners. In this work, we argue that a fair assessment of optimizers' performance must take the computational… ▽ More

    Submitted 15 August, 2020; v1 submitted 25 October, 2019; originally announced October 2019.

    Comments: published at International Conference on Machine Learning (ICML 2020)