Skip to main content

Showing 1–7 of 7 results for author: Bodapati, S B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.04708  [pdf, other

    cs.CL

    Accelerated Test-Time Scaling with Model-Free Speculative Sampling

    Authors: Woomin Song, Saket Dingliwal, Sai Muralidhar Jayanthi, Bhavana Ganesh, Jinwoo Shin, Aram Galstyan, Sravan Babu Bodapati

    Abstract: Language models have demonstrated remarkable capabilities in reasoning tasks through test-time scaling techniques like best-of-N sampling and tree search. However, these approaches often demand substantial computational resources, creating a critical trade-off between performance and efficiency. We introduce STAND (STochastic Adaptive N-gram Drafting), a novel model-free speculative decoding appro… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  2. arXiv:2506.01215  [pdf, other

    cs.CL cs.LG

    Compress, Gather, and Recompute: REFORMing Long-Context Processing in Transformers

    Authors: Woomin Song, Sai Muralidhar Jayanthi, Srikanth Ronanki, Kanthashree Mysore Sathyendra, Jinwoo Shin, Aram Galstyan, Shubham Katiyar, Sravan Babu Bodapati

    Abstract: As large language models increasingly gain popularity in real-world applications, processing extremely long contexts, often exceeding the model's pre-trained context limits, has emerged as a critical challenge. While existing approaches to efficient long-context processing show promise, recurrent compression-based methods struggle with information preservation, whereas random access approaches req… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  3. arXiv:2506.01206  [pdf, other

    cs.CL cs.AI

    Mamba Drafters for Speculative Decoding

    Authors: Daewon Choi, Seunghyuk Oh, Saket Dingliwal, Jihoon Tack, Kyuyoung Kim, Woomin Song, Seojin Kim, Insu Han, Jinwoo Shin, Aram Galstyan, Shubham Katiyar, Sravan Babu Bodapati

    Abstract: Speculative decoding has emerged as a promising approach to accelerating large language model (LLM) generation using a fast drafter while maintaining alignment with the target model's distribution. However, existing approaches face a trade-off: external drafters offer flexibility but can suffer from slower drafting, while self-speculation methods use drafters tailored to the target model but requi… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  4. arXiv:2503.04992  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Wanda++: Pruning Large Language Models via Regional Gradients

    Authors: Yifan Yang, Kai Zhen, Bhavana Ganesh, Aram Galstyan, Goeric Huybrechts, Markus Müller, Jonas M. Kübler, Rupak Vignesh Swaminathan, Athanasios Mouchtaris, Sravan Babu Bodapati, Nathan Susanj, Zheng Zhang, Jack FitzGerald, Abhishek Kumar

    Abstract: Large Language Models (LLMs) pruning seeks to remove unimportant weights for inference speedup with minimal accuracy impact. However, existing methods often suffer from accuracy degradation without full-model sparsity-aware fine-tuning. This paper presents Wanda++, a novel pruning framework that outperforms the state-of-the-art methods by utilizing decoder-block-level \textbf{regional} gradients.… ▽ More

    Submitted 1 June, 2025; v1 submitted 6 March, 2025; originally announced March 2025.

    Comments: Paper accepted at ACL 2025 Findings

  5. arXiv:2407.06443  [pdf, other

    cs.AI

    Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment

    Authors: Qizhang Feng, Siva Rajesh Kasa, Santhosh Kumar Kasa, Hyokun Yun, Choon Hui Teo, Sravan Babu Bodapati

    Abstract: Large Language Models (LLMs) have seen widespread adoption due to their remarkable natural language capabilities. However, when deploying them in real-world settings, it is important to align LLMs to generate texts according to acceptable human standards. Methods such as Proximal Policy Optimization (PPO) and Direct Preference Optimization (DPO) have enabled significant progress in refining LLMs u… ▽ More

    Submitted 27 April, 2025; v1 submitted 8 July, 2024; originally announced July 2024.

  6. arXiv:1910.01043  [pdf, other

    cs.CL

    Neural Word Decomposition Models for Abusive Language Detection

    Authors: Sravan Babu Bodapati, Spandana Gella, Kasturi Bhattacharjee, Yaser Al-Onaizan

    Abstract: User generated text on social media often suffers from a lot of undesired characteristics including hatespeech, abusive language, insults etc. that are targeted to attack or abuse a specific group of people. Often such text is written differently compared to traditional text such as news involving either explicit mention of abusive words, obfuscated words and typological errors or implicit abuse i… ▽ More

    Submitted 2 October, 2019; originally announced October 2019.

    Comments: Accepted at ALW Workshop at ACL2019, Florence; BERT has a WordPiece model and it enhances performance of word based models in noisy settings

    Journal ref: https://www.aclweb.org/anthology/events/acl-2019/

  7. arXiv:1909.07746  [pdf, other

    cs.LG cs.CL cs.IR

    Multi Sense Embeddings from Topic Models

    Authors: Shobhit Jain, Sravan Babu Bodapati, Ramesh Nallapati, Anima Anandkumar

    Abstract: Distributed word embeddings have yielded state-of-the-art performance in many NLP tasks, mainly due to their success in capturing useful semantic information. These representations assign only a single vector to each word whereas a large number of words are polysemous (i.e., have multiple meanings). In this work, we approach this critical problem in lexical semantics, namely that of representing v… ▽ More

    Submitted 3 February, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

    Comments: Accepted at ACL supported conference for Natural Language & Speech Processing. https://www.aclweb.org/anthology/W19-74, Year: 2019