Skip to main content

Showing 1–14 of 14 results for author: Maheshwary, R

.
  1. arXiv:2505.16293  [pdf, ps, other

    cs.CL

    Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA

    Authors: Rishabh Maheshwary, Masoud Hashemi, Khyati Mahajan, Shiva Krishna Reddy Malay, Sai Rajeswar, Sathwik Tejaswi Madhusudhan, Spandana Gella, Vikas Yadav

    Abstract: Iterative RAG for multi-hop question answering faces challenges with lengthy contexts and the buildup of irrelevant information. This hinders a model's capacity to process and reason over retrieved content and limits performance. While recent methods focus on compressing retrieved information, they are either restricted to single-round RAG, require finetuning or lack scalability in iterative RAG.… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  2. arXiv:2504.07072  [pdf, other

    cs.CL cs.CV

    Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

    Authors: Israfel Salazar, Manuel Fernández Burda, Shayekh Bin Islam, Arshia Soltani Moakhar, Shivalika Singh, Fabian Farestam, Angelika Romanou, Danylo Boiko, Dipika Khullar, Mike Zhang, Dominik Krzemiński, Jekaterina Novikova, Luísa Shimabucoro, Joseph Marvin Imperial, Rishabh Maheshwary, Sharad Duwal, Alfonso Amayuelas, Swati Rajwal, Jebish Purbey, Ahmed Ruby, Nicholas Popovič, Marek Suppa, Azmine Toushik Wasi, Ram Mohan Rao Kadiyala, Olga Tsymboi , et al. (20 additional authors not shown)

    Abstract: The evaluation of vision-language models (VLMs) has mainly relied on English-language benchmarks, leaving significant gaps in both multilingual and multicultural coverage. While multilingual benchmarks have expanded, both in size and languages, many rely on translations of English datasets, failing to capture cultural nuances. In this work, we propose Kaleidoscope, as the most comprehensive exam b… ▽ More

    Submitted 29 April, 2025; v1 submitted 9 April, 2025; originally announced April 2025.

    Comments: v2: corrected the author list

  3. arXiv:2411.19799  [pdf, other

    cs.CL

    INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge

    Authors: Angelika Romanou, Negar Foroutan, Anna Sotnikova, Zeming Chen, Sree Harsha Nelaturu, Shivalika Singh, Rishabh Maheshwary, Micol Altomare, Mohamed A. Haggag, Snegha A, Alfonso Amayuelas, Azril Hafizi Amirudin, Viraat Aryabumi, Danylo Boiko, Michael Chang, Jenny Chim, Gal Cohen, Aditya Kumar Dalmia, Abraham Diress, Sharad Duwal, Daniil Dzenhaliou, Daniel Fernando Erazo Florez, Fabian Farestam, Joseph Marvin Imperial, Shayekh Bin Islam , et al. (34 additional authors not shown)

    Abstract: The performance differential of large language models (LLM) between languages hinders their effective deployment in many regions, inhibiting the potential economic and societal value of generative AI tools in many communities. However, the development of functional LLMs in many languages (\ie, multilingual LLMs) is bottlenecked by the lack of high-quality evaluation resources in languages other th… ▽ More

    Submitted 29 November, 2024; originally announced November 2024.

  4. arXiv:2411.02398  [pdf, other

    cs.CL cs.AI cs.LG

    Prompting with Phonemes: Enhancing LLMs' Multilinguality for Non-Latin Script Languages

    Authors: Hoang H Nguyen, Khyati Mahajan, Vikas Yadav, Julian Salazar, Philip S. Yu, Masoud Hashemi, Rishabh Maheshwary

    Abstract: Although multilingual LLMs have achieved remarkable performance across benchmarks, we find they continue to underperform on non-Latin script languages across contemporary LLM families. This discrepancy arises from the fact that LLMs are pretrained with orthographic scripts, which are dominated by Latin characters that obscure their shared phonology with non-Latin scripts. We propose leveraging pho… ▽ More

    Submitted 6 March, 2025; v1 submitted 4 November, 2024; originally announced November 2024.

    Comments: Accepted for NAACL 2025 (Main Conference)

  5. arXiv:2410.15522  [pdf, other

    cs.CL cs.AI cs.LG

    M-RewardBench: Evaluating Reward Models in Multilingual Settings

    Authors: Srishti Gureja, Lester James V. Miranda, Shayekh Bin Islam, Rishabh Maheshwary, Drishti Sharma, Gusti Winata, Nathan Lambert, Sebastian Ruder, Sara Hooker, Marzieh Fadaee

    Abstract: Reward models (RMs) have driven the state-of-the-art performance of LLMs today by enabling the integration of human feedback into the language modeling process. However, RMs are primarily trained and evaluated in English, and their capabilities in multilingual settings remain largely understudied. In this work, we conduct a systematic evaluation of several reward models in multilingual settings. W… ▽ More

    Submitted 20 May, 2025; v1 submitted 20 October, 2024; originally announced October 2024.

    Comments: 16 pages, 6 figures, 10 tables. Website: https://m-rewardbench.github.io/ , Updated results with latest models. Added more author information

  6. arXiv:2406.17415  [pdf, other

    cs.CL cs.AI cs.LG

    Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels

    Authors: Razvan-Gabriel Dumitru, Vikas Yadav, Rishabh Maheshwary, Paul-Ioan Clotan, Sathwik Tejaswi Madhusudhan, Mihai Surdeanu

    Abstract: We present a simple meta quantization approach that quantizes different layers of a large language model (LLM) at different bit levels, and is independent of the underlying quantization technique. Specifically, we quantize the most important layers to higher bit precision and less important layers to lower bits. We propose two effective strategies to measure the importance of layers within LLMs: t… ▽ More

    Submitted 28 October, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    ACM Class: I.2.7; I.2.0

  7. arXiv:2406.16783  [pdf, other

    cs.CL cs.AI cs.LG

    M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models

    Authors: Rishabh Maheshwary, Vikas Yadav, Hoang Nguyen, Khyati Mahajan, Sathwik Tejaswi Madhusudhan

    Abstract: Instruction finetuning (IFT) is critical for aligning Large Language Models (LLMs) to follow instructions. While many effective IFT datasets have been introduced recently, they predominantly focus on high-resource languages like English. To better align LLMs across a broad spectrum of languages and tasks, we propose a fully synthetic, novel taxonomy (Evol) guided Multilingual, Multi-turn instructi… ▽ More

    Submitted 4 March, 2025; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: 39 pages

  8. arXiv:2403.07230  [pdf, other

    cs.CL cs.AI cs.LG

    Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences

    Authors: Pulkit Pattnaik, Rishabh Maheshwary, Kelechi Ogueji, Vikas Yadav, Sathwik Tejaswi Madhusudhan

    Abstract: Direct Preference Optimization (DPO) is an effective technique that leverages pairwise preference data (usually one chosen and rejected response pair per user prompt) to align LLMs to human preferences. In practice, multiple responses can exist for a given prompt with varying quality relative to each other. With availability of such quality ratings for multiple responses, we propose utilizing thes… ▽ More

    Submitted 8 November, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Published at EMNLP 2024 as long (findings) conference paper

  9. arXiv:2306.08751  [pdf, other

    cs.CV

    Improving Selective Visual Question Answering by Learning from Your Peers

    Authors: Corentin Dancette, Spencer Whitehead, Rishabh Maheshwary, Ramakrishna Vedantam, Stefan Scherer, Xinlei Chen, Matthieu Cord, Marcus Rohrbach

    Abstract: Despite advances in Visual Question Answering (VQA), the ability of models to assess their own correctness remains underexplored. Recent work has shown that VQA models, out-of-the-box, can have difficulties abstaining from answering when they are wrong. The option to abstain, also called Selective Prediction, is highly relevant when deploying systems to users who must trust the system's output (e.… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: CVPR 2023. Code available here: https://github.com/facebookresearch/selective-vqa_ood

  10. arXiv:2205.00177  [pdf, other

    cs.CL

    Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers

    Authors: Vivek Kumar, Rishabh Maheshwary, Vikram Pudi

    Abstract: Existing Math Word Problem (MWP) solvers have achieved high accuracy on benchmark datasets. However, prior works have shown that such solvers do not generalize well and rely on superficial cues to achieve high performance. In this paper, we first conduct experiments to showcase that this behaviour is mainly associated with the limited size and diversity present in existing MWP datasets. Next, we p… ▽ More

    Submitted 30 April, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022

  11. arXiv:2109.05925  [pdf, other

    cs.CL

    Adversarial Examples for Evaluating Math Word Problem Solvers

    Authors: Vivek Kumar, Rishabh Maheshwary, Vikram Pudi

    Abstract: Standard accuracy metrics have shown that Math Word Problem (MWP) solvers have achieved high performance on benchmark datasets. However, the extent to which existing MWP solvers truly understand language and its relation with numbers is still unclear. In this paper, we generate adversarial attacks to evaluate the robustness of state-of-the-art MWP solvers. We propose two methods Question Reorderin… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP Findings 2021

  12. arXiv:2109.04775  [pdf, other

    cs.CL

    A Strong Baseline for Query Efficient Attacks in a Black Box Setting

    Authors: Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

    Abstract: Existing black box search methods have achieved high success rate in generating adversarial attacks against NLP models. However, such search methods are inefficient as they do not consider the amount of queries required to generate adversarial attacks. Also, prior attacks do not maintain a consistent search space while comparing different search methods. In this paper, we propose a query efficient… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 - Main Conference

  13. arXiv:2012.14956  [pdf, other

    cs.CL

    Generating Natural Language Attacks in a Hard Label Black Box Setting

    Authors: Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

    Abstract: We study an important and challenging task of attacking natural language processing models in a hard label black box setting. We propose a decision-based attack strategy that crafts high quality adversarial examples on text classification and entailment tasks. Our proposed attack strategy leverages population-based optimization algorithm to craft plausible and semantically similar adversarial exam… ▽ More

    Submitted 29 April, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

    Comments: Accepted at AAAI 2021 (Main Conference)

  14. arXiv:2012.13339  [pdf, ps, other

    cs.CL

    A Context Aware Approach for Generating Natural Language Attacks

    Authors: Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

    Abstract: We study an important task of attacking natural language processing models in a black box setting. We propose an attack strategy that crafts semantically similar adversarial examples on text classification and entailment tasks. Our proposed attack finds candidate words by considering the information of both the original word and its surrounding context. It jointly leverages masked language modelli… ▽ More

    Submitted 24 December, 2020; originally announced December 2020.

    Comments: Accepted as Student Poster at AAAI 2021