Skip to main content

Showing 1–9 of 9 results for author: Beniwal, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.16722  [pdf, ps, other

    cs.CL cs.AI

    Breaking mBad! Supervised Fine-tuning for Cross-Lingual Detoxification

    Authors: Himanshu Beniwal, Youngwoo Kim, Maarten Sap, Soham Dan, Thomas Hartvigsen

    Abstract: As large language models (LLMs) become increasingly prevalent in global applications, ensuring that they are toxicity-free across diverse linguistic contexts remains a critical challenge. We explore "Cross-lingual Detoxification", a cross-lingual paradigm that mitigates toxicity, enabling detoxification capabilities to transfer between high and low-resource languages across different script famili… ▽ More

    Submitted 30 June, 2025; v1 submitted 22 May, 2025; originally announced May 2025.

  2. arXiv:2504.04377  [pdf, other

    cs.CL

    PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages

    Authors: Priyanshu Kumar, Devansh Jain, Akhila Yerukola, Liwei Jiang, Himanshu Beniwal, Thomas Hartvigsen, Maarten Sap

    Abstract: Truly multilingual safety moderation efforts for Large Language Models (LLMs) have been hindered by a narrow focus on a small set of languages (e.g., English, Chinese) as well as a limited scope of safety definition, resulting in significant gaps in moderation capabilities. To bridge these gaps, we release POLYGUARD, a new state-of-the-art multilingual safety model for safeguarding LLM generations… ▽ More

    Submitted 6 April, 2025; originally announced April 2025.

  3. arXiv:2503.23088  [pdf, ps, other

    cs.CL cs.AI

    UNITYAI-GUARD: Pioneering Toxicity Detection Across Low-Resource Indian Languages

    Authors: Himanshu Beniwal, Reddybathuni Venkat, Rohit Kumar, Birudugadda Srivibhav, Daksh Jain, Pavan Doddi, Eshwar Dhande, Adithya Ananth, Kuldeep, Mayank Singh

    Abstract: This work introduces UnityAI-Guard, a framework for binary toxicity classification targeting low-resource Indian languages. While existing systems predominantly cater to high-resource languages, UnityAI-Guard addresses this critical gap by developing state-of-the-art models for identifying toxic content across diverse Brahmic/Indic scripts. Our approach achieves an impressive average F1-score of 8… ▽ More

    Submitted 5 July, 2025; v1 submitted 29 March, 2025; originally announced March 2025.

  4. arXiv:2503.21670  [pdf, ps, other

    cs.CL cs.AI

    COMI-LINGUA: Expert Annotated Large-Scale Dataset for Multitask NLP in Hindi-English Code-Mixing

    Authors: Rajvee Sheth, Himanshu Beniwal, Mayank Singh

    Abstract: We introduce COMI-LINGUA, the largest manually annotated Hindi-English code-mixed dataset, comprising 125K+ high-quality instances across five core NLP tasks: Matrix Language Identification, Token-level Language Identification, POS Tagging, Named Entity Recognition (NER), and Machine Translation. Each instance is annotated by three bilingual annotators, yielding over 376K expert annotations with s… ▽ More

    Submitted 5 June, 2025; v1 submitted 27 March, 2025; originally announced March 2025.

  5. arXiv:2502.16901  [pdf, other

    cs.CL cs.AI

    Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs

    Authors: Himanshu Beniwal, Sailesh Panda, Birudugadda Srivibhav, Mayank Singh

    Abstract: We explore \textbf{C}ross-lingual \textbf{B}ackdoor \textbf{AT}tacks (X-BAT) in multilingual Large Language Models (mLLMs), revealing how backdoors inserted in one language can automatically transfer to others through shared embedding spaces. Using toxicity classification as a case study, we demonstrate that attackers can compromise multilingual systems by poisoning data in a single language, with… ▽ More

    Submitted 20 May, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

  6. arXiv:2408.03125  [pdf, other

    cs.CL cs.AI

    COMMENTATOR: A Code-mixed Multilingual Text Annotation Framework

    Authors: Rajvee Sheth, Shubh Nisar, Heenaben Prajapati, Himanshu Beniwal, Mayank Singh

    Abstract: As the NLP community increasingly addresses challenges associated with multilingualism, robust annotation tools are essential to handle multilingual datasets efficiently. In this paper, we introduce a code-mixed multilingual text annotation framework, COMMENTATOR, specifically designed for annotating code-mixed text. The tool demonstrates its effectiveness in token-level and sentence-level languag… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  7. arXiv:2402.11997  [pdf, other

    cs.CL cs.AI cs.LG

    Remember This Event That Year? Assessing Temporal Information and Reasoning in Large Language Models

    Authors: Himanshu Beniwal, Dishant Patel, Kowsik Nandagopan D, Hritik Ladia, Ankit Yadav, Mayank Singh

    Abstract: Large Language Models (LLMs) are increasingly ubiquitous, yet their ability to retain and reason about temporal information remains limited, hindering their application in real-world scenarios where understanding the sequential nature of events is crucial. Our study experiments with 12 state-of-the-art models (ranging from 2B to 70B+ parameters) on a novel numerical-temporal dataset, \textbf{TempU… ▽ More

    Submitted 5 July, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  8. arXiv:2401.10521  [pdf, other

    cs.CL cs.AI

    Cross-lingual Editing in Multilingual Language Models

    Authors: Himanshu Beniwal, Kowsik Nandagopan D, Mayank Singh

    Abstract: The training of large language models (LLMs) necessitates substantial data and computational resources, and updating outdated LLMs entails significant efforts and resources. While numerous model editing techniques (METs) have emerged to efficiently update model outputs without retraining, their effectiveness in multilingual LLMs, where knowledge is stored in diverse languages, remains an underexpl… ▽ More

    Submitted 3 February, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted at EACL 2024

  9. arXiv:2401.03855  [pdf, other

    cs.CL cs.AI

    PythonSaga: Redefining the Benchmark to Evaluate Code Generating LLMs

    Authors: Ankit Yadav, Himanshu Beniwal, Mayank Singh

    Abstract: Driven by the surge in code generation using large language models (LLMs), numerous benchmarks have emerged to evaluate these LLMs capabilities. We conducted a large-scale human evaluation of HumanEval and MBPP, two popular benchmarks for Python code generation, analyzing their diversity and difficulty. Our findings unveil a critical bias towards a limited set of programming concepts, neglecting m… ▽ More

    Submitted 4 July, 2024; v1 submitted 8 January, 2024; originally announced January 2024.