Skip to main content

Showing 1–50 of 137 results for author: Chadha, A

.
  1. arXiv:2506.08885  [pdf, ps, other

    cs.CL cs.LG

    AdversariaL attacK sAfety aLIgnment(ALKALI): Safeguarding LLMs through GRACE: Geometric Representation-Aware Contrastive Enhancement- Introducing Adversarial Vulnerability Quality Index (AVQI)

    Authors: Danush Khanna, Krishna Kumar, Basab Ghosh, Vinija Jain, Vasu Sharma, Aman Chadha, Amitava Das

    Abstract: Adversarial threats against LLMs are escalating faster than current defenses can adapt. We expose a critical geometric blind spot in alignment: adversarial prompts exploit latent camouflage, embedding perilously close to the safe representation manifold while encoding unsafe intent thereby evading surface level defenses like Direct Preference Optimization (DPO), which remain blind to the latent ge… ▽ More

    Submitted 11 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

  2. arXiv:2505.18931  [pdf, ps, other

    cs.AI cs.CL cs.LG

    Can Large Language Models Infer Causal Relationships from Real-World Text?

    Authors: Ryan Saklad, Aman Chadha, Oleg Pavlov, Raha Moraffah

    Abstract: Understanding and inferring causal relationships from texts is a core aspect of human cognition and is essential for advancing large language models (LLMs) towards artificial general intelligence. Existing work primarily focuses on synthetically generated texts which involve simple causal relationships explicitly mentioned in the text. This fails to reflect the complexities of real-world tasks. In… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

  3. arXiv:2505.17870  [pdf, ps, other

    cs.CL

    Just as Humans Need Vaccines, So Do Models: Model Immunization to Combat Falsehoods

    Authors: Shaina Raza, Rizwan Qureshi, Marcelo Lotif, Aman Chadha, Deval Pandya, Christos Emmanouilidis

    Abstract: Generative AI models often learn and reproduce false information present in their training corpora. This position paper argues that, analogous to biological immunization, where controlled exposure to a weakened pathogen builds immunity, AI models should be fine tuned on small, quarantined sets of explicitly labeled falsehoods as a "vaccine" against misinformation. These curated false examples are… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  4. arXiv:2503.05315  [pdf, ps, other

    cs.LG cs.IR cs.SE

    LoRACode: LoRA Adapters for Code Embeddings

    Authors: Saumya Chaturvedi, Aman Chadha, Laurent Bindschaedler

    Abstract: Code embeddings are essential for semantic code search; however, current approaches often struggle to capture the precise syntactic and contextual nuances inherent in code. Open-source models such as CodeBERT and UniXcoder exhibit limitations in scalability and efficiency, while high-performing proprietary systems impose substantial computational costs. We introduce a parameter-efficient fine-tuni… ▽ More

    Submitted 2 June, 2025; v1 submitted 7 March, 2025; originally announced March 2025.

    Comments: Accepted at the Deep Learning for Code (DL4C) Workshop at ICLR 2025

  5. arXiv:2502.03512  [pdf, other

    cs.AI

    YINYANG-ALIGN: Benchmarking Contradictory Objectives and Proposing Multi-Objective Optimization based DPO for Text-to-Image Alignment

    Authors: Amitava Das, Yaswanth Narsupalli, Gurpreet Singh, Vinija Jain, Vasu Sharma, Suranjana Trivedy, Aman Chadha, Amit Sheth

    Abstract: Precise alignment in Text-to-Image (T2I) systems is crucial to ensure that generated visuals not only accurately encapsulate user intents but also conform to stringent ethical and aesthetic benchmarks. Incidents like the Google Gemini fiasco, where misaligned outputs triggered significant public backlash, underscore the critical need for robust alignment mechanisms. In contrast, Large Language Mod… ▽ More

    Submitted 9 February, 2025; v1 submitted 5 February, 2025; originally announced February 2025.

  6. arXiv:2502.02027  [pdf, other

    cs.CV cs.AI

    From Fog to Failure: The Unintended Consequences of Dehazing on Object Detection in Clear Images

    Authors: Ashutosh Kumar, Aman Chadha

    Abstract: This study explores the challenges of integrating human visual cue-based dehazing into object detection, given the selective nature of human perception. While human vision adapts dynamically to environmental conditions, computational dehazing does not always enhance detection uniformly. We propose a multi-stage framework where a lightweight detector identifies regions of interest (RoIs), which are… ▽ More

    Submitted 16 March, 2025; v1 submitted 4 February, 2025; originally announced February 2025.

  7. arXiv:2502.01673  [pdf, other

    cs.CL cs.AI

    Multilingual State Space Models for Structured Question Answering in Indic Languages

    Authors: Arpita Vats, Rahul Raja, Mrinal Mathur, Vinija Jain, Aman Chadha

    Abstract: The diversity and complexity of Indic languages present unique challenges for natural language processing (NLP) tasks, particularly in the domain of question answering (QA).To address these challenges, this paper explores the application of State Space Models (SSMs),to build efficient and contextually aware QA systems tailored for Indic languages. SSMs are particularly suited for this task due to… ▽ More

    Submitted 24 April, 2025; v1 submitted 1 February, 2025; originally announced February 2025.

    Comments: Accepted at NAACL

  8. arXiv:2501.15747  [pdf, other

    cs.CL cs.AI

    IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

    Authors: Sankalp KJ, Ashutosh Kumar, Laxmaan Balaji, Nikunj Kotecha, Vinija Jain, Aman Chadha, Sreyoshi Bhaduri

    Abstract: Known by more than 1.5 billion people in the Indian subcontinent, Indic languages present unique challenges and opportunities for natural language processing (NLP) research due to their rich cultural heritage, linguistic diversity, and complex structures. IndicMMLU-Pro is a comprehensive benchmark designed to evaluate Large Language Models (LLMs) across Indic languages, building upon the MMLU Pro… ▽ More

    Submitted 27 January, 2025; v1 submitted 26 January, 2025; originally announced January 2025.

  9. arXiv:2501.08167  [pdf, ps, other

    cs.CL cs.AI cs.CY

    Potential and Perils of Large Language Models as Judges of Unstructured Textual Data

    Authors: Rewina Bedemariam, Natalie Perez, Sreyoshi Bhaduri, Satya Kapoor, Alex Gil, Elizabeth Conjar, Ikkei Itoku, David Theil, Aman Chadha, Naumaan Nayyar

    Abstract: Rapid advancements in large language models have unlocked remarkable capabilities when it comes to processing and summarizing unstructured text data. This has implications for the analysis of rich, open-ended datasets, such as survey responses, where LLMs hold the promise of efficiently distilling key themes and sentiments. However, as organizations increasingly turn to these powerful AI systems t… ▽ More

    Submitted 20 January, 2025; v1 submitted 14 January, 2025; originally announced January 2025.

    Comments: 11 pages, 1 appendix

  10. arXiv:2501.03271  [pdf, other

    cs.LG cs.AI cs.CL

    DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization

    Authors: Amitava Das, Suranjana Trivedy, Danush Khanna, Rajarshi Roy, Gurpreet Singh, Basab Ghosh, Yaswanth Narsupalli, Vinija Jain, Vasu Sharma, Aishwarya Naresh Reganti, Aman Chadha

    Abstract: The rapid rise of large language models (LLMs) has unlocked many applications but also underscores the challenge of aligning them with diverse values and preferences. Direct Preference Optimization (DPO) is central to alignment but constrained by fixed divergences and limited feature transformations. We propose DPO-Kernels, which integrates kernel methods to address these issues through four key c… ▽ More

    Submitted 19 January, 2025; v1 submitted 4 January, 2025; originally announced January 2025.

    MSC Class: 68T45

  11. arXiv:2501.00048  [pdf

    cs.LG cs.AI

    Stroke Prediction using Clinical and Social Features in Machine Learning

    Authors: Aidan Chadha

    Abstract: Every year in the United States, 800,000 individuals suffer a stroke - one person every 40 seconds, with a death occurring every four minutes. While individual factors vary, certain predictors are more prevalent in determining stroke risk. As strokes are the second leading cause of death and disability worldwide, predicting stroke likelihood based on lifestyle factors is crucial. Showing individua… ▽ More

    Submitted 27 December, 2024; originally announced January 2025.

  12. arXiv:2412.17304  [pdf, other

    cs.AI

    On the Feasibility of Vision-Language Models for Time-Series Classification

    Authors: Vinay Prithyani, Mohsin Mohammed, Richa Gadgil, Ricardo Buitrago, Vinija Jain, Aman Chadha

    Abstract: We build upon time-series classification by leveraging the capabilities of Vision Language Models (VLMs). We find that VLMs produce competitive results after two or less epochs of fine-tuning. We develop a novel approach that incorporates graphical data representations as images in conjunction with numerical data. This approach is rooted in the hypothesis that graphical representations can provide… ▽ More

    Submitted 17 January, 2025; v1 submitted 23 December, 2024; originally announced December 2024.

  13. arXiv:2412.17131  [pdf, other

    cs.CL

    LLMsAgainstHate @ NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification in Devanagari Languages via Parameter Efficient Fine-Tuning of LLMs

    Authors: Rushendra Sidibomma, Pransh Patwa, Parth Patwa, Aman Chadha, Vinija Jain, Amitava Das

    Abstract: The detection of hate speech has become increasingly important in combating online hostility and its real-world consequences. Despite recent advancements, there is limited research addressing hate speech detection in Devanagari-scripted languages, where resources and tools are scarce. While large language models (LLMs) have shown promise in language-related tasks, traditional fine-tuning approache… ▽ More

    Submitted 26 December, 2024; v1 submitted 22 December, 2024; originally announced December 2024.

  14. arXiv:2412.16359  [pdf, ps, other

    cs.CL cs.AI

    Human-Readable Adversarial Prompts: An Investigation into LLM Vulnerabilities Using Situational Context

    Authors: Nilanjana Das, Edward Raff, Aman Chadha, Manas Gaur

    Abstract: As the AI systems become deeply embedded in social media platforms, we've uncovered a concerning security vulnerability that goes beyond traditional adversarial attacks. It becomes important to assess the risks of LLMs before the general public use them on social media platforms to avoid any adverse impacts. Unlike obvious nonsensical text strings that safety systems can easily catch, our work rev… ▽ More

    Submitted 29 May, 2025; v1 submitted 20 December, 2024; originally announced December 2024.

    Comments: arXiv admin note: text overlap with arXiv:2407.14644

  15. arXiv:2412.15443  [pdf

    cs.CL

    SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval

    Authors: Aakash Mahalingam, Vinesh Kumar Gande, Aman Chadha, Vinija Jain, Divya Chaudhary

    Abstract: Retrieval-Augmented Generation (RAG) systems have become pivotal in leveraging vast corpora to generate informed and contextually relevant responses, notably reducing hallucinations in Large Language Models. Despite significant advancements, these systems struggle to efficiently process and retrieve information from large datasets while maintaining a comprehensive understanding of the context. Thi… ▽ More

    Submitted 19 December, 2024; originally announced December 2024.

    Comments: 16 pages, 8 figures, Workshop on Generative AI and Knowledge Graphs (GenAIK) at The 31st International Conference on Computational Linguistics (COLING 2025)

    Journal ref: Workshop on Generative AI and Knowledge Graphs (GenAIK) at The 31st International Conference on Computational Linguistics (COLING 2025)

  16. arXiv:2412.00869  [pdf, other

    cs.CL cs.AI

    KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting

    Authors: Thilini Wijesiriwardene, Ruwan Wickramarachchi, Sreeram Vennam, Vinija Jain, Aman Chadha, Amitava Das, Ponnurangam Kumaraguru, Amit Sheth

    Abstract: Making analogies is fundamental to cognition. Proportional analogies, which consist of four terms, are often used to assess linguistic and cognitive abilities. For instance, completing analogies like "Oxygen is to Gas as <blank> is to <blank>" requires identifying the semantic relationship (e.g., "type of") between the first pair of terms ("Oxygen" and "Gas") and finding a second pair that shares… ▽ More

    Submitted 18 December, 2024; v1 submitted 1 December, 2024; originally announced December 2024.

    Comments: Accepted at COLING 2025

  17. arXiv:2412.00319  [pdf, other

    cs.SD cs.AI eess.AS

    Improving speaker verification robustness with synthetic emotional utterances

    Authors: Nikhil Kumar Koditala, Chelsea Jui-Ting Ju, Ruirui Li, Minho Jin, Aman Chadha, Andreas Stolcke

    Abstract: A speaker verification (SV) system offers an authentication service designed to confirm whether a given speech sample originates from a specific speaker. This technology has paved the way for various personalized applications that cater to individual preferences. A noteworthy challenge faced by SV systems is their ability to perform consistently across a range of emotional spectra. Most existing m… ▽ More

    Submitted 29 November, 2024; originally announced December 2024.

  18. arXiv:2411.16754  [pdf, other

    cs.CV cs.AI

    Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

    Authors: Nasrin Imanpour, Shashwat Bajpai, Subhankar Ghosh, Sainath Reddy Sankepally, Abhilekh Borah, Hasnat Md Abdullah, Nishoak Kosaraju, Shreyas Dixit, Ashhar Aziz, Shwetangshu Biswas, Vinija Jain, Aman Chadha, Amit Sheth, Amitava Das

    Abstract: The proliferation of AI techniques for image generation, coupled with their increasing accessibility, has raised significant concerns about the potential misuse of these images to spread misinformation. Recent AI-generated image detection (AGID) methods include CNNDetection, NPR, DM Image Detection, Fake Image Detection, DIRE, LASTED, GAN Image Detection, AIDE, SSP, DRCT, RINE, OCC-CLIP, De-Fake,… ▽ More

    Submitted 24 November, 2024; originally announced November 2024.

    Comments: 13 pages, 9 figures

  19. arXiv:2411.16508  [pdf, other

    cs.CV cs.CL

    All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

    Authors: Ashmal Vayani, Dinura Dissanayake, Hasindri Watawana, Noor Ahsan, Nevasini Sasikumar, Omkar Thawakar, Henok Biadglign Ademtew, Yahya Hmaiti, Amandeep Kumar, Kartik Kuckreja, Mykola Maslych, Wafa Al Ghallabi, Mihail Mihaylov, Chao Qin, Abdelrahman M Shaker, Mike Zhang, Mahardika Krisna Ihsani, Amiel Esplana, Monil Gokani, Shachar Mirkin, Harsh Singh, Ashay Srivastava, Endre Hamerlik, Fathinah Asma Izzati, Fadillah Adamsyah Maani , et al. (44 additional authors not shown)

    Abstract: Existing Large Multimodal Models (LMMs) generally focus on only a few regions and languages. As LMMs continue to improve, it is increasingly important to ensure they understand cultural contexts, respect local sensitivities, and support low-resource languages, all while effectively integrating corresponding visual cues. In pursuit of culturally diverse global multimodal models, our proposed All La… ▽ More

    Submitted 30 April, 2025; v1 submitted 25 November, 2024; originally announced November 2024.

    Comments: A Multilingual Multimodal cultural benchmark for 100 languages

  20. arXiv:2411.10867  [pdf, other

    cs.CV cs.AI

    ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models

    Authors: Vipula Rawte, Sarthak Jain, Aarush Sinha, Garv Kaushik, Aman Bansal, Prathiksha Rumale Vishwanath, Samyak Rajesh Jain, Aishwarya Naresh Reganti, Vinija Jain, Aman Chadha, Amit P. Sheth, Amitava Das

    Abstract: Recent advances in Large Multimodal Models (LMMs) have expanded their capabilities to video understanding, with Text-to-Video (T2V) models excelling in generating videos from textual prompts. However, they still frequently produce hallucinated content, revealing AI-generated inconsistencies. We introduce ViBe (https://vibe-t2v-bench.github.io/): a large-scale dataset of hallucinated videos from op… ▽ More

    Submitted 19 March, 2025; v1 submitted 16 November, 2024; originally announced November 2024.

  21. arXiv:2410.15017  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    DM-Codec: Distilling Multimodal Representations for Speech Tokenization

    Authors: Md Mubtasim Ahasan, Md Fahim, Tasnim Mohiuddin, A K M Mahbubur Rahman, Aman Chadha, Tariq Iqbal, M Ashraful Amin, Md Mofijul Islam, Amin Ahsan Ali

    Abstract: Recent advancements in speech-language models have yielded significant improvements in speech tokenization and synthesis. However, effectively mapping the complex, multidimensional attributes of speech into discrete tokens remains challenging. This process demands acoustic, semantic, and contextual information for precise speech representations. Existing speech representations generally fall into… ▽ More

    Submitted 19 October, 2024; originally announced October 2024.

  22. arXiv:2410.04236  [pdf, other

    cs.CL cs.AI cs.LG

    Overview of Factify5WQA: Fact Verification through 5W Question-Answering

    Authors: Suryavardan Suresh, Anku Rani, Parth Patwa, Aishwarya Reganti, Vinija Jain, Aman Chadha, Amitava Das, Amit Sheth, Asif Ekbal

    Abstract: Researchers have found that fake news spreads much times faster than real news. This is a major problem, especially in today's world where social media is the key source of news for many among the younger population. Fact verification, thus, becomes an important task and many media sites contribute to the cause. Manual fact verification is a tedious task, given the volume of fake news online. The… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

    Comments: Accepted at defactify3@aaai2024

  23. arXiv:2410.02458  [pdf, other

    eess.IV cs.CL cs.CV

    MedVisionLlama: Leveraging Pre-Trained Large Language Model Layers to Enhance Medical Image Segmentation

    Authors: Gurucharan Marthi Krishna Kumar, Aman Chadha, Janine Mendola, Amir Shmuel

    Abstract: Large Language Models (LLMs), known for their versatility in textual data, are increasingly being explored for their potential to enhance medical image segmentation, a crucial task for accurate diagnostic imaging. This study explores enhancing Vision Transformers (ViTs) for medical image segmentation by integrating pre-trained LLM transformer blocks. Our approach, which incorporates a frozen LLM t… ▽ More

    Submitted 4 October, 2024; v1 submitted 3 October, 2024; originally announced October 2024.

  24. arXiv:2409.09269  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types

    Authors: Neelabh Sinha, Vinija Jain, Aman Chadha

    Abstract: Visual Question-Answering (VQA) has become key to user experience, particularly after improved generalization capabilities of Vision-Language Models (VLMs). But evaluating VLMs for an application requirement using a standardized framework in practical settings is still challenging. This paper aims to solve that using an end-to-end framework. We present VQA360 - a novel dataset derived from establi… ▽ More

    Submitted 12 December, 2024; v1 submitted 13 September, 2024; originally announced September 2024.

    Comments: Accepted at The First Workshop of Evaluation of Multi-Modal Generation (EvalMG) in 31st International Conference on Computational Linguistics (COLING), 2025. 8 pages + references + 6 pages of Appendix

  25. arXiv:2409.00391  [pdf, other

    cs.SD cs.AI eess.AS

    Density Adaptive Attention-based Speech Network: Enhancing Feature Understanding for Mental Health Disorders

    Authors: Georgios Ioannides, Adrian Kieback, Aman Chadha, Aaron Elkins

    Abstract: Speech-based depression detection poses significant challenges for automated detection due to its unique manifestation across individuals and data scarcity. Addressing these challenges, we introduce DAAMAudioCNNLSTM and DAAMAudioTransformer, two parameter efficient and explainable models for audio feature extraction and depression detection. DAAMAudioCNNLSTM features a novel CNN-LSTM framework wit… ▽ More

    Submitted 31 August, 2024; originally announced September 2024.

  26. arXiv:2408.12369  [pdf, other

    cs.AI

    RoundTable: Leveraging Dynamic Schema and Contextual Autocomplete for Enhanced Query Precision in Tabular Question Answering

    Authors: Pratyush Kumar, Kuber Vijaykumar Bellad, Bharat Vadlamudi, Aman Chadha

    Abstract: With advancements in Large Language Models (LLMs), a major use case that has emerged is querying databases in plain English, translating user questions into executable database queries, which has improved significantly. However, real-world datasets often feature a vast array of attributes and complex values, complicating the LLMs task of accurately identifying relevant columns or values from natur… ▽ More

    Submitted 23 August, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

    Comments: 13 pages, 4 figures

  27. arXiv:2408.12060  [pdf, other

    cs.CL cs.AI

    Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs

    Authors: Ronit Singhal, Pransh Patwa, Parth Patwa, Aman Chadha, Amitava Das

    Abstract: Given the widespread dissemination of misinformation on social media, implementing fact-checking mechanisms for online claims is essential. Manually verifying every claim is very challenging, underscoring the need for an automated fact-checking system. This paper presents our system designed to address this issue. We utilize the Averitec dataset (Schlichtkrull et al., 2023) to assess the performan… ▽ More

    Submitted 4 October, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

    Comments: Accepted in The Seventh FEVER Workshop at EMNLP 2024

  28. arXiv:2408.11247  [pdf, other

    cs.CL

    Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data

    Authors: Atmika Gorti, Manas Gaur, Aman Chadha

    Abstract: Large Language Models (LLMs) are prone to inheriting and amplifying societal biases embedded within their training data, potentially reinforcing harmful stereotypes related to gender, occupation, and other sensitive categories. This issue becomes particularly problematic as biased LLMs can have far-reaching consequences, leading to unfair practices and exacerbating social inequalities across vario… ▽ More

    Submitted 26 August, 2024; v1 submitted 20 August, 2024; originally announced August 2024.

    Comments: Accepted in AAAI Spring Symposium 2024

  29. arXiv:2408.11237  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification

    Authors: Christos Constantinou, Georgios Ioannides, Aman Chadha, Aaron Elkins, Edwin Simpson

    Abstract: Detecting out-of-distribution (OOD) data is crucial in machine learning applications to mitigate the risk of model overconfidence, thereby enhancing the reliability and safety of deployed systems. The majority of existing OOD detection methods predominantly address uni-modal inputs, such as images or texts. In the context of multi-modal documents, there is a notable lack of extensive research on t… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  30. arXiv:2408.10446  [pdf, other

    cs.CV cs.AI

    The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks

    Authors: Niyar R Barman, Krish Sharma, Ashhar Aziz, Shashwat Bajpai, Shwetangshu Biswas, Vasu Sharma, Vinija Jain, Aman Chadha, Amit Sheth, Amitava Das

    Abstract: The rapid advancement of text-to-image generation systems, exemplified by models like Stable Diffusion, Midjourney, Imagen, and DALL-E, has heightened concerns about their potential misuse. In response, companies like Meta and Google have intensified their efforts to implement watermarking techniques on AI-generated images to curb the circulation of potentially misleading visuals. However, in this… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 23 pages and 10 figures

  31. arXiv:2406.14805  [pdf, other

    cs.CL

    How Well Do LLMs Represent Values Across Cultures? Empirical Analysis of LLM Responses Based on Hofstede Cultural Dimensions

    Authors: Julia Kharchenko, Tanya Roosta, Aman Chadha, Chirag Shah

    Abstract: Large Language Models (LLMs) attempt to imitate human behavior by responding to humans in a way that pleases them, including by adhering to their values. However, humans come from diverse cultures with different values. It is critical to understand whether LLMs showcase different values to the user based on the stereotypical values of a user's known country. We prompt different LLMs with a series… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  32. arXiv:2406.12644  [pdf, other

    cs.CL cs.AI

    Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles

    Authors: Devichand Budagam, Ashutosh Kumar, Mahsa Khoshnoodi, Sankalp KJ, Vinija Jain, Aman Chadha

    Abstract: Assessing the effectiveness of large language models (LLMs) in performing different tasks is crucial for understanding their strengths and weaknesses. This paper presents Hierarchical Prompting Taxonomy (HPT), grounded on human cognitive principles and designed to assess LLMs by examining the cognitive demands of various tasks. The HPT utilizes the Hierarchical Prompting Framework (HPF), which str… ▽ More

    Submitted 11 December, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  33. arXiv:2406.11402  [pdf, other

    cs.CL cs.AI cs.LG

    Are Small Language Models Ready to Compete with Large Language Models for Practical Applications?

    Authors: Neelabh Sinha, Vinija Jain, Aman Chadha

    Abstract: The rapid rise of Language Models (LMs) has expanded their use in several applications. Yet, due to constraints of model size, associated cost, or proprietary restrictions, utilizing state-of-the-art (SOTA) LLMs is not always feasible. With open, smaller LMs emerging, more applications can leverage their capabilities, but selecting the right LM can be challenging as smaller LMs do not perform well… ▽ More

    Submitted 12 March, 2025; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted at The Fifth Workshop on Trustworthy Natural Language Processing (TrustNLP 2025) in Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025. 8 pages + references + Appendix

  34. arXiv:2406.11109  [pdf, other

    cs.CL cs.AI cs.LG

    Investigating Annotator Bias in Large Language Models for Hate Speech Detection

    Authors: Amit Das, Zheng Zhang, Najib Hasan, Souvika Sarkar, Fatemeh Jamshidi, Tathagata Bhattacharya, Mostafa Rahgouy, Nilanjana Raychawdhary, Dongji Feng, Vinija Jain, Aman Chadha, Mary Sandage, Lauramarie Pope, Gerry Dozier, Cheryl Seals

    Abstract: Data annotation, the practice of assigning descriptive labels to raw data, is pivotal in optimizing the performance of machine learning models. However, it is a resource-intensive process susceptible to biases introduced by annotators. The emergence of sophisticated Large Language Models (LLMs) presents a unique opportunity to modernize and streamline this complex procedure. While existing researc… ▽ More

    Submitted 16 November, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: Accepted at NeurIPS Safe Generative AI Workshop, 2024

  35. arXiv:2406.09559  [pdf, other

    cs.CL cs.AI cs.LG

    Decoding the Diversity: A Review of the Indic AI Research Landscape

    Authors: Sankalp KJ, Vinija Jain, Sreyoshi Bhaduri, Tamoghna Roy, Aman Chadha

    Abstract: This review paper provides a comprehensive overview of large language model (LLM) research directions within Indic languages. Indic languages are those spoken in the Indian subcontinent, including India, Pakistan, Bangladesh, Sri Lanka, Nepal, and Bhutan, among others. These languages have a rich cultural and linguistic heritage and are spoken by over 1.5 billion people worldwide. With the tremend… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 27 pages, 1 figure

  36. arXiv:2406.08862  [pdf, other

    cs.LG

    Cognitively Inspired Energy-Based World Models

    Authors: Alexi Gladstone, Ganesh Nanduru, Md Mofijul Islam, Aman Chadha, Jundong Li, Tariq Iqbal

    Abstract: One of the predominant methods for training world models is autoregressive prediction in the output space of the next element of a sequence. In Natural Language Processing (NLP), this takes the form of Large Language Models (LLMs) predicting the next token; in Computer Vision (CV), this takes the form of autoregressive models predicting the next frame/token/pixel. However, this approach differs fr… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 23 pages, 6 figures

  37. arXiv:2406.05344  [pdf, other

    cs.CL

    MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention

    Authors: Prince Jha, Raghav Jain, Konika Mandal, Aman Chadha, Sriparna Saha, Pushpak Bhattacharyya

    Abstract: In the digital world, memes present a unique challenge for content moderation due to their potential to spread harmful content. Although detection methods have improved, proactive solutions such as intervention are still limited, with current research focusing mostly on text-based content, neglecting the widespread influence of multimodal content like memes. Addressing this gap, we present \textit… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  38. arXiv:2405.17927  [pdf, other

    cs.AI cs.CL cs.CV cs.LG eess.AS

    The Evolution of Multimodal Model Architectures

    Authors: Shakti N. Wadekar, Abhishek Chaurasia, Aman Chadha, Eugenio Culurciello

    Abstract: This work uniquely identifies and characterizes four prevalent multimodal model architectural patterns in the contemporary multimodal landscape. Systematically categorizing models by architecture type facilitates monitoring of developments in the multimodal domain. Distinct from recent survey papers that present general information on multimodal architectures, this research conducts a comprehensiv… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 30 pages, 6 tables, 7 figures

  39. arXiv:2405.17475  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    How Culturally Aware are Vision-Language Models?

    Authors: Olena Burda-Lassen, Aman Chadha, Shashank Goswami, Vinija Jain

    Abstract: An image is often considered worth a thousand words, and certain images can tell rich and insightful stories. Can these stories be told via image captioning? Images from folklore genres, such as mythology, folk dance, cultural signs, and symbols, are vital to every culture. Our research compares the performance of four popular vision-language models (GPT-4V, Gemini Pro Vision, LLaVA, and OpenFlami… ▽ More

    Submitted 8 February, 2025; v1 submitted 24 May, 2024; originally announced May 2024.

  40. Enhancing Adverse Drug Event Detection with Multimodal Dataset: Corpus Creation and Model Development

    Authors: Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Aman Chadha, Samrat Mondal

    Abstract: The mining of adverse drug events (ADEs) is pivotal in pharmacovigilance, enhancing patient safety by identifying potential risks associated with medications, facilitating early detection of adverse events, and guiding regulatory decision-making. Traditional ADE detection methods are reliable but slow, not easily adaptable to large-scale operations, and offer limited information. With the exponent… ▽ More

    Submitted 26 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: ACL Findings 2024

    Report number: 2024.findings-acl.667

  41. arXiv:2405.13019  [pdf, other

    cs.CL cs.AI

    A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models

    Authors: Mahsa Khoshnoodi, Vinija Jain, Mingye Gao, Malavika Srikanth, Aman Chadha

    Abstract: Despite the crucial importance of accelerating text generation in large language models (LLMs) for efficiently producing content, the sequential nature of this process often leads to high inference latency, posing challenges for real-time applications. Various techniques have been proposed and developed to address these challenges and improve efficiency. This paper presents a comprehensive survey… ▽ More

    Submitted 24 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  42. arXiv:2405.09589  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.SD eess.AS

    A Comprehensive Survey of Hallucination in Large Language, Image, Video and Audio Foundation Models

    Authors: Pranab Sahoo, Prabhash Meharia, Akash Ghosh, Sriparna Saha, Vinija Jain, Aman Chadha

    Abstract: The rapid advancement of foundation models (FMs) across language, image, audio, and video domains has shown remarkable capabilities in diverse tasks. However, the proliferation of FMs brings forth a critical challenge: the potential to generate hallucinated outputs, particularly in high-stakes applications. The tendency of foundation models to produce hallucinated content arguably represents the b… ▽ More

    Submitted 3 October, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    Comments: EMNLP 2024 Findings

  43. arXiv:2404.13506  [pdf, other

    cs.LG cs.AI cs.CL

    Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications

    Authors: Charith Chandra Sai Balne, Sreyoshi Bhaduri, Tamoghna Roy, Vinija Jain, Aman Chadha

    Abstract: The rise of deep learning has marked significant progress in fields such as computer vision, natural language processing, and medical imaging, primarily through the adaptation of pre-trained models for specific tasks. Traditional fine-tuning methods, involving adjustments to all parameters, face challenges due to high computational and memory demands. This has led to the development of Parameter E… ▽ More

    Submitted 23 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  44. arXiv:2404.11036  [pdf, other

    cs.LG cs.CL

    Cross-Platform Hate Speech Detection with Weakly Supervised Causal Disentanglement

    Authors: Paras Sheth, Tharindu Kumarage, Raha Moraffah, Aman Chadha, Huan Liu

    Abstract: Content moderation faces a challenging task as social media's ability to spread hate speech contrasts with its role in promoting global connectivity. With rapidly evolving slang and hate speech, the adaptability of conventional deep learning to the fluid landscape of online dialogue remains limited. In response, causality inspired disentanglement has shown promise by segregating platform specific… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  45. arXiv:2404.07214  [pdf, other

    cs.CV cs.AI cs.CL

    Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions

    Authors: Akash Ghosh, Arkadeep Acharya, Sriparna Saha, Vinija Jain, Aman Chadha

    Abstract: The advent of Large Language Models (LLMs) has significantly reshaped the trajectory of the AI revolution. Nevertheless, these LLMs exhibit a notable limitation, as they are primarily adept at processing textual information. To address this constraint, researchers have endeavored to integrate visual capabilities with LLMs, resulting in the emergence of Vision-Language Models (VLMs). These advanced… ▽ More

    Submitted 12 April, 2024; v1 submitted 20 February, 2024; originally announced April 2024.

    Comments: The most extensive and up to date Survey on Visual Language Models covering 76 Visual Language Models

  46. arXiv:2403.19113  [pdf, other

    cs.CL cs.AI

    FACTOID: FACtual enTailment fOr hallucInation Detection

    Authors: Vipula Rawte, S. M Towhidul Islam Tonmoy, Krishnav Rajbangshi, Shravani Nag, Aman Chadha, Amit P. Sheth, Amitava Das

    Abstract: The widespread adoption of Large Language Models (LLMs) has facilitated numerous benefits. However, hallucination is a significant concern. In response, Retrieval Augmented Generation (RAG) has emerged as a highly promising paradigm to improve LLM outputs by grounding them in factual information. RAG relies on textual entailment (TE) or similar methods to check if the text produced by LLMs is supp… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  47. arXiv:2403.18976  [pdf, other

    cs.CL cs.AI

    "Sorry, Come Again?" Prompting -- Enhancing Comprehension and Diminishing Hallucination with [PAUSE]-injected Optimal Paraphrasing

    Authors: Vipula Rawte, S. M Towhidul Islam Tonmoy, S M Mehedi Zaman, Prachi Priya, Aman Chadha, Amit P. Sheth, Amitava Das

    Abstract: Hallucination has emerged as the most vulnerable aspect of contemporary Large Language Models (LLMs). In this paper, we introduce the Sorry, Come Again (SCA) prompting, aimed to avoid LLM hallucinations by enhancing comprehension through: (i) optimal paraphrasing and (ii) injecting [PAUSE] tokens to delay LLM generation. First, we provide an in-depth analysis of linguistic nuances: formality, read… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  48. arXiv:2403.16422  [pdf, other

    cs.CV cs.AI

    Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation

    Authors: Sanyam Lakhanpal, Shivang Chopra, Vinija Jain, Aman Chadha, Man Luo

    Abstract: Over the past few years, Text-to-Image (T2I) generation approaches based on diffusion models have gained significant attention. However, vanilla diffusion models often suffer from spelling inaccuracies in the text displayed within the generated images. The capability to generate visual text is crucial, offering both academic interest and a wide range of practical applications. To produce accurate… ▽ More

    Submitted 28 October, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted at WACV 2025

  49. arXiv:2403.14633  [pdf, other

    cs.CY cs.AI cs.CL

    Born With a Silver Spoon? Investigating Socioeconomic Bias in Large Language Models

    Authors: Smriti Singh, Shuvam Keshari, Vinija Jain, Aman Chadha

    Abstract: Socioeconomic bias in society exacerbates disparities, influencing access to opportunities and resources based on individuals' economic and social backgrounds. This pervasive issue perpetuates systemic inequalities, hindering the pursuit of inclusive progress as a society. In this paper, we investigate the presence of socioeconomic bias, if any, in large language models. To this end, we introduce… ▽ More

    Submitted 19 December, 2024; v1 submitted 16 February, 2024; originally announced March 2024.

  50. arXiv:2403.09724  [pdf, other

    cs.CL cs.CY cs.LG

    ClaimVer: Explainable Claim-Level Verification and Evidence Attribution of Text Through Knowledge Graphs

    Authors: Preetam Prabhu Srikar Dammu, Himanshu Naidu, Mouly Dewan, YoungMin Kim, Tanya Roosta, Aman Chadha, Chirag Shah

    Abstract: In the midst of widespread misinformation and disinformation through social media and the proliferation of AI-generated texts, it has become increasingly difficult for people to validate and trust information they encounter. Many fact-checking approaches and tools have been developed, but they often lack appropriate explainability or granularity to be useful in various contexts. A text validation… ▽ More

    Submitted 20 September, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: EMNLP 2024 Findings