Skip to main content

Showing 1–9 of 9 results for author: Ramakrishnan, A A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06561  [pdf, ps, other

    cs.CL cs.AI cs.CV

    LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles

    Authors: Ho Yin 'Sam' Ng, Ting-Yao Hsu, Aashish Anantha Ramakrishnan, Branislav Kveton, Nedim Lipka, Franck Dernoncourt, Dongwon Lee, Tong Yu, Sungchul Kim, Ryan A. Rossi, Ting-Hao 'Kenneth' Huang

    Abstract: Figure captions are crucial for helping readers understand and remember a figure's key message. Many models have been developed to generate these captions, helping authors compose better quality captions more easily. Yet, authors almost always need to revise generic AI-generated captions to match their writing style and the domain's style, highlighting the need for personalization. Despite languag… ▽ More

    Submitted 17 June, 2025; v1 submitted 6 June, 2025; originally announced June 2025.

    Comments: The LaMP-CAP dataset is publicly available at: https://github.com/Crowd-AI-Lab/lamp-cap

  2. arXiv:2505.16513  [pdf, ps, other

    cs.CV

    Detailed Evaluation of Modern Machine Learning Approaches for Optic Plastics Sorting

    Authors: Vaishali Maheshkar, Aadarsh Anantha Ramakrishnan, Charuvahan Adhivarahan, Karthik Dantu

    Abstract: According to the EPA, only 25% of waste is recycled, and just 60% of U.S. municipalities offer curbside recycling. Plastics fare worse, with a recycling rate of only 8%; an additional 16% is incinerated, while the remaining 76% ends up in landfills. The low plastic recycling rate stems from contamination, poor economic incentives, and technical difficulties, making efficient recycling a challenge.… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: Accepted at the 2024 REMADE Circular Economy Tech Summit and Conference, https://remadeinstitute.org/2024-conference/

    MSC Class: 68T45 ACM Class: I.4.9; I.4.6

  3. arXiv:2505.16258  [pdf, ps, other

    cs.CL cs.AI cs.CV

    IRONIC: Coherence-Aware Reasoning Chains for Multi-Modal Sarcasm Detection

    Authors: Aashish Anantha Ramakrishnan, Aadarsh Anantha Ramakrishnan, Dongwon Lee

    Abstract: Interpreting figurative language such as sarcasm across multi-modal inputs presents unique challenges, often requiring task-specific fine-tuning and extensive reasoning steps. However, current Chain-of-Thought approaches do not efficiently leverage the same cognitive processes that enable humans to identify sarcasm. We present IRONIC, an in-context learning framework that leverages Multi-modal Coh… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    MSC Class: 68T50 ACM Class: I.2.7; I.2.10

  4. arXiv:2503.23242  [pdf

    cs.CL cs.AI

    Beyond speculation: Measuring the growing presence of LLM-generated texts in multilingual disinformation

    Authors: Dominik Macko, Aashish Anantha Ramakrishnan, Jason Samuel Lucas, Robert Moro, Ivan Srba, Adaku Uchendu, Dongwon Lee

    Abstract: Increased sophistication of large language models (LLMs) and the consequent quality of generated multilingual text raises concerns about potential disinformation misuse. While humans struggle to distinguish LLM-generated content from human-written texts, the scholarly debate about their impact remains divided. Some argue that heightened fears are overblown due to natural ecosystem limitations, whi… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

  5. arXiv:2503.10997  [pdf, ps, other

    cs.CL cs.AI cs.CV

    RONA: Pragmatically Diverse Image Captioning with Coherence Relations

    Authors: Aashish Anantha Ramakrishnan, Aadarsh Anantha Ramakrishnan, Dongwon Lee

    Abstract: Writing Assistants (e.g., Grammarly, Microsoft Copilot) traditionally generate diverse image captions by employing syntactic and semantic variations to describe image components. However, human-written captions prioritize conveying a central message alongside visual descriptions using pragmatic cues. To enhance caption diversity, it is essential to explore alternative ways of communicating these m… ▽ More

    Submitted 9 June, 2025; v1 submitted 13 March, 2025; originally announced March 2025.

    Comments: Accepted in the NAACL Fourth Workshop on Intelligent and Interactive Writing Assistants (In2Writing), Albuquerque, New Mexico, May 2025, https://in2writing.glitch.me

    MSC Class: 68T50 ACM Class: I.2.7; I.2.10

  6. arXiv:2502.11300  [pdf, ps, other

    cs.CL cs.AI cs.CV

    CORDIAL: Can Multimodal Large Language Models Effectively Understand Coherence Relationships?

    Authors: Aashish Anantha Ramakrishnan, Aadarsh Anantha Ramakrishnan, Dongwon Lee

    Abstract: Multimodal Large Language Models (MLLMs) are renowned for their superior instruction-following and reasoning capabilities across diverse problem domains. However, existing benchmarks primarily focus on assessing factual and logical correctness in downstream tasks, with limited emphasis on evaluating MLLMs' ability to interpret pragmatic cues and intermodal relationships. To address this gap, we as… ▽ More

    Submitted 9 June, 2025; v1 submitted 16 February, 2025; originally announced February 2025.

    Comments: To appear at the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), Vienna, Austria, July 2025, https://2025.aclweb.org/

    ACM Class: I.2.7; I.2.10

  7. arXiv:2406.11106  [pdf, other

    cs.CL cs.AI

    From Intentions to Techniques: A Comprehensive Taxonomy and Challenges in Text Watermarking for Large Language Models

    Authors: Harsh Nishant Lalai, Aashish Anantha Ramakrishnan, Raj Sanjay Shah, Dongwon Lee

    Abstract: With the rapid growth of Large Language Models (LLMs), safeguarding textual content against unauthorized use is crucial. Text watermarking offers a vital solution, protecting both - LLM-generated and plain text sources. This paper presents a unified overview of different perspectives behind designing watermarking techniques, through a comprehensive survey of the research literature. Our work has t… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  8. arXiv:2404.10141  [pdf, other

    cs.CV cs.CL cs.MM

    ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis

    Authors: Aashish Anantha Ramakrishnan, Sharon X. Huang, Dongwon Lee

    Abstract: Text-to-Image (T2I) Synthesis has made tremendous strides in enhancing synthesized image quality, but current datasets evaluate model performance only on descriptive, instruction-based prompts. Real-world news image captions take a more pragmatic approach, providing high-level situational and Named-Entity (NE) information and limited physical object descriptions, making them abstractive. To evalua… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 23 pages, 9 figures

    MSC Class: 65D19

  9. arXiv:2301.02160  [pdf, other

    cs.CV cs.CL

    ANNA: Abstractive Text-to-Image Synthesis with Filtered News Captions

    Authors: Aashish Anantha Ramakrishnan, Sharon X. Huang, Dongwon Lee

    Abstract: Advancements in Text-to-Image synthesis over recent years have focused more on improving the quality of generated samples using datasets with descriptive prompts. However, real-world image-caption pairs present in domains such as news data do not use simple and directly descriptive captions. With captions containing information on both the image content and underlying contextual cues, they become… ▽ More

    Submitted 1 July, 2024; v1 submitted 5 January, 2023; originally announced January 2023.

    Comments: To appear in the ACL 3rd Workshop on Advances in Language and Vision Research (ALVR), Bangkok, Thailand, August 2024, https://alvr-workshop.github.io

    MSC Class: 65D19