Skip to main content

Showing 1–6 of 6 results for author: Narsupalli, Y

.
  1. arXiv:2502.03512  [pdf, other

    cs.AI

    YINYANG-ALIGN: Benchmarking Contradictory Objectives and Proposing Multi-Objective Optimization based DPO for Text-to-Image Alignment

    Authors: Amitava Das, Yaswanth Narsupalli, Gurpreet Singh, Vinija Jain, Vasu Sharma, Suranjana Trivedy, Aman Chadha, Amit Sheth

    Abstract: Precise alignment in Text-to-Image (T2I) systems is crucial to ensure that generated visuals not only accurately encapsulate user intents but also conform to stringent ethical and aesthetic benchmarks. Incidents like the Google Gemini fiasco, where misaligned outputs triggered significant public backlash, underscore the critical need for robust alignment mechanisms. In contrast, Large Language Mod… ▽ More

    Submitted 9 February, 2025; v1 submitted 5 February, 2025; originally announced February 2025.

  2. arXiv:2501.03271  [pdf, other

    cs.LG cs.AI cs.CL

    DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization

    Authors: Amitava Das, Suranjana Trivedy, Danush Khanna, Rajarshi Roy, Gurpreet Singh, Basab Ghosh, Yaswanth Narsupalli, Vinija Jain, Vasu Sharma, Aishwarya Naresh Reganti, Aman Chadha

    Abstract: The rapid rise of large language models (LLMs) has unlocked many applications but also underscores the challenge of aligning them with diverse values and preferences. Direct Preference Optimization (DPO) is central to alignment but constrained by fixed divergences and limited feature transformations. We propose DPO-Kernels, which integrates kernel methods to address these issues through four key c… ▽ More

    Submitted 19 January, 2025; v1 submitted 4 January, 2025; originally announced January 2025.

    MSC Class: 68T45

  3. arXiv:2407.12877  [pdf, other

    cs.CL cs.AI

    ReFeR: Improving Evaluation and Reasoning through Hierarchy of Models

    Authors: Yaswanth Narsupalli, Abhranil Chandra, Sreevatsa Muppirala, Manish Gupta, Pawan Goyal

    Abstract: Assessing the quality of outputs generated by generative models, such as large language models and vision language models, presents notable challenges. Traditional methods for evaluation typically rely on either human assessments, which are resource-intensive, or automatic metrics that often show a low correlation with human judgment. Another common approach is to use deep learning systems, which… ▽ More

    Submitted 9 October, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: Paper Under Review

  4. arXiv:2406.15252  [pdf, other

    cs.CV cs.AI

    VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation

    Authors: Xuan He, Dongfu Jiang, Ge Zhang, Max Ku, Achint Soni, Sherman Siu, Haonan Chen, Abhranil Chandra, Ziyan Jiang, Aaran Arulraj, Kai Wang, Quy Duc Do, Yuansheng Ni, Bohan Lyu, Yaswanth Narsupalli, Rongqi Fan, Zhiheng Lyu, Yuchen Lin, Wenhu Chen

    Abstract: The recent years have witnessed great advances in video generation. However, the development of automatic video metrics is lagging significantly behind. None of the existing metric is able to provide reliable scores over generated videos. The main barrier is the lack of large-scale human-annotated dataset. In this paper, we release VideoFeedback, the first large-scale dataset containing human-prov… ▽ More

    Submitted 14 October, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

  5. arXiv:2406.05862  [pdf, other

    cs.CL cs.AI cs.CV

    II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models

    Authors: Ziqiang Liu, Feiteng Fang, Xi Feng, Xinrun Du, Chenhao Zhang, Zekun Wang, Yuelin Bai, Qixuan Zhao, Liyang Fan, Chengguang Gan, Hongquan Lin, Jiaming Li, Yuansheng Ni, Haihong Wu, Yaswanth Narsupalli, Zhigang Zheng, Chengming Li, Xiping Hu, Ruifeng Xu, Xiaojun Chen, Min Yang, Jiaheng Liu, Ruibo Liu, Wenhao Huang, Ge Zhang , et al. (1 additional authors not shown)

    Abstract: The rapid advancements in the development of multimodal large language models (MLLMs) have consistently led to new breakthroughs on various benchmarks. In response, numerous challenging and comprehensive benchmarks have been proposed to more accurately assess the capabilities of MLLMs. However, there is a dearth of exploration of the higher-order perceptual capabilities of MLLMs. To fill this gap,… ▽ More

    Submitted 13 January, 2025; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: 100 pages, 82 figures, add citations

  6. arXiv:2310.09501  [pdf, other

    cs.CL

    DepNeCTI: Dependency-based Nested Compound Type Identification for Sanskrit

    Authors: Jivnesh Sandhan, Yaswanth Narsupalli, Sreevatsa Muppirala, Sriram Krishnan, Pavankumar Satuluri, Amba Kulkarni, Pawan Goyal

    Abstract: Multi-component compounding is a prevalent phenomenon in Sanskrit, and understanding the implicit structure of a compound's components is crucial for deciphering its meaning. Earlier approaches in Sanskrit have focused on binary compounds and neglected the multi-component compound setting. This work introduces the novel task of nested compound type identification (NeCTI), which aims to identify ne… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

    Comments: 9 Pages, Camera-ready version accepted at EMNLP23 (Findings)