Skip to main content

Showing 1–6 of 6 results for author: Gunjal, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.17746  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains

    Authors: Anisha Gunjal, Anthony Wang, Elaine Lau, Vaskar Nath, Bing Liu, Sean Hendryx

    Abstract: Extending Reinforcement Learning with Verifiable Rewards (RLVR) to real-world tasks often requires balancing objective and subjective evaluation criteria. However, many such tasks lack a single, unambiguous ground truth-making it difficult to define reliable reward signals for post-training language models. While traditional preference-based methods offer a workaround, they rely on opaque reward f… ▽ More

    Submitted 23 July, 2025; originally announced July 2025.

  2. arXiv:2506.13923  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Adaptive Guidance Accelerates Reinforcement Learning of Reasoning Models

    Authors: Vaskar Nath, Elaine Lau, Anisha Gunjal, Manasi Sharma, Nikhil Baharte, Sean Hendryx

    Abstract: We study the process through which reasoning models trained with reinforcement learning on verifiable rewards (RLVR) can learn to solve new problems. We find that RLVR drives performance in two main ways: (1) by compressing pass@$k$ into pass@1 and (2) via "capability gain" in which models learn to solve new problems that they previously could not solve even at high $k$. We find that while capabil… ▽ More

    Submitted 19 June, 2025; v1 submitted 16 June, 2025; originally announced June 2025.

  3. Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification

    Authors: Anisha Gunjal, Greg Durrett

    Abstract: Automatic factuality verification of large language model (LLM) generations is becoming more and more widely used to combat hallucinations. A major point of tension in the literature is the granularity of this fact-checking: larger chunks of text are hard to fact-check, but more atomic facts like propositions may lack context to interpret correctly. In this work, we assess the role of context in t… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Journal ref: Findings of the Association for Computational Linguistics: EMNLP 2024 (2024) 3751-3768

  4. arXiv:2308.06394  [pdf, other

    cs.CV cs.LG

    Detecting and Preventing Hallucinations in Large Vision Language Models

    Authors: Anisha Gunjal, Jihan Yin, Erhan Bas

    Abstract: Instruction tuned Large Vision Language Models (LVLMs) have significantly advanced in generalizing across a diverse set of multi-modal tasks, especially for Visual Question Answering (VQA). However, generating detailed responses that are visually grounded is still a challenging task for these models. We find that even the current state-of-the-art LVLMs (InstructBLIP) still contain a staggering 30… ▽ More

    Submitted 11 February, 2024; v1 submitted 11 August, 2023; originally announced August 2023.

    Comments: AAAI 2024

  5. arXiv:2305.14847  [pdf, other

    cs.CL

    Drafting Event Schemas using Language Models

    Authors: Anisha Gunjal, Greg Durrett

    Abstract: Past work has studied event prediction and event language modeling, sometimes mediated through structured representations of knowledge in the form of event schemas. Such schemas can lead to explainable predictions and forecasting of unseen events given incomplete information. In this work, we look at the process of creating such schemas to describe complex events. We use large language models (LLM… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  6. arXiv:2204.11827  [pdf, other

    cs.LG cs.AI cs.RO

    Task-Induced Representation Learning

    Authors: Jun Yamada, Karl Pertsch, Anisha Gunjal, Joseph J. Lim

    Abstract: In this work, we evaluate the effectiveness of representation learning approaches for decision making in visually complex environments. Representation learning is essential for effective reinforcement learning (RL) from high-dimensional inputs. Unsupervised representation learning approaches based on reconstruction, prediction or contrastive learning have shown substantial learning efficiency gain… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

    Comments: International Conference on Learning Representations (ICLR), 2022