Skip to main content

Showing 1–6 of 6 results for author: Padthe, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.01921  [pdf, ps, other

    cs.CL

    NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks

    Authors: Yang Li, Youssef Emad, Karthik Padthe, Jack Lanchantin, Weizhe Yuan, Thao Nguyen, Jason Weston, Shang-Wen Li, Dong Wang, Ilia Kulikov, Xian Li

    Abstract: Recent work has shown that distilling reasoning traces from a larger teacher model via supervised finetuning outperforms reinforcement learning with the smaller student model alone (Guo et al. 2025). However, there has not been a systematic study of what kind of reasoning demonstrations from the teacher are most effective in improving the student model's reasoning capabilities. In this work we cur… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  2. arXiv:2502.13124  [pdf, ps, other

    cs.CL

    NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions

    Authors: Weizhe Yuan, Jane Yu, Song Jiang, Karthik Padthe, Yang Li, Ilia Kulikov, Kyunghyun Cho, Dong Wang, Yuandong Tian, Jason E Weston, Xian Li

    Abstract: Scaling reasoning capabilities beyond traditional domains such as math and coding is hindered by the lack of diverse and high-quality questions. To overcome this limitation, we introduce a scalable approach for generating diverse and challenging reasoning questions, accompanied by reference answers. We present NaturalReasoning, a comprehensive dataset comprising 2.8 million questions that span mul… ▽ More

    Submitted 14 June, 2025; v1 submitted 18 February, 2025; originally announced February 2025.

    Comments: Dataset at https://huggingface.co/datasets/facebook/natural_reasoning

  3. arXiv:2412.18069  [pdf, ps, other

    cs.CL

    Improving Factuality with Explicit Working Memory

    Authors: Mingda Chen, Yang Li, Karthik Padthe, Rulin Shao, Alicia Sun, Luke Zettlemoyer, Gargi Ghosh, Wen-tau Yih

    Abstract: Large language models can generate factually inaccurate content, a problem known as hallucination. Recent works have built upon retrieved-augmented generation to improve factuality through iterative prompting but these methods are limited by the traditional RAG design. To address these challenges, we introduce EWE (Explicit Working Memory), a novel approach that enhances factuality in long-form te… ▽ More

    Submitted 2 June, 2025; v1 submitted 23 December, 2024; originally announced December 2024.

    Comments: ACL 2025 Camera Ready

  4. arXiv:2405.17247  [pdf, other

    cs.LG

    An Introduction to Vision-Language Modeling

    Authors: Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar MaƱas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Megan Richards, Samuel Lavoie , et al. (16 additional authors not shown)

    Abstract: Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technol… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  5. arXiv:2405.01582  [pdf, other

    cs.CL cs.AI cs.LG

    Text Quality-Based Pruning for Efficient Training of Language Models

    Authors: Vasu Sharma, Karthik Padthe, Newsha Ardalani, Kushal Tirumala, Russell Howes, Hu Xu, Po-Yao Huang, Shang-Wen Li, Armen Aghajanyan, Gargi Ghosh, Luke Zettlemoyer

    Abstract: In recent times training Language Models (LMs) have relied on computationally heavy training over massive datasets which makes this training process extremely laborious. In this paper we propose a novel method for numerically evaluating text quality in large unlabelled NLP datasets in a model agnostic manner to assign the text instances a "quality score". By proposing the text quality metric, th… ▽ More

    Submitted 10 May, 2024; v1 submitted 26 April, 2024; originally announced May 2024.

  6. arXiv:2102.03672  [pdf, other

    cs.LG

    Emergency Department Optimization and Load Prediction in Hospitals

    Authors: Karthik K. Padthe, Vikas Kumar, Carly M. Eckert, Nicholas M. Mark, Anam Zahid, Muhammad Aurangzeb Ahmad, Ankur Teredesai

    Abstract: Over the past several years, across the globe, there has been an increase in people seeking care in emergency departments (EDs). ED resources, including nurse staffing, are strained by such increases in patient volume. Accurate forecasting of incoming patient volume in emergency departments (ED) is crucial for efficient utilization and allocation of ED resources. Working with a suburban ED in the… ▽ More

    Submitted 6 February, 2021; originally announced February 2021.

    Comments: 7 pages, 3 figures, 4 tables