Skip to main content

Showing 1–8 of 8 results for author: Schoenick, C

.
  1. arXiv:2504.07096  [pdf, other

    cs.CL

    OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

    Authors: Jiacheng Liu, Taylor Blanton, Yanai Elazar, Sewon Min, YenSung Chen, Arnavi Chheda-Kothary, Huy Tran, Byron Bischoff, Eric Marsh, Michael Schmitz, Cassidy Trier, Aaron Sarnat, Jenna James, Jon Borchardt, Bailey Kuehl, Evie Cheng, Karen Farley, Sruthi Sreeram, Taira Anderson, David Albright, Carissa Schoenick, Luca Soldaini, Dirk Groeneveld, Rock Yuren Pang, Pang Wei Koh , et al. (6 additional authors not shown)

    Abstract: We present OLMoTrace, the first system that traces the outputs of language models back to their full, multi-trillion-token training data in real time. OLMoTrace finds and shows verbatim matches between segments of language model output and documents in the training text corpora. Powered by an extended version of infini-gram (Liu et al., 2024), our system returns tracing results within a few second… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

    Comments: Under submission at ACL 2025 demo track

  2. arXiv:2409.17146  [pdf, other

    cs.CV cs.CL cs.LG

    Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models

    Authors: Matt Deitke, Christopher Clark, Sangho Lee, Rohun Tripathi, Yue Yang, Jae Sung Park, Mohammadreza Salehi, Niklas Muennighoff, Kyle Lo, Luca Soldaini, Jiasen Lu, Taira Anderson, Erin Bransom, Kiana Ehsani, Huong Ngo, YenSung Chen, Ajay Patel, Mark Yatskar, Chris Callison-Burch, Andrew Head, Rose Hendrix, Favyen Bastani, Eli VanderBilt, Nathan Lambert, Yvonne Chou , et al. (25 additional authors not shown)

    Abstract: Today's most advanced vision-language models (VLMs) remain proprietary. The strongest open-weight models rely heavily on synthetic data from proprietary VLMs to achieve good performance, effectively distilling these closed VLMs into open ones. As a result, the community has been missing foundational knowledge about how to build performant VLMs from scratch. We present Molmo, a new family of VLMs t… ▽ More

    Submitted 5 December, 2024; v1 submitted 25 September, 2024; originally announced September 2024.

    Comments: Updated with ablations and more technical details

  3. arXiv:2112.00800  [pdf, other

    cs.CL cs.AI

    Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text

    Authors: Christopher Clark, Jordi Salvador, Dustin Schwenk, Derrick Bonafilia, Mark Yatskar, Eric Kolve, Alvaro Herrasti, Jonghyun Choi, Sachin Mehta, Sam Skjonsberg, Carissa Schoenick, Aaron Sarnat, Hannaneh Hajishirzi, Aniruddha Kembhavi, Oren Etzioni, Ali Farhadi

    Abstract: Communicating with humans is challenging for AIs because it requires a shared understanding of the world, complex semantics (e.g., metaphors or analogies), and at times multi-modal gestures (e.g., pointing with a finger, or an arrow in a diagram). We investigate these challenges in the context of Iconary, a collaborative game of drawing and guessing based on Pictionary, that poses a novel challeng… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

    Comments: In EMNLP 2021

  4. arXiv:2102.03315  [pdf, other

    cs.CL cs.AI

    Think you have Solved Direct-Answer Question Answering? Try ARC-DA, the Direct-Answer AI2 Reasoning Challenge

    Authors: Sumithra Bhakthavatsalam, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Peter Clark

    Abstract: We present the ARC-DA dataset, a direct-answer ("open response", "freeform") version of the ARC (AI2 Reasoning Challenge) multiple-choice dataset. While ARC has been influential in the community, its multiple-choice format is unrepresentative of real-world questions, and multiple choice formats can be particularly susceptible to artifacts. The ARC-DA dataset addresses these concerns by converting… ▽ More

    Submitted 5 February, 2021; originally announced February 2021.

  5. arXiv:1909.08135  [pdf, other

    cs.CL

    SUPP.AI: Finding Evidence for Supplement-Drug Interactions

    Authors: Lucy Lu Wang, Oyvind Tafjord, Arman Cohan, Sarthak Jain, Sam Skjonsberg, Carissa Schoenick, Nick Botner, Waleed Ammar

    Abstract: Dietary supplements are used by a large portion of the population, but information on their pharmacologic interactions is incomplete. To address this challenge, we present SUPP.AI, an application for browsing evidence of supplement-drug interactions (SDIs) extracted from the biomedical literature. We train a model to automatically extract supplement information and identify such interactions from… ▽ More

    Submitted 6 July, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

    Comments: ACL Demo 2020

  6. arXiv:1909.01958  [pdf, other

    cs.CL cs.AI

    From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project

    Authors: Peter Clark, Oren Etzioni, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Niket Tandon, Sumithra Bhakthavatsalam, Dirk Groeneveld, Michal Guerquin, Michael Schmitz

    Abstract: AI has achieved remarkable mastery over games such as Chess, Go, and Poker, and even Jeopardy, but the rich variety of standardized exams has remained a landmark challenge. Even in 2016, the best AI system achieved merely 59.3% on an 8th Grade science exam challenge. This paper reports unprecedented success on the Grade 8 New York Regents Science Exam, where for the first time a system scores more… ▽ More

    Submitted 1 February, 2021; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: AI Magazine 41 (4) Winter 2020. New analysis sections added

  7. arXiv:1803.05457  [pdf, other

    cs.AI cs.CL cs.IR

    Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

    Authors: Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord

    Abstract: We present a new question set, text corpus, and baselines assembled to encourage AI research in advanced question answering. Together, these constitute the AI2 Reasoning Challenge (ARC), which requires far more powerful knowledge and reasoning than previous challenges such as SQuAD or SNLI. The ARC question set is partitioned into a Challenge Set and an Easy Set, where the Challenge Set contains o… ▽ More

    Submitted 14 March, 2018; originally announced March 2018.

    Comments: 10 pages, 7 tables, 2 figures

  8. arXiv:1604.04315  [pdf

    cs.AI

    Moving Beyond the Turing Test with the Allen AI Science Challenge

    Authors: Carissa Schoenick, Peter Clark, Oyvind Tafjord, Peter Turney, Oren Etzioni

    Abstract: Given recent successes in AI (e.g., AlphaGo's victory against Lee Sedol in the game of GO), it's become increasingly important to assess: how close are AI systems to human-level intelligence? This paper describes the Allen AI Science Challenge---an approach towards that goal which led to a unique Kaggle Competition, its results, the lessons learned, and our next steps.

    Submitted 22 February, 2017; v1 submitted 14 April, 2016; originally announced April 2016.

    Comments: 7 pages