Skip to main content

Showing 1–7 of 7 results for author: Lipkin, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.13139  [pdf, other

    cs.CL cs.AI cs.LG

    Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo

    Authors: João Loula, Benjamin LeBrun, Li Du, Ben Lipkin, Clemente Pasti, Gabriel Grand, Tianyu Liu, Yahya Emara, Marjorie Freedman, Jason Eisner, Ryan Cotterell, Vikash Mansinghka, Alexander K. Lew, Tim Vieira, Timothy J. O'Donnell

    Abstract: A wide range of LM applications require generating text that conforms to syntactic or semantic constraints. Imposing such constraints can be naturally framed as probabilistic conditioning, but exact generation from the resulting distribution -- which can differ substantially from the LM's base distribution -- is generally intractable. In this work, we develop an architecture for controlled LM gene… ▽ More

    Submitted 18 April, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

    Comments: 34 pages, 4 figures

  2. arXiv:2504.05410  [pdf, other

    cs.CL cs.AI cs.LG

    Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling

    Authors: Benjamin Lipkin, Benjamin LeBrun, Jacob Hoover Vigly, João Loula, David R. MacIver, Li Du, Jason Eisner, Ryan Cotterell, Vikash Mansinghka, Timothy J. O'Donnell, Alexander K. Lew, Tim Vieira

    Abstract: The dominant approach to generating from language models subject to some constraint is locally constrained decoding (LCD), incrementally sampling tokens at each time step such that the constraint is never violated. Typically, this is achieved through token masking: looping over the vocabulary and excluding non-conforming tokens. There are two important problems with this approach. (i) Evaluating t… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  3. arXiv:2409.17005  [pdf, other

    cs.AI cs.CL

    Models Can and Should Embrace the Communicative Nature of Human-Generated Math

    Authors: Sasha Boguraev, Ben Lipkin, Leonie Weissweiler, Kyle Mahowald

    Abstract: Math is constructed by people for people: just as natural language corpora reflect not just propositions but the communicative goals of language users, the math data that models are trained on reflects not just idealized mathematical entities but rich communicative intentions. While there are important advantages to treating math in a purely symbolic manner, we here hypothesize that there are bene… ▽ More

    Submitted 31 October, 2024; v1 submitted 25 September, 2024; originally announced September 2024.

  4. arXiv:2405.09605  [pdf, other

    cs.CL cs.AI cs.LG

    Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models

    Authors: Anna A. Ivanova, Aalok Sathe, Benjamin Lipkin, Unnathi Kumar, Setayesh Radkani, Thomas H. Clark, Carina Kauf, Jennifer Hu, R. T. Pramod, Gabriel Grand, Vivian Paulun, Maria Ryskina, Ekin Akyürek, Ethan Wilcox, Nafisa Rashid, Leshem Choshen, Roger Levy, Evelina Fedorenko, Joshua Tenenbaum, Jacob Andreas

    Abstract: The ability to build and leverage world models is essential for a general-purpose AI agent. Testing such capabilities is hard, in part because the building blocks of world models are ill-defined. We present Elements of World Knowledge (EWOK), a framework for evaluating world modeling in language models by testing their ability to use knowledge of a concept to match a target text with a plausible/i… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 21 pages (11 main), 7 figures. Authors Anna Ivanova, Aalok Sathe, Benjamin Lipkin contributed equally

  5. LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

    Authors: Theo X. Olausson, Alex Gu, Benjamin Lipkin, Cedegao E. Zhang, Armando Solar-Lezama, Joshua B. Tenenbaum, Roger Levy

    Abstract: Logical reasoning, i.e., deductively inferring the truth value of a conclusion from a set of premises, is an important task for artificial intelligence with wide potential impacts on science, mathematics, and society. While many prompting-based strategies have been proposed to enable Large Language Models (LLMs) to do such reasoning more effectively, they still appear unsatisfactory, often failing… ▽ More

    Submitted 14 February, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP Main 2023 (Outstanding Paper Award)

    Journal ref: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5153-5176, Singapore. Association for Computational Linguistics

  6. arXiv:2305.06161  [pdf, other

    cs.CL cs.AI cs.PL cs.SE

    StarCoder: may the source be with you!

    Authors: Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu , et al. (42 additional authors not shown)

    Abstract: The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention. StarCoderBase is trained on 1 trillion tokens sourced from The Stack, a large colle… ▽ More

    Submitted 13 December, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

  7. arXiv:2305.01020  [pdf, other

    cs.CL cs.AI

    Evaluating statistical language models as pragmatic reasoners

    Authors: Benjamin Lipkin, Lionel Wong, Gabriel Grand, Joshua B Tenenbaum

    Abstract: The relationship between communicated language and intended meaning is often probabilistic and sensitive to context. Numerous strategies attempt to estimate such a mapping, often leveraging recursive Bayesian models of communication. In parallel, large language models (LLMs) have been increasingly applied to semantic parsing applications, tasked with inferring logical representations from natural… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

    Comments: 8 pages, 4 figures, to appear in the Proceedings of the Annual Meeting of the Cognitive Science Society 2023