Skip to main content

Showing 1–5 of 5 results for author: Scivetti, W

.
  1. arXiv:2506.04408  [pdf, ps, other

    cs.CL cs.AI

    Unpacking Let Alone: Human-Scale Models Generalize to a Rare Construction in Form but not Meaning

    Authors: Wesley Scivetti, Tatsuya Aoyama, Ethan Wilcox, Nathan Schneider

    Abstract: Humans have a remarkable ability to acquire and understand grammatical phenomena that are seen rarely, if ever, during childhood. Recent evidence suggests that language models with human-scale pretraining data may possess a similar ability by generalizing from frequent to rare constructions. However, it remains an open question how widespread this generalization ability is, and to what extent this… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  2. arXiv:2503.18751  [pdf, other

    cs.CL cs.AI

    Construction Identification and Disambiguation Using BERT: A Case Study of NPN

    Authors: Wesley Scivetti, Nathan Schneider

    Abstract: Construction Grammar hypothesizes that knowledge of a language consists chiefly of knowledge of form-meaning pairs (''constructions'') that include vocabulary, general grammar rules, and even idiosyncratic patterns. Recent work has shown that transformer language models represent at least some constructional patterns, including ones where the construction is rare overall. In this work, we probe BE… ▽ More

    Submitted 24 March, 2025; originally announced March 2025.

    Comments: 8 pages, ACL long-paper format (preprint)

  3. arXiv:2501.04661  [pdf, other

    cs.CL cs.AI

    Assessing Language Comprehension in Large Language Models Using Construction Grammar

    Authors: Wesley Scivetti, Melissa Torgbi, Austin Blodgett, Mollie Shichman, Taylor Hudson, Claire Bonial, Harish Tayyar Madabushi

    Abstract: Large Language Models, despite their significant capabilities, are known to fail in surprising and unpredictable ways. Evaluating their true `understanding' of language is particularly challenging due to the extensive web-scale data they are trained on. Therefore, we construct an evaluation to systematically assess natural language understanding (NLU) in LLMs by leveraging Construction Grammar (Cx… ▽ More

    Submitted 8 January, 2025; originally announced January 2025.

  4. arXiv:2411.00491  [pdf, other

    cs.CL

    GDTB: Genre Diverse Data for English Shallow Discourse Parsing across Modalities, Text Types, and Domains

    Authors: Yang Janet Liu, Tatsuya Aoyama, Wesley Scivetti, Yilun Zhu, Shabnam Behzad, Lauren Elizabeth Levine, Jessica Lin, Devika Tiwari, Amir Zeldes

    Abstract: Work on shallow discourse parsing in English has focused on the Wall Street Journal corpus, the only large-scale dataset for the language in the PDTB framework. However, the data is not openly available, is restricted to the news domain, and is by now 35 years old. In this paper, we present and evaluate a new open-access, multi-genre benchmark for PDTB-style shallow discourse parsing, based on the… ▽ More

    Submitted 1 November, 2024; originally announced November 2024.

    Comments: Accepted to EMNLP 2024 (main, long); camera-ready version

  5. arXiv:2403.17748  [pdf, other

    cs.CL

    UCxn: Typologically Informed Annotation of Constructions Atop Universal Dependencies

    Authors: Leonie Weissweiler, Nina Böbel, Kirian Guiller, Santiago Herrera, Wesley Scivetti, Arthur Lorenzi, Nurit Melnik, Archna Bhatia, Hinrich Schütze, Lori Levin, Amir Zeldes, Joakim Nivre, William Croft, Nathan Schneider

    Abstract: The Universal Dependencies (UD) project has created an invaluable collection of treebanks with contributions in over 140 languages. However, the UD annotations do not tell the full story. Grammatical constructions that convey meaning through a particular combination of several morphosyntactic elements -- for example, interrogative sentences with special markers and/or word orders -- are not labele… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: LREC-COLING 2024