Showing 1–2 of 2 results for author: Boguraev, S

Search v0.5.6 released 2020-02-24

arXiv:2505.16002 [pdf, ps, other]

cs.CL cs.AI

Causal Interventions Reveal Shared Structure Across English Filler-Gap Constructions

Authors: Sasha Boguraev, Christopher Potts, Kyle Mahowald

Abstract: Language Models (LMs) have emerged as powerful sources of evidence for linguists seeking to develop theories of syntax. In this paper, we argue that causal interpretability methods, applied to LMs, can greatly enhance the value of such evidence by helping us characterize the abstract mechanisms that LMs learn to use. Our empirical focus is a set of English filler-gap dependency constructions (e.g.… ▽ More Language Models (LMs) have emerged as powerful sources of evidence for linguists seeking to develop theories of syntax. In this paper, we argue that causal interpretability methods, applied to LMs, can greatly enhance the value of such evidence by helping us characterize the abstract mechanisms that LMs learn to use. Our empirical focus is a set of English filler-gap dependency constructions (e.g., questions, relative clauses). Linguistic theories largely agree that these constructions share many properties. Using experiments based in Distributed Interchange Interventions, we show that LMs converge on similar abstract analyses of these constructions. These analyses also reveal previously overlooked factors -- relating to frequency, filler type, and surrounding context -- that could motivate changes to standard linguistic theory. Overall, these results suggest that mechanistic, internal analyses of LMs can push linguistic theory forward. △ Less

Submitted 29 September, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

Comments: 22 pages, 21 figures, 10 tables; EMNLP (Main) 2025
arXiv:2409.17005 [pdf, other]

cs.AI cs.CL

Models Can and Should Embrace the Communicative Nature of Human-Generated Math

Authors: Sasha Boguraev, Ben Lipkin, Leonie Weissweiler, Kyle Mahowald

Abstract: Math is constructed by people for people: just as natural language corpora reflect not just propositions but the communicative goals of language users, the math data that models are trained on reflects not just idealized mathematical entities but rich communicative intentions. While there are important advantages to treating math in a purely symbolic manner, we here hypothesize that there are bene… ▽ More Math is constructed by people for people: just as natural language corpora reflect not just propositions but the communicative goals of language users, the math data that models are trained on reflects not just idealized mathematical entities but rich communicative intentions. While there are important advantages to treating math in a purely symbolic manner, we here hypothesize that there are benefits to treating math as situated linguistic communication and that language models are well suited for this goal, in ways that are not fully appreciated. We illustrate these points with two case studies. First, we ran an experiment in which we found that language models interpret the equals sign in a humanlike way -- generating systematically different word problems for the same underlying equation arranged in different ways. Second, we found that language models prefer proofs to be ordered in naturalistic ways, even though other orders would be logically equivalent. We advocate for AI systems that learn from and represent the communicative intentions latent in human-generated math. △ Less

Submitted 31 October, 2024; v1 submitted 25 September, 2024; originally announced September 2024.

Search v0.5.6 released 2020-02-24