Skip to main content

Showing 1–5 of 5 results for author: Strobl, L

.
  1. arXiv:2503.22076  [pdf, other

    cs.LG

    Concise One-Layer Transformers Can Do Function Evaluation (Sometimes)

    Authors: Lena Strobl, Dana Angluin, Robert Frank

    Abstract: While transformers have proven enormously successful in a range of tasks, their fundamental properties as models of computation are not well understood. This paper contributes to the study of the expressive capacity of transformers, focusing on their ability to perform the fundamental computational task of evaluating an arbitrary function from $[n]$ to $[n]$ at a given argument. We prove that conc… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  2. arXiv:2412.09925  [pdf, other

    cs.LG cs.CL cs.FL

    Simulating Hard Attention Using Soft Attention

    Authors: Andy Yang, Lena Strobl, David Chiang, Dana Angluin

    Abstract: We study conditions under which transformers using soft attention can simulate hard attention, that is, effectively focus all attention on a subset of positions. First, we examine several variants of linear temporal logic, whose formulas have been previously been shown to be computable using hard attention transformers. We demonstrate how soft attention transformers can compute formulas of these l… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

  3. arXiv:2404.02040  [pdf, other

    cs.FL cs.LG

    Transformers as Transducers

    Authors: Lena Strobl, Dana Angluin, David Chiang, Jonathan Rawski, Ashish Sabharwal

    Abstract: We study the sequence-to-sequence mapping capacity of transformers by relating them to finite transducers, and find that they can express surprisingly large classes of transductions. We do so using variants of RASP, a programming language designed to help people "think like transformers," as an intermediate representation. We extend the existing Boolean variant B-RASP to sequence-to-sequence funct… ▽ More

    Submitted 5 November, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: To appear in Transactions of the Association for Computational Linguistics

  4. arXiv:2311.00208  [pdf, other

    cs.LG cs.CL cs.FL cs.LO

    What Formal Languages Can Transformers Express? A Survey

    Authors: Lena Strobl, William Merrill, Gail Weiss, David Chiang, Dana Angluin

    Abstract: As transformers have gained prominence in natural language processing, some researchers have investigated theoretically what problems they can and cannot solve, by treating problems as formal languages. Exploring such questions can help clarify the power of transformers relative to other models of computation, their fundamental capabilities and limits, and the impact of architectural choices. Work… ▽ More

    Submitted 4 September, 2024; v1 submitted 31 October, 2023; originally announced November 2023.

    Comments: One minor correction in ยง5.1

    Journal ref: Transactions of the Association for Computational Linguistics, 12:543-561, 2024

  5. arXiv:2308.03212  [pdf, other

    cs.CL cs.CC cs.LG

    Average-Hard Attention Transformers are Constant-Depth Uniform Threshold Circuits

    Authors: Lena Strobl

    Abstract: Transformers have emerged as a widely used neural network model for various natural language processing tasks. Previous research explored their relationship with constant-depth threshold circuits, making two assumptions: average-hard attention and logarithmic precision for internal computations relative to input length. Merrill et al. (2022) prove that average-hard attention transformers recognize… ▽ More

    Submitted 21 August, 2023; v1 submitted 6 August, 2023; originally announced August 2023.