Skip to main content

Showing 1–3 of 3 results for author: McMilin, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2210.00131  [pdf, other

    cs.CL cs.AI

    Underspecification in Language Modeling Tasks: A Causality-Informed Study of Gendered Pronoun Resolution

    Authors: Emily McMilin

    Abstract: Modern language modeling tasks are often underspecified: for a given token prediction, many words may satisfy the user's intent of producing natural language at inference time, however only one word will minimize the task's loss function at training time. We introduce a simple causal mechanism to describe the role underspecification plays in the generation of spurious correlations. Despite its sim… ▽ More

    Submitted 22 February, 2024; v1 submitted 30 September, 2022; originally announced October 2022.

    Comments: 24 pages, 41 figures

  2. arXiv:2208.10063  [pdf, other

    cs.CL cs.AI

    Selection Collider Bias in Large Language Models

    Authors: Emily McMilin

    Abstract: In this paper we motivate the causal mechanisms behind sample selection induced collider bias (selection collider bias) that can cause Large Language Models (LLMs) to learn unconditional dependence between entities that are unconditionally independent in the real world. We show that selection collider bias can become amplified in underspecified learning tasks, and although difficult to overcome, w… ▽ More

    Submitted 13 September, 2022; v1 submitted 22 August, 2022; originally announced August 2022.

    Comments: 12 pages, 16 figures, UAI 2022 Causal Representation Learning Workshop

  3. arXiv:2207.08982  [pdf, other

    cs.CL cs.AI

    Selection Bias Induced Spurious Correlations in Large Language Models

    Authors: Emily McMilin

    Abstract: In this work we show how large language models (LLMs) can learn statistical dependencies between otherwise unconditionally independent variables due to dataset selection bias. To demonstrate the effect, we developed a masked gender task that can be applied to BERT-family models to reveal spurious correlations between predicted gender pronouns and a variety of seemingly gender-neutral variables lik… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: 8 pages, 5 figures, Published at the ICML 2022 Workshop on Spurious Correlations, Invariance, and Stability