Skip to main content

Showing 1–19 of 19 results for author: Ethayarajh, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.02919  [pdf, other

    cs.CL

    Data Checklist: On Unit-Testing Datasets with Usable Information

    Authors: Heidi C. Zhang, Shabnam Behzad, Kawin Ethayarajh, Dan Jurafsky

    Abstract: Model checklists (Ribeiro et al., 2020) have emerged as a useful tool for understanding the behavior of LLMs, analogous to unit-testing in software engineering. However, despite datasets being a key determinant of model behavior, evaluating datasets, e.g., for the existence of annotation artifacts, is largely done ad hoc, once a problem in model behavior has already been found downstream. In this… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: 17 pages, 4 figures. COLM 2024

  2. arXiv:2402.01306  [pdf, other

    cs.LG cs.AI

    KTO: Model Alignment as Prospect Theoretic Optimization

    Authors: Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, Douwe Kiela

    Abstract: Kahneman & Tversky's $\textit{prospect theory}$ tells us that humans perceive random variables in a biased but well-defined manner (1992); for example, humans are famously loss-averse. We show that objectives for aligning LLMs with human feedback implicitly incorporate many of these biases -- the success of these objectives (e.g., DPO) over cross-entropy minimization can partly be ascribed to them… ▽ More

    Submitted 19 November, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  3. arXiv:2309.08638  [pdf, other

    cs.CL

    Anchor Points: Benchmarking Models with Much Fewer Examples

    Authors: Rajan Vivek, Kawin Ethayarajh, Diyi Yang, Douwe Kiela

    Abstract: Modern language models often exhibit powerful but brittle behavior, leading to the development of larger and more diverse benchmarks to reliably assess their behavior. Here, we suggest that model performance can be benchmarked and elucidated with much smaller evaluation sets. We first show that in six popular language classification benchmarks, model confidence in the correct class on many pairs o… ▽ More

    Submitted 18 February, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted to EACL 2024 Main Conference. Code will be released at: https://github.com/rvivek3/AnchorPoints

  4. arXiv:2205.11930  [pdf, other

    cs.CL cs.AI cs.LG

    The Authenticity Gap in Human Evaluation

    Authors: Kawin Ethayarajh, Dan Jurafsky

    Abstract: Human ratings are the gold standard in NLG evaluation. The standard protocol is to collect ratings of generated text, average across annotators, and rank NLG systems by their average scores. However, little consideration has been given as to whether this approach faithfully captures human preferences. Analyzing this standard protocol through the lens of utility theory in economics, we identify the… ▽ More

    Submitted 2 November, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022

  5. arXiv:2205.05093  [pdf, other

    cs.CL cs.AI

    Richer Countries and Richer Representations

    Authors: Kaitlyn Zhou, Kawin Ethayarajh, Dan Jurafsky

    Abstract: We examine whether some countries are more richly represented in embedding space than others. We find that countries whose names occur with low frequency in training corpora are more likely to be tokenized into subwords, are less semantically distinct in embedding space, and are less likely to be correctly predicted: e.g., Ghana (the correct answer and in-vocabulary) is not predicted for, "The cou… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: Camera Ready for ACL 2022 (Findings)

  6. arXiv:2205.05092  [pdf, other

    cs.CL cs.AI

    Problems with Cosine as a Measure of Embedding Similarity for High Frequency Words

    Authors: Kaitlyn Zhou, Kawin Ethayarajh, Dallas Card, Dan Jurafsky

    Abstract: Cosine similarity of contextual embeddings is used in many NLP tasks (e.g., QA, IR, MT) and metrics (e.g., BERTScore). Here, we uncover systematic ways in which word similarities estimated by cosine over BERT embeddings are understated and trace this effect to training data frequency. We find that relative to human judgements, cosine similarity underestimates the similarity of frequent words with… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: Camera Ready for ACL 2022 (Main Conference)

  7. arXiv:2110.08420  [pdf, other

    cs.CL cs.AI cs.LG

    Understanding Dataset Difficulty with $\mathcal{V}$-Usable Information

    Authors: Kawin Ethayarajh, Yejin Choi, Swabha Swayamdipta

    Abstract: Estimating the difficulty of a dataset typically involves comparing state-of-the-art models to humans; the bigger the performance gap, the harder the dataset is said to be. However, this comparison provides little understanding of how difficult each instance in a given distribution is, or what attributes make the dataset difficult for a given model. To address these questions, we frame dataset dif… ▽ More

    Submitted 26 April, 2025; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ICML 2022 (Outstanding Paper)

  8. arXiv:2109.09234  [pdf, other

    cs.CL

    Conditional probing: measuring usable information beyond a baseline

    Authors: John Hewitt, Kawin Ethayarajh, Percy Liang, Christopher D. Manning

    Abstract: Probing experiments investigate the extent to which neural representations make properties -- like part-of-speech -- predictable. One suggests that a representation encodes a property if probing that representation produces higher accuracy than probing a baseline representation like non-contextual word embeddings. Instead of using baselines as a point of comparison, we're interested in measuring i… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 + typo fixes

  9. arXiv:2108.07258  [pdf, other

    cs.LG cs.AI cs.CY

    On the Opportunities and Risks of Foundation Models

    Authors: Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh , et al. (89 additional authors not shown)

    Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their cap… ▽ More

    Submitted 12 July, 2022; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Report page with citation guidelines: https://crfm.stanford.edu/report.html

  10. arXiv:2106.06052  [pdf, other

    cs.CL cs.AI

    Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking

    Authors: Zhiyi Ma, Kawin Ethayarajh, Tristan Thrush, Somya Jain, Ledell Wu, Robin Jia, Christopher Potts, Adina Williams, Douwe Kiela

    Abstract: We introduce Dynaboard, an evaluation-as-a-service framework for hosting benchmarks and conducting holistic model comparison, integrated with the Dynabench platform. Our platform evaluates NLP models directly instead of relying on self-reported metrics or predictions on a single dataset. Under this paradigm, models are submitted to be evaluated in the cloud, circumventing the issues of reproducibi… ▽ More

    Submitted 20 May, 2021; originally announced June 2021.

  11. arXiv:2105.14652  [pdf, other

    cs.CL

    Attention Flows are Shapley Value Explanations

    Authors: Kawin Ethayarajh, Dan Jurafsky

    Abstract: Shapley Values, a solution to the credit assignment problem in cooperative game theory, are a popular type of explanation in machine learning, having been used to explain the importance of features, embeddings, and even neurons. In NLP, however, leave-one-out and attention-based explanations still predominate. Can we draw a connection between these different methods? We formally prove that -- save… ▽ More

    Submitted 30 May, 2021; originally announced May 2021.

    Comments: ACL 2021

  12. arXiv:2104.08465  [pdf, other

    cs.CL

    Frequency-based Distortions in Contextualized Word Embeddings

    Authors: Kaitlyn Zhou, Kawin Ethayarajh, Dan Jurafsky

    Abstract: How does word frequency in pre-training data affect the behavior of similarity metrics in contextualized BERT embeddings? Are there systematic ways in which some word relationships are exaggerated or understated? In this work, we explore the geometric characteristics of contextualized word embeddings with two novel tools: (1) an identity probe that predicts the identity of a word using its embeddi… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

  13. arXiv:2009.13888  [pdf, other

    cs.CL cs.AI cs.LG

    Utility is in the Eye of the User: A Critique of NLP Leaderboards

    Authors: Kawin Ethayarajh, Dan Jurafsky

    Abstract: Benchmarks such as GLUE have helped drive advances in NLP by incentivizing the creation of more accurate models. While this leaderboard paradigm has been remarkably successful, a historical focus on performance-based evaluation has been at the expense of other qualities that the NLP community values in models, such as compactness, fairness, and energy efficiency. In this opinion paper, we study th… ▽ More

    Submitted 3 March, 2021; v1 submitted 29 September, 2020; originally announced September 2020.

    Comments: EMNLP 2020 (updated with additional references)

  14. arXiv:2004.12726  [pdf, other

    cs.CL cs.LG

    BLEU Neighbors: A Reference-less Approach to Automatic Evaluation

    Authors: Kawin Ethayarajh, Dorsa Sadigh

    Abstract: Evaluation is a bottleneck in the development of natural language generation (NLG) models. Automatic metrics such as BLEU rely on references, but for tasks such as open-ended generation, there are no references to draw upon. Although language diversity can be estimated using statistical measures such as perplexity, measuring language quality requires human evaluation. However, because human evalua… ▽ More

    Submitted 12 October, 2020; v1 submitted 27 April, 2020; originally announced April 2020.

  15. arXiv:2004.12332  [pdf, other

    cs.CL cs.LG

    Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds

    Authors: Kawin Ethayarajh

    Abstract: Most NLP datasets are not annotated with protected attributes such as gender, making it difficult to measure classification bias using standard measures of fairness (e.g., equal opportunity). However, manually annotating a large dataset with a protected attribute is slow and expensive. Instead of annotating all the examples, can we annotate a subset of them and use that sample to estimate the bias… ▽ More

    Submitted 26 April, 2020; originally announced April 2020.

    Comments: ACL 2020

  16. arXiv:1909.00512  [pdf, other

    cs.CL

    How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings

    Authors: Kawin Ethayarajh

    Abstract: Replacing static word embeddings with contextualized word representations has yielded significant improvements on many NLP tasks. However, just how contextual are the contextualized representations produced by models such as ELMo and BERT? Are there infinitely many context-specific representations for each word, or are words essentially assigned one of a finite number of word-sense representations… ▽ More

    Submitted 1 September, 2019; originally announced September 2019.

    Comments: Accepted to EMNLP 2019

  17. arXiv:1909.00504  [pdf, other

    cs.CL

    Rotate King to get Queen: Word Relationships as Orthogonal Transformations in Embedding Space

    Authors: Kawin Ethayarajh

    Abstract: A notable property of word embeddings is that word relationships can exist as linear substructures in the embedding space. For example, $\textit{gender}$ corresponds to $\vec{\textit{woman}} - \vec{\textit{man}}$ and $\vec{\textit{queen}} - \vec{\textit{king}}$. This, in turn, allows word analogies to be solved arithmetically:… ▽ More

    Submitted 5 September, 2019; v1 submitted 1 September, 2019; originally announced September 2019.

    Comments: Accepted to EMNLP 2019

  18. arXiv:1908.06361  [pdf, other

    cs.CL

    Understanding Undesirable Word Embedding Associations

    Authors: Kawin Ethayarajh, David Duvenaud, Graeme Hirst

    Abstract: Word embeddings are often criticized for capturing undesirable word associations such as gender stereotypes. However, methods for measuring and removing such biases remain poorly understood. We show that for any embedding model that implicitly does matrix factorization, debiasing vectors post hoc using subspace projection (Bolukbasi et al., 2016) is, under certain conditions, equivalent to trainin… ▽ More

    Submitted 17 August, 2019; originally announced August 2019.

    Comments: Accepted to ACL 2019

  19. arXiv:1810.04882  [pdf, other

    cs.CL

    Towards Understanding Linear Word Analogies

    Authors: Kawin Ethayarajh, David Duvenaud, Graeme Hirst

    Abstract: A surprising property of word vectors is that word analogies can often be solved with vector arithmetic. However, it is unclear why arithmetic operators correspond to non-linear embedding models such as skip-gram with negative sampling (SGNS). We provide a formal explanation of this phenomenon without making the strong assumptions that past theories have made about the vector space and word distri… ▽ More

    Submitted 12 August, 2019; v1 submitted 11 October, 2018; originally announced October 2018.

    Comments: Accepted to ACL 2019