Skip to main content

Showing 1–5 of 5 results for author: Wen-Yi, A W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.14174  [pdf, ps, other

    cs.CL cs.LG

    Cheaper, Better, Faster, Stronger: Robust Text-to-SQL without Chain-of-Thought or Fine-Tuning

    Authors: Yusuf Denizay Dönder, Derek Hommel, Andrea W Wen-Yi, David Mimno, Unso Eun Seo Jo

    Abstract: LLMs are effective at code generation tasks like text-to-SQL, but is it worth the cost? Many state-of-the-art approaches use non-task-specific LLM techniques including Chain-of-Thought (CoT), self-consistency, and fine-tuning. These methods can be costly at inference time, sometimes requiring over a hundred LLM calls with reasoning, incurring average costs of up to \… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

  2. arXiv:2504.00289  [pdf, other

    cs.CL cs.AI cs.CY

    Do Chinese models speak Chinese languages?

    Authors: Andrea W Wen-Yi, Unso Eun Seo Jo, David Mimno

    Abstract: The release of top-performing open-weight LLMs has cemented China's role as a leading force in AI development. Do these models support languages spoken in China? Or do they speak the same languages as Western models? Comparing multilingual capabilities is important for two reasons. First, language ability provides insights into pre-training data curation, and thus into resource allocation and deve… ▽ More

    Submitted 7 April, 2025; v1 submitted 31 March, 2025; originally announced April 2025.

    Comments: First and second author contribute equally

  3. arXiv:2407.12500  [pdf, ps, other

    cs.CL

    Automate or Assist? The Role of Computational Models in Identifying Gendered Discourse in US Capital Trial Transcripts

    Authors: Andrea W Wen-Yi, Kathryn Adamson, Nathalie Greenfield, Rachel Goldberg, Sandra Babcock, David Mimno, Allison Koenecke

    Abstract: The language used by US courtroom actors in criminal trials has long been studied for biases. However, systematic studies for bias in high-stakes court trials have been difficult, due to the nuanced nature of bias and the legal expertise required. Large language models offer the possibility to automate annotation. But validating the computational approach requires both an understanding of how auto… ▽ More

    Submitted 26 July, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

    Journal ref: Published in AIES 2024

  4. arXiv:2407.09652  [pdf, other

    cs.CL

    How Chinese are Chinese Language Models? The Puzzling Lack of Language Policy in China's LLMs

    Authors: Andrea W Wen-Yi, Unso Eun Seo Jo, Lu Jia Lin, David Mimno

    Abstract: Contemporary language models are increasingly multilingual, but Chinese LLM developers must navigate complex political and business considerations of language diversity. Language policy in China aims at influencing the public discourse and governing a multi-ethnic society, and has gradually transitioned from a pluralist to a more assimilationist approach since 1949. We explore the impact of these… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Wen-Yi and Jo contributed equally to this work

  5. Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings

    Authors: Andrea W Wen-Yi, David Mimno

    Abstract: Cross-lingual transfer learning is an important property of multilingual large language models (LLMs). But how do LLMs represent relationships between languages? Every language model has an input layer that maps tokens to vectors. This ubiquitous layer of language models is often overlooked. We find that similarities between these input embeddings are highly interpretable and that the geometry of… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Journal ref: Published in EMNLP 2023