Model Misalignment and Language Change: Traces of AI-Associated Language in Unscripted Spoken English

Anderson, Bryce; Galpin, Riley; Juzek, Tom S.

Computer Science > Computation and Language

arXiv:2508.00238 (cs)

[Submitted on 1 Aug 2025]

Title:Model Misalignment and Language Change: Traces of AI-Associated Language in Unscripted Spoken English

Authors:Bryce Anderson, Riley Galpin, Tom S. Juzek

View PDF HTML (experimental)

Abstract:In recent years, written language, particularly in science and education, has undergone remarkable shifts in word usage. These changes are widely attributed to the growing influence of Large Language Models (LLMs), which frequently rely on a distinct lexical style. Divergences between model output and target audience norms can be viewed as a form of misalignment. While these shifts are often linked to using Artificial Intelligence (AI) directly as a tool to generate text, it remains unclear whether the changes reflect broader changes in the human language system itself. To explore this question, we constructed a dataset of 22.1 million words from unscripted spoken language drawn from conversational science and technology podcasts. We analyzed lexical trends before and after ChatGPT's release in 2022, focusing on commonly LLM-associated words. Our results show a moderate yet significant increase in the usage of these words post-2022, suggesting a convergence between human word choices and LLM-associated patterns. In contrast, baseline synonym words exhibit no significant directional shift. Given the short time frame and the number of words affected, this may indicate the onset of a remarkable shift in language use. Whether this represents natural language change or a novel shift driven by AI exposure remains an open question. Similarly, although the shifts may stem from broader adoption patterns, it may also be that upstream training misalignments ultimately contribute to changes in human language use. These findings parallel ethical concerns that misaligned models may shape social and moral beliefs.

Comments:	Accepted at AIES 2025. To appear in the AIES Proceedings. 14 pages, 2 figures, 2 tables. Licensed under CC BY-SA 4.0
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
MSC classes:	68T50
ACM classes:	I.2; I.2.7
Cite as:	arXiv:2508.00238 [cs.CL]
	(or arXiv:2508.00238v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2508.00238

Submission history

From: Bryce Anderson [view email]
[v1] Fri, 1 Aug 2025 00:47:33 UTC (275 KB)

Computer Science > Computation and Language

Title:Model Misalignment and Language Change: Traces of AI-Associated Language in Unscripted Spoken English

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Model Misalignment and Language Change: Traces of AI-Associated Language in Unscripted Spoken English

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators