The Shrinking Landscape of Linguistic Diversity in the Age of Large Language Models

Sourati, Zhivar; Karimi-Malekabadi, Farzan; Ozcan, Meltem; McDaniel, Colin; Ziabari, Alireza; Trager, Jackson; Tak, Ala; Chen, Meng; Morstatter, Fred; Dehghani, Morteza

Computer Science > Computation and Language

arXiv:2502.11266 (cs)

[Submitted on 16 Feb 2025]

Title:The Shrinking Landscape of Linguistic Diversity in the Age of Large Language Models

Authors:Zhivar Sourati, Farzan Karimi-Malekabadi, Meltem Ozcan, Colin McDaniel, Alireza Ziabari, Jackson Trager, Ala Tak, Meng Chen, Fred Morstatter, Morteza Dehghani

View PDF HTML (experimental)

Abstract:Language is far more than a communication tool. A wealth of information - including but not limited to the identities, psychological states, and social contexts of its users - can be gleaned through linguistic markers, and such insights are routinely leveraged across diverse fields ranging from product development and marketing to healthcare. In four studies utilizing experimental and observational methods, we demonstrate that the widespread adoption of large language models (LLMs) as writing assistants is linked to notable declines in linguistic diversity and may interfere with the societal and psychological insights language provides. We show that while the core content of texts is retained when LLMs polish and rewrite texts, not only do they homogenize writing styles, but they also alter stylistic elements in a way that selectively amplifies certain dominant characteristics or biases while suppressing others - emphasizing conformity over individuality. By varying LLMs, prompts, classifiers, and contexts, we show that these trends are robust and consistent. Our findings highlight a wide array of risks associated with linguistic homogenization, including compromised diagnostic processes and personalization efforts, the exacerbation of existing divides and barriers to equity in settings like personnel selection where language plays a critical role in assessing candidates' qualifications, communication skills, and cultural fit, and the undermining of efforts for cultural preservation.

Comments:	arXiv admin note: text overlap with arXiv:2404.00267
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2502.11266 [cs.CL]
	(or arXiv:2502.11266v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.11266

Submission history

From: Zhivar Sourati [view email]
[v1] Sun, 16 Feb 2025 20:51:07 UTC (19,082 KB)

Computer Science > Computation and Language

Title:The Shrinking Landscape of Linguistic Diversity in the Age of Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:The Shrinking Landscape of Linguistic Diversity in the Age of Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators