A Linguistic Comparison between Human and ChatGPT-Generated Conversations

Sandler, Morgan; Choung, Hyesun; Ross, Arun; David, Prabu

Computer Science > Computation and Language

arXiv:2401.16587 (cs)

[Submitted on 29 Jan 2024 (v1), last revised 26 Apr 2024 (this version, v3)]

Title:A Linguistic Comparison between Human and ChatGPT-Generated Conversations

Authors:Morgan Sandler, Hyesun Choung, Arun Ross, Prabu David

View PDF HTML (experimental)

Abstract:This study explores linguistic differences between human and LLM-generated dialogues, using 19.5K dialogues generated by ChatGPT-3.5 as a companion to the EmpathicDialogues dataset. The research employs Linguistic Inquiry and Word Count (LIWC) analysis, comparing ChatGPT-generated conversations with human conversations across 118 linguistic categories. Results show greater variability and authenticity in human dialogues, but ChatGPT excels in categories such as social processes, analytical style, cognition, attentional focus, and positive emotional tone, reinforcing recent findings of LLMs being "more human than human." However, no significant difference was found in positive or negative affect between ChatGPT and human dialogues. Classifier analysis of dialogue embeddings indicates implicit coding of the valence of affect despite no explicit mention of affect in the conversations. The research also contributes a novel, companion ChatGPT-generated dataset of conversations between two independent chatbots, which were designed to replicate a corpus of human conversations available for open access and used widely in AI research on language modeling. Our findings enhance understanding of ChatGPT's linguistic capabilities and inform ongoing efforts to distinguish between human and LLM-generated text, which is critical in detecting AI-generated fakes, misinformation, and disinformation.

Comments:	Proceedings of the 4th International Conference on Pattern Recognition and Artificial Intelligence (ICPRAI), Jeju, Korea, 2024
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
Cite as:	arXiv:2401.16587 [cs.CL]
	(or arXiv:2401.16587v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2401.16587

Submission history

From: Morgan Sandler [view email]
[v1] Mon, 29 Jan 2024 21:43:27 UTC (7,173 KB)
[v2] Fri, 2 Feb 2024 16:47:16 UTC (7,173 KB)
[v3] Fri, 26 Apr 2024 01:16:35 UTC (7,176 KB)

Computer Science > Computation and Language

Title:A Linguistic Comparison between Human and ChatGPT-Generated Conversations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Linguistic Comparison between Human and ChatGPT-Generated Conversations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators