Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI
Authors:
Yuxia Wang,
Rui Xing,
Jonibek Mansurov,
Giovanni Puccetti,
Zhuohan Xie,
Minh Ngoc Ta,
Jiahui Geng,
Jinyan Su,
Mervat Abassy,
Saad El Dine Ahmed,
Kareem Elozeiri,
Nurkhan Laiyk,
Maiya Goloburda,
Tarek Mahmoud,
Raj Vardhan Tomar,
Alexander Aziz,
Ryuto Koike,
Masahiro Kaneko,
Artem Shelmanov,
Ekaterina Artemova,
Vladislav Mikhailov,
Akim Tsvigun,
Alham Fikri Aji,
Nizar Habash,
Iryna Gurevych
, et al. (1 additional authors not shown)
Abstract:
Prior studies have shown that distinguishing text generated by large language models (LLMs) from human-written one is highly challenging, and often no better than random guessing. To verify the generalizability of this finding across languages and domains, we perform an extensive case study to identify the upper bound of human detection accuracy. Across 16 datasets covering 9 languages and 9 domai…
▽ More
Prior studies have shown that distinguishing text generated by large language models (LLMs) from human-written one is highly challenging, and often no better than random guessing. To verify the generalizability of this finding across languages and domains, we perform an extensive case study to identify the upper bound of human detection accuracy. Across 16 datasets covering 9 languages and 9 domains, 19 annotators achieved an average detection accuracy of 87.6\%, thus challenging previous conclusions. We find that major gaps between human and machine text lie in concreteness, cultural nuances, and diversity. Prompting by explicitly explaining the distinctions in the prompts can partially bridge the gaps in over 50\% of the cases. However, we also find that humans do not always prefer human-written text, particularly when they cannot clearly identify its source.
△ Less
Submitted 23 May, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Authors:
Mervat Abassy,
Kareem Elozeiri,
Alexander Aziz,
Minh Ngoc Ta,
Raj Vardhan Tomar,
Bimarsha Adhikari,
Saad El Dine Ahmed,
Yuxia Wang,
Osama Mohammed Afzal,
Zhuohan Xie,
Jonibek Mansurov,
Ekaterina Artemova,
Vladislav Mikhailov,
Rui Xing,
Jiahui Geng,
Hasan Iqbal,
Zain Muhammad Mujahid,
Tarek Mahmoud,
Akim Tsvigun,
Alham Fikri Aji,
Artem Shelmanov,
Nizar Habash,
Iryna Gurevych,
Preslav Nakov
Abstract:
The ease of access to large language models (LLMs) has enabled a widespread of machine-generated texts, and now it is often hard to tell whether a piece of text was human-written or machine-generated. This raises concerns about potential misuse, particularly within educational and academic domains. Thus, it is important to develop practical systems that can automate the process. Here, we present o…
▽ More
The ease of access to large language models (LLMs) has enabled a widespread of machine-generated texts, and now it is often hard to tell whether a piece of text was human-written or machine-generated. This raises concerns about potential misuse, particularly within educational and academic domains. Thus, it is important to develop practical systems that can automate the process. Here, we present one such system, LLM-DetectAIve, designed for fine-grained detection. Unlike most previous work on machine-generated text detection, which focused on binary classification, LLM-DetectAIve supports four categories: (i) human-written, (ii) machine-generated, (iii) machine-written, then machine-humanized, and (iv) human-written, then machine-polished. Category (iii) aims to detect attempts to obfuscate the fact that a text was machine-generated, while category (iv) looks for cases where the LLM was used to polish a human-written text, which is typically acceptable in academic writing, but not in education. Our experiments show that LLM-DetectAIve can effectively identify the above four categories, which makes it a potentially useful tool in education, academia, and other domains.
LLM-DetectAIve is publicly accessible at https://github.com/mbzuai-nlp/LLM-DetectAIve. The video describing our system is available at https://youtu.be/E8eT_bE7k8c.
△ Less
Submitted 14 March, 2025; v1 submitted 8 August, 2024;
originally announced August 2024.