Machine Generation and Detection of Arabic Manipulated and Fake News

Nagoudi, El Moatez Billah; Elmadany, AbdelRahim; Abdul-Mageed, Muhammad; Alhindi, Tariq; Cavusoglu, Hasan

Computer Science > Computation and Language

arXiv:2011.03092 (cs)

[Submitted on 5 Nov 2020]

Title:Machine Generation and Detection of Arabic Manipulated and Fake News

Authors:El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Tariq Alhindi, Hasan Cavusoglu

View PDF

Abstract:Fake news and deceptive machine-generated text are serious problems threatening modern societies, including in the Arab world. This motivates work on detecting false and manipulated stories online. However, a bottleneck for this research is lack of sufficient data to train detection models. We present a novel method for automatically generating Arabic manipulated (and potentially fake) news stories. Our method is simple and only depends on availability of true stories, which are abundant online, and a part of speech tagger (POS). To facilitate future work, we dispense with both of these requirements altogether by providing AraNews, a novel and large POS-tagged news dataset that can be used off-the-shelf. Using stories generated based on AraNews, we carry out a human annotation study that casts light on the effects of machine manipulation on text veracity. The study also measures human ability to detect Arabic machine manipulated text generated by our method. Finally, we develop the first models for detecting manipulated Arabic news and achieve state-of-the-art results on Arabic fake news detection (macro F1=70.06). Our models and data are publicly available.

Comments:	10 pages, accepted in The Fifth Arabic Natural Language Processing Workshop (WANLP 2020)
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2011.03092 [cs.CL]
	(or arXiv:2011.03092v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2011.03092

Submission history

From: El Moatez Billah Nagoudi [view email]
[v1] Thu, 5 Nov 2020 20:50:22 UTC (537 KB)

Computer Science > Computation and Language

Title:Machine Generation and Detection of Arabic Manipulated and Fake News

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Machine Generation and Detection of Arabic Manipulated and Fake News

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators