Muted: Multilingual Targeted Offensive Speech Identification and Visualization

Tillmann, Christoph; Trivedi, Aashka; Rosenthal, Sara; Borse, Santosh; Zhang, Rong; Sil, Avirup; Bhattacharjee, Bishwaranjan

Computer Science > Computation and Language

arXiv:2312.11344 (cs)

[Submitted on 18 Dec 2023]

Title:Muted: Multilingual Targeted Offensive Speech Identification and Visualization

Authors:Christoph Tillmann, Aashka Trivedi, Sara Rosenthal, Santosh Borse, Rong Zhang, Avirup Sil, Bishwaranjan Bhattacharjee

View PDF HTML (experimental)

Abstract:Offensive language such as hate, abuse, and profanity (HAP) occurs in various content on the web. While previous work has mostly dealt with sentence level annotations, there have been a few recent attempts to identify offensive spans as well. We build upon this work and introduce Muted, a system to identify multilingual HAP content by displaying offensive arguments and their targets using heat maps to indicate their intensity. Muted can leverage any transformer-based HAP-classification model and its attention mechanism out-of-the-box to identify toxic spans, without further fine-tuning. In addition, we use the spaCy library to identify the specific targets and arguments for the words predicted by the attention heatmaps. We present the model's performance on identifying offensive spans and their targets in existing datasets and present new annotations on German text. Finally, we demonstrate our proposed visualization tool on multilingual inputs.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2312.11344 [cs.CL]
	(or arXiv:2312.11344v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2312.11344
Journal reference:	EMNLP 2023 Demo Track

Submission history

From: Christoph Tillmann [view email]
[v1] Mon, 18 Dec 2023 16:50:27 UTC (1,359 KB)

Computer Science > Computation and Language

Title:Muted: Multilingual Targeted Offensive Speech Identification and Visualization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Muted: Multilingual Targeted Offensive Speech Identification and Visualization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators