Exploration and Evaluation of Bias in Cyberbullying Detection with Machine Learning

Root, Andrew; Jakubowski, Liam; Vanamala, Mounika

Computer Science > Machine Learning

arXiv:2412.00609 (cs)

[Submitted on 30 Nov 2024]

Title:Exploration and Evaluation of Bias in Cyberbullying Detection with Machine Learning

Authors:Andrew Root, Liam Jakubowski, Mounika Vanamala

View PDF HTML (experimental)

Abstract:It is well known that the usefulness of a machine learning model is due to its ability to generalize to unseen data. This study uses three popular cyberbullying datasets to explore the effects of data, how it's collected, and how it's labeled, on the resulting machine learning models. The bias introduced from differing definitions of cyberbullying and from data collection is discussed in detail. An emphasis is made on the impact of dataset expansion methods, which utilize current data points to fetch and label new ones. Furthermore, explicit testing is performed to evaluate the ability of a model to generalize to unseen datasets through cross-dataset evaluation. As hypothesized, the models have a significant drop in the Macro F1 Score, with an average drop of 0.222. As such, this study effectively highlights the importance of dataset curation and cross-dataset testing for creating models with real-world applicability. The experiments and other code can be found at this https URL.

Comments:	8 pages, 6 figures
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2412.00609 [cs.LG]
	(or arXiv:2412.00609v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2412.00609

Submission history

From: Andrew Root [view email]
[v1] Sat, 30 Nov 2024 23:18:49 UTC (563 KB)

Computer Science > Machine Learning

Title:Exploration and Evaluation of Bias in Cyberbullying Detection with Machine Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Exploration and Evaluation of Bias in Cyberbullying Detection with Machine Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators