Opioid Named Entity Recognition (ONER-2025) from Reddit

Sidorov, Grigori; Ahmad, Muhammad; Ameer, Iqra; Usman, Muhammad; Batyrshin, Ildar

Computer Science > Computation and Language

arXiv:2504.00027 (cs)

[Submitted on 28 Mar 2025 (v1), last revised 30 Apr 2025 (this version, v3)]

Title:Opioid Named Entity Recognition (ONER-2025) from Reddit

Authors:Grigori Sidorov, Muhammad Ahmad, Iqra Ameer, Muhammad Usman, Ildar Batyrshin

View PDF

Abstract:The opioid overdose epidemic remains a critical public health crisis, particularly in the United States, leading to significant mortality and societal costs. Social media platforms like Reddit provide vast amounts of unstructured data that offer insights into public perceptions, discussions, and experiences related to opioid use. This study leverages Natural Language Processing (NLP), specifically Opioid Named Entity Recognition (ONER-2025), to extract actionable information from these platforms. Our research makes four key contributions. First, we created a unique, manually annotated dataset sourced from Reddit, where users share self-reported experiences of opioid use via different administration routes. This dataset contains 331,285 tokens and includes eight major opioid entity categories. Second, we detail our annotation process and guidelines while discussing the challenges of labeling the ONER-2025 dataset. Third, we analyze key linguistic challenges, including slang, ambiguity, fragmented sentences, and emotionally charged language, in opioid discussions. Fourth, we propose a real-time monitoring system to process streaming data from social media, healthcare records, and emergency services to identify overdose events. Using 5-fold cross-validation in 11 experiments, our system integrates machine learning, deep learning, and transformer-based language models with advanced contextual embeddings to enhance understanding. Our transformer-based models (bert-base-NER and roberta-base) achieved 97% accuracy and F1-score, outperforming baselines by 10.23% (RF=0.88).

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2504.00027 [cs.CL]
	(or arXiv:2504.00027v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.00027

Submission history

From: Muhammad Ahmad [view email]
[v1] Fri, 28 Mar 2025 20:51:06 UTC (944 KB)
[v2] Sat, 5 Apr 2025 04:25:58 UTC (1,175 KB)
[v3] Wed, 30 Apr 2025 21:34:50 UTC (945 KB)

Computer Science > Computation and Language

Title:Opioid Named Entity Recognition (ONER-2025) from Reddit

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Opioid Named Entity Recognition (ONER-2025) from Reddit

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators