Pub-Guard-LLM: Detecting Fraudulent Biomedical Articles with Reliable Explanations

Chen, Lihu; Fu, Shuojie; Freedman, Gabriel; Zor, Cemre; Martin, Guy; Kinross, James; Vaghela, Uddhav; Serban, Ovidiu; Toni, Francesca

Computer Science > Computation and Language

arXiv:2502.15429 (cs)

[Submitted on 21 Feb 2025 (v1), last revised 8 Apr 2025 (this version, v4)]

Title:Pub-Guard-LLM: Detecting Fraudulent Biomedical Articles with Reliable Explanations

Authors:Lihu Chen, Shuojie Fu, Gabriel Freedman, Cemre Zor, Guy Martin, James Kinross, Uddhav Vaghela, Ovidiu Serban, Francesca Toni

View PDF HTML (experimental)

Abstract:A significant and growing number of published scientific articles is found to involve fraudulent practices, posing a serious threat to the credibility and safety of research in fields such as medicine. We propose Pub-Guard-LLM, the first large language model-based system tailored to fraud detection of biomedical scientific articles. We provide three application modes for deploying Pub-Guard-LLM: vanilla reasoning, retrieval-augmented generation, and multi-agent debate. Each mode allows for textual explanations of predictions. To assess the performance of our system, we introduce an open-source benchmark, PubMed Retraction, comprising over 11K real-world biomedical articles, including metadata and retraction labels. We show that, across all modes, Pub-Guard-LLM consistently surpasses the performance of various baselines and provides more reliable explanations, namely explanations which are deemed more relevant and coherent than those generated by the baselines when evaluated by multiple assessment methods. By enhancing both detection performance and explainability in scientific fraud detection, Pub-Guard-LLM contributes to safeguarding research integrity with a novel, effective, open-source tool.

Comments:	long paper under review
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2502.15429 [cs.CL]
	(or arXiv:2502.15429v4 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.15429

Submission history

From: Lihu Chen [view email]
[v1] Fri, 21 Feb 2025 12:54:56 UTC (332 KB)
[v2] Tue, 25 Feb 2025 22:41:04 UTC (332 KB)
[v3] Fri, 4 Apr 2025 15:21:03 UTC (332 KB)
[v4] Tue, 8 Apr 2025 10:27:59 UTC (332 KB)

Computer Science > Computation and Language

Title:Pub-Guard-LLM: Detecting Fraudulent Biomedical Articles with Reliable Explanations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Pub-Guard-LLM: Detecting Fraudulent Biomedical Articles with Reliable Explanations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators