Confabulations from ACL Publications (CAP): A Dataset for Scientific Hallucination Detection

Gamba, Federica; Sinha, Aman; Mickus, Timothee; Vazquez, Raul; Bhamidipati, Patanjali; Savelli, Claudio; Chattopadhyay, Ahana; Zanella, Laura A.; Kankanampati, Yash; Remesh, Binesh Arakkal; Chandramania, Aryan Ashok; Agarwal, Rohit; Li, Chuyuan; Buhnila, Ioana; Mamidi, Radhika

Computer Science > Computation and Language

arXiv:2510.22395 (cs)

[Submitted on 25 Oct 2025]

Title:Confabulations from ACL Publications (CAP): A Dataset for Scientific Hallucination Detection

Authors:Federica Gamba, Aman Sinha, Timothee Mickus, Raul Vazquez, Patanjali Bhamidipati, Claudio Savelli, Ahana Chattopadhyay, Laura A. Zanella, Yash Kankanampati, Binesh Arakkal Remesh, Aryan Ashok Chandramania, Rohit Agarwal, Chuyuan Li, Ioana Buhnila, Radhika Mamidi

View PDF HTML (experimental)

Abstract:We introduce the CAP (Confabulations from ACL Publications) dataset, a multilingual resource for studying hallucinations in large language models (LLMs) within scientific text generation. CAP focuses on the scientific domain, where hallucinations can distort factual knowledge, as they frequently do. In this domain, however, the presence of specialized terminology, statistical reasoning, and context-dependent interpretations further exacerbates these distortions, particularly given LLMs' lack of true comprehension, limited contextual understanding, and bias toward surface-level generalization. CAP operates in a cross-lingual setting covering five high-resource languages (English, French, Hindi, Italian, and Spanish) and four low-resource languages (Bengali, Gujarati, Malayalam, and Telugu). The dataset comprises 900 curated scientific questions and over 7000 LLM-generated answers from 16 publicly available models, provided as question-answer pairs along with token sequences and corresponding logits. Each instance is annotated with a binary label indicating the presence of a scientific hallucination, denoted as a factuality error, and a fluency label, capturing issues in the linguistic quality or naturalness of the text. CAP is publicly released to facilitate advanced research on hallucination detection, multilingual evaluation of LLMs, and the development of more reliable scientific NLP systems.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2510.22395 [cs.CL]
	(or arXiv:2510.22395v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2510.22395

Submission history

From: Aman Sinha [view email]
[v1] Sat, 25 Oct 2025 18:42:22 UTC (788 KB)

Computer Science > Computation and Language

Title:Confabulations from ACL Publications (CAP): A Dataset for Scientific Hallucination Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Confabulations from ACL Publications (CAP): A Dataset for Scientific Hallucination Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators