CiteFix: Enhancing RAG Accuracy Through Post-Processing Citation Correction

Maheshwari, Harsh; Tenneti, Srikanth; Nakkiran, Alwarappan

Computer Science > Information Retrieval

arXiv:2504.15629 (cs)

[Submitted on 22 Apr 2025 (v1), last revised 11 Jun 2025 (this version, v2)]

Title:CiteFix: Enhancing RAG Accuracy Through Post-Processing Citation Correction

Authors:Harsh Maheshwari, Srikanth Tenneti, Alwarappan Nakkiran

View PDF HTML (experimental)

Abstract:Retrieval Augmented Generation (RAG) has emerged as a powerful application of Large Language Models (LLMs), revolutionizing information search and consumption. RAG systems combine traditional search capabilities with LLMs to generate comprehensive answers to user queries, ideally with accurate citations. However, in our experience of developing a RAG product, LLMs often struggle with source attribution, aligning with other industry studies reporting citation accuracy rates of only about 74% for popular generative search engines. To address this, we present efficient post-processing algorithms to improve citation accuracy in LLM-generated responses, with minimal impact on latency and cost. Our approaches cross-check generated citations against retrieved articles using methods including keyword + semantic matching, fine tuned model with BERTScore, and a lightweight LLM-based technique. Our experimental results demonstrate a relative improvement of 15.46% in the overall accuracy metrics of our RAG system. This significant enhancement potentially enables a shift from our current larger language model to a relatively smaller model that is approximately 12x more cost-effective and 3x faster in inference time, while maintaining comparable performance. This research contributes to enhancing the reliability and trustworthiness of AI-generated content in information retrieval and summarization tasks which is critical to gain customer trust especially in commercial products.

Subjects:	Information Retrieval (cs.IR); Computation and Language (cs.CL)
Cite as:	arXiv:2504.15629 [cs.IR]
	(or arXiv:2504.15629v2 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2504.15629

Submission history

From: Harsh Maheshwari [view email]
[v1] Tue, 22 Apr 2025 06:41:25 UTC (8,850 KB)
[v2] Wed, 11 Jun 2025 07:56:14 UTC (7,448 KB)

Computer Science > Information Retrieval

Title:CiteFix: Enhancing RAG Accuracy Through Post-Processing Citation Correction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:CiteFix: Enhancing RAG Accuracy Through Post-Processing Citation Correction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators