A MapReduce Approach to Effectively Utilize Long Context Information in Retrieval Augmented Language Models

Zhang, Gongbo; Xu, Zihan; Jin, Qiao; Chen, Fangyi; Fang, Yilu; Liu, Yi; Rousseau, Justin F.; Xu, Ziyang; Lu, Zhiyong; Weng, Chunhua; Peng, Yifan

Computer Science > Computation and Language

arXiv:2412.15271 (cs)

[Submitted on 17 Dec 2024]

Title:A MapReduce Approach to Effectively Utilize Long Context Information in Retrieval Augmented Language Models

Authors:Gongbo Zhang, Zihan Xu, Qiao Jin, Fangyi Chen, Yilu Fang, Yi Liu, Justin F. Rousseau, Ziyang Xu, Zhiyong Lu, Chunhua Weng, Yifan Peng

View PDF HTML (experimental)

Abstract:While holding great promise for improving and facilitating healthcare, large language models (LLMs) struggle to produce up-to-date responses on evolving topics due to outdated knowledge or hallucination. Retrieval-augmented generation (RAG) is a pivotal innovation that improves the accuracy and relevance of LLM responses by integrating LLMs with a search engine and external sources of knowledge. However, the quality of RAG responses can be largely impacted by the rank and density of key information in the retrieval results, such as the "lost-in-the-middle" problem. In this work, we aim to improve the robustness and reliability of the RAG workflow in the medical domain. Specifically, we propose a map-reduce strategy, BriefContext, to combat the "lost-in-the-middle" issue without modifying the model weights. We demonstrated the advantage of the workflow with various LLM backbones and on multiple QA datasets. This method promises to improve the safety and reliability of LLMs deployed in healthcare domains.

Subjects:	Computation and Language (cs.CL); Information Retrieval (cs.IR)
Cite as:	arXiv:2412.15271 [cs.CL]
	(or arXiv:2412.15271v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2412.15271

Submission history

From: Gongbo Zhang [view email]
[v1] Tue, 17 Dec 2024 11:18:14 UTC (237 KB)

Computer Science > Computation and Language

Title:A MapReduce Approach to Effectively Utilize Long Context Information in Retrieval Augmented Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A MapReduce Approach to Effectively Utilize Long Context Information in Retrieval Augmented Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators