Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Hillebrand, Lars; Pradhan, Prabhupad; Bauckhage, Christian; Sifa, Rafet

Computer Science > Computation and Language

arXiv:2406.04156 (cs)

[Submitted on 6 Jun 2024]

Title:Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Authors:Lars Hillebrand, Prabhupad Pradhan, Christian Bauckhage, Rafet Sifa

View PDF HTML (experimental)

Abstract:We introduce "pointer-guided segment ordering" (SO), a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations in large language models. Our methodology leverages a self-attention-driven pointer network to restore the original sequence of shuffled text segments, addressing the challenge of capturing the structural coherence and contextual dependencies within documents. This pre-training approach is complemented by a fine-tuning methodology that incorporates dynamic sampling, augmenting the diversity of training instances and improving sample efficiency for various downstream applications. We evaluate our method on a diverse set of datasets, demonstrating its efficacy in tasks requiring sequential text classification across scientific literature and financial reporting domains. Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures, leading to state-of-the-art performance in downstream classification tasks.

Comments:	17 pages, 3 figures, 5 tables, accepted at ECML-PKDD 2024
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2406.04156 [cs.CL]
	(or arXiv:2406.04156v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.04156

Submission history

From: Lars Hillebrand [view email]
[v1] Thu, 6 Jun 2024 15:17:51 UTC (597 KB)

Computer Science > Computation and Language

Title:Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators