Prompting or Fine-tuning? Exploring Large Language Models for Causal Graph Validation

Susanti, Yuni; Holsmoelle, Nina

Computer Science > Computation and Language

arXiv:2406.16899 (cs)

[Submitted on 29 May 2024 (v1), last revised 9 Apr 2025 (this version, v2)]

Title:Prompting or Fine-tuning? Exploring Large Language Models for Causal Graph Validation

Authors:Yuni Susanti, Nina Holsmoelle

View PDF HTML (experimental)

Abstract:This study explores the capability of Large Language Models (LLMs) to evaluate causality in causal graphs generated by conventional statistical causal discovery methods-a task traditionally reliant on manual assessment by human subject matter experts. To bridge this gap in causality assessment, LLMs are employed to evaluate the causal relationships by determining whether a causal connection between variable pairs can be inferred from textual context. Our study compares two approaches: (1) prompting-based method for zero-shot and few-shot causal inference and, (2) fine-tuning language models for the causal relation prediction task. While prompt-based LLMs have demonstrated versatility across various NLP tasks, our experiments on biomedical and general-domain datasets show that fine-tuned models consistently outperform them, achieving up to a 20.5-point improvement in F1 score-even when using smaller-parameter language models. These findings provide valuable insights into the strengths and limitations of both approaches for causal graph evaluation.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2406.16899 [cs.CL]
	(or arXiv:2406.16899v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.16899
Journal reference:	Causal-NeSy @ ESWC 2025

Submission history

From: Yuni Susanti [view email]
[v1] Wed, 29 May 2024 09:06:18 UTC (197 KB)
[v2] Wed, 9 Apr 2025 04:44:48 UTC (130 KB)

Computer Science > Computation and Language

Title:Prompting or Fine-tuning? Exploring Large Language Models for Causal Graph Validation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Prompting or Fine-tuning? Exploring Large Language Models for Causal Graph Validation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators