Failure Modes of LLMs for Causal Reasoning on Narratives

Yamin, Khurram; Gupta, Shantanu; Ghosal, Gaurav R.; Lipton, Zachary C.; Wilder, Bryan

Computer Science > Machine Learning

arXiv:2410.23884 (cs)

[Submitted on 31 Oct 2024 (v1), last revised 15 Jun 2025 (this version, v5)]

Title:Failure Modes of LLMs for Causal Reasoning on Narratives

Authors:Khurram Yamin, Shantanu Gupta, Gaurav R. Ghosal, Zachary C. Lipton, Bryan Wilder

View PDF HTML (experimental)

Abstract:The ability to robustly identify causal relationships is essential for autonomous decision-making and adaptation to novel scenarios. However, accurately inferring causal structure requires integrating both world knowledge and abstract logical reasoning. In this work, we investigate the interaction between these two capabilities through the representative task of causal reasoning over narratives. Through controlled synthetic, semi-synthetic, and real-world experiments, we find that state-of-the-art large language models (LLMs) often rely on superficial heuristics -- for example, inferring causality from event order or recalling memorized world knowledge without attending to context. Furthermore, we show that simple reformulations of the task can elicit more robust reasoning behavior. Our evaluation spans a range of causal structures, from linear chains to complex graphs involving colliders and forks. These findings uncover systematic patterns in how LLMs perform causal reasoning and lay the groundwork for developing methods that better align LLM behavior with principled causal inference.

Comments:	ICML 2025 Workshop on Scaling up Intervention Models
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL)
Cite as:	arXiv:2410.23884 [cs.LG]
	(or arXiv:2410.23884v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2410.23884

Submission history

From: Khurram Yamin [view email]
[v1] Thu, 31 Oct 2024 12:48:58 UTC (3,343 KB)
[v2] Tue, 24 Dec 2024 23:07:13 UTC (3,448 KB)
[v3] Wed, 4 Jun 2025 18:26:30 UTC (253 KB)
[v4] Wed, 11 Jun 2025 22:07:36 UTC (247 KB)
[v5] Sun, 15 Jun 2025 00:36:18 UTC (247 KB)

Computer Science > Machine Learning

Title:Failure Modes of LLMs for Causal Reasoning on Narratives

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Failure Modes of LLMs for Causal Reasoning on Narratives

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators