Showing 1–1 of 1 results for author: Lal, V

Search v0.5.6 released 2020-02-24

arXiv:2412.07446 [pdf, ps, other]

cs.AI cs.CL cs.LG stat.ML

A Causal World Model Underlying Next Token Prediction: Exploring GPT in a Controlled Environment

Authors: Raanan Y. Rohekar, Yaniv Gurwicz, Sungduk Yu, Estelle Aflalo, Vasudev Lal

Abstract: Are generative pre-trained transformer (GPT) models, trained only to predict the next token, implicitly learning a world model from which sequences are generated one token at a time? We address this question by deriving a causal interpretation of the attention mechanism in GPT and presenting a causal world model that arises from this interpretation. Furthermore, we propose that GPT models, at infe… ▽ More Are generative pre-trained transformer (GPT) models, trained only to predict the next token, implicitly learning a world model from which sequences are generated one token at a time? We address this question by deriving a causal interpretation of the attention mechanism in GPT and presenting a causal world model that arises from this interpretation. Furthermore, we propose that GPT models, at inference time, can be utilized for zero-shot causal structure learning for input sequences, and introduce a corresponding confidence score. Empirical tests were conducted in controlled environments using the setups of the Othello and Chess strategy games. A GPT, pre-trained on real-world games played with the intention of winning, was tested on out-of-distribution synthetic data consisting of sequences of random legal moves. We find that the GPT model is likely to generate legal next moves for out-of-distribution sequences for which a causal structure is encoded in the attention mechanism with high confidence. In cases where it generates illegal moves, it also fails to capture a causal structure. △ Less

Submitted 6 July, 2025; v1 submitted 10 December, 2024; originally announced December 2024.

Comments: International Conference on Machine Learning (ICML), 2025

Search v0.5.6 released 2020-02-24