Scalable Causal Discovery from Recursive Nonlinear Data via Truncated Basis Function Scores and Tests

Ramsey, Joseph; Andrews, Bryan

Statistics > Machine Learning

arXiv:2510.04276 (stat)

[Submitted on 5 Oct 2025]

Title:Scalable Causal Discovery from Recursive Nonlinear Data via Truncated Basis Function Scores and Tests

Authors:Joseph Ramsey, Bryan Andrews

View PDF HTML (experimental)

Abstract:Learning graphical conditional independence structures from nonlinear, continuous or mixed data is a central challenge in machine learning and the sciences, and many existing methods struggle to scale to thousands of samples or hundreds of variables. We introduce two basis-expansion tools for scalable causal discovery. First, the Basis Function BIC (BF-BIC) score uses truncated additive expansions to approximate nonlinear dependencies. BF-BIC is theoretically consistent under additive models and extends to post-nonlinear (PNL) models via an invertible reparameterization. It remains robust under moderate interactions and supports mixed data through a degenerate-Gaussian embedding for discrete variables. In simulations with fully nonlinear neural causal models (NCMs), BF-BIC outperforms kernel- and constraint-based methods (e.g., KCI, RFCI) in both accuracy and runtime. Second, the Basis Function Likelihood Ratio Test (BF-LRT) provides an approximate conditional independence test that is substantially faster than kernel tests while retaining competitive accuracy. Extensive simulations and a real-data application to Canadian wildfire risk show that, when integrated into hybrid searches, BF-based methods enable interpretable and scalable causal discovery. Implementations are available in Python, R, and Java.

Comments:	30 pages, 11 figures, 5 tables
Subjects:	Machine Learning (stat.ML); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2510.04276 [stat.ML]
	(or arXiv:2510.04276v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2510.04276

Submission history

From: Joseph Ramsey [view email]
[v1] Sun, 5 Oct 2025 16:34:54 UTC (3,352 KB)

Statistics > Machine Learning

Title:Scalable Causal Discovery from Recursive Nonlinear Data via Truncated Basis Function Scores and Tests

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Scalable Causal Discovery from Recursive Nonlinear Data via Truncated Basis Function Scores and Tests

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators