-
Products, Abstractions and Inclusions of Causal Spaces
Authors:
Simon Buchholz,
Junhyung Park,
Bernhard Schölkopf
Abstract:
Causal spaces have recently been introduced as a measure-theoretic framework to encode the notion of causality. While it has some advantages over established frameworks, such as structural causal models, the theory is so far only developed for single causal spaces. In many mathematical theories, not least the theory of probability spaces of which causal spaces are a direct extension, combinations…
▽ More
Causal spaces have recently been introduced as a measure-theoretic framework to encode the notion of causality. While it has some advantages over established frameworks, such as structural causal models, the theory is so far only developed for single causal spaces. In many mathematical theories, not least the theory of probability spaces of which causal spaces are a direct extension, combinations of objects and maps between objects form a central part. In this paper, taking inspiration from such objects in probability theory, we propose the definitions of products of causal spaces, as well as (stochastic) transformations between causal spaces. In the context of causality, these quantities can be given direct semantic interpretations as causally independent components, abstractions and extensions.
△ Less
Submitted 6 June, 2024; v1 submitted 1 June, 2024;
originally announced June 2024.
-
Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models
Authors:
Goutham Rajendran,
Simon Buchholz,
Bryon Aragam,
Bernhard Schölkopf,
Pradeep Ravikumar
Abstract:
To build intelligent machine learning systems, there are two broad approaches. One approach is to build inherently interpretable models, as endeavored by the growing field of causal representation learning. The other approach is to build highly-performant foundation models and then invest efforts into understanding how they work. In this work, we relate these two approaches and study how to learn…
▽ More
To build intelligent machine learning systems, there are two broad approaches. One approach is to build inherently interpretable models, as endeavored by the growing field of causal representation learning. The other approach is to build highly-performant foundation models and then invest efforts into understanding how they work. In this work, we relate these two approaches and study how to learn human-interpretable concepts from data. Weaving together ideas from both fields, we formally define a notion of concepts and show that they can be provably recovered from diverse data. Experiments on synthetic data and large language models show the utility of our unified approach.
△ Less
Submitted 9 December, 2024; v1 submitted 14 February, 2024;
originally announced February 2024.
-
Aizenman-Wehr argument for a class of disordered gradient models
Authors:
Simon Buchholz,
Codina Cotar
Abstract:
We consider random gradient fields with disorder where the interaction potential $V_e$ on an edge $e$ can be expressed as $e^{-V_e(s)} = \int ρ(\mathrm{d}κ)\, e^{-κξ_e} e^{-\frac{κs^2}{2}}$. Here $ρ$ denotes a measure with compact support in $(0,\infty)$ and $ξ_e\in\mathbb{R}$ a nontrivial edge dependent disorder. We show that in dimension $d=2$ there is a unique shift covariant disordered gradien…
▽ More
We consider random gradient fields with disorder where the interaction potential $V_e$ on an edge $e$ can be expressed as $e^{-V_e(s)} = \int ρ(\mathrm{d}κ)\, e^{-κξ_e} e^{-\frac{κs^2}{2}}$. Here $ρ$ denotes a measure with compact support in $(0,\infty)$ and $ξ_e\in\mathbb{R}$ a nontrivial edge dependent disorder. We show that in dimension $d=2$ there is a unique shift covariant disordered gradient Gibbs measure such that the annealed measure is ergodic and has zero tilt. This shows that the phase transitions known to occur for this class of potential do not persist to the disordered setting. The proof relies on the connection of the gradient Gibbs measures to a random conductance model with compact state space, to which the well known Aizenman-Wehr argument applies.
△ Less
Submitted 16 February, 2024; v1 submitted 22 September, 2023;
originally announced September 2023.
-
Learning Linear Causal Representations from Interventions under General Nonlinear Mixing
Authors:
Simon Buchholz,
Goutham Rajendran,
Elan Rosenfeld,
Bryon Aragam,
Bernhard Schölkopf,
Pradeep Ravikumar
Abstract:
We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker cl…
▽ More
We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker classes, such as linear maps or paired counterfactual data. This is also the first instance of causal identifiability from non-paired interventions for deep neural network embeddings. Our proof relies on carefully uncovering the high-dimensional geometric structure present in the data distribution after a non-linear density transformation, which we capture by analyzing quadratic forms of precision matrices of the latent distributions. Finally, we propose a contrastive algorithm to identify the latent variables in practice and evaluate its performance on various tasks.
△ Less
Submitted 18 December, 2023; v1 submitted 3 June, 2023;
originally announced June 2023.
-
A Measure-Theoretic Axiomatisation of Causality
Authors:
Junhyung Park,
Simon Buchholz,
Bernhard Schölkopf,
Krikamol Muandet
Abstract:
Causality is a central concept in a wide range of research areas, yet there is still no universally agreed axiomatisation of causality. We view causality both as an extension of probability theory and as a study of \textit{what happens when one intervenes on a system}, and argue in favour of taking Kolmogorov's measure-theoretic axiomatisation of probability as the starting point towards an axioma…
▽ More
Causality is a central concept in a wide range of research areas, yet there is still no universally agreed axiomatisation of causality. We view causality both as an extension of probability theory and as a study of \textit{what happens when one intervenes on a system}, and argue in favour of taking Kolmogorov's measure-theoretic axiomatisation of probability as the starting point towards an axiomatisation of causality. To that end, we propose the notion of a \textit{causal space}, consisting of a probability space along with a collection of transition probability kernels, called \textit{causal kernels}, that encode the causal information of the space. Our proposed framework is not only rigorously grounded in measure theory, but it also sheds light on long-standing limitations of existing frameworks including, for example, cycles, latent variables and stochastic processes.
△ Less
Submitted 6 June, 2024; v1 submitted 19 May, 2023;
originally announced May 2023.
-
Front Transport Reduction for Complex Moving Fronts
Authors:
Philipp Krah,
Steffen Büchholz,
Matthias Häringer,
Julius Reiss
Abstract:
This work addresses model order reduction for complex moving fronts, which are transported by advection or through a reaction-diffusion process. Such systems are especially challenging for model order reduction since the transport cannot be captured by linear reduction methods. Moreover, topological changes, such as splitting or merging of fronts pose difficulties for many nonlinear reduction meth…
▽ More
This work addresses model order reduction for complex moving fronts, which are transported by advection or through a reaction-diffusion process. Such systems are especially challenging for model order reduction since the transport cannot be captured by linear reduction methods. Moreover, topological changes, such as splitting or merging of fronts pose difficulties for many nonlinear reduction methods and the small non-vanishing support of the underlying partial differential equations dynamics makes most nonlinear hyper-reduction methods infeasible. We propose a new decomposition method together with a hyper-reduction scheme that addresses these shortcomings. The decomposition uses a level-set function to parameterize the transport and a nonlinear activation function that captures the structure of the front. This approach is similar to autoencoder artificial neural networks, but additionally provides insights into the system, which can be used for efficient reduced order models. We make use of this property and are thus able to solve the advection equation with the same complexity as the POD-Galerkin approach while obtaining errors of less than one percent for representative examples. Furthermore, we outline a special hyper-reduction method for more complicated advection-reaction-diffusion systems. The capability of the approach is illustrated by various numerical examples in one and two spatial dimensions, including real-life applications to a two-dimensional Bunsen flame.
△ Less
Submitted 16 February, 2022;
originally announced February 2022.
-
Cauchy-Born Rule from Microscopic Models with Non-convex Potentials
Authors:
Stefan Adams,
Simon Buchholz,
Roman Kotecký,
Stefan Müller
Abstract:
We study gradient field models on an integer lattice with non-convex interactions. These models emerge in distinct branches of physics and mathematics under various names. In particular, as zero-mass lattice (Euclidean) quantum field theory, models of random interfaces, and as mass-string models of nonlinear elasticity.Our attention is mostly devoted to the latter with random vector valued fields…
▽ More
We study gradient field models on an integer lattice with non-convex interactions. These models emerge in distinct branches of physics and mathematics under various names. In particular, as zero-mass lattice (Euclidean) quantum field theory, models of random interfaces, and as mass-string models of nonlinear elasticity.Our attention is mostly devoted to the latter with random vector valued fields as displacements for atoms of crystal structures,where our aim is to prove the strict convexity of the free energy as a function of affine deformations for low enough temperatures and small enough deformations. This claim can be interpreted as a form of verification of the Cauchy-Born rule at small non-vanishing temperatures for a class of these models. We also show that the scaling limit of the Laplace transform of the corresponding Gibbs measure (under a proper rescaling) corresponds to the Gaussian gradient field with a particular covariance.
The proofs are based on a multi-scale (renormalisation group analysis) techniques needed in view of strong correlations of studied gradient fields. To cover sufficiently wide class of models, we extend these techniques from the standard case with rotationally symmetric nearest neighbour interaction to a more general situation with finite range interactions without any symmetry. Our presentation is entirely self-contained covering the details of the needed renormalisation group methods.
△ Less
Submitted 5 June, 2024; v1 submitted 29 October, 2019;
originally announced October 2019.
-
Phase transitions for a class of gradient fields
Authors:
Simon Buchholz
Abstract:
We consider gradient fields on $\mathbb{Z}^d$ for potentials $V$ that can be expressed as $$e^{-V(x)}=pe^{-\frac{qx^2}{2}}+(1-p)e^{-\frac{x^2}{2}}.$$ This representation allows us to associate a random conductance type model to the gradient fields with zero tilt. We investigate this random conductance model and prove correlation inequalities, duality properties, and uniqueness of the Gibbs measure…
▽ More
We consider gradient fields on $\mathbb{Z}^d$ for potentials $V$ that can be expressed as $$e^{-V(x)}=pe^{-\frac{qx^2}{2}}+(1-p)e^{-\frac{x^2}{2}}.$$ This representation allows us to associate a random conductance type model to the gradient fields with zero tilt. We investigate this random conductance model and prove correlation inequalities, duality properties, and uniqueness of the Gibbs measure in certain regimes. Moreover, we show that there is a close relation between Gibbs measures of the random conductance model and gradient Gibbs measures with zero tilt for the potential $V$. Based on these results we can give a new proof for the non-uniqueness of gradient Gibbs measures without using reflection positivity. We also show uniqueness of ergodic zero tilt gradient Gibbs measures for almost all values of $p$ and $q$ and, in dimension $d\geq 4$, for $q$ close to one or for $p(1-p)$ sufficiently small.
△ Less
Submitted 6 September, 2019;
originally announced September 2019.
-
Probability to be positive for the membrane model in dimensions 2 and 3
Authors:
Simon Buchholz,
Jean-Dominique Deuschel,
Noemi Kurt,
Florian Schweiger
Abstract:
We consider the membrane model on a box $V_N\subset \mathbb{Z}^n$ of size $(2N+1)^n$ with zero boundary condition in the subcritical dimensions $n=2$ and $n=3$. We show optimal estimates for the probability that the field is positive in a subset $D_N$ of $V_N$. In particular we obtain for $D_N=V_N$ that the probability to be positive on the entire domain is exponentially small and the rate is of t…
▽ More
We consider the membrane model on a box $V_N\subset \mathbb{Z}^n$ of size $(2N+1)^n$ with zero boundary condition in the subcritical dimensions $n=2$ and $n=3$. We show optimal estimates for the probability that the field is positive in a subset $D_N$ of $V_N$. In particular we obtain for $D_N=V_N$ that the probability to be positive on the entire domain is exponentially small and the rate is of the order of the surface area $N^{n-1}$.
△ Less
Submitted 11 October, 2018;
originally announced October 2018.
-
Finite Range Decomposition for Gaussian Measures with Improved Regularity
Authors:
Simon Buchholz
Abstract:
We consider a family of gradient Gaussian vector fields on the torus $(\mathbb{Z}/L^N\mathbb{Z})^d$. Adams, Kotecký, Müller and independently Bauerschmidt established the existence of a uniform finite range decomposition of the corresponding covariance operators, i.e., the covariance can be written as a sum of covariance operators supported on increasing cubes with diameter $L^k$. We improve this…
▽ More
We consider a family of gradient Gaussian vector fields on the torus $(\mathbb{Z}/L^N\mathbb{Z})^d$. Adams, Kotecký, Müller and independently Bauerschmidt established the existence of a uniform finite range decomposition of the corresponding covariance operators, i.e., the covariance can be written as a sum of covariance operators supported on increasing cubes with diameter $L^k$. We improve this result and show that the decay behaviour of the kernels in Fourier space can be controlled. Then we show the regularity of the integration map that convolves functionals with the partial measures of the finite range decomposition. In particular the new finite range decomposition avoids the loss of regularity which arises in the renormalisation group approach to anisotropic problems in statistical mechanics.
△ Less
Submitted 10 August, 2017; v1 submitted 22 March, 2016;
originally announced March 2016.