-
Towards Robust Causal Effect Identification Beyond Markov Equivalence
Authors:
Kai Z. Teh,
Kayvan Sadeghi,
Terry Soo
Abstract:
Causal effect identification typically requires a fully specified causal graph, which can be difficult to obtain in practice. We provide a sufficient criterion for identifying causal effects from a candidate set of Markov equivalence classes with added background knowledge, which represents cases where determining the causal graph up to a single Markov equivalence class is challenging. Such cases…
▽ More
Causal effect identification typically requires a fully specified causal graph, which can be difficult to obtain in practice. We provide a sufficient criterion for identifying causal effects from a candidate set of Markov equivalence classes with added background knowledge, which represents cases where determining the causal graph up to a single Markov equivalence class is challenging. Such cases can happen, for example, when the untestable assumptions (e.g. faithfulness) that underlie causal discovery algorithms do not hold.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
Causal Models for Growing Networks
Authors:
Gecia Bravo-Hermsdorff,
Lee M. Gunderson,
Kayvan Sadeghi
Abstract:
Real-world networks grow over time; statistical models based on node exchangeability are not appropriate. Instead of constraining the structure of the \textit{distribution} of edges, we propose that the relevant symmetries refer to the \textit{causal structure} between them. We first enumerate the 96 causal directed acyclic graph (DAG) models over pairs of nodes (dyad variables) in a growing netwo…
▽ More
Real-world networks grow over time; statistical models based on node exchangeability are not appropriate. Instead of constraining the structure of the \textit{distribution} of edges, we propose that the relevant symmetries refer to the \textit{causal structure} between them. We first enumerate the 96 causal directed acyclic graph (DAG) models over pairs of nodes (dyad variables) in a growing network with finite ancestral sets that are invariant to node deletion. We then partition them into 21 classes with ancestral sets that are closed under node marginalization. Several of these classes are remarkably amenable to distributed and asynchronous evaluation. As an example, we highlight a simple model that exhibits flexible power-law degree distributions and emergent phase transitions in sparsity, which we characterize analytically. With few parameters and much conditional independence, our proposed framework provides natural baseline models for causal inference in relational data.
△ Less
Submitted 1 April, 2025;
originally announced April 2025.
-
A General Framework on Conditions for Constraint-based Causal Learning
Authors:
Kai Z. Teh,
Kayvan Sadeghi,
Terry Soo
Abstract:
Most constraint-based causal learning algorithms provably return the correct causal graph under certain correctness conditions, such as faithfulness. By representing any constraint-based causal learning algorithm using the notion of a property, we provide a general framework to obtain and study correctness conditions for these algorithms. From the framework, we provide exact correctness conditions…
▽ More
Most constraint-based causal learning algorithms provably return the correct causal graph under certain correctness conditions, such as faithfulness. By representing any constraint-based causal learning algorithm using the notion of a property, we provide a general framework to obtain and study correctness conditions for these algorithms. From the framework, we provide exact correctness conditions for the PC algorithm, which are then related to the correctness conditions of some other existing causal discovery algorithms. The framework also suggests a paradigm for designing causal learning algorithms which allows for the correctness conditions of algorithms to be controlled for before designing the actual algorithm, and has the following implications. We show that the sparsest Markov representation condition is the weakest correctness condition for algorithms that output ancestral graphs or directed acyclic graphs satisfying any existing notions of minimality. We also reason that Pearl-minimality is necessary for meaningful causal learning but not sufficient to relax the faithfulness condition and, as such, has to be strengthened, such as by including background knowledge, for causal learning beyond faithfulness.
△ Less
Submitted 30 June, 2025; v1 submitted 14 August, 2024;
originally announced August 2024.
-
Localised Natural Causal Learning Algorithms for Weak Consistency Conditions
Authors:
Kai Z Teh,
Kayvan Sadeghi,
Terry Soo
Abstract:
By relaxing conditions for natural structure learning algorithms, a family of constraint-based algorithms containing all exact structure learning algorithms under the faithfulness assumption, we define localised natural structure learning algorithms (LoNS). We also provide a set of necessary and sufficient assumptions for consistency of LoNS, which can be thought of as a strict relaxation of the r…
▽ More
By relaxing conditions for natural structure learning algorithms, a family of constraint-based algorithms containing all exact structure learning algorithms under the faithfulness assumption, we define localised natural structure learning algorithms (LoNS). We also provide a set of necessary and sufficient assumptions for consistency of LoNS, which can be thought of as a strict relaxation of the restricted faithfulness assumption. We provide a practical LoNS algorithm that runs in exponential time, which is then compared with related existing structure learning algorithms, namely PC/SGS and the relatively recent sparsest permutation algorithm. Simulation studies are also provided.
△ Less
Submitted 27 May, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
Axiomatization of Interventional Probability Distributions
Authors:
Kayvan Sadeghi,
Terry Soo
Abstract:
Causal intervention is an essential tool in causal inference. It is axiomatized under the rules of do-calculus in the case of structure causal models. We provide simple axiomatizations for families of probability distributions to be different types of interventional distributions. Our axiomatizations neatly lead to a simple and clear theory of causality that has several advantages: it does not nee…
▽ More
Causal intervention is an essential tool in causal inference. It is axiomatized under the rules of do-calculus in the case of structure causal models. We provide simple axiomatizations for families of probability distributions to be different types of interventional distributions. Our axiomatizations neatly lead to a simple and clear theory of causality that has several advantages: it does not need to make use of any modeling assumptions such as those imposed by structural causal models; it only relies on interventions on single variables; it includes most cases with latent variables and causal cycles; and more importantly, it does not assume the existence of an underlying true causal graph as we do not take it as the primitive object--in fact, a causal graph is derived as a by-product of our theory. We show that, under our axiomatizations, the intervened distributions are Markovian to the defined intervened causal graphs, and an observed joint probability distribution is Markovian to the obtained causal graph; these results are consistent with the case of structural causal models, and as a result, the existing theory of causal inference applies. We also show that a large class of natural structural causal models satisfy the theory presented here. We note that the aim of this paper is axiomatization of interventional families, which is subtly different from "causal modeling."
△ Less
Submitted 13 November, 2023; v1 submitted 8 May, 2023;
originally announced May 2023.
-
Conditions and Assumptions for Constraint-based Causal Structure Learning
Authors:
Kayvan Sadeghi,
Terry Soo
Abstract:
We formalize constraint-based structure learning of the "true" causal graph from observed data when unobserved variables are also existent. We provide conditions for a "natural" family of constraint-based structure-learning algorithms that output graphs that are Markov equivalent to the causal graph. Under the faithfulness assumption, this natural family contains all exact structure-learning algor…
▽ More
We formalize constraint-based structure learning of the "true" causal graph from observed data when unobserved variables are also existent. We provide conditions for a "natural" family of constraint-based structure-learning algorithms that output graphs that are Markov equivalent to the causal graph. Under the faithfulness assumption, this natural family contains all exact structure-learning algorithms. We also provide a set of assumptions, under which any natural structure-learning algorithm outputs Markov equivalent graphs to the causal graph. These assumptions can be thought of as a relaxation of faithfulness, and most of them can be directly tested from (the underlying distribution) of the data, particularly when one focuses on structural causal models. We specialize the definitions and results for structural causal models.
△ Less
Submitted 8 May, 2022; v1 submitted 24 March, 2021;
originally announced March 2021.
-
Semisupervised Adversarial Neural Networks for Cyber Security Transfer Learning
Authors:
Casey Kneale,
Kolia Sadeghi
Abstract:
On the path to establishing a global cybersecurity framework where each enterprise shares information about malicious behavior, an important question arises. How can a machine learning representation characterizing a cyber attack on one network be used to detect similar attacks on other enterprise networks if each networks has wildly different distributions of benign and malicious traffic? We addr…
▽ More
On the path to establishing a global cybersecurity framework where each enterprise shares information about malicious behavior, an important question arises. How can a machine learning representation characterizing a cyber attack on one network be used to detect similar attacks on other enterprise networks if each networks has wildly different distributions of benign and malicious traffic? We address this issue by comparing the results of naively transferring a model across network domains and using CORrelation ALignment, to our novel adversarial Siamese neural network. Our proposed model learns attack representations that are more invariant to each network's particularities via an adversarial approach. It uses a simple ranking loss that prioritizes the labeling of the most egregious malicious events correctly over average accuracy. This is appropriate for driving an alert triage workflow wherein an analyst only has time to inspect the top few events ranked highest by the model. In terms of accuracy, the other approaches fail completely to detect any malicious events when models were trained on one dataset are evaluated on another for the first 100 events. While, the method presented here retrieves sizable proportions of malicious events, at the expense of some training instabilities due in adversarial modeling. We evaluate these approaches using 2 publicly available networking datasets, and suggest areas for future research.
△ Less
Submitted 25 July, 2019;
originally announced July 2019.
-
On Finite Exchangeability and Conditional Independence
Authors:
Kayvan Sadeghi
Abstract:
We study the independence structure of finitely exchangeable distributions over random vectors and random networks. In particular, we provide necessary and sufficient conditions for an exchangeable vector so that its elements are completely independent or completely dependent. We also provide a sufficient condition for an exchangeable vector so that its elements are marginally independent. We then…
▽ More
We study the independence structure of finitely exchangeable distributions over random vectors and random networks. In particular, we provide necessary and sufficient conditions for an exchangeable vector so that its elements are completely independent or completely dependent. We also provide a sufficient condition for an exchangeable vector so that its elements are marginally independent. We then generalize these results and conditions for exchangeable random networks. In this case, it is demonstrated that the situation is more complex. We show that the independence structure of exchangeable random networks lies in one of six regimes that are two-fold dual to one another, represented by undirected and bidirected independence graphs in graphical model sense with graphs that are complement of each other. In addition, under certain additional assumptions, we provide necessary and sufficient conditions for the exchangeable network distributions to be faithful to each of these graphs.
△ Less
Submitted 12 June, 2020; v1 submitted 5 July, 2019;
originally announced July 2019.
-
Markov Properties of Discrete Determinantal Point Processes
Authors:
Kayvan Sadeghi,
Alessandro Rinaldo
Abstract:
Determinantal point processes (DPPs) are probabilistic models for repulsion. When used to represent the occurrence of random subsets of a finite base set, DPPs allow to model global negative associations in a mathematically elegant and direct way. Discrete DPPs have become popular and computationally tractable models for solving several machine learning tasks that require the selection of diverse…
▽ More
Determinantal point processes (DPPs) are probabilistic models for repulsion. When used to represent the occurrence of random subsets of a finite base set, DPPs allow to model global negative associations in a mathematically elegant and direct way. Discrete DPPs have become popular and computationally tractable models for solving several machine learning tasks that require the selection of diverse objects, and have been successfully applied in numerous real-life problems. Despite their popularity, the statistical properties of such models have not been adequately explored. In this note, we derive the Markov properties of discrete DPPs and show how they can be expressed using graphical models.
△ Less
Submitted 27 January, 2019; v1 submitted 4 October, 2018;
originally announced October 2018.
-
Faithfulness of Probability Distributions and Graphs
Authors:
Kayvan Sadeghi
Abstract:
A main question in graphical models and causal inference is whether, given a probability distribution $P$ (which is usually an underlying distribution of data), there is a graph (or graphs) to which $P$ is faithful. The main goal of this paper is to provide a theoretical answer to this problem. We work with general independence models, which contain probabilistic independence models as a special c…
▽ More
A main question in graphical models and causal inference is whether, given a probability distribution $P$ (which is usually an underlying distribution of data), there is a graph (or graphs) to which $P$ is faithful. The main goal of this paper is to provide a theoretical answer to this problem. We work with general independence models, which contain probabilistic independence models as a special case. We exploit a generalization of ordering, called preordering, of the nodes of (mixed) graphs. This allows us to provide sufficient conditions for a given independence model to be Markov to a graph with the minimum possible number of edges, and more importantly, necessary and sufficient conditions for a given probability distribution to be faithful to a graph. We present our results for the general case of mixed graphs, but specialize the definitions and results to the better-known subclasses of undirected (concentration) and bidirected (covariance) graphs as well as directed acyclic graphs.
△ Less
Submitted 2 November, 2017; v1 submitted 29 January, 2017;
originally announced January 2017.
-
Unifying Markov Properties for Graphical Models
Authors:
Steffen Lauritzen,
Kayvan Sadeghi
Abstract:
Several types of graphs with different conditional independence interpretations --- also known as Markov properties --- have been proposed and used in graphical models. In this paper we unify these Markov properties by introducing a class of graphs with four types of edges --- lines, arrows, arcs, and dotted lines --- and a single separation criterion. We show that independence structures defined…
▽ More
Several types of graphs with different conditional independence interpretations --- also known as Markov properties --- have been proposed and used in graphical models. In this paper we unify these Markov properties by introducing a class of graphs with four types of edges --- lines, arrows, arcs, and dotted lines --- and a single separation criterion. We show that independence structures defined by this class specialize to each of the previously defined cases, when suitable subclasses of graphs are considered. In addition, we define a pairwise Markov property for the subclass of chain mixed graphs which includes chain graphs with the LWF interpretation, as well as summary graphs (and consequently ancestral graphs). We prove the equivalence of this pairwise Markov property to the global Markov property for compositional graphoid independence models.
△ Less
Submitted 11 July, 2017; v1 submitted 20 August, 2016;
originally announced August 2016.
-
Hierarchical Models for Independence Structures of Networks
Authors:
Kayvan Sadeghi,
Alessandro Rinaldo
Abstract:
We introduce a new family of network models, called hierarchical network models, that allow us to represent in an explicit manner the stochastic dependence among the dyads (random ties) of the network. In particular, each member of this family can be associated with a graphical model defining conditional independence clauses among the dyads of the network, called the dependency graph. Every networ…
▽ More
We introduce a new family of network models, called hierarchical network models, that allow us to represent in an explicit manner the stochastic dependence among the dyads (random ties) of the network. In particular, each member of this family can be associated with a graphical model defining conditional independence clauses among the dyads of the network, called the dependency graph. Every network model with dyadic independence assumption can be generalized to construct members of this new family. Using this new framework, we generalize the Erdös-Rényi and beta-models to create hierarchical Erdös-Rényi and beta-models. We describe various methods for parameter estimation as well as simulation studies for models with sparse dependency graphs.
△ Less
Submitted 25 November, 2019; v1 submitted 15 May, 2016;
originally announced May 2016.
-
Pairwise Markov properties for regression graphs
Authors:
Kayvan Sadeghi,
Nanny Wermuth
Abstract:
With a sequence of regressions, one may generate joint probability distributions. One starts with a joint, marginal distribution of context variables having possibly a concentration graph structure and continues with an ordered sequence of conditional distributions, named regressions in joint responses. The involved random variables may be discrete, continuous or of both types. Such a generating p…
▽ More
With a sequence of regressions, one may generate joint probability distributions. One starts with a joint, marginal distribution of context variables having possibly a concentration graph structure and continues with an ordered sequence of conditional distributions, named regressions in joint responses. The involved random variables may be discrete, continuous or of both types. Such a generating process specifies for each response a conditioning set which contains just its regressor variables and it leads to at least one valid ordering of all nodes in the corresponding regression graph which has three types of edge; one for undirected dependences among context variables, another for undirected dependences among joint responses and one for any directed dependence of a response on a regressor variable. For this regression graph, there are several definitions of pairwise Markov properties, where each interprets the conditional independence associated with a missing edge in the graph in a different way. We explain how these properties arise, prove their equivalence for compositional graphoids and point at the equivalence of each one of them to the global Markov property.
△ Less
Submitted 2 February, 2017; v1 submitted 30 December, 2015;
originally announced December 2015.
-
Statistical Models for Degree Distributions of Networks
Authors:
Kayvan Sadeghi,
Alessandro Rinaldo
Abstract:
We define and study the statistical models in exponential family form whose sufficient statistics are the degree distributions and the bi-degree distributions of undirected labelled simple graphs. Graphs that are constrained by the joint degree distributions are called $dK$-graphs in the computer science literature and this paper attempts to provide the first statistically grounded analysis of thi…
▽ More
We define and study the statistical models in exponential family form whose sufficient statistics are the degree distributions and the bi-degree distributions of undirected labelled simple graphs. Graphs that are constrained by the joint degree distributions are called $dK$-graphs in the computer science literature and this paper attempts to provide the first statistically grounded analysis of this type of models. In addition to formalizing these models, we provide some preliminary results for the parameter estimation and the asymptotic behaviour of the model for degree distribution, and discuss the parameter estimation for the model for bi-degree distribution.
△ Less
Submitted 14 November, 2014;
originally announced November 2014.
-
Marginalization and Conditioning for LWF Chain Graphs
Authors:
Kayvan Sadeghi
Abstract:
In this paper, we deal with the problem of marginalization over and conditioning on two disjoint subsets of the node set of chain graphs (CGs) with the LWF Markov property. For this purpose, we define the class of chain mixed graphs (CMGs) with three types of edges and, for this class, provide a separation criterion under which the class of CMGs is stable under marginalization and conditioning and…
▽ More
In this paper, we deal with the problem of marginalization over and conditioning on two disjoint subsets of the node set of chain graphs (CGs) with the LWF Markov property. For this purpose, we define the class of chain mixed graphs (CMGs) with three types of edges and, for this class, provide a separation criterion under which the class of CMGs is stable under marginalization and conditioning and contains the class of LWF CGs as its subclass. We provide a method for generating such graphs after marginalization and conditioning for a given CMG or a given LWF CG. We then define and study the class of anterial graphs, which is also stable under marginalization and conditioning and contains LWF CGs, but has a simpler structure than CMGs.
△ Less
Submitted 28 August, 2016; v1 submitted 28 May, 2014;
originally announced May 2014.
-
Markov Equivalences for Subclasses of Loopless Mixed Graphs
Authors:
Kayvan Sadeghi
Abstract:
In this paper we discuss four problems regarding Markov equivalences for subclasses of loopless mixed graphs. We classify these four problems as finding conditions for internal Markov equivalence, which is Markov equivalence within a subclass, for external Markov equivalence, which is Markov equivalence between subclasses, for representational Markov equivalence, which is the possibility of a grap…
▽ More
In this paper we discuss four problems regarding Markov equivalences for subclasses of loopless mixed graphs. We classify these four problems as finding conditions for internal Markov equivalence, which is Markov equivalence within a subclass, for external Markov equivalence, which is Markov equivalence between subclasses, for representational Markov equivalence, which is the possibility of a graph from a subclass being Markov equivalent to a graph from another subclass, and finding algorithms to generate a graph from a certain subclass that is Markov equivalent to a given graph. We particularly focus on the class of maximal ancestral graphs and its subclasses, namely regression graphs, bidirected graphs, undirected graphs, and directed acyclic graphs, and present novel results for representational Markov equivalence and algorithms.
△ Less
Submitted 20 October, 2011;
originally announced October 2011.
-
Stable mixed graphs
Authors:
Kayvan Sadeghi
Abstract:
In this paper, we study classes of graphs with three types of edges that capture the modified independence structure of a directed acyclic graph (DAG) after marginalisation over unobserved variables and conditioning on selection variables using the $m$-separation criterion. These include MC, summary, and ancestral graphs. As a modification of MC graphs, we define the class of ribbonless graphs (RG…
▽ More
In this paper, we study classes of graphs with three types of edges that capture the modified independence structure of a directed acyclic graph (DAG) after marginalisation over unobserved variables and conditioning on selection variables using the $m$-separation criterion. These include MC, summary, and ancestral graphs. As a modification of MC graphs, we define the class of ribbonless graphs (RGs) that permits the use of the $m$-separation criterion. RGs contain summary and ancestral graphs as subclasses, and each RG can be generated by a DAG after marginalisation and conditioning. We derive simple algorithms to generate RGs, from given DAGs or RGs, and also to generate summary and ancestral graphs in a simple way by further extension of the RG-generating algorithm. This enables us to develop a parallel theory on these three classes and to study the relationships between them as well as the use of each class.
△ Less
Submitted 17 December, 2013; v1 submitted 18 October, 2011;
originally announced October 2011.
-
Markov properties for mixed graphs
Authors:
Kayvan Sadeghi,
Steffen Lauritzen
Abstract:
In this paper, we unify the Markov theory of a variety of different types of graphs used in graphical Markov models by introducing the class of loopless mixed graphs, and show that all independence models induced by $m$-separation on such graphs are compositional graphoids. We focus in particular on the subclass of ribbonless graphs which as special cases include undirected graphs, bidirected grap…
▽ More
In this paper, we unify the Markov theory of a variety of different types of graphs used in graphical Markov models by introducing the class of loopless mixed graphs, and show that all independence models induced by $m$-separation on such graphs are compositional graphoids. We focus in particular on the subclass of ribbonless graphs which as special cases include undirected graphs, bidirected graphs, and directed acyclic graphs, as well as ancestral graphs and summary graphs. We define maximality of such graphs as well as a pairwise and a global Markov property. We prove that the global and pairwise Markov properties of a maximal ribbonless graph are equivalent for any independence model that is a compositional graphoid.
△ Less
Submitted 12 March, 2014; v1 submitted 27 September, 2011;
originally announced September 2011.
-
Sequences of regressions and their independences
Authors:
Nanny Wermuth,
Kayvan Sadeghi
Abstract:
Ordered sequences of univariate or multivariate regressions provide statistical models for analysing data from randomized, possibly sequential interventions, from cohort or multi-wave panel studies, but also from cross-sectional or retrospective studies. Conditional independences are captured by what we name regression graphs, provided the generated distribution shares some properties with a joint…
▽ More
Ordered sequences of univariate or multivariate regressions provide statistical models for analysing data from randomized, possibly sequential interventions, from cohort or multi-wave panel studies, but also from cross-sectional or retrospective studies. Conditional independences are captured by what we name regression graphs, provided the generated distribution shares some properties with a joint Gaussian distribution. Regression graphs extend purely directed, acyclic graphs by two types of undirected graph, one type for components of joint responses and the other for components of the context vector variable. We review the special features and the history of regression graphs, derive criteria to read all implied independences of a regression graph and prove criteria for Markov equivalence that is to judge whether two different graphs imply the same set of independence statements. Knowledge of Markov equivalence provides alternative interpretations of a given sequence of regressions, is essential for machine learning strategies and permits to use the simple graphical criteria of regression graphs on graphs for which the corresponding criteria are in general more complex. Under the known conditions that a Markov equivalent directed acyclic graph exists for any given regression graph, we give a polynomial time algorithm to find one such graph.
△ Less
Submitted 8 March, 2012; v1 submitted 13 March, 2011;
originally announced March 2011.