-
TimeGraph: Synthetic Benchmark Datasets for Robust Time-Series Causal Discovery
Authors:
Muhammad Hasan Ferdous,
Emam Hossain,
Md Osman Gani
Abstract:
Robust causal discovery in time series datasets depends on reliable benchmark datasets with known ground-truth causal relationships. However, such datasets remain scarce, and existing synthetic alternatives often overlook critical temporal properties inherent in real-world data, including nonstationarity driven by trends and seasonality, irregular sampling intervals, and the presence of unobserved…
▽ More
Robust causal discovery in time series datasets depends on reliable benchmark datasets with known ground-truth causal relationships. However, such datasets remain scarce, and existing synthetic alternatives often overlook critical temporal properties inherent in real-world data, including nonstationarity driven by trends and seasonality, irregular sampling intervals, and the presence of unobserved confounders. To address these challenges, we introduce TimeGraph, a comprehensive suite of synthetic time-series benchmark datasets that systematically incorporates both linear and nonlinear dependencies while modeling key temporal characteristics such as trends, seasonal effects, and heterogeneous noise patterns. Each dataset is accompanied by a fully specified causal graph featuring varying densities and diverse noise distributions and is provided in two versions: one including unobserved confounders and one without, thereby offering extensive coverage of real-world complexity while preserving methodological neutrality. We further demonstrate the utility of TimeGraph through systematic evaluations of state-of-the-art causal discovery algorithms including PCMCI+, LPCMCI, and FGES across a diverse array of configurations and metrics. Our experiments reveal significant variations in algorithmic performance under realistic temporal conditions, underscoring the need for robust synthetic benchmarks in the fair and transparent assessment of causal discovery methods. The complete TimeGraph suite, including dataset generation scripts, evaluation metrics, and recommended experimental protocols, is freely available to facilitate reproducible research and foster community-driven advancements in time-series causal discovery.
△ Less
Submitted 2 June, 2025;
originally announced June 2025.
-
Correlation to Causation: A Causal Deep Learning Framework for Arctic Sea Ice Prediction
Authors:
Emam Hossain,
Muhammad Hasan Ferdous,
Jianwu Wang,
Aneesh Subramanian,
Md Osman Gani
Abstract:
Traditional machine learning and deep learning techniques rely on correlation-based learning, often failing to distinguish spurious associations from true causal relationships, which limits robustness, interpretability, and generalizability. To address these challenges, we propose a causality-driven deep learning framework that integrates Multivariate Granger Causality (MVGC) and PCMCI+ causal dis…
▽ More
Traditional machine learning and deep learning techniques rely on correlation-based learning, often failing to distinguish spurious associations from true causal relationships, which limits robustness, interpretability, and generalizability. To address these challenges, we propose a causality-driven deep learning framework that integrates Multivariate Granger Causality (MVGC) and PCMCI+ causal discovery algorithms with a hybrid deep learning architecture. Using 43 years (1979-2021) of daily and monthly Arctic Sea Ice Extent (SIE) and ocean-atmospheric datasets, our approach identifies causally significant factors, prioritizes features with direct influence, reduces feature overhead, and improves computational efficiency. Experiments demonstrate that integrating causal features enhances the deep learning model's predictive accuracy and interpretability across multiple lead times. Beyond SIE prediction, the proposed framework offers a scalable solution for dynamic, high-dimensional systems, advancing both theoretical understanding and practical applications in predictive modeling.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
eCDANs: Efficient Temporal Causal Discovery from Autocorrelated and Non-stationary Data (Student Abstract)
Authors:
Muhammad Hasan Ferdous,
Uzma Hasan,
Md Osman Gani
Abstract:
Conventional temporal causal discovery (CD) methods suffer from high dimensionality, fail to identify lagged causal relationships, and often ignore dynamics in relations. In this study, we present a novel constraint-based CD approach for autocorrelated and non-stationary time series data (eCDANs) capable of detecting lagged and contemporaneous causal relationships along with temporal changes. eCDA…
▽ More
Conventional temporal causal discovery (CD) methods suffer from high dimensionality, fail to identify lagged causal relationships, and often ignore dynamics in relations. In this study, we present a novel constraint-based CD approach for autocorrelated and non-stationary time series data (eCDANs) capable of detecting lagged and contemporaneous causal relationships along with temporal changes. eCDANs addresses high dimensionality by optimizing the conditioning sets while conducting conditional independence (CI) tests and identifies the changes in causal relations by introducing a surrogate variable to represent time dependency. Experiments on synthetic and real-world data show that eCDANs can identify time influence and outperform the baselines.
△ Less
Submitted 5 March, 2023;
originally announced March 2023.
-
CDANs: Temporal Causal Discovery from Autocorrelated and Non-Stationary Time Series Data
Authors:
Muhammad Hasan Ferdous,
Uzma Hasan,
Md Osman Gani
Abstract:
Time series data are found in many areas of healthcare such as medical time series, electronic health records (EHR), measurements of vitals, and wearable devices. Causal discovery, which involves estimating causal relationships from observational data, holds the potential to play a significant role in extracting actionable insights about human health. In this study, we present a novel constraint-b…
▽ More
Time series data are found in many areas of healthcare such as medical time series, electronic health records (EHR), measurements of vitals, and wearable devices. Causal discovery, which involves estimating causal relationships from observational data, holds the potential to play a significant role in extracting actionable insights about human health. In this study, we present a novel constraint-based causal discovery approach for autocorrelated and non-stationary time series data (CDANs). Our proposed method addresses several limitations of existing causal discovery methods for autocorrelated and non-stationary time series data, such as high dimensionality, the inability to identify lagged causal relationships, and overlooking changing modules. Our approach identifies lagged and instantaneous/contemporaneous causal relationships along with changing modules that vary over time. The method optimizes the conditioning sets in a constraint-based search by considering lagged parents instead of conditioning on the entire past that addresses high dimensionality. The changing modules are detected by considering both contemporaneous and lagged parents. The approach first detects the lagged adjacencies, then identifies the changing modules and contemporaneous adjacencies, and finally determines the causal direction. We extensively evaluated our proposed method on synthetic and real-world clinical datasets, and compared its performance with several baseline approaches. The experimental results demonstrate the effectiveness of the proposed method in detecting causal relationships and changing modules for autocorrelated and non-stationary time series data.
△ Less
Submitted 16 October, 2023; v1 submitted 6 February, 2023;
originally announced February 2023.