-
CQUESST: A dynamical stochastic framework for predicting soil-carbon sequestration
Authors:
Dan Pagendam,
Jeff Baldock,
David Clifford,
Ryan Farquharson,
Lawrence Murray,
Mike Beare,
Denis Curtin,
Noel Cressie
Abstract:
A statistical framework we call CQUESST (Carbon Quantification and Uncertainty from Evolutionary Soil STochastics), which models carbon sequestration and cycling in soils, is applied to a long-running agricultural experiment that controls for crop type, tillage, and season. The experiment, known as the Millenium Tillage Trial (MTT), ran on 42 field-plots for ten years from 2000-2010; here CQUESST…
▽ More
A statistical framework we call CQUESST (Carbon Quantification and Uncertainty from Evolutionary Soil STochastics), which models carbon sequestration and cycling in soils, is applied to a long-running agricultural experiment that controls for crop type, tillage, and season. The experiment, known as the Millenium Tillage Trial (MTT), ran on 42 field-plots for ten years from 2000-2010; here CQUESST is used to model soil carbon dynamically in six pools, in each of the 42 agricultural plots, and on a monthly time step for a decade. We show how CQUESST can be used to estimate soil-carbon cycling rates under different treatments. Our methods provide much-needed statistical tools for quantitatively inferring the effectiveness of different experimental treatments on soil-carbon sequestration. The decade-long data are of multiple observation types, and these interacting time series are ingested into a fully Bayesian model that has a dynamic stochastic model of multiple pools of soil carbon at its core. CQUESST's stochastic model is motivated by the deterministic RothC soil-carbon model based on nonlinear difference equations. We demonstrate how CQUESST can estimate soil-carbon fluxes for different experimental treatments while acknowledging uncertainties in soil-carbon dynamics, in physical parameters, and in observations. CQUESST is implemented efficiently in the probabilistic programming language Stan using its MapReduce parallelization, and it scales well for large numbers of field-plots, using software libraries that allow for computation to be shared over multiple nodes of high-performance computing clusters.
△ Less
Submitted 9 November, 2024;
originally announced November 2024.
-
Benchmarking changepoint detection algorithms on cardiac time series
Authors:
Ayse Cakmak,
Erik Reinertsen,
Shamim Nemati,
Gari D. Clifford
Abstract:
The pattern of state changes in a biomedical time series can be related to health or disease. This work presents a principled approach for selecting a changepoint detection algorithm for a specific task, such as disease classification. Eight key algorithms were compared, and the performance of each algorithm was evaluated as a function of temporal tolerance, noise, and abnormal conduction (ectopy)…
▽ More
The pattern of state changes in a biomedical time series can be related to health or disease. This work presents a principled approach for selecting a changepoint detection algorithm for a specific task, such as disease classification. Eight key algorithms were compared, and the performance of each algorithm was evaluated as a function of temporal tolerance, noise, and abnormal conduction (ectopy) on realistic artificial cardiovascular time series data. All algorithms were applied to real data (cardiac time series of 22 patients with REM-behavior disorder (RBD) and 15 healthy controls) using the parameters selected on artificial data. Finally, features were derived from the detected changepoints to classify RBD patients from healthy controls using a K-Nearest Neighbors approach. On artificial data, Modified Bayesian Changepoint Detection algorithm provided superior positive predictive value for state change identification while Recursive Mean Difference Maximization (RMDM) achieved the highest true positive rate. For the classification task, features derived from the RMDM algorithm provided the highest leave one out cross validated accuracy of 0.89 and true positive rate of 0.87. Automatically detected changepoints provide useful information about subject's physiological state which cannot be directly observed. However, the choice of change point detection algorithm depends on the nature of the underlying data and the downstream application, such as a classification task. This work represents the first time change point detection algorithms have been compared in a meaningful way and utilized in a classification task, which demonstrates the effect of changepoint algorithm choice on application performance.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Addressing Class Imbalance in Classification Problems of Noisy Signals by using Fourier Transform Surrogates
Authors:
Justus T. C. Schwabedal,
John C. Snyder,
Ayse Cakmak,
Shamim Nemati,
Gari D. Clifford
Abstract:
Randomizing the Fourier-transform (FT) phases of temporal-spatial data generates surrogates that approximate examples from the data-generating distribution. We propose such FT surrogates as a novel tool to augment and analyze training of neural networks and explore the approach in the example of sleep-stage classification. By computing FT surrogates of raw EEG, EOG, and EMG signals of under-repres…
▽ More
Randomizing the Fourier-transform (FT) phases of temporal-spatial data generates surrogates that approximate examples from the data-generating distribution. We propose such FT surrogates as a novel tool to augment and analyze training of neural networks and explore the approach in the example of sleep-stage classification. By computing FT surrogates of raw EEG, EOG, and EMG signals of under-represented sleep stages, we balanced the CAPSLPDB sleep database. We then trained and tested a convolutional neural network for sleep stage classification, and found that our surrogate-based augmentation improved the mean F1-score by 7%. As another application of FT surrogates, we formulated an approach to compute saliency maps for individual sleep epochs. The visualization is based on the response of inferred class probabilities under replacement of short data segments by partial surrogates. To quantify how well the distributions of the surrogates and the original data match, we evaluated a trained classifier on surrogates of correctly classified examples, and summarized these conditional predictions in a confusion matrix. We show how such conditional confusion matrices can qualitatively explain the performance of surrogates in class balancing. The FT-surrogate augmentation approach may improve classification on noisy signals if carefully adapted to the data distribution under analysis.
△ Less
Submitted 28 January, 2019; v1 submitted 20 June, 2018;
originally announced June 2018.
-
Subject Selection on a Riemannian Manifold for Unsupervised Cross-subject Seizure Detection
Authors:
Samaneh Nasiri Ghosheh Bolagh,
Gari. D. Clifford
Abstract:
Inter-subject variability between individuals poses a challenge in inter-subject brain signal analysis problems. A new algorithm for subject-selection based on clustering covariance matrices on a Riemannian manifold is proposed. After unsupervised selection of the subsets of relevant subjects, data in a cluster is mapped to a tangent space at the mean point of covariance matrices in that cluster a…
▽ More
Inter-subject variability between individuals poses a challenge in inter-subject brain signal analysis problems. A new algorithm for subject-selection based on clustering covariance matrices on a Riemannian manifold is proposed. After unsupervised selection of the subsets of relevant subjects, data in a cluster is mapped to a tangent space at the mean point of covariance matrices in that cluster and an SVM classifier on labeled data from relevant subjects is trained. Experiment on an EEG seizure database shows that the proposed method increases the accuracy over state-of-the-art from 86.83% to 89.84% and specificity from 87.38% to 89.64% while reducing the false positive rate/hour from 0.8/hour to 0.77/hour.
△ Less
Submitted 1 December, 2017;
originally announced December 2017.