Search | arXiv e-print repository

A Hierarchical Random Effects State-space Model for Modeling Brain Activities from Electroencephalogram Data

Authors: Xingche Guo, Bin Yang, Ji Meng Loh, Qinxia Wang, Yuanjia Wang

Abstract: Mental disorders present challenges in diagnosis and treatment due to their complex and heterogeneous nature. Electroencephalogram (EEG) has shown promise as a potential biomarker for these disorders. However, existing methods for analyzing EEG signals have limitations in addressing heterogeneity and capturing complex brain activity patterns between regions. This paper proposes a novel random effe… ▽ More Mental disorders present challenges in diagnosis and treatment due to their complex and heterogeneous nature. Electroencephalogram (EEG) has shown promise as a potential biomarker for these disorders. However, existing methods for analyzing EEG signals have limitations in addressing heterogeneity and capturing complex brain activity patterns between regions. This paper proposes a novel random effects state-space model (RESSM) for analyzing large-scale multi-channel resting-state EEG signals, accounting for the heterogeneity of brain connectivities between groups and individual subjects. We incorporate multi-level random effects for temporal dynamical and spatial mapping matrices and address nonstationarity so that the brain connectivity patterns can vary over time. The model is fitted under a Bayesian hierarchical model framework coupled with a Gibbs sampler. Compared to previous mixed-effects state-space models, we directly model high-dimensional random effects matrices without structural constraints and tackle the challenge of identifiability. Through extensive simulation studies, we demonstrate that our approach yields valid estimation and inference. We apply RESSM to a multi-site clinical trial of Major Depressive Disorder (MDD). Our analysis uncovers significant differences in resting-state brain temporal dynamics among MDD patients compared to healthy individuals. In addition, we show the subject-level EEG features derived from RESSM exhibit a superior predictive value for the heterogeneous treatment effect compared to the EEG frequency band power, suggesting the potential of EEG as a valuable biomarker for MDD. △ Less

Submitted 27 January, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

arXiv:2305.13852 [pdf, ps, other]

Learning Optimal Biomarker-Guided Treatment Policy for Chronic Disorders

Authors: Bin Yang, Xingche Guo, Ji Meng Loh, Qinxia Wang, Yuanjia Wang

Abstract: Electroencephalogram (EEG) provides noninvasive measures of brain activity and is found to be valuable for diagnosis of some chronic disorders. Specifically, pre-treatment EEG signals in alpha and theta frequency bands have demonstrated some association with anti-depressant response, which is well-known to have low response rate. We aim to design an integrated pipeline that improves the response r… ▽ More Electroencephalogram (EEG) provides noninvasive measures of brain activity and is found to be valuable for diagnosis of some chronic disorders. Specifically, pre-treatment EEG signals in alpha and theta frequency bands have demonstrated some association with anti-depressant response, which is well-known to have low response rate. We aim to design an integrated pipeline that improves the response rate of major depressive disorder patients by developing an individualized treatment policy guided by the resting state pre-treatment EEG recordings and other treatment effects modifiers. We first design an innovative automatic site-specific EEG preprocessing pipeline to extract features that possess stronger signals compared with raw data. We then estimate the conditional average treatment effect using causal forests, and use a doubly robust technique to improve the efficiency in the estimation of the average treatment effect. We present evidence of heterogeneity in the treatment effect and the modifying power of EEG features as well as a significant average treatment effect, a result that cannot be obtained by conventional methods. Finally, we employ an efficient policy learning algorithm to learn an optimal depth-2 treatment assignment decision tree and compare its performance with Q-Learning and outcome-weighted learning via simulation studies and an application to a large multi-site, double-blind randomized controlled clinical trial, EMBARC. △ Less

Submitted 23 May, 2023; originally announced May 2023.

arXiv:1907.13276 [pdf, other]

Are Outlier Detection Methods Resilient to Sampling?

Authors: Laure Berti-Equille, Ji Meng Loh, Saravanan Thirumuruganathan

Abstract: Outlier detection is a fundamental task in data mining and has many applications including detecting errors in databases. While there has been extensive prior work on methods for outlier detection, modern datasets often have sizes that are beyond the ability of commonly used methods to process the data within a reasonable time. To overcome this issue, outlier detection methods can be trained over… ▽ More Outlier detection is a fundamental task in data mining and has many applications including detecting errors in databases. While there has been extensive prior work on methods for outlier detection, modern datasets often have sizes that are beyond the ability of commonly used methods to process the data within a reasonable time. To overcome this issue, outlier detection methods can be trained over samples of the full-sized dataset. However, it is not clear how a model trained on a sample compares with one trained on the entire dataset. In this paper, we introduce the notion of resilience to sampling for outlier detection methods. Orthogonal to traditional performance metrics such as precision/recall, resilience represents the extent to which the outliers detected by a method applied to samples from a sampling scheme matches those when applied to the whole dataset. We propose a novel approach for estimating the resilience to sampling of both individual outlier methods and their ensembles. We performed an extensive experimental study on synthetic and real-world datasets where we study seven diverse and representative outlier detection methods, compare results obtained from samples versus those obtained from the whole datasets and evaluate the accuracy of our resilience estimates. We observed that the methods are not equally resilient to a given sampling scheme and it is often the case that careful joint selection of both the sampling scheme and the outlier detection method is necessary. It is our hope that the paper initiates research on designing outlier detection algorithms that are resilient to sampling. △ Less

Submitted 30 July, 2019; originally announced July 2019.

Comments: 18 pages

arXiv:1302.0256 [pdf, ps, other]

Regression shrinkage and grouping of highly correlated predictors with HORSES

Authors: Woncheol Jang, Johan Lim, Nicole A. Lazar, Ji Meng Loh, Donghyeon Yu

Abstract: Identifying homogeneous subgroups of variables can be challenging in high dimensional data analysis with highly correlated predictors. We propose a new method called Hexagonal Operator for Regression with Shrinkage and Equality Selection, HORSES for short, that simultaneously selects positively correlated variables and identifies them as predictive clusters. This is achieved via a constrained leas… ▽ More Identifying homogeneous subgroups of variables can be challenging in high dimensional data analysis with highly correlated predictors. We propose a new method called Hexagonal Operator for Regression with Shrinkage and Equality Selection, HORSES for short, that simultaneously selects positively correlated variables and identifies them as predictive clusters. This is achieved via a constrained least-squares problem with regularization that consists of a linear combination of an L_1 penalty for the coefficients and another L_1 penalty for pairwise differences of the coefficients. This specification of the penalty function encourages grouping of positively correlated predictors combined with a sparsity solution. We construct an efficient algorithm to implement the HORSES procedure. We show via simulation that the proposed method outperforms other variable selection methods in terms of prediction error and parsimony. The technique is demonstrated on two data sets, a small data set from analysis of soil in Appalachia, and a high dimensional data set from a near infrared (NIR) spectroscopy study, showing the flexibility of the methodology. △ Less

Submitted 1 February, 2013; originally announced February 2013.

MSC Class: 62J07; 62P10

arXiv:1206.6674 [pdf, ps, other]

doi 10.1214/11-AOAS523

Meta-analysis of functional neuroimaging data using Bayesian nonparametric binary regression

Authors: Yu Ryan Yue, Martin A. Lindquist, Ji Meng Loh

Abstract: In this work we perform a meta-analysis of neuroimaging data, consisting of locations of peak activations identified in 162 separate studies on emotion. Neuroimaging meta-analyses are typically performed using kernel-based methods. However, these methods require the width of the kernel to be set a priori and to be constant across the brain. To address these issues, we propose a fully Bayesian nonp… ▽ More In this work we perform a meta-analysis of neuroimaging data, consisting of locations of peak activations identified in 162 separate studies on emotion. Neuroimaging meta-analyses are typically performed using kernel-based methods. However, these methods require the width of the kernel to be set a priori and to be constant across the brain. To address these issues, we propose a fully Bayesian nonparametric binary regression method to perform neuroimaging meta-analyses. In our method, each location (or voxel) has a probability of being a peak activation, and the corresponding probability function is based on a spatially adaptive Gaussian Markov random field (GMRF). We also include parameters in the model to robustify the procedure against miscoding of the voxel response. Posterior inference is implemented using efficient MCMC algorithms extended from those introduced in Holmes and Held [Bayesian Anal. 1 (2006) 145--168]. Our method allows the probability function to be locally adaptive with respect to the covariates, that is, to be smooth in one region of the covariate space and wiggly or even discontinuous in another. Posterior miscoding probabilities for each of the identified voxels can also be obtained, identifying voxels that may have been falsely classified as being activated. Simulation studies and application to the emotion neuroimaging data indicate that our method is superior to standard kernel-based methods. △ Less

Submitted 28 June, 2012; originally announced June 2012.

Comments: Published in at http://dx.doi.org/10.1214/11-AOAS523 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS523

Journal ref: Annals of Applied Statistics 2012, Vol. 6, No. 2, 697-718

arXiv:1011.2037 [pdf, ps, other]

doi 10.1214/09-AOAS307

Density estimation for grouped data with application to line transect sampling

Authors: Woncheol Jang, Ji Meng Loh

Abstract: Line transect sampling is a method used to estimate wildlife populations, with the resulting data often grouped in intervals. Estimating the density from grouped data can be challenging. In this paper we propose a kernel density estimator of wildlife population density for such grouped data. Our method uses a combined cross-validation and smoothed bootstrap approach to select the optimal bandwidth… ▽ More Line transect sampling is a method used to estimate wildlife populations, with the resulting data often grouped in intervals. Estimating the density from grouped data can be challenging. In this paper we propose a kernel density estimator of wildlife population density for such grouped data. Our method uses a combined cross-validation and smoothed bootstrap approach to select the optimal bandwidth for grouped data. Our simulation study shows that with the smoothing parameter selected with this method, the estimated density from grouped data matches the true density more closely than with other approaches. Using smoothed bootstrap, we also construct bias-adjusted confidence intervals for the value of the density at the boundary. We apply the proposed method to two grouped data sets, one from a wooden stake study where the true density is known, and the other from a survey of kangaroos in Australia. △ Less

Submitted 9 November, 2010; originally announced November 2010.

Comments: Published in at http://dx.doi.org/10.1214/09-AOAS307 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS307

Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 2, 893-915

arXiv:0712.1458 [pdf, ps, other]

doi 10.1214/07-AOAS129

Accounting for spatial correlation in the scan statistic

Authors: Ji Meng Loh, Zhengyuan Zhu

Abstract: The spatial scan statistic is widely used in epidemiology and medical studies as a tool to identify hotspots of diseases. The classical spatial scan statistic assumes the number of disease cases in different locations have independent Poisson distributions, while in practice the data may exhibit overdispersion and spatial correlation. In this work, we examine the behavior of the spatial scan sta… ▽ More The spatial scan statistic is widely used in epidemiology and medical studies as a tool to identify hotspots of diseases. The classical spatial scan statistic assumes the number of disease cases in different locations have independent Poisson distributions, while in practice the data may exhibit overdispersion and spatial correlation. In this work, we examine the behavior of the spatial scan statistic when overdispersion and spatial correlation are present, and propose a modified spatial scan statistic to account for that. Some theoretical results are provided to demonstrate that ignoring the overdispersion and spatial correlation leads to an increased rate of false positives, which is verified through a simulation study. Simulation studies also show that our modified procedure can substantially reduce the rate of false alarms. Two data examples involving brain cancer cases in New Mexico and chickenpox incidence data in France are used to illustrate the practical relevance of the modified procedure. △ Less

Submitted 10 December, 2007; originally announced December 2007.

Comments: Published in at http://dx.doi.org/10.1214/07-AOAS129 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS129

Journal ref: Annals of Applied Statistics 2007, Vol. 1, No. 2, 560-584

Showing 1–7 of 7 results for author: Loh, J M