-
Frequency Band Analysis of Nonstationary Multivariate Time Series
Authors:
Raanju R. Sundararajan,
Scott A. Bruce
Abstract:
Information from frequency bands in biomedical time series provides useful summaries of the observed signal. Many existing methods consider summaries of the time series obtained over a few well-known, pre-defined frequency bands of interest. However, these methods do not provide data-driven methods for identifying frequency bands that optimally summarize frequency-domain information in the time se…
▽ More
Information from frequency bands in biomedical time series provides useful summaries of the observed signal. Many existing methods consider summaries of the time series obtained over a few well-known, pre-defined frequency bands of interest. However, these methods do not provide data-driven methods for identifying frequency bands that optimally summarize frequency-domain information in the time series. A new method to identify partition points in the frequency space of a multivariate locally stationary time series is proposed. These partition points signify changes across frequencies in the time-varying behavior of the signal and provide frequency band summary measures that best preserve the nonstationary dynamics of the observed series. An $L_2$ norm-based discrepancy measure that finds differences in the time-varying spectral density matrix is constructed, and its asymptotic properties are derived. New nonparametric bootstrap tests are also provided to identify significant frequency partition points and to identify components and cross-components of the spectral matrix exhibiting changes over frequencies. Finite-sample performance of the proposed method is illustrated via simulations. The proposed method is used to develop optimal frequency band summary measures for characterizing time-varying behavior in resting-state electroencephalography (EEG) time series, as well as identifying components and cross-components associated with each frequency partition point.
△ Less
Submitted 9 January, 2023;
originally announced January 2023.
-
Adaptive Bayesian Sum of Trees Model for Covariate Dependent Spectral Analysis
Authors:
Yakun Wang,
Zeda Li,
Scott A. Bruce
Abstract:
This article introduces a flexible and adaptive nonparametric method for estimating the association between multiple covariates and power spectra of multiple time series. The proposed approach uses a Bayesian sum of trees model to capture complex dependencies and interactions between covariates and the power spectrum, which are often observed in studies of biomedical time series. Local power spect…
▽ More
This article introduces a flexible and adaptive nonparametric method for estimating the association between multiple covariates and power spectra of multiple time series. The proposed approach uses a Bayesian sum of trees model to capture complex dependencies and interactions between covariates and the power spectrum, which are often observed in studies of biomedical time series. Local power spectra corresponding to terminal nodes within trees are estimated nonparametrically using Bayesian penalized linear splines. The trees are considered to be random and fit using a Bayesian backfitting Markov chain Monte Carlo (MCMC) algorithm that sequentially considers tree modifications via reversible-jump MCMC techniques. For high-dimensional covariates, a sparsity-inducing Dirichlet hyperprior on tree splitting proportions is considered, which provides sparse estimation of covariate effects and efficient variable selection. By averaging over the posterior distribution of trees, the proposed method can recover both smooth and abrupt changes in the power spectrum across multiple covariates. Empirical performance is evaluated via simulations to demonstrate the proposed method's ability to accurately recover complex relationships and interactions. The proposed methodology is used to study gait maturation in young children by evaluating age-related changes in power spectra of stride interval time series in the presence of other covariates.
△ Less
Submitted 29 September, 2021;
originally announced September 2021.
-
Classification of Categorical Time Series Using the Spectral Envelope and Optimal Scalings
Authors:
Zeda Li,
Scott A. Bruce,
Tian Cai
Abstract:
This article introduces a novel approach to the classification of categorical time series under the supervised learning paradigm. To construct meaningful features for categorical time series classification, we consider two relevant quantities: the spectral envelope and its corresponding set of optimal scalings. These quantities characterize oscillatory patterns in a categorical time series as the…
▽ More
This article introduces a novel approach to the classification of categorical time series under the supervised learning paradigm. To construct meaningful features for categorical time series classification, we consider two relevant quantities: the spectral envelope and its corresponding set of optimal scalings. These quantities characterize oscillatory patterns in a categorical time series as the largest possible power at each frequency, or spectral envelope, obtained by assigning numerical values, or scalings, to categories that optimally emphasize oscillations at each frequency. Our procedure combines these two quantities to produce an interpretable and parsimonious feature-based classifier that can be used to accurately determine group membership for categorical time series. Classification consistency of the proposed method is investigated, and simulation studies are used to demonstrate accuracy in classifying categorical time series with various underlying group structures. Finally, we use the proposed method to explore key differences in oscillatory patterns of sleep stage time series for patients with different sleep disorders and accurately classify patients accordingly.
△ Less
Submitted 4 February, 2021;
originally announced February 2021.
-
Adaptive Frequency Band Analysis for Functional Time Series
Authors:
Pramita Bagchi,
Scott A. Bruce
Abstract:
The frequency-domain properties of nonstationary functional time series often contain valuable information. These properties are characterized through its time-varying power spectrum. Practitioners seeking low-dimensional summary measures of the power spectrum often partition frequencies into bands and create collapsed measures of power within bands. However, standard frequency bands have largely…
▽ More
The frequency-domain properties of nonstationary functional time series often contain valuable information. These properties are characterized through its time-varying power spectrum. Practitioners seeking low-dimensional summary measures of the power spectrum often partition frequencies into bands and create collapsed measures of power within bands. However, standard frequency bands have largely been developed through manual inspection of time series data and may not adequately summarize power spectra. In this article, we propose a framework for adaptive frequency band estimation of nonstationary functional time series that optimally summarizes the time-varying dynamics of the series. We develop a scan statistic and search algorithm to detect changes in the frequency domain. We establish theoretical properties of this framework and develop a computationally-efficient implementation. The validity of our method is also justified through numerous simulation studies and an application to analyzing electroencephalogram data in participants alternating between eyes open and eyes closed conditions.
△ Less
Submitted 10 March, 2021; v1 submitted 2 February, 2021;
originally announced February 2021.
-
Adaptive Bayesian Spectral Analysis of Nonstationary Biomedical Time Series
Authors:
Scott A. Bruce,
Martica H. Hall,
Daniel J. Buysse,
Robert T. Krafty
Abstract:
Many studies of biomedical time series signals aim to measure the association between frequency-domain properties of time series and clinical and behavioral covariates. However, the time-varying dynamics of these associations are largely ignored due to a lack of methods that can assess the changing nature of the relationship through time. This article introduces a method for the simultaneous and a…
▽ More
Many studies of biomedical time series signals aim to measure the association between frequency-domain properties of time series and clinical and behavioral covariates. However, the time-varying dynamics of these associations are largely ignored due to a lack of methods that can assess the changing nature of the relationship through time. This article introduces a method for the simultaneous and automatic analysis of the association between the time-varying power spectrum and covariates. The procedure adaptively partitions the grid of time and covariate values into an unknown number of approximately stationary blocks and nonparametrically estimates local spectra within blocks through penalized splines. The approach is formulated in a fully Bayesian framework, in which the number and locations of partition points are random, and fit using reversible jump Markov chain Monte Carlo techniques. Estimation and inference averaged over the distribution of partitions allows for the accurate analysis of spectra with both smooth and abrupt changes. The proposed methodology is used to analyze the association between the time-varying spectrum of heart rate variability and self-reported sleep quality in a study of older adults serving as the primary caregiver for their ill spouse.
△ Less
Submitted 4 October, 2016; v1 submitted 2 September, 2016;
originally announced September 2016.
-
A Scalable Framework for NBA Player and Team Comparisons Using Player Tracking Data
Authors:
Scott Bruce
Abstract:
The release of NBA player tracking data greatly enhances the granularity and dimensionality of basketball statistics used to evaluate and compare player performance. However, the high dimensionality of this new data source can be troublesome as it demands more computational resources and reduces the ability to easily interpret findings. Therefore, we must find a way to reduce the dimensionality of…
▽ More
The release of NBA player tracking data greatly enhances the granularity and dimensionality of basketball statistics used to evaluate and compare player performance. However, the high dimensionality of this new data source can be troublesome as it demands more computational resources and reduces the ability to easily interpret findings. Therefore, we must find a way to reduce the dimensionality of the data while retaining the ability to differentiate and compare player performance.
In this paper, Principal Component Analysis (PCA) is used to identify four principal components that account for 68% of the variation in player tracking data from the 2013-2014 regular season and intuitive interpretations of these new dimensions are developed by examining the statistics that influence them the most. In this new high variance, low dimensional space, you can easily compare statistical profiles across any or all of the principal component dimensions to evaluate characteristics that make certain players and teams similar or unique. A simple measure of similarity between two player or team statistical profiles based on the four principal component scores is also constructed. The Statistical Diversity Index (SDI) allows for quick and intuitive comparisons using the entirety of the player tracking data. As new statistics emerge, this framework is scalable as it can incorporate existing and new data sources by reconstructing the principal component dimensions and SDI for improved comparisons. Using principal component scores and SDI, several use cases are presented for improved personnel management.
△ Less
Submitted 10 January, 2016; v1 submitted 13 November, 2015;
originally announced November 2015.
-
Nonparametric Distributed Learning Architecture for Big Data: Algorithm and Applications
Authors:
Scott Bruce,
Zeda Li,
Hsiang-Chieh Yang,
Subhadeep Mukhopadhyay
Abstract:
Dramatic increases in the size and complexity of modern datasets have made traditional "centralized" statistical inference prohibitive. In addition to computational challenges associated with big data learning, the presence of numerous data types (e.g. discrete, continuous, categorical, etc.) makes automation and scalability difficult. A question of immediate concern is how to design a data-intens…
▽ More
Dramatic increases in the size and complexity of modern datasets have made traditional "centralized" statistical inference prohibitive. In addition to computational challenges associated with big data learning, the presence of numerous data types (e.g. discrete, continuous, categorical, etc.) makes automation and scalability difficult. A question of immediate concern is how to design a data-intensive statistical inference architecture without changing the basic statistical modeling principles developed for "small" data over the last century. To address this problem, we present MetaLP, a flexible, distributed statistical modeling framework.
△ Less
Submitted 26 February, 2018; v1 submitted 15 August, 2015;
originally announced August 2015.