-
Predicting Binary Neutron Star Postmerger Spectra Using Artificial Neural Networks
Authors:
Dimitrios Pesios,
Ioannis Koutalios,
Dimitris Kugiumtzis,
Nikolaos Stergioulas
Abstract:
Gravitational waves in the postmerger phase of binary neutron star mergers may become detectable with planned upgrades of existing gravitational-wave detectors or with more sensitive next-generation detectors. The construction of template banks for the postmerger phase can facilitate signal detection and parameter estimation. Here, we investigate the performance of an artificial neural network in…
▽ More
Gravitational waves in the postmerger phase of binary neutron star mergers may become detectable with planned upgrades of existing gravitational-wave detectors or with more sensitive next-generation detectors. The construction of template banks for the postmerger phase can facilitate signal detection and parameter estimation. Here, we investigate the performance of an artificial neural network in predicting simulation-based waveforms in the frequency domain (restricted to the magnitude of the frequency spectrum and to equal-mass models) that depend on three parameters that can be inferred through observations, neutron star mass, tidal deformability, and the gradient of radius versus mass. Compared to a baseline study using multiple linear regression, we find that the artificial neural network can predict waveforms with higher accuracy and more consistent performance in a cross-validation study. We also demonstrate, through a recalibration procedure, that future reduction of uncertainties in empirical relations that are used in our hierarchical scheme will result in more accurate predicted postmerger spectra.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Detecting direct causality in multivariate time series: A comparative study
Authors:
Angeliki Papana,
Elsa Siggiridou,
Dimitris Kugiumtzis
Abstract:
The concept of Granger causality is increasingly being applied for the characterization of directional interactions in different applications. A multivariate framework for estimating Granger causality is essential in order to account for all the available information from multivariate time series. However, the inclusion of non-informative or non-significant variables creates estimation problems re…
▽ More
The concept of Granger causality is increasingly being applied for the characterization of directional interactions in different applications. A multivariate framework for estimating Granger causality is essential in order to account for all the available information from multivariate time series. However, the inclusion of non-informative or non-significant variables creates estimation problems related to the 'curse of dimensionality'. To deal with this issue, direct causality measures using variable selection and dimension reduction techniques have been introduced. In this comparative work, the performance of an ensemble of bivariate and multivariate causality measures in the time domain is assessed, focusing on dimension reduction causality measures. In particular, different types of high-dimensional coupled discrete systems are used (involving up to 100 variables) and the robustness of the causality measures to time series length and different noise types is examined. The results of the simulation study highlight the superiority of the dimension reduction measures, especially for high-dimensional systems.
△ Less
Submitted 2 November, 2020;
originally announced November 2020.
-
Evaluation of Granger causality measures for constructing networks from multivariate time series
Authors:
Elsa Siggiridou,
Christos Koutlis,
Alkiviadis Tsimpiris,
Dimitris Kugiumtzis
Abstract:
Granger causality and variants of this concept allow the study of complex dynamical systems as networks constructed from multivariate time series. In this work, a large number of Granger causality measures used to form causality networks from multivariate time series are assessed. These measures are in the time domain, such as model-based and information measures, the frequency domain and the phas…
▽ More
Granger causality and variants of this concept allow the study of complex dynamical systems as networks constructed from multivariate time series. In this work, a large number of Granger causality measures used to form causality networks from multivariate time series are assessed. These measures are in the time domain, such as model-based and information measures, the frequency domain and the phase domain. The study aims also to compare bivariate and multivariate measures, linear and nonlinear measures, as well as the use of dimension reduction in linear model-based measures and information measures. The latter is particular relevant in the study of high-dimensional time series. For the performance of the multivariate causality measures, low and high dimensional coupled dynamical systems are considered in discrete and continuous time, as well as deterministic and stochastic. The measures are evaluated and ranked according to their ability to provide causality networks that match the original coupling structure. The simulation study concludes that the Granger causality measures using dimension reduction are superior and should be preferred particularly in studies involving many observed variables, such as multi-channel electroencephalograms and financial markets.
△ Less
Submitted 31 October, 2019;
originally announced October 2019.
-
Evaluation of algorithms for correction of transcranial magnetic stimulation induced artifacts in electroencephalograms
Authors:
Panteleimon Vafeidis,
Vasilios K. Kimiskidis,
Dimitris Kugiumtzis
Abstract:
Transcranial magnetic stimulation combined with electroencephalography (TMS-EEG) is widely used to study the reactivity and connectivity of brain regions for clinical or research purposes. The electromagnetic pulse of the TMS device generates at the instant of administration an artifact of large amplitude and a duration up to tens of milliseconds that overlaps with brain activity. Methods for TMS…
▽ More
Transcranial magnetic stimulation combined with electroencephalography (TMS-EEG) is widely used to study the reactivity and connectivity of brain regions for clinical or research purposes. The electromagnetic pulse of the TMS device generates at the instant of administration an artifact of large amplitude and a duration up to tens of milliseconds that overlaps with brain activity. Methods for TMS artifact correction have been developed to remove the artifact and recover the underlying, immediate response of the cerebral cortex to the magnetic stimulus. In this study, three such algorithms are evaluated. Since there is no ground truth for the masked brain activity, pilot data formed from the superposition of the isolated TMS artifact on the EEG brain activity are used to evaluate the performance of the algorithms. Different scenarios of TMS-EEG experiments are considered for the evaluation: TMS at resting state, TMS inducing epileptiform discharges and TMS administered during epileptiform discharges. We show that a proposed gap filling method is able to reproduce qualitative characteristics and in many cases closely resemble the hidden EEG signal. Finally, shortcomings of the TMS correction algorithms as well as the pilot data approach are discussed.
△ Less
Submitted 25 October, 2018;
originally announced October 2018.
-
Granger Causality in Multi-variate Time Series using a Time Ordered Restricted Vector Autoregressive Model
Authors:
Elsa Siggiridou,
Dimitris Kugiumtzis
Abstract:
Granger causality has been used for the investigation of the inter-dependence structure of the underlying systems of multi-variate time series. In particular, the direct causal effects are commonly estimated by the conditional Granger causality index (CGCI). In the presence of many observed variables and relatively short time series, CGCI may fail because it is based on vector autoregressive model…
▽ More
Granger causality has been used for the investigation of the inter-dependence structure of the underlying systems of multi-variate time series. In particular, the direct causal effects are commonly estimated by the conditional Granger causality index (CGCI). In the presence of many observed variables and relatively short time series, CGCI may fail because it is based on vector autoregressive models (VAR) involving a large number of coefficients to be estimated. In this work, the VAR is restricted by a scheme that modifies the recently developed method of backward-in-time selection (BTS) of the lagged variables and the CGCI is combined with BTS. Further, the proposed approach is compared favorably to other restricted VAR representations, such as the top-down strategy, the bottom-up strategy, and the least absolute shrinkage and selection operator (LASSO), in terms of sensitivity and specificity of CGCI. This is shown by using simulations of linear and nonlinear, low and high-dimensional systems and different time series lengths. For nonlinear systems, CGCI from the restricted VAR representations are compared with analogous nonlinear causality indices. Further, CGCI in conjunction with BTS and other restricted VAR representations is applied to multi-channel scalp electroencephalogram (EEG) recordings of epileptic patients containing epileptiform discharges. CGCI on the restricted VAR, and BTS in particular, could track the changes in brain connectivity before, during and after epileptiform discharges, which was not possible using the full VAR representation.
△ Less
Submitted 11 November, 2015;
originally announced November 2015.
-
Markov chain order estimation with parametric significance tests of conditional mutual information
Authors:
Maria Papapetrou,
Dimitris Kugiumtzis
Abstract:
Besides the different approaches suggested in the literature, accurate estimation of the order of a Markov chain from a given symbol sequence is an open issue, especially when the order is moderately large. Here, parametric significance tests of conditional mutual information (CMI) of increasing order $m$, $I_c(m)$, on a symbol sequence are conducted for increasing orders $m$ in order to estimate…
▽ More
Besides the different approaches suggested in the literature, accurate estimation of the order of a Markov chain from a given symbol sequence is an open issue, especially when the order is moderately large. Here, parametric significance tests of conditional mutual information (CMI) of increasing order $m$, $I_c(m)$, on a symbol sequence are conducted for increasing orders $m$ in order to estimate the true order $L$ of the underlying Markov chain. CMI of order $m$ is the mutual information of two variables in the Markov chain being $m$ time steps apart, conditioning on the intermediate variables of the chain. The null distribution of CMI is approximated with a normal and gamma distribution deriving analytic expressions of their parameters, and a gamma distribution deriving its parameters from the mean and variance of the normal distribution. The accuracy of order estimation is assessed with the three parametric tests, and the parametric tests are compared to the randomization significance test and other known order estimation criteria using Monte Carlo simulations of Markov chains with different order $L$, length of symbol sequence $N$ and number of symbols $K$. The parametric test using the gamma distribution (with directly defined parameters) is consistently better than the other two parametric tests and matches well the performance of the randomization test. The tests are applied to genes and intergenic regions of DNA sequences, and the estimated orders are interpreted in view of the results from the simulation study. The application shows the usefulness of the parametric gamma test for long symbol sequences where the randomization test becomes prohibitively slow to compute.
△ Less
Submitted 7 November, 2015;
originally announced November 2015.
-
Estimation of connectivity measures in gappy time series
Authors:
G. Papadopoulos,
D. Kugiumtzis
Abstract:
A new method is proposed to compute connectivity measures on multivariate time series with gaps. Rather than removing or filling the gaps, the rows of the joint data matrix containing empty entries are removed and the calculations are done on the remainder matrix. The method, called measure adapted gap removal (MAGR), can be applied to any connectivity measure that uses a joint data matrix, such a…
▽ More
A new method is proposed to compute connectivity measures on multivariate time series with gaps. Rather than removing or filling the gaps, the rows of the joint data matrix containing empty entries are removed and the calculations are done on the remainder matrix. The method, called measure adapted gap removal (MAGR), can be applied to any connectivity measure that uses a joint data matrix, such as cross correlation, cross mutual information and transfer entropy. MAGR is favorably compared using these three measures to a number of known gap-filling techniques, as well as the gap closure. The superiority of MAGR is illustrated on time series from synthetic systems and financial time series.
△ Less
Submitted 29 April, 2015;
originally announced May 2015.
-
Simulation of Multivariate Non-Gaussian Autoregressive Time Series with Given Autocovariance and Marginals
Authors:
Dimitris Kugiumtzis,
Efthimia Bora-Senta
Abstract:
A semi-analytic method is proposed for the generation of realizations of a multivariate process of a given linear correlation structure and marginal distribution. This is an extension of a similar method for univariate processes, transforming the autocorrelation of the non-Gaussian process to that of a Gaussian process based on a piece-wise linear marginal transform from non-Gaussian to Gaussian m…
▽ More
A semi-analytic method is proposed for the generation of realizations of a multivariate process of a given linear correlation structure and marginal distribution. This is an extension of a similar method for univariate processes, transforming the autocorrelation of the non-Gaussian process to that of a Gaussian process based on a piece-wise linear marginal transform from non-Gaussian to Gaussian marginal. The extension to multivariate processes involves the derivation of the autocorrelation matrix from the marginal transforms, which determines the generating vector autoregressive process. The effectiveness of the approach is demonstrated on systems designed under different scenarios of autocovariance and marginals.
△ Less
Submitted 13 March, 2014;
originally announced March 2014.
-
Direct coupling information measure from non-uniform embedding
Authors:
Dimitris Kugiumtzis
Abstract:
A measure to estimate the direct and directional coupling in multivariate time series is proposed. The measure is an extension of a recently published measure of conditional Mutual Information from Mixed Embedding (MIME) for bivariate time series. In the proposed measure of Partial MIME (PMIME), the embedding is on all observed variables, and it is optimized in explaining the response variable. It…
▽ More
A measure to estimate the direct and directional coupling in multivariate time series is proposed. The measure is an extension of a recently published measure of conditional Mutual Information from Mixed Embedding (MIME) for bivariate time series. In the proposed measure of Partial MIME (PMIME), the embedding is on all observed variables, and it is optimized in explaining the response variable. It is shown that PMIME detects correctly direct coupling, and outperforms the (linear) conditional Granger causality and the partial transfer entropy. We demonstrate that PMIME does not rely on significance test and embedding parameters, and the number of observed variables has no effect on its statistical accuracy, it may only slow the computations. The importance of these points is shown in simulations and in an application to epileptic multi-channel scalp EEG.
△ Less
Submitted 27 May, 2013;
originally announced May 2013.
-
Partial Transfer Entropy on Rank Vectors
Authors:
Dimitris Kugiumtzis
Abstract:
For the evaluation of information flow in bivariate time series, information measures have been employed, such as the transfer entropy (TE), the symbolic transfer entropy (STE), defined similarly to TE but on the ranks of the components of the reconstructed vectors, and the transfer entropy on rank vectors (TERV), similar to STE but forming the ranks for the future samples of the response system w…
▽ More
For the evaluation of information flow in bivariate time series, information measures have been employed, such as the transfer entropy (TE), the symbolic transfer entropy (STE), defined similarly to TE but on the ranks of the components of the reconstructed vectors, and the transfer entropy on rank vectors (TERV), similar to STE but forming the ranks for the future samples of the response system with regard to the current reconstructed vector. Here we extend TERV for multivariate time series, and account for the presence of confounding variables, called partial transfer entropy on ranks (PTERV). We investigate the asymptotic properties of PTERV, and also partial STE (PSTE), construct parametric significance tests under approximations with Gaussian and gamma null distributions, and show that the parametric tests cannot achieve the power of the randomization test using time-shifted surrogates. Using simulations on known coupled dynamical systems and applying parametric and randomization significance tests, we show that PTERV performs better than PSTE but worse than the partial transfer entropy (PTE). However, PTERV, unlike PTE, is robust to the presence of drifts in the time series and it is also not affected by the level of detrending.
△ Less
Submitted 26 March, 2013;
originally announced March 2013.
-
Backward-in-Time Selection of the Order of Dynamic Regression Prediction Model
Authors:
Ioannis Vlachos,
Dimitris Kugiumtzis
Abstract:
We investigate the optimal structure of dynamic regression models used in multivariate time series prediction and propose a scheme to form the lagged variable structure called Backward-in-Time Selection (BTS) that takes into account feedback and multi-collinearity, often present in multivariate time series. We compare BTS to other known methods, also in conjunction with regularization techniques u…
▽ More
We investigate the optimal structure of dynamic regression models used in multivariate time series prediction and propose a scheme to form the lagged variable structure called Backward-in-Time Selection (BTS) that takes into account feedback and multi-collinearity, often present in multivariate time series. We compare BTS to other known methods, also in conjunction with regularization techniques used for the estimation of model parameters, namely principal components, partial least squares and ridge regression estimation. The predictive efficiency of the different models is assessed by means of Monte Carlo simulations for different settings of feedback and multi-collinearity. The results show that BTS has consistently good prediction performance while other popular methods have varying and often inferior performance. The prediction performance of BTS was also found the best when tested on human electroencephalograms of an epileptic seizure, and to the prediction of returns of indices of world financial markets.
△ Less
Submitted 11 January, 2013;
originally announced January 2013.
-
Markov Chain Order estimation with Conditional Mutual Information
Authors:
Maria Papapetrou,
Dimitris Kugiumtzis
Abstract:
We introduce the Conditional Mutual Information (CMI) for the estimation of the Markov chain order. For a Markov chain of $K$ symbols, we define CMI of order $m$, $I_c(m)$, as the mutual information of two variables in the chain being $m$ time steps apart, conditioning on the intermediate variables of the chain. We find approximate analytic significance limits based on the estimation bias of CMI a…
▽ More
We introduce the Conditional Mutual Information (CMI) for the estimation of the Markov chain order. For a Markov chain of $K$ symbols, we define CMI of order $m$, $I_c(m)$, as the mutual information of two variables in the chain being $m$ time steps apart, conditioning on the intermediate variables of the chain. We find approximate analytic significance limits based on the estimation bias of CMI and develop a randomization significance test of $I_c(m)$, where the randomized symbol sequences are formed by random permutation of the components of the original symbol sequence. The significance test is applied for increasing $m$ and the Markov chain order is estimated by the last order for which the null hypothesis is rejected. We present the appropriateness of CMI-testing on Monte Carlo simulations and compare it to the Akaike and Bayesian information criteria, the maximal fluctuation method (Peres-Shields estimator) and a likelihood ratio test for increasing orders using $φ$-divergence. The order criterion of CMI-testing turns out to be superior for orders larger than one, but its effectiveness for large orders depends on data availability. In view of the results from the simulations, we interpret the estimated orders by the CMI-testing and the other criteria on genes and intergenic regions of DNA chains.
△ Less
Submitted 1 January, 2013;
originally announced January 2013.
-
Reducing the Bias of Causality Measures
Authors:
A. Papana,
D. Kugiumtzis,
P. G. Larsson
Abstract:
Measures of the direction and strength of the interdependence between two time series are evaluated and modified in order to reduce the bias in the estimation of the measures, so that they give zero values when there is no causal effect. For this, point shuffling is employed as used in the frame of surrogate data. This correction is not specific to a particular measure and it is implemented here o…
▽ More
Measures of the direction and strength of the interdependence between two time series are evaluated and modified in order to reduce the bias in the estimation of the measures, so that they give zero values when there is no causal effect. For this, point shuffling is employed as used in the frame of surrogate data. This correction is not specific to a particular measure and it is implemented here on measures based on state space reconstruction and information measures. The performance of the causality measures and their modifications is evaluated on simulated uncoupled and coupled dynamical systems and for different settings of embedding dimension, time series length and noise level. The corrected measures, and particularly the suggested corrected transfer entropy, turn out to stabilize at the zero level in the absence of causal effect and detect correctly the direction of information flow when it is present. The measures are also evaluated on electroencephalograms (EEG) for the detection of the information flow in the brain of an epileptic patient. The performance of the measures on EEG is interpreted, in view of the results from the simulation study.
△ Less
Submitted 18 January, 2011;
originally announced January 2011.
-
Non-uniform state space reconstruction and coupling detection
Authors:
Ioannis Vlachos,
Dimitris Kugiumtzis
Abstract:
We investigate the state space reconstruction from multiple time series derived from continuous and discrete systems and propose a method for building embedding vectors progressively using information measure criteria regarding past, current and future states. The embedding scheme can be adapted for different purposes, such as mixed modelling, cross-prediction and Granger causality. In particular…
▽ More
We investigate the state space reconstruction from multiple time series derived from continuous and discrete systems and propose a method for building embedding vectors progressively using information measure criteria regarding past, current and future states. The embedding scheme can be adapted for different purposes, such as mixed modelling, cross-prediction and Granger causality. In particular we apply this method in order to detect and evaluate information transfer in coupled systems. As a practical application, we investigate in records of scalp epileptic EEG the information flow across brain areas.
△ Less
Submitted 2 July, 2010;
originally announced July 2010.
-
Transfer Entropy on Rank Vectors
Authors:
Dimitris Kugiumtzis
Abstract:
Transfer entropy (TE) is a popular measure of information flow found to perform consistently well in different settings. Symbolic transfer entropy (STE) is defined similarly to TE but on the ranks of the components of the reconstructed vectors rather than the reconstructed vectors themselves. First, we correct STE by forming the ranks for the future samples of the response system with regard to th…
▽ More
Transfer entropy (TE) is a popular measure of information flow found to perform consistently well in different settings. Symbolic transfer entropy (STE) is defined similarly to TE but on the ranks of the components of the reconstructed vectors rather than the reconstructed vectors themselves. First, we correct STE by forming the ranks for the future samples of the response system with regard to the current reconstructed vector. We give the grounds for this modified version of STE, which we call Transfer Entropy on Rank Vectors (TERV). Then we propose to use more than one step ahead in the formation of the future of the response in order to capture the information flow from the driving system over a longer time horizon. To assess the performance of STE, TE and TERV in detecting correctly the information flow we use receiver operating characteristic (ROC) curves formed by the measure values in the two coupling directions computed on a number of realizations of known weakly coupled systems. We also consider different settings of state space reconstruction, time series length and observational noise. The results show that TERV indeed improves STE and in some cases performs better than TE, particularly in the presence of noise, but overall TE gives more consistent results. The use of multiple steps ahead improves the accuracy of TE and TERV.
△ Less
Submitted 2 July, 2010;
originally announced July 2010.
-
Measures of Analysis of Time Series (MATS): A MATLAB Toolkit for Computation of Multiple Measures on Time Series Data Bases
Authors:
Dimitris Kugiumtzis,
Alkiviadis Tsimpiris
Abstract:
In many applications, such as physiology and finance, large time series data bases are to be analyzed requiring the computation of linear, nonlinear and other measures. Such measures have been developed and implemented in commercial and freeware softwares rather selectively and independently. The Measures of Analysis of Time Series ({\tt MATS}) {\tt MATLAB} toolkit is designed to handle an arbit…
▽ More
In many applications, such as physiology and finance, large time series data bases are to be analyzed requiring the computation of linear, nonlinear and other measures. Such measures have been developed and implemented in commercial and freeware softwares rather selectively and independently. The Measures of Analysis of Time Series ({\tt MATS}) {\tt MATLAB} toolkit is designed to handle an arbitrary large set of scalar time series and compute a large variety of measures on them, allowing for the specification of varying measure parameters as well. The variety of options with added facilities for visualization of the results support different settings of time series analysis, such as the detection of dynamics changes in long data records, resampling (surrogate or bootstrap) tests for independence and linearity with various test statistics, and discrimination power of different measures and for different combinations of their parameters. The basic features of {\tt MATS} are presented and the implemented measures are briefly described. The usefulness of {\tt MATS} is illustrated on some empirical examples along with screenshots.
△ Less
Submitted 9 February, 2010;
originally announced February 2010.
-
Evaluation of Mutual Information Estimators for Time Series
Authors:
Angeliki Papana,
Dimitris Kugiumtzis
Abstract:
We study some of the most commonly used mutual information estimators, based on histograms of fixed or adaptive bin size, $k$-nearest neighbors and kernels, and focus on optimal selection of their free parameters. We examine the consistency of the estimators (convergence to a stable value with the increase of time series length) and the degree of deviation among the estimators. The optimization…
▽ More
We study some of the most commonly used mutual information estimators, based on histograms of fixed or adaptive bin size, $k$-nearest neighbors and kernels, and focus on optimal selection of their free parameters. We examine the consistency of the estimators (convergence to a stable value with the increase of time series length) and the degree of deviation among the estimators. The optimization of parameters is assessed by quantifying the deviation of the estimated mutual information from its true or asymptotic value as a function of the free parameter. Moreover, some common-used criteria for parameter selection are evaluated for each estimator. The comparative study is based on Monte Carlo simulations on time series from several linear and nonlinear systems of different lengths and noise levels. The results show that the $k$-nearest neighbor is the most stable and less affected by the method-specific parameter. A data adaptive criterion for optimal binning is suggested for linear systems but it is found to be rather conservative for nonlinear systems. It turns out that the binning and kernel estimators give the least deviation in identifying the lag of the first minimum of mutual information from nonlinear systems, and are stable in the presence of noise.
△ Less
Submitted 30 April, 2009;
originally announced April 2009.
-
Turning Point Prediction of Oscillating Time Series using Local Dynamic Regression Models
Authors:
D. Kugiumtzis,
I. Vlachos
Abstract:
In the prediction of oscillating time series, the interest is in the turning points of successive oscillations rather than the samples themselves. For this purpose a scheme has been proposed; the state space reconstruction is limited to the turning points and the local (nearest neighbor) model is modified in order to predict the turning point magnitudes and times. This approach is extended here…
▽ More
In the prediction of oscillating time series, the interest is in the turning points of successive oscillations rather than the samples themselves. For this purpose a scheme has been proposed; the state space reconstruction is limited to the turning points and the local (nearest neighbor) model is modified in order to predict the turning point magnitudes and times. This approach is extended here using a local dynamic regression model on both turning point magnitudes and times. Simulations on oscillating nonlinear systems show that the proposed approach gives better predictions of turning points than the standard local model applied to all the samples of the oscillating time series.
△ Less
Submitted 12 September, 2008;
originally announced September 2008.
-
State Space Reconstruction for Multivariate Time Series Prediction
Authors:
I. Vlachos,
D. Kugiumtzis
Abstract:
In the nonlinear prediction of scalar time series, the common practice is to reconstruct the state space using time-delay embedding and apply a local model on neighborhoods of the reconstructed space. The method of false nearest neighbors is often used to estimate the embedding dimension. For prediction purposes, the optimal embedding dimension can also be estimated by some prediction error mini…
▽ More
In the nonlinear prediction of scalar time series, the common practice is to reconstruct the state space using time-delay embedding and apply a local model on neighborhoods of the reconstructed space. The method of false nearest neighbors is often used to estimate the embedding dimension. For prediction purposes, the optimal embedding dimension can also be estimated by some prediction error minimization criterion. We investigate the proper state space reconstruction for multivariate time series and modify the two abovementioned criteria to search for optimal embedding in the set of the variables and their delays. We pinpoint the problems that can arise in each case and compare the state space reconstructions (suggested by each of the two methods) on the predictive ability of the local model that uses each of them. Results obtained from Monte Carlo simulations on known chaotic maps revealed the non-uniqueness of optimum reconstruction in the multivariate case and showed that prediction criteria perform better when the task is prediction.
△ Less
Submitted 12 September, 2008;
originally announced September 2008.
-
Evaluation of mutual information estimators on nonlinear dynamic systems
Authors:
A. Papana,
D. Kugiumtzis
Abstract:
Mutual information is a nonlinear measure used in time series analysis in order to measure the linear and non-linear correlations at any lag $τ$. The aim of this study is to evaluate some of the most commonly used mutual information estimators, i.e. estimators based on histograms (with fixed or adaptive bin size), $k$-nearest neighbors and kernels. We assess the accuracy of the estimators by Mon…
▽ More
Mutual information is a nonlinear measure used in time series analysis in order to measure the linear and non-linear correlations at any lag $τ$. The aim of this study is to evaluate some of the most commonly used mutual information estimators, i.e. estimators based on histograms (with fixed or adaptive bin size), $k$-nearest neighbors and kernels. We assess the accuracy of the estimators by Monte-Carlo simulations on time series from nonlinear dynamical systems of varying complexity. As the true mutual information is generally unknown, we investigate the existence and rate of consistency of the estimators (convergence to a stable value with the increase of time series length), and the degree of deviation among the estimators. The results show that the $k$-nearest neighbor estimator is the most stable and less affected by the method-specific parameter.
△ Less
Submitted 12 September, 2008;
originally announced September 2008.
-
Local prediction of turning points of oscillating time series
Authors:
D. Kugiumtzis
Abstract:
For oscillating time series, the prediction is often focused on the turning points. In order to predict the turning point magnitudes and times it is proposed to form the state space reconstruction only from the turning points and modify the local (nearest neighbor) model accordingly. The model on turning points gives optimal prediction at a lower dimensional state space than the optimal local mo…
▽ More
For oscillating time series, the prediction is often focused on the turning points. In order to predict the turning point magnitudes and times it is proposed to form the state space reconstruction only from the turning points and modify the local (nearest neighbor) model accordingly. The model on turning points gives optimal prediction at a lower dimensional state space than the optimal local model applied directly on the oscillating time series and is thus computationally more efficient. Monte Carlo simulations on different oscillating nonlinear systems showed that it gives better predictions of turning points and this is confirmed also for the time series of annual sunspots and total stress in a plastic deformation experiment.
△ Less
Submitted 6 August, 2008;
originally announced August 2008.
-
Statistical Analysis for Long Term Correlations in the Stress Time Series of Jerky Flow
Authors:
Dimitris Kugiumtzis,
Elias C. Aifantis
Abstract:
Stress time series from the PLC effect typically exhibit stick-slips of upload and download type. These data contain strong short-term correlations of a nonlinear type. We investigate whether there are also long term correlations, i.e. the successive up-down patterns are generated by a deterministic mechanism. A statistical test is conducted for the null hypothesis that the sequence of the up-do…
▽ More
Stress time series from the PLC effect typically exhibit stick-slips of upload and download type. These data contain strong short-term correlations of a nonlinear type. We investigate whether there are also long term correlations, i.e. the successive up-down patterns are generated by a deterministic mechanism. A statistical test is conducted for the null hypothesis that the sequence of the up-down patterns is totally random. The test is constructed by means of surrogate data, suitably generated to represent the null hypothesis. Linear and nonlinear estimates are used as test statistics, namely autocorrelation, mutual information and Lyapunov exponents, which are found to have proper performance for the test. The test is then applied to three stress time series under different experimental conditions. Rejections are obtained for one of them and not with all statistics. From the overall results we cannot conclude that the underlying mechanism to the PLC effect has long memory.
△ Less
Submitted 19 May, 2004;
originally announced May 2004.
-
Statistical analysis of Gene and Intergenic DNA Sequences
Authors:
D. Kugiumtzis,
A. Provata
Abstract:
Much of the on-going statistical analysis of DNA sequences is focused on the estimation of characteristics of coding and non-coding regions that would possibly allow discrimination of these regions. In the current approach, we concentrate specifically on genes and intergenic regions. To estimate the level and type of correlation in these regions we apply various statistical methods inspired from…
▽ More
Much of the on-going statistical analysis of DNA sequences is focused on the estimation of characteristics of coding and non-coding regions that would possibly allow discrimination of these regions. In the current approach, we concentrate specifically on genes and intergenic regions. To estimate the level and type of correlation in these regions we apply various statistical methods inspired from nonlinear time series analysis, namely the probability distribution of tuplets, the Mutual Information and the Identical Neighbour Fit. The methods are suitably modified to work on symbolic sequences and they are first tested for validity on sequences obtained from well--known simple deterministic and stochastic models. Then they are applied to the DNA sequence of chromosome 1 of {\em arabidopsis thaliana}. The results suggest that correlations do exist in the DNA sequence but they are weak and that intergenic sequences tend to be more correlated than gene sequences. The use of statistical tests with surrogate data establish these findings in a rigorous statistical manner.
△ Less
Submitted 21 April, 2004;
originally announced April 2004.
-
Statically Transformed Autoregressive Process and Surrogate Data Test for Nonlinearity
Authors:
D. Kugiumtzis
Abstract:
The key feature for the successful implementation of the surrogate data test for nonlinearity on a scalar time series is the generation of surrogate data that represent exactly the null hypothesis (statically transformed normal stochastic process), i.e. they possess the sample autocorrelation and amplitude distribution of the given data. A new conceptual approach and algorithm for the generation…
▽ More
The key feature for the successful implementation of the surrogate data test for nonlinearity on a scalar time series is the generation of surrogate data that represent exactly the null hypothesis (statically transformed normal stochastic process), i.e. they possess the sample autocorrelation and amplitude distribution of the given data. A new conceptual approach and algorithm for the generation of surrogate data is proposed, called {\em statically transformed
autoregressive process} (STAP). It identifies a normal autoregressive process and a monotonic static transform, so that the transformed realisations of this process fulfill exactly both conditions and do not suffer from bias in autocorrelation as the surrogate data generated by other algorithms. The appropriateness of STAP is demonstrated with simulated and real world data.
△ Less
Submitted 30 January, 2002; v1 submitted 15 October, 2001;
originally announced October 2001.
-
Test your surrogate data before you test for nonlinearity
Authors:
D. Kugiumtzis
Abstract:
The schemes for the generation of surrogate data in order to test the null hypothesis of linear stochastic process undergoing nonlinear static transform are investigated as to their consistency in representing the null hypothesis. In particular, we pinpoint some important caveats of the prominent algorithm of amplitude adjusted Fourier transform surrogates (AAFT) and compare it to the iterated A…
▽ More
The schemes for the generation of surrogate data in order to test the null hypothesis of linear stochastic process undergoing nonlinear static transform are investigated as to their consistency in representing the null hypothesis. In particular, we pinpoint some important caveats of the prominent algorithm of amplitude adjusted Fourier transform surrogates (AAFT) and compare it to the iterated AAFT (IAAFT), which is more consistent in representing the null hypothesis. It turns out that in many applications with real data the inferences of nonlinearity after marginal rejection of the null hypothesis were premature and have to be re-investigated taken into account the inaccuracies in the AAFT algorithm, mainly concerning the mismatching of the linear correlations. In order to deal with such inaccuracies we propose the use of linear together with nonlinear polynomials as discriminating statistics. The application of this setup to some well-known real data sets cautions against the use of the AAFT algorithm.
△ Less
Submitted 7 May, 1999;
originally announced May 1999.
-
State Space Reconstruction Parameters in the Analysis of Chaotic Time Series - the Role of the Time Window Length
Authors:
Dimitris Kugiumtzis
Abstract:
The most common state space reconstruction method in the analysis of chaotic time series is the Method of Delays (MOD). Many techniques have been suggested to estimate the parameters of MOD, i.e. the time delay $τ$ and the embedding dimension $m$. We discuss the applicability of these techniques with a critical view as to their validity, and point out the necessity of determining the overall tim…
▽ More
The most common state space reconstruction method in the analysis of chaotic time series is the Method of Delays (MOD). Many techniques have been suggested to estimate the parameters of MOD, i.e. the time delay $τ$ and the embedding dimension $m$. We discuss the applicability of these techniques with a critical view as to their validity, and point out the necessity of determining the overall time window length, $τ_w$, for successful embedding. Emphasis is put on the relation between $τ_w$ and the dynamics of the underlying chaotic system, and we suggest to set $τ_w \geq τ_p$, the mean orbital period; $τ_p$ is approximated from the oscillations of the time series. The procedure is assessed using the correlation dimension for both synthetic and real data. For clean synthetic data, values of $τ_w$ larger than $τ_p$ always give good results given enough data and thus $τ_p$ can be considered as a lower limit ($τ_w \geq τ_p$). For noisy synthetic data and real data, an upper limit is reached for $τ_w$ which approaches $τ_p$ for increasing noise amplitude.
△ Less
Submitted 29 February, 1996;
originally announced February 1996.
-
Chaotic time series Part I: Estimation of invariant properies in state space
Authors:
Dimitris Kugiumtzis,
Bjoern Lillekjendlie,
Nils Christophersen
Abstract:
Certain deterministic non-linear systems may show chaotic behaviour. Time series derived from such systems seem stochastic when analyzed with linear techniques. However, uncovering the deterministic structure is important because it allows for construction of more realistic and better models and thus improved predictive capabilities. This paper describes key features of chaotic systems including…
▽ More
Certain deterministic non-linear systems may show chaotic behaviour. Time series derived from such systems seem stochastic when analyzed with linear techniques. However, uncovering the deterministic structure is important because it allows for construction of more realistic and better models and thus improved predictive capabilities. This paper describes key features of chaotic systems including strange attractors and Lyapunov exponents. The emphasis is on state space reconstruction techniques that are used to estimate these properties, given scalar observations. Data generated from equations known to display chaotic behaviour are used for illustration. A compilation of applications to real data from widely different fields is given. If chaos is found to be present, one may proceed to build non-linear models, which is the topic of the second paper in this series.
△ Less
Submitted 23 January, 1994; v1 submitted 14 January, 1994;
originally announced January 1994.
-
Chaotic time series Part II: System identification and prediction
Authors:
Bjoern Lillekjendlie,
Dimitris Kugiumtzis,
Nils Christophersen
Abstract:
This paper is the second in a series of two, and describes the current state of the art in modelling and prediction of chaotic time series. Sampled data from deterministic non-linear systems may look stochastic when analysed with linear methods. However, the deterministic structure may be uncovered and non-linear models constructed that allow improved prediction. We give the background for such…
▽ More
This paper is the second in a series of two, and describes the current state of the art in modelling and prediction of chaotic time series. Sampled data from deterministic non-linear systems may look stochastic when analysed with linear methods. However, the deterministic structure may be uncovered and non-linear models constructed that allow improved prediction. We give the background for such methods from a geometrical point of view, and briefly describe the following types of methods: global polynomials, local polynomials, multi layer perceptrons and semi-local methods including radial basis functions. Some illustrative examples from known chaotic systems are presented, emphasising the increase in prediction error with time. We compare some of the algorithms with respect to prediction accuracy and storage requirements, and list applications of these methods to real data from widely different areas.
△ Less
Submitted 23 January, 1994; v1 submitted 14 January, 1994;
originally announced January 1994.