Search | arXiv e-print repository

Monitoring for a Phase Transition in a Time Series of Wigner Matrices

Authors: Nina Dörnemann, Piotr Kokoszka, Tim Kutta, Sunmin Lee

Abstract: We develop methodology and theory for the detection of a phase transition in a time-series of high-dimensional random matrices. In the model we study, at each time point $ t = 1,2,\ldots $, we observe a deformed Wigner matrix $ \mathbf{M}_t $, where the unobservable deformation represents a latent signal. This signal is detectable only in the supercritical regime, and our objective is to detec… ▽ More We develop methodology and theory for the detection of a phase transition in a time-series of high-dimensional random matrices. In the model we study, at each time point $ t = 1,2,\ldots $, we observe a deformed Wigner matrix $ \mathbf{M}_t $, where the unobservable deformation represents a latent signal. This signal is detectable only in the supercritical regime, and our objective is to detect the transition to this regime in real time, as new matrix--valued observations arrive. Our approach is based on a partial sum process of extremal eigenvalues of $\mathbf{M}_t$, and its theoretical analysis combines state-of-the-art tools from random-matrix-theory and Gaussian approximations. The resulting detector is self-normalized, which ensures appropriate scaling for convergence and a pivotal limit, without any additional parameter estimation. Simulations show excellent performance for varying dimensions. Applications to pollution monitoring and social interactions in primates illustrate the usefulness of our approach. △ Less

Submitted 7 July, 2025; originally announced July 2025.

arXiv:2506.21172 [pdf, ps, other]

Prokhorov Metric Convergence of the Partial Sum Process for Reconstructed Functional Data

Authors: Tim Kutta, Piotr Kokoszka

Abstract: Motivated by applications in functional data analysis, we study the partial sum process of sparsely observed, random functions. A key novelty of our analysis are bounds for the distributional distance between the limit Brownian motion and the entire partial sum process in the function space. To measure the distance between distributions, we employ the Prokhorov and Wasserstein metrics. We show tha… ▽ More Motivated by applications in functional data analysis, we study the partial sum process of sparsely observed, random functions. A key novelty of our analysis are bounds for the distributional distance between the limit Brownian motion and the entire partial sum process in the function space. To measure the distance between distributions, we employ the Prokhorov and Wasserstein metrics. We show that these bounds have important probabilistic implications, including strong invariance principles and new couplings between the partial sums and their Gaussian limits. Our results are formulated for weakly dependent, nonstationary time series in the Banach space of d-dimensional, continuous functions. Mathematically, our approach rests on a new, two-step proof strategy: First, using entropy bounds from empirical process theory, we replace the function-valued partial sum process by a high-dimensional discretization. Second, using Gaussian approximations for weakly dependent, high-dimensional vectors, we obtain bounds on the distance. As a statistical application of our coupling results, we validate an open-ended monitoring scheme for sparse functional data. Existing probabilistic tools were not appropriate for this task. △ Less

Submitted 26 June, 2025; originally announced June 2025.

MSC Class: 62M10; 62R10

arXiv:2405.17318 [pdf, ps, other]

Extremal correlation coefficient for functional data

Authors: Mihyun Kim, Piotr Kokoszka

Abstract: We propose a coefficient that measures dependence in paired samples of functions. It has properties similar to the Pearson correlation, but differs in significant ways: (i) it is designed to measure dependence between curves, (ii) it focuses only on extreme curves. The new coefficient is derived within the framework of regular variation in Banach spaces. A consistent estimator is proposed and just… ▽ More We propose a coefficient that measures dependence in paired samples of functions. It has properties similar to the Pearson correlation, but differs in significant ways: (i) it is designed to measure dependence between curves, (ii) it focuses only on extreme curves. The new coefficient is derived within the framework of regular variation in Banach spaces. A consistent estimator is proposed and justified by an asymptotic analysis and a simulation study. The usefulness of the new coefficient is illustrated on financial and and climate functional data. △ Less

Submitted 30 September, 2025; v1 submitted 27 May, 2024; originally announced May 2024.

MSC Class: 62R10; 60G70

arXiv:2404.02643 [pdf, other]

Estimation of the long-run variance of nonlinear time series with an application to change point analysis

Authors: Vaidotas Characiejus, Piotr Kokoszka, Xiangdong Meng

Abstract: For a broad class of nonlinear time series known as Bernoulli shifts, we establish the asymptotic normality of the smoothed periodogram estimator of the long-run variance. This estimator uses only a narrow band of Fourier frequencies around the origin and so has been extensively used in local Whittle estimation. Existing asymptotic normality results apply only to linear time series, so our work su… ▽ More For a broad class of nonlinear time series known as Bernoulli shifts, we establish the asymptotic normality of the smoothed periodogram estimator of the long-run variance. This estimator uses only a narrow band of Fourier frequencies around the origin and so has been extensively used in local Whittle estimation. Existing asymptotic normality results apply only to linear time series, so our work substantially extends the scope of the applicability of the smoothed periodogram estimator. As an illustration, we apply it to a test of changes in mean against long-range dependence. A simulation study is also conducted to illustrate the performance of the test for nonlinear time series. △ Less

Submitted 8 May, 2025; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: 32 pages, 2 figures

MSC Class: 62M10; 62M15

arXiv:1812.03108 [pdf, ps, other]

Principal components analysis of regularly varying functions

Authors: Piotr Kokoszka, Stilian Stoev, Qian Xiong

Abstract: The paper is concerned with asymptotic properties of the principal components analysis of functional data. The currently available results assume the existence of the fourth moment. We develop analogous results in a setting which does not require this assumption. Instead, we assume that the observed functions are regularly varying. We derive the asymptotic distribution of the sample covariance ope… ▽ More The paper is concerned with asymptotic properties of the principal components analysis of functional data. The currently available results assume the existence of the fourth moment. We develop analogous results in a setting which does not require this assumption. Instead, we assume that the observed functions are regularly varying. We derive the asymptotic distribution of the sample covariance operator and of the sample functional principal components. We obtain a number of results on the convergence of moments and almost sure convergence. We apply the new theory to establish the consistency of the regression operator in a functional linear model. △ Less

Submitted 7 December, 2018; originally announced December 2018.

arXiv:1801.05466 [pdf, other]

Testing Separability of Functional Time Series

Authors: Panayiotis Constantinou, Piotr Kokoszka, Matthew Reimherr

Abstract: We derive and study a significance test for determining if a panel of functional time series is separable. In the context of this paper, separability means that the covariance structure factors into the product of two functions, one depending only on time and the other depending only on the coordinates of the panel. Separability is a property which can dramatically improve computational efficiency… ▽ More We derive and study a significance test for determining if a panel of functional time series is separable. In the context of this paper, separability means that the covariance structure factors into the product of two functions, one depending only on time and the other depending only on the coordinates of the panel. Separability is a property which can dramatically improve computational efficiency by substantially reducing model complexity. It is especially useful for functional data as it implies that the functional principal components are the same for each member of the panel. However such an assumption must be verified before proceeding with further inference. Our approach is based on functional norm differences and provides a test with well controlled size and high power. We establish our procedure quite generally, allowing one to test separability of autocovariances as well. In addition to an asymptotic justification, our methodology is validated by a simulation study. It is applied to functional panels of particulate pollution and stock market data. △ Less

Submitted 16 January, 2018; originally announced January 2018.

arXiv:1612.02794 [pdf, ps, other]

Change point detection in heteroscedastic time series

Authors: Tomasz Gorecki, Lajos Horvath, Piotr Kokoszka

Abstract: Many time series exhibit changes both in level and in variability. Generally, it is more important to detect a change in the level, and changing or smoothly evolving variability can confound existing tests. This paper develops a framework for testing for shifts in the level of a series which accommodates the possibility of changing variability. The resulting tests are robust both to heteroskedasti… ▽ More Many time series exhibit changes both in level and in variability. Generally, it is more important to detect a change in the level, and changing or smoothly evolving variability can confound existing tests. This paper develops a framework for testing for shifts in the level of a series which accommodates the possibility of changing variability. The resulting tests are robust both to heteroskedasticity and serial dependence. They rely on a new functional central limit theorem for dependent random variables whose variance can change or trend in a substantial way. This new result is of independent interest as it can be applied in many inferential contexts applicable to time series. Its application to change point tests relies on a new approach which utilizes Karhunen--Lo{é}ve expansions of the limit Gaussian processes. After presenting the theory in the most commonly encountered setting of the detection of a change point in the mean, we show how it can be extended to linear and nonlinear regression. Finite sample performance is examined by means of a simulation study and an application to yields on US treasury bonds. △ Less

Submitted 8 December, 2016; originally announced December 2016.

arXiv:1302.6102 [pdf, ps, other]

Functional Data Analysis with Increasing Number of Projections

Authors: Stefan Fremdt, Lajos Horváth, Piotr Kokoszka, Josef G. Steinebach

Abstract: Functional principal components (FPC's) provide the most important and most extensively used tool for dimension reduction and inference for functional data. The selection of the number, d, of the FPC's to be used in a specific procedure has attracted a fair amount of attention, and a number of reasonably effective approaches exist. Intuitively, they assume that the functional data can be sufficien… ▽ More Functional principal components (FPC's) provide the most important and most extensively used tool for dimension reduction and inference for functional data. The selection of the number, d, of the FPC's to be used in a specific procedure has attracted a fair amount of attention, and a number of reasonably effective approaches exist. Intuitively, they assume that the functional data can be sufficiently well approximated by a projection onto a finite-dimensional subspace, and the error resulting from such an approximation does not impact the conclusions. This has been shown to be a very effective approach, but it is desirable to understand the behavior of many inferential procedures by considering the projections on subspaces spanned by an increasing number of the FPC's. Such an approach reflects more fully the infinite-dimensional nature of functional data, and allows to derive procedures which are fairly insensitive to the selection of d. This is accomplished by considering limits as d tends to infinity with the sample size. We propose a specific framework in which we let d tend to infinity by deriving a normal approximation for the two-parameter partial sum process of the scores ξ_{i,j} of the i-th function with respect to the j-th FPC. Our approximation can be used to derive statistics that use segments of observations and segments of the FPC's. We apply our general results to derive two inferential procedures for the mean function: a change-point test and a two-sample test. In addition to the asymptotic theory, the tests are assessed through a small simulation study and a data example. △ Less

Submitted 25 February, 2013; originally announced February 2013.

arXiv:1105.0019 [pdf, other]

Estimation of the mean of functional time series and a two sample problem

Authors: Lajos Horvath, Piotr Kokoszka, Ron Reeder

Abstract: This paper is concerned with inference based on the mean function of a functional time series, which is defined as a collection of curves obtained by splitting a continuous time record, e.g. into daily or annual curves. We develop a normal approximation for the functional sample mean, and then focus on the estimation of the asymptotic variance kernel. Using these results, we develop and asymptotic… ▽ More This paper is concerned with inference based on the mean function of a functional time series, which is defined as a collection of curves obtained by splitting a continuous time record, e.g. into daily or annual curves. We develop a normal approximation for the functional sample mean, and then focus on the estimation of the asymptotic variance kernel. Using these results, we develop and asymptotically justify a testing procedure for the equality of means in two functional samples exhibiting temporal dependence. Evaluated by means of a simulations study and application to real data sets, this two sample procedure enjoys good size and power in finite samples. We provide the details of its numerical implementation. △ Less

Submitted 29 April, 2011; originally announced May 2011.

arXiv:1104.4049 [pdf, other]

Testing the Equality of Covariance Operators in Functional Samples

Authors: Stefan Fremdt, Lajos Horváth, Piotr Kokoszka, Josef G. Steinebach

Abstract: We propose a robust test for the equality of the covariance structures in two functional samples. The test statistic has a chi-square asymptotic distribution with a known number of degrees of freedom, which depends on the level of dimension reduction needed to represent the data. Detailed analysis of the asymptotic properties is developed. Finite sample performance is examined by a simulation stud… ▽ More We propose a robust test for the equality of the covariance structures in two functional samples. The test statistic has a chi-square asymptotic distribution with a known number of degrees of freedom, which depends on the level of dimension reduction needed to represent the data. Detailed analysis of the asymptotic properties is developed. Finite sample performance is examined by a simulation study and an application to egg-laying curves of fruit flies. △ Less

Submitted 21 April, 2011; v1 submitted 20 April, 2011; originally announced April 2011.

MSC Class: 62G10 (Primary) 62G20; 62H15 (Secondary)

arXiv:1104.3074 [pdf, ps, other]

doi 10.3150/12-BEJ418

Consistency of the mean and the principal components of spatially distributed functional data

Authors: Siegfried Hörmann, Piotr Kokoszka

Abstract: This paper develops a framework for the estimation of the functional mean and the functional principal components when the functions form a random field. More specifically, the data we study consist of curves $X(\mathbf{s}_k;t),t\in[0,T]$, observed at spatial points $\mathbf{s}_1,\mathbf{s}_2,\ldots,\mathbf{s}_N$. We establish conditions for the sample average (in space) of the $X(\mathbf{s}_k)$ t… ▽ More This paper develops a framework for the estimation of the functional mean and the functional principal components when the functions form a random field. More specifically, the data we study consist of curves $X(\mathbf{s}_k;t),t\in[0,T]$, observed at spatial points $\mathbf{s}_1,\mathbf{s}_2,\ldots,\mathbf{s}_N$. We establish conditions for the sample average (in space) of the $X(\mathbf{s}_k)$ to be a consistent estimator of the population mean function, and for the usual empirical covariance operator to be a consistent estimator of the population covariance operator. These conditions involve an interplay of the assumptions on an appropriately defined dependence between the functions $X(\mathbf{s}_k)$ and the assumptions on the spatial distribution of the points $\mathbf{s}_k$. The rates of convergence may be the same as for i.i.d. functional samples, but generally depend on the strength of dependence and appropriately quantified distances between the points $\mathbf{s}_k$. We also formulate conditions for the lack of consistency. △ Less

Submitted 11 December, 2013; v1 submitted 15 April, 2011; originally announced April 2011.

Comments: Published in at http://dx.doi.org/10.3150/12-BEJ418 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ418

Journal ref: Bernoulli 2013, Vol. 19, No. 5A, 1535-1558

arXiv:1010.0792 [pdf, ps, other]

doi 10.1214/09-AOS768

Weakly dependent functional data

Authors: Siegfried Hörmann, Piotr Kokoszka

Abstract: Functional data often arise from measurements on fine time grids and are obtained by separating an almost continuous time record into natural consecutive intervals, for example, days. The functions thus obtained form a functional time series, and the central issue in the analysis of such data consists in taking into account the temporal dependence of these functional observations. Examples include… ▽ More Functional data often arise from measurements on fine time grids and are obtained by separating an almost continuous time record into natural consecutive intervals, for example, days. The functions thus obtained form a functional time series, and the central issue in the analysis of such data consists in taking into account the temporal dependence of these functional observations. Examples include daily curves of financial transaction data and daily patterns of geophysical and environmental data. For scalar and vector valued stochastic processes, a large number of dependence notions have been proposed, mostly involving mixing type distances between $σ$-algebras. In time series analysis, measures of dependence based on moments have proven most useful (autocovariances and cumulants). We introduce a moment-based notion of dependence for functional time series which involves $m$-dependence. We show that it is applicable to linear as well as nonlinear functional time series. Then we investigate the impact of dependence thus quantified on several important statistical procedures for functional data. We study the estimation of the functional principal components, the long-run covariance matrix, change point detection and the functional linear model. We explain when temporal dependence affects the results obtained for i.i.d. functional observations and when these results are robust to weak dependence. △ Less

Submitted 5 October, 2010; originally announced October 2010.

Comments: Published in at http://dx.doi.org/10.1214/09-AOS768 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS768

Journal ref: Annals of Statistics 2010, Vol. 38, No. 3, 1845-1884

arXiv:0810.4012 [pdf, ps, other]

doi 10.3150/08-BEJ122

Testing for changes in polynomial regression

Authors: Alexander Aue, Lajos Horváth, Marie Hušková, Piotr Kokoszka

Abstract: We consider a nonlinear polynomial regression model in which we wish to test the null hypothesis of structural stability in the regression parameters against the alternative of a break at an unknown time. We derive the extreme value distribution of a maximum-type test statistic which is asymptotically equivalent to the maximally selected likelihood ratio. The resulting test is easy to apply and… ▽ More We consider a nonlinear polynomial regression model in which we wish to test the null hypothesis of structural stability in the regression parameters against the alternative of a break at an unknown time. We derive the extreme value distribution of a maximum-type test statistic which is asymptotically equivalent to the maximally selected likelihood ratio. The resulting test is easy to apply and has good size and power, even in small samples. △ Less

Submitted 22 October, 2008; originally announced October 2008.

Comments: Published in at http://dx.doi.org/10.3150/08-BEJ122 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ122

Journal ref: Bernoulli 2008, Vol. 14, No. 3, 637-660

arXiv:0805.2029 [pdf, ps, other]

doi 10.3150/07-BEJ113

Sample autocovariances of long-memory time series

Authors: Lajos Horváth, Piotr Kokoszka

Abstract: We find the asymptotic distribution of the sample autocovariances of long-memory processes in cases of finite and infinite fourth moment. Depending on the interplay of assumptions on moments and the intensity of dependence, there are three types of convergence rates and limit distributions. In particular, a normal approximation with the standard rate does not always hold in practically relevant… ▽ More We find the asymptotic distribution of the sample autocovariances of long-memory processes in cases of finite and infinite fourth moment. Depending on the interplay of assumptions on moments and the intensity of dependence, there are three types of convergence rates and limit distributions. In particular, a normal approximation with the standard rate does not always hold in practically relevant cases. △ Less

Submitted 14 May, 2008; originally announced May 2008.

Comments: Published in at http://dx.doi.org/10.3150/07-BEJ113 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ113

Journal ref: Bernoulli 2008, Vol. 14, No. 2, 405-418

arXiv:math/0607803 [pdf, ps, other]

doi 10.1214/009053606000000254

On discriminating between long-range dependence and changes in mean

Authors: István Berkes, Lajos Horváth, Piotr Kokoszka, Qi-Man Shao

Abstract: We develop a testing procedure for distinguishing between a long-range dependent time series and a weakly dependent time series with change-points in the mean. In the simplest case, under the null hypothesis the time series is weakly dependent with one change in mean at an unknown point, and under the alternative it is long-range dependent. We compute the CUSUM statistic $T_n$, which allows us t… ▽ More We develop a testing procedure for distinguishing between a long-range dependent time series and a weakly dependent time series with change-points in the mean. In the simplest case, under the null hypothesis the time series is weakly dependent with one change in mean at an unknown point, and under the alternative it is long-range dependent. We compute the CUSUM statistic $T_n$, which allows us to construct an estimator $\hat{k}$ of a change-point. We then compute the statistic $T_{n,1}$ based on the observations up to time $\hat{k}$ and the statistic $T_{n,2}$ based on the observations after time $\hat{k}$. The statistic $M_n=\max[T_{n,1},T_{n,2}]$ converges to a well-known distribution under the null, but diverges to infinity if the observations exhibit long-range dependence. The theory is illustrated by examples and an application to the returns of the Dow Jones index. △ Less

Submitted 31 July, 2006; originally announced July 2006.

Comments: Published at http://dx.doi.org/10.1214/009053606000000254 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOS-AOS0159 MSC Class: 62M10; 62G10 (Primary)

Journal ref: Annals of Statistics 2006, Vol. 34, No. 3, 1140-1165

arXiv:math/0503520 [pdf, ps, other]

doi 10.1214/105051604000000783

Near-integrated GARCH sequences

Authors: Istvan Berkes, Lajos Horvath, Piotr Kokoszka

Abstract: Motivated by regularities observed in time series of returns on speculative assets, we develop an asymptotic theory of GARCH(1,1) processes {y_k} defined by the equations y_k=σ_kε_k, σ_k^2=ω+αy_{k-1}^2+βσ_{k-1}^2 for which the sum α+βapproaches unity as the number of available observations tends to infinity. We call such sequences near-integrated. We show that the asymptotic behavior of near-int… ▽ More Motivated by regularities observed in time series of returns on speculative assets, we develop an asymptotic theory of GARCH(1,1) processes {y_k} defined by the equations y_k=σ_kε_k, σ_k^2=ω+αy_{k-1}^2+βσ_{k-1}^2 for which the sum α+βapproaches unity as the number of available observations tends to infinity. We call such sequences near-integrated. We show that the asymptotic behavior of near-integrated GARCH(1,1) processes critically depends on the sign of γ:=α+β-1. We find assumptions under which the solutions exhibit increasing oscillations and show that these oscillations grow approximately like a power function if γ\leq 0 and exponentially if γ>0. We establish an additive representation for the near-integrated GARCH(1,1) processes which is more convenient to use than the traditional multiplicative Volterra series expansion. △ Less

Submitted 24 March, 2005; originally announced March 2005.

Comments: Published at http://dx.doi.org/10.1214/105051604000000783 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AAP-AAP069 MSC Class: 62M10; 91B84. (Primary)

Journal ref: Annals of Applied Probability 2005, Vol. 15, No. 1B, 890-913

Showing 1–16 of 16 results for author: Kokoszka, P