-
Sample Splitting and Assessing Goodness-of-fit of Time Series
Authors:
Richard A. Davis,
Leon Fernandes
Abstract:
A fundamental and often final step in time series modeling is to assess the quality of fit of a proposed model to the data. Since the underlying distribution of the innovations that generate a model is often not prescribed, goodness-of-fit tests typically take the form of testing the fitted residuals for serial independence. However, these fitted residuals are inherently dependent since they are b…
▽ More
A fundamental and often final step in time series modeling is to assess the quality of fit of a proposed model to the data. Since the underlying distribution of the innovations that generate a model is often not prescribed, goodness-of-fit tests typically take the form of testing the fitted residuals for serial independence. However, these fitted residuals are inherently dependent since they are based on the same parameter estimates and thus standard tests of serial independence, such as those based on the autocorrelation function (ACF) or distance correlation function (ADCF) of the fitted residuals need to be adjusted. The sample splitting procedure in Pfister et al.~(2018) is one such fix for the case of models for independent data, but fails to work in the dependent setting. In this paper sample splitting is leveraged in the time series setting to perform tests of serial dependence of fitted residuals using the ACF and ADCF. Here the first $f_n$ of the data points are used to estimate the parameters of the model and then using these parameter estimates, the last $l_n$ of the data points are used to compute the estimated residuals. Tests for serial independence are then based on these $l_n$ residuals. As long as the overlap between the $f_n$ and $l_n$ data splits is asymptotically 1/2, the ACF and ADCF tests of serial independence tests often have the same limit distributions as though the underlying residuals are indeed iid. In particular if the first half of the data is used to estimate the parameters and the estimated residuals are computed for the entire data set based on these parameter estimates, then the ACF and ADCF can have the same limit distributions as though the residuals were iid. This procedure ameliorates the need for adjustment in the construction of confidence bounds for both the ACF and ADCF in goodness-of-fit testing.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Clustering Multivariate Time Series using Energy Distance
Authors:
Richard A. Davis,
Leon Fernandes,
Konstantinos Fokianos
Abstract:
A novel methodology is proposed for clustering multivariate time series data using energy distance defined in Székely and Rizzo (2013). Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure separation between the finite dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering met…
▽ More
A novel methodology is proposed for clustering multivariate time series data using energy distance defined in Székely and Rizzo (2013). Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure separation between the finite dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering method is then applied to obtain the dendrogram. This procedure is completely nonparametric as the dissimilarities between stationary distributions are directly calculated without making any model assumptions. In order to justify this procedure, asymptotic properties of the energy distance estimates are derived for general stationary and ergodic time series. The method is illustrated in a simulation study for various component time series that are either linear or nonlinear. Finally the methodology is applied to two examples; one involves GDP of selected countries and the other is population size of various states in the U.S.A. in the years 1900 -1999.
△ Less
Submitted 24 March, 2023;
originally announced March 2023.
-
Kernel PCA for multivariate extremes
Authors:
Marco Avella-Medina,
Richard A. Davis,
Gennady Samorodnitsky
Abstract:
We propose kernel PCA as a method for analyzing the dependence structure of multivariate extremes and demonstrate that it can be a powerful tool for clustering and dimension reduction. Our work provides some theoretical insight into the preimages obtained by kernel PCA, demonstrating that under certain conditions they can effectively identify clusters in the data. We build on these new insights to…
▽ More
We propose kernel PCA as a method for analyzing the dependence structure of multivariate extremes and demonstrate that it can be a powerful tool for clustering and dimension reduction. Our work provides some theoretical insight into the preimages obtained by kernel PCA, demonstrating that under certain conditions they can effectively identify clusters in the data. We build on these new insights to characterize rigorously the performance of kernel PCA based on an extremal sample, i.e., the angular part of random vectors for which the radius exceeds a large threshold. More specifically, we focus on the asymptotic dependence of multivariate extremes characterized by the angular or spectral measure in extreme value theory and provide a careful analysis in the case where the extremes are generated from a linear factor model. We give theoretical guarantees on the performance of kernel PCA preimages of such extremes by leveraging their asymptotic distribution together with Davis-Kahan perturbation bounds. Our theoretical findings are complemented with numerical experiments illustrating the finite sample performance of our methods.
△ Less
Submitted 23 November, 2022; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Spectral learning of multivariate extremes
Authors:
Marco Avella Medina,
Richard A. Davis,
Gennady Samorodnitsky
Abstract:
We propose a spectral clustering algorithm for analyzing the dependence structure of multivariate extremes. More specifically, we focus on the asymptotic dependence of multivariate extremes characterized by the angular or spectral measure in extreme value theory. Our work studies the theoretical performance of spectral clustering based on a random $k$-nearest neighbor graph constructed from an ext…
▽ More
We propose a spectral clustering algorithm for analyzing the dependence structure of multivariate extremes. More specifically, we focus on the asymptotic dependence of multivariate extremes characterized by the angular or spectral measure in extreme value theory. Our work studies the theoretical performance of spectral clustering based on a random $k$-nearest neighbor graph constructed from an extremal sample, i.e., the angular part of random vectors for which the radius exceeds a large threshold. In particular, we derive the asymptotic distribution of extremes arising from a linear factor model and prove that, under certain conditions, spectral clustering can consistently identify the clusters of extremes arising in this model. Leveraging this result we propose a simple consistent estimation strategy for learning the angular measure. Our theoretical findings are complemented with numerical experiments illustrating the finite sample performance of our methods.
△ Less
Submitted 1 August, 2023; v1 submitted 15 November, 2021;
originally announced November 2021.
-
Modeling of time series using random forests: theoretical developments
Authors:
Richard A. Davis,
Mikkel S. Nielsen
Abstract:
In this paper we study asymptotic properties of random forests within the framework of nonlinear time series modeling. While random forests have been successfully applied in various fields, the theoretical justification has not been considered for their use in a time series setting. Under mild conditions, we prove a uniform concentration inequality for regression trees built on nonlinear autoregre…
▽ More
In this paper we study asymptotic properties of random forests within the framework of nonlinear time series modeling. While random forests have been successfully applied in various fields, the theoretical justification has not been considered for their use in a time series setting. Under mild conditions, we prove a uniform concentration inequality for regression trees built on nonlinear autoregressive processes and, subsequently, we use this result to prove consistency for a large class of random forests. The results are supported by various simulations.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Goodness-of-Fit Testing for Time Series Models via Distance Covariance
Authors:
Phyllis Wan,
Richard A. Davis
Abstract:
In many statistical modeling frameworks, goodness-of-fit tests are typically administered to the estimated residuals. In the time series setting, whiteness of the residuals is assessed using the sample autocorrelation function. For many time series models, especially those used for financial time series, the key assumption on the residuals is that they are in fact independent and not just uncorrel…
▽ More
In many statistical modeling frameworks, goodness-of-fit tests are typically administered to the estimated residuals. In the time series setting, whiteness of the residuals is assessed using the sample autocorrelation function. For many time series models, especially those used for financial time series, the key assumption on the residuals is that they are in fact independent and not just uncorrelated. In this paper, we apply the auto-distance covariance function (ADCV) to evaluate the serial dependence of the estimated residuals. Distance covariance can discriminate between dependence and independence of two random vectors. The limit behavior of the test statistic based on the ADCV is derived for a general class of time series models. One of the key aspects in this theory is adjusting for the dependence that arises due to parameter estimation. This adjustment has essentially the same form regardless of the model specification. We illustrate the results in simulated examples.
△ Less
Submitted 2 March, 2019;
originally announced March 2019.
-
Are Extreme Value Estimation Methods Useful for Network Data?
Authors:
Phyllis Wan,
Tiandong Wang,
Richard A. Davis,
Sidney I. Resnick
Abstract:
Preferential attachment is an appealing edge generating mechanism for modeling social networks. It provides both an intuitive description of network growth and an explanation for the observed power laws in degree distributions. However, there are often limitations in fitting parametric network models to data due to the complex nature of real-world networks. In this paper, we consider a semi-parame…
▽ More
Preferential attachment is an appealing edge generating mechanism for modeling social networks. It provides both an intuitive description of network growth and an explanation for the observed power laws in degree distributions. However, there are often limitations in fitting parametric network models to data due to the complex nature of real-world networks. In this paper, we consider a semi-parametric estimation approach by looking at only the nodes with large in- or out-degrees of the network. This method examines the tail behavior of both the marginal and joint degree distributions and is based on extreme value theory. We compare it with the existing parametric approaches and demonstrate how it can provide more robust estimates of parameters associated with the network when the data are corrupted or when the model is misspecified.
△ Less
Submitted 19 December, 2017;
originally announced December 2017.
-
Fitting the Linear Preferential Attachment Model
Authors:
Phyllis Wan,
Tiandong Wang,
Richard A. Davis,
Sidney I. Resnick
Abstract:
Preferential attachment is an appealing mechanism for modeling power-law behavior of the degree distributions in directed social networks. In this paper, we consider methods for fitting a 5-parameter linear preferential model to network data under two data scenarios. In the case where full history of the network formation is given, we derive the maximum likelihood estimator of the parameters and s…
▽ More
Preferential attachment is an appealing mechanism for modeling power-law behavior of the degree distributions in directed social networks. In this paper, we consider methods for fitting a 5-parameter linear preferential model to network data under two data scenarios. In the case where full history of the network formation is given, we derive the maximum likelihood estimator of the parameters and show that it is strongly consistent and asymptotically normal. In the case where only a single-time snapshot of the network is available, we propose an estimation method which combines method of moments with an approximation to the likelihood. The resulting estimator is also strongly consistent and performs quite well compared to the MLE estimator. We illustrate both estimation procedures through simulated data, and explore the usage of this model in a real data example.
△ Less
Submitted 27 August, 2017; v1 submitted 8 March, 2017;
originally announced March 2017.
-
Semiparametric estimation for isotropic max-stable space-time processes
Authors:
Sven Buhl,
Richard A. Davis,
Claudia Klüppelberg,
Christina Steinkohl
Abstract:
Regularly varying space-time processes have proved useful to study extremal dependence in space-time data. We propose a semiparametric estimation procedure based on a closed form expression of the extremogram to estimate parametric models of extremal dependence functions. We establish the asymptotic properties of the resulting parameter estimates and propose subsampling procedures to obtain asympt…
▽ More
Regularly varying space-time processes have proved useful to study extremal dependence in space-time data. We propose a semiparametric estimation procedure based on a closed form expression of the extremogram to estimate parametric models of extremal dependence functions. We establish the asymptotic properties of the resulting parameter estimates and propose subsampling procedures to obtain asymptotically correct confidence intervals. A simulation study shows that the proposed procedure works well for moderate sample sizes and is robust to small departures from the underlying model. Finally, we apply this estimation procedure to fitting a max-stable process to radar rainfall measurements in a region in Florida. Complementary results and some proofs of key results are presented together with the simulation study in the supplement.
△ Less
Submitted 15 July, 2018; v1 submitted 16 September, 2016;
originally announced September 2016.
-
Inference on the tail process with application to financial time series modelling
Authors:
R. A. Davis,
H. Drees,
J. Segers,
M. Warchoł
Abstract:
To draw inference on serial extremal dependence within heavy-tailed Markov chains, Drees, Segers and Warchoł [Extremes (2015) 18, 369--402] proposed nonparametric estimators of the spectral tail process. The methodology can be extended to the more general setting of a stationary, regularly varying time series. The large-sample distribution of the estimators is derived via empirical process theory…
▽ More
To draw inference on serial extremal dependence within heavy-tailed Markov chains, Drees, Segers and Warchoł [Extremes (2015) 18, 369--402] proposed nonparametric estimators of the spectral tail process. The methodology can be extended to the more general setting of a stationary, regularly varying time series. The large-sample distribution of the estimators is derived via empirical process theory for cluster functionals. The finite-sample performance of these estimators is evaluated via Monte Carlo simulations. Moreover, two different bootstrap schemes are employed which yield confidence intervals for the pre-asymptotic spectral tail process: the stationary bootstrap and the multiplier block bootstrap. The estimators are applied to stock price data to study the persistence of positive and negative shocks.
△ Less
Submitted 29 January, 2018; v1 submitted 4 April, 2016;
originally announced April 2016.
-
Reduced-Rank Covariance Estimation in Vector Autoregressive Modeling
Authors:
Richard A. Davis,
Pengfei Zang,
Tian Zheng
Abstract:
We consider reduced-rank modeling of the white noise covariance matrix in a large dimensional vector autoregressive (VAR) model. We first propose the reduced-rank covariance estimator under the setting where independent observations are available. We derive the reduced-rank estimator based on a latent variable model for the vector observation and give the analytical form of its maximum likelihood…
▽ More
We consider reduced-rank modeling of the white noise covariance matrix in a large dimensional vector autoregressive (VAR) model. We first propose the reduced-rank covariance estimator under the setting where independent observations are available. We derive the reduced-rank estimator based on a latent variable model for the vector observation and give the analytical form of its maximum likelihood estimate. Simulation results show that the reduced-rank covariance estimator outperforms two competing covariance estimators for estimating large dimensional covariance matrices from independent observations. Then we describe how to integrate the proposed reduced-rank estimator into the fitting of large dimensional VAR models, where we consider two scenarios that require different model fitting procedures. In the VAR modeling context, our reduced-rank covariance estimator not only provides interpretable descriptions of the dependence structure of VAR processes but also leads to improvement in model-fitting and forecasting over unrestricted covariance estimators. Two real data examples are presented to illustrate these fitting procedures.
△ Less
Submitted 5 December, 2014;
originally announced December 2014.
-
Self-excited Threshold Poisson Autoregression
Authors:
Chao Wang,
Heng Liu,
Jian-Feng Yao,
Richard A. Davis,
Wai Keung Li
Abstract:
This paper studies theory and inference of an observation-driven model for time series of counts. It is assumed that the observations follow a Poisson distribution conditioned on an accompanying intensity process, which is equipped with a two-regime structure according to the magnitude of the lagged observations. The model remedies one of the drawbacks of the Poisson autoregression model by allowi…
▽ More
This paper studies theory and inference of an observation-driven model for time series of counts. It is assumed that the observations follow a Poisson distribution conditioned on an accompanying intensity process, which is equipped with a two-regime structure according to the magnitude of the lagged observations. The model remedies one of the drawbacks of the Poisson autoregression model by allowing possibly negative correlation in the observations. Classical Markov chain theory and Lyapunov's method are utilized to derive the conditions under which the process has a unique invariant probability measure and to show a strong law of large numbers of the intensity process. Moreover the asymptotic theory of the maximum likelihood estimates of the parameters is established. A simulation study and a real data application are considered, where the model is applied to the number of major earthquakes in the world.
△ Less
Submitted 17 July, 2013;
originally announced July 2013.
-
Approximating the conditional density given large observed values via a multivariate extremes framework, with application to environmental data
Authors:
Daniel Cooley,
Richard A. Davis,
Philippe Naveau
Abstract:
Phenomena such as air pollution levels are of greatest interest when observations are large, but standard prediction methods are not specifically designed for large observations. We propose a method, rooted in extreme value theory, which approximates the conditional distribution of an unobserved component of a random vector given large observed values. Specifically, for…
▽ More
Phenomena such as air pollution levels are of greatest interest when observations are large, but standard prediction methods are not specifically designed for large observations. We propose a method, rooted in extreme value theory, which approximates the conditional distribution of an unobserved component of a random vector given large observed values. Specifically, for $\mathbf{Z}=(Z_1,...,Z_d)^T$ and $\mathbf{Z}_{-d}=(Z_1,...,Z_{d-1})^T$, the method approximates the conditional distribution of $[Z_d|\mathbf{Z}_{-d}=\mathbf{z}_{-d}]$ when $|\mathbf{z}_{-d}|>r_*$. The approach is based on the assumption that $\mathbf{Z}$ is a multivariate regularly varying random vector of dimension $d$. The conditional distribution approximation relies on knowledge of the angular measure of $\mathbf{Z}$, which provides explicit structure for dependence in the distribution's tail. As the method produces a predictive distribution rather than just a point predictor, one can answer any question posed about the quantity being predicted, and, in particular, one can assess how well the extreme behavior is represented. Using a fitted model for the angular measure, we apply our method to nitrogen dioxide measurements in metropolitan Washington DC. We obtain a predictive distribution for the air pollutant at a location given the air pollutant's measurements at four nearby locations and given that the norm of the vector of the observed measurements is large.
△ Less
Submitted 8 January, 2013;
originally announced January 2013.
-
Sparse Vector Autoregressive Modeling
Authors:
Richard A. Davis,
Pengfei Zang,
Tian Zheng
Abstract:
The vector autoregressive (VAR) model has been widely used for modeling temporal dependence in a multivariate time series. For large (and even moderate) dimensions, the number of AR coefficients can be prohibitively large, resulting in noisy estimates, unstable predictions and difficult-to-interpret temporal dependence. To overcome such drawbacks, we propose a 2-stage approach for fitting sparse V…
▽ More
The vector autoregressive (VAR) model has been widely used for modeling temporal dependence in a multivariate time series. For large (and even moderate) dimensions, the number of AR coefficients can be prohibitively large, resulting in noisy estimates, unstable predictions and difficult-to-interpret temporal dependence. To overcome such drawbacks, we propose a 2-stage approach for fitting sparse VAR (sVAR) models in which many of the AR coefficients are zero. The first stage selects non-zero AR coefficients based on an estimate of the partial spectral coherence (PSC) together with the use of BIC. The PSC is useful for quantifying the conditional relationship between marginal series in a multivariate process. A refinement second stage is then applied to further reduce the number of parameters. The performance of this 2-stage approach is illustrated with simulation results. The 2-stage approach is also applied to two real data examples: the first is the Google Flu Trends data and the second is a time series of concentration levels of air pollutants.
△ Less
Submitted 2 July, 2012;
originally announced July 2012.
-
Statistical inference for max-stable processes in space and time
Authors:
Richard A. Davis,
Claudia Klüppelberg,
Christina Steinkohl
Abstract:
Max-stable processes have proved to be useful for the statistical modelling of spatial extremes. Several representations of max-stable random fields have been proposed in the literature. One such representation is based on a limit of normalized and scaled pointwise maxima of stationary Gaussian processes that was first introduced by Kabluchko, Schlather and de Haan (2009).
This paper deals with…
▽ More
Max-stable processes have proved to be useful for the statistical modelling of spatial extremes. Several representations of max-stable random fields have been proposed in the literature. One such representation is based on a limit of normalized and scaled pointwise maxima of stationary Gaussian processes that was first introduced by Kabluchko, Schlather and de Haan (2009).
This paper deals with statistical inference for max-stable space-time processes that are defined in an analogous fashion. We describe pairwise likelihood estimation, where the pairwise density of the process is used to estimate the model parameters and prove strong consistency and asymptotic normality of the parameter estimates for an increasing space-time dimension, i.e., as the joint number of spatial locations and time points tends to infinity. A simulation study shows that the proposed method works well for these models.
△ Less
Submitted 25 April, 2012;
originally announced April 2012.
-
Estimating Extremal Dependence in Univariate and Multivariate Time Series via the Extremogram
Authors:
Richard A. Davis,
Thomas Mikosch,
Ivor Cribben
Abstract:
Davis and Mikosch [7] introduced the extremogram as a flexible quantitative tool for measuring various types of extremal dependence in a stationary time series. There we showed some standard statistical properties of the sample extremogram. A major difficulty was the construction of credible confidence bands for the extremogram. In this paper, we employ the stationary bootstrap to overcome this pr…
▽ More
Davis and Mikosch [7] introduced the extremogram as a flexible quantitative tool for measuring various types of extremal dependence in a stationary time series. There we showed some standard statistical properties of the sample extremogram. A major difficulty was the construction of credible confidence bands for the extremogram. In this paper, we employ the stationary bootstrap to overcome this problem. Moreover, we introduce the cross extremogram as a measure of extremal serial dependence between two or more time series. We also study the extremogram for return times between extremal events. The use of the stationary bootstrap for the extremogram and the resulting interpretations are illustrated in several univariate and multivariate financial time series examples.
△ Less
Submitted 27 July, 2011;
originally announced July 2011.
-
Max-stable processes for modelling extremes observed in space and time
Authors:
Richard A. Davis,
Claudia Klüppelberg,
Christina Steinkohl
Abstract:
Max-stable processes have proved to be useful for the statistical modelling of spatial extremes. Several representations of max-stable random fields have been proposed in the literature. For statistical inference it is often assumed that there is no temporal dependence, i.e., the observations at spatial locations are independent in time. We use two representations of stationary max-stable spatial…
▽ More
Max-stable processes have proved to be useful for the statistical modelling of spatial extremes. Several representations of max-stable random fields have been proposed in the literature. For statistical inference it is often assumed that there is no temporal dependence, i.e., the observations at spatial locations are independent in time. We use two representations of stationary max-stable spatial random fields and extend the concepts to the space-time domain. In a first approach, we extend the idea of constructing max-stable random fields as limits of normalized and rescaled pointwise maxima of independent Gaussian random fields, which was introduced by Kabluchko, Schlather and de Haan [2009], who construct max-stable random fields associated to a class of variograms. We use a similar approach based on a well-known result by Hüsler and Reiss and apply specific spatio-temporal covariance models for the underlying Gaussian random field, which satisfy weak regularity assumptions. Furthermore, we extend Smith's storm profile model to a space-time setting and provide explicit expressions for the bivariate distribution functions.
The tail dependence coefficient is an important measure of extremal dependence. We show how the spatio-temporal covariance function underlying the Gaussian random field can be interpreted in terms of the tail dependence coefficient. Within this context, we examine different concepts for constructing spatio-temporal covariance models and analyse several specific examples, including Gneiting's class of nonseparable stationary covariance functions.
△ Less
Submitted 22 July, 2011;
originally announced July 2011.
-
Discussion of: A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable?
Authors:
Richard A. Davis,
Jingchen Liu
Abstract:
Discussion of "A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable?" by B.B. McShane and A.J. Wyner [arXiv:1104.4002]
Discussion of "A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable?" by B.B. McShane and A.J. Wyner [arXiv:1104.4002]
△ Less
Submitted 21 April, 2011;
originally announced April 2011.
-
A Conversation with Murray Rosenblatt
Authors:
David R. Brillinger,
Richard A. Davis
Abstract:
On an exquisite March day in 2006, David Brillinger and Richard Davis sat down with Murray and Ady Rosenblatt at their home in La Jolla, California for an enjoyable day of reminiscences and conversation. Our mentor, Murray Rosenblatt, was born on September 7, 1926 in New York City and attended City College of New York before entering graduate school at Cornell University in 1946. After completin…
▽ More
On an exquisite March day in 2006, David Brillinger and Richard Davis sat down with Murray and Ady Rosenblatt at their home in La Jolla, California for an enjoyable day of reminiscences and conversation. Our mentor, Murray Rosenblatt, was born on September 7, 1926 in New York City and attended City College of New York before entering graduate school at Cornell University in 1946. After completing his Ph.D. in 1949 under the direction of the renowned probabilist Mark Kac, the Rosenblatts' moved to Chicago where Murray became an instructor/assistant professor in the Committee of Statistics at the University of Chicago. Murray's academic career then took him to the University of Indiana and Brown University before his joining the University of California at San Diego in 1964. Along the way, Murray established himself as one of the most celebrated and leading figures in probability and statistics with particular emphasis on time series and Markov processes. In addition to being a fellow of the Institute of Mathematical Statistics and American Association for the Advancement of Science, he was a Guggenheim fellow (1965--1966, 1971--1972) and was elected to the National Academy of Sciences in 1984. Among his many contributions, Murray conducted seminal work on density estimation, central limit theorems under strong mixing, spectral domain methods and long memory processes. Murray and Ady Rosenblatt were married in 1949 and have two children, Karin and Daniel.
△ Less
Submitted 16 October, 2009;
originally announced October 2009.