-
Spatio-temporal point process intensity estimation using zero-deflated subsampling applied to a lightning strikes dataset in France
Authors:
Jean-François Coeurjolly,
Thibault Espinasse,
Anne-Laure Fougères,
Mathieu Ribatet
Abstract:
Cloud-to-ground lightning strikes observed in a specific geographical domain over time can be naturally modeled by a spatio-temporal point process. Our focus lies in the parametric estimation of its intensity function, incorporating both spatial factors (such as altitude) and spatio-temporal covariates (such as field temperature, precipitation, etc.). The events are observed in France over a span…
▽ More
Cloud-to-ground lightning strikes observed in a specific geographical domain over time can be naturally modeled by a spatio-temporal point process. Our focus lies in the parametric estimation of its intensity function, incorporating both spatial factors (such as altitude) and spatio-temporal covariates (such as field temperature, precipitation, etc.). The events are observed in France over a span of three years. Spatio-temporal covariates are observed with resolution $0.1^\circ \times 0.1^\circ$ ($\approx 100$km$^2$) and six-hour periods. This results in an extensive dataset, further characterized by a significant excess of zeroes (i.e., spatio-temporal cells with no observed events). We reexamine composite likelihood methods commonly employed for spatial point processes, especially in situations where covariates are piecewise constant. Additionally, we extend these methods to account for zero-deflated subsampling, a strategy involving dependent subsampling, with a focus on selecting more cells in regions where events are observed. A simulation study is conducted to illustrate these novel methodologies, followed by their application to the dataset of lightning strikes.
△ Less
Submitted 11 October, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Spatial modeling of extremes and an angular component
Authors:
Gaspard Tamagny,
Mathieu Ribatet
Abstract:
Many environmental processes such as rainfall, wind or snowfall are inherently spatial and the modelling of extremes has to take into account that feature. In addition, environmental processes are often attached with an angle, e.g., wind speed and direction or extreme snowfall and time of occurrence in year. This article proposes a Bayesian hierarchical model with a conditional independence assump…
▽ More
Many environmental processes such as rainfall, wind or snowfall are inherently spatial and the modelling of extremes has to take into account that feature. In addition, environmental processes are often attached with an angle, e.g., wind speed and direction or extreme snowfall and time of occurrence in year. This article proposes a Bayesian hierarchical model with a conditional independence assumption that aims at modelling simultaneously spatial extremes and an angular component. The proposed model relies on the extreme value theory as well as recent developments for handling directional statistics over a continuous domain. Working within a Bayesian setting, a Gibbs sampler is introduced whose performances are analysed through a simulation study. The paper ends with an application on extreme wind speed in France. Results show that extreme wind events in France are mainly coming from West apart from the Mediterranean part of France and the Alps.
△ Less
Submitted 3 July, 2024; v1 submitted 15 June, 2023;
originally announced June 2023.
-
Full likelihood inference for max-stable data
Authors:
Raphaël Huser,
Clément Dombry,
Mathieu Ribatet,
Marc G. Genton
Abstract:
We show how to perform full likelihood inference for max-stable multivariate distributions or processes based on a stochastic Expectation-Maximisation algorithm, which combines statistical and computational efficiency in high-dimensions. The good performance of this methodology is demonstrated by simulation based on the popular logistic and Brown--Resnick models, and it is shown to provide dramati…
▽ More
We show how to perform full likelihood inference for max-stable multivariate distributions or processes based on a stochastic Expectation-Maximisation algorithm, which combines statistical and computational efficiency in high-dimensions. The good performance of this methodology is demonstrated by simulation based on the popular logistic and Brown--Resnick models, and it is shown to provide dramatic computational time improvements with respect to a direct computation of the likelihood. Strategies to further reduce the computational burden are also discussed.
△ Less
Submitted 13 July, 2018; v1 submitted 25 March, 2017;
originally announced March 2017.
-
ABC random forests for Bayesian parameter inference
Authors:
Louis Raynal,
Jean-Michel Marin,
Pierre Pudlo,
Mathieu Ribatet,
Christian P. Robert,
Arnaud Estoup
Abstract:
This preprint has been reviewed and recommended by Peer Community In Evolutionary Biology (http://dx.doi.org/10.24072/pci.evolbiol.100036). Approximate Bayesian computation (ABC) has grown into a standard methodology that manages Bayesian inference for models associated with intractable likelihood functions. Most ABC implementations require the preliminary selection of a vector of informative stat…
▽ More
This preprint has been reviewed and recommended by Peer Community In Evolutionary Biology (http://dx.doi.org/10.24072/pci.evolbiol.100036). Approximate Bayesian computation (ABC) has grown into a standard methodology that manages Bayesian inference for models associated with intractable likelihood functions. Most ABC implementations require the preliminary selection of a vector of informative statistics summarizing raw data. Furthermore, in almost all existing implementations, the tolerance level that separates acceptance from rejection of simulated parameter values needs to be calibrated. We propose to conduct likelihood-free Bayesian inferences about parameters with no prior selection of the relevant components of the summary statistics and bypassing the derivation of the associated tolerance level. The approach relies on the random forest methodology of Breiman (2001) applied in a (non parametric) regression setting. We advocate the derivation of a new random forest for each component of the parameter vector of interest. When compared with earlier ABC solutions, this method offers significant gains in terms of robustness to the choice of the summary statistics, does not depend on any type of tolerance level, and is a good trade-off in term of quality of point estimator precision and credible interval estimations for a given computing time. We illustrate the performance of our methodological proposal and compare it with earlier ABC methods on a Normal toy example and a population genetics example dealing with human population evolution. All methods designed here have been incorporated in the R package abcrf (version 1.7) available on CRAN.
△ Less
Submitted 2 November, 2018; v1 submitted 18 May, 2016;
originally announced May 2016.
-
Conditional simulation of max-stable processes
Authors:
Clément Dombry,
Frédéric Éyi-Minko,
Mathieu Ribatet
Abstract:
Since many environmental processes such as heat waves or precipitation are spatial in extent, it is likely that a single extreme event affects several locations and the areal modelling of extremes is therefore essential if the spatial dependence of extremes has to be appropriately taken into account. This paper proposes a framework for conditional simulations of max-stable processes and give close…
▽ More
Since many environmental processes such as heat waves or precipitation are spatial in extent, it is likely that a single extreme event affects several locations and the areal modelling of extremes is therefore essential if the spatial dependence of extremes has to be appropriately taken into account. This paper proposes a framework for conditional simulations of max-stable processes and give closed forms for Brown-Resnick and Schlather processes. We test the method on simulated data and give an application to extreme rainfall around Zurich and extreme temperature in Switzerland. Results show that the proposed framework provides accurate conditional simulations and can handle real-sized problems.
△ Less
Submitted 27 August, 2012;
originally announced August 2012.
-
Rejoinder to "Statistical Modeling of Spatial Extremes"
Authors:
A. C. Davison,
S. A. Padoan,
M. Ribatet
Abstract:
Rejoinder to "Statistical Modeling of Spatial Extremes" by A. C. Davison, S. A. Padoan and M. Ribatet [arXiv:1208.3378].
Rejoinder to "Statistical Modeling of Spatial Extremes" by A. C. Davison, S. A. Padoan and M. Ribatet [arXiv:1208.3378].
△ Less
Submitted 17 August, 2012;
originally announced August 2012.
-
Statistical Modeling of Spatial Extremes
Authors:
A. C. Davison,
S. A. Padoan,
M. Ribatet
Abstract:
The areal modeling of the extremes of a natural process such as rainfall or temperature is important in environmental statistics; for example, understanding extreme areal rainfall is crucial in flood protection. This article reviews recent progress in the statistical modeling of spatial extremes, starting with sketches of the necessary elements of extreme value statistics and geostatistics. The ma…
▽ More
The areal modeling of the extremes of a natural process such as rainfall or temperature is important in environmental statistics; for example, understanding extreme areal rainfall is crucial in flood protection. This article reviews recent progress in the statistical modeling of spatial extremes, starting with sketches of the necessary elements of extreme value statistics and geostatistics. The main types of statistical models thus far proposed, based on latent variables, on copulas and on spatial max-stable processes, are described and then are compared by application to a data set on rainfall in Switzerland. Whereas latent variable modeling allows a better fit to marginal distributions, it fits the joint distributions of extremes poorly, so appropriately-chosen copula or max-stable models seem essential for successful spatial modeling of extremes.
△ Less
Submitted 16 August, 2012;
originally announced August 2012.
-
Conditional simulations of Brown-Resnick processes
Authors:
Clément Dombry,
Frédéric Éyi-Minko,
Mathieu Ribatet
Abstract:
Since many environmental processes such as heat waves or precipitation are spatial in extent, it is likely that a single extreme event affects several locations and the areal modeling of extremes is therefore essential if the spatial dependence of extremes has to be appropriately taken into account. Although some progress has been made to develop a geostatistic of extremes, conditional simulation…
▽ More
Since many environmental processes such as heat waves or precipitation are spatial in extent, it is likely that a single extreme event affects several locations and the areal modeling of extremes is therefore essential if the spatial dependence of extremes has to be appropriately taken into account. Although some progress has been made to develop a geostatistic of extremes, conditional simulation of max-stable processes is still in its early stage. This paper proposes a framework to get conditional simulations of Brown-Resnick processes. Although closed forms for the regular conditional distribution of Brown-Resnick processes were recently found, sampling from this conditional distribution is a considerable challenge as it leads quickly to a combinatorial explosion. To bypass this computational burden, a Markov chain Monte-Carlo algorithm is presented. We test the method on simulated data and give an application to extreme rainfall around Zurich. Results show that the proposed framework provides accurate conditional simulations of Brown-Resnick processes and can handle real-sized problems.
△ Less
Submitted 27 August, 2012; v1 submitted 16 December, 2011;
originally announced December 2011.
-
Bayesian Inference from Composite Likelihoods, with an Application to Spatial Extremes
Authors:
Mathieu Ribatet,
Daniel Cooley,
Anthony C. Davison
Abstract:
Composite likelihoods are increasingly used in applications where the full likelihood is analytically unknown or computationally prohibitive. Although the maximum composite likelihood estimator has frequentist properties akin to those of the usual maximum likelihood estimator, Bayesian inference based on composite likelihoods has yet to be explored. In this paper we investigate the use of the Metr…
▽ More
Composite likelihoods are increasingly used in applications where the full likelihood is analytically unknown or computationally prohibitive. Although the maximum composite likelihood estimator has frequentist properties akin to those of the usual maximum likelihood estimator, Bayesian inference based on composite likelihoods has yet to be explored. In this paper we investigate the use of the Metropolis--Hastings algorithm to compute a pseudo-posterior distribution based on the composite likelihood. Two methodologies for adjusting the algorithm are presented and their performance on approximating the true posterior distribution is investigated using simulated data sets and real data on spatial extremes of rainfall.
△ Less
Submitted 6 July, 2011; v1 submitted 27 November, 2009;
originally announced November 2009.
-
Likelihood-based inference for max-stable processes
Authors:
Simone A. Padoan,
Mathieu Ribatet,
Scott A. Sisson
Abstract:
The last decade has seen max-stable processes emerge as a common tool for the statistical modeling of spatial extremes. However, their application is complicated due to the unavailability of the multivariate density function, and so likelihood-based methods remain far from providing a complete and flexible framework for inference. In this article we develop inferentially practical, likelihood-ba…
▽ More
The last decade has seen max-stable processes emerge as a common tool for the statistical modeling of spatial extremes. However, their application is complicated due to the unavailability of the multivariate density function, and so likelihood-based methods remain far from providing a complete and flexible framework for inference. In this article we develop inferentially practical, likelihood-based methods for fitting max-stable processes derived from a composite-likelihood approach. The procedure is sufficiently reliable and versatile to permit the simultaneous modeling of marginal and dependence parameters in the spatial context at a moderate computational cost. The utility of this methodology is examined via simulation, and illustrated by the analysis of U.S. precipitation extremes.
△ Less
Submitted 23 February, 2009; v1 submitted 18 February, 2009;
originally announced February 2009.
-
Global sensitivity analysis of computer models with functional inputs
Authors:
Bertrand Iooss,
Mathieu Ribatet
Abstract:
Global sensitivity analysis is used to quantify the influence of uncertain input parameters on the response variability of a numerical model. The common quantitative methods are applicable to computer codes with scalar input variables. This paper aims to illustrate different variance-based sensitivity analysis techniques, based on the so-called Sobol indices, when some input variables are functi…
▽ More
Global sensitivity analysis is used to quantify the influence of uncertain input parameters on the response variability of a numerical model. The common quantitative methods are applicable to computer codes with scalar input variables. This paper aims to illustrate different variance-based sensitivity analysis techniques, based on the so-called Sobol indices, when some input variables are functional, such as stochastic processes or random spatial fields. In this work, we focus on large cpu time computer codes which need a preliminary meta-modeling step before performing the sensitivity analysis. We propose the use of the joint modeling approach, i.e., modeling simultaneously the mean and the dispersion of the code outputs using two interlinked Generalized Linear Models (GLM) or Generalized Additive Models (GAM). The ``mean'' model allows to estimate the sensitivity indices of each scalar input variables, while the ``dispersion'' model allows to derive the total sensitivity index of the functional input variables. The proposed approach is compared to some classical SA methodologies on an analytical function. Lastly, the proposed methodology is applied to a concrete industrial computer code that simulates the nuclear fuel irradiation.
△ Less
Submitted 9 June, 2008; v1 submitted 7 February, 2008;
originally announced February 2008.
-
Usefulness of the Reversible Jump Markov Chain Monte Carlo Model in Regional Flood Frequency Analysis
Authors:
Mathieu Ribatet,
Eric Sauquet,
Jean-Michel Grésillon,
Taha B. M. J. Ouarda
Abstract:
Regional flood frequency analysis is a convenient way to reduce estimation uncertainty when few data are available at the gauging site. In this work, a model that allows a non-null probability to a regional fixed shape parameter is presented. This methodology is integrated within a Bayesian framework and uses reversible jump techniques. The performance on stochastic data of this new estimator is…
▽ More
Regional flood frequency analysis is a convenient way to reduce estimation uncertainty when few data are available at the gauging site. In this work, a model that allows a non-null probability to a regional fixed shape parameter is presented. This methodology is integrated within a Bayesian framework and uses reversible jump techniques. The performance on stochastic data of this new estimator is compared to two other models: a conventional Bayesian analysis and the index flood approach. Results show that the proposed estimator is absolutely suited to regional estimation when only a few data are available at the target site. Moreover, unlike the index flood estimator, target site index flood error estimation seems to have less impact on Bayesian estimators. Some suggestions about configurations of the pooling groups are also presented to increase the performance of each estimator.
△ Less
Submitted 4 February, 2008;
originally announced February 2008.
-
Global Sensitivity Analysis of Stochastic Computer Models with joint metamodels
Authors:
Bertrand Iooss,
Mathieu Ribatet,
Amandine Marrel
Abstract:
The global sensitivity analysis method, used to quantify the influence of uncertain input variables on the response variability of a numerical model, is applicable to deterministic computer code (for which the same set of input variables gives always the same output value). This paper proposes a global sensitivity analysis methodology for stochastic computer code (having a variability induced by…
▽ More
The global sensitivity analysis method, used to quantify the influence of uncertain input variables on the response variability of a numerical model, is applicable to deterministic computer code (for which the same set of input variables gives always the same output value). This paper proposes a global sensitivity analysis methodology for stochastic computer code (having a variability induced by some uncontrollable variables). The framework of the joint modeling of the mean and dispersion of heteroscedastic data is used. To deal with the complexity of computer experiment outputs, non parametric joint models (based on Generalized Additive Models and Gaussian processes) are discussed. The relevance of these new models is analyzed in terms of the obtained variance-based sensitivity indices with two case studies. Results show that the joint modeling approach leads accurate sensitivity index estimations even when clear heteroscedasticity is present.
△ Less
Submitted 8 June, 2009; v1 submitted 4 February, 2008;
originally announced February 2008.
-
Modeling All Exceedances Above a Threshold Using an Extremal Dependence Structure: Inferences on Several Flood Characteristics
Authors:
Mathieu Ribatet,
Taha B. M. J. Ouarda,
Eric Sauquet,
Jean-Michel Grésillon
Abstract:
Flood quantile estimation is of great importance for many engineering studies and policy decisions. However, practitioners must often deal with small data available. Thus, the information must be used optimally. In the last decades, to reduce the waste of data, inferential methodology has evolved from annual maxima modeling to peaks over a threshold one. To mitigate the lack of data, peaks over…
▽ More
Flood quantile estimation is of great importance for many engineering studies and policy decisions. However, practitioners must often deal with small data available. Thus, the information must be used optimally. In the last decades, to reduce the waste of data, inferential methodology has evolved from annual maxima modeling to peaks over a threshold one. To mitigate the lack of data, peaks over a threshold are sometimes combined with additional information - mostly regional and historical information. However, whatever the extra information is, the most precious information for the practitioner is found at the target site. In this study, a model that allows inferences on the whole time series is introduced. In particular, the proposed model takes into account the dependence between successive extreme observations using an appropriate extremal dependence structure. Results show that this model leads to more accurate flood peak quantile estimates than conventional estimators. In addition, as the time dependence is taken into account, inferences on other flood characteristics can be performed. An illustration is given on flood duration. Our analysis shows that the accuracy of the proposed models to estimate the flood duration is related to specific catchment characteristics. Some suggestions to increase the flood duration predictions are introduced.
△ Less
Submitted 4 February, 2008;
originally announced February 2008.
-
A regional Bayesian POT model for flood frequency analysis
Authors:
Mathieu Ribatet,
Eric Sauquet,
Jean-Michel Grésillon,
Taha B. M. J. Ouarda
Abstract:
Flood frequency analysis is usually based on the fitting of an extreme value distribution to the local streamflow series. However, when the local data series is short, frequency analysis results become unreliable. Regional frequency analysis is a convenient way to reduce the estimation uncertainty. In this work, we propose a regional Bayesian model for short record length sites. This model is le…
▽ More
Flood frequency analysis is usually based on the fitting of an extreme value distribution to the local streamflow series. However, when the local data series is short, frequency analysis results become unreliable. Regional frequency analysis is a convenient way to reduce the estimation uncertainty. In this work, we propose a regional Bayesian model for short record length sites. This model is less restrictive than the index flood model while preserving the formalism of "homogeneous regions". The performance of the proposed model is assessed on a set of gauging stations in France. The accuracy of quantile estimates as a function of the degree of homogeneity of the pooling group is also analysed. The results indicate that the regional Bayesian model outperforms the index flood model and local estimators. Furthermore, it seems that working with relatively large and homogeneous regions may lead to more accurate results than working with smaller and highly homogeneous regions.
△ Less
Submitted 4 February, 2008;
originally announced February 2008.