-
Pointwise density estimation on metric spaces and applications in seismology
Authors:
Galatia Cleanthous,
Athanasios G. Georgiadis,
Philip A. White
Abstract:
We are studying the problem of estimating density in a wide range of metric spaces, including the Euclidean space, the sphere, the ball, and various Riemannian manifolds. Our framework involves a metric space with a doubling measure and a self-adjoint operator, whose heat kernel exhibits Gaussian behaviour. We begin by reviewing the construction of kernel density estimators and the related backgro…
▽ More
We are studying the problem of estimating density in a wide range of metric spaces, including the Euclidean space, the sphere, the ball, and various Riemannian manifolds. Our framework involves a metric space with a doubling measure and a self-adjoint operator, whose heat kernel exhibits Gaussian behaviour. We begin by reviewing the construction of kernel density estimators and the related background information. As a novel result, we present a pointwise kernel density estimation for probability density functions that belong to general Hölder spaces. The study is accompanied by an application in Seismology. Precisely, we analyze a globally-indexed dataset of earthquake occurrence and compare the out-of-sample performance of several approximated kernel density estimators indexed on the sphere.
△ Less
Submitted 31 March, 2023;
originally announced April 2023.
-
Joint Multivariate and Functional Modeling for Plant Traits and Reflectances
Authors:
Philip A. White,
Michael F. Christensen,
Henry Frye,
Alan E. Gelfand,
John A. Silander Jr
Abstract:
The investigation of leaf-level traits in response to varying environmental conditions has immense importance for understanding plant ecology. Remote sensing technology enables measurement of the reflectance of plants to make inferences about underlying traits along environmental gradients. While much focus has been placed on understanding how reflectance and traits are related at the leaf-level,…
▽ More
The investigation of leaf-level traits in response to varying environmental conditions has immense importance for understanding plant ecology. Remote sensing technology enables measurement of the reflectance of plants to make inferences about underlying traits along environmental gradients. While much focus has been placed on understanding how reflectance and traits are related at the leaf-level, the challenge of modelling the dependence of this relationship along environmental gradients has limited this line of inquiry. Here, we take up the problem of jointly modeling traits and reflectance given environment. Our objective is to assess not only response to environmental regressors but also dependence between trait levels and the reflectance spectrum in the context of this regression. This leads to joint modeling of a response vector of traits with reflectance arising as a functional response over the wavelength spectrum. To conduct this investigation, we employ a dataset from a global biodiversity hotspot, the Greater Cape Floristic Region in South Africa.
△ Less
Submitted 1 October, 2022;
originally announced October 2022.
-
Nonseparable Space-Time Stationary Covariance Functions on Networks cross Time
Authors:
Emilio Porcu,
Philip A. White,
Marc G. Genton
Abstract:
The advent of data science has provided an increasing number of challenges with high data complexity. This paper addresses the challenge of space-time data where the spatial domain is not a planar surface, a sphere, or a linear network, but a generalized network (termed a graph with Euclidean edges). Additionally, data are repeatedly measured over different temporal instants. We provide new classe…
▽ More
The advent of data science has provided an increasing number of challenges with high data complexity. This paper addresses the challenge of space-time data where the spatial domain is not a planar surface, a sphere, or a linear network, but a generalized network (termed a graph with Euclidean edges). Additionally, data are repeatedly measured over different temporal instants. We provide new classes of nonseparable space-time stationary covariance functions where {\em space} can be a generalized network, a Euclidean tree, or a linear network, and where time can be linear or circular (seasonal). Because the construction principles are technical, we focus on illustrations that guide the reader through the construction of statistically interpretable examples. A simulation study demonstrates that we can recover the correct model when compared to misspecified models. In addition, our simulation studies show that we effectively recover simulation parameters. In our data analysis, we consider a traffic accident dataset that shows improved model performance based on covariance specifications and network-based metrics.
△ Less
Submitted 5 August, 2022;
originally announced August 2022.
-
Spatial Functional Data Modeling of Plant Reflectances
Authors:
Philip A. White,
Henry Frye,
Michael F. Christensen,
Alan E. Gelfand,
John A. Silander Jr
Abstract:
Plant reflectance spectra - the profile of light reflected by leaves across different wavelengths - supply the spectral signature for a species at a spatial location to enable estimation of functional and taxonomic diversity for plants. We consider leaf spectra as "responses" to be explained spatially. These spectra/reflectances are functions over a wavelength band that respond to the environment.…
▽ More
Plant reflectance spectra - the profile of light reflected by leaves across different wavelengths - supply the spectral signature for a species at a spatial location to enable estimation of functional and taxonomic diversity for plants. We consider leaf spectra as "responses" to be explained spatially. These spectra/reflectances are functions over a wavelength band that respond to the environment.
Our motivating data are gathered for several families from the Cape Floristic Region (CFR) in South Africa and lead us to develop rich novel spatial models that can explain spectra for genera within families. Wavelength responses for an individual leaf are viewed as a function of wavelength, leading to functional data modeling. Local environmental features become covariates. We introduce wavelength - covariate interaction since the response to environmental regressors may vary with wavelength, so may variance. Formal spatial modeling enables prediction of reflectances for genera at unobserved locations with known environmental features. We incorporate spatial dependence, wavelength dependence, and space-wavelength interaction (in the spirit of space-time interaction). We implement out-of-sample validation to select a best model, discovering that the model features listed above are all informative for the functional data analysis. We then supply interpretation of the results under the selected model.
△ Less
Submitted 25 March, 2021; v1 submitted 5 February, 2021;
originally announced February 2021.
-
Hierarchical Integrated Spatial Process Modeling of Monotone West Antarctic Snow Density Curves
Authors:
Philip A. White,
Durban G. Keeler,
Summer Rupper
Abstract:
Snow density estimates below the surface, used with airplane-acquired ice-penetrating radar measurements, give a site-specific history of snow water accumulation. Because it is infeasible to drill snow cores across all of Antarctica to measure snow density and because it is critical to understand how climatic changes are affecting the world's largest freshwater reservoir, we develop methods that e…
▽ More
Snow density estimates below the surface, used with airplane-acquired ice-penetrating radar measurements, give a site-specific history of snow water accumulation. Because it is infeasible to drill snow cores across all of Antarctica to measure snow density and because it is critical to understand how climatic changes are affecting the world's largest freshwater reservoir, we develop methods that enable snow density estimation with uncertainty in regions where snow cores have not been drilled.
In inland West Antarctica, snow density increases monotonically as a function of depth, except for possible micro-scale variability or measurement error, and it cannot exceed the density of ice. We present a novel class of integrated spatial process models that allow interpolation of monotone snow density curves. For computational feasibility, we construct the space-depth process through kernel convolutions of log-Gaussian spatial processes. We discuss model comparison, model fitting, and prediction. Using this model, we extend estimates of snow density beyond the depth of the original core and estimate snow density curves where snow cores have not been drilled. Along flight lines with ice-penetrating radar, we use interpolated snow density curves to estimate recent water accumulation and find predominantly decreasing water accumulation over recent decades.
△ Less
Submitted 19 July, 2021; v1 submitted 15 January, 2020;
originally announced January 2020.
-
Generalized Evolutionary Point Processes: Model Specifications and Model Comparison
Authors:
Philip A. White,
Alan E. Gelfand
Abstract:
Generalized evolutionary point processes offer a class of point process models that allows for either excitation or inhibition based upon the history of the process. In this regard, we propose modeling which comprises generalization of the nonlinear Hawkes process. Working within a Bayesian framework, model fitting is implemented through Markov chain Monte Carlo. This entails discussion of computa…
▽ More
Generalized evolutionary point processes offer a class of point process models that allows for either excitation or inhibition based upon the history of the process. In this regard, we propose modeling which comprises generalization of the nonlinear Hawkes process. Working within a Bayesian framework, model fitting is implemented through Markov chain Monte Carlo. This entails discussion of computation of the likelihood for such point patterns. Furthermore, for this class of models, we discuss strategies for model comparison. Using simulation, we illustrate how well we can distinguish these models from point pattern specifications with conditionally independent event times, e.g., Poisson processes. Specifically, we demonstrate that these models can correctly identify true relationships (i.e., excitation or inhibition/control). Then, we consider a novel extension of the log Gaussian Cox process that incorporates evolutionary behavior and illustrate that our model comparison approach prefers the evolutionary log Gaussian Cox process compared to simpler models. We also examine a real dataset consisting of violent crime events from the 11th police district in Chicago from the year 2018. This data exhibits strong daily seasonality and changes across the year. After we account for these data attributes, we find significant but mild self-excitation, implying that event occurrence increases the intensity of future events.
△ Less
Submitted 15 October, 2019;
originally announced October 2019.
-
Multivariate Functional Data Modeling with Time-varying Clustering
Authors:
Philip A. White,
Alan E. Gelfand
Abstract:
We consider the situation where multivariate functional data has been collected over time at each of a set of sites. Our illustrative setting is bivariate, monitoring ozone and PM$_{10}$ levels as a function of time over the course of a year at a set of monitoring sites. The data we work with is from 24 monitoring sites in Mexico City which record hourly ozone and PM$_{10}$ levels. We use the data…
▽ More
We consider the situation where multivariate functional data has been collected over time at each of a set of sites. Our illustrative setting is bivariate, monitoring ozone and PM$_{10}$ levels as a function of time over the course of a year at a set of monitoring sites. The data we work with is from 24 monitoring sites in Mexico City which record hourly ozone and PM$_{10}$ levels. We use the data for the year 2017. Hence, we have 48 functions to work with. Our objective is to implement model-based clustering of the functions across the sites. Using our example, such clustering can be considered for ozone and PM$_{10}$ individually or jointly. It may occur differentially for the two pollutants. More importantly for us, we allow that such clustering can vary with time.
We model the multivariate functions across sites using a multivariate Gaussian process. With many sites and several functions at each site, we use dimension reduction to provide a stochastic process specification for the distribution of the collection of multivariate functions over the say $n$ sites. Furthermore, to cluster the functions, either individually by component or jointly with all components, we use the Dirichlet process which enables shared labeling of the functions across the sites. Specifically, we cluster functions based on their response to exogenous variables. Though the functions arise in continuous time, clustering in continuous time is extremely computationally demanding and not of practical interest. Therefore, we employ a partitioning of the time scale to capture time-varying clustering.
△ Less
Submitted 1 May, 2019; v1 submitted 25 April, 2019;
originally announced April 2019.
-
Modeling Daily Seasonality of Mexico City Ozone using Nonseparable Covariance Models on Circles Cross Time
Authors:
Philip A. White,
Emilio Porcu
Abstract:
Mexico City tracks ground-level ozone levels to assess compliance with national ambient air quality standards and to prevent environmental health emergencies. Ozone levels show distinct daily patterns, within the city, and over the course of the year. To model these data, we use covariance models over space, circular time, and linear time. We review existing models and develop new classes of nonse…
▽ More
Mexico City tracks ground-level ozone levels to assess compliance with national ambient air quality standards and to prevent environmental health emergencies. Ozone levels show distinct daily patterns, within the city, and over the course of the year. To model these data, we use covariance models over space, circular time, and linear time. We review existing models and develop new classes of nonseparable covariance models of this type, models appropriate for quasi-periodic data collected at many locations. With these covariance models, we use nearest-neighbor Gaussian processes to predict hourly ozone levels at unobserved locations in April and May, the peak ozone season, to infer compliance to Mexican air quality standards and to estimate respiratory health risk associated with ozone. Predicted compliance with air quality standards and estimated respiratory health risk vary greatly over space and time. In some regions, we predict exceedance of national standards for more than a third of the hours in April and May. On many days, we predict that nearly all of Mexico City exceeds nationally legislated ozone thresholds at least once. In peak regions, we estimate respiratory risk for ozone to be 55% higher on average than the annual average risk and as much at 170% higher on some days.
△ Less
Submitted 15 July, 2018;
originally announced July 2018.
-
Non-separable Nearest-Neighbor Gaussian Process Model for Antarctic Surface Mass Balance and Ice Core Site Selection
Authors:
Philip A. White,
C. Shane Reese,
William F. Christensen,
Summer Rupper
Abstract:
Surface mass balance (SMB) is an important factor in the estimation of sea level change, and data are collected to estimate models for prediction of SMB over the Antarctic ice sheets. Using a quality-controlled aggregate dataset of SMB field measurements with significantly more observations than previous analyses, a fully Bayesian nearest-neighbor Gaussian process model is posed to estimate Antarc…
▽ More
Surface mass balance (SMB) is an important factor in the estimation of sea level change, and data are collected to estimate models for prediction of SMB over the Antarctic ice sheets. Using a quality-controlled aggregate dataset of SMB field measurements with significantly more observations than previous analyses, a fully Bayesian nearest-neighbor Gaussian process model is posed to estimate Antarctic SMB and propose new field measurement locations. A corresponding Antarctic SMB map is rendered using this model and is compared with previous estimates. A prediction uncertainty map is created to identify regions of high SMB uncertainty. The model estimates net SMB to be 2345 Gton $\text{yr}^{-1}$, with 95% credible interval (2273,2413) Gton $\text{yr}^{-1}$. Overall, these results suggest lower Antarctic SMB than previously reported. Using the model's uncertainty quantification, we propose 25 new measurement sites for field study utilizing a design to minimize integrated mean squared error.
△ Less
Submitted 14 July, 2018;
originally announced July 2018.
-
Pollution State Modeling for Mexico City
Authors:
Philip A. White,
Alan E. Gelfand,
Eliane R. Rodrigues,
Guadalupe Tzintzun
Abstract:
Ground-level ozone and particulate matter pollutants are associated with a variety of health issues and increased mortality. For this reason, Mexican environmental agencies regulate pollutant levels. In addition, Mexico City defines pollution emergencies using thresholds that rely on regional maxima for ozone and particulate matter with diameter less than 10 micrometers ($\text{PM}_{10}$). To pred…
▽ More
Ground-level ozone and particulate matter pollutants are associated with a variety of health issues and increased mortality. For this reason, Mexican environmental agencies regulate pollutant levels. In addition, Mexico City defines pollution emergencies using thresholds that rely on regional maxima for ozone and particulate matter with diameter less than 10 micrometers ($\text{PM}_{10}$). To predict local pollution emergencies and to assess compliance to Mexican ambient air quality standards, we analyze hourly ozone and $\text{PM}_{10}$ measurements from 24 stations across Mexico City from 2017 using a bivariate spatiotemporal model. Using this model, we predict future pollutant levels using current weather conditions and recent pollutant concentrations. Using hourly pollutant projections, we predict regional maxima needed to estimate the probability of future pollution emergencies. We discuss how predicted compliance to legislated pollution limits varies across regions within Mexico City in 2017. We find that predicted probability of pollution emergencies is limited to a few time periods. In contrast, we show that predicted exceedance of Mexican ambient air quality standards is a common, nearly daily occurrence.
△ Less
Submitted 10 July, 2018;
originally announced July 2018.
-
Modeling Efficiency of Foreign Aid Allocation in Malawi
Authors:
Philip A. White,
Candace Berrett,
E. Shannon Neeley-Tass,
Michael G. Findley
Abstract:
The Open Aid Malawi initiative has collected an unprecedented database that identifies as much location-specific information as possible for each of over 2500 individual foreign aid donations to Malawi since 2003. Ensuring efficient use and distribution of that aid is important to donors and to Malawi citizens. However, because of individual donor goals and difficulty in tracking donor coordinatio…
▽ More
The Open Aid Malawi initiative has collected an unprecedented database that identifies as much location-specific information as possible for each of over 2500 individual foreign aid donations to Malawi since 2003. Ensuring efficient use and distribution of that aid is important to donors and to Malawi citizens. However, because of individual donor goals and difficulty in tracking donor coordination, determining presence or absence of efficient aid allocation is difficult. We compare several Bayesian spatial generalized linear mixed models to relate aid allocation to various economic indicators within seven donation sectors. We find that the spatial gamma regression model best predicts current aid allocation. Using this model, first we use inferences on coefficients to examine whether or not there is evidence of efficient aid allocation within each sector. Second, we use this model to determine a more efficient aid allocation scenario and compare this scenario to the current allocation to provide insight for future aid donations.
△ Less
Submitted 1 November, 2017; v1 submitted 9 August, 2016;
originally announced August 2016.