-
Highly Multivariate Large-scale Spatial Stochastic Processes -- A Cross-Markov Random Field Approach
Authors:
Xiaoqing Chen,
Peter Diggle,
James V. Zidek,
Gavin Shaddick
Abstract:
Key challenges in the analysis of highly multivariate large-scale spatial stochastic processes, where both the number of components (p) and spatial locations (n) can be large, include achieving maximal sparsity in the joint precision matrix, ensuring efficient computational cost for its generation, accommodating asymmetric cross-covariance in the joint covariance matrix, and delivering scientific…
▽ More
Key challenges in the analysis of highly multivariate large-scale spatial stochastic processes, where both the number of components (p) and spatial locations (n) can be large, include achieving maximal sparsity in the joint precision matrix, ensuring efficient computational cost for its generation, accommodating asymmetric cross-covariance in the joint covariance matrix, and delivering scientific interpretability. We propose a cross-MRF model class, consisting of a mixed spatial graphical model framework and cross-MRF theory, to collectively address these challenges in one unified framework across two modelling stages. The first stage exploits scientifically informed conditional independence (CI) among p component fields and allows for a step-wise parallel generation of joint covariance and precision matrix, enabling a simultaneous accommodation of asymmetric cross-covariance in joint covariance matrix and sparsity in joint precision matrix. The second stage extends the first-stage CI to doubly CI among both p and n and unearths the cross-MRF via an extended Hammersley-Clifford theorem for multivariate spatial stochastic processes. This results in the sparsest possible representation of the joint precision matrix and ensures its lowest generation complexity. We demonstrate with 1D simulated comparative studies and 2D real-world data.
△ Less
Submitted 9 March, 2025; v1 submitted 19 August, 2024;
originally announced August 2024.
-
Preferential monitoring site location in the Southern California Air Quality Basin
Authors:
Adrian Jones,
James V Zidek,
Joe Watson
Abstract:
The preferential siting of the locations of monitors of hazardous environmental fields can lead to the serious underestimation of the impacts of those fields. In particular, human health effects can be severely underestimated when standard statistical are applied without appropriate adjustment. This report describes an extensive analysis of the siting of monitors for a network that measures air po…
▽ More
The preferential siting of the locations of monitors of hazardous environmental fields can lead to the serious underestimation of the impacts of those fields. In particular, human health effects can be severely underestimated when standard statistical are applied without appropriate adjustment. This report describes an extensive analysis of the siting of monitors for a network that measures air pollution PM10 in California's South Coast Air Basin SOCAB. That analysis uses EPA data collected during the 1986 to 2019 period. Background descriptions, including those published by the US EPA are provided. The analysis uses a very general and fast Monte Carlo test for preferential sampling developed by Dr Joe Watson, which confirms that the sites were preferentially sited, as would be expected, given the intended purpose of the network to detect noncompliance with air quality standards. Our findings demonstrate both the value of that algorithm for application where where such background knowledge is not available, and hence to situations in which standard statistical tools require modification.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
Knots and their effect on the tensile strength of lumber: a case study
Authors:
Shuxian Fan,
Samuel W. K. Wong,
James V. Zidek
Abstract:
When assessing the strength of sawn lumber for use in engineering applications, the sizes and locations of knots are an important consideration. Knots are the most common visual characteristics of lumber, that result from the growth of tree branches. Large individual knots, as well as clusters of distinct knots, are known to have strength-reducing effects. However, industry grading rules that gove…
▽ More
When assessing the strength of sawn lumber for use in engineering applications, the sizes and locations of knots are an important consideration. Knots are the most common visual characteristics of lumber, that result from the growth of tree branches. Large individual knots, as well as clusters of distinct knots, are known to have strength-reducing effects. However, industry grading rules that govern knots are informed by subjective judgment to some extent, particularly the spatial interaction of knots and their relationship with lumber strength. This case study reports the results of an experiment that investigated and modelled the strength-reducing effects of knots on a sample of Douglas Fir lumber. Experimental data were obtained by taking scans of lumber surfaces and applying tensile strength testing. The modelling approach presented incorporates all relevant knot information in a Bayesian framework, thereby contributing a more refined way of managing the quality of manufactured lumber.
△ Less
Submitted 14 February, 2023; v1 submitted 10 January, 2022;
originally announced January 2022.
-
Approximately Optimal Spatial Design: How Good is it?
Authors:
Yu Wang,
Nhu D. Le,
James V. Zidek
Abstract:
The increasing recognition of the association between adverse human health conditions and many environmental substances as well as processes has led to the need to monitor them. An important problem that arises in environmental statistics is the design of the locations of the monitoring stations for those environmental processes of interest. One particular design criterion for monitoring networks…
▽ More
The increasing recognition of the association between adverse human health conditions and many environmental substances as well as processes has led to the need to monitor them. An important problem that arises in environmental statistics is the design of the locations of the monitoring stations for those environmental processes of interest. One particular design criterion for monitoring networks that tries to reduce the uncertainty about predictions of unseen processes is called the maximum-entropy design. However, this design criterion involves a hard optimization problem that is computationally intractable for large data sets. Previous work of Wang et al. (2017) examined a probabilistic model that can be implemented efficiently to approximate the underlying optimization problem. In this paper, we attempt to establish statistically sound tools for assessing the quality of the approximations.
△ Less
Submitted 3 February, 2020;
originally announced February 2020.
-
Data integration for high-resolution, continental-scale estimation of air pollution concentrations
Authors:
Matthew L. Thomas,
Gavin Shaddick,
Daniel Simpson,
Kees de Hoogh,
James V. Zidek
Abstract:
Air pollution constitutes the highest environmental risk factor in relation to heath. In order to provide the evidence required for health impact analyses, to inform policy and to develop potential mitigation strategies comprehensive information is required on the state of air pollution. Information on air pollution traditionally comes from ground monitoring (GM) networks but these may not be able…
▽ More
Air pollution constitutes the highest environmental risk factor in relation to heath. In order to provide the evidence required for health impact analyses, to inform policy and to develop potential mitigation strategies comprehensive information is required on the state of air pollution. Information on air pollution traditionally comes from ground monitoring (GM) networks but these may not be able to provide sufficient coverage and may need to be supplemented with information from other sources (e.g. chemical transport models; CTMs). However, these may only be available on grids and may not capture micro-scale features that may be important in assessing air quality in areas of high population. We develop a model that allows calibration between multiple data sources available at different levels of support by allowing the coefficients of calibration equations to vary over space and time, enabling downscaling where the data is sufficient to support it. The model is used to produce high-resolution (1km $\times$ 1km) estimates of NO$_2$ and PM$_{2.5}$ across Western Europe for 2010-2016. Concentrations of both pollutants are decreasing during this period, however there remain large populations exposed to levels exceeding the WHO Air Quality Guidelines and thus air pollution remains a serious threat to health.
△ Less
Submitted 9 October, 2019; v1 submitted 28 June, 2019;
originally announced July 2019.
-
A general theory for preferential sampling in environmental networks
Authors:
Joe Watson,
James V. Zidek,
Gavin Shaddick
Abstract:
This paper presents a general model framework for detecting the preferential sampling of environmental monitors recording an environmental process across space and/or time. This is achieved by considering the joint distribution of an environmental process with a site--selection process that considers where and when sites are placed to measure the process. The environmental process may be spatial,…
▽ More
This paper presents a general model framework for detecting the preferential sampling of environmental monitors recording an environmental process across space and/or time. This is achieved by considering the joint distribution of an environmental process with a site--selection process that considers where and when sites are placed to measure the process. The environmental process may be spatial, temporal or spatio--temporal in nature. By sharing random effects between the two processes, the joint model is able to establish whether site placement was stochastically dependent of the environmental process under study. The embedding into a spatio--temporal framework also allows for the modelling of the dynamic site---selection process itself. Real--world factors affecting both the size and location of the network can be easily modelled and quantified. Depending upon the choice of population of locations to consider for selection across space and time under the site--selection process, different insights about the precise nature of preferential sampling can be obtained. The general framework developed in the paper is designed to be easily and quickly fit using the R-INLA package. We apply this framework to a case study involving particulate air pollution over the UK where a major reduction in the size of a monitoring network through time occurred. It is demonstrated that a significant response--biased reduction in the air quality monitoring network occurred. We also show that the network was consistently unrepresentative of the levels of particulate matter seen across much of GB throughout the operating life of the network. Finally we show that this may have led to a severe over-reporting of the population--average exposure levels experienced across GB. This could have great impacts on estimates of the health effects of black smoke levels.
△ Less
Submitted 6 April, 2019; v1 submitted 13 September, 2018;
originally announced September 2018.
-
Approximately Optimal Subset Selection for Statistical Design and Modelling
Authors:
Yu Wang,
Nhu D. Le,
James V. Zidek
Abstract:
We study the problem of optimal subset selection from a set of correlated random variables. In particular, we consider the associated combinatorial optimization problem of maximizing the determinant of a symmetric positive definite matrix that characterizes the chosen subset. This problem arises in many domains, such as experimental designs, regression modeling, and environmental statistics. We es…
▽ More
We study the problem of optimal subset selection from a set of correlated random variables. In particular, we consider the associated combinatorial optimization problem of maximizing the determinant of a symmetric positive definite matrix that characterizes the chosen subset. This problem arises in many domains, such as experimental designs, regression modeling, and environmental statistics. We establish an efficient polynomial-time algorithm using Determinantal Point Process for approximating the optimal solution to the problem. We demonstrate the advantages of our methods by presenting computational results for both synthetic and real data sets.
△ Less
Submitted 10 July, 2019; v1 submitted 1 September, 2017;
originally announced September 2017.
-
Sequential Decision Model for Inference and Prediction on Non-Uniform Hypergraphs with Application to Knot Matching from Computational Forestry
Authors:
Seong-Hwan Jun,
Samuel W. K. Wong,
James V. Zidek,
Alexandre Bouchard-Côté
Abstract:
In this paper, we consider the knot matching problem arising in computational forestry. The knot matching problem is an important problem that needs to be solved to advance the state of the art in automatic strength prediction of lumber. We show that this problem can be formulated as a quadripartite matching problem and develop a sequential decision model that admits efficient parameter estimation…
▽ More
In this paper, we consider the knot matching problem arising in computational forestry. The knot matching problem is an important problem that needs to be solved to advance the state of the art in automatic strength prediction of lumber. We show that this problem can be formulated as a quadripartite matching problem and develop a sequential decision model that admits efficient parameter estimation along with a sequential Monte Carlo sampler on graph matching that can be utilized for rapid sampling of graph matching. We demonstrate the effectiveness of our methods on 30 manually annotated boards and present findings from various simulation studies to provide further evidence supporting the efficacy of our methods.
△ Less
Submitted 24 August, 2017;
originally announced August 2017.
-
The duration of load effect in lumber as stochastic degradation
Authors:
Samuel W. K. Wong,
James V. Zidek
Abstract:
This paper proposes a gamma process for modelling the damage that accumulates over time in the lumber used in structural engineering applications when stress is applied. The model separates the stochastic processes representing features internal to the piece of lumber on the one hand, from those representing external forces due to applied dead and live loads. The model applies those external force…
▽ More
This paper proposes a gamma process for modelling the damage that accumulates over time in the lumber used in structural engineering applications when stress is applied. The model separates the stochastic processes representing features internal to the piece of lumber on the one hand, from those representing external forces due to applied dead and live loads. The model applies those external forces through a time-varying population level function designed for time-varying loads. The application of this type of model, which is standard in reliability analysis, is novel in this context, which has been dominated by accumulated damage models (ADMs) over more than half a century. The proposed model is compared with one of the traditional ADMs. Our statistical results based on a Bayesian analysis of experimental data highlight the limitations of using accelerated testing data to assess long-term reliability, as seen in the wide posterior intervals. This suggests the need for more comprehensive testing in future applications, or to encode appropriate expert knowledge in the priors used for Bayesian analysis.
△ Less
Submitted 23 August, 2017;
originally announced August 2017.
-
Dimensional and statistical foundations for accumulated damage models
Authors:
Samuel W. K. Wong,
James V. Zidek
Abstract:
This paper develops a framework for creating damage accumulation models for engineered wood products by invoking the classical theory of non--dimensionalization. The result is a general class of such models. Both the US and Canadian damage accumulation models are revisited. It is shown how the former may be generalized within that framework while deficiencies are discovered in the latter and overc…
▽ More
This paper develops a framework for creating damage accumulation models for engineered wood products by invoking the classical theory of non--dimensionalization. The result is a general class of such models. Both the US and Canadian damage accumulation models are revisited. It is shown how the former may be generalized within that framework while deficiencies are discovered in the latter and overcome. Use of modern Bayesian statistical methods for estimating the parameters in these models is proposed along with an illustrative application of these methods to a ramp load dataset.
△ Less
Submitted 9 August, 2017;
originally announced August 2017.
-
Bayesian analysis of accumulated damage models in lumber reliability
Authors:
Chun-Hao Yang,
James V. Zidek,
Samuel W. K. Wong
Abstract:
Wood products that are subjected to sustained stress over a period of long duration may weaken, and this effect must be considered in models for the long-term reliability of lumber. The damage accumulation approach has been widely used for this purpose to set engineering standards. In this article, we revisit an accumulated damage model and propose a Bayesian framework for analysis. For parameter…
▽ More
Wood products that are subjected to sustained stress over a period of long duration may weaken, and this effect must be considered in models for the long-term reliability of lumber. The damage accumulation approach has been widely used for this purpose to set engineering standards. In this article, we revisit an accumulated damage model and propose a Bayesian framework for analysis. For parameter estimation and uncertainty quantification, we adopt approximation Bayesian computation (ABC) techniques to handle the complexities of the model. We demonstrate the effectiveness of our approach using both simulated and real data, and apply our fitted model to analyze long-term lumber reliability under a stochastic live loading scenario.
△ Less
Submitted 14 June, 2017;
originally announced June 2017.
-
Monitoring test under nonparametric random effects model
Authors:
Jiahua Chen,
Pengfei Li,
Yukun Liu,
James V. Zidek
Abstract:
Factors such as climate change, forest fire and plague of insects, lead to concerns on the mechanical strength of plantation materials. To address such concerns, these products must be closely monitored. This leads to the need of updating lumber quality monitoring procedures in American Society for Testing and Materials (ASTM) Standard D1990 (adopted in 1991) from time to time. A key component of…
▽ More
Factors such as climate change, forest fire and plague of insects, lead to concerns on the mechanical strength of plantation materials. To address such concerns, these products must be closely monitored. This leads to the need of updating lumber quality monitoring procedures in American Society for Testing and Materials (ASTM) Standard D1990 (adopted in 1991) from time to time. A key component of monitoring is an effective method for detecting the change in lower percentiles of the solid lumber strength based on multiple samples. In a recent study by Verrill et al.\ (2015), eight statistical tests proposed by wood scientists were examined thoroughly based on real and simulated data sets. These tests are found unsatisfactory in differing aspects such as seriously inflated false alarm rate when observations are clustered, suboptimal power properties, or having inconvenient ad hoc rejection regions. A contributing factor behind suboptimal performance is that most of these tests are not developed to detect the change in quantiles. In this paper, we use a nonparametric random effects model to handle the within cluster correlations, composite empirical likelihood to avoid explicit modelling of the correlations structure, and a density ratio model to combine the information from multiple samples. In addition, we propose a cluster-based bootstrapping procedure to construct the monitoring test on quantiles which satisfactorily controls the type I error in the presence of within cluster correlation. The performance of the test is examined through simulation experiments and a real world example. The new method is generally applicable, not confined to the motivating example.
△ Less
Submitted 18 October, 2016;
originally announced October 2016.
-
Data Integration Model for Air Quality: A Hierarchical Approach to the Global Estimation of Exposures to Ambient Air Pollution
Authors:
Gavin Shaddick,
Matthew L. Thomas,
Amelia Jobling,
Michael Brauer,
Aaron van Donkelaar,
Rick Burnett,
Howard Chang,
Aaron Cohen,
Rita Van Dingenen,
Carlos Dora,
Sophie Gumy,
Yang Liu,
Randall Martin,
Lance A. Waller,
Jason West,
James V. Zidek,
Annette Prüss-Ustün
Abstract:
Air pollution is a major risk factor for global health, with both ambient and household air pollution contributing substantial components of the overall global disease burden. One of the key drivers of adverse health effects is fine particulate matter ambient pollution (PM$_{2.5}$) to which an estimated 3 million deaths can be attributed annually. The primary source of information for estimating e…
▽ More
Air pollution is a major risk factor for global health, with both ambient and household air pollution contributing substantial components of the overall global disease burden. One of the key drivers of adverse health effects is fine particulate matter ambient pollution (PM$_{2.5}$) to which an estimated 3 million deaths can be attributed annually. The primary source of information for estimating exposures has been measurements from ground monitoring networks but, although coverage is increasing, there remain regions in which monitoring is limited. Ground monitoring data therefore needs to be supplemented with information from other sources, such as satellite retrievals of aerosol optical depth and chemical transport models. A hierarchical modelling approach for integrating data from multiple sources is proposed allowing spatially-varying relationships between ground measurements and other factors that estimate air quality. Set within a Bayesian framework, the resulting Data Integration Model for Air Quality (DIMAQ) is used to estimate exposures, together with associated measures of uncertainty, on a high resolution grid covering the entire world. Bayesian analysis on this scale can be computationally challenging and here approximate Bayesian inference is performed using Integrated Nested Laplace Approximations. Model selection and assessment is performed by cross-validation with the final model offering substantial increases in predictive accuracy, particularly in regions where there is sparse ground monitoring, when compared to current approaches: root mean square error (RMSE) reduced from 17.1 to 10.7, and population weighted RMSE from 23.1 to 12.1 $μ$gm$^{-3}$. Based on summaries of the posterior distributions for each grid cell, it is estimated that 92% of the world's population reside in areas exceeding the World Health Organization's Air Quality Guidelines.
△ Less
Submitted 26 September, 2016; v1 submitted 1 September, 2016;
originally announced September 2016.
-
Spatio-temporal Modelling of Temperature Fields in the Pacific Northwest
Authors:
Camila M. Casquilho-Resende,
Nhu D. Le,
James V. Zidek
Abstract:
The importance of modelling temperature fields goes beyond the need to understand a region's climate and serves too as a starting point for understanding their socioeconomic, and health consequences. The topography of the study region contributes much to the complexity of modelling these fields and demands flexible spatio-temporal models that are able to handle nonstationarity and changes in trend…
▽ More
The importance of modelling temperature fields goes beyond the need to understand a region's climate and serves too as a starting point for understanding their socioeconomic, and health consequences. The topography of the study region contributes much to the complexity of modelling these fields and demands flexible spatio-temporal models that are able to handle nonstationarity and changes in trend. In this paper, we develop a flexible stochastic spatio-temporal model for daily temperatures in the Pacific Northwest, and describe a methodology for performing Bayesian spatial prediction. A novel aspect of this model, an extension of the spatio-temporal model proposed in Le and Zidek (1992), is its incorporation of site-specific features of a spatio-temporal field in its spatio-temporal mean. Due to the often surprising Pacific Northwestern weather, the analysis reported in the paper shows the need to incorporate spatio-temporal interactions in that mean in order to understand the rapid changes in temperature observed in nearby locations and to get approximately stationary residuals for higher level analysis. No structure is assumed for the spatial covariance matrix of these residuals, thus allowing the model to capture any nonstationary spatial structures remaining in those residuals.
△ Less
Submitted 2 April, 2016;
originally announced April 2016.
-
Reducing estimation bias in adaptively changing monitoring networks with preferential site selection
Authors:
James V. Zidek,
Gavin Shaddick,
Carolyn G. Taylor
Abstract:
This paper explores the topic of preferential sampling, specifically situations where monitoring sites in environmental networks are preferentially located by the designers. This means the data arising from such networks may not accurately characterize the spatio-temporal field they intend to monitor. Approaches that have been developed to mitigate the effects of preferential sampling in various c…
▽ More
This paper explores the topic of preferential sampling, specifically situations where monitoring sites in environmental networks are preferentially located by the designers. This means the data arising from such networks may not accurately characterize the spatio-temporal field they intend to monitor. Approaches that have been developed to mitigate the effects of preferential sampling in various contexts are reviewed and, building on these approaches, a general framework for dealing with the effects of preferential sampling in environmental monitoring is proposed. Strategies for implementation are proposed, leading to a method for improving the accuracy of official statistics used to report trends and inform regulatory policy. An essential feature of the method is its capacity to learn the preferential selection process over time and hence to reduce bias in these statistics. Simulation studies suggest dramatic reductions in bias are possible. A case study demonstrates use of the method in assessing the levels of air pollution due to black smoke in the UK over an extended period (1970-1996). In particular, dramatic reductions in the estimates of the number of sites out of compliance are observed.
△ Less
Submitted 3 December, 2014;
originally announced December 2014.
-
Bayesian Melding of the Dead-Reckoned Path and GPS Measurements for an Accurate and High-Resolution Path of Marine Mammals
Authors:
Yang Liu,
Brian C. Battaile,
James V. Zidek,
Andrew W. Trites
Abstract:
With the recent advances in electrical engineering, devices attached to free-ranging marine mammals today can collect oceanographic data in remarkable high spatial-temporal resolution. However, those data cannot be fully utilized without a matching high-resolution and accurate path of the animal, which is currently missing in this field. In this paper, we develop a Bayesian melding approach based…
▽ More
With the recent advances in electrical engineering, devices attached to free-ranging marine mammals today can collect oceanographic data in remarkable high spatial-temporal resolution. However, those data cannot be fully utilized without a matching high-resolution and accurate path of the animal, which is currently missing in this field. In this paper, we develop a Bayesian melding approach based on a Brownian Bridge process to combine the fine-resolution but seriously biased Dead-Reckoned path and the precise but sparse GPS measurements, which results in an accurate and high-resolution estimated path together with credible bands as quantified uncertainty statements. We also exploit the properties of underlying processes and some approximations to the likelihood to dramatically reduce the computational burden of handling those big high resolution data sets.
△ Less
Submitted 28 December, 2014; v1 submitted 24 November, 2014;
originally announced November 2014.
-
Hypothesis testing in the presence of multiple samples under density ratio models
Authors:
Song Cai,
Jiahua Chen,
James V. Zidek
Abstract:
This paper presents a hypothesis testing method given independent samples from a number of connected populations. The method is motivated by a forestry project for monitoring change in the strength of lumber. Traditional practice has been built upon nonparametric methods which ignore the fact that these populations are connected. By pooling the information in multiple samples through a density rat…
▽ More
This paper presents a hypothesis testing method given independent samples from a number of connected populations. The method is motivated by a forestry project for monitoring change in the strength of lumber. Traditional practice has been built upon nonparametric methods which ignore the fact that these populations are connected. By pooling the information in multiple samples through a density ratio model, the proposed empirical likelihood method leads to a more efficient inference and therefore reduces the cost in applications. The new test has a classical chi-square null limiting distribution. Its power function is obtained under a class of local alternatives. The local power is found increased even when some underlying populations are unrelated to the hypothesis of interest. Simulation studies confirm that this test has better power properties than potential competitors, and is robust to model misspecification. An application example to lumber strength is included.
△ Less
Submitted 14 May, 2015; v1 submitted 18 September, 2013;
originally announced September 2013.
-
Modeling Non-Stationary Processes Through Dimension Expansion
Authors:
Luke Bornn,
Gavin Shaddick,
James V Zidek
Abstract:
In this paper, we propose a novel approach to modeling nonstationary spatial fields. The proposed method works by expanding the geographic plane over which these processes evolve into higher dimensional spaces, transforming and clarifying complex patterns in the physical plane. By combining aspects of multi-dimensional scaling, group lasso, and latent variables models, a dimensionally sparse proje…
▽ More
In this paper, we propose a novel approach to modeling nonstationary spatial fields. The proposed method works by expanding the geographic plane over which these processes evolve into higher dimensional spaces, transforming and clarifying complex patterns in the physical plane. By combining aspects of multi-dimensional scaling, group lasso, and latent variables models, a dimensionally sparse projection is found in which the originally nonstationary field exhibits stationarity. Following a comparison with existing methods in a simulated environment, dimension expansion is studied on a classic test-bed data set historically used to study nonstationary models. Following this, we explore the use of dimension expansion in modeling air pollution in the United Kingdom, a process known to be strongly influenced by rural/urban effects, amongst others, which gives rise to a nonstationary field.
△ Less
Submitted 2 June, 2011; v1 submitted 10 November, 2010;
originally announced November 2010.
-
Predicting phenological events using event-history analysis
Authors:
Song Cai,
James V. Zidek,
Nathaniel Newlands
Abstract:
This paper presents an approach to phenology, one based on the use of a method developed by the authors for event history data. Of specific interest is the prediction of the so-called "bloom--date" of fruit trees in the agriculture industry and it is this application which we consider, although the method is much more broadly applicable. Our approach provides sensible estimate for a parameter that…
▽ More
This paper presents an approach to phenology, one based on the use of a method developed by the authors for event history data. Of specific interest is the prediction of the so-called "bloom--date" of fruit trees in the agriculture industry and it is this application which we consider, although the method is much more broadly applicable. Our approach provides sensible estimate for a parameter that interests phenologists -- Tbase, the thresholding parameter in the definition of the growing degree days (GDD). Our analysis supports scientists' empirical finding: the timing of a phenological event of a prenniel crop is related the cumulative sum of GDDs. Our prediction of future bloom--dates are quite accurate, but the predictive uncertainty is high, possibly due to our crude climate model for predicting future temperature, the time-dependent covariate in our regression model for phenological events. We found that if we can manage to get accurate prediction of future temperature, our prediction of bloom--date is more accurate and the predictive uncertainty is much lower.
△ Less
Submitted 20 September, 2010;
originally announced September 2010.
-
Predicting Sequences of Progressive Events Times with Time-dependent Covariates
Authors:
Song Cai,
James V. Zidek,
Nathaniel Newlands
Abstract:
This paper presents an approach to modeling progressive event-history data when the overall objective is prediction based on time-dependent covariates. This approach does not model the hazard function directly. Instead, it models the process of the state indicators of the event history so that the time-dependent covariates can be incorporated and predictors of the future events easily formulated.…
▽ More
This paper presents an approach to modeling progressive event-history data when the overall objective is prediction based on time-dependent covariates. This approach does not model the hazard function directly. Instead, it models the process of the state indicators of the event history so that the time-dependent covariates can be incorporated and predictors of the future events easily formulated. Our model can be applied to a range of real-world problems in medical and agricultural science.
△ Less
Submitted 5 September, 2010;
originally announced September 2010.
-
Estimating exposure response functions using ambient pollution concentrations
Authors:
Gavin Shaddick,
Duncan Lee,
James V. Zidek,
Ruth Salway
Abstract:
This paper presents an approach to estimating the health effects of an environmental hazard. The approach is general in nature, but is applied here to the case of air pollution. It uses a computer model involving ambient pollution and temperature inputs, to simulate the exposures experienced by individuals in an urban area, whilst incorporating the mechanisms that determine exposures. The output…
▽ More
This paper presents an approach to estimating the health effects of an environmental hazard. The approach is general in nature, but is applied here to the case of air pollution. It uses a computer model involving ambient pollution and temperature inputs, to simulate the exposures experienced by individuals in an urban area, whilst incorporating the mechanisms that determine exposures. The output from the model comprises a set of daily exposures for a sample of individuals from the population of interest. These daily exposures are approximated by parametric distributions, so that the predictive exposure distribution of a randomly selected individual can be generated. These distributions are then incorporated into a hierarchical Bayesian framework (with inference using Markov Chain Monte Carlo simulation) in order to examine the relationship between short-term changes in exposures and health outcomes, whilst making allowance for long-term trends, seasonality, the effect of potential confounders and the possibility of ecological bias.
The paper applies this approach to particulate pollution (PM$_{10}$) and respiratory mortality counts for seniors in greater London ($\geq$65 years) during 1997. Within this substantive epidemiological study, the effects on health of ambient concentrations and (estimated) personal exposures are compared.
△ Less
Submitted 31 October, 2007;
originally announced October 2007.
-
Modeling Hourly Ozone Concentration Fields
Authors:
Yiping Dou,
Nhu D Le,
James V Zidek
Abstract:
This paper presents a dynamic linear model for modeling hourly ozone concentrations over the eastern United States. That model, which is developed within an Bayesian hierarchical framework, inherits the important feature of such models that its coefficients, treated as states of the process, can change with time. Thus the model includes a time--varying site invariant mean field as well as time v…
▽ More
This paper presents a dynamic linear model for modeling hourly ozone concentrations over the eastern United States. That model, which is developed within an Bayesian hierarchical framework, inherits the important feature of such models that its coefficients, treated as states of the process, can change with time. Thus the model includes a time--varying site invariant mean field as well as time varying coefficients for 24 and 12 diurnal cycle components. This cost of this model's great flexibility comes at the cost of computational complexity, forcing us to use an MCMC approach and to restrict application of our model domain to a small number of monitoring sites. We critically assess this model and discover some of its weaknesses in this type of application.
△ Less
Submitted 1 June, 2007;
originally announced June 2007.