Search | arXiv e-print repository

arXiv:2209.01479 [pdf, other]

Evaluation of Model-Based PM$_{2.5}$ Estimates for Exposure Assessment During Wildfire Smoke Episodes in the Western U.S

Authors: Ellen M. Considine, Jiayuan Hao, Priyanka deSouza, Danielle Braun, Colleen E. Reid, Rachel C. Nethery

Abstract: Investigating the health impacts of wildfire smoke requires data on people's exposure to fine particulate matter (PM$_{2.5}$) across space and time. In recent years, it has become common to use machine learning models to fill gaps in monitoring data. However, it remains unclear how well these models are able to capture spikes in PM$_{2.5}$ during and across wildfire events. Here, we evaluate the a… ▽ More Investigating the health impacts of wildfire smoke requires data on people's exposure to fine particulate matter (PM$_{2.5}$) across space and time. In recent years, it has become common to use machine learning models to fill gaps in monitoring data. However, it remains unclear how well these models are able to capture spikes in PM$_{2.5}$ during and across wildfire events. Here, we evaluate the accuracy of two sets of high-coverage and high-resolution machine learning-derived PM$_{2.5}$ data sets created by Di et al. (2021) and Reid et al. (2021). In general, the Reid estimates are more accurate than the Di estimates when compared to independent validation data from mobile smoke monitors deployed by the US Forest Service. However, both models tend to severely under-predict PM$_{2.5}$ on high-pollution days. Our findings complement other recent studies calling for increased air pollution monitoring in the western US and support the inclusion of wildfire-specific monitoring observations and predictor variables in model-based estimates of PM$_{2.5}$. Lastly, we call for more rigorous error quantification of machine-learning derived exposure data sets, with special attention to extreme events. △ Less

Submitted 9 January, 2023; v1 submitted 3 September, 2022; originally announced September 2022.

Comments: Main text = 26 pages, including 3 figures and 2 tables. References + Supporting Information = 22 pages, including equations, supplemental notes, 2 figures, and 8 tables

arXiv:2110.01053 [pdf, other]

Treeging

Authors: Gregory L. Watson, Michael Jerrett, Colleen E. Reid, Donatello Telesca

Abstract: Treeging combines the flexible mean structure of regression trees with the covariance-based prediction strategy of kriging into the base learner of an ensemble prediction algorithm. In so doing, it combines the strengths of the two primary types of spatial and space-time prediction models: (1) models with flexible mean structures (often machine learning algorithms) that assume independently distri… ▽ More Treeging combines the flexible mean structure of regression trees with the covariance-based prediction strategy of kriging into the base learner of an ensemble prediction algorithm. In so doing, it combines the strengths of the two primary types of spatial and space-time prediction models: (1) models with flexible mean structures (often machine learning algorithms) that assume independently distributed data, and (2) kriging or Gaussian Process (GP) prediction models with rich covariance structures but simple mean structures. We investigate the predictive accuracy of treeging across a thorough and widely varied battery of spatial and space-time simulation scenarios, comparing it to ordinary kriging, random forest and ensembles of ordinary kriging base learners. Treeging performs well across the board, whereas kriging suffers when dependence is weak or in the presence of spurious covariates, and random forest suffers when the covariates are less informative. Treeging also outperforms these competitors in predicting atmospheric pollutants (ozone and PM$_{2.5}$) in several case studies. We examine sensitivity to tuning parameters (number of base learners and training data sampling proportion), finding they follow the familiar intuition of their random forest counterparts. We include a discussion of scaleability, noting that any covariance approximation techniques that expedite kriging (GP) may be similarly applied to expedite treeging. △ Less

Submitted 3 October, 2021; originally announced October 2021.

arXiv:2012.13867 [pdf, other]

Prediction & Model Evaluation for Space-Time Data

Authors: Gregory L. Watson, Colleen E. Reid, Michael Jerrett, Donatello Telesca

Abstract: Evaluation metrics for prediction error, model selection and model averaging on space-time data are understudied and poorly understood. The absence of independent replication makes prediction ambiguous as a concept and renders evaluation procedures developed for independent data inappropriate for most space-time prediction problems. Motivated by air pollution data collected during California wildf… ▽ More Evaluation metrics for prediction error, model selection and model averaging on space-time data are understudied and poorly understood. The absence of independent replication makes prediction ambiguous as a concept and renders evaluation procedures developed for independent data inappropriate for most space-time prediction problems. Motivated by air pollution data collected during California wildfires in 2008, this manuscript attempts a formalization of the true prediction error associated with spatial interpolation. We investigate a variety of cross-validation (CV) procedures employing both simulations and case studies to provide insight into the nature of the estimand targeted by alternative data partition strategies. Consistent with recent best practice, we find that location-based cross-validation is appropriate for estimating spatial interpolation error as in our analysis of the California wildfire data. Interestingly, commonly held notions of bias-variance trade-off of CV fold size do not trivially apply to dependent data, and we recommend leave-one-location-out (LOLO) CV as the preferred prediction error metric for spatial interpolation. △ Less

Submitted 4 November, 2022; v1 submitted 27 December, 2020; originally announced December 2020.

Showing 1–3 of 3 results for author: Reid, C E