A Hierarchical Bayesian Model for Stochastic Spatiotemporal SIR Modeling and Prediction of COVID-19 Cases and Hospitalizations
Authors:
Curtis B. Storlie,
Ricardo L. Rojas,
Gabriel O. Demuth,
Benjamin D. Pollock,
Patrick W. Johnson,
Patrick M. Wilson,
Ethan P. Heinzen,
Hongfang Liu,
Rickey E. Carter,
Sean C. Dowdy,
Shannon M. Dunlay,
Elizabeth B. Habermann,
Daryl J. Kor,
Matthew R. Neville,
Andrew H. Limper,
Katherine H. Noe,
Mohamad Bydon,
Pablo Moreno Franco,
Priya Sampathkumar,
Nilay D. Shah,
Henry H. Ting
Abstract:
Most COVID-19 predictive modeling efforts use statistical or mathematical models to predict national- and state-level COVID-19 cases or deaths in the future. These approaches assume parameters such as reproduction time, test positivity rate, hospitalization rate, and social intervention effectiveness (masking, distancing, and mobility) are constant. However, the one certainty with the COVID-19 pan…
▽ More
Most COVID-19 predictive modeling efforts use statistical or mathematical models to predict national- and state-level COVID-19 cases or deaths in the future. These approaches assume parameters such as reproduction time, test positivity rate, hospitalization rate, and social intervention effectiveness (masking, distancing, and mobility) are constant. However, the one certainty with the COVID-19 pandemic is that these parameters change over time, as well as vary across counties and states. In fact, the rate of spread over region, hospitalization rate, hospital length of stay and mortality rate, the proportion of the population that is susceptible, test positivity rate, and social behaviors can all change significantly over time. Thus, the quantification of uncertainty becomes critical in making meaningful and accurate forecasts of the future. Bayesian approaches are a natural way to fully represent this uncertainty in mathematical models and have become particularly popular in physics and engineering models. The explicit integration time varying parameters and uncertainty quantification into a hierarchical Bayesian forecast model differentiates the Mayo COVID-19 model from other forecasting models. By accounting for all sources of uncertainty in both parameter estimation as well as future trends with a Bayesian approach, the Mayo COVID-19 model accurately forecasts future cases and hospitalizations, as well as the degree of uncertainty. This approach has been remarkably accurate and a linchpin in Mayo Clinic's response to managing the COVID-19 pandemic. The model accurately predicted timing and extent of the summer and fall surges at Mayo Clinic sites, allowing hospital leadership to manage resources effectively to provide a successful pandemic response. This model has also proven to be very useful to the state of Minnesota to help guide difficult policy decisions.
△ Less
Submitted 8 April, 2021;
originally announced April 2021.
Unsupervised Machine Learning for the Discovery of Latent Disease Clusters and Patient Subgroups Using Electronic Health Records
Authors:
Yanshan Wang,
Yiqing Zhao,
Terry M. Therneau,
Elizabeth J. Atkinson,
Ahmad P. Tafti,
Nan Zhang,
Shreyasee Amin,
Andrew H. Limper,
Hongfang Liu
Abstract:
Machine learning has become ubiquitous and a key technology on mining electronic health records (EHRs) for facilitating clinical research and practice. Unsupervised machine learning, as opposed to supervised learning, has shown promise in identifying novel patterns and relations from EHRs without using human created labels. In this paper, we investigate the application of unsupervised machine lear…
▽ More
Machine learning has become ubiquitous and a key technology on mining electronic health records (EHRs) for facilitating clinical research and practice. Unsupervised machine learning, as opposed to supervised learning, has shown promise in identifying novel patterns and relations from EHRs without using human created labels. In this paper, we investigate the application of unsupervised machine learning models in discovering latent disease clusters and patient subgroups based on EHRs. We utilized Latent Dirichlet Allocation (LDA), a generative probabilistic model, and proposed a novel model named Poisson Dirichlet Model (PDM), which extends the LDA approach using a Poisson distribution to model patients' disease diagnoses and to alleviate age and sex factors by considering both observed and expected observations. In the empirical experiments, we evaluated LDA and PDM on three patient cohorts with EHR data retrieved from the Rochester Epidemiology Project (REP), for the discovery of latent disease clusters and patient subgroups. We compared the effectiveness of LDA and PDM in identifying latent disease clusters through the visualization of disease representations learned by two approaches. We also tested the performance of LDA and PDM in differentiating patient subgroups through survival analysis, as well as statistical analysis. The experimental results show that the proposed PDM could effectively identify distinguished disease clusters by alleviating the impact of age and sex, and that LDA could stratify patients into more differentiable subgroups than PDM in terms of p-values. However, the subgroups discovered by PDM might imply the underlying patterns of diseases of greater interest in epidemiology research due to the alleviation of age and sex. Both unsupervised machine learning approaches could be leveraged to discover patient subgroups using EHRs but with different foci.
△ Less
Submitted 17 May, 2019;
originally announced May 2019.