-
Modelling multiplex testing for outbreak Control
Authors:
Martyn Fyles,
Christopher E. Overton,
Tom Ward,
Emma Bennett,
Tom Fowler,
Ian Hall
Abstract:
During the SARS-CoV-2 pandemic, polymerase chain reaction (PCR) and lateral flow device (LFD) tests were frequently deployed to detect the presence of SARS-CoV-2. Many of these tests were singleplex, and only tested for the presence of a single pathogen. Multiplex tests can test for the presence of several pathogens using only a single swab, which can allow for: surveillance of more pathogens, tar…
▽ More
During the SARS-CoV-2 pandemic, polymerase chain reaction (PCR) and lateral flow device (LFD) tests were frequently deployed to detect the presence of SARS-CoV-2. Many of these tests were singleplex, and only tested for the presence of a single pathogen. Multiplex tests can test for the presence of several pathogens using only a single swab, which can allow for: surveillance of more pathogens, targeting of antiviral interventions, a reduced burden of testing, and lower costs. Test sensitivity however, particularly in LFD tests, is highly conditional on the viral concentration dynamics of individuals. To inform the use of multiplex testing in outbreak detection it is therefore necessary to investigate the interactions between outbreak detection strategies and the differing viral concentration trajectories of key pathogens. Viral concentration trajectories are estimated for SARS-CoV-2, and Influenza A/B. Testing strategies for the first five symptomatic cases in an outbreak are then simulated and used to evaluate key performance indicators. Strategies that use a combination of multiplex LFD and PCR tests achieve; high levels of detection, detect outbreaks rapidly, and have the lowest burden of testing across multiple pathogens. Influenza B was estimated to have lower rates of detection due to its modelled viral concentration dynamics.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Best practices for estimating and reporting epidemiological delay distributions of infectious diseases using public health surveillance and healthcare data
Authors:
Kelly Charniga,
Sang Woo Park,
Andrei R Akhmetzhanov,
Anne Cori,
Jonathan Dushoff,
Sebastian Funk,
Katelyn M Gostic,
Natalie M Linton,
Adrian Lison,
Christopher E Overton,
Juliet R C Pulliam,
Thomas Ward,
Simon Cauchemez,
Sam Abbott
Abstract:
Epidemiological delays, such as incubation periods, serial intervals, and hospital lengths of stay, are among key quantities in infectious disease epidemiology that inform public health policy and clinical practice. This information is used to inform mathematical and statistical models, which in turn can inform control strategies. There are three main challenges that make delay distributions diffi…
▽ More
Epidemiological delays, such as incubation periods, serial intervals, and hospital lengths of stay, are among key quantities in infectious disease epidemiology that inform public health policy and clinical practice. This information is used to inform mathematical and statistical models, which in turn can inform control strategies. There are three main challenges that make delay distributions difficult to estimate. First, the data are commonly censored (e.g., symptom onset may only be reported by date instead of the exact time of day). Second, delays are often right truncated when being estimated in real time (not all events that have occurred have been observed yet). Third, during a rapidly growing or declining outbreak, overrepresentation or underrepresentation, respectively, of recently infected cases in the data can lead to bias in estimates. Studies that estimate delays rarely address all these factors and sometimes report several estimates using different combinations of adjustments, which can lead to conflicting answers and confusion about which estimates are most accurate. In this work, we formulate a checklist of best practices for estimating and reporting epidemiological delays with a focus on the incubation period and serial interval. We also propose strategies for handling common biases and identify areas where more work is needed. Our recommendations can help improve the robustness and utility of reported estimates and provide guidance for the evaluation of estimates for downstream use in transmission models or other analyses.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Real-time COVID-19 hospital admissions forecasting with leading indicators and ensemble methods in England
Authors:
Jonathon Mellor,
Rachel Christie,
Robert S Paton,
Rhianna Leslie,
Maria Tang,
Martyn Fyles,
Sarah Deeny,
Thomas Ward,
Christopher E Overton
Abstract:
Hospitalisations from COVID-19 with Omicron sub-lineages have put a sustained pressure on the English healthcare system. Understanding the expected healthcare demand enables more effective and timely planning from public health. We collect syndromic surveillance sources, which include online search data, NHS 111 telephonic and online triages. Incorporating this data we explore generalised additive…
▽ More
Hospitalisations from COVID-19 with Omicron sub-lineages have put a sustained pressure on the English healthcare system. Understanding the expected healthcare demand enables more effective and timely planning from public health. We collect syndromic surveillance sources, which include online search data, NHS 111 telephonic and online triages. Incorporating this data we explore generalised additive models, generalised linear mixed-models, penalised generalised linear models and model ensemble methods to forecast over a two-week forecast horizon at an NHS Trust level. Furthermore, we showcase how model combinations improve forecast scoring through a mean ensemble, weighted ensemble, and ensemble by regression. Validated over multiple Omicron waves, at different spatial scales, we show that leading indicators can improve performance of forecasting models, particularly at epidemic changepoints. Using a variety of scoring rules, we show that ensemble approaches outperformed all individual models, providing higher performance at a 21-day window than the corresponding individual models at 14-days. We introduce a modelling structure used by public health officials in England in 2022 to inform NHS healthcare strategy and policy decision making. This paper explores the significance of ensemble methods to improve forecasting performance and how novel syndromic surveillance can be practically applied in epidemic forecasting.
△ Less
Submitted 16 August, 2023; v1 submitted 9 June, 2023;
originally announced June 2023.
-
Quantifying the risk of workplace COVID-19 clusters in terms of commuter, workplace, and population characteristics
Authors:
Christopher E. Overton,
Rachel Abbey,
Tarrion Baird,
Rachel Christie,
Owen Daniel,
Julie Day,
Matthew Gittins,
Owen Jones,
Robert Paton,
Maria Tang,
Tom Ward,
Jack Wilkinson,
Camilla Woodrow-Hill,
Tim Aldridge,
Yiqun Chen
Abstract:
Objectives: To identify and quantify risk factors that contribute to clusters of COVID-19 in the workplace.
Methods: We identified clusters of COVID-19 cases in the workplace and investigated the characteristics of the individuals, the workplaces, the areas they work, and the methods of commute to work, through data linkages based on Middle Layer Super Output Areas (MSOAs) in England between 20/…
▽ More
Objectives: To identify and quantify risk factors that contribute to clusters of COVID-19 in the workplace.
Methods: We identified clusters of COVID-19 cases in the workplace and investigated the characteristics of the individuals, the workplaces, the areas they work, and the methods of commute to work, through data linkages based on Middle Layer Super Output Areas (MSOAs) in England between 20/06/2021 and 20/02/2022. We estimated associations between potential risk factors and workplace clusters, adjusting for plausible confounders identified using a Directed Acyclic Graph (DAG).
Results: For most industries, increased physical proximity in the workplace was associated with increased risk of COVID-19 clusters, while increased vaccination was associated with reduced risk. Commuter demographic risk factors varied across industry, but for the majority of industries, a higher proportion of black/african/caribbean ethnicities, and living in deprived areas, was associated with increased cluster risk. A higher proportion of commuters in the 60-64 age group was associated with reduced cluster risk. There were significant associations between gender, work commute methods, and staff contract type with cluster risk, but these were highly variable across industries.
Conclusions: This study has used novel national data linkages to identify potential risk factors of workplace COVID-19 clusters, including possible protective effects of vaccination and increased physical distance at work. The same methodological approach can be applied to wider occupational and environmental health research.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Understanding the leading indicators of hospital admissions from COVID-19 across successive waves in the UK
Authors:
Jonathon Mellor,
Christopher E Overton,
Martyn Fyles,
Liam Chawner,
James Baxter,
Tarrion Baird,
Thomas Ward
Abstract:
Following the UK Government's Living with COVID-19 Strategy and the end of universal testing, hospital admissions are an increasingly important measure of COVID-19 pandemic pressure. Understanding leading indicators of admissions at National Health Service (NHS) Trust, regional and national geographies help health services plan capacity needs and prepare for ongoing pressures. We explored the spat…
▽ More
Following the UK Government's Living with COVID-19 Strategy and the end of universal testing, hospital admissions are an increasingly important measure of COVID-19 pandemic pressure. Understanding leading indicators of admissions at National Health Service (NHS) Trust, regional and national geographies help health services plan capacity needs and prepare for ongoing pressures. We explored the spatio-temporal relationships of leading indicators of hospital pressure across successive waves of SARS-CoV-2 incidence in England. This includes an analysis of internet search volume values from Google Trends, NHS triage calls and online queries, the NHS COVID-19 App, lateral flow devices and the ZOE App. Data sources were analysed for their feasibility as leading indicators using linear and non-linear methods; granger causality, cross correlations and dynamic time warping at fine spatial scales. Consistent temporal and spatial relationships were found for some of the leading indicators assessed across resurgent waves of COVID-19. Google Trends and NHS queries consistently led admissions in over 70% of Trusts, with lead times ranging from 5-20 days, whereas an inconsistent relationship was found for the ZOE app, NHS COVID-19 App, and rapid testing, that diminished with granularity, showing limited autocorrelation of leads between -7 to 7 days. This work shows that novel syndromic surveillance data has utility for understanding the expected hospital burden at fine spatial scales. The analysis shows at low level geographies that some surveillance sources can predict hospital admissions, though care must be taken in relying on the lead times and consistency between waves.
△ Less
Submitted 16 August, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
Forecasting influenza hospital admissions within English sub-regions using hierarchical generalised additive models
Authors:
Jonathon Mellor,
Rachel Christie,
Christopher E Overton,
Robert S Paton,
Rhianna Leslie,
Maria Tang,
Sarah Deeny,
Thomas Ward
Abstract:
Background: Seasonal influenza causes a substantial burden on healthcare services over the winter period when these systems are already under pressure. Policies during the COVID-19 pandemic supressed the transmission of season influenza, making the timing and magnitude of a potential resurgence difficult to predict.
Methods: We developed a hierarchical generalised additive model (GAM) for the sh…
▽ More
Background: Seasonal influenza causes a substantial burden on healthcare services over the winter period when these systems are already under pressure. Policies during the COVID-19 pandemic supressed the transmission of season influenza, making the timing and magnitude of a potential resurgence difficult to predict.
Methods: We developed a hierarchical generalised additive model (GAM) for the short-term forecasting of hospital admissions with a positive test for the influenza virus sub-regionally across England. The model incorporates a multi-level structure of spatio-temporal splines, weekly seasonality, and spatial correlation. Using multiple performance metrics including interval score, coverage, bias, and median absolute error, the predictive performance is evaluated for the 2022/23 seasonal wave. Performance is measured against an autoregressive integrated moving average (ARIMA) time series model.
Results: The GAM method outperformed the ARIMA model across scoring rules at both high and low-level geographies, and across the different phases of the epidemic wave including the turning point. The performance of the GAM with a 14-day forecast horizon was comparable in error to the ARIMA at 7 days. The performance of the GAM is found to be most sensitive to the flexibility of the smoothing function that measures the national epidemic trend.
Interpretation: This study introduces a novel approach to short-term forecasting of hospital admissions with influenza using hierarchical, spatial, and temporal components. The model is data-driven and practical to deploy using information realistically available at time of prediction, addressing key limitations of epidemic forecasting approaches. This model was used across the winter for healthcare operational planning by the UK Health Security Agency and the National Health Service in England.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Adaptive data collection for intra-individual studies affected by adherence
Authors:
Greta Monacelli,
Lili Zhang,
Winfried Schlee,
Berthold Langguth,
Tomás E. Ward,
Thomas B. Murphy
Abstract:
Recently the use of mobile technologies in Ecological Momentary Assessments (EMA) and Interventions (EMI) has made it easier to collect data suitable for intra-individual variability studies in the medical field. Nevertheless, especially when self-reports are used during the data collection process, there are difficulties in balancing data quality and the burden placed on the subjects. In this pap…
▽ More
Recently the use of mobile technologies in Ecological Momentary Assessments (EMA) and Interventions (EMI) has made it easier to collect data suitable for intra-individual variability studies in the medical field. Nevertheless, especially when self-reports are used during the data collection process, there are difficulties in balancing data quality and the burden placed on the subjects. In this paper, we address this problem for a specific EMA setting which aims to submit a demanding task to subjects at high/low values of a self-reported variable. We adopt a dynamic approach inspired by control chart methods and design optimization techniques to obtain an EMA triggering mechanism for data collection which takes into account both the individual variability of the self-reported variable and of the adherence rate. We test the algorithm in both a simulation setting and with real, large-scale data from a tinnitus longitudinal study. A Wilcoxon-Mann-Whitney Rank Sum Test shows that the algorithm tends to have both a higher F1 score and utility than a random schedule and a rule-based algorithm with static thresholds, which are the current state-of-the-art approaches. In conclusion, the algorithm is proven effective in balancing data quality and the burden placed on the participants, especially, as the analysis performed suggest, in studies where data collection is impacted by adherence.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
Implementing the ICE Estimator in Multilayer Perceptron Classifiers
Authors:
Tyler Ward
Abstract:
This paper describes the techniques used to implement the ICE estimator for a multilayer perceptron model, and reviews the performance of the resulting models. The ICE estimator is implemented in the Apache Spark MultilayerPerceptronClassifier, and shown in cross-validation to outperform the stock MultilayerPerceptronClassifier that uses unadjusted MLE (cross-entropy) loss. The resulting models ha…
▽ More
This paper describes the techniques used to implement the ICE estimator for a multilayer perceptron model, and reviews the performance of the resulting models. The ICE estimator is implemented in the Apache Spark MultilayerPerceptronClassifier, and shown in cross-validation to outperform the stock MultilayerPerceptronClassifier that uses unadjusted MLE (cross-entropy) loss. The resulting models have identical runtime performance, and similar fitting performance to the stock MLP implementations. Additionally, this approach requires no hyper-parameters, and is therefore viable as a drop-in replacement for cross-entropy optimizing multilayer perceptron classifiers wherever overfitting may be a concern.
△ Less
Submitted 12 July, 2020;
originally announced July 2020.
-
IROS 2019 Lifelong Robotic Vision Challenge -- Lifelong Object Recognition Report
Authors:
Qi She,
Fan Feng,
Qi Liu,
Rosa H. M. Chan,
Xinyue Hao,
Chuanlin Lan,
Qihan Yang,
Vincenzo Lomonaco,
German I. Parisi,
Heechul Bae,
Eoin Brophy,
Baoquan Chen,
Gabriele Graffieti,
Vidit Goel,
Hyonyoung Han,
Sathursan Kanagarajah,
Somesh Kumar,
Siew-Kei Lam,
Tin Lun Lam,
Liang Ma,
Davide Maltoni,
Lorenzo Pellegrini,
Duvindu Piyasena,
Shiliang Pu,
Debdoot Sheet
, et al. (11 additional authors not shown)
Abstract:
This report summarizes IROS 2019-Lifelong Robotic Vision Competition (Lifelong Object Recognition Challenge) with methods and results from the top $8$ finalists (out of over~$150$ teams). The competition dataset (L)ifel(O)ng (R)obotic V(IS)ion (OpenLORIS) - Object Recognition (OpenLORIS-object) is designed for driving lifelong/continual learning research and application in robotic vision domain, w…
▽ More
This report summarizes IROS 2019-Lifelong Robotic Vision Competition (Lifelong Object Recognition Challenge) with methods and results from the top $8$ finalists (out of over~$150$ teams). The competition dataset (L)ifel(O)ng (R)obotic V(IS)ion (OpenLORIS) - Object Recognition (OpenLORIS-object) is designed for driving lifelong/continual learning research and application in robotic vision domain, with everyday objects in home, office, campus, and mall scenarios. The dataset explicitly quantifies the variants of illumination, object occlusion, object size, camera-object distance/angles, and clutter information. Rules are designed to quantify the learning capability of the robotic vision system when faced with the objects appearing in the dynamic environments in the contest. Individual reports, dataset information, rules, and released source code can be found at the project homepage: "https://lifelong-robotic-vision.github.io/competition/".
△ Less
Submitted 26 April, 2020;
originally announced April 2020.
-
Predicting Injectable Medication Adherence via a Smart Sharps Bin and Machine Learning
Authors:
Yingqi Gu,
Akshay Zalkikar,
Lara Kelly,
Kieran Daly,
Tomas E. Ward
Abstract:
Medication non-adherence is a widespread problem affecting over 50% of people who have chronic illness and need chronic treatment. Non-adherence exacerbates health risks and drives significant increases in treatment costs. In order to address these challenges, the importance of predicting patients' adherence has been recognised. In other words, it is important to improve the efficiency of interven…
▽ More
Medication non-adherence is a widespread problem affecting over 50% of people who have chronic illness and need chronic treatment. Non-adherence exacerbates health risks and drives significant increases in treatment costs. In order to address these challenges, the importance of predicting patients' adherence has been recognised. In other words, it is important to improve the efficiency of interventions of the current healthcare system by prioritizing resources to the patients who are most likely to be non-adherent. Our objective in this work is to make predictions regarding individual patients' behaviour in terms of taking their medication on time during their next scheduled medication opportunity. We do this by leveraging a number of machine learning models. In particular, we demonstrate the use of a connected IoT device; a "Smart Sharps Bin", invented by HealthBeacon Ltd.; to monitor and track injection disposal of patients in their home environment. Using extensive data collected from these devices, five machine learning models, namely Extra Trees Classifier, Random Forest, XGBoost, Gradient Boosting and Multilayer Perception were trained and evaluated on a large dataset comprising 165,223 historic injection disposal records collected from 5,915 HealthBeacon units over the course of 3 years. The testing work was conducted on real-time data generated by the smart device over a time period after the model training was complete, i.e. true future data. The proposed machine learning approach demonstrated very good predictive performance exhibiting an Area Under the Receiver Operating Characteristic Curve (ROC AUC) of 0.86.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Optimised Convolutional Neural Networks for Heart Rate Estimation and Human Activity Recognition in Wrist Worn Sensing Applications
Authors:
Eoin Brophy,
Willie Muehlhausen,
Alan F. Smeaton,
Tomas E. Ward
Abstract:
Wrist-worn smart devices are providing increased insights into human health, behaviour and performance through sophisticated analytics. However, battery life, device cost and sensor performance in the face of movement-related artefact present challenges which must be further addressed to see effective applications and wider adoption through commoditisation of the technology. We address these chall…
▽ More
Wrist-worn smart devices are providing increased insights into human health, behaviour and performance through sophisticated analytics. However, battery life, device cost and sensor performance in the face of movement-related artefact present challenges which must be further addressed to see effective applications and wider adoption through commoditisation of the technology. We address these challenges by demonstrating, through using a simple optical measurement, photoplethysmography (PPG) used conventionally for heart rate detection in wrist-worn sensors, that we can provide improved heart rate and human activity recognition (HAR) simultaneously at low sample rates, without an inertial measurement unit. This simplifies hardware design and reduces costs and power budgets. We apply two deep learning pipelines, one for human activity recognition and one for heart rate estimation. HAR is achieved through the application of a visual classification approach, capable of robust performance at low sample rates. Here, transfer learning is leveraged to retrain a convolutional neural network (CNN) to distinguish characteristics of the PPG during different human activities. For heart rate estimation we use a CNN adopted for regression which maps noisy optical signals to heart rate estimates. In both cases, comparisons are made with leading conventional approaches. Our results demonstrate a low sampling frequency can achieve good performance without significant degradation of accuracy. 5 Hz and 10 Hz were shown to have 80.2% and 83.0% classification accuracy for HAR respectively. These same sampling frequencies also yielded a robust heart rate estimation which was comparative with that achieved at the more energy-intensive rate of 256 Hz.
△ Less
Submitted 30 March, 2020;
originally announced April 2020.
-
Synthesis of Realistic ECG using Generative Adversarial Networks
Authors:
Anne Marie Delaney,
Eoin Brophy,
Tomas E. Ward
Abstract:
Access to medical data is highly restricted due to its sensitive nature, preventing communities from using this data for research or clinical training. Common methods of de-identification implemented to enable the sharing of data are sometimes inadequate to protect the individuals contained in the data. For our research, we investigate the ability of generative adversarial networks (GANs) to produ…
▽ More
Access to medical data is highly restricted due to its sensitive nature, preventing communities from using this data for research or clinical training. Common methods of de-identification implemented to enable the sharing of data are sometimes inadequate to protect the individuals contained in the data. For our research, we investigate the ability of generative adversarial networks (GANs) to produce realistic medical time series data which can be used without concerns over privacy. The aim is to generate synthetic ECG signals representative of normal ECG waveforms. GANs have been used successfully to generate good quality synthetic time series and have been shown to prevent re-identification of individual records. In this work, a range of GAN architectures are developed to generate synthetic sine waves and synthetic ECG. Two evaluation metrics are then used to quantitatively assess how suitable the synthetic data is for real world applications such as clinical training and data analysis. Finally, we discuss the privacy concerns associated with sharing synthetic data produced by GANs and test their ability to withstand a simple membership inference attack. For the first time we both quantitatively and qualitatively demonstrate that GAN architecture can successfully generate time series signals that are not only structurally similar to the training sets but also diverse in nature across generated samples. We also report on their ability to withstand a simple membership inference attack, protecting the privacy of the training set.
△ Less
Submitted 19 September, 2019;
originally announced September 2019.
-
Quick and Easy Time Series Generation with Established Image-based GANs
Authors:
Eoin Brophy,
Zhengwei Wang,
Tomas E. Ward
Abstract:
In the recent years Generative Adversarial Networks (GANs) have demonstrated significant progress in generating authentic looking data. In this work we introduce our simple method to exploit the advancements in well established image-based GANs to synthesise single channel time series data. We implement Wasserstein GANs (WGANs) with gradient penalty due to their stability in training to synthesise…
▽ More
In the recent years Generative Adversarial Networks (GANs) have demonstrated significant progress in generating authentic looking data. In this work we introduce our simple method to exploit the advancements in well established image-based GANs to synthesise single channel time series data. We implement Wasserstein GANs (WGANs) with gradient penalty due to their stability in training to synthesise three different types of data; sinusoidal data, photoplethysmograph (PPG) data and electrocardiograph (ECG) data. The length of the returned time series data is limited only by the image resolution, we use an image size of 64x64 pixels which yields 4096 data points. We present both visual and quantitative evidence that our novel method can successfully generate time series data using image-based GANs.
△ Less
Submitted 29 October, 2019; v1 submitted 14 February, 2019;
originally announced February 2019.
-
Information-Corrected Estimation: A Generalization Error Reducing Parameter Estimation Method
Authors:
Matthew Dixon,
Tyler Ward
Abstract:
Modern computational models in supervised machine learning are often highly parameterized universal approximators. As such, the value of the parameters is unimportant, and only the out of sample performance is considered. On the other hand much of the literature on model estimation assumes that the parameters themselves have intrinsic value, and thus is concerned with bias and variance of paramete…
▽ More
Modern computational models in supervised machine learning are often highly parameterized universal approximators. As such, the value of the parameters is unimportant, and only the out of sample performance is considered. On the other hand much of the literature on model estimation assumes that the parameters themselves have intrinsic value, and thus is concerned with bias and variance of parameter estimates, which may not have any simple relationship to out of sample model performance. Therefore, within supervised machine learning, heavy use is made of ridge regression (i.e., L2 regularization), which requires the the estimation of hyperparameters and can be rendered ineffective by certain model parameterizations. We introduce an objective function which we refer to as Information-Corrected Estimation (ICE) that reduces KL divergence based generalization error for supervised machine learning. ICE attempts to directly maximize a corrected likelihood function as an estimator of the KL divergence. Such an approach is proven, theoretically, to be effective for a wide class of models, with only mild regularity restrictions. Under finite sample sizes, this corrected estimation procedure is shown experimentally to lead to significant reduction in generalization error compared to maximum likelihood estimation and L2 regularization.
△ Less
Submitted 3 November, 2021; v1 submitted 13 March, 2018;
originally announced March 2018.