-
Using Generative Models to Produce Realistic Populations of UK Windstorms
Authors:
Yee Chun Tsoi,
Kieran M. R. Hunt,
Len Shaffrey,
Atta Badii,
Richard Dixon,
Ludovico Nicotina
Abstract:
This study evaluates the potential of generative models, trained on historical ERA5 reanalysis data, for simulating windstorms over the UK. Four generative models, including a standard GAN, a WGAN-GP, a U-net diffusion model, and a diffusion-GAN were assessed based on their ability to replicate spatial and statistical characteristics of windstorms. Different models have distinct strengths and limi…
▽ More
This study evaluates the potential of generative models, trained on historical ERA5 reanalysis data, for simulating windstorms over the UK. Four generative models, including a standard GAN, a WGAN-GP, a U-net diffusion model, and a diffusion-GAN were assessed based on their ability to replicate spatial and statistical characteristics of windstorms. Different models have distinct strengths and limitations. The standard GAN displayed broader variability and limited alignment on the PCA dimensions. The WGAN-GP had a more balanced performance but occasionally misrepresented extreme events. The U-net diffusion model produced high-quality spatial patterns but consistently underestimated windstorm intensities. The diffusion-GAN performed better than the other models in general but overestimated extremes. An ensemble approach combining the strengths of these models could potentially improve their overall reliability. This study provides a foundation for such generative models in meteorological research and could potentially be applied in windstorm analysis and risk assessment.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
Hydra-LSTM: A semi-shared Machine Learning architecture for prediction across Watersheds
Authors:
Karan Ruparell,
Robert J. Marks,
Andy Wood,
Kieran M. R. Hunt,
Hannah L. Cloke,
Christel Prudhomme,
Florian Pappenberger,
Matthew Chantry
Abstract:
Long Short Term Memory networks (LSTMs) are used to build single models that predict river discharge across many catchments. These models offer greater accuracy than models trained on each catchment independently if using the same data. However, the same data is rarely available for all catchments. This prevents the use of variables available only in some catchments, such as historic river dischar…
▽ More
Long Short Term Memory networks (LSTMs) are used to build single models that predict river discharge across many catchments. These models offer greater accuracy than models trained on each catchment independently if using the same data. However, the same data is rarely available for all catchments. This prevents the use of variables available only in some catchments, such as historic river discharge or upstream discharge. The only existing method that allows for optional variables requires all variables to be considered in the initial training of the model, limiting its transferability to new catchments. To address this limitation, we develop the Hydra-LSTM. The Hydra-LSTM processes variables used across all catchments and variables used in only some catchments separately to allow general training and use of catchment-specific data in individual catchments. The bulk of the model can be shared across catchments, maintaining the benefits of multi-catchment models to generalise, while also benefitting from the advantages of using bespoke data. We apply this methodology to 1 day-ahead river discharge prediction in the Western US, as next-day river discharge prediction is the first step towards prediction across longer time scales. We obtain state-of-the-art performance, generating more accurate median and quantile predictions than Multi-Catchment and Single-Catchment LSTMs while allowing local forecasters to easily introduce and remove variables from their prediction set. We test the ability of the Hydra-LSTM to incorporate catchment-specific data by introducing historical river discharge as a catchment-specific input, outperforming state-of-the-art models without needing to train an entirely new model.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Do AI models produce better weather forecasts than physics-based models? A quantitative evaluation case study of Storm Ciarán
Authors:
Andrew J. Charlton-Perez,
Helen F. Dacre,
Simon Driscoll,
Suzanne L. Gray,
Ben Harvey,
Natalie J. Harvey,
Kieran M. R. Hunt,
Robert W. Lee,
Ranjini Swaminathan,
Remy Vandaele,
Ambrogio Volonté
Abstract:
There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impact weather events. We compare forecasts of Storm Ciarán, a European windstorm that caused sixteen deaths and ext…
▽ More
There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impact weather events. We compare forecasts of Storm Ciarán, a European windstorm that caused sixteen deaths and extensive damage in Northern Europe, made by machine learning and numerical weather prediction models. The four machine learning models considered (FourCastNet, Pangu-Weather, GraphCast and FourCastNet-v2) produce forecasts that accurately capture the synoptic-scale structure of the cyclone including the position of the cloud head, shape of the warm sector and location of warm conveyor belt jet, and the large-scale dynamical drivers important for the rapid storm development such as the position of the storm relative to the upper-level jet exit. However, their ability to resolve the more detailed structures important for issuing weather warnings is more mixed. All of the machine learning models underestimate the peak amplitude of winds associated with the storm, only some machine learning models resolve the warm core seclusion and none of the machine learning models capture the sharp bent-back warm frontal gradient. Our study shows there is a great deal about the performance and properties of machine learning weather forecasts that can be derived from case studies of high-impact weather events such as Storm Ciarán.
△ Less
Submitted 19 February, 2024; v1 submitted 5 December, 2023;
originally announced December 2023.
-
Predicting Clinical Intent from Free Text Electronic Health Records
Authors:
Kawsar Noor,
Katherine Smith,
Julia Bennett,
Jade OConnell,
Jessica Fisk,
Monika Hunt,
Gary Philippo,
Teresa Xu,
Simon Knight,
Luis Romao,
Richard JB Dobson,
Wai Keong Wong
Abstract:
After a patient consultation, a clinician determines the steps in the management of the patient. A clinician may for example request to see the patient again or refer them to a specialist. Whilst most clinicians will record their intent as "next steps" in the patient's clinical notes, in some cases the clinician may forget to indicate their intent as an order or request, e.g. failure to place the…
▽ More
After a patient consultation, a clinician determines the steps in the management of the patient. A clinician may for example request to see the patient again or refer them to a specialist. Whilst most clinicians will record their intent as "next steps" in the patient's clinical notes, in some cases the clinician may forget to indicate their intent as an order or request, e.g. failure to place the follow-up order. This consequently results in patients becoming lost-to-follow up and may in some cases lead to adverse consequences. In this paper we train a machine learning model to detect a clinician's intent to follow up with a patient from the patient's clinical notes. Annotators systematically identified 22 possible types of clinical intent and annotated 3000 Bariatric clinical notes. The annotation process revealed a class imbalance in the labeled data and we found that there was only sufficient labeled data to train 11 out of the 22 intents. We used the data to train a BERT based multilabel classification model and reported the following average accuracy metrics for all intents: macro-precision: 0.91, macro-recall: 0.90, macro-f1: 0.90.
△ Less
Submitted 25 March, 2022;
originally announced April 2022.
-
Sim2Ls: FAIR simulation workflows and data
Authors:
Martin Hunt,
Steven Clark,
Daniel Mejia,
Saaketh Desai,
Alejandro Strachan
Abstract:
Just like the scientific data they generate, simulation workflows for research should be findable, accessible, interoperable, and reusable (FAIR). However, while significant progress has been made towards FAIR data, the majority of science and engineering workflows used in research remain poorly documented and often unavailable, involving ad hoc scripts and manual steps, hindering reproducibility…
▽ More
Just like the scientific data they generate, simulation workflows for research should be findable, accessible, interoperable, and reusable (FAIR). However, while significant progress has been made towards FAIR data, the majority of science and engineering workflows used in research remain poorly documented and often unavailable, involving ad hoc scripts and manual steps, hindering reproducibility and stifling progress. We introduce Sim2Ls (pronounced simtools) and the Sim2L Python library that allow developers to create and share end-to-end computational workflows with well-defined and verified inputs and outputs. The Sim2L library makes Sim2Ls, their requirements, and their services discoverable, verifies inputs and outputs, and automatically stores results in a globally-accessible simulation cache and results database. This simulation ecosystem is available in nanoHUB, an open platform that also provides publication services for Sim2Ls, a computational environment for developers and users, and the hardware to execute runs and store results at no cost. We exemplify the use of Sim2Ls using two applications and discuss best practices towards FAIR simulation workflows and associated data.
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
Class Clown: Data Redaction in Machine Unlearning at Enterprise Scale
Authors:
Daniel L. Felps,
Amelia D. Schwickerath,
Joyce D. Williams,
Trung N. Vuong,
Alan Briggs,
Matthew Hunt,
Evan Sakmar,
David D. Saranchak,
Tyler Shumaker
Abstract:
Individuals are gaining more control of their personal data through recent data privacy laws such the General Data Protection Regulation and the California Consumer Privacy Act. One aspect of these laws is the ability to request a business to delete private information, the so called "right to be forgotten" or "right to erasure". These laws have serious financial implications for companies and org…
▽ More
Individuals are gaining more control of their personal data through recent data privacy laws such the General Data Protection Regulation and the California Consumer Privacy Act. One aspect of these laws is the ability to request a business to delete private information, the so called "right to be forgotten" or "right to erasure". These laws have serious financial implications for companies and organizations that train large, highly accurate deep neural networks (DNNs) using these valuable consumer data sets. However, a received redaction request poses complex technical challenges on how to comply with the law while fulfilling core business operations. We introduce a DNN model lifecycle maintenance process that establishes how to handle specific data redaction requests and minimize the need to completely retrain the model. Our process is based upon the membership inference attack as a compliance tool for every point in the training set. These attack models quantify the privacy risk of all training data points and form the basis of follow-on data redaction from an accurate deployed model; excision is implemented through incorrect label assignment within incremental model updates.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
Bootstrap Aggregation for Point-based Generalized Membership Inference Attacks
Authors:
Daniel L. Felps,
Amelia D. Schwickerath,
Joyce D. Williams,
Trung N. Vuong,
Alan Briggs,
Matthew Hunt,
Evan Sakmar,
David D. Saranchak,
Tyler Shumaker
Abstract:
An efficient scheme is introduced that extends the generalized membership inference attack to every point in a model's training data set. Our approach leverages data partitioning to create variable sized training sets for the reference models. We then train an attack model for every single training example for a reference model configuration based upon output for each individual point. This allows…
▽ More
An efficient scheme is introduced that extends the generalized membership inference attack to every point in a model's training data set. Our approach leverages data partitioning to create variable sized training sets for the reference models. We then train an attack model for every single training example for a reference model configuration based upon output for each individual point. This allows us to quantify the membership inference attack vulnerability of each training data point. Using this approach, we discovered that smaller amounts of reference model training data led to a stronger attack. Furthermore, the reference models do not need to be of the same architecture as the target model, providing additional attack efficiencies. The attack may also be performed by an adversary even when they do not have the complete original data set.
△ Less
Submitted 17 November, 2020;
originally announced November 2020.
-
Neural Network-Based Modeling of Phonetic Durations
Authors:
Xizi Wei,
Melvyn Hunt,
Adrian Skilling
Abstract:
A deep neural network (DNN)-based model has been developed to predict non-parametric distributions of durations of phonemes in specified phonetic contexts and used to explore which factors influence durations most. Major factors in US English are pre-pausal lengthening, lexical stress, and speaking rate. The model can be used to check that text-to-speech (TTS) training speech follows the script an…
▽ More
A deep neural network (DNN)-based model has been developed to predict non-parametric distributions of durations of phonemes in specified phonetic contexts and used to explore which factors influence durations most. Major factors in US English are pre-pausal lengthening, lexical stress, and speaking rate. The model can be used to check that text-to-speech (TTS) training speech follows the script and words are pronounced as expected. Duration prediction is poorer with training speech for automatic speech recognition (ASR) because the training corpus typically consists of single utterances from many speakers and is often noisy or casually spoken. Low probability durations in ASR training material nevertheless mostly correspond to non-standard speech, with some having disfluencies. Children's speech is disproportionately present in these utterances, since children show much more variation in timing.
△ Less
Submitted 6 September, 2019;
originally announced September 2019.
-
Comparison of Patch-Based Conditional Generative Adversarial Neural Net Models with Emphasis on Model Robustness for Use in Head and Neck Cases for MR-Only planning
Authors:
Peter Klages,
Ilyes Benslimane,
Sadegh Riyahi,
Jue Jiang,
Margie Hunt,
Joe Deasy,
Harini Veeraraghavan,
Neelam Tyagi
Abstract:
A total of twenty paired CT and MR images were used in this study to investigate two conditional generative adversarial networks, Pix2Pix, and Cycle GAN, for generating synthetic CT images for Headand Neck cancer cases. Ten of the patient cases were used for training and included such common artifacts as dental implants; the remaining ten testing cases were used for testing and included a larger r…
▽ More
A total of twenty paired CT and MR images were used in this study to investigate two conditional generative adversarial networks, Pix2Pix, and Cycle GAN, for generating synthetic CT images for Headand Neck cancer cases. Ten of the patient cases were used for training and included such common artifacts as dental implants; the remaining ten testing cases were used for testing and included a larger range of image features commonly found in clinical head and neck cases. These features included strong metal artifacts from dental implants, one case with a metal implant, and one case with abnormal anatomy. The original CT images were deformably registered to the mDixon FFE MR images to minimize the effects of processing the MR images. The sCT generation accuracy and robustness were evaluated using Mean Absolute Error (MAE) based on the Hounsfield Units (HU) for three regions (whole body, bone, and air within the body), Mean Error (ME) to observe systematic average offset errors in the sCT generation, and dosimetric evaluation of all clinically relevant structures. For the test set the MAE for the Pix2Pix and Cycle GAN models were 92.4 $\pm$ 13.5 HU, and 100.7 $\pm$ 14.6 HU, respectively, for the body region, 166.3 $\pm$ 31.8 HU, and 184 $\pm$ 31.9 HU, respectively, for the bone region, and 183.7 $\pm$ 41.3 HU and 185.4 $\pm$ 37.9 HU for the air regions. The ME for Pix2Pix and Cycle GAN were 21.0 $\pm$ 11.8 HU and 37.5 $\pm$ 14.9 HU, respectively. Absolute Percent Mean/Max Dose Errors were less than 2% for the PTV and all critical structures for both models, and DRRs generated from these models looked qualitatively similar to CT generated DRRs showing these methods are promising for MR-only planning.
△ Less
Submitted 27 February, 2019; v1 submitted 1 February, 2019;
originally announced February 2019.
-
Open Access Policy: Numbers, Analysis, Effectiveness
Authors:
A. Swan,
Y. Gargouri,
M. Hunt,
S. Harnad
Abstract:
The PASTEUR4OA project analyses what makes an Open Access (OA) policy effective. The total number of institutional or funder OA policies worldwide is now 663 (March 2015), over half of them mandatory. ROARMAP, the policy registry, has been rebuilt to record more policy detail and provide more extensive search functionality. Deposit rates were measured for articles in institutions' repositories and…
▽ More
The PASTEUR4OA project analyses what makes an Open Access (OA) policy effective. The total number of institutional or funder OA policies worldwide is now 663 (March 2015), over half of them mandatory. ROARMAP, the policy registry, has been rebuilt to record more policy detail and provide more extensive search functionality. Deposit rates were measured for articles in institutions' repositories and compared to the total number of WoS-indexed articles published from those institutions. Average deposit rate was over four times as high for institutions with a mandatory policy. Six positive correlations were found between deposit rates and (1) Must-Deposit; (2) Cannot-Waive-Deposit; (3) Deposit-Linked-to-Research-Evaluation; (4) Cannot-Waive-Rights-Retention; (5) Must-Make-Deposit-OA (after allowable embargo) and (6) Can-Waive-OA. For deposit latency, there is a positive correlation between earlier deposit and (7) Must-Deposit-Immediately as well as with (4) Cannot-Waive-Rights-Retention and with mandate age. There are not yet enough OA policies to test whether still further policy conditions would contribute to mandate effectiveness but the present findings already suggest that it would be useful for current and future OA policies to adopt the seven positive conditions so as to accelerate and maximise the growth of OA.
△ Less
Submitted 9 April, 2015;
originally announced April 2015.