-
Harnessing AI data-driven global weather models for climate attribution: An analysis of the 2017 Oroville Dam extreme atmospheric river
Authors:
Jorge Baño-Medina,
Agniv Sengupta,
Allison Michaelis,
Luca Delle Monache,
Julie Kalansky,
Duncan Watson-Parris
Abstract:
AI data-driven models (Graphcast, Pangu Weather, Fourcastnet, and SFNO) are explored for storyline-based climate attribution due to their short inference times, which can accelerate the number of events studied, and provide real time attributions when public attention is heightened. The analysis is framed on the extreme atmospheric river episode of February 2017 that contributed to the Oroville da…
▽ More
AI data-driven models (Graphcast, Pangu Weather, Fourcastnet, and SFNO) are explored for storyline-based climate attribution due to their short inference times, which can accelerate the number of events studied, and provide real time attributions when public attention is heightened. The analysis is framed on the extreme atmospheric river episode of February 2017 that contributed to the Oroville dam spillway incident in Northern California. Past and future simulations are generated by perturbing the initial conditions with the pre-industrial and the late-21st century temperature climate change signals, respectively. The simulations are compared to results from a dynamical model which represents plausible pseudo-realities under both climate environments. Overall, the AI models show promising results, projecting a 5-6 % increase in the integrated water vapor over the Oroville dam in the present day compared to the pre-industrial, in agreement with the dynamical model. Different geopotential-moisture-temperature dependencies are unveiled for each of the AI-models tested, providing valuable information for understanding the physicality of the attribution response. However, the AI models tend to simulate weaker attribution values than the pseudo-reality imagined by the dynamical model, suggesting some reduced extrapolation skill, especially for the late-21st century regime. Large ensembles generated with an AI model (>500 members) produced statistically significant present-day to pre-industrial attribution results, unlike the >20-member ensemble from the dynamical model. This analysis highlights the potential of AI models to conduct attribution analysis, while emphasizing future lines of work on explainable artificial intelligence to gain confidence in these tools, which can enable reliable attribution studies in real-time.
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
Exploring Randomly Wired Neural Networks for Climate Model Emulation
Authors:
William Yik,
Sam J. Silva,
Andrew Geiss,
Duncan Watson-Parris
Abstract:
Exploring the climate impacts of various anthropogenic emissions scenarios is key to making informed decisions for climate change mitigation and adaptation. State-of-the-art Earth system models can provide detailed insight into these impacts, but have a large associated computational cost on a per-scenario basis. This large computational burden has driven recent interest in developing cheap machin…
▽ More
Exploring the climate impacts of various anthropogenic emissions scenarios is key to making informed decisions for climate change mitigation and adaptation. State-of-the-art Earth system models can provide detailed insight into these impacts, but have a large associated computational cost on a per-scenario basis. This large computational burden has driven recent interest in developing cheap machine learning models for the task of climate model emulation. In this manuscript, we explore the efficacy of randomly wired neural networks for this task. We describe how they can be constructed and compare them to their standard feedforward counterparts using the ClimateBench dataset. Specifically, we replace the serially connected dense layers in multilayer perceptrons, convolutional neural networks, and convolutional long short-term memory networks with randomly wired dense layers and assess the impact on model performance for models with 1 million and 10 million parameters. We find that models with less complex architectures see the greatest performance improvement with the addition of random wiring (up to 30.4% for multilayer perceptrons). Furthermore, out of 24 different model architecture, parameter count, and prediction task combinations, only one saw a statistically significant performance deficit in randomly wired networks compared to their standard counterparts, with 14 cases showing statistically significant improvement. We also find no significant difference in prediction speed between networks with standard feedforward dense layers and those with randomly wired layers. These findings indicate that randomly wired neural networks may be suitable direct replacements for traditional dense layers in many standard models.
△ Less
Submitted 21 January, 2024; v1 submitted 6 December, 2022;
originally announced December 2022.
-
Pyrocast: a Machine Learning Pipeline to Forecast Pyrocumulonimbus (PyroCb) Clouds
Authors:
Kenza Tazi,
Emiliano Díaz Salas-Porras,
Ashwin Braude,
Daniel Okoh,
Kara D. Lamb,
Duncan Watson-Parris,
Paula Harder,
Nis Meinert
Abstract:
Pyrocumulonimbus (pyroCb) clouds are storm clouds generated by extreme wildfires. PyroCbs are associated with unpredictable, and therefore dangerous, wildfire spread. They can also inject smoke particles and trace gases into the upper troposphere and lower stratosphere, affecting the Earth's climate. As global temperatures increase, these previously rare events are becoming more common. Being able…
▽ More
Pyrocumulonimbus (pyroCb) clouds are storm clouds generated by extreme wildfires. PyroCbs are associated with unpredictable, and therefore dangerous, wildfire spread. They can also inject smoke particles and trace gases into the upper troposphere and lower stratosphere, affecting the Earth's climate. As global temperatures increase, these previously rare events are becoming more common. Being able to predict which fires are likely to generate pyroCb is therefore key to climate adaptation in wildfire-prone areas. This paper introduces Pyrocast, a pipeline for pyroCb analysis and forecasting. The pipeline's first two components, a pyroCb database and a pyroCb forecast model, are presented. The database brings together geostationary imagery and environmental data for over 148 pyroCb events across North America, Australia, and Russia between 2018 and 2022. Random Forests, Convolutional Neural Networks (CNNs), and CNNs pretrained with Auto-Encoders were tested to predict the generation of pyroCb for a given fire six hours in advance. The best model predicted pyroCb with an AUC of $0.90 \pm 0.04$.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
AODisaggregation: toward global aerosol vertical profiles
Authors:
Shahine Bouabid,
Duncan Watson-Parris,
Sofija Stefanović,
Athanasios Nenes,
Dino Sejdinovic
Abstract:
Aerosol-cloud interactions constitute the largest source of uncertainty in assessments of the anthropogenic climate change. This uncertainty arises in part from the difficulty in measuring the vertical distributions of aerosols, and only sporadic vertically resolved observations are available. We often have to settle for less informative vertically aggregated proxies such as aerosol optical depth…
▽ More
Aerosol-cloud interactions constitute the largest source of uncertainty in assessments of the anthropogenic climate change. This uncertainty arises in part from the difficulty in measuring the vertical distributions of aerosols, and only sporadic vertically resolved observations are available. We often have to settle for less informative vertically aggregated proxies such as aerosol optical depth (AOD). In this work, we develop a framework for the vertical disaggregation of AOD into extinction profiles, i.e. the measure of light extinction throughout an atmospheric column, using readily available vertically resolved meteorological predictors such as temperature, pressure or relative humidity. Using Bayesian nonparametric modelling, we devise a simple Gaussian process prior over aerosol vertical profiles and update it with AOD observations to infer a distribution over vertical extinction profiles. To validate our approach, we use ECHAM-HAM aerosol-climate model data which offers self-consistent simulations of meteorological covariates, AOD and extinction profiles. Our results show that, while very simple, our model is able to reconstruct realistic extinction profiles with well-calibrated uncertainty, outperforming by an order of magnitude the idealized baseline which is typically used in satellite AOD retrieval algorithms. In particular, the model demonstrates a faithful reconstruction of extinction patterns arising from aerosol water uptake in the boundary layer. Observations however suggest that other extinction patterns, due to aerosol mass concentration, particle size and radiative properties, might be more challenging to capture and require additional vertically resolved predictors.
△ Less
Submitted 23 May, 2022; v1 submitted 6 May, 2022;
originally announced May 2022.
-
Using Non-Linear Causal Models to Study Aerosol-Cloud Interactions in the Southeast Pacific
Authors:
Andrew Jesson,
Peter Manshausen,
Alyson Douglas,
Duncan Watson-Parris,
Yarin Gal,
Philip Stier
Abstract:
Aerosol-cloud interactions include a myriad of effects that all begin when aerosol enters a cloud and acts as cloud condensation nuclei (CCN). An increase in CCN results in a decrease in the mean cloud droplet size (r$_{e}$). The smaller droplet size leads to brighter, more expansive, and longer lasting clouds that reflect more incoming sunlight, thus cooling the earth. Globally, aerosol-cloud int…
▽ More
Aerosol-cloud interactions include a myriad of effects that all begin when aerosol enters a cloud and acts as cloud condensation nuclei (CCN). An increase in CCN results in a decrease in the mean cloud droplet size (r$_{e}$). The smaller droplet size leads to brighter, more expansive, and longer lasting clouds that reflect more incoming sunlight, thus cooling the earth. Globally, aerosol-cloud interactions cool the Earth, however the strength of the effect is heterogeneous over different meteorological regimes. Understanding how aerosol-cloud interactions evolve as a function of the local environment can help us better understand sources of error in our Earth system models, which currently fail to reproduce the observed relationships. In this work we use recent non-linear, causal machine learning methods to study the heterogeneous effects of aerosols on cloud droplet radius.
△ Less
Submitted 3 November, 2021; v1 submitted 28 October, 2021;
originally announced October 2021.
-
Model calibration using ESEm v1.0.0 -- an open, scalable Earth System Emulator
Authors:
Duncan Watson-Parris,
Andrew Williams,
Lucia Deaconu,
Philip Stier
Abstract:
Large computer models are ubiquitous in the earth sciences. These models often have tens or hundreds of tuneable parameters and can take thousands of core-hours to run to completion while generating terabytes of output. It is becoming common practice to develop emulators as fast approximations, or surrogates, of these models in order to explore the relationships between these inputs and outputs, u…
▽ More
Large computer models are ubiquitous in the earth sciences. These models often have tens or hundreds of tuneable parameters and can take thousands of core-hours to run to completion while generating terabytes of output. It is becoming common practice to develop emulators as fast approximations, or surrogates, of these models in order to explore the relationships between these inputs and outputs, understand uncertainties and generate large ensembles datasets. While the purpose of these surrogates may differ, their development is often very similar. Here we introduce ESEm: an open-source tool providing a general workflow for emulating and validating a wide variety of models and outputs. It includes efficient routines for sampling these emulators for the purpose of uncertainty quantification and model calibration. It is built on well-established, high-performance libraries to ensure robustness, extensibility and scalability. We demonstrate the flexibility of ESEm through three case-studies using ESEm to reduce parametric uncertainty in a general circulation model, explore precipitation sensitivity in a cloud resolving model and scenario uncertainty in the CMIP6 multi-model ensemble.
△ Less
Submitted 24 August, 2021;
originally announced August 2021.
-
RainBench: Towards Global Precipitation Forecasting from Satellite Imagery
Authors:
Christian Schroeder de Witt,
Catherine Tong,
Valentina Zantedeschi,
Daniele De Martini,
Freddie Kalaitzis,
Matthew Chantry,
Duncan Watson-Parris,
Piotr Bilinski
Abstract:
Extreme precipitation events, such as violent rainfall and hail storms, routinely ravage economies and livelihoods around the developing world. Climate change further aggravates this issue. Data-driven deep learning approaches could widen the access to accurate multi-day forecasts, to mitigate against such events. However, there is currently no benchmark dataset dedicated to the study of global pr…
▽ More
Extreme precipitation events, such as violent rainfall and hail storms, routinely ravage economies and livelihoods around the developing world. Climate change further aggravates this issue. Data-driven deep learning approaches could widen the access to accurate multi-day forecasts, to mitigate against such events. However, there is currently no benchmark dataset dedicated to the study of global precipitation forecasts. In this paper, we introduce \textbf{RainBench}, a new multi-modal benchmark dataset for data-driven precipitation forecasting. It includes simulated satellite data, a selection of relevant meteorological data from the ERA5 reanalysis product, and IMERG precipitation data. We also release \textbf{PyRain}, a library to process large precipitation datasets efficiently. We present an extensive analysis of our novel dataset and establish baseline results for two benchmark medium-range precipitation forecasting tasks. Finally, we discuss existing data-driven weather forecasting methodologies and suggest future research avenues.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
Machine learning for weather and climate are worlds apart
Authors:
Duncan Watson-Parris
Abstract:
Modern weather and climate models share a common heritage, and often even components, however they are used in different ways to answer fundamentally different questions. As such, attempts to emulate them using machine learning should reflect this. While the use of machine learning to emulate weather forecast models is a relatively new endeavour there is a rich history of climate model emulation.…
▽ More
Modern weather and climate models share a common heritage, and often even components, however they are used in different ways to answer fundamentally different questions. As such, attempts to emulate them using machine learning should reflect this. While the use of machine learning to emulate weather forecast models is a relatively new endeavour there is a rich history of climate model emulation. This is primarily because while weather modelling is an initial condition problem which intimately depends on the current state of the atmosphere, climate modelling is predominantly a boundary condition problem. In order to emulate the response of the climate to different drivers therefore, representation of the full dynamical evolution of the atmosphere is neither necessary, or in many cases, desirable. Climate scientists are typically interested in different questions also. Indeed emulating the steady-state climate response has been possible for many years and provides significant speed increases that allow solving inverse problems for e.g. parameter estimation. Nevertheless, the large datasets, non-linear relationships and limited training data make Climate a domain which is rich in interesting machine learning challenges.
Here I seek to set out the current state of climate model emulation and demonstrate how, despite some challenges, recent advances in machine learning provide new opportunities for creating useful statistical models of the climate.
△ Less
Submitted 29 October, 2020; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Building high accuracy emulators for scientific simulations with deep neural architecture search
Authors:
M. F. Kasim,
D. Watson-Parris,
L. Deaconu,
S. Oliver,
P. Hatfield,
D. H. Froula,
G. Gregori,
M. Jarvis,
S. Khatiwala,
J. Korenaga,
J. Topp-Mugglestone,
E. Viezzer,
S. M. Vinko
Abstract:
Computer simulations are invaluable tools for scientific discovery. However, accurate simulations are often slow to execute, which limits their applicability to extensive parameter exploration, large-scale data analysis, and uncertainty quantification. A promising route to accelerate simulations by building fast emulators with machine learning requires large training datasets, which can be prohibi…
▽ More
Computer simulations are invaluable tools for scientific discovery. However, accurate simulations are often slow to execute, which limits their applicability to extensive parameter exploration, large-scale data analysis, and uncertainty quantification. A promising route to accelerate simulations by building fast emulators with machine learning requires large training datasets, which can be prohibitively expensive to obtain with slow simulations. Here we present a method based on neural architecture search to build accurate emulators even with a limited number of training data. The method successfully accelerates simulations by up to 2 billion times in 10 scientific cases including astrophysics, climate science, biogeochemistry, high energy density physics, fusion energy, and seismology, using the same super-architecture, algorithm, and hyperparameters. Our approach also inherently provides emulator uncertainty estimation, adding further confidence in their use. We anticipate this work will accelerate research involving expensive simulations, allow more extensive parameters exploration, and enable new, previously unfeasible computational discovery.
△ Less
Submitted 8 October, 2020; v1 submitted 17 January, 2020;
originally announced January 2020.
-
Detecting anthropogenic cloud perturbations with deep learning
Authors:
Duncan Watson-Parris,
Samuel Sutherland,
Matthew Christensen,
Anthony Caterini,
Dino Sejdinovic,
Philip Stier
Abstract:
One of the most pressing questions in climate science is that of the effect of anthropogenic aerosol on the Earth's energy balance. Aerosols provide the `seeds' on which cloud droplets form, and changes in the amount of aerosol available to a cloud can change its brightness and other physical properties such as optical thickness and spatial extent. Clouds play a critical role in moderating global…
▽ More
One of the most pressing questions in climate science is that of the effect of anthropogenic aerosol on the Earth's energy balance. Aerosols provide the `seeds' on which cloud droplets form, and changes in the amount of aerosol available to a cloud can change its brightness and other physical properties such as optical thickness and spatial extent. Clouds play a critical role in moderating global temperatures and small perturbations can lead to significant amounts of cooling or warming. Uncertainty in this effect is so large it is not currently known if it is negligible, or provides a large enough cooling to largely negate present-day warming by CO2. This work uses deep convolutional neural networks to look for two particular perturbations in clouds due to anthropogenic aerosol and assess their properties and prevalence, providing valuable insights into their climatic effects.
△ Less
Submitted 29 November, 2019;
originally announced November 2019.
-
Cumulo: A Dataset for Learning Cloud Classes
Authors:
Valentina Zantedeschi,
Fabrizio Falasca,
Alyson Douglas,
Richard Strange,
Matt J. Kusner,
Duncan Watson-Parris
Abstract:
One of the greatest sources of uncertainty in future climate projections comes from limitations in modelling clouds and in understanding how different cloud types interact with the climate system. A key first step in reducing this uncertainty is to accurately classify cloud types at high spatial and temporal resolution. In this paper, we introduce Cumulo, a benchmark dataset for training and evalu…
▽ More
One of the greatest sources of uncertainty in future climate projections comes from limitations in modelling clouds and in understanding how different cloud types interact with the climate system. A key first step in reducing this uncertainty is to accurately classify cloud types at high spatial and temporal resolution. In this paper, we introduce Cumulo, a benchmark dataset for training and evaluating global cloud classification models. It consists of one year of 1km resolution MODIS hyperspectral imagery merged with pixel-width 'tracks' of CloudSat cloud labels. Bringing these complementary datasets together is a crucial first step, enabling the Machine-Learning community to develop innovative new techniques which could greatly benefit the Climate community. To showcase Cumulo, we provide baseline performance analysis using an invertible flow generative model (IResNet), which further allows us to discover new sub-classes for a given cloud class by exploring the latent space. To compare methods, we introduce a set of evaluation criteria, to identify models that are not only accurate, but also physically-realistic. CUMULO can be download from https://www.dropbox.com/sh/i3s9q2v2jjyk2it/AACxXnXfMF5wuIqLXqH4NJOra?dl=0 .
△ Less
Submitted 13 October, 2022; v1 submitted 5 November, 2019;
originally announced November 2019.