Search | arXiv e-print repository

Robustness of AI-based weather forecasts in a changing climate

Authors: Thomas Rackow, Nikolay Koldunov, Christian Lessig, Irina Sandu, Mihai Alexe, Matthew Chantry, Mariana Clare, Jesper Dramsch, Florian Pappenberger, Xabier Pedruzo-Bagazgoitia, Steffen Tietsche, Thomas Jung

Abstract: Data-driven machine learning models for weather forecasting have made transformational progress in the last 1-2 years, with state-of-the-art ones now outperforming the best physics-based models for a wide range of skill scores. Given the strong links between weather and climate modelling, this raises the question whether machine learning models could also revolutionize climate science, for example… ▽ More Data-driven machine learning models for weather forecasting have made transformational progress in the last 1-2 years, with state-of-the-art ones now outperforming the best physics-based models for a wide range of skill scores. Given the strong links between weather and climate modelling, this raises the question whether machine learning models could also revolutionize climate science, for example by informing mitigation and adaptation to climate change or to generate larger ensembles for more robust uncertainty estimates. Here, we show that current state-of-the-art machine learning models trained for weather forecasting in present-day climate produce skillful forecasts across different climate states corresponding to pre-industrial, present-day, and future 2.9K warmer climates. This indicates that the dynamics shaping the weather on short timescales may not differ fundamentally in a changing climate. It also demonstrates out-of-distribution generalization capabilities of the machine learning models that are a critical prerequisite for climate applications. Nonetheless, two of the models show a global-mean cold bias in the forecasts for the future warmer climate state, i.e. they drift towards the colder present-day climate they have been trained for. A similar result is obtained for the pre-industrial case where two out of three models show a warming. We discuss possible remedies for these biases and analyze their spatial distribution, revealing complex warming and cooling patterns that are partly related to missing ocean-sea ice and land surface information in the training data. Despite these current limitations, our results suggest that data-driven machine learning models will provide powerful tools for climate science and transform established approaches by complementing conventional physics-based models. △ Less

Submitted 27 September, 2024; originally announced September 2024.

Comments: 14 pages, 4 figures

arXiv:2409.02891 [pdf, other]

Regional data-driven weather modeling with a global stretched-grid

Authors: Thomas Nils Nipen, Håvard Homleid Haugen, Magnus Sikora Ingstad, Even Marius Nordhagen, Aram Farhad Shafiq Salihi, Paulina Tedesco, Ivar Ambjørn Seierstad, Jørn Kristiansen, Simon Lang, Mihai Alexe, Jesper Dramsch, Baudouin Raoult, Gert Mertes, Matthew Chantry

Abstract: A data-driven model (DDM) suitable for regional weather forecasting applications is presented. The model extends the Artificial Intelligence Forecasting System by introducing a stretched-grid architecture that dedicates higher resolution over a regional area of interest and maintains a lower resolution elsewhere on the globe. The model is based on graph neural networks, which naturally affords arb… ▽ More A data-driven model (DDM) suitable for regional weather forecasting applications is presented. The model extends the Artificial Intelligence Forecasting System by introducing a stretched-grid architecture that dedicates higher resolution over a regional area of interest and maintains a lower resolution elsewhere on the globe. The model is based on graph neural networks, which naturally affords arbitrary multi-resolution grid configurations. The model is applied to short-range weather prediction for the Nordics, producing forecasts at 2.5 km spatial and 6 h temporal resolution. The model is pre-trained on 43 years of global ERA5 data at 31 km resolution and is further refined using 3.3 years of 2.5 km resolution operational analyses from the MetCoOp Ensemble Prediction System (MEPS). The performance of the model is evaluated using surface observations from measurement stations across Norway and is compared to short-range weather forecasts from MEPS. The DDM outperforms both the control run and the ensemble mean of MEPS for 2 m temperature. The model also produces competitive precipitation and wind speed forecasts, but is shown to underestimate extreme events. △ Less

Submitted 4 September, 2024; originally announced September 2024.

arXiv:2006.13311 [pdf, other]

doi 10.1016/bs.agph.2020.08.002

70 years of machine learning in geoscience in review

Authors: Jesper Sören Dramsch

Abstract: This review gives an overview of the development of machine learning in geoscience. A thorough analysis of the co-developments of machine learning applications throughout the last 70 years relates the recent enthusiasm for machine learning to developments in geoscience. I explore the shift of kriging towards a mainstream machine learning method and the historic application of neural networks in ge… ▽ More This review gives an overview of the development of machine learning in geoscience. A thorough analysis of the co-developments of machine learning applications throughout the last 70 years relates the recent enthusiasm for machine learning to developments in geoscience. I explore the shift of kriging towards a mainstream machine learning method and the historic application of neural networks in geoscience, following the general trend of machine learning enthusiasm through the decades. Furthermore, this chapter explores the shift from mathematical fundamentals and knowledge in software development towards skills in model validation, applied statistics, and integrated subject matter expertise. The review is interspersed with code examples to complement the theoretical foundations and illustrate model validation and machine learning explainability for science. The scope of this review includes various shallow machine learning methods, e.g. Decision Trees, Random Forests, Support-Vector Machines, and Gaussian Processes, as well as, deep neural networks, including feed-forward neural networks, convolutional neural networks, recurrent neural networks and generative adversarial networks. Regarding geoscience, the review has a bias towards geophysics but aims to strike a balance with geochemistry, geostatistics, and geology, however excludes remote sensing, as this would exceed the scope. In general, I aim to provide context for the recent enthusiasm surrounding deep learning with respect to research, hardware, and software developments that enable successful application of shallow and deep machine learning in all disciplines of Earth science. △ Less

Submitted 26 August, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

Comments: 36 pages, 17 figures, book chapter

arXiv:1905.12321 [pdf, other]

doi 10.1016/j.cageo.2020.104643

Complex-valued neural networks for machine learning on non-stationary physical data

Authors: Jesper Sören Dramsch, Mikael Lüthje, Anders Nymark Christensen

Abstract: Deep learning has become an area of interest in most scientific areas, including physical sciences. Modern networks apply real-valued transformations on the data. Particularly, convolutions in convolutional neural networks discard phase information entirely. Many deterministic signals, such as seismic data or electrical signals, contain significant information in the phase of the signal. We explor… ▽ More Deep learning has become an area of interest in most scientific areas, including physical sciences. Modern networks apply real-valued transformations on the data. Particularly, convolutions in convolutional neural networks discard phase information entirely. Many deterministic signals, such as seismic data or electrical signals, contain significant information in the phase of the signal. We explore complex-valued deep convolutional networks to leverage non-linear feature maps. Seismic data commonly has a lowcut filter applied, to attenuate noise from ocean waves and similar long wavelength contributions. Discarding the phase information leads to low-frequency aliasing analogous to the Nyquist-Shannon theorem for high frequencies. In non-stationary data, the phase content can stabilize training and improve the generalizability of neural networks. While it has been shown that phase content can be restored in deep neural networks, we show how including phase information in feature maps improves both training and inference from deterministic physical data. Furthermore, we show that the reduction of parameters in a complex network outperforms larger real-valued networks. △ Less

Submitted 26 November, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

Comments: 17 pages total, 15 pages, 2 pages references, paper, 11 figures, 28 networks

arXiv:1904.02254 [pdf, other]

doi 10.3997/2214-4609.201901967

Including Physics in Deep Learning -- An example from 4D seismic pressure saturation inversion

Authors: Jesper Sören Dramsch, Gustavo Corte, Hamed Amini, Colin MacBeth, Mikael Lüthje

Abstract: Geoscience data often have to rely on strong priors in the face of uncertainty. Additionally, we often try to detect or model anomalous sparse data that can appear as an outlier in machine learning models. These are classic examples of imbalanced learning. Approaching these problems can benefit from including prior information from physics models or transforming data to a beneficial domain. We sho… ▽ More Geoscience data often have to rely on strong priors in the face of uncertainty. Additionally, we often try to detect or model anomalous sparse data that can appear as an outlier in machine learning models. These are classic examples of imbalanced learning. Approaching these problems can benefit from including prior information from physics models or transforming data to a beneficial domain. We show an example of including physical information in the architecture of a neural network as prior information. We go on to present noise injection at training time to successfully transfer the network from synthetic data to field data. △ Less

Submitted 3 April, 2019; originally announced April 2019.

Comments: 5 pages, 5 figures, workshop, extended abstract, EAGE 2019 Workshop Programme, European Association of Geoscientists and Engineers

arXiv:1805.08826 [pdf, other]

doi 10.3997/2214-4609.201800734

Rapid seismic domain transfer: Seismic velocity inversion and modeling using deep generative neural networks

Authors: Lukas Mosser, Wouter Kimman, Jesper Dramsch, Steve Purves, Alfredo De la Fuente, Graham Ganssle

Abstract: Traditional physics-based approaches to infer sub-surface properties such as full-waveform inversion or reflectivity inversion are time-consuming and computationally expensive. We present a deep-learning technique that eliminates the need for these computationally complex methods by posing the problem as one of domain transfer. Our solution is based on a deep convolutional generative adversarial n… ▽ More Traditional physics-based approaches to infer sub-surface properties such as full-waveform inversion or reflectivity inversion are time-consuming and computationally expensive. We present a deep-learning technique that eliminates the need for these computationally complex methods by posing the problem as one of domain transfer. Our solution is based on a deep convolutional generative adversarial network and dramatically reduces computation time. Training based on two different types of synthetic data produced a neural network that generates realistic velocity models when applied to a real dataset. The system's ability to generalize means it is robust against the inherent occurrence of velocity errors and artifacts in both training and test datasets. △ Less

Submitted 22 May, 2018; originally announced May 2018.

Comments: Extended abstract submitted to EAGE 2018, 5 pages, 3 figures

Showing 1–6 of 6 results for author: Dramsch, J