-
Climate in a Bottle: Towards a Generative Foundation Model for the Kilometer-Scale Global Atmosphere
Authors:
Noah D. Brenowitz,
Tao Ge,
Akshay Subramaniam,
Aayush Gupta,
David M. Hall,
Morteza Mardani,
Arash Vahdat,
Karthik Kashinath,
Michael S. Pritchard
Abstract:
AI emulators offer a path to compressing, boosting limited ensembles, and improving the latency of interacting with petabyte-scale climate prediction data. However, prevailing auto-regressive paradigms offer limited flexibility, and are challenging to train on climate time horizons due to drifts, instabilities and component-coupling challenges. Conditionally generative models offer an appealing al…
▽ More
AI emulators offer a path to compressing, boosting limited ensembles, and improving the latency of interacting with petabyte-scale climate prediction data. However, prevailing auto-regressive paradigms offer limited flexibility, and are challenging to train on climate time horizons due to drifts, instabilities and component-coupling challenges. Conditionally generative models offer an appealing alternative. In this context we demonstrate a generative diffusion-based framework -- Climate in a Bottle (cBottle) -- for emulating global km-scale climate simulations and reanalysis on the equal-area HEALPix grid. cBottle consists of two model stages: a globally-trained coarse-resolution image generator that generates 100km (50k-pixel) fields given monthly average sea surface temperatures and solar conditioning, followed by a locally-trained 16x super-resolution stage that generates 5km (12.5M-pixel) fields; global super-resolution is made affordable using an overlapping patch-based multi-diffusion. Overall, cBottle shows promise as an emulator across a battery of climate model diagnostics, including diurnal-to-seasonal scale variability, large-scale modes of variability, tropical cyclone statistics, and trends of climate change and weather extremes. Moreover, cBottle is a step towards a foundation model, by bridging multiple data modalities (reanalysis and simulation) with corresponding utility beyond emulation to tasks such as zero-shot bias correction, climate downscaling, and channel in-filling.
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling
Authors:
Jaideep Pathak,
Yair Cohen,
Piyush Garg,
Peter Harrington,
Noah Brenowitz,
Dale Durran,
Morteza Mardani,
Arash Vahdat,
Shaoming Xu,
Karthik Kashinath,
Michael Pritchard
Abstract:
Storm-scale convection-allowing models (CAMs) are an important tool for predicting the evolution of thunderstorms and mesoscale convective systems that result in damaging extreme weather. By explicitly resolving convective dynamics within the atmosphere they afford meteorologists the nuance needed to provide outlook on hazard. Deep learning models have thus far not proven skilful at km-scale atmos…
▽ More
Storm-scale convection-allowing models (CAMs) are an important tool for predicting the evolution of thunderstorms and mesoscale convective systems that result in damaging extreme weather. By explicitly resolving convective dynamics within the atmosphere they afford meteorologists the nuance needed to provide outlook on hazard. Deep learning models have thus far not proven skilful at km-scale atmospheric simulation, despite being competitive at coarser resolution with state-of-the-art global, medium-range weather forecasting. We present a generative diffusion model called StormCast, which emulates the high-resolution rapid refresh (HRRR) model-NOAA's state-of-the-art 3km operational CAM. StormCast autoregressively predicts 99 state variables at km scale using a 1-hour time step, with dense vertical resolution in the atmospheric boundary layer, conditioned on 26 synoptic variables. We present evidence of successfully learnt km-scale dynamics including competitive 1-6 hour forecast skill for composite radar reflectivity alongside physically realistic convective cluster evolution, moist updrafts, and cold pool morphology. StormCast predictions maintain realistic power spectra for multiple predicted variables across multi-hour forecasts. Together, these results establish the potential for autoregressive ML to emulate CAMs -- opening up new km-scale frontiers for regional ML weather prediction and future climate hazard dynamical downscaling.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Huge Ensembles Part I: Design of Ensemble Weather Forecasts using Spherical Fourier Neural Operators
Authors:
Ankur Mahesh,
William Collins,
Boris Bonev,
Noah Brenowitz,
Yair Cohen,
Joshua Elms,
Peter Harrington,
Karthik Kashinath,
Thorsten Kurth,
Joshua North,
Travis OBrien,
Michael Pritchard,
David Pruitt,
Mark Risser,
Shashank Subramanian,
Jared Willard
Abstract:
Studying low-likelihood high-impact extreme weather events in a warming world is a significant and challenging task for current ensemble forecasting systems. While these systems presently use up to 100 members, larger ensembles could enrich the sampling of internal variability. They may capture the long tails associated with climate hazards better than traditional ensemble sizes. Due to computatio…
▽ More
Studying low-likelihood high-impact extreme weather events in a warming world is a significant and challenging task for current ensemble forecasting systems. While these systems presently use up to 100 members, larger ensembles could enrich the sampling of internal variability. They may capture the long tails associated with climate hazards better than traditional ensemble sizes. Due to computational constraints, it is infeasible to generate huge ensembles (comprised of 1,000-10,000 members) with traditional, physics-based numerical models. In this two-part paper, we replace traditional numerical simulations with machine learning (ML) to generate hindcasts of huge ensembles. In Part I, we construct an ensemble weather forecasting system based on Spherical Fourier Neural Operators (SFNO), and we discuss important design decisions for constructing such an ensemble. The ensemble represents model uncertainty through perturbed-parameter techniques, and it represents initial condition uncertainty through bred vectors, which sample the fastest growing modes of the forecast. Using the European Centre for Medium-Range Weather Forecasts Integrated Forecasting System (IFS) as a baseline, we develop an evaluation pipeline composed of mean, spectral, and extreme diagnostics. Using large-scale, distributed SFNOs with 1.1 billion learned parameters, we achieve calibrated probabilistic forecasts. As the trajectories of the individual members diverge, the ML ensemble mean spectra degrade with lead time, consistent with physical expectations. However, the individual ensemble members' spectra stay constant with lead time. Therefore, these members simulate realistic weather states, and the ML ensemble thus passes a crucial spectral test in the literature. The IFS and ML ensembles have similar Extreme Forecast Indices, and we show that the ML extreme weather forecasts are reliable and discriminating.
△ Less
Submitted 3 April, 2025; v1 submitted 6 August, 2024;
originally announced August 2024.
-
Huge Ensembles Part II: Properties of a Huge Ensemble of Hindcasts Generated with Spherical Fourier Neural Operators
Authors:
Ankur Mahesh,
William Collins,
Boris Bonev,
Noah Brenowitz,
Yair Cohen,
Peter Harrington,
Karthik Kashinath,
Thorsten Kurth,
Joshua North,
Travis OBrien,
Michael Pritchard,
David Pruitt,
Mark Risser,
Shashank Subramanian,
Jared Willard
Abstract:
In Part I, we created an ensemble based on Spherical Fourier Neural Operators. As initial condition perturbations, we used bred vectors, and as model perturbations, we used multiple checkpoints trained independently from scratch. Based on diagnostics that assess the ensemble's physical fidelity, our ensemble has comparable performance to operational weather forecasting systems. However, it require…
▽ More
In Part I, we created an ensemble based on Spherical Fourier Neural Operators. As initial condition perturbations, we used bred vectors, and as model perturbations, we used multiple checkpoints trained independently from scratch. Based on diagnostics that assess the ensemble's physical fidelity, our ensemble has comparable performance to operational weather forecasting systems. However, it requires orders of magnitude fewer computational resources. Here in Part II, we generate a huge ensemble (HENS), with 7,424 members initialized each day of summer 2023. We enumerate the technical requirements for running huge ensembles at this scale. HENS precisely samples the tails of the forecast distribution and presents a detailed sampling of internal variability. HENS has two primary applications: (1) as a large dataset with which to study the statistics and drivers of extreme weather and (2) as a weather forecasting system. For extreme climate statistics, HENS samples events 4$σ$ away from the ensemble mean. At each grid cell, HENS increases the skill of the most accurate ensemble member and enhances coverage of possible future trajectories. As a weather forecasting model, HENS issues extreme weather forecasts with better uncertainty quantification. It also reduces the probability of outlier events, in which the verification value lies outside the ensemble forecast distribution.
△ Less
Submitted 3 April, 2025; v1 submitted 2 August, 2024;
originally announced August 2024.
-
Generative Data Assimilation of Sparse Weather Station Observations at Kilometer Scales
Authors:
Peter Manshausen,
Yair Cohen,
Peter Harrington,
Jaideep Pathak,
Mike Pritchard,
Piyush Garg,
Morteza Mardani,
Karthik Kashinath,
Simon Byrne,
Noah Brenowitz
Abstract:
Data assimilation of observational data into full atmospheric states is essential for weather forecast model initialization. Recently, methods for deep generative data assimilation have been proposed which allow for using new input data without retraining the model. They could also dramatically accelerate the costly data assimilation process used in operational regional weather models. Here, in a…
▽ More
Data assimilation of observational data into full atmospheric states is essential for weather forecast model initialization. Recently, methods for deep generative data assimilation have been proposed which allow for using new input data without retraining the model. They could also dramatically accelerate the costly data assimilation process used in operational regional weather models. Here, in a central US testbed, we demonstrate the viability of score-based data assimilation in the context of realistically complex km-scale weather. We train an unconditional diffusion model to generate snapshots of a state-of-the-art km-scale analysis product, the High Resolution Rapid Refresh. Then, using score-based data assimilation to incorporate sparse weather station data, the model produces maps of precipitation and surface winds. The generated fields display physically plausible structures, such as gust fronts, and sensitivity tests confirm learnt physics through multivariate relationships. Preliminary skill analysis shows the approach already outperforms a naive baseline of the High-Resolution Rapid Refresh system itself. By incorporating observations from 40 weather stations, 10% lower RMSEs on left-out stations are attained. Despite some lingering imperfections such as insufficiently disperse ensemble DA estimates, we find the results overall an encouraging proof of concept, and the first at km-scale. It is a ripe time to explore extensions that combine increasingly ambitious regional state generators with an increasing set of in situ, ground-based, and satellite remote sensing data streams.
△ Less
Submitted 1 April, 2025; v1 submitted 19 June, 2024;
originally announced June 2024.
-
ACE: A fast, skillful learned global atmospheric model for climate prediction
Authors:
Oliver Watt-Meyer,
Gideon Dresdner,
Jeremy McGibbon,
Spencer K. Clark,
Brian Henn,
James Duncan,
Noah D. Brenowitz,
Karthik Kashinath,
Michael S. Pritchard,
Boris Bonev,
Matthew E. Peters,
Christopher S. Bretherton
Abstract:
Existing ML-based atmospheric models are not suitable for climate prediction, which requires long-term stability and physical consistency. We present ACE (AI2 Climate Emulator), a 200M-parameter, autoregressive machine learning emulator of an existing comprehensive 100-km resolution global atmospheric model. The formulation of ACE allows evaluation of physical laws such as the conservation of mass…
▽ More
Existing ML-based atmospheric models are not suitable for climate prediction, which requires long-term stability and physical consistency. We present ACE (AI2 Climate Emulator), a 200M-parameter, autoregressive machine learning emulator of an existing comprehensive 100-km resolution global atmospheric model. The formulation of ACE allows evaluation of physical laws such as the conservation of mass and moisture. The emulator is stable for 100 years, nearly conserves column moisture without explicit constraints and faithfully reproduces the reference model's climate, outperforming a challenging baseline on over 90% of tracked variables. ACE requires nearly 100x less wall clock time and is 100x more energy efficient than the reference model using typically available resources. Without fine-tuning, ACE can stably generalize to a previously unseen historical sea surface temperature dataset.
△ Less
Submitted 6 December, 2023; v1 submitted 3 October, 2023;
originally announced October 2023.
-
Residual Corrective Diffusion Modeling for Km-scale Atmospheric Downscaling
Authors:
Morteza Mardani,
Noah Brenowitz,
Yair Cohen,
Jaideep Pathak,
Chieh-Yu Chen,
Cheng-Chin Liu,
Arash Vahdat,
Mohammad Amin Nabian,
Tao Ge,
Akshay Subramaniam,
Karthik Kashinath,
Jan Kautz,
Mike Pritchard
Abstract:
The state of the art for physical hazard prediction from weather and climate requires expensive km-scale numerical simulations driven by coarser resolution global inputs. Here, a generative diffusion architecture is explored for downscaling such global inputs to km-scale, as a cost-effective machine learning alternative. The model is trained to predict 2km data from a regional weather model over T…
▽ More
The state of the art for physical hazard prediction from weather and climate requires expensive km-scale numerical simulations driven by coarser resolution global inputs. Here, a generative diffusion architecture is explored for downscaling such global inputs to km-scale, as a cost-effective machine learning alternative. The model is trained to predict 2km data from a regional weather model over Taiwan, conditioned on a 25km global reanalysis. To address the large resolution ratio, different physics involved at different scales and prediction of channels beyond those in the input data, we employ a two-step approach where a UNet predicts the mean and a corrector diffusion (CorrDiff) model predicts the residual. CorrDiff exhibits encouraging skill in bulk MAE and CRPS scores. The predicted spectra and distributions from CorrDiff faithfully recover important power law relationships in the target data. Case studies of coherent weather phenomena show that CorrDiff can help sharpen wind and temperature gradients that co-locate with intense rainfall in cold front, and can help intensify typhoons and synthesize rain band structures. Calibration of model uncertainty remains challenging. The prospect of unifying methods like CorrDiff with coarser resolution global weather models implies a potential for global-to-regional multi-scale machine learning simulation.
△ Less
Submitted 11 August, 2024; v1 submitted 24 September, 2023;
originally announced September 2023.
-
Earth Virtualization Engines -- A Technical Perspective
Authors:
Torsten Hoefler,
Bjorn Stevens,
Andreas F. Prein,
Johanna Baehr,
Thomas Schulthess,
Thomas F. Stocker,
John Taylor,
Daniel Klocke,
Pekka Manninen,
Piers M. Forster,
Tobias Kölling,
Nicolas Gruber,
Hartwig Anzt,
Claudia Frauen,
Florian Ziemen,
Milan Klöwer,
Karthik Kashinath,
Christoph Schär,
Oliver Fuhrer,
Bryan N. Lawrence
Abstract:
Participants of the Berlin Summit on Earth Virtualization Engines (EVEs) discussed ideas and concepts to improve our ability to cope with climate change. EVEs aim to provide interactive and accessible climate simulations and data for a wide range of users. They combine high-resolution physics-based models with machine learning techniques to improve the fidelity, efficiency, and interpretability of…
▽ More
Participants of the Berlin Summit on Earth Virtualization Engines (EVEs) discussed ideas and concepts to improve our ability to cope with climate change. EVEs aim to provide interactive and accessible climate simulations and data for a wide range of users. They combine high-resolution physics-based models with machine learning techniques to improve the fidelity, efficiency, and interpretability of climate projections. At their core, EVEs offer a federated data layer that enables simple and fast access to exabyte-sized climate data through simple interfaces. In this article, we summarize the technical challenges and opportunities for developing EVEs, and argue that they are essential for addressing the consequences of climate change.
△ Less
Submitted 16 September, 2023;
originally announced September 2023.
-
Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere
Authors:
Boris Bonev,
Thorsten Kurth,
Christian Hundt,
Jaideep Pathak,
Maximilian Baust,
Karthik Kashinath,
Anima Anandkumar
Abstract:
Fourier Neural Operators (FNOs) have proven to be an efficient and effective method for resolution-independent operator learning in a broad variety of application areas across scientific machine learning. A key reason for their success is their ability to accurately model long-range dependencies in spatio-temporal data by learning global convolutions in a computationally efficient manner. To this…
▽ More
Fourier Neural Operators (FNOs) have proven to be an efficient and effective method for resolution-independent operator learning in a broad variety of application areas across scientific machine learning. A key reason for their success is their ability to accurately model long-range dependencies in spatio-temporal data by learning global convolutions in a computationally efficient manner. To this end, FNOs rely on the discrete Fourier transform (DFT), however, DFTs cause visual and spectral artifacts as well as pronounced dissipation when learning operators in spherical coordinates since they incorrectly assume a flat geometry. To overcome this limitation, we generalize FNOs on the sphere, introducing Spherical FNOs (SFNOs) for learning operators on spherical geometries. We apply SFNOs to forecasting atmospheric dynamics, and demonstrate stable auto\-regressive rollouts for a year of simulated time (1,460 steps), while retaining physically plausible dynamics. The SFNO has important implications for machine learning-based simulation of climate dynamics that could eventually help accelerate our response to climate change.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Unsupervised Discovery of Extreme Weather Events Using Universal Representations of Emergent Organization
Authors:
Adam Rupe,
Karthik Kashinath,
Nalini Kumar,
James P. Crutchfield
Abstract:
Spontaneous self-organization is ubiquitous in systems far from thermodynamic equilibrium. While organized structures that emerge dominate transport properties, universal representations that identify and describe these key objects remain elusive. Here, we introduce a theoretically-grounded framework for describing emergent organization that, via data-driven algorithms, is constructive in practice…
▽ More
Spontaneous self-organization is ubiquitous in systems far from thermodynamic equilibrium. While organized structures that emerge dominate transport properties, universal representations that identify and describe these key objects remain elusive. Here, we introduce a theoretically-grounded framework for describing emergent organization that, via data-driven algorithms, is constructive in practice. Its building blocks are spacetime lightcones that embody how information propagates across a system through local interactions. We show that predictive equivalence classes of lightcones -- local causal states -- capture organized behaviors and coherent structures in complex spatiotemporal systems. Employing an unsupervised physics-informed machine learning algorithm and a high-performance computing implementation, we demonstrate automatically discovering coherent structures in two real world domain science problems. We show that local causal states identify vortices and track their power-law decay behavior in two-dimensional fluid turbulence. We then show how to detect and track familiar extreme weather events -- hurricanes and atmospheric rivers -- and discover other novel coherent structures associated with precipitation extremes in high-resolution climate data at the grid-cell level.
△ Less
Submitted 28 September, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
DL-Corrector-Remapper: A grid-free bias-correction deep learning methodology for data-driven high-resolution global weather forecasting
Authors:
Tao Ge,
Jaideep Pathak,
Akshay Subramaniam,
Karthik Kashinath
Abstract:
Data-driven models, such as FourCastNet (FCN), have shown exemplary performance in high-resolution global weather forecasting. This performance, however, is based on supervision on mesh-gridded weather data without the utilization of raw climate observational data, the gold standard ground truth. In this work we develop a methodology to correct, remap, and fine-tune gridded uniform forecasts of FC…
▽ More
Data-driven models, such as FourCastNet (FCN), have shown exemplary performance in high-resolution global weather forecasting. This performance, however, is based on supervision on mesh-gridded weather data without the utilization of raw climate observational data, the gold standard ground truth. In this work we develop a methodology to correct, remap, and fine-tune gridded uniform forecasts of FCN so it can be directly compared against observational ground truth, which is sparse and non-uniform in space and time. This is akin to bias correction and post-processing of numerical weather prediction (NWP), a routine operation at meteorological and weather forecasting centers across the globe. The Adaptive Fourier Neural Operator (AFNO) architecture is used as the backbone to learn continuous representations of the atmosphere. The spatially and temporally non-uniform output is evaluated by the non-uniform discrete inverse Fourier transform (NUIDFT) given the output query locations. We call this network the Deep-Learning-Corrector-Remapper (DLCR). The improvement in DLCR's performance against the gold standard ground truth over the baseline's performance shows its potential to correct, remap, and fine-tune the mesh-gridded forecasts under the supervision of observations.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators
Authors:
Thorsten Kurth,
Shashank Subramanian,
Peter Harrington,
Jaideep Pathak,
Morteza Mardani,
David Hall,
Andrea Miele,
Karthik Kashinath,
Animashree Anandkumar
Abstract:
Extreme weather amplified by climate change is causing increasingly devastating impacts across the globe. The current use of physics-based numerical weather prediction (NWP) limits accuracy due to high computational cost and strict time-to-solution limits. We report that a data-driven deep learning Earth system emulator, FourCastNet, can predict global weather and generate medium-range forecasts f…
▽ More
Extreme weather amplified by climate change is causing increasingly devastating impacts across the globe. The current use of physics-based numerical weather prediction (NWP) limits accuracy due to high computational cost and strict time-to-solution limits. We report that a data-driven deep learning Earth system emulator, FourCastNet, can predict global weather and generate medium-range forecasts five orders-of-magnitude faster than NWP while approaching state-of-the-art accuracy. FourCast-Net is optimized and scales efficiently on three supercomputing systems: Selene, Perlmutter, and JUWELS Booster up to 3,808 NVIDIA A100 GPUs, attaining 140.8 petaFLOPS in mixed precision (11.9%of peak at that scale). The time-to-solution for training FourCastNet measured on JUWELS Booster on 3,072GPUs is 67.4minutes, resulting in an 80,000times faster time-to-solution relative to state-of-the-art NWP, in inference. FourCastNet produces accurate instantaneous weather predictions for a week in advance, enables enormous ensembles that better capture weather extremes, and supports higher global forecast resolutions.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators
Authors:
Jaideep Pathak,
Shashank Subramanian,
Peter Harrington,
Sanjeev Raja,
Ashesh Chattopadhyay,
Morteza Mardani,
Thorsten Kurth,
David Hall,
Zongyi Li,
Kamyar Azizzadenesheli,
Pedram Hassanzadeh,
Karthik Kashinath,
Animashree Anandkumar
Abstract:
FourCastNet, short for Fourier Forecasting Neural Network, is a global data-driven weather forecasting model that provides accurate short to medium-range global predictions at $0.25^{\circ}$ resolution. FourCastNet accurately forecasts high-resolution, fast-timescale variables such as the surface wind speed, precipitation, and atmospheric water vapor. It has important implications for planning win…
▽ More
FourCastNet, short for Fourier Forecasting Neural Network, is a global data-driven weather forecasting model that provides accurate short to medium-range global predictions at $0.25^{\circ}$ resolution. FourCastNet accurately forecasts high-resolution, fast-timescale variables such as the surface wind speed, precipitation, and atmospheric water vapor. It has important implications for planning wind energy resources, predicting extreme weather events such as tropical cyclones, extra-tropical cyclones, and atmospheric rivers. FourCastNet matches the forecasting accuracy of the ECMWF Integrated Forecasting System (IFS), a state-of-the-art Numerical Weather Prediction (NWP) model, at short lead times for large-scale variables, while outperforming IFS for variables with complex fine-scale structure, including precipitation. FourCastNet generates a week-long forecast in less than 2 seconds, orders of magnitude faster than IFS. The speed of FourCastNet enables the creation of rapid and inexpensive large-ensemble forecasts with thousands of ensemble-members for improving probabilistic forecasting. We discuss how data-driven deep learning models such as FourCastNet are a valuable addition to the meteorology toolkit to aid and augment NWP models.
△ Less
Submitted 22 February, 2022;
originally announced February 2022.
-
Towards physically consistent data-driven weather forecasting: Integrating data assimilation with equivariance-preserving deep spatial transformers
Authors:
Ashesh Chattopadhyay,
Mustafa Mustafa,
Pedram Hassanzadeh,
Eviatar Bach,
Karthik Kashinath
Abstract:
There is growing interest in data-driven weather prediction (DDWP), for example using convolutional neural networks such as U-NETs that are trained on data from models or reanalysis. Here, we propose 3 components to integrate with commonly used DDWP models in order to improve their physical consistency and forecast accuracy. These components are 1) a deep spatial transformer added to the latent sp…
▽ More
There is growing interest in data-driven weather prediction (DDWP), for example using convolutional neural networks such as U-NETs that are trained on data from models or reanalysis. Here, we propose 3 components to integrate with commonly used DDWP models in order to improve their physical consistency and forecast accuracy. These components are 1) a deep spatial transformer added to the latent space of the U-NETs to preserve a property called equivariance, which is related to correctly capturing rotations and scalings of features in spatio-temporal data, 2) a data-assimilation (DA) algorithm to ingest noisy observations and improve the initial conditions for next forecasts, and 3) a multi-time-step algorithm, which combines forecasts from DDWP models with different time steps through DA, improving the accuracy of forecasts at short intervals. To show the benefit/feasibility of each component, we use geopotential height at 500~hPa (Z500) from ERA5 reanalysis and examine the short-term forecast accuracy of specific setups of the DDWP framework. Results show that the equivariance-preserving networks (U-STNs) clearly outperform the U-NETs, for example improving the forecast skill by $45\%$. Using a sigma-point ensemble Kalman (SPEnKF) algorithm for DA and U-STN as the forward model, we show that stable, accurate DA cycles are achieved even with high observation noise. The DDWP+DA framework substantially benefits from large ($O(1000)$) ensembles that are inexpensively generated with the data-driven forward model in each DA cycle. The multi-time-step DDWP+DA framework also shows promises, e.g., it reduces the average error by factors of 2-3.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
Using Machine Learning to Augment Coarse-Grid Computational Fluid Dynamics Simulations
Authors:
Jaideep Pathak,
Mustafa Mustafa,
Karthik Kashinath,
Emmanuel Motheau,
Thorsten Kurth,
Marcus Day
Abstract:
Simulation of turbulent flows at high Reynolds number is a computationally challenging task relevant to a large number of engineering and scientific applications in diverse fields such as climate science, aerodynamics, and combustion. Turbulent flows are typically modeled by the Navier-Stokes equations. Direct Numerical Simulation (DNS) of the Navier-Stokes equations with sufficient numerical reso…
▽ More
Simulation of turbulent flows at high Reynolds number is a computationally challenging task relevant to a large number of engineering and scientific applications in diverse fields such as climate science, aerodynamics, and combustion. Turbulent flows are typically modeled by the Navier-Stokes equations. Direct Numerical Simulation (DNS) of the Navier-Stokes equations with sufficient numerical resolution to capture all the relevant scales of the turbulent motions can be prohibitively expensive. Simulation at lower-resolution on a coarse-grid introduces significant errors. We introduce a machine learning (ML) technique based on a deep neural network architecture that corrects the numerical errors induced by a coarse-grid simulation of turbulent flows at high-Reynolds numbers, while simultaneously recovering an estimate of the high-resolution fields. Our proposed simulation strategy is a hybrid ML-PDE solver that is capable of obtaining a meaningful high-resolution solution trajectory while solving the system PDE at a lower resolution. The approach has the potential to dramatically reduce the expense of turbulent flow simulations. As a proof-of-concept, we demonstrate our ML-PDE strategy on a two-dimensional turbulent (Rayleigh Number $Ra=10^9$) Rayleigh-Bénard Convection (RBC) problem.
△ Less
Submitted 3 October, 2020; v1 submitted 30 September, 2020;
originally announced October 2020.
-
MeshfreeFlowNet: A Physics-Constrained Deep Continuous Space-Time Super-Resolution Framework
Authors:
Chiyu Max Jiang,
Soheil Esmaeilzadeh,
Kamyar Azizzadenesheli,
Karthik Kashinath,
Mustafa Mustafa,
Hamdi A. Tchelepi,
Philip Marcus,
Prabhat,
Anima Anandkumar
Abstract:
We propose MeshfreeFlowNet, a novel deep learning-based super-resolution framework to generate continuous (grid-free) spatio-temporal solutions from the low-resolution inputs. While being computationally efficient, MeshfreeFlowNet accurately recovers the fine-scale quantities of interest. MeshfreeFlowNet allows for: (i) the output to be sampled at all spatio-temporal resolutions, (ii) a set of Par…
▽ More
We propose MeshfreeFlowNet, a novel deep learning-based super-resolution framework to generate continuous (grid-free) spatio-temporal solutions from the low-resolution inputs. While being computationally efficient, MeshfreeFlowNet accurately recovers the fine-scale quantities of interest. MeshfreeFlowNet allows for: (i) the output to be sampled at all spatio-temporal resolutions, (ii) a set of Partial Differential Equation (PDE) constraints to be imposed, and (iii) training on fixed-size inputs on arbitrarily sized spatio-temporal domains owing to its fully convolutional encoder. We empirically study the performance of MeshfreeFlowNet on the task of super-resolution of turbulent flows in the Rayleigh-Benard convection problem. Across a diverse set of evaluation metrics, we show that MeshfreeFlowNet significantly outperforms existing baselines. Furthermore, we provide a large scale implementation of MeshfreeFlowNet and show that it efficiently scales across large clusters, achieving 96.80% scaling efficiency on up to 128 GPUs and a training time of less than 4 minutes.
△ Less
Submitted 21 August, 2020; v1 submitted 1 May, 2020;
originally announced May 2020.
-
Towards Physics-informed Deep Learning for Turbulent Flow Prediction
Authors:
Rui Wang,
Karthik Kashinath,
Mustafa Mustafa,
Adrian Albert,
Rose Yu
Abstract:
While deep learning has shown tremendous success in a wide range of domains, it remains a grand challenge to incorporate physical principles in a systematic manner to the design, training, and inference of such models. In this paper, we aim to predict turbulent flow by learning its highly nonlinear dynamics from spatiotemporal velocity fields of large-scale fluid flow simulations of relevance to t…
▽ More
While deep learning has shown tremendous success in a wide range of domains, it remains a grand challenge to incorporate physical principles in a systematic manner to the design, training, and inference of such models. In this paper, we aim to predict turbulent flow by learning its highly nonlinear dynamics from spatiotemporal velocity fields of large-scale fluid flow simulations of relevance to turbulence modeling and climate modeling. We adopt a hybrid approach by marrying two well-established turbulent flow simulation techniques with deep learning. Specifically, we introduce trainable spectral filters in a coupled model of Reynolds-averaged Navier-Stokes (RANS) and Large Eddy Simulation (LES), followed by a specialized U-net for prediction. Our approach, which we call turbulent-Flow Net (TF-Net), is grounded in a principled physics model, yet offers the flexibility of learned representations. We compare our model, TF-Net, with state-of-the-art baselines and observe significant reductions in error for predictions 60 frames ahead. Most importantly, our method predicts physical fields that obey desirable physical characteristics, such as conservation of mass, whilst faithfully emulating the turbulent kinetic energy field and spectrum, which are critical for accurate prediction of turbulent flows.
△ Less
Submitted 13 June, 2020; v1 submitted 19 November, 2019;
originally announced November 2019.
-
DisCo: Physics-Based Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
Authors:
Adam Rupe,
Nalini Kumar,
Vladislav Epifanov,
Karthik Kashinath,
Oleksandr Pavlyk,
Frank Schlimbach,
Mostofa Patwary,
Sergey Maidanov,
Victor Lee,
Prabhat,
James P. Crutchfield
Abstract:
Extracting actionable insight from complex unlabeled scientific data is an open challenge and key to unlocking data-driven discovery in science. Complementary and alternative to supervised machine learning approaches, unsupervised physics-based methods based on behavior-driven theories hold great promise. Due to computational limitations, practical application on real-world domain science problems…
▽ More
Extracting actionable insight from complex unlabeled scientific data is an open challenge and key to unlocking data-driven discovery in science. Complementary and alternative to supervised machine learning approaches, unsupervised physics-based methods based on behavior-driven theories hold great promise. Due to computational limitations, practical application on real-world domain science problems has lagged far behind theoretical development. We present our first step towards bridging this divide - DisCo - a high-performance distributed workflow for the behavior-driven local causal state theory. DisCo provides a scalable unsupervised physics-based representation learning method that decomposes spatiotemporal systems into their structurally relevant components, which are captured by the latent local causal state variables. Complex spatiotemporal systems are generally highly structured and organize around a lower-dimensional skeleton of coherent structures, and in several firsts we demonstrate the efficacy of DisCo in capturing such structures from observational and simulated scientific data. To the best of our knowledge, DisCo is also the first application software developed entirely in Python to scale to over 1000 machine nodes, providing good performance along with ensuring domain scientists' productivity. We developed scalable, performant methods optimized for Intel many-core processors that will be upstreamed to open-source Python library packages. Our capstone experiment, using newly developed DisCo workflow and libraries, performs unsupervised spacetime segmentation analysis of CAM5.1 climate simulation data, processing an unprecedented 89.5 TB in 6.6 minutes end-to-end using 1024 Intel Haswell nodes on the Cori supercomputer obtaining 91% weak-scaling and 64% strong-scaling efficiency.
△ Less
Submitted 25 September, 2019;
originally announced September 2019.
-
Towards Unsupervised Segmentation of Extreme Weather Events
Authors:
Adam Rupe,
Karthik Kashinath,
Nalini Kumar,
Victor Lee,
Prabhat,
James P. Crutchfield
Abstract:
Extreme weather is one of the main mechanisms through which climate change will directly impact human society. Coping with such change as a global community requires markedly improved understanding of how global warming drives extreme weather events. While alternative climate scenarios can be simulated using sophisticated models, identifying extreme weather events in these simulations requires aut…
▽ More
Extreme weather is one of the main mechanisms through which climate change will directly impact human society. Coping with such change as a global community requires markedly improved understanding of how global warming drives extreme weather events. While alternative climate scenarios can be simulated using sophisticated models, identifying extreme weather events in these simulations requires automation due to the vast amounts of complex high-dimensional data produced. Atmospheric dynamics, and hydrodynamic flows more generally, are highly structured and largely organize around a lower dimensional skeleton of coherent structures. Indeed, extreme weather events are a special case of more general hydrodynamic coherent structures. We present a scalable physics-based representation learning method that decomposes spatiotemporal systems into their structurally relevant components, which are captured by latent variables known as local causal states. For complex fluid flows we show our method is capable of capturing known coherent structures, and with promising segmentation results on CAM5.1 water vapor data we outline the path to extreme weather identification from unlabeled climate model simulation data.
△ Less
Submitted 16 September, 2019;
originally announced September 2019.
-
Enforcing Statistical Constraints in Generative Adversarial Networks for Modeling Chaotic Dynamical Systems
Authors:
Jin-Long Wu,
Karthik Kashinath,
Adrian Albert,
Dragos Chirila,
Prabhat,
Heng Xiao
Abstract:
Simulating complex physical systems often involves solving partial differential equations (PDEs) with some closures due to the presence of multi-scale physics that cannot be fully resolved. Therefore, reliable and accurate closure models for unresolved physics remains an important requirement for many computational physics problems, e.g., turbulence simulation. Recently, several researchers have a…
▽ More
Simulating complex physical systems often involves solving partial differential equations (PDEs) with some closures due to the presence of multi-scale physics that cannot be fully resolved. Therefore, reliable and accurate closure models for unresolved physics remains an important requirement for many computational physics problems, e.g., turbulence simulation. Recently, several researchers have adopted generative adversarial networks (GANs), a novel paradigm of training machine learning models, to generate solutions of PDEs-governed complex systems without having to numerically solve these PDEs. However, GANs are known to be difficult in training and likely to converge to local minima, where the generated samples do not capture the true statistics of the training data. In this work, we present a statistical constrained generative adversarial network by enforcing constraints of covariance from the training data, which results in an improved machine-learning-based emulator to capture the statistics of the training data generated by solving fully resolved PDEs. We show that such a statistical regularization leads to better performance compared to standard GANs, measured by (1) the constrained model's ability to more faithfully emulate certain physical properties of the system and (2) the significantly reduced (by up to 80%) training time to reach the solution. We exemplify this approach on the Rayleigh-Benard convection, a turbulent flow system that is an idealized model of the Earth's atmosphere. With the growth of high-fidelity simulation databases of physical systems, this work suggests great potential for being an alternative to the explicit modeling of closures or parameterizations for unresolved physics, which are known to be a major source of uncertainty in simulating multi-scale physical systems, e.g., turbulence or Earth's climate.
△ Less
Submitted 13 May, 2019;
originally announced May 2019.
-
Testing the Reliability of Interpretable Neural Networks in Geoscience Using the Madden-Julian Oscillation
Authors:
Benjamin A. Toms,
Karthik Kashinath,
Prabhat,
Da Yang
Abstract:
We test the reliability of two neural network interpretation techniques, backward optimization and layerwise relevance propagation, within geoscientific applications by applying them to a commonly studied geophysical phenomenon, the Madden-Julian Oscillation. The Madden-Julian Oscillation is a multi-scale pattern within the tropical atmosphere that has been extensively studied over the past decade…
▽ More
We test the reliability of two neural network interpretation techniques, backward optimization and layerwise relevance propagation, within geoscientific applications by applying them to a commonly studied geophysical phenomenon, the Madden-Julian Oscillation. The Madden-Julian Oscillation is a multi-scale pattern within the tropical atmosphere that has been extensively studied over the past decades, which makes it an ideal test case to ensure the interpretability methods can recover the current state of knowledge regarding its spatial structure. The neural networks can, indeed, reproduce the current state of knowledge and can also provide new insights into the seasonality of the Madden-Julian Oscillation and its relationships with atmospheric state variables.
The neural network identifies the phase of the Madden-Julian Oscillation twice as accurately as linear regression, which means that nonlinearities used by the neural network are important to the structure of the Madden-Julian Oscillation. Interpretations of the neural network show that it accurately captures the spatial structures of the Madden-Julian Oscillation, suggest that the nonlinearities of the Madden-Julian Oscillation are manifested through the uniqueness of each event, and offer physically meaningful insights into its relationship with atmospheric state variables. We also use the interpretations to identify the seasonality of the Madden-Julian Oscillation, and find that the conventionally defined extended seasons should be shifted later by one month. More generally, this study suggests that neural networks can be reliably interpreted for geoscientific applications and may there
△ Less
Submitted 27 May, 2020; v1 submitted 12 February, 2019;
originally announced February 2019.
-
Spherical CNNs on Unstructured Grids
Authors:
Chiyu "Max" Jiang,
Jingwei Huang,
Karthik Kashinath,
Prabhat,
Philip Marcus,
Matthias Niessner
Abstract:
We present an efficient convolution kernel for Convolutional Neural Networks (CNNs) on unstructured grids using parameterized differential operators while focusing on spherical signals such as panorama images or planetary signals. To this end, we replace conventional convolution kernels with linear combinations of differential operators that are weighted by learnable parameters. Differential opera…
▽ More
We present an efficient convolution kernel for Convolutional Neural Networks (CNNs) on unstructured grids using parameterized differential operators while focusing on spherical signals such as panorama images or planetary signals. To this end, we replace conventional convolution kernels with linear combinations of differential operators that are weighted by learnable parameters. Differential operators can be efficiently estimated on unstructured grids using one-ring neighbors, and learnable parameters can be optimized through standard back-propagation. As a result, we obtain extremely efficient neural networks that match or outperform state-of-the-art network architectures in terms of performance but with a significantly lower number of network parameters. We evaluate our algorithm in an extensive series of experiments on a variety of computer vision and climate science tasks, including shape classification, climate pattern segmentation, and omnidirectional image semantic segmentation. Overall, we present (1) a novel CNN approach on unstructured grids using parameterized differential operators for spherical signals, and (2) we show that our unique kernel parameterization allows our model to achieve the same or higher accuracy with significantly fewer network parameters.
△ Less
Submitted 7 January, 2019;
originally announced January 2019.
-
A Physics-Based Approach to Unsupervised Discovery of Coherent Structures in Spatiotemporal Systems
Authors:
A. Rupe,
J. P. Crutchfield,
K. Kashinath,
Prabhat
Abstract:
Given that observational and numerical climate data are being produced at ever more prodigious rates, increasingly sophisticated and automated analysis techniques have become essential. Deep learning is quickly becoming a standard approach for such analyses and, while great progress is being made, major challenges remain. Unlike commercial applications in which deep learning has led to surprising…
▽ More
Given that observational and numerical climate data are being produced at ever more prodigious rates, increasingly sophisticated and automated analysis techniques have become essential. Deep learning is quickly becoming a standard approach for such analyses and, while great progress is being made, major challenges remain. Unlike commercial applications in which deep learning has led to surprising successes, scientific data is highly complex and typically unlabeled. Moreover, interpretability and detecting new mechanisms are key to scientific discovery. To enhance discovery we present a complementary physics-based, data-driven approach that exploits the causal nature of spatiotemporal data sets generated by local dynamics (e.g. hydrodynamic flows). We illustrate how novel patterns and coherent structures can be discovered in cellular automata and outline the path from them to climate data.
△ Less
Submitted 10 September, 2017;
originally announced September 2017.