Search | arXiv e-print repository

Huge Ensembles Part I: Design of Ensemble Weather Forecasts using Spherical Fourier Neural Operators

Authors: Ankur Mahesh, William Collins, Boris Bonev, Noah Brenowitz, Yair Cohen, Joshua Elms, Peter Harrington, Karthik Kashinath, Thorsten Kurth, Joshua North, Travis OBrien, Michael Pritchard, David Pruitt, Mark Risser, Shashank Subramanian, Jared Willard

Abstract: Studying low-likelihood high-impact extreme weather events in a warming world is a significant and challenging task for current ensemble forecasting systems. While these systems presently use up to 100 members, larger ensembles could enrich the sampling of internal variability. They may capture the long tails associated with climate hazards better than traditional ensemble sizes. Due to computatio… ▽ More Studying low-likelihood high-impact extreme weather events in a warming world is a significant and challenging task for current ensemble forecasting systems. While these systems presently use up to 100 members, larger ensembles could enrich the sampling of internal variability. They may capture the long tails associated with climate hazards better than traditional ensemble sizes. Due to computational constraints, it is infeasible to generate huge ensembles (comprised of 1,000-10,000 members) with traditional, physics-based numerical models. In this two-part paper, we replace traditional numerical simulations with machine learning (ML) to generate hindcasts of huge ensembles. In Part I, we construct an ensemble weather forecasting system based on Spherical Fourier Neural Operators (SFNO), and we discuss important design decisions for constructing such an ensemble. The ensemble represents model uncertainty through perturbed-parameter techniques, and it represents initial condition uncertainty through bred vectors, which sample the fastest growing modes of the forecast. Using the European Centre for Medium-Range Weather Forecasts Integrated Forecasting System (IFS) as a baseline, we develop an evaluation pipeline composed of mean, spectral, and extreme diagnostics. Using large-scale, distributed SFNOs with 1.1 billion learned parameters, we achieve calibrated probabilistic forecasts. As the trajectories of the individual members diverge, the ML ensemble mean spectra degrade with lead time, consistent with physical expectations. However, the individual ensemble members' spectra stay constant with lead time. Therefore, these members simulate realistic weather states, and the ML ensemble thus passes a crucial spectral test in the literature. The IFS and ML ensembles have similar Extreme Forecast Indices, and we show that the ML extreme weather forecasts are reliable and discriminating. △ Less

Submitted 3 April, 2025; v1 submitted 6 August, 2024; originally announced August 2024.

arXiv:2408.01581 [pdf, other]

Huge Ensembles Part II: Properties of a Huge Ensemble of Hindcasts Generated with Spherical Fourier Neural Operators

Authors: Ankur Mahesh, William Collins, Boris Bonev, Noah Brenowitz, Yair Cohen, Peter Harrington, Karthik Kashinath, Thorsten Kurth, Joshua North, Travis OBrien, Michael Pritchard, David Pruitt, Mark Risser, Shashank Subramanian, Jared Willard

Abstract: In Part I, we created an ensemble based on Spherical Fourier Neural Operators. As initial condition perturbations, we used bred vectors, and as model perturbations, we used multiple checkpoints trained independently from scratch. Based on diagnostics that assess the ensemble's physical fidelity, our ensemble has comparable performance to operational weather forecasting systems. However, it require… ▽ More In Part I, we created an ensemble based on Spherical Fourier Neural Operators. As initial condition perturbations, we used bred vectors, and as model perturbations, we used multiple checkpoints trained independently from scratch. Based on diagnostics that assess the ensemble's physical fidelity, our ensemble has comparable performance to operational weather forecasting systems. However, it requires orders of magnitude fewer computational resources. Here in Part II, we generate a huge ensemble (HENS), with 7,424 members initialized each day of summer 2023. We enumerate the technical requirements for running huge ensembles at this scale. HENS precisely samples the tails of the forecast distribution and presents a detailed sampling of internal variability. HENS has two primary applications: (1) as a large dataset with which to study the statistics and drivers of extreme weather and (2) as a weather forecasting system. For extreme climate statistics, HENS samples events 4$σ$ away from the ensemble mean. At each grid cell, HENS increases the skill of the most accurate ensemble member and enhances coverage of possible future trajectories. As a weather forecasting model, HENS issues extreme weather forecasts with better uncertainty quantification. It also reduces the probability of outlier events, in which the verification value lies outside the ensemble forecast distribution. △ Less

Submitted 3 April, 2025; v1 submitted 2 August, 2024; originally announced August 2024.

arXiv:1910.11370 [pdf]

A perspective on Microscopy Metadata: data provenance and quality control

Authors: Maximiliaan Huisman, Mathias Hammer, Alex Rigano, Ulrike Boehm, James J. Chambers, Nathalie Gaudreault, Alison J. North, Jaime A. Pimentel, Damir Sudar, Peter Bajcsy, Claire M. Brown, Alexander D. Corbett, Orestis Faklaris, Judith Lacoste, Alex Laude, Glyn Nelson, Roland Nitschke, David Grunwald, Caterina Strambio-De-Castillia

Abstract: The application of microscopy in biomedical research has come a long way since Antonie van Leeuwenhoek discovered unicellular organisms. Countless innovations have positioned light microscopy as a cornerstone of modern biology and a method of choice for connecting omics datasets to their biological and clinical correlates. Still, regardless of how convincing published imaging data looks, it does n… ▽ More The application of microscopy in biomedical research has come a long way since Antonie van Leeuwenhoek discovered unicellular organisms. Countless innovations have positioned light microscopy as a cornerstone of modern biology and a method of choice for connecting omics datasets to their biological and clinical correlates. Still, regardless of how convincing published imaging data looks, it does not always convey meaningful information about the conditions in which it was acquired, processed, and analyzed. Adequate record-keeping, reporting, and quality control are therefore essential to ensure experimental rigor and data fidelity, allow experiments to be reproducibly repeated, and promote the proper evaluation, interpretation, comparison, and re-use. To this end, microscopy images should be accompanied by complete descriptions detailing experimental procedures, biological samples, microscope hardware specifications, image acquisition parameters, and image analysis procedures, as well as metrics accounting for instrument performance and calibration. However, universal, community-accepted Microscopy Metadata standards and reporting specifications that would result in Findable Accessible Interoperable and Reproducible (FAIR) microscopy data have not yet been established. To understand this shortcoming and to propose a way forward, here we provide an overview of the nature of microscopy metadata and its importance for fostering data quality, reproducibility, scientific rigor, and sharing value in light microscopy. The proposal for tiered Microscopy Metadata Specifications that extend the OME Data Model put forth by the 4D Nucleome Initiative and by Bioimaging North America [1-3] as well as a suite of three complementary and interoperable tools are being developed to facilitate the process of image data documentation and are presented in related manuscripts [4-6]. △ Less

Submitted 31 May, 2021; v1 submitted 24 October, 2019; originally announced October 2019.

Showing 1–3 of 3 results for author: North, J