Skip to main content

Showing 1–23 of 23 results for author: Kurth, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.11157  [pdf, other

    cs.LG cs.AI

    Attention on the Sphere

    Authors: Boris Bonev, Max Rietmann, Andrea Paris, Alberto Carpentieri, Thorsten Kurth

    Abstract: We introduce a generalized attention mechanism for spherical domains, enabling Transformer architectures to natively process data defined on the two-dimensional sphere - a critical need in fields such as atmospheric physics, cosmology, and robotics, where preserving spherical symmetries and topology is essential for physical accuracy. By integrating numerical quadrature weights into the attention… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

  2. arXiv:2410.18904  [pdf, other

    physics.ao-ph cs.LG

    Modulated Adaptive Fourier Neural Operators for Temporal Interpolation of Weather Forecasts

    Authors: Jussi Leinonen, Boris Bonev, Thorsten Kurth, Yair Cohen

    Abstract: Weather and climate data are often available at limited temporal resolution, either due to storage limitations, or in the case of weather forecast models based on deep learning, their inherently long time steps. The coarse temporal resolution makes it difficult to capture rapidly evolving weather events. To address this limitation, we introduce an interpolation model that reconstructs the atmosphe… ▽ More

    Submitted 24 October, 2024; originally announced October 2024.

    Comments: 8 pages, 5 figures

  3. arXiv:2409.01712  [pdf, other

    q-bio.GN cs.AR cs.LG cs.MS cs.PF

    Toward Capturing Genetic Epistasis From Multivariate Genome-Wide Association Studies Using Mixed-Precision Kernel Ridge Regression

    Authors: Hatem Ltaief, Rabab Alomairy, Qinglei Cao, Jie Ren, Lotfi Slim, Thorsten Kurth, Benedikt Dorschner, Salim Bougouffa, Rached Abdelkhalak, David E. Keyes

    Abstract: We exploit the widening margin in tensor-core performance between [FP64/FP32/FP16/INT8,FP64/FP32/FP16/FP8/INT8] on NVIDIA [Ampere,Hopper] GPUs to boost the performance of output accuracy-preserving mixed-precision computation of Genome-Wide Association Studies (GWAS) of 305K patients from the UK BioBank, the largest-ever GWAS cohort studied for genetic epistasis using a multivariate approach. Tile… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

  4. arXiv:2408.03100  [pdf, other

    physics.ao-ph cs.LG

    Huge Ensembles Part I: Design of Ensemble Weather Forecasts using Spherical Fourier Neural Operators

    Authors: Ankur Mahesh, William Collins, Boris Bonev, Noah Brenowitz, Yair Cohen, Joshua Elms, Peter Harrington, Karthik Kashinath, Thorsten Kurth, Joshua North, Travis OBrien, Michael Pritchard, David Pruitt, Mark Risser, Shashank Subramanian, Jared Willard

    Abstract: Studying low-likelihood high-impact extreme weather events in a warming world is a significant and challenging task for current ensemble forecasting systems. While these systems presently use up to 100 members, larger ensembles could enrich the sampling of internal variability. They may capture the long tails associated with climate hazards better than traditional ensemble sizes. Due to computatio… ▽ More

    Submitted 3 April, 2025; v1 submitted 6 August, 2024; originally announced August 2024.

  5. arXiv:2408.01581  [pdf, other

    cs.LG physics.ao-ph

    Huge Ensembles Part II: Properties of a Huge Ensemble of Hindcasts Generated with Spherical Fourier Neural Operators

    Authors: Ankur Mahesh, William Collins, Boris Bonev, Noah Brenowitz, Yair Cohen, Peter Harrington, Karthik Kashinath, Thorsten Kurth, Joshua North, Travis OBrien, Michael Pritchard, David Pruitt, Mark Risser, Shashank Subramanian, Jared Willard

    Abstract: In Part I, we created an ensemble based on Spherical Fourier Neural Operators. As initial condition perturbations, we used bred vectors, and as model perturbations, we used multiple checkpoints trained independently from scratch. Based on diagnostics that assess the ensemble's physical fidelity, our ensemble has comparable performance to operational weather forecasting systems. However, it require… ▽ More

    Submitted 3 April, 2025; v1 submitted 2 August, 2024; originally announced August 2024.

  6. arXiv:2406.08632  [pdf, other

    physics.ao-ph cs.LG

    Coupled Ocean-Atmosphere Dynamics in a Machine Learning Earth System Model

    Authors: Chenggong Wang, Michael S. Pritchard, Noah Brenowitz, Yair Cohen, Boris Bonev, Thorsten Kurth, Dale Durran, Jaideep Pathak

    Abstract: Seasonal climate forecasts are socioeconomically important for managing the impacts of extreme weather events and for planning in sectors like agriculture and energy. Climate predictability on seasonal timescales is tied to boundary effects of the ocean on the atmosphere and coupled interactions in the ocean-atmosphere system. We present the Ocean-linked-atmosphere (Ola) model, a high-resolution (… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  7. arXiv:2402.16845  [pdf, other

    cs.LG cs.AI math.NA

    Neural Operators with Localized Integral and Differential Kernels

    Authors: Miguel Liu-Schiaffini, Julius Berner, Boris Bonev, Thorsten Kurth, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: Neural operators learn mappings between function spaces, which is practical for learning solution operators of PDEs and other scientific modeling applications. Among them, the Fourier neural operator (FNO) is a popular architecture that performs global convolutions in the Fourier space. However, such global operations are often prone to over-smoothing and may fail to capture local details. In cont… ▽ More

    Submitted 8 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted at 2024 International Conference on Machine Learning

  8. arXiv:2401.15305  [pdf, other

    physics.ao-ph cs.LG

    A Practical Probabilistic Benchmark for AI Weather Models

    Authors: Noah D. Brenowitz, Yair Cohen, Jaideep Pathak, Ankur Mahesh, Boris Bonev, Thorsten Kurth, Dale R. Durran, Peter Harrington, Michael S. Pritchard

    Abstract: Since the weather is chaotic, forecasts aim to predict the distribution of future states rather than make a single prediction. Recently, multiple data driven weather models have emerged claiming breakthroughs in skill. However, these have mostly been benchmarked using deterministic skill scores, and little is known about their probabilistic skill. Unfortunately, it is hard to fairly compare AI wea… ▽ More

    Submitted 12 November, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

    Comments: 15 pages, 5 figures

    Journal ref: Geophysical Research Letters, 52, e2024GL113656 (2025)

  9. arXiv:2311.06253  [pdf, other

    physics.ao-ph cs.LG

    Advancing Parsimonious Deep Learning Weather Prediction using the HEALPix Mesh

    Authors: Matthias Karlbauer, Nathaniel Cresswell-Clay, Dale R. Durran, Raul A. Moreno, Thorsten Kurth, Boris Bonev, Noah Brenowitz, Martin V. Butz

    Abstract: We present a parsimonious deep learning weather prediction model to forecast seven atmospheric variables with 3-h time resolution for up to one-year lead times on a 110-km global mesh using the Hierarchical Equal Area isoLatitude Pixelization (HEALPix). In comparison to state-of-the-art (SOTA) machine learning (ML) weather forecast models, such as Pangu-Weather and GraphCast, our DLWP-HPX model us… ▽ More

    Submitted 19 June, 2024; v1 submitted 11 September, 2023; originally announced November 2023.

    Comments: Submitted to Journal of Advances in Modeling Earth Systems (JAMES)

    Journal ref: Journal of Advances in Modeling Earth Systems, 16, e2023MS004021

  10. arXiv:2306.03838  [pdf, other

    cs.LG math.NA physics.ao-ph physics.comp-ph

    Spherical Fourier Neural Operators: Learning Stable Dynamics on the Sphere

    Authors: Boris Bonev, Thorsten Kurth, Christian Hundt, Jaideep Pathak, Maximilian Baust, Karthik Kashinath, Anima Anandkumar

    Abstract: Fourier Neural Operators (FNOs) have proven to be an efficient and effective method for resolution-independent operator learning in a broad variety of application areas across scientific machine learning. A key reason for their success is their ability to accurately model long-range dependencies in spatio-temporal data by learning global convolutions in a computationally efficient manner. To this… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  11. arXiv:2208.05419  [pdf, ps, other

    physics.ao-ph cs.AI cs.CV cs.LG cs.PF

    FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators

    Authors: Thorsten Kurth, Shashank Subramanian, Peter Harrington, Jaideep Pathak, Morteza Mardani, David Hall, Andrea Miele, Karthik Kashinath, Animashree Anandkumar

    Abstract: Extreme weather amplified by climate change is causing increasingly devastating impacts across the globe. The current use of physics-based numerical weather prediction (NWP) limits accuracy due to high computational cost and strict time-to-solution limits. We report that a data-driven deep learning Earth system emulator, FourCastNet, can predict global weather and generate medium-range forecasts f… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

  12. arXiv:2202.11214  [pdf, other

    physics.ao-ph cs.LG

    FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators

    Authors: Jaideep Pathak, Shashank Subramanian, Peter Harrington, Sanjeev Raja, Ashesh Chattopadhyay, Morteza Mardani, Thorsten Kurth, David Hall, Zongyi Li, Kamyar Azizzadenesheli, Pedram Hassanzadeh, Karthik Kashinath, Animashree Anandkumar

    Abstract: FourCastNet, short for Fourier Forecasting Neural Network, is a global data-driven weather forecasting model that provides accurate short to medium-range global predictions at $0.25^{\circ}$ resolution. FourCastNet accurately forecasts high-resolution, fast-timescale variables such as the surface wind speed, precipitation, and atmospheric water vapor. It has important implications for planning win… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  13. arXiv:2110.11466  [pdf, other

    cs.LG cs.DC

    MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

    Authors: Steven Farrell, Murali Emani, Jacob Balma, Lukas Drescher, Aleksandr Drozd, Andreas Fink, Geoffrey Fox, David Kanter, Thorsten Kurth, Peter Mattson, Dawei Mu, Amit Ruhela, Kento Sato, Koichi Shirahata, Tsuguchika Tabaru, Aristeidis Tsaris, Jan Balewski, Ben Cumming, Takumi Danjo, Jens Domke, Takaaki Fukai, Naoto Fukumoto, Tatsuya Fukushi, Balazs Gerofi, Takumi Honda , et al. (18 additional authors not shown)

    Abstract: Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights. High performance computing systems are pushing the frontiers of performance with a rich diversity of hardware resources and massive scale-out capabilities. There is a critical need to understand fair and effective benchmarking of machine learning appli… ▽ More

    Submitted 26 October, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

  14. arXiv:2010.06574  [pdf, other

    cs.DC cs.CE q-bio.QM

    IMPECCABLE: Integrated Modeling PipelinE for COVID Cure by Assessing Better LEads

    Authors: Aymen Al Saadi, Dario Alfe, Yadu Babuji, Agastya Bhati, Ben Blaiszik, Thomas Brettin, Kyle Chard, Ryan Chard, Peter Coveney, Anda Trifan, Alex Brace, Austin Clyde, Ian Foster, Tom Gibbs, Shantenu Jha, Kristopher Keipert, Thorsten Kurth, Dieter Kranzlmüller, Hyungro Lee, Zhuozhao Li, Heng Ma, Andre Merzky, Gerald Mathias, Alexander Partin, Junqi Yin , et al. (11 additional authors not shown)

    Abstract: The drug discovery process currently employed in the pharmaceutical industry typically requires about 10 years and $2-3 billion to deliver one new drug. This is both too expensive and too slow, especially in emergencies like the COVID-19 pandemic. In silicomethodologies need to be improved to better select lead compounds that can proceed to later stages of the drug discovery protocol accelerating… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

  15. arXiv:2010.00072  [pdf, ps, other

    physics.comp-ph cs.LG physics.geo-ph

    Using Machine Learning to Augment Coarse-Grid Computational Fluid Dynamics Simulations

    Authors: Jaideep Pathak, Mustafa Mustafa, Karthik Kashinath, Emmanuel Motheau, Thorsten Kurth, Marcus Day

    Abstract: Simulation of turbulent flows at high Reynolds number is a computationally challenging task relevant to a large number of engineering and scientific applications in diverse fields such as climate science, aerodynamics, and combustion. Turbulent flows are typically modeled by the Navier-Stokes equations. Direct Numerical Simulation (DNS) of the Navier-Stokes equations with sufficient numerical reso… ▽ More

    Submitted 3 October, 2020; v1 submitted 30 September, 2020; originally announced October 2020.

    Comments: Corrected typographical errors in the previous version related to the incorrectly formatted accented character "é" appearing in various places in the manuscript

  16. arXiv:2009.05257  [pdf, ps, other

    cs.DC cs.LG cs.PF

    Hierarchical Roofline Performance Analysis for Deep Learning Applications

    Authors: Charlene Yang, Yunsong Wang, Steven Farrell, Thorsten Kurth, Samuel Williams

    Abstract: This paper presents a practical methodology for collecting performance data necessary to conduct hierarchical Roofline analysis on NVIDIA GPUs. It discusses the extension of the Empirical Roofline Toolkit for broader support of a range of data precisions and Tensor Core support and introduces a Nsight Compute based method to accurately collect application performance information. This methodology… ▽ More

    Submitted 24 November, 2020; v1 submitted 11 September, 2020; originally announced September 2020.

    Comments: 9 pages

  17. arXiv:2009.04598  [pdf, other

    cs.DC cs.AR cs.LG cs.PF

    Time-Based Roofline for Deep Learning Performance Analysis

    Authors: Yunsong Wang, Charlene Yang, Steven Farrell, Yan Zhang, Thorsten Kurth, Samuel Williams

    Abstract: Deep learning applications are usually very compute-intensive and require a long run time for training and inference. This has been tackled by researchers from both hardware and software sides, and in this paper, we propose a Roofline-based approach to performance analysis to facilitate the optimization of these applications. This approach is an extension of the Roofline model widely used in tradi… ▽ More

    Submitted 22 September, 2020; v1 submitted 9 September, 2020; originally announced September 2020.

    Comments: 9 pages

  18. arXiv:1910.13444  [pdf, other

    physics.comp-ph cs.LG stat.ML

    Highly-scalable, physics-informed GANs for learning solutions of stochastic PDEs

    Authors: Liu Yang, Sean Treichler, Thorsten Kurth, Keno Fischer, David Barajas-Solano, Josh Romero, Valentin Churavy, Alexandre Tartakovsky, Michael Houston, Prabhat, George Karniadakis

    Abstract: Uncertainty quantification for forward and inverse problems is a central challenge across physical and biomedical disciplines. We address this challenge for the problem of modeling subsurface flow at the Hanford Site by combining stochastic computational models with observational data using physics-informed GAN models. The geographic extent, spatial heterogeneity, and multiple correlation length s… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.

    Comments: 3rd Deep Learning on Supercomputers Workshop (DLS) at SC19

  19. arXiv:1810.01993  [pdf, other

    cs.DC

    Exascale Deep Learning for Climate Analytics

    Authors: Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat, Michael Houston

    Abstract: We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parall… ▽ More

    Submitted 3 October, 2018; originally announced October 2018.

    Comments: 12 pages, 5 tables, 4, figures, Super Computing Conference November 11-16, 2018, Dallas, TX, USA

  20. arXiv:1810.01609  [pdf, other

    hep-lat cs.DC nucl-th physics.comp-ph

    Simulating the weak death of the neutron in a femtoscale universe with near-Exascale computing

    Authors: Evan Berkowitz, M. A. Clark, Arjun Gambhir, Ken McElvain, Amy Nicholson, Enrico Rinaldi, Pavlos Vranas, André Walker-Loud, Chia Cheng Chang, Bálint Joó, Thorsten Kurth, Kostas Orginos

    Abstract: The fundamental particle theory called Quantum Chromodynamics (QCD) dictates everything about protons and neutrons, from their intrinsic properties to interactions that bind them into atomic nuclei. Quantities that cannot be fully resolved through experiment, such as the neutron lifetime (whose precise value is important for the existence of light-atomic elements that make the sun shine and life p… ▽ More

    Submitted 10 October, 2018; v1 submitted 3 October, 2018; originally announced October 2018.

    Comments: 2018 Gordon Bell Finalist: 9 pages, 9 figures; v2: fixed 2 typos and appended acknowledgements

    Report number: LLNL-JRNL-749850, RIKEN-iTHEMS-Report-18 ACM Class: C.1.4; D.1.3

    Journal ref: Supercomputing 2018, pp. 697-705

  21. arXiv:1712.09388  [pdf, other

    cs.DC

    Scaling GRPC Tensorflow on 512 nodes of Cori Supercomputer

    Authors: Amrita Mathuriya, Thorsten Kurth, Vivek Rane, Mustafa Mustafa, Lei Shao, Debbie Bard, Prabhat, Victor W Lee

    Abstract: We explore scaling of the standard distributed Tensorflow with GRPC primitives on up to 512 Intel Xeon Phi (KNL) nodes of Cori supercomputer with synchronous stochastic gradient descent (SGD), and identify causes of scaling inefficiency at higher node counts. To our knowledge, this is the first exploration of distributed GRPC Tensorflow scalability on a HPC supercomputer at such large scale with s… ▽ More

    Submitted 26 December, 2017; originally announced December 2017.

    Comments: Published as a poster in NIPS 2017 Workshop: Deep Learning At Supercomputer Scale

  22. arXiv:1711.03573  [pdf, other

    hep-ex cs.DC cs.LG physics.data-an

    Deep Neural Networks for Physics Analysis on low-level whole-detector data at the LHC

    Authors: Wahid Bhimji, Steven Andrew Farrell, Thorsten Kurth, Michela Paganini, Prabhat, Evan Racah

    Abstract: There has been considerable recent activity applying deep convolutional neural nets (CNNs) to data from particle physics experiments. Current approaches on ATLAS/CMS have largely focussed on a subset of the calorimeter, and for identifying objects or particular particle types. We explore approaches that use the entire calorimeter, combined with track information, for directly conducting physics an… ▽ More

    Submitted 29 November, 2017; v1 submitted 9 November, 2017; originally announced November 2017.

    Comments: Presented at ACAT 2017 Conference, Submitted to J. Phys. Conf. Ser

  23. arXiv:1708.05256  [pdf, other

    cs.PF cs.CV cs.LG

    Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data

    Authors: Thorsten Kurth, Jian Zhang, Nadathur Satish, Ioannis Mitliagkas, Evan Racah, Mostofa Ali Patwary, Tareq Malas, Narayanan Sundaram, Wahid Bhimji, Mikhail Smorkalov, Jack Deslippe, Mikhail Shiryaev, Srinivas Sridharan, Prabhat, Pradeep Dubey

    Abstract: This paper presents the first, 15-PetaFLOP Deep Learning system for solving scientific pattern classification problems on contemporary HPC architectures. We develop supervised convolutional architectures for discriminating signals in high-energy physics data as well as semi-supervised architectures for localizing and classifying extreme weather in climate data. Our Intelcaffe-based implementation… ▽ More

    Submitted 17 August, 2017; originally announced August 2017.

    Comments: 12 pages, 9 figures