-
Multidimensional Distributional Neural Network Output Demonstrated in Super-Resolution of Surface Wind Speed
Authors:
Harrison J. Goldwyn,
Mitchell Krock,
Johann Rudi,
Daniel Getter,
Julie Bessac
Abstract:
Accurate quantification of uncertainty in neural network predictions remains a central challenge for scientific applications involving high-dimensional, correlated data. While existing methods capture either aleatoric or epistemic uncertainty, few offer closed-form, multidimensional distributions that preserve spatial correlation while remaining computationally tractable. In this work, we present…
▽ More
Accurate quantification of uncertainty in neural network predictions remains a central challenge for scientific applications involving high-dimensional, correlated data. While existing methods capture either aleatoric or epistemic uncertainty, few offer closed-form, multidimensional distributions that preserve spatial correlation while remaining computationally tractable. In this work, we present a framework for training neural networks with a multidimensional Gaussian loss, generating closed-form predictive distributions over outputs with non-identically distributed and heteroscedastic structure. Our approach captures aleatoric uncertainty by iteratively estimating the means and covariance matrices, and is demonstrated on a super-resolution example. We leverage a Fourier representation of the covariance matrix to stabilize network training and preserve spatial correlation. We introduce a novel regularization strategy -- referred to as information sharing -- that interpolates between image-specific and global covariance estimates, enabling convergence of the super-resolution downscaling network trained on image-specific distributional loss functions. This framework allows for efficient sampling, explicit correlation modeling, and extensions to more complex distribution families all without disrupting prediction performance. We demonstrate the method on a surface wind speed downscaling task and discuss its broader applicability to uncertainty-aware prediction in scientific models.
△ Less
Submitted 21 August, 2025;
originally announced August 2025.
-
CG-Kit: Code Generation Toolkit for Performant and Maintainable Variants of Source Code Applied to Flash-X Hydrodynamics Simulations
Authors:
Johann Rudi,
Youngjun Lee,
Aidan H. Chadha,
Mohamed Wahib,
Klaus Weide,
Jared P. O'Neal,
Anshu Dubey
Abstract:
CG-Kit is a new code generation toolkit that we propose as a solution for portability and maintainability for scientific computing applications. The development of CG-Kit is rooted in the urgent need created by the shifting landscape of high-performance computing platforms and the algorithmic complexities of a particular large-scale multiphysics application: Flash-X. This combination leads to uniq…
▽ More
CG-Kit is a new code generation toolkit that we propose as a solution for portability and maintainability for scientific computing applications. The development of CG-Kit is rooted in the urgent need created by the shifting landscape of high-performance computing platforms and the algorithmic complexities of a particular large-scale multiphysics application: Flash-X. This combination leads to unique challenges including handling an existing large code base in Fortran and/or C/C++, subdivision of code into a great variety of units supporting a wide range of physics and numerical methods, different parallelization techniques for distributed- and shared-memory systems and accelerator devices, and heterogeneity of computing platforms requiring coexisting variants of parallel algorithms. The challenges demand that developers determine custom abstractions and granularity for code generation. CG-Kit tackles this with standalone tools that can be combined into highly specific and, we argue, highly effective portability and maintainability tool chains. Here we present the design of our new tools: parametrized source trees, control flow graphs, and recipes. The tools are implemented in Python. Although the tools are agnostic to the programming language of the source code, we focus on C/C++ and Fortran. Code generation experiments demonstrate the generation of variants of parallel algorithms: first, multithreaded variants of the basic AXPY operation (scalar-vector addition and vector-vector multiplication) to introduce the application of CG-Kit tool chains; and second, variants of parallel algorithms within a hydrodynamics solver, called Spark, from Flash-X that operates on block-structured adaptive meshes. In summary, code generated by CG-Kit achieves a reduction by over 60% of the original C/C++/Fortran source code.
△ Less
Submitted 6 January, 2024;
originally announced January 2024.
-
Statistical treatment of convolutional neural network super-resolution of inland surface wind for subgrid-scale variability quantification
Authors:
Daniel Getter,
Julie Bessac,
Johann Rudi,
Yan Feng
Abstract:
Machine learning models have been employed to perform either physics-free data-driven or hybrid dynamical downscaling of climate data. Most of these implementations operate over relatively small downscaling factors because of the challenge of recovering fine-scale information from coarse data. This limits their compatibility with many global climate model outputs, often available between $\sim$50-…
▽ More
Machine learning models have been employed to perform either physics-free data-driven or hybrid dynamical downscaling of climate data. Most of these implementations operate over relatively small downscaling factors because of the challenge of recovering fine-scale information from coarse data. This limits their compatibility with many global climate model outputs, often available between $\sim$50--100 km resolution, to scales of interest such as cloud resolving or urban scales. This study systematically examines the capability of convolutional neural networks (CNNs) to downscale surface wind speed data over land surface from different coarse resolutions (25 km, 48 km, and 100 km resolution) to 3 km. For each downscaling factor, we consider three CNN configurations that generate super-resolved predictions of fine-scale wind speed, which take between 1 to 3 input fields: coarse wind speed, fine-scale topography, and diurnal cycle. In addition to fine-scale wind speeds, probability density function parameters are generated, through which sample wind speeds can be generated accounting for the intrinsic stochasticity of wind speed. For generalizability assessment, CNN models are tested on regions with different topography and climate that are unseen during training. The evaluation of super-resolved predictions focuses on subgrid-scale variability and the recovery of extremes. Models with coarse wind and fine topography as inputs exhibit the best performance compared with other model configurations, operating across the same downscaling factor. Our diurnal cycle encoding results in lower out-of-sample generalizability compared with other input configurations.
△ Less
Submitted 23 February, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Flash-X, a multiphysics simulation software instrument
Authors:
Anshu Dubey,
Klaus Weide,
Jared O'Neal,
Akash Dhruv,
Sean Couch,
J. Austin Harris,
Tom Klosterman,
Rajeev Jain,
Johann Rudi,
Bronson Messer,
Michael Pajkos,
Jared Carlson,
Ran Chu,
Mohamed Wahib,
Saurabh Chawdhary,
Paul M. Ricker,
Dongwook Lee,
Katie Antypas,
Katherine M. Riley,
Christopher Daley,
Murali Ganapathy,
Francis X. Timmes,
Dean M. Townsley,
Marcos Vanella,
John Bachan
, et al. (6 additional authors not shown)
Abstract:
Flash-X is a highly composable multiphysics software system that can be used to simulate physical phenomena in several scientific domains. It derives some of its solvers from FLASH, which was first released in 2000. Flash-X has a new framework that relies on abstractions and asynchronous communications for performance portability across a range of increasingly heterogeneous hardware platforms. Fla…
▽ More
Flash-X is a highly composable multiphysics software system that can be used to simulate physical phenomena in several scientific domains. It derives some of its solvers from FLASH, which was first released in 2000. Flash-X has a new framework that relies on abstractions and asynchronous communications for performance portability across a range of increasingly heterogeneous hardware platforms. Flash-X is meant primarily for solving Eulerian formulations of applications with compressible and/or incompressible reactive flows. It also has a built-in, versatile Lagrangian framework that can be used in many different ways, including implementing tracers, particle-in-cell simulations, and immersed boundary methods.
△ Less
Submitted 24 August, 2022;
originally announced August 2022.
-
Parameter Estimation with Dense and Convolutional Neural Networks Applied to the FitzHugh-Nagumo ODE
Authors:
Johann Rudi,
Julie Bessac,
Amanda Lenzi
Abstract:
Machine learning algorithms have been successfully used to approximate nonlinear maps under weak assumptions on the structure and properties of the maps. We present deep neural networks using dense and convolutional layers to solve an inverse problem, where we seek to estimate parameters of a FitzHugh-Nagumo model, which consists of a nonlinear system of ordinary differential equations (ODEs). We…
▽ More
Machine learning algorithms have been successfully used to approximate nonlinear maps under weak assumptions on the structure and properties of the maps. We present deep neural networks using dense and convolutional layers to solve an inverse problem, where we seek to estimate parameters of a FitzHugh-Nagumo model, which consists of a nonlinear system of ordinary differential equations (ODEs). We employ the neural networks to approximate reconstruction maps for model parameter estimation from observational data, where the data comes from the solution of the ODE and takes the form of a time series representing dynamically spiking membrane potential of a biological neuron. We target this dynamical model because of the computational challenges it poses in an inference setting, namely, having a highly nonlinear and nonconvex data misfit term and permitting only weakly informative priors on parameters. These challenges cause traditional optimization to fail and alternative algorithms to exhibit large computational costs. We quantify the prediction errors of model parameters obtained from the neural networks and investigate the effects of network architectures with and without the presence of noise in observational data. We generalize our framework for neural network-based reconstruction maps to simultaneously estimate ODE parameters and parameters of autocorrelated observational noise. Our results demonstrate that deep neural networks have the potential to estimate parameters in dynamical models and stochastic processes, and they are capable of predicting parameters accurately for the FitzHugh-Nagumo model.
△ Less
Submitted 4 May, 2021; v1 submitted 11 December, 2020;
originally announced December 2020.
-
Weighted BFBT Preconditioner for Stokes Flow Problems with Highly Heterogeneous Viscosity
Authors:
Johann Rudi,
Georg Stadler,
Omar Ghattas
Abstract:
We present a weighted BFBT approximation (w-BFBT) to the inverse Schur complement of a Stokes system with highly heterogeneous viscosity. When used as part of a Schur complement-based Stokes preconditioner, we observe robust fast convergence for Stokes problems with smooth but highly varying (up to 10 orders of magnitude) viscosities, optimal algorithmic scalability with respect to mesh refinement…
▽ More
We present a weighted BFBT approximation (w-BFBT) to the inverse Schur complement of a Stokes system with highly heterogeneous viscosity. When used as part of a Schur complement-based Stokes preconditioner, we observe robust fast convergence for Stokes problems with smooth but highly varying (up to 10 orders of magnitude) viscosities, optimal algorithmic scalability with respect to mesh refinement, and only a mild dependence on the polynomial order of high-order finite element discretizations ($Q_k \times P_{k-1}^{disc}$, order $k \ge 2$). For certain difficult problems, we demonstrate numerically that w-BFBT significantly improves Stokes solver convergence over the widely used inverse viscosity-weighted pressure mass matrix approximation of the Schur complement. In addition, we derive theoretical eigenvalue bounds to prove spectral equivalence of w-BFBT. Using detailed numerical experiments, we discuss modifications to w-BFBT at Dirichlet boundaries that decrease the number of iterations. The overall algorithmic performance of the Stokes solver is governed by the efficacy of w-BFBT as a Schur complement approximation and, in addition, by our parallel hybrid spectral-geometric-algebraic multigrid (HMG) method, which we use to approximate the inverses of the viscous block and variable-coefficient pressure Poisson operators within w-BFBT. Building on the scalability of HMG, our Stokes solver achieves a parallel efficiency of 90% while weak scaling over a more than 600-fold increase from 48 to all 30,000 cores of TACC's Lonestar 5 supercomputer.
△ Less
Submitted 29 January, 2017; v1 submitted 13 July, 2016;
originally announced July 2016.