-
An Energy Stable High-Order Cut Cell Discontinuous Galerkin Method with State Redistribution for Wave Propagation
Authors:
Christina G. Taylor,
Lucas C. Wilcox,
Jesse Chan
Abstract:
Cut meshes are a type of mesh that is formed by allowing embedded boundaries to "cut" a simple underlying mesh resulting in a hybrid mesh of cut and standard elements. While cut meshes can allow complex boundaries to be represented well regardless of the mesh resolution, their arbitrarily shaped and sized cut elements can present issues such as the small cell problem, where small cut elements can…
▽ More
Cut meshes are a type of mesh that is formed by allowing embedded boundaries to "cut" a simple underlying mesh resulting in a hybrid mesh of cut and standard elements. While cut meshes can allow complex boundaries to be represented well regardless of the mesh resolution, their arbitrarily shaped and sized cut elements can present issues such as the small cell problem, where small cut elements can result in a severely restricted CFL condition. State redistribution, a technique developed by Berger and Giuliani [1], can be used to address the small cell problem. In this work, we pair state redistribution with a high-order discontinuous Galerkin scheme that is $L_2$ energy stable for arbitrary quadrature. We prove that state redistribution can be added to a provably $L_2$ energy stable discontinuous Galerkin method on a cut mesh without damaging the scheme's $L_2$ stability. We numerically verify the high order accuracy and stability of our scheme on two-dimensional wave propagation problems.
△ Less
Submitted 23 April, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Entropy Stable Discontinuous Galerkin Methods for Balance Laws in Non-Conservative Form: Applications to the Euler Equations with Gravity
Authors:
Maciej Waruszewski,
Jeremy E. Kozdon,
Lucas C. Wilcox,
Thomas H. Gibson,
Francis X. Giraldo
Abstract:
In this work a non-conservative balance law formulation is considered that encompasses the rotating, compressible Euler equations for dry atmospheric flows. We develop a semi-discretely entropy stable discontinuous Galerkin method on curvilinear meshes using a generalization of flux differencing for numerical fluxes in fluctuation form. The method uses the skew-hybridized formulation of the elemen…
▽ More
In this work a non-conservative balance law formulation is considered that encompasses the rotating, compressible Euler equations for dry atmospheric flows. We develop a semi-discretely entropy stable discontinuous Galerkin method on curvilinear meshes using a generalization of flux differencing for numerical fluxes in fluctuation form. The method uses the skew-hybridized formulation of the element operators to ensure that, even in the presence of under-integration on curvilinear meshes, the resulting discretization is entropy stable. Several atmospheric flow test cases in one, two, and three dimensions confirm the theoretical entropy stability results as well as show the high-order accuracy and robustness of the method.
△ Less
Submitted 25 July, 2022; v1 submitted 29 October, 2021;
originally announced October 2021.
-
Large-eddy simulations with ClimateMachine: a new open-source code for atmospheric simulations on GPUs and CPUs
Authors:
Akshay Sridhar,
Yassine Tissaoui,
Simone Marras,
Zhaoyi Shen,
Charles Kawczynski,
Simon Byrne,
Kiran Pamnany,
Maciej Waruszewski,
Thomas H. Gibson,
Jeremy E. Kozdon,
Valentin Churavy,
Lucas C. Wilcox,
Francis X. Giraldo,
Tapio Schneider
Abstract:
We introduce ClimateMachine, a new open-source atmosphere modeling framework using the Julia language to be performance portable on central processing units (CPUs) and graphics processing units (GPUs). ClimateMachine uses a common framework both for coarser-resolution global simulations and for high-resolution, limited-area large-eddy simulations (LES). Here, we demonstrate the LES configuration o…
▽ More
We introduce ClimateMachine, a new open-source atmosphere modeling framework using the Julia language to be performance portable on central processing units (CPUs) and graphics processing units (GPUs). ClimateMachine uses a common framework both for coarser-resolution global simulations and for high-resolution, limited-area large-eddy simulations (LES). Here, we demonstrate the LES configuration of the atmosphere model in canonical benchmark cases and atmospheric flows, using an energy-conserving nodal discontinuous-Galerkin (DG) discretization of the governing equations. Resolution dependence, conservation characteristics and scaling metrics are examined in comparison with existing LES codes. They demonstrate the utility of ClimateMachine as a modelling tool for limited-area LES flow configurations.
△ Less
Submitted 2 October, 2021;
originally announced October 2021.
-
Hybridized Summation-By-Parts Finite Difference Methods
Authors:
Jeremy E. Kozdon,
Brittany A. Erickson,
Lucas C. Wilcox
Abstract:
We present a hybridization technique for summation-by-parts finite difference methods with weak enforcement of interface and boundary conditions for second order, linear elliptic partial differential equations. The method is based on techniques from the hybridized discontinuous Galerkin literature where local and global problems are defined for the volume and trace grid points, respectively. By us…
▽ More
We present a hybridization technique for summation-by-parts finite difference methods with weak enforcement of interface and boundary conditions for second order, linear elliptic partial differential equations. The method is based on techniques from the hybridized discontinuous Galerkin literature where local and global problems are defined for the volume and trace grid points, respectively. By using a Schur complement technique the volume points can be eliminated, which drastically reduces the system size. We derive both the local and global problems, and show that the linear systems that must be solved are symmetric positive definite. The theoretical stability results are confirmed with numerical experiments as is the accuracy of the method.
△ Less
Submitted 1 June, 2021; v1 submitted 31 January, 2020;
originally announced February 2020.
-
Fast Mesh Refinement in Pseudospectral Optimal Control
Authors:
N. Koeppen,
I. M. Ross,
L. C. Wilcox,
R. J. Proulx
Abstract:
Mesh refinement in pseudospectral (PS) optimal control is embarrassingly easy --- simply increase the order $N$ of the Lagrange interpolating polynomial and the mathematics of convergence automates the distribution of the grid points. Unfortunately, as $N$ increases, the condition number of the resulting linear algebra increases as $N^2$; hence, spectral efficiency and accuracy are lost in practic…
▽ More
Mesh refinement in pseudospectral (PS) optimal control is embarrassingly easy --- simply increase the order $N$ of the Lagrange interpolating polynomial and the mathematics of convergence automates the distribution of the grid points. Unfortunately, as $N$ increases, the condition number of the resulting linear algebra increases as $N^2$; hence, spectral efficiency and accuracy are lost in practice. In this paper, we advance Birkhoff interpolation concepts over an arbitrary grid to generate well-conditioned PS optimal control discretizations. We show that the condition number increases only as $\sqrt{N}$ in general, but is independent of $N$ for the special case of one of the boundary points being fixed. Hence, spectral accuracy and efficiency are maintained as $N$ increases. The effectiveness of the resulting fast mesh refinement strategy is demonstrated by using \underline{polynomials of over a thousandth order} to solve a low-thrust, long-duration orbit transfer problem.
△ Less
Submitted 29 April, 2019;
originally announced April 2019.
-
Robust Approaches to Handling Complex Geometries with Galerkin Difference Methods
Authors:
Jeremy E. Kozdon,
Lucas C. Wilcox,
Thomas Hagstrom,
Jeffrey W. Banks
Abstract:
The Galerkin difference (GD) basis is a set of continuous, piecewise polynomials defined using a finite difference like grid of degrees of freedom. The one dimensional GD basis functions are naturally extended to multiple dimensions using the tensor product constructions to quadrilateral elements for discretizing partial differential equations. Here we propose two approaches to handling complex ge…
▽ More
The Galerkin difference (GD) basis is a set of continuous, piecewise polynomials defined using a finite difference like grid of degrees of freedom. The one dimensional GD basis functions are naturally extended to multiple dimensions using the tensor product constructions to quadrilateral elements for discretizing partial differential equations. Here we propose two approaches to handling complex geometries using the GD basis within a discontinuous Galerkin finite element setting: (1) using non-conforming, curvilinear GD elements and (2) coupling affine GD elements with curvilinear simplicial elements. In both cases the (semidiscrete) discontinuous Galerkin method is provably energy stable even when variational crimes are committed and in both cases a weight-adjusted mass matrix is used, which ensures that only the reference mass matrix must be inverted. Additionally, we give sufficient conditions on the treatment of metric terms for the curvilinear, nonconforming GD elements to ensure that the scheme is both constant preserving and conservative. Numerical experiments confirm the stability results and demonstrate the accuracy of the coupled schemes.
△ Less
Submitted 1 June, 2021; v1 submitted 15 June, 2018;
originally announced June 2018.
-
Discretely entropy stable weight-adjusted discontinuous Galerkin methods on curvilinear meshes
Authors:
Jesse Chan,
Lucas C. Wilcox
Abstract:
We construct entropy conservative and entropy stable high order accurate discontinuous Galerkin (DG) discretizations for time-dependent nonlinear hyperbolic conservation laws on curvilinear meshes. The resulting schemes preserve a semi-discrete quadrature approximation of a continuous global entropy inequality. The proof requires the satisfaction of a discrete geometric conservation law, which we…
▽ More
We construct entropy conservative and entropy stable high order accurate discontinuous Galerkin (DG) discretizations for time-dependent nonlinear hyperbolic conservation laws on curvilinear meshes. The resulting schemes preserve a semi-discrete quadrature approximation of a continuous global entropy inequality. The proof requires the satisfaction of a discrete geometric conservation law, which we enforce through an appropriate polynomial approximation. We extend the construction of entropy conservative and entropy stable DG schemes to the case when high order accurate curvilinear mass matrices are approximated using low-storage weight-adjusted approximations, and describe how to retain global conservation properties under such an approximation. The theoretical results are verified through numerical experiments for the compressible Euler equations on triangular and tetrahedral meshes.
△ Less
Submitted 12 June, 2018; v1 submitted 28 May, 2018;
originally announced May 2018.
-
An Energy Stable Approach for Discretizing Hyperbolic Equations with Nonconforming Discontinuous Galerkin Methods
Authors:
Jeremy E. Kozdon,
Lucas C. Wilcox
Abstract:
When nonconforming discontinuous Galerkin methods are implemented for hyperbolic equations using quadrature, exponential energy growth can result even when the underlying scheme with exact integration does not support such growth. Using linear elasticity as a model problem, we propose a skew-symmetric formulation that has the same energy stability properties for both exact and inexact quadrature-b…
▽ More
When nonconforming discontinuous Galerkin methods are implemented for hyperbolic equations using quadrature, exponential energy growth can result even when the underlying scheme with exact integration does not support such growth. Using linear elasticity as a model problem, we propose a skew-symmetric formulation that has the same energy stability properties for both exact and inexact quadrature-based integration. These stability properties are maintained even when the material properties are variable and discontinuous, and the elements are non-affine (e.g., curved). Additionally, we show how the nonconforming scheme can be made conservative and constant preserving with variable material properties and curved elements. The analytic stability, conservation, and constant preserving results are confirmed through numerical experiments demonstrating the stability as well as the accuracy of the method.
△ Less
Submitted 10 March, 2022; v1 submitted 1 June, 2017;
originally announced June 2017.
-
Acceleration of the Implicit-Explicit Non-hydrostatic Unified Model of the Atmosphere (NUMA) on Manycore Processors
Authors:
Daniel S. Abdi,
Francis X. Giraldo,
Emil M. Constantinescu,
Lester E. Carr III,
Lucas C. Wilcox,
Timothy C. Warburton
Abstract:
We present the acceleration of an IMplicit-EXplicit (IMEX) non-hydrostatic atmospheric model on manycore processors such as GPUs and Intel's MIC architecture. IMEX time integration methods sidestep the constraint imposed by the Courant-Friedrichs-Lewy condition on explicit methods through corrective implicit solves within each time step. In this work, we implement and evaluate the performance of I…
▽ More
We present the acceleration of an IMplicit-EXplicit (IMEX) non-hydrostatic atmospheric model on manycore processors such as GPUs and Intel's MIC architecture. IMEX time integration methods sidestep the constraint imposed by the Courant-Friedrichs-Lewy condition on explicit methods through corrective implicit solves within each time step. In this work, we implement and evaluate the performance of IMEX on manycore processors relative to explicit methods. Using 3D-IMEX at Courant number C=15 , we obtained a speedup of about 4X relative to an explicit time stepping method run with the maximum allowable C=1. In addition, we demonstrate a much larger speedup of 100X at C=150 using 1D-IMEX due to the unconditional stability of the method in the vertical direction. Several improvements on the IMEX procedure were necessary in order to outperform our results with explicit methods: a) reducing the number of degrees of freedom of the IMEX formulation by forming the Schur complement; b) formulating a horizontally-explicit vertically-implicit (HEVI) 1D-IMEX scheme that has a lower workload and potentially better scalability than 3D-IMEX; c) using high-order polynomial preconditioners to reduce the condition number of the resulting system; d) using a direct solver for the 1D-IMEX method by performing and storing LU factorizations once to obtain a constant cost for any Courant number. Without all of these improvements, explicit time integration methods turned out to be difficult to beat. We discuss in detail the IMEX infrastructure required for formulating and implementing efficient methods on manycore processors. Finally, we validate our results with standard benchmark problems in NWP and evaluate the performance and scalability of the IMEX method using up to 4192 GPUs and 16 Knights Landing processors.
△ Less
Submitted 13 February, 2017;
originally announced February 2017.
-
Solving 1D Conservation Laws Using Pontryagin's Minimum Principle
Authors:
Wei Kang,
Lucas C. Wilcox
Abstract:
This paper discusses a connection between scalar convex conservation laws and Pontryagin's minimum principle. For flux functions for which an associated optimal control problem can be found, a minimum value solution of the conservation law is proposed. For scalar space-independent convex conservation laws such a control problem exists and the minimum value solution of the conservation law is equiv…
▽ More
This paper discusses a connection between scalar convex conservation laws and Pontryagin's minimum principle. For flux functions for which an associated optimal control problem can be found, a minimum value solution of the conservation law is proposed. For scalar space-independent convex conservation laws such a control problem exists and the minimum value solution of the conservation law is equivalent to the entropy solution. This can be seen as a generalization of the Lax--Oleinik formula to convex (not necessarily uniformly convex) flux functions. Using Pontryagin's minimum principle, an algorithm for finding the minimum value solution pointwise of scalar convex conservation laws is given. Numerical examples of approximating the solution of both space-dependent and space-independent conservation laws are provided to demonstrate the accuracy and applicability of the proposed algorithm. Furthermore, a MATLAB routine using Chebfun is provided (along with demonstration code on how to use it) to approximately solve scalar convex conservation laws with space-independent flux functions.
△ Less
Submitted 10 February, 2017; v1 submitted 14 May, 2016;
originally announced May 2016.
-
Array Program Transformation with Loo.py by Example: High-Order Finite Elements
Authors:
Andreas Klöckner,
Lucas C. Wilcox,
T. Warburton
Abstract:
To concisely and effectively demonstrate the capabilities of our program transformation system Loo.py, we examine a transformation path from two real-world Fortran subroutines as found in a weather model to a single high-performance computational kernel suitable for execution on modern GPU hardware. Along the transformation path, we encounter kernel fusion, vectorization, prefetch- ing, paralleliz…
▽ More
To concisely and effectively demonstrate the capabilities of our program transformation system Loo.py, we examine a transformation path from two real-world Fortran subroutines as found in a weather model to a single high-performance computational kernel suitable for execution on modern GPU hardware. Along the transformation path, we encounter kernel fusion, vectorization, prefetch- ing, parallelization, and algorithmic changes achieved by mechanized conversion between imperative and functional/substitution- based code, among a number more. We conclude with performance results that demonstrate the effects and support the effectiveness of the applied transformations.
△ Less
Submitted 13 April, 2016;
originally announced April 2016.
-
Strong Scaling for Numerical Weather Prediction at Petascale with the Atmospheric Model NUMA
Authors:
Andreas Müller,
Michal A. Kopera,
Simone Marras,
Lucas C. Wilcox,
Tobin Isaac,
Francis X. Giraldo
Abstract:
Numerical weather prediction (NWP) has proven to be computationally challenging due to its inherent multiscale nature. Currently, the highest resolution NWP models use a horizontal resolution of about 10km. In order to increase the resolution of NWP models highly scalable atmospheric models are needed.
The Non-hydrostatic Unified Model of the Atmosphere (NUMA), developed by the authors at the Na…
▽ More
Numerical weather prediction (NWP) has proven to be computationally challenging due to its inherent multiscale nature. Currently, the highest resolution NWP models use a horizontal resolution of about 10km. In order to increase the resolution of NWP models highly scalable atmospheric models are needed.
The Non-hydrostatic Unified Model of the Atmosphere (NUMA), developed by the authors at the Naval Postgraduate School, was designed to achieve this purpose. NUMA is used by the Naval Research Laboratory, Monterey as the engine inside its next generation weather prediction system NEPTUNE. NUMA solves the fully compressible Navier-Stokes equations by means of high-order Galerkin methods (both spectral element as well as discontinuous Galerkin methods can be used). Mesh generation is done using the p4est library. NUMA is capable of running middle and upper atmosphere simulations since it does not make use of the shallow-atmosphere approximation.
This paper presents the performance analysis and optimization of the spectral element version of NUMA. The performance at different optimization stages is analyzed using a theoretical performance model as well as measurements via hardware counters. Machine independent optimization is compared to machine specific optimization using BG/Q vector intrinsics. By using vector intrinsics the main computations reach 1.2 PFlops on the entire machine Mira (12% of the theoretical peak performance). The paper also presents scalability studies for two idealized test cases that are relevant for NWP applications. The atmospheric model NUMA delivers an excellent strong scaling efficiency of 99% on the entire supercomputer Mira using a mesh with 1.8 billion grid points. This allows to run a global forecast of a baroclinic wave test case at 3km uniform horizontal resolution and double precision within the time frame required for operational weather prediction.
△ Less
Submitted 8 September, 2016; v1 submitted 4 November, 2015;
originally announced November 2015.
-
Mitigating the Curse of Dimensionality: Sparse Grid Characteristics Method for Optimal Feedback Control and HJB Equations
Authors:
Wei Kang,
Lucas C. Wilcox
Abstract:
We address finding the semi-global solutions to optimal feedback control and the Hamilton--Jacobi--Bellman (HJB) equation. Using the solution of an HJB equation, a feedback optimal control law can be implemented in real-time with minimum computational load. However, except for systems with two or three state variables, using traditional techniques for numerically finding a semi-global solution to…
▽ More
We address finding the semi-global solutions to optimal feedback control and the Hamilton--Jacobi--Bellman (HJB) equation. Using the solution of an HJB equation, a feedback optimal control law can be implemented in real-time with minimum computational load. However, except for systems with two or three state variables, using traditional techniques for numerically finding a semi-global solution to an HJB equation for general nonlinear systems is infeasible due to the curse of dimensionality. Here we present a new computational method for finding feedback optimal control and solving HJB equations which is able to mitigate the curse of dimensionality. We do not discretize the HJB equation directly, instead we introduce a sparse grid in the state space and use the Pontryagin's maximum principle to derive a set of necessary conditions in the form of a boundary value problem, also known as the characteristic equations, for each grid point. Using this approach, the method is spatially causality free, which enjoys the advantage of perfect parallelism on a sparse grid. Compared with dense grids, a sparse grid has a significantly reduced size which is feasible for systems with relatively high dimensions, such as the $6$-D system shown in the examples. Once the solution obtained at each grid point, high-order accurate polynomial interpolation is used to approximate the feedback control at arbitrary points. We prove an upper bound for the approximation error and approximate it numerically. This sparse grid characteristics method is demonstrated with two examples of rigid body attitude control using momentum wheels.
△ Less
Submitted 15 June, 2016; v1 submitted 16 July, 2015;
originally announced July 2015.
-
Stable Coupling of Nonconforming, High-Order Finite Difference Methods
Authors:
Jeremy E. Kozdon,
Lucas C. Wilcox
Abstract:
A methodology for handling block-to-block coupling of nonconforming, multiblock summation-by-parts finite difference methods is proposed. The coupling is based on the construction of projection operators that move a finite difference grid solution along an interface to a space of piecewise defined functions; we specifically consider discontinuous, piecewise polynomial functions. The constructed pr…
▽ More
A methodology for handling block-to-block coupling of nonconforming, multiblock summation-by-parts finite difference methods is proposed. The coupling is based on the construction of projection operators that move a finite difference grid solution along an interface to a space of piecewise defined functions; we specifically consider discontinuous, piecewise polynomial functions. The constructed projection operators are compatible with the underlying summation-by-parts energy norm. Using the linear wave equation in two dimensions as a model problem, energy stability of the coupled numerical method is proven for the case of curved, nonconforming block-to-block interfaces. To further demonstrate the power of the coupling procedure, we show how it allows for the development of a provably energy stable coupling between curvilinear finite difference methods and a curved-triangle discontinuous Galerkin method. The theoretical results are verified through numerical simulations on curved meshes as well as eigenvalue analysis.
△ Less
Submitted 28 September, 2015; v1 submitted 21 October, 2014;
originally announced October 2014.
-
Recursive Algorithms for Distributed Forests of Octrees
Authors:
Tobin Isaac,
Carsten Burstedde,
Lucas C. Wilcox,
Omar Ghattas
Abstract:
The forest-of-octrees approach to parallel adaptive mesh refinement and coarsening (AMR) has recently been demonstrated in the context of a number of large-scale PDE-based applications. Although linear octrees, which store only leaf octants, have an underlying tree structure by definition, it is not often exploited in previously published mesh-related algorithms. This is because the branches are n…
▽ More
The forest-of-octrees approach to parallel adaptive mesh refinement and coarsening (AMR) has recently been demonstrated in the context of a number of large-scale PDE-based applications. Although linear octrees, which store only leaf octants, have an underlying tree structure by definition, it is not often exploited in previously published mesh-related algorithms. This is because the branches are not explicitly stored, and because the topological relationships in meshes, such as the adjacency between cells, introduce dependencies that do not respect the octree hierarchy. In this work we combine hierarchical and topological relationships between octree branches to design efficient recursive algorithms.
We present three important algorithms with recursive implementations. The first is a parallel search for leaves matching any of a set of multiple search criteria. The second is a ghost layer construction algorithm that handles arbitrarily refined octrees that are not covered by previous algorithms, which require a 2:1 condition between neighboring leaves. The third is a universal mesh topology iterator. This iterator visits every cell in a domain partition, as well as every interface (face, edge and corner) between these cells. The iterator calculates the local topological information for every interface that it visits, taking into account the nonconforming interfaces that increase the complexity of describing the local topology. To demonstrate the utility of the topology iterator, we use it to compute the numbering and encoding of higher-order $C^0$ nodal basis functions.
We analyze the complexity of the new recursive algorithms theoretically, and assess their performance, both in terms of single-processor efficiency and in terms of parallel scalability, demonstrating good weak and strong scaling up to 458k cores of the JUQUEEN supercomputer.
△ Less
Submitted 19 August, 2015; v1 submitted 31 May, 2014;
originally announced June 2014.
-
Discretely exact derivatives for hyperbolic PDE-constrained optimization problems discretized by the discontinuous Galerkin method
Authors:
Lucas C. Wilcox,
Georg Stadler,
Tan Bui-Thanh,
Omar Ghattas
Abstract:
This paper discusses the computation of derivatives for optimization problems governed by linear hyperbolic systems of partial differential equations (PDEs) that are discretized by the discontinuous Galerkin (dG) method. An efficient and accurate computation of these derivatives is important, for instance, in inverse problems and optimal control problems. This computation is usually based on an ad…
▽ More
This paper discusses the computation of derivatives for optimization problems governed by linear hyperbolic systems of partial differential equations (PDEs) that are discretized by the discontinuous Galerkin (dG) method. An efficient and accurate computation of these derivatives is important, for instance, in inverse problems and optimal control problems. This computation is usually based on an adjoint PDE system, and the question addressed in this paper is how the discretization of this adjoint system should relate to the dG discretization of the hyperbolic state equation. Adjoint-based derivatives can either be computed before or after discretization; these two options are often referred to as the optimize-then-discretize and discretize-then-optimize approaches. We discuss the relation between these two options for dG discretizations in space and Runge-Kutta time integration. Discretely exact discretizations for several hyperbolic optimization problems are derived, including the advection equation, Maxwell's equations and the coupled elastic-acoustic wave equation. We find that the discrete adjoint equation inherits a natural dG discretization from the discretization of the state equation and that the expressions for the discretely exact gradient often have to take into account contributions from element faces. For the coupled elastic-acoustic wave equation, the correctness and accuracy of our derivative expressions are illustrated by comparisons with finite difference gradients. The results show that a straightforward discretization of the continuous gradient differs from the discretely exact gradient, and thus is not consistent with the discretized objective. This inconsistency may cause difficulties in the convergence of gradient based algorithms for solving optimization problems.
△ Less
Submitted 27 November, 2013;
originally announced November 2013.