Skip to main content

Showing 1–39 of 39 results for author: Knepley, M G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.05868  [pdf, other

    cs.DC cs.MS

    Efficient N-to-M Checkpointing Algorithm for Finite Element Simulations

    Authors: David A. Ham, Vaclav Hapla, Matthew G. Knepley, Lawrence Mitchell, Koki Sagiyama

    Abstract: In this work, we introduce a new algorithm for N-to-M checkpointing in finite element simulations. This new algorithm allows efficient saving/loading of functions representing physical quantities associated with the mesh representing the physical domain. Specifically, the algorithm allows for using different numbers of parallel processes for saving and loading, allowing for restarting and post-pro… ▽ More

    Submitted 30 October, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: author accepted manuscript

  2. arXiv:2303.12620  [pdf, other

    physics.plasm-ph cs.CE

    A Numerical Study of Landau Damping with PETSc-PIC

    Authors: Daniel S. Finn, Matthew G. Knepley, Joseph V. Pusztay, Mark F. Adams

    Abstract: We present a study of the standard plasma physics test, Landau damping, using the Particle-In-Cell (PIC) algorithm. The Landau damping phenomenon consists of the damping of small oscillations in plasmas without collisions. In the PIC method, a hybrid discretization is constructed with a grid of finitely supported basis functions to represent the electric, magnetic and/or gravitational fields, and… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: 14 pages, 7 figures

  3. arXiv:2208.07128  [pdf, other

    cs.CG math.NA

    Tetrahedralization of a Hexahedral Mesh

    Authors: Aman Timalsina, Matthew G. Knepley

    Abstract: Two important classes of three-dimensional elements in computational meshes are hexahedra and tetrahedra. While several efficient methods exist that convert a hexahedral element to a tetrahedral elements, the existing algorithm for tetrahedralization of a hexahedral complex is the marching tetrahedron algorithm which limits pre-selection of face divisions. We generalize a procedure for tetrahedral… ▽ More

    Submitted 19 January, 2023; v1 submitted 15 August, 2022; originally announced August 2022.

    Comments: The previous version had an error in the proof of Observation 2.1, which has since been rectified in this version. Formatting and title changed

  4. arXiv:2201.02806  [pdf, other

    cs.MS cs.GR

    Parallel Metric-Based Mesh Adaptation in PETSc using ParMmg

    Authors: Joseph G. Wallwork, Matthew G. Knepley, Nicolas Barral, Matthew D. Piggott

    Abstract: This research note documents the integration of the MPI-parallel metric-based mesh adaptation toolkit ParMmg into the solver library PETSc. This coupling brings robust, scalable anisotropic mesh adaptation to a wide community of PETSc users, as well as users of downstream packages. We demonstrate the new functionality via the solution of Poisson problems in three dimensions, with both uniform and… ▽ More

    Submitted 27 July, 2022; v1 submitted 8 January, 2022; originally announced January 2022.

    Comments: 5 pages, 2 figures. Appeared as a research note in the 30th International Meshing Roundtable

    MSC Class: 35-04 ACM Class: G.4

  5. Understanding performance variability in standard and pipelined parallel Krylov solvers

    Authors: Hannah Morgan, Patrick Sanan, Matthew G. Knepley, Richard Tran Mills

    Abstract: In this work, we collect data from runs of Krylov subspace methods and pipelined Krylov algorithms in an effort to understand and model the impact of machine noise and other sources of variability on performance. We find large variability of Krylov iterations between compute nodes for standard methods that is reduced in pipelined algorithms, directly supporting conjecture, as well as large variati… ▽ More

    Submitted 21 March, 2021; originally announced March 2021.

    Comments: 18 pages, 12 figures

    Journal ref: IJHPCA, 35(1), 2020

  6. arXiv:2004.08729  [pdf, other

    cs.MS math.NA

    Fully Parallel Mesh I/O using PETSc DMPlex with an Application to Waveform Modeling

    Authors: Vaclav Hapla, Matthew G. Knepley, Michael Afanasiev, Christian Boehm, Martin van Driel, Lion Krischer, Andreas Fichtner

    Abstract: Large-scale PDE simulations using high-order finite-element methods on unstructured meshes are an indispensable tool in science and engineering. The widely used open-source PETSc library offers an efficient representation of generic unstructured meshes within its DMPlex module. This paper details our recent implementation of parallel mesh reading and topological interpolation (computation of edges… ▽ More

    Submitted 15 September, 2020; v1 submitted 18 April, 2020; originally announced April 2020.

    Comments: 23 pages, 11 figures

    MSC Class: 65-04; 65Y05; 65M50; 05C90; 35L05

    Journal ref: SIAM J. Sci. Comput. 43 (2021) C127-C153

  7. arXiv:1912.08516  [pdf, other

    cs.MS math.NA

    PCPATCH: software for the topological construction of multigrid relaxation methods

    Authors: Patrick E. Farrell, Matthew G. Knepley, Lawrence Mitchell, Florian Wechsung

    Abstract: Effective relaxation methods are necessary for good multigrid convergence. For many equations, standard Jacobi and Gauß-Seidel are inadequate, and more sophisticated space decompositions are required; examples include problems with semidefinite terms or saddle point structure. In this paper we present a unifying software abstraction, PCPATCH, for the topological construction of space decomposition… ▽ More

    Submitted 5 July, 2021; v1 submitted 18 December, 2019; originally announced December 2019.

    Comments: 22 pages, minor fixes in bibliography

    Journal ref: ACM Transactions on Mathematical Software 47(3):25 (2021)

  8. arXiv:1809.00747  [pdf, other

    cs.CE physics.comp-ph

    A high order hybridizable discontinuous Galerkin method for incompressible miscible displacement in heterogeneous media

    Authors: Maurice S. Fabien, Matthew G. Knepley, Beatrice M. Riviere

    Abstract: We present a new method for approximating solutions to the incompressible miscible displacement problem in porous media. At the discrete level, the coupled nonlinear system has been split into two linear systems that are solved sequentially. The method is based on a hybridizable discontinuous Galerkin method for the Darcy flow, which produces a mass--conservative flux approximation, and a hybridiz… ▽ More

    Submitted 16 September, 2018; v1 submitted 3 September, 2018; originally announced September 2018.

  9. Composable block solvers for the four-field double porosity/permeability model

    Authors: M. S. Joshaghani, J. Chang, K. B. Nakshatrala, M. G. Knepley

    Abstract: The objective of this paper is twofold. First, we propose two composable block solver methodologies to solve the discrete systems that arise from finite element discretizations of the double porosity/permeability (DPP) model. The DPP model, which is a four-field mathematical model, describes the flow of a single-phase incompressible fluid in a porous medium with two distinct pore-networks and with… ▽ More

    Submitted 24 August, 2018; originally announced August 2018.

  10. arXiv:1802.07832  [pdf, other

    cs.MS cs.PF

    Comparative study of finite element methods using the Time-Accuracy-Size (TAS) spectrum analysis

    Authors: Justin Chang, Maurice S. Fabien, Matthew G. Knepley, Richard T. Mills

    Abstract: We present a performance analysis appropriate for comparing algorithms using different numerical discretizations. By taking into account the total time-to-solution, numerical accuracy with respect to an error norm, and the computation rate, a cost-benefit analysis can be performed to determine which algorithm and discretization are particularly suited for an application. This work extends the perf… ▽ More

    Submitted 21 February, 2018; originally announced February 2018.

    MSC Class: 65Y05; 65Y20; 68N99

  11. arXiv:1802.06013  [pdf, other

    cs.CE physics.comp-ph

    A hybridizable discontinuous Galerkin method for two-phase flow in heterogeneous porous media

    Authors: Maurice S. Fabien, Matthew G. Knepley, Beatrice M. Riviere

    Abstract: We present a new method for simulating incompressible immiscible two-phase flow in porous media. The semi-implicit method decouples the wetting phase pressure and saturation equations. The equations are discretized using a hybridizable discontinuous Galerkin (HDG) method. The proposed method is of high order, conserves global/local mass balance, and the number of globally coupled degrees of freedo… ▽ More

    Submitted 16 February, 2018; originally announced February 2018.

    Comments: 20 pages, 39 figures, 2 tables

  12. arXiv:1705.03625  [pdf, other

    cs.MS math.NA

    A performance spectrum for parallel computational frameworks that solve PDEs

    Authors: J. Chang, K. B. Nakshatrala, M. G. Knepley, L. Johnsson

    Abstract: Important computational physics problems are often large-scale in nature, and it is highly desirable to have robust and high performing computational frameworks that can quickly address these problems. However, it is no trivial task to determine whether a computational framework is performing efficiently or is scalable. The aim of this paper is to present various strategies for better understandin… ▽ More

    Submitted 14 September, 2017; v1 submitted 10 May, 2017; originally announced May 2017.

  13. Landau Collision Integral Solver with Adaptive Mesh Refinement on Emerging Architectures

    Authors: M. F. Adams, E. Hirvijoki, M. G. Knepley, J. Brown, T. Isaac, R. Mills

    Abstract: The Landau collision integral is an accurate model for the small-angle dominated Coulomb collisions in fusion plasmas. We investigate a high order accurate, fully conservative, finite element discretization of the nonlinear multi-species Landau integral with adaptive mesh refinement using the PETSc library (www.mcs.anl.gov/petsc). We develop algorithms and techniques to efficiently utilize emergin… ▽ More

    Submitted 28 February, 2017; v1 submitted 27 February, 2017; originally announced February 2017.

    Journal ref: SIAM Journal on Scientific Computing, 39 (6), 2017

  14. arXiv:1610.09874  [pdf, other

    math.NA cs.CG cs.MS

    Anisotropic mesh adaptation in Firedrake with PETSc DMPlex

    Authors: Nicolas Barral, Matthew G. Knepley, Michael Lange, Matthew D. Piggott, Gerard J. Gorman

    Abstract: Despite decades of research in this area, mesh adaptation capabilities are still rarely found in numerical simulation software. We postulate that the primary reason for this is lack of usability. Integrating mesh adaptation into existing software is difficult as non-trivial operators, such as error metrics and interpolation operators, are required, and integrating available adaptive remeshers is n… ▽ More

    Submitted 31 October, 2016; originally announced October 2016.

    Comments: 5 page, 2 figures, Proceedings of the 25th International Meshing Roundtable, ed. Steve Owen and Hang Si, 2016

  15. arXiv:1607.04254  [pdf, other

    math.NA cs.MS

    Composing Scalable Nonlinear Algebraic Solvers

    Authors: Peter R. Brune, Matthew G. Knepley, Barry F. Smith, Xuemin Tu

    Abstract: Most efficient linear solvers use composable algorithmic components, with the most common model being the combination of a Krylov accelerator and one or more preconditioners. A similar set of concepts may be used for nonlinear algebraic systems, where nonlinear composition of different nonlinear solvers may significantly improve the time to solution. We describe the basic concepts of nonlinear com… ▽ More

    Submitted 14 July, 2016; originally announced July 2016.

    Comments: 29 pages, 14 figures, 13 tables

    MSC Class: 65F08; 65Y05; 65Y20; 68W10

    Journal ref: SIAM Review 57(4), 535-565, 2015

  16. arXiv:1607.04245  [pdf, other

    cs.MS

    Finite Element Integration with Quadrature on the GPU

    Authors: Matthew G. Knepley, Karl Rupp, Andy R. Terrel

    Abstract: We present a novel, quadrature-based finite element integration method for low-order elements on GPUs, using a pattern we call \textit{thread transposition} to avoid reductions while vectorizing aggressively. On the NVIDIA GTX580, which has a nominal single precision peak flop rate of 1.5 TF/s and a memory bandwidth of 192 GB/s, we achieve close to 300 GF/s for element integration on first-order d… ▽ More

    Submitted 14 July, 2016; originally announced July 2016.

    Comments: 14 pages, 6 figures

    ACM Class: G.4; G.1.8

  17. arXiv:1604.07163  [pdf, other

    cs.MS

    Extreme-scale Multigrid Components within PETSc

    Authors: Dave A. May, Patrick Sanan, Karl Rupp, Matthew G. Knepley, Barry F. Smith

    Abstract: Elliptic partial differential equations (PDEs) frequently arise in continuum descriptions of physical processes relevant to science and engineering. Multilevel preconditioners represent a family of scalable techniques for solving discrete PDEs of this type and thus are the method of choice for high-resolution simulations. The scalability and time-to-solution of massively parallel multilevel precon… ▽ More

    Submitted 25 April, 2016; originally announced April 2016.

  18. arXiv:1602.04873  [pdf, other

    cs.DC cs.PF

    A Stochastic Performance Model for Pipelined Krylov Methods

    Authors: Hannah Morgan, Matthew G. Knepley, Patrick Sanan, L. Ridgway Scott

    Abstract: Pipelined Krylov methods seek to ameliorate the latency due to inner products necessary for projection by overlapping it with the computation associated with sparse matrix-vector multiplication. We clarify a folk theorem that this can only result in a speedup of $2\times$ over the naive implementation. Examining many repeated runs, we show that stochastic noise also contributes to the latency, and… ▽ More

    Submitted 15 February, 2016; originally announced February 2016.

  19. arXiv:1508.02470  [pdf, other

    cs.MS

    Support for Non-conformal Meshes in PETSc's DMPlex Interface

    Authors: Tobin Isaac, Matthew G. Knepley

    Abstract: PETSc's DMPlex interface for unstructured meshes has been extended to support non-conformal meshes. The topological construct that DMPlex implements---the CW-complex---is by definition conformal, so representing non- conformal meshes in a way that hides complexity requires careful attention to the interface between DMPlex and numerical methods such as the finite element method. Our approach---whic… ▽ More

    Submitted 10 August, 2015; originally announced August 2015.

    Comments: 16 pages, 13 figures, 5 code examples

  20. Efficient mesh management in Firedrake using PETSc-DMPlex

    Authors: Michael Lange, Lawrence Mitchell, Matthew G. Knepley, Gerard J. Gorman

    Abstract: The use of composable abstractions allows the application of new and established algorithms to a wide range of problems while automatically inheriting the benefits of well-known performance optimisations. This work highlights the composition of the PETSc DMPlex domain topology abstraction with the Firedrake automated finite element system to create a PDE solving environment that combines expressiv… ▽ More

    Submitted 25 June, 2015; originally announced June 2015.

    Comments: 12 pages, 6 figures, submitted to SISC CSE Special Issue

    Journal ref: SIAM Journal on Scientific Computing 38(5):S143-S155 (2016)

  21. arXiv:1506.06194  [pdf, other

    cs.MS cs.DC math.NA

    Unstructured Overlapping Mesh Distribution in Parallel

    Authors: Matthew G. Knepley, Michael Lange, Gerard J. Gorman

    Abstract: We present a simple mathematical framework and API for parallel mesh and data distribution, load balancing, and overlap generation. It relies on viewing the mesh as a Hasse diagram, abstracting away information such as cell shape, dimension, and coordinates. The high level of abstraction makes our interface both concise and powerful, as the same algorithm applies to any representable mesh, such as… ▽ More

    Submitted 19 June, 2015; originally announced June 2015.

    Comments: 14 pages, 6 figures, submitted to TOMS

  22. arXiv:1505.04633  [pdf, other

    cs.MS

    Flexible, Scalable Mesh and Data Management using PETSc DMPlex

    Authors: Michael Lange, Matthew G. Knepley, Gerard J. Gorman

    Abstract: Designing a scientific software stack to meet the needs of the next-generation of mesh-based simulation demands, not only scalable and efficient mesh and data management on a wide range of platforms, but also an abstraction layer that makes it useful for a wide range of application codes. Common utility tasks, such as file I/O, mesh distribution, and work partitioning, should be delegated to exter… ▽ More

    Submitted 18 May, 2015; originally announced May 2015.

    Comments: 6 pages, 6 figures, to appear in EASC 2015

  23. arXiv:1409.7418  [pdf, other

    physics.chem-ph cs.CE physics.comp-ph

    Modeling Charge-Sign Asymmetric Solvation Free Energies With Nonlinear Boundary Conditions

    Authors: Jaydeep P. Bardhan, Matthew G. Knepley

    Abstract: We show that charge-sign-dependent asymmetric hydration can be modeled accurately using linear Poisson theory but replacing the standard electric-displacement boundary condition with a simple nonlinear boundary condition. Using a single multiplicative scaling factor to determine atomic radii from molecular dynamics Lennard-Jones parameters, the new model accurately reproduces MD free-energy calcul… ▽ More

    Submitted 25 September, 2014; originally announced September 2014.

    Comments: 7 pages, 2 figures, accepted to Journal of Chemical Physics

  24. arXiv:1407.2905  [pdf, ps, other

    cs.SE cs.CE cs.MS

    Run-time extensibility and librarization of simulation software

    Authors: Jed Brown, Matthew G. Knepley, Barry F. Smith

    Abstract: Build-time configuration and environment assumptions are hampering progress and usability in scientific software. That which would be utterly unacceptable in non-scientific software somehow passes for the norm in scientific packages. The community needs reusable software packages that are easy use and flexible enough to accommodate next-generation simulation and analysis demands.

    Submitted 10 July, 2014; originally announced July 2014.

    Comments: 6 pages

  25. arXiv:1309.1204  [pdf, other

    cs.MS cs.CE

    Achieving High Performance with Unified Residual Evaluation

    Authors: Matthew G. Knepley, Jed Brown, Karl Rupp, Barry F. Smith

    Abstract: We examine residual evaluation, perhaps the most basic operation in numerical simulation. By raising the level of abstraction in this operation, we can eliminate specialized code, enable optimization, and greatly increase the extensibility of existing code.

    Submitted 6 September, 2013; v1 submitted 4 September, 2013; originally announced September 2013.

    Comments: 4 pages, 1 figure

  26. arXiv:1308.5846  [pdf, other

    physics.geo-ph cs.CE cs.MS

    A Domain Decomposition Approach to Implementing Fault Slip in Finite-Element Models of Quasi-static and Dynamic Crustal Deformation

    Authors: Brad T. Aagaard, Matthew G. Knepley, Charles A. Williams

    Abstract: We employ a domain decomposition approach with Lagrange multipliers to implement fault slip in a finite-element code, PyLith, for use in both quasi-static and dynamic crustal deformation applications. This integrated approach to solving both quasi-static and dynamic simulations leverages common finite-element data structures and implementations of various boundary conditions, discretization scheme… ▽ More

    Submitted 27 August, 2013; originally announced August 2013.

    Comments: 14 pages, 15 figures

    Journal ref: Journal of Geophysical Research, 118(6), pp.3059-3079, 2013

  27. arXiv:1209.1711  [pdf, ps, other

    cs.PL cs.CE cs.MS

    Programming Languages for Scientific Computing

    Authors: Matthew G. Knepley

    Abstract: Scientific computation is a discipline that combines numerical analysis, physical understanding, algorithm development, and structured programming. Several yottacycles per year on the world's largest computers are spent simulating problems as diverse as weather prediction, the properties of material composites, the behavior of biomolecules in solution, and the quantum nature of chemical compounds.… ▽ More

    Submitted 9 January, 2018; v1 submitted 8 September, 2012; originally announced September 2012.

    Comments: 21 pages

    Journal ref: Encyclopedia of Applied and Computational Mathematics, Springer, 2012

  28. arXiv:1208.3866  [pdf, ps, other

    physics.chem-ph cs.MS math.NA

    Analytical Nonlocal Electrostatics Using Eigenfunction Expansions of Boundary-Integral Operators

    Authors: Jaydeep P. Bardhan, Matthew G. Knepley, Peter R. Brune

    Abstract: In this paper, we present an analytical solution to nonlocal continuum electrostatics for an arbitrary charge distribution in a spherical solute. Our approach relies on two key steps: (1) re-formulating the PDE problem using boundary-integral equations, and (2) diagonalizing the boundary-integral operators using the fact their eigenfunctions are the surface spherical harmonics. To introduce this u… ▽ More

    Submitted 20 August, 2012; v1 submitted 19 August, 2012; originally announced August 2012.

    Comments: 19 pages, 7 figures

  29. arXiv:1204.0267  [pdf, ps, other

    cs.CE cs.MS physics.chem-ph physics.comp-ph

    Computational science and re-discovery: open-source implementations of ellipsoidal harmonics for problems in potential theory

    Authors: Jaydeep P. Bardhan, Matthew G. Knepley

    Abstract: We present two open-source (BSD) implementations of ellipsoidal harmonic expansions for solving problems of potential theory using separation of variables. Ellipsoidal harmonics are used surprisingly infrequently, considering their substantial value for problems ranging in scale from molecules to the entire solar system. In this article, we suggest two possible reasons for the paucity relative to… ▽ More

    Submitted 3 April, 2012; v1 submitted 1 April, 2012; originally announced April 2012.

    Comments: 25 pages, 3 figures

    Journal ref: Computational Science & Discovery, 5:014006, 2012

  30. arXiv:1111.6583  [pdf, other

    math.NA cs.DC cs.MS physics.comp-ph

    PyClaw: Accessible, Extensible, Scalable Tools for Wave Propagation Problems

    Authors: David I. Ketcheson, Kyle T. Mandli, Aron Ahmadia, Amal Alghamdi, Manuel Quezada, Matteo Parsani, Matthew G. Knepley, Matthew Emmett

    Abstract: Development of scientific software involves tradeoffs between ease of use, generality, and performance. We describe the design of a general hyperbolic PDE solver that can be operated with the convenience of MATLAB yet achieves efficiency near that of hand-coded Fortran and scales to the largest supercomputers. This is achieved by using Python for most of the code while employing automatically-wrap… ▽ More

    Submitted 12 May, 2012; v1 submitted 27 November, 2011; originally announced November 2011.

    Journal ref: SISC 34(4):C210-C231 (2012)

  31. arXiv:1109.0651  [pdf, ps, other

    cs.CE physics.chem-ph physics.comp-ph

    Mathematical Analysis of the BIBEE Approximation for Molecular Solvation: Exact Results for Spherical Inclusions

    Authors: Jaydeep P. Bardhan, Matthew G. Knepley

    Abstract: We analyze the mathematically rigorous BIBEE (boundary-integral based electrostatics estimation) approximation of the mixed-dielectric continuum model of molecular electrostatics, using the analytically solvable case of a spherical solute containing an arbitrary charge distribution. Our analysis, which builds on Kirkwood's solution using spherical harmonics, clarifies important aspects of the appr… ▽ More

    Submitted 3 September, 2011; originally announced September 2011.

    Comments: 33 pages, 5 figures

    Journal ref: Journal of Chemical Physics, 135(12):124107-124117, 2011

  32. arXiv:1107.5951  [pdf, other

    cs.CE cs.DC physics.geo-ph

    Optimal, scalable forward models for computing gravity anomalies

    Authors: Dave A. May, Matthew G. Knepley

    Abstract: We describe three approaches for computing a gravity signal from a density anomaly. The first approach consists of the classical "summation" technique, whilst the remaining two methods solve the Poisson problem for the gravitational potential using either a Finite Element (FE) discretization employing a multilevel preconditioner, or a Green's function evaluated with the Fast Multipole Method (FMM)… ▽ More

    Submitted 29 July, 2011; originally announced July 2011.

    Comments: 38 pages, 13 figures; accepted by Geophysical Journal International

    Journal ref: Geophysical Journal International, 187(1):161-177, 2011

  33. arXiv:1104.0261  [pdf, other

    math.NA cs.CG

    Unstructured Geometric Multigrid in Two and Three Dimensions on Complex and Graded Meshes

    Authors: Peter R. Brune, Matthew G. Knepley, L. Ridgway Scott

    Abstract: The use of multigrid and related preconditioners with the finite element method is often limited by the difficulty of applying the algorithm effectively to a problem, especially when the domain has a complex shape or adaptive refinement. We introduce a simplification of a general topologically-motivated mesh coarsening algorithm for use in creating hierarchies of meshes for geometric unstructured… ▽ More

    Submitted 5 April, 2011; v1 submitted 1 April, 2011; originally announced April 2011.

    Comments: 17 pages, 5 figures, 4 tables

    MSC Class: 65N30; 65M50; 65M55

    Journal ref: SIAM Journal on Scientific Computing, 35(1), A173-A191, 2013

  34. Finite Element Integration on GPUs

    Authors: Matthew G. Knepley, Andy R. Terrel

    Abstract: We present a novel finite element integration method for low order elements on GPUs. We achieve more than 100GF for element integration on first order discretizations of both the Laplacian and Elasticity operators.

    Submitted 28 February, 2011; originally announced March 2011.

    Comments: 16 pages, 3 figures

    ACM Class: G.4; G.1.8

    Journal ref: ACM Transactions on Mathematical Software, 39(2), 2013

  35. arXiv:1008.2410  [pdf, other

    cs.CE math.NA physics.comp-ph

    Removing the Barrier to Scalability in Parallel FMM

    Authors: Matthew G. Knepley

    Abstract: The Fast Multipole Method (FMM) is well known to possess a bottleneck arising from decreasing workload on higher levels of the FMM tree [Greengard and Gropp, Comp. Math. Appl., 20(7), 1990]. We show that this potential bottleneck can be eliminated by overlapping multipole and local expansion computations with direct kernel evaluations on the finest level grid.

    Submitted 13 August, 2010; originally announced August 2010.

    Comments: 11 pages, 2 figures

  36. arXiv:1007.4591  [pdf, other

    cs.CE physics.chem-ph physics.comp-ph

    Biomolecular electrostatics using a fast multipole BEM on up to 512 GPUs and a billion unknowns

    Authors: Rio Yokota, Jaydeep P. Bardhan, Matthew G. Knepley, L. A. Barba, Tsuyoshi Hamada

    Abstract: We present teraflop-scale calculations of biomolecular electrostatics enabled by the combination of algorithmic and hardware acceleration. The algorithmic acceleration is achieved with the fast multipole method (FMM) in conjunction with a boundary element method (BEM) formulation of the continuum electrostatic model, as well as the BIBEE approximation to BEM. The hardware acceleration is achieved… ▽ More

    Submitted 10 February, 2011; v1 submitted 26 July, 2010; originally announced July 2010.

    Journal ref: Comput. Phys. Commun., 182(6):1271-1283 (2011)

  37. arXiv:0909.5413  [pdf, ps, other

    cs.MS cs.DC math.NA

    PetRBF--A parallel O(N) algorithm for radial basis function interpolation

    Authors: Rio Yokota, L. A. Barba, Matthew G. Knepley

    Abstract: We have developed a parallel algorithm for radial basis function (RBF) interpolation that exhibits O(N) complexity,requires O(N) storage, and scales excellently up to a thousand processes. The algorithm uses a GMRES iterative solver with a restricted additive Schwarz method (RASM) as a preconditioner and a fast matrix-vector algorithm. Previous fast RBF methods, --,achieving at most O(NlogN) com… ▽ More

    Submitted 29 September, 2009; originally announced September 2009.

    Comments: Submitted to Computer Methods in Applied Mechanics and Engineering

    Journal ref: Computer Methods in Applied Mechanics and Engineering, 199(25-28), pp. 1793-1804, 2010

  38. Mesh Algorithms for PDE with Sieve I: Mesh Distribution

    Authors: Matthew G. Knepley, Dmitry A. Karpeev

    Abstract: We have developed a new programming framework, called Sieve, to support parallel numerical PDE algorithms operating over distributed meshes. We have also developed a reference implementation of Sieve in C++ as a library of generic algorithms operating on distributed containers conforming to the Sieve interface. Sieve makes instances of the incidence relation, or \emph{arrows}, the conceptual fir… ▽ More

    Submitted 30 August, 2009; originally announced August 2009.

    Comments: 36 pages, 22 figures

    ACM Class: G.1.8; G.4; J.2; E.2

    Journal ref: Scientific Programming, 17(3), 215-230, 2009

  39. arXiv:0905.2637  [pdf, other

    cs.DC cs.DS

    PetFMM--A dynamically load-balancing parallel fast multipole library

    Authors: Felipe A. Cruz, Matthew G. Knepley, L. A. Barba

    Abstract: Fast algorithms for the computation of $N$-body problems can be broadly classified into mesh-based interpolation methods, and hierarchical or multiresolution methods. To this last class belongs the well-known fast multipole method (FMM), which offers O(N) complexity. This paper presents an extensible parallel library for $N$-body interactions utilizing the FMM algorithm, built on the framework o… ▽ More

    Submitted 15 May, 2009; originally announced May 2009.

    Comments: 28 pages, 9 figures

    Journal ref: Int. J. Num. Meth. Eng., 85(4): 403-428 (Jan. 2011)