Skip to main content

Showing 1–33 of 33 results for author: Köstler, H

.
  1. arXiv:2504.06699  [pdf, other

    cs.LG

    Benchmarking Convolutional Neural Network and Graph Neural Network based Surrogate Models on a Real-World Car External Aerodynamics Dataset

    Authors: Sam Jacob Jacob, Markus Mrosek, Carsten Othmer, Harald Köstler

    Abstract: Aerodynamic optimization is crucial for developing eco-friendly, aerodynamic, and stylish cars, which requires close collaboration between aerodynamicists and stylists, a collaboration impaired by the time-consuming nature of aerodynamic simulations. Surrogate models offer a viable solution to reduce this overhead, but they are untested in real-world aerodynamic datasets. We present a comparative… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

  2. arXiv:2502.20049  [pdf, other

    cs.DC

    Large-Scale Simulations of Fully Resolved Complex Moving Geometries with Partially Saturated Cells

    Authors: P. Suffa, S. Kemmler, H. Koestler, U. Ruede

    Abstract: We employ the Partially Saturated Cells Method (PSM) to model the interaction between the fluid flow and solid moving objects as an extension to the conventional lattice Boltzmann method. We introduce an efficient and accurate method for mapping complex moving geometries onto uniform Cartesian grids suitable for massively parallel processing. A validation of the physical accuracy of the solid-flui… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: 13 pages, 16 figures

  3. arXiv:2412.08186  [pdf, other

    cs.CE cs.AI math.NA

    Towards Automated Algebraic Multigrid Preconditioner Design Using Genetic Programming for Large-Scale Laser Beam Welding Simulations

    Authors: Dinesh Parthasarathy, Tommaso Bevilacqua, Martin Lanser, Axel Klawonn, Harald Köstler

    Abstract: Multigrid methods are asymptotically optimal algorithms ideal for large-scale simulations. But, they require making numerous algorithmic choices that significantly influence their efficiency. Unlike recent approaches that learn optimal multigrid components using machine learning techniques, we adopt a complementary strategy here, employing evolutionary algorithms to construct efficient multigrid c… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

    MSC Class: 65M55 (Primary) 74F05; 65M60 (Secondary) ACM Class: I.2.2; G.1.8; J.2

  4. arXiv:2412.05852  [pdf, other

    cs.CE cs.AI math.NA

    Evolving Algebraic Multigrid Methods Using Grammar-Guided Genetic Programming

    Authors: Dinesh Parthasarathy, Wayne Bradford Mitchell, Harald Köstler

    Abstract: Multigrid methods despite being known to be asymptotically optimal algorithms, depend on the careful selection of their individual components for efficiency. Also, they are mostly restricted to standard cycle types like V-, F-, and W-cycles. We use grammar rules to generate arbitrary-shaped cycles, wherein the smoothers and their relaxation weights are chosen independently at each step within the… ▽ More

    Submitted 8 December, 2024; originally announced December 2024.

  5. arXiv:2409.07203  [pdf

    physics.med-ph

    Phantom-based gradient waveform measurements with compensated variable-prephasing: Description and application to EPI at 7T

    Authors: Hannah Scholten, Tobias Wech, Istvan Homolya, Herbert Köstler

    Abstract: Purpose: Introducing "compensated variable-prephasing" (CVP), a phantom-based method for gradient waveform measurements. The technique is based on the "variable-prephasing" (VP) method, but takes into account the effects of all gradients involved in the measurement. Methods: We conducted measurements of a trapezoidal test gradient, and of an EPI readout gradient train with three approaches: VP,… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

    Comments: 24 pages, 5 figures

  6. arXiv:2408.06880  [pdf, other

    cs.DC cs.PF

    Architecture Specific Generation of Large Scale Lattice Boltzmann Methods for Sparse Complex Geometries

    Authors: Philipp Suffa, Markus Holzer, Harald Köstler, Ulrich Rüde

    Abstract: We implement and analyse a sparse / indirect-addressing data structure for the Lattice Boltzmann Method to support efficient compute kernels for fluid dynamics problems with a high number of non-fluid nodes in the domain, such as in porous media flows. The data structure is integrated into a code generation pipeline to enable sparse Lattice Boltzmann Methods with a variety of stencils and collisio… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 16 pages, 19 figures

  7. arXiv:2404.08371  [pdf, other

    cs.CE

    Code Generation and Performance Engineering for Matrix-Free Finite Element Methods on Hybrid Tetrahedral Grids

    Authors: Fabian Böhm, Daniel Bauer, Nils Kohl, Christie Alappat, Dominik Thönnes, Marcus Mohr, Harald Köstler, Ulrich Rüde

    Abstract: This paper introduces a code generator designed for node-level optimized, extreme-scalable, matrix-free finite element operators on hybrid tetrahedral grids. It optimizes the local evaluation of bilinear forms through various techniques including tabulation, relocation of loop invariants, and inter-element vectorization - implemented as transformations of an abstract syntax tree. A key contributio… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 22 pages

    MSC Class: 65F50; 65N30; 65N55; 65Y20; 65F10

  8. arXiv:2403.08063  [pdf, other

    cs.CE

    Towards Code Generation for Octree-Based Multigrid Solvers

    Authors: Richard Angersbach, Sebastian Kuckuck, Harald Köstler

    Abstract: This paper presents a novel method designed to generate multigrid solvers optimized for octree-based software frameworks. Our approach focuses on accurately capturing local features within a domain while leveraging the efficiency inherent in multigrid techniques. We outline the essential steps involved in generating specialized kernels for local refinement and communication routines, integrating o… ▽ More

    Submitted 6 May, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  9. A Continuous Benchmarking Infrastructure for High-Performance Computing Applications

    Authors: Christoph Alt, Martin Lanser, Jonas Plewinski, Atin Janki, Axel Klawonn, Harald Köstler, Michael Selzer, Ulrich Rüde

    Abstract: For scientific software, especially those used for large-scale simulations, achieving good performance and efficiently using the available hardware resources is essential. It is important to regularly perform benchmarks to ensure the efficient use of hardware and software when systems are changing and the software evolves. However, this can become quickly very tedious when many options for paramet… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Journal ref: International Journal of Parallel, Emergent & Distributed Systems, 2024

  10. arXiv:2402.13171  [pdf, other

    cs.CE cs.DC physics.flu-dyn

    waLBerla-wind: a lattice-Boltzmann-based high-performance flow solver for wind energy applications

    Authors: Helen Schottenhamml, Ani Anciaux-Sedrakian, Frédéric Blondel, Harald Köstler, Ulrich Rüde

    Abstract: This article presents the development of a new wind turbine simulation software to study wake flow physics. To this end, the design and development of waLBerla-wind, a new simulator based on the lattice-Boltzmann method that is known for its excellent performance and scaling properties, will be presented. Here it will be used for large eddy simulations (LES) coupled with actuator wind turbine mode… ▽ More

    Submitted 8 December, 2023; originally announced February 2024.

    Journal ref: Concurrency Computat Pract Exper. 2024;e8117

  11. arXiv:2311.11348  [pdf, other

    cs.MS cs.DC

    p-adaptive discontinuous Galerkin method for the shallow water equations on heterogeneous computing architectures

    Authors: Sara Faghih-Naini, Vadym Aizinger, Sebastian Kuckuk, Richard Angersbach, Harald Köstler

    Abstract: Heterogeneous computing and exploiting integrated CPU-GPU architectures has become a clear current trend since the flattening of Moore's Law. In this work, we propose a numerical and algorithmic re-design of a p-adaptive quadrature-free discontinuous Galerkin method (DG) for the shallow water equations (SWE). Our new approach separates the computations of the non-adaptive (lower-order) and adaptiv… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

  12. arXiv:2307.01594  [pdf

    physics.med-ph

    A pre-emphasis based on the gradient system transfer function reduces steady-state disruptions in bSSFP imaging caused by residual gradients

    Authors: Hannah Scholten, Herbert Köstler, Anne Slawig

    Abstract: Purpose: To examine whether an advanced gradient pre-emphasis approach based on the gradient system transfer function (GSTF) can mitigate artifacts caused by residual unbalanced gradients in Cartesian balanced steady-state free precession (bSSFP) imaging with non-linear line-ordering. Theory and Methods: We implemented a gradient pre-emphasis based on the GSTF for bSSFP sequences with linear, ce… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: 20 pages with 6 figures

  13. arXiv:2306.10080  [pdf, ps, other

    cs.CE cs.GT cs.LG

    AI Driven Near Real-time Locational Marginal Pricing Method: A Feasibility and Robustness Study

    Authors: Naga Venkata Sai Jitin Jami, Juraj Kardoš, Olaf Schenk, Harald Köstler

    Abstract: Accurate price predictions are essential for market participants in order to optimize their operational schedules and bidding strategies, especially in the current context where electricity prices become more volatile and less predictable using classical approaches. The Locational Marginal Pricing (LMP) pricing mechanism is used in many modern power markets, where the traditional approach utilizes… ▽ More

    Submitted 2 October, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

  14. arXiv:2303.11811  [pdf, other

    cs.CE

    Efficiency and scalability of fully-resolved fluid-particle simulations on heterogeneous CPU-GPU architectures

    Authors: Samuel Kemmler, Christoph Rettinger, Ulrich Rüde, Pablo Cuéllar, Harald Köstler

    Abstract: Current supercomputers often have a heterogeneous architecture using both CPUs and GPUs. At the same time, numerical simulation tasks frequently involve multiphysics scenarios whose components run on different hardware due to multiple reasons, e.g., architectural requirements, pragmatism, etc. This leads naturally to a software design where different simulation modules are mapped to different subs… ▽ More

    Submitted 9 December, 2024; v1 submitted 21 March, 2023; originally announced March 2023.

  15. arXiv:2302.14660  [pdf, other

    physics.chem-ph cs.PF physics.comp-ph

    MD-Bench: Engineering the in-core performance of short-range molecular dynamics kernels from state-of-the-art simulation packages

    Authors: Rafael Ravedutti Lucio Machado, Jan Eitzinger, Jan Laukemann, Georg Hager, Harald Köstler, Gerhard Wellein

    Abstract: Molecular dynamics (MD) simulations provide considerable benefits for the investigation and experimentation of systems at atomic level. Their usage is widespread into several research fields, but their system size and timescale are also crucially limited by the computing power they can make use of. Performance engineering of MD kernels is therefore important to understand their bottlenecks and poi… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

    Comments: 17 pages, 10 figures, 5 tables. arXiv admin note: text overlap with arXiv:2207.13094

  16. arXiv:2301.10674  [pdf, other

    physics.flu-dyn physics.comp-ph physics.geo-ph

    Particle-resolved simulation of antidunes in free-surface flows

    Authors: Christoph Schwarzmeier, Christoph Rettinger, Samuel Kemmler, Jonas Plewinski, Francisco Núñez-González, Harald Köstler, Ulrich Rüde, Bernhard Vowinckel

    Abstract: The interaction of supercritical turbulent flows with granular sediment beds is challenging to study both experimentally and numerically; this challenging task has hampered the advances in understanding antidunes, the most characteristic bedform of supercritical flows. This article presents the first numerical attempt to simulate upstream-migrating antidunes with geometrically resolved particles a… ▽ More

    Submitted 23 March, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Journal ref: Journal of Fluid Mechanics 961 (2023)

  17. arXiv:2207.13094  [pdf, other

    physics.comp-ph cs.PF

    MD-Bench: A generic proxy-app toolbox for state-of-the-art molecular dynamics algorithms

    Authors: Rafael Ravedutti Lucio Machado, Jan Eitzinger, Harald Köstler, Gerhard Wellein

    Abstract: Proxy-apps, or mini-apps, are simple self-contained benchmark codes with performance-relevant kernels extracted from real applications. Initially used to facilitate software-hardware co-design, they are a crucial ingredient for serious performance engineering, especially when dealing with large-scale production codes. MD-Bench is a new proxy-app in the area of classical short-range molecular dynam… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    Comments: 12 Pages, 2 figures, submitted to PPAM22

  18. arXiv:2204.12846  [pdf, other

    math.NA cs.AI cs.MS cs.NE

    Evolving Generalizable Multigrid-Based Helmholtz Preconditioners with Grammar-Guided Genetic Programming

    Authors: Jonas Schmitt, Harald Köstler

    Abstract: Solving the indefinite Helmholtz equation is not only crucial for the understanding of many physical phenomena but also represents an outstandingly-difficult benchmark problem for the successful application of numerical methods. Here we introduce a new approach for evolving efficient preconditioned iterative solvers for Helmholtz problems with multi-objective grammar-guided genetic programming. Ou… ▽ More

    Submitted 28 April, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

    Journal ref: Proceedings of the 2022 Genetic and Evolutionary Computation Conference (Boston, USA) (GECCO '22)

  19. Deep Learning for Real-Time Aerodynamic Evaluations of Arbitrary Vehicle Shapes

    Authors: Sam Jacob Jacob, Markus Mrosek, Carsten Othmer, Harald Köstler

    Abstract: The aerodynamic optimization process of cars requires multiple iterations between aerodynamicists and stylists. Response Surface Modeling and Reduced-Order Modeling are commonly used to eliminate the overhead due to Computational Fluid Dynamics, leading to faster iterations. However, a primary drawback of these models is that they can work only on the parametrized geometric features they were trai… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

  20. arXiv:2108.04543  [pdf, other

    cs.LG cs.CV physics.med-ph

    Known Operator Learning and Hybrid Machine Learning in Medical Imaging -- A Review of the Past, the Present, and the Future

    Authors: Andreas Maier, Harald Köstler, Marco Heisig, Patrick Krauss, Seung Hee Yang

    Abstract: In this article, we perform a review of the state-of-the-art of hybrid machine learning in medical imaging. We start with a short summary of the general developments of the past in machine learning and how general and specialized approaches have been in competition in the past decades. A particular focus will be the theoretical and experimental evidence pro and contra hybrid modelling. Next, we in… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    Comments: 22 pages, 4 figures, submitted to "Progress in Biomedical Engineering"

    Journal ref: Prog. Biomed. Eng. 4 022002 (2022)

  21. arXiv:2009.07400  [pdf, other

    cs.PF cs.DC cs.PL physics.comp-ph

    tinyMD: A Portable and Scalable Implementation for Pairwise Interactions Simulations

    Authors: Rafael Ravedutti L. Machado, Jonas Schmitt, Sebastian Eibl, Jan Eitzinger, Roland Leißa, Sebastian Hack, Arsène Pérard-Gayot, Richard Membarth, Harald Köstler

    Abstract: This paper investigates the suitability of the AnyDSL partial evaluation framework to implement tinyMD: an efficient, scalable, and portable simulation of pairwise interactions among particles. We compare tinyMD with the miniMD proxy application that scales very well on parallel supercomputers. We discuss the differences between both implementations and contrast miniMD's performance for single-nod… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

    Comments: 35 pages, 8 figures, submitted to Journal of Computational Science

    MSC Class: B.8.2; D.1.3; D.3.3; J.2

  22. arXiv:2006.09127  [pdf, other

    cs.ET quant-ph

    Quantum simulation and circuit design for solving multidimensional Poisson equations

    Authors: Michael Holzmann, Harald Koestler

    Abstract: Many methods solve Poisson equations by using grid techniques which discretize the problem in each dimension. Most of these algorithms are subject to the curse of dimensionality, so that they need exponential runtime. In the paper "Quantum algorithm and circuit design solving the Poisson equation" a quantum algorithm is shown running in polylog time to produce a quantum state representing the solu… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

  23. arXiv:2001.11806  [pdf, other

    cs.MS cs.CE cs.DC

    lbmpy: Automatic code generation for efficient parallel lattice Boltzmann methods

    Authors: Martin Bauer, Harald Köstler, Ulrich Rüde

    Abstract: Lattice Boltzmann methods are a popular mesoscopic alternative to macroscopic computational fluid dynamics solvers. Many variants have been developed that vary in complexity, accuracy, and computational cost. Extensions are available to simulate multi-phase, multi-component, turbulent, or non-Newtonian flows. In this work we present lbmpy, a code generation package that supports a wide variety of… ▽ More

    Submitted 11 April, 2020; v1 submitted 31 January, 2020; originally announced January 2020.

  24. arXiv:1910.02749  [pdf, other

    math.NA cs.MS cs.NE

    Optimizing Geometric Multigrid Methods with Evolutionary Computation

    Authors: Jonas Schmitt, Sebastian Kuckuk, Harald Köstler

    Abstract: For many linear and nonlinear systems that arise from the discretization of partial differential equations the construction of an efficient multigrid solver is a challenging task. Here we present a novel approach for the optimization of geometric multigrid methods that is based on evolutionary computation, a generic program optimization technique inspired by the principle of natural evolution. A m… ▽ More

    Submitted 8 October, 2019; v1 submitted 7 October, 2019; originally announced October 2019.

  25. arXiv:1909.13772  [pdf, other

    cs.DC cs.CE physics.comp-ph

    waLBerla: A block-structured high-performance framework for multiphysics simulations

    Authors: Martin Bauer, Sebastian Eibl, Christian Godenschwager, Nils Kohl, Michael Kuron, Christoph Rettinger, Florian Schornbaum, Christoph Schwarzmeier, Dominik Thönnes, Harald Köstler, Ulrich Rüde

    Abstract: Programming current supercomputers efficiently is a challenging task. Multiple levels of parallelism on the core, on the compute node, and between nodes need to be exploited to make full use of the system. Heterogeneous hardware architectures with accelerators further complicate the development process. waLBerla addresses these challenges by providing the user with highly efficient building blocks… ▽ More

    Submitted 30 September, 2019; originally announced September 2019.

  26. arXiv:1904.08684  [pdf, other

    cs.MS cs.CE

    Towards whole program generation of quadrature-free discontinuous Galerkin methods for the shallow water equations

    Authors: Sara Faghih-Naini, Sebastian Kuckuk, Vadym Aizinger, Daniel Zint, Roberto Grosso, Harald Köstler

    Abstract: The shallow water equations (SWE) are a commonly used model to study tsunamis, tides, and coastal ocean circulation. However, there exist various approaches to discretize and solve them efficiently. Which of them is best for a certain scenario is often not known and, in addition, depends heavily on the used HPC platform. From a simulation software perspective, this places a premium on the ability… ▽ More

    Submitted 18 April, 2019; originally announced April 2019.

  27. Lattice Boltzmann Benchmark Kernels as a Testbed for Performance Analysis

    Authors: Markus Wittmann, Viktor Haag, Thomas Zeiser, Harald Köstler, Gerhard Wellein

    Abstract: Lattice Boltzmann methods (LBM) are an important part of current computational fluid dynamics (CFD). They allow easy implementations and boundary handling. However, competitive time to solution not only depends on the choice of a reasonable method, but also on an efficient implementation on modern hardware. Hence, performance optimization has a long history in the lattice Boltzmann community. A va… ▽ More

    Submitted 30 November, 2017; originally announced November 2017.

    Comments: preprint, submitted to Computer & Fluids Special Issue DSFD2017

    Journal ref: Computers & Fluids, 2018

  28. arXiv:1708.08286  [pdf, other

    cs.DC

    A Scalable and Extensible Checkpointing Scheme for Massively Parallel Simulations

    Authors: Nils Kohl, Johannes Hötzer, Florian Schornbaum, Martin Bauer, Christian Godenschwager, Harald Köstler, Britta Nestler, Ulrich Rüde

    Abstract: Realistic simulations in engineering or in the materials sciences can consume enormous computing resources and thus require the use of massively parallel supercomputers. The probability of a failure increases both with the runtime and with the number of system components. For future exascale systems it is therefore considered critical that strategies are developed to make software resilient agains… ▽ More

    Submitted 29 January, 2018; v1 submitted 28 August, 2017; originally announced August 2017.

  29. A Python Extension for the Massively Parallel Multiphysics Simulation Framework waLBerla

    Authors: Martin Bauer, Florian Schornbaum, Christian Godenschwager, Matthias Markl, Daniela Anderl, Harald Köstler, Ulrich Rüde

    Abstract: We present a Python extension to the massively parallel HPC simulation toolkit waLBerla. waLBerla is a framework for stencil based algorithms operating on block-structured grids, with the main application field being fluid simulations in complex geometries using the lattice Boltzmann method. Careful performance engineering results in excellent node performance and good scalability to over 400,000… ▽ More

    Submitted 23 November, 2015; originally announced November 2015.

  30. arXiv:1506.01684  [pdf, other

    cs.DC physics.comp-ph

    Massively Parallel Phase-Field Simulations for Ternary Eutectic Directional Solidification

    Authors: Martin Bauer, Johannes Hötzer, Philipp Steinmetz, Marcus Jainta, Marco Berghoff, Florian Schornbaum, Christian Godenschwager, Harald Köstler, Britta Nestler, Ulrich Rüde

    Abstract: Microstructures forming during ternary eutectic directional solidification processes have significant influence on the macroscopic mechanical properties of metal alloys. For a realistic simulation, we use the well established thermodynamically consistent phase-field method and improve it with a new grand potential formulation to couple the concentration evolution. This extension is very compute in… ▽ More

    Submitted 4 June, 2015; originally announced June 2015.

    Comments: submitted to Supercomputing 2015

  31. arXiv:1406.5369  [pdf, other

    cs.MS

    A Scala Prototype to Generate Multigrid Solver Implementations for Different Problems and Target Multi-Core Platforms

    Authors: Harald Koestler, Christian Schmitt, Sebastian Kuckuk, Frank Hannig, Juergen Teich, Ulrich Ruede

    Abstract: Many problems in computational science and engineering involve partial differential equations and thus require the numerical solution of large, sparse (non)linear systems of equations. Multigrid is known to be one of the most efficient methods for this purpose. However, the concrete multigrid algorithm and its implementation highly depend on the underlying problem and hardware. Therefore, changes… ▽ More

    Submitted 20 June, 2014; originally announced June 2014.

  32. arXiv:1112.0850  [pdf, ps, other

    cs.PF

    Performance engineering for the Lattice Boltzmann method on GPGPUs: Architectural requirements and performance results

    Authors: Johannes Habich, Christian Feichtinger, Harald Köstler, Georg Hager, Gerhard Wellein

    Abstract: GPUs offer several times the floating point performance and memory bandwidth of current standard two socket CPU servers, e.g. NVIDIA C2070 vs. Intel Xeon Westmere X5650. The lattice Boltzmann method has been established as a flow solver in recent years and was one of the first flow solvers to be successfully ported and that performs well on GPUs. We demonstrate advanced optimization strategies for… ▽ More

    Submitted 5 December, 2011; originally announced December 2011.

    Comments: 10 pages, 7 figures, 4 tables, preprint submitted to Computers and Fluids journal

  33. A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters

    Authors: Christian Feichtinger, Johannes Habich, Harald Koestler, Georg Hager, Ulrich Ruede, Gerhard Wellein

    Abstract: Sustaining a large fraction of single GPU performance in parallel computations is considered to be the major problem of GPU-based clusters. In this article, this topic is addressed in the context of a lattice Boltzmann flow solver that is integrated in the WaLBerla software framework. We propose a multi-GPU implementation using a block-structured MPI parallelization, suitable for load balancing an… ▽ More

    Submitted 8 July, 2010; originally announced July 2010.

    Comments: 20 pages, 12 figures

    Journal ref: Parallel Computing 37(9), 536-549 (2011)