-
Madgraph on GPUs and vector CPUs: towards production (The 5-year journey to the first LO release CUDACPP v1.00.00)
Authors:
Andrea Valassi,
Taylor Childers,
Stephan Hageböck,
Daniele Massaro,
Olivier Mattelaer,
Nathan Nichols,
Filip Optolowicz,
Stefan Roiser,
Jørgen Teig,
Zenny Wettersten
Abstract:
The effort to speed up the Madgraph5_aMC@NLO generator by exploiting CPU vectorization and GPUs, which started at the beginning of 2020, has delivered the first production release of the code for leading-order (LO) processes in October 2024. To achieve this goal, many new features, tests and fixes have been implemented in recent months. This process benefitted also from the early feedback of the C…
▽ More
The effort to speed up the Madgraph5_aMC@NLO generator by exploiting CPU vectorization and GPUs, which started at the beginning of 2020, has delivered the first production release of the code for leading-order (LO) processes in October 2024. To achieve this goal, many new features, tests and fixes have been implemented in recent months. This process benefitted also from the early feedback of the CMS experiment. In this contribution, we report on these activities and on the status of the LO software at the time of CHEP2024.
△ Less
Submitted 27 March, 2025;
originally announced March 2025.
-
Madgraph5_aMC@NLO on GPUs and vector CPUs Experience with the first alpha release
Authors:
Stephan Hageboeck,
Taylor Childers,
Walter Hopkins,
Olivier Mattelaer,
Nathan Nichols,
Stefan Roiser,
Jørgen Teig,
Andrea Valassi,
Carl Vuosalo,
Zenny Wettersten
Abstract:
Madgraph5_aMC@NLO is one of the most-frequently used Monte-Carlo event generators at the LHC, and an important consumer of compute resources. The software has been reengineered to maintain the overall look and feel of the user interface while speeding up event generation on CPUs and GPUs. The most computationally intensive part, the calculation of "matrix elements", is offloaded to new implementat…
▽ More
Madgraph5_aMC@NLO is one of the most-frequently used Monte-Carlo event generators at the LHC, and an important consumer of compute resources. The software has been reengineered to maintain the overall look and feel of the user interface while speeding up event generation on CPUs and GPUs. The most computationally intensive part, the calculation of "matrix elements", is offloaded to new implementations optimised for GPUs and for CPU vector instructions, using event-level data parallelism. We present the work to support accelerated leading-order QCD processes, and discuss how this work is going to be released to Madgraph5_aMC@NLO's users.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Speeding up Madgraph5 aMC@NLO through CPU vectorization and GPU offloading: towards a first alpha release
Authors:
Andrea Valassi,
Taylor Childers,
Laurence Field,
Stephan Hageböck,
Walter Hopkins,
Olivier Mattelaer,
Nathan Nichols,
Stefan Roiser,
David Smith,
Jorgen Teig,
Carl Vuosalo,
Zenny Wettersten
Abstract:
The matrix element (ME) calculation in any Monte Carlo physics event generator is an ideal fit for implementing data parallelism with lockstep processing on GPUs and vector CPUs. For complex physics processes where the ME calculation is the computational bottleneck of event generation workflows, this can lead to large overall speedups by efficiently exploiting these hardware architectures, which a…
▽ More
The matrix element (ME) calculation in any Monte Carlo physics event generator is an ideal fit for implementing data parallelism with lockstep processing on GPUs and vector CPUs. For complex physics processes where the ME calculation is the computational bottleneck of event generation workflows, this can lead to large overall speedups by efficiently exploiting these hardware architectures, which are now largely underutilized in HEP. In this paper, we present the status of our work on the reengineering of the Madgraph5_aMC@NLO event generator at the time of the ACAT2022 conference. The progress achieved since our previous publication in the ICHEP2022 proceedings is discussed, for our implementations of the ME calculations in vectorized C++, in CUDA and in the SYCL framework, as well as in their integration into the existing MadEvent framework. The outlook towards a first alpha release of the software supporting QCD LO processes usable by the LHC experiments is also discussed.
△ Less
Submitted 9 December, 2023; v1 submitted 31 March, 2023;
originally announced March 2023.
-
Developments in Performance and Portability for MadGraph5_aMC@NLO
Authors:
Andrea Valassi,
Taylor Childers,
Laurence Field,
Stefan Hageböck,
Walter Hopkins,
Olivier Mattelaer,
Nathan Nichols,
Stefan Roiser,
David Smith
Abstract:
Event generators simulate particle interactions using Monte Carlo techniques, providing the primary connection between experiment and theory in experimental high energy physics. These software packages, which are the first step in the simulation worflow of collider experiments, represent approximately 5 to 20% of the annual WLCG usage for the ATLAS and CMS experiments. With computing architectures…
▽ More
Event generators simulate particle interactions using Monte Carlo techniques, providing the primary connection between experiment and theory in experimental high energy physics. These software packages, which are the first step in the simulation worflow of collider experiments, represent approximately 5 to 20% of the annual WLCG usage for the ATLAS and CMS experiments. With computing architectures becoming more heterogeneous, it is important to ensure that these key software frameworks can be run on future systems, large and small. In this contribution, recent progress on porting and speeding up the Madgraph5_aMC@NLO event generator on hybrid architectures, i.e. CPU with GPU accelerators, is discussed. The main focus of this work has been in the calculation of scattering amplitudes and "matrix elements", which is the computational bottleneck of an event generation application. For physics processes limited to QCD leading order, the code generation toolkit has been expanded to produce matrix element calculations using C++ vector instructions on CPUs and using CUDA for NVidia GPUs, as well as using Alpaka, Kokkos and SYCL for multiple CPU and GPU architectures. Performance is reported in terms of matrix element calculations per time on NVidia, Intel, and AMD devices. The status and outlook for the integration of this work into a production release usable by the LHC experiments, with the same functionalities and very similar user interfaces as the current Fortran version, is also described.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Design and engineering of a simplified workflow execution for the MG5aMC event generator on GPUs and vector CPUs
Authors:
Andrea Valassi,
Stefan Roiser,
Olivier Mattelaer,
Stephan Hageboeck
Abstract:
Physics event generators are essential components of the data analysis software chain of high energy physics experiments, and important consumers of their CPU resources. Improving the software performance of these packages on modern hardware architectures, such as those deployed at HPC centers, is essential in view of the upcoming HL-LHC physics programme. In this paper, we describe an ongoing act…
▽ More
Physics event generators are essential components of the data analysis software chain of high energy physics experiments, and important consumers of their CPU resources. Improving the software performance of these packages on modern hardware architectures, such as those deployed at HPC centers, is essential in view of the upcoming HL-LHC physics programme. In this paper, we describe an ongoing activity to reengineer the Madgraph5_aMC@NLO physics event generator, primarily to port it and allow its efficient execution on GPUs, but also to modernize it and optimize its performance on vector CPUs. We describe the motivation, engineering process and software architecture design of our developments, as well as the current challenges and future directions for this project. This paper is based on our submission to vCHEP2021 in March 2021,complemented with a few preliminary results that we presented during the conference. Further details and updated results will be given in later publications.
△ Less
Submitted 13 July, 2021; v1 submitted 23 June, 2021;
originally announced June 2021.