Search | arXiv e-print repository

Quantum-centric Supercomputing for Physics Research

Authors: Vincent R. Pascuzzi, Antonio Córcoles

Abstract: This document summarizes the presentation on Quantum-centric Supercomputing given at the 22nd International Workshop on Advanced Computing and Analysis Techniques in Physics Research, hosted at Stony Brook University. This document summarizes the presentation on Quantum-centric Supercomputing given at the 22nd International Workshop on Advanced Computing and Analysis Techniques in Physics Research, hosted at Stony Brook University. △ Less

Submitted 21 August, 2024; originally announced August 2024.

Comments: Accepted to IOPscience Journal of Physics Conference Series

arXiv:2408.06469 [pdf, ps, other]

Design and architecture of the IBM Quantum Engine Compiler

Authors: Michael B. Healy, Reza Jokar, Soolu Thomas, Vincent R. Pascuzzi, Kit Barton, Thomas A. Alexander, Roy Elkabetz, Brian C. Donovan, Hiroshi Horii, Marius Hillenbrand

Abstract: In this work, we describe the design and architecture of the open-source Quantum Engine Compiler (qe-compiler) currently used in production for IBM Quantum systems. The qe-compiler is built using LLVM's Multi-Level Intermediate Representation (MLIR) framework and includes definitions for several dialects to represent parameterized quantum computation at multiple levels of abstraction. The compiler… ▽ More In this work, we describe the design and architecture of the open-source Quantum Engine Compiler (qe-compiler) currently used in production for IBM Quantum systems. The qe-compiler is built using LLVM's Multi-Level Intermediate Representation (MLIR) framework and includes definitions for several dialects to represent parameterized quantum computation at multiple levels of abstraction. The compiler also provides Python bindings and a diagnostic system. An open-source LALR lexer and parser built using Bison and Flex generates an Abstract Syntax Tree that is translated to a high-level MLIR dialect. An extensible hierarchical target system for modeling the heterogeneous nature of control systems at compilation time is included. Target-based and generic compilation passes are added using a pipeline interface to translate the input down to low-level intermediate representations (including LLVM IR) and can take advantage of LLVM backends and tooling to generate machine executable binaries. The qe-compiler is built to be extensible, maintainable, performant, and scalable to support the future of quantum computing. △ Less

Submitted 12 August, 2024; originally announced August 2024.

Comments: To be published in the proceedings of the IEEE International Conference on Quantum Computing and Engineering 2024 (QCE24)

arXiv:2407.20212 [pdf, other]

Distributed Quantum Approximate Optimization Algorithm on a Quantum-Centric Supercomputing Architecture

Authors: Seongmin Kim, Vincent R. Pascuzzi, Zhihao Xu, Tengfei Luo, Eungkyu Lee, In-Saeng Suh

Abstract: Quantum approximate optimization algorithm (QAOA) has shown promise in solving combinatorial optimization problems by providing quantum speedup on near-term gate-based quantum computing systems. However, QAOA faces challenges for high-dimensional problems due to the large number of qubits required and the complexity of deep circuits, limiting its scalability for real-world applications. In this st… ▽ More Quantum approximate optimization algorithm (QAOA) has shown promise in solving combinatorial optimization problems by providing quantum speedup on near-term gate-based quantum computing systems. However, QAOA faces challenges for high-dimensional problems due to the large number of qubits required and the complexity of deep circuits, limiting its scalability for real-world applications. In this study, we present a distributed QAOA (DQAOA), which leverages distributed computing strategies to decompose a large computational workload into smaller tasks that require fewer qubits and shallower circuits than necessitated to solve the original problem. These sub-problems are processed using a combination of high-performance and quantum computing resources. The global solution is iteratively updated by aggregating sub-solutions, allowing convergence toward the optimal solution. We demonstrate that DQAOA can handle considerably large-scale optimization problems (e.g., 1,000-bit problem) achieving a high approximation ratio ($\sim$99%) and short time-to-solution ($\sim$276 s), outperforming existing strategies. Furthermore, we realize DQAOA on a quantum-centric supercomputing architecture, paving the way for practical applications of gate-based quantum computers in real-world optimization tasks. To extend DQAOA's applicability to materials science, we further develop an active learning algorithm integrated with our DQAOA (AL-DQAOA), which involves machine learning, DQAOA, and active data production in an iterative loop. We successfully optimize photonic structures using AL-DQAOA, indicating that solving real-world optimization problems using gate-based quantum computing is feasible. We expect the proposed DQAOA to be applicable to a wide range of optimization problems and AL-DQAOA to find broader applications in material design. △ Less

Submitted 21 March, 2025; v1 submitted 29 July, 2024; originally announced July 2024.

arXiv:2312.09733 [pdf, other]

doi 10.1016/j.future.2024.04.060

Quantum-centric Supercomputing for Materials Science: A Perspective on Challenges and Future Directions

Authors: Yuri Alexeev, Maximilian Amsler, Paul Baity, Marco Antonio Barroca, Sanzio Bassini, Torey Battelle, Daan Camps, David Casanova, Young Jai Choi, Frederic T. Chong, Charles Chung, Chris Codella, Antonio D. Corcoles, James Cruise, Alberto Di Meglio, Jonathan Dubois, Ivan Duran, Thomas Eckl, Sophia Economou, Stephan Eidenbenz, Bruce Elmegreen, Clyde Fare, Ismael Faro, Cristina Sanz Fernández, Rodrigo Neumann Barros Ferreira , et al. (102 additional authors not shown)

Abstract: Computational models are an essential tool for the design, characterization, and discovery of novel materials. Hard computational tasks in materials science stretch the limits of existing high-performance supercomputing centers, consuming much of their simulation, analysis, and data resources. Quantum computing, on the other hand, is an emerging technology with the potential to accelerate many of… ▽ More Computational models are an essential tool for the design, characterization, and discovery of novel materials. Hard computational tasks in materials science stretch the limits of existing high-performance supercomputing centers, consuming much of their simulation, analysis, and data resources. Quantum computing, on the other hand, is an emerging technology with the potential to accelerate many of the computational tasks needed for materials science. In order to do that, the quantum technology must interact with conventional high-performance computing in several ways: approximate results validation, identification of hard problems, and synergies in quantum-centric supercomputing. In this paper, we provide a perspective on how quantum-centric supercomputing can help address critical computational problems in materials science, the challenges to face in order to solve representative use cases, and new suggested directions. △ Less

Submitted 19 September, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: 65 pages, 15 figures; comments welcome

Journal ref: Future Generation Computer Systems, Volume 160, November 2024, Pages 666-710

arXiv:2312.02272 [pdf, other]

doi 10.22331/q-2025-02-19-1638

Fermionic wave packet scattering: a quantum computing approach

Authors: Yahui Chai, Arianna Crippa, Karl Jansen, Stefan Kühn, Vincent R. Pascuzzi, Francesco Tacchino, Ivano Tavernelli

Abstract: Quantum computing provides a novel avenue towards simulating dynamical phenomena, and, in particular, scattering processes relevant for exploring the structure of matter. However, preparing and evolving particle wave packets on a quantum device is a nontrivial task. In this work, we propose a method to prepare Gaussian wave packets with momentum on top of the interacting ground state of a fermioni… ▽ More Quantum computing provides a novel avenue towards simulating dynamical phenomena, and, in particular, scattering processes relevant for exploring the structure of matter. However, preparing and evolving particle wave packets on a quantum device is a nontrivial task. In this work, we propose a method to prepare Gaussian wave packets with momentum on top of the interacting ground state of a fermionic Hamiltonian. Using Givens rotation, we show how to efficiently obtain expectation values of observables throughout the evolution of the wave packets on digital quantum computers. We demonstrate our technique by applying it to the staggered lattice formulation of the Thirring model and studying the scattering of two wave packets. Monitoring the particle density and the entropy produced during the scattering process, we characterize the phenomenon and provide a first step towards studying more complicated collision processes on digital quantum computers. In addition, we perform a small-scale demonstration on IBM's quantum hardware, showing that our method is suitable for current and near-term quantum devices. △ Less

Submitted 16 January, 2025; v1 submitted 4 December, 2023; originally announced December 2023.

Journal ref: Quantum 9, 1638 (2025)

arXiv:2306.15869 [pdf, other]

Evaluating Portable Parallelization Strategies for Heterogeneous Architectures in High Energy Physics

Authors: Mohammad Atif, Meghna Battacharya, Paolo Calafiura, Taylor Childers, Mark Dewing, Zhihua Dong, Oliver Gutsche, Salman Habib, Kyle Knoepfel, Matti Kortelainen, Ka Hei Martin Kwok, Charles Leggett, Meifeng Lin, Vincent Pascuzzi, Alexei Strelchenko, Vakhtang Tsulaia, Brett Viren, Tianle Wang, Beomki Yeo, Haiwang Yu

Abstract: High-energy physics (HEP) experiments have developed millions of lines of code over decades that are optimized to run on traditional x86 CPU systems. However, we are seeing a rapidly increasing fraction of floating point computing power in leadership-class computing facilities and traditional data centers coming from new accelerator architectures, such as GPUs. HEP experiments are now faced with t… ▽ More High-energy physics (HEP) experiments have developed millions of lines of code over decades that are optimized to run on traditional x86 CPU systems. However, we are seeing a rapidly increasing fraction of floating point computing power in leadership-class computing facilities and traditional data centers coming from new accelerator architectures, such as GPUs. HEP experiments are now faced with the untenable prospect of rewriting millions of lines of x86 CPU code, for the increasingly dominant architectures found in these computational accelerators. This task is made more challenging by the architecture-specific languages and APIs promoted by manufacturers such as NVIDIA, Intel and AMD. Producing multiple, architecture-specific implementations is not a viable scenario, given the available person power and code maintenance issues. The Portable Parallelization Strategies team of the HEP Center for Computational Excellence is investigating the use of Kokkos, SYCL, OpenMP, std::execution::parallel and alpaka as potential portability solutions that promise to execute on multiple architectures from the same source code, using representative use cases from major HEP experiments, including the DUNE experiment of the Long Baseline Neutrino Facility, and the ATLAS and CMS experiments of the Large Hadron Collider. This cross-cutting evaluation of portability solutions using real applications will help inform and guide the HEP community when choosing their software and hardware suites for the next generation of experimental frameworks. We present the outcomes of our studies, including performance metrics, porting challenges, API evaluations, and build system integration. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: 18 pages, 9 Figures, 2 Tables

arXiv:2208.11745 [pdf, other]

AI-coupled HPC Workflows

Authors: Shantenu Jha, Vincent R. Pascuzzi, Matteo Turilli

Abstract: Increasingly, scientific discovery requires sophisticated and scalable workflows. Workflows have become the ``new applications,'' wherein multi-scale computing campaigns comprise multiple and heterogeneous executable tasks. In particular, the introduction of AI/ML models into the traditional HPC workflows has been an enabler of highly accurate modeling, typically reducing computational needs compa… ▽ More Increasingly, scientific discovery requires sophisticated and scalable workflows. Workflows have become the ``new applications,'' wherein multi-scale computing campaigns comprise multiple and heterogeneous executable tasks. In particular, the introduction of AI/ML models into the traditional HPC workflows has been an enabler of highly accurate modeling, typically reducing computational needs compared to traditional methods. This chapter discusses various modes of integrating AI/ML models to HPC computations, resulting in diverse types of AI-coupled HPC workflows. The increasing need of coupling AI/ML and HPC across scientific domains is motivated, and then exemplified by a number of production-grade use cases for each mode. We additionally discuss the primary challenges of extreme-scale AI-coupled HPC campaigns -- task heterogeneity, adaptivity, performance -- and several framework and middleware solutions which aim to address them. While both HPC workflow and AI/ML computing paradigms are independently effective, we highlight how their integration, and ultimate convergence, is leading to significant improvements in scientific performance across a range of domains, ultimately resulting in scientific explorations otherwise unattainable. △ Less

Submitted 24 August, 2022; originally announced August 2022.

arXiv:2208.11069 [pdf, other]

Asynchronous Execution of Heterogeneous Tasks in ML-driven HPC Workflows

Authors: Vincent R. Pascuzzi, Ozgur O. Kilic, Matteo Turilli, Shantenu Jha

Abstract: Heterogeneous scientific workflows consist of numerous types of tasks that require executing on heterogeneous resources. Asynchronous execution of those tasks is crucial to improve resource utilization, task throughput and reduce workflows' makespan. Therefore, middleware capable of scheduling and executing different task types across heterogeneous resources must enable asynchronous execution of t… ▽ More Heterogeneous scientific workflows consist of numerous types of tasks that require executing on heterogeneous resources. Asynchronous execution of those tasks is crucial to improve resource utilization, task throughput and reduce workflows' makespan. Therefore, middleware capable of scheduling and executing different task types across heterogeneous resources must enable asynchronous execution of tasks. In this paper, we investigate the requirements and properties of the asynchronous task execution of machine learning (ML)-driven high performance computing (HPC) workflows. We model the degree of asynchronicity permitted for arbitrary workflows and propose key metrics that can be used to determine qualitative benefits when employing asynchronous execution. Our experiments represent relevant scientific drivers, we perform them at scale on Summit, and we show that the performance enhancements due to asynchronous execution are consistent with our model. △ Less

Submitted 27 June, 2023; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: Publised on 26th edition of the workshop on Job Scheduling Strategies for Parallel Processing. JSSPP23

arXiv:2205.00585 [pdf, other]

On the importance of scalability and resource estimation of quantum algorithms for domain sciences

Authors: Vincent R. Pascuzzi, Ning Bao, Ang Li

Abstract: The quantum information science community has seen a surge in new algorithmic developments across scientific domains. These developments have demonstrated polynomial or better improvements in computational and space complexity, incentivizing further research in the field. However, despite recent progress, many works fail to provide quantitative estimates on algorithmic scalability or quantum resou… ▽ More The quantum information science community has seen a surge in new algorithmic developments across scientific domains. These developments have demonstrated polynomial or better improvements in computational and space complexity, incentivizing further research in the field. However, despite recent progress, many works fail to provide quantitative estimates on algorithmic scalability or quantum resources required -- e.g., number of logical qubits, error thresholds, etc. -- to realize the highly sought "quantum advantage." In this paper, we discuss several quantum algorithms and motivate the importance of such estimates. By example and under simple scaling assumptions, we approximate the computational expectations of a future quantum device for a high energy physics simulation algorithm and how it compares to its classical analog. We assert that a standard candle is necessary for claims of quantum advantage. △ Less

Submitted 1 May, 2022; originally announced May 2022.

Comments: Submitted to QCE 2022

arXiv:2203.09945 [pdf, other]

Portability: A Necessary Approach for Future Scientific Software

Authors: Meghna Bhattacharya, Paolo Calafiura, Taylor Childers, Mark Dewing, Zhihua Dong, Oliver Gutsche, Salman Habib, Xiangyang Ju, Michael Kirby, Kyle Knoepfel, Matti Kortelainen, Martin Kwok, Charles Leggett, Meifeng Lin, Vincent R. Pascuzzi, Alexei Strelchenko, Brett Viren, Beomki Yeo, Haiwang Yu

Abstract: Today's world of scientific software for High Energy Physics (HEP) is powered by x86 code, while the future will be much more reliant on accelerators like GPUs and FPGAs. The portable parallelization strategies (PPS) project of the High Energy Physics Center for Computational Excellence (HEP/CCE) is investigating solutions for portability techniques that will allow the coding of an algorithm once,… ▽ More Today's world of scientific software for High Energy Physics (HEP) is powered by x86 code, while the future will be much more reliant on accelerators like GPUs and FPGAs. The portable parallelization strategies (PPS) project of the High Energy Physics Center for Computational Excellence (HEP/CCE) is investigating solutions for portability techniques that will allow the coding of an algorithm once, and the ability to execute it on a variety of hardware products from many vendors, especially including accelerators. We think without these solutions, the scientific success of our experiments and endeavors is in danger, as software development could be expert driven and costly to be able to run on available hardware infrastructure. We think the best solution for the community would be an extension to the C++ standard with a very low entry bar for users, supporting all hardware forms and vendors. We are very far from that ideal though. We argue that in the future, as a community, we need to request and work on portability solutions and strive to reach this ideal. △ Less

Submitted 15 March, 2022; originally announced March 2022.

Comments: 13 pages, 1 figure. Contribution to Snowmass 2021

arXiv:2203.09467 [pdf, other]

Celeritas: GPU-accelerated particle transport for detector simulation in High Energy Physics experiments

Authors: S. C. Tognini, P. Canal, T. M. Evans, G. Lima, A. L. Lund, S. R. Johnson, S. Y. Jun, V. R. Pascuzzi, P. K. Romano

Abstract: Within the next decade, experimental High Energy Physics (HEP) will enter a new era of scientific discovery through a set of targeted programs recommended by the Particle Physics Project Prioritization Panel (P5), including the upcoming High Luminosity Large Hadron Collider (LHC) HL-LHC upgrade and the Deep Underground Neutrino Experiment (DUNE). These efforts in the Energy and Intensity Frontiers… ▽ More Within the next decade, experimental High Energy Physics (HEP) will enter a new era of scientific discovery through a set of targeted programs recommended by the Particle Physics Project Prioritization Panel (P5), including the upcoming High Luminosity Large Hadron Collider (LHC) HL-LHC upgrade and the Deep Underground Neutrino Experiment (DUNE). These efforts in the Energy and Intensity Frontiers will require an unprecedented amount of computational capacity on many fronts including Monte Carlo (MC) detector simulation. In order to alleviate this impending computational bottleneck, the Celeritas MC particle transport code is designed to leverage the new generation of heterogeneous computer architectures, including the exascale computing power of U.S. Department of Energy (DOE) Leadership Computing Facilities (LCFs), to model targeted HEP detector problems at the full fidelity of Geant4. This paper presents the planned roadmap for Celeritas, including its proposed code architecture, physics capabilities, and strategies for integrating it with existing and future experimental HEP computing workflows. △ Less

Submitted 22 March, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

Comments: Contribution to Snowmass 2021

Report number: FERMILAB-FN-1159-SCD

arXiv:2203.09384 [pdf, other]

doi 10.1145/3529538.3529996

Benchmarking a Proof-of-Concept Performance Portable SYCL-based Fast Fourier Transformation Library

Authors: Vincent R. Pascuzzi, Mehdi Goli

Abstract: In this paper, we present an early version of a SYCL-based FFT library, capable of running on all major vendor hardware, including CPUs and GPUs from AMD, ARM, Intel and NVIDIA. Although preliminary, the aim of this work is to seed further developments for a rich set of features for calculating FFTs. It has the advantage over existing portable FFT libraries in that it is single-source, and therefo… ▽ More In this paper, we present an early version of a SYCL-based FFT library, capable of running on all major vendor hardware, including CPUs and GPUs from AMD, ARM, Intel and NVIDIA. Although preliminary, the aim of this work is to seed further developments for a rich set of features for calculating FFTs. It has the advantage over existing portable FFT libraries in that it is single-source, and therefore removes the complexities that arise due to abundant use of pre-process macros and auto-generated kernels to target different architectures. We exercise two SYCL-enabled compilers, Codeplay ComputeCpp and Intel's open-source LLVM project, to evaluate performance portability of our SYCL-based FFT on various heterogeneous architectures. The current limitations of our library is it supports single-dimension FFTs up to $2^{11}$ in length and base-2 input sequences. We compare our results with highly optimized vendor specific FFT libraries and provide a detailed analysis to demonstrate a fair level of performance, as well as potential sources of performance bottlenecks. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Comments: 12 pages, 6 figures, submitted to IWOCL 2022

Journal ref: IWOCL'22: International Workshop on OpenCL, May 2022, Article No.: 20, Pages 1-9

arXiv:2203.07645 [pdf, other]

Software and Computing for Small HEP Experiments

Authors: Dave Casper, Maria Elena Monzani, Benjamin Nachman, Costas Andreopoulos, Stephen Bailey, Deborah Bard, Wahid Bhimji, Giuseppe Cerati, Grigorios Chachamis, Jacob Daughhetee, Miriam Diamond, V. Daniel Elvira, Alden Fan, Krzysztof Genser, Paolo Girotti, Scott Kravitz, Robert Kutschke, Vincent R. Pascuzzi, Gabriel N. Perdue, Erica Snider, Elizabeth Sexton-Kennedy, Graeme Andrew Stewart, Matthew Szydagis, Eric Torrence, Christopher Tunnell

Abstract: This white paper briefly summarized key conclusions of the recent US Community Study on the Future of Particle Physics (Snowmass 2021) workshop on Software and Computing for Small High Energy Physics Experiments. This white paper briefly summarized key conclusions of the recent US Community Study on the Future of Particle Physics (Snowmass 2021) workshop on Software and Computing for Small High Energy Physics Experiments. △ Less

Submitted 27 December, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: Contribution to Snowmass 2021

Report number: FERMILAB-CONF-22-138

arXiv:2203.07614 [pdf, ps, other]

Detector and Beamline Simulation for Next-Generation High Energy Physics Experiments

Authors: Sunanda Banerjee, D. N. Brown, David N. Brown, Paolo Calafiura, Jacob Calcutt, Philippe Canal, Miriam Diamond, Daniel Elvira, Thomas Evans, Renee Fatemi, Krzysztof Genser, Robert Hatcher, Alexander Himmel, Seth R. Johnson, Soon Yung Jun, Michael Kelsey, Evangelos Kourlitis, Robert K. Kutschke, Guilherme Lima, Kevin Lynch, Kendall Mahn, Zachary Marshall, Michael Mooney, Adam Para, Vincent R. Pascuzzi , et al. (9 additional authors not shown)

Abstract: The success of high energy physics programs relies heavily on accurate detector simulations and beam interaction modeling. The increasingly complex detector geometries and beam dynamics require sophisticated techniques in order to meet the demands of current and future experiments. Common software tools used today are unable to fully utilize modern computational resources, while data-recording rat… ▽ More The success of high energy physics programs relies heavily on accurate detector simulations and beam interaction modeling. The increasingly complex detector geometries and beam dynamics require sophisticated techniques in order to meet the demands of current and future experiments. Common software tools used today are unable to fully utilize modern computational resources, while data-recording rates are often orders of magnitude larger than what can be produced via simulation. In this paper, we describe the state, current and future needs of high energy physics detector and beamline simulations and related challenges, and we propose a number of possible ways to address them. △ Less

Submitted 20 April, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

Comments: Contribution to Snowmass 2021

Report number: FERMILAB-FN-1151-ND-PPD-SCD

arXiv:2110.13338 [pdf, other]

doi 10.1103/PhysRevA.105.042406

Computationally Efficient Zero Noise Extrapolation for Quantum Gate Error Mitigation

Authors: Vincent R. Pascuzzi, Andre He, Christian W. Bauer, Wibe A. de Jong, Benjamin Nachman

Abstract: Zero noise extrapolation (ZNE) is a widely used technique for gate error mitigation on near term quantum computers because it can be implemented in software and does not require knowledge of the quantum computer noise parameters. Traditional ZNE requires a significant resource overhead in terms of quantum operations. A recent proposal using a targeted (or random) instead of fixed identity insertio… ▽ More Zero noise extrapolation (ZNE) is a widely used technique for gate error mitigation on near term quantum computers because it can be implemented in software and does not require knowledge of the quantum computer noise parameters. Traditional ZNE requires a significant resource overhead in terms of quantum operations. A recent proposal using a targeted (or random) instead of fixed identity insertion method (RIIM versus FIIM) requires significantly fewer quantum gates for the same formal precision. We start by showing that RIIM can allow for ZNE to be deployed on deeper circuits than FIIM, but requires many more measurements to maintain the same statistical uncertainty. We develop two extensions to FIIM and RIIM. The List Identity Insertion Method (LIIM) allows to mitigate the error from certain CNOT gates, typically those with the largest error. Set Identity Insertion Method (SIIM) naturally interpolates between the measurement-efficient FIIM and the gate-efficient RIIM, allowing to trade off fewer CNOT gates for more measurements. Finally, we investigate a way to boost the number of measurements, namely to run ZNE in parallel, utilizing as many quantum devices as are available. We explore the performance of RIIM in a parallel setting where there is a non-trivial spread in noise across sets of qubits within or across quantum computers. △ Less

Submitted 9 March, 2022; v1 submitted 25 October, 2021; originally announced October 2021.

Comments: 10 pages, 10 figures

Journal ref: Phys. Rev. A 105, 042406 (2022)

arXiv:2109.01329 [pdf, other]

doi 10.1007/978-3-030-97759-7_2

Achieving near native runtime performance and cross-platform performance portability for random number generation through SYCL interoperability

Authors: Vincent R. Pascuzzi, Mehdi Goli

Abstract: High-performance computing (HPC) is a major driver accelerating scientific research and discovery, from quantum simulations to medical therapeutics. While the increasing availability of HPC resources is in many cases pivotal to successful science, even the largest collaborations lack the computational expertise required for maximal exploitation of current hardware capabilities. The need to maintai… ▽ More High-performance computing (HPC) is a major driver accelerating scientific research and discovery, from quantum simulations to medical therapeutics. While the increasing availability of HPC resources is in many cases pivotal to successful science, even the largest collaborations lack the computational expertise required for maximal exploitation of current hardware capabilities. The need to maintain multiple platform-specific codebases further complicates matters, potentially adding constraints on machines that can be utilized. Fortunately, numerous programming models are under development that aim to facilitate portable codes for heterogeneous computing. One in particular is SYCL, an open standard, C++-based single-source programming paradigm. Among SYCL's features is interoperability, a mechanism through which applications and third-party libraries coordinate sharing data and execute collaboratively. In this paper, we leverage the SYCL programming model to demonstrate cross-platform performance portability across heterogeneous resources. We detail our NVIDIA and AMD random number generator extensions to the oneMKL open-source interfaces library. Performance portability is measured relative to platform-specific baseline applications executed on four major hardware platforms using two different compilers supporting SYCL. The utility of our extensions are exemplified in a real-world setting via a high-energy physics simulation application. We show the performance of implementations that capitalize on SYCL interoperability are at par with native implementations, attesting to the cross-platform performance portability of a SYCL-based approach to scientific codes. △ Less

Submitted 18 October, 2021; v1 submitted 3 September, 2021; originally announced September 2021.

Comments: 24 pages, 5 figures, conference

arXiv:2103.14737 [pdf, ps, other]

Porting HEP Parameterized Calorimeter Simulation Code to GPUs

Authors: Zhihua Dong, Heather Gray, Charles Leggett, Meifeng Lin, Vincent R. Pascuzzi, Kwangmin Yu

Abstract: The High Energy Physics (HEP) experiments, such as those at the Large Hadron Collider (LHC), traditionally consume large amounts of CPU cycles for detector simulations and data analysis, but rarely use compute accelerators such as GPUs. As the LHC is upgraded to allow for higher luminosity, resulting in much higher data rates, purely relying on CPUs may not provide enough computing power to suppor… ▽ More The High Energy Physics (HEP) experiments, such as those at the Large Hadron Collider (LHC), traditionally consume large amounts of CPU cycles for detector simulations and data analysis, but rarely use compute accelerators such as GPUs. As the LHC is upgraded to allow for higher luminosity, resulting in much higher data rates, purely relying on CPUs may not provide enough computing power to support the simulation and data analysis needs. As a proof of concept, we investigate the feasibility of porting a HEP parameterized calorimeter simulation code to GPUs. We have chosen to use FastCaloSim, the ATLAS fast parametrized calorimeter simulation. While FastCaloSim is sufficiently fast such that it does not impose a bottleneck in detector simulations overall, significant speed-ups in the processing of large samples can be achieved from GPU parallelization at both the particle (intra-event) and event levels; this is especially beneficial in conditions expected at the high-luminosity LHC, where extremely high per-event particle multiplicities will result from the many simultaneous proton-proton collisions. We report our experience with porting FastCaloSim to NVIDIA GPUs using CUDA. A preliminary Kokkos implementation of FastCaloSim for portability to other parallel architectures is also described. △ Less

Submitted 18 May, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

Comments: 15 pages, 1 figure, 8 tables, 2 listings, submitted to Frontiers in Big Data (Big Data in AI and High Energy Physics)

arXiv:2103.08591 [pdf, other]

doi 10.1103/PhysRevLett.127.270502

Mitigating depolarizing noise on quantum computers with noise-estimation circuits

Authors: Miroslav Urbanek, Benjamin Nachman, Vincent R. Pascuzzi, Andre He, Christian W. Bauer, Wibe A. de Jong

Abstract: A significant problem for current quantum computers is noise. While there are many distinct noise channels, the depolarizing noise model often appropriately describes average noise for large circuits involving many qubits and gates. We present a method to mitigate the depolarizing noise by first estimating its rate with a noise-estimation circuit and then correcting the output of the target circui… ▽ More A significant problem for current quantum computers is noise. While there are many distinct noise channels, the depolarizing noise model often appropriately describes average noise for large circuits involving many qubits and gates. We present a method to mitigate the depolarizing noise by first estimating its rate with a noise-estimation circuit and then correcting the output of the target circuit using the estimated rate. The method is experimentally validated on the simulation of the Heisenberg model. We find that our approach in combination with readout-error correction, randomized compiling, and zero-noise extrapolation produces results close to exact results even for circuits containing hundreds of CNOT gates. △ Less

Submitted 15 March, 2021; originally announced March 2021.

arXiv:1712.06982 [pdf, other]

doi 10.1007/s41781-018-0018-8

A Roadmap for HEP Software and Computing R&D for the 2020s

Authors: Johannes Albrecht, Antonio Augusto Alves Jr, Guilherme Amadio, Giuseppe Andronico, Nguyen Anh-Ky, Laurent Aphecetche, John Apostolakis, Makoto Asai, Luca Atzori, Marian Babik, Giuseppe Bagliesi, Marilena Bandieramonte, Sunanda Banerjee, Martin Barisits, Lothar A. T. Bauerdick, Stefano Belforte, Douglas Benjamin, Catrin Bernius, Wahid Bhimji, Riccardo Maria Bianchi, Ian Bird, Catherine Biscarat, Jakob Blomer, Kenneth Bloom, Tommaso Boccali , et al. (285 additional authors not shown)

Abstract: Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for… ▽ More Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for the HL-LHC in particular, it is critical that all of the collaborating stakeholders agree on the software goals and priorities, and that the efforts complement each other. In this spirit, this white paper describes the R&D activities required to prepare for this software upgrade. △ Less

Submitted 19 December, 2018; v1 submitted 18 December, 2017; originally announced December 2017.

Report number: HSF-CWP-2017-01

Journal ref: Comput Softw Big Sci (2019) 3, 7

arXiv:1508.07258 [pdf, ps, other]

doi 10.1063/1.4952643

Some new aspects of first integrals and symmetries for central force dynamics

Authors: Stephen C. Anco, Tyler Meadows, Vincent Pascuzzi

Abstract: For the general central force equations of motion in $n>1$ dimensions, a complete set of $2n$ first integrals is derived in an explicit algorithmic way without the use of dynamical symmetries or Noether's theorem. The derivation uses the polar formulation of the equations of motion and yields energy, angular momentum, a generalized Laplace-Rugge-Lenz vector, and a temporal quantity involving the t… ▽ More For the general central force equations of motion in $n>1$ dimensions, a complete set of $2n$ first integrals is derived in an explicit algorithmic way without the use of dynamical symmetries or Noether's theorem. The derivation uses the polar formulation of the equations of motion and yields energy, angular momentum, a generalized Laplace-Rugge-Lenz vector, and a temporal quantity involving the time variable explicitly. A variant of the general Laplace-Rugge-Lenz vector, which generalizes Hamilton's eccentricity vector, is also obtained. The physical meaning of the general Laplace-Rugge-Lenz vector, its variant, and the temporal quantity are discussed for general central forces. Their properties are compared for precessing bounded trajectories versus non-precessing bounded trajectories, as well as unbounded trajectories, by considering an inverse-square force (Kepler problem) and a cubically perturbed inverse-square force (Newtonian revolving orbit problem). △ Less

Submitted 28 May, 2016; v1 submitted 28 August, 2015; originally announced August 2015.

Comments: 41 pages. Minor typos corrected. Published version

Journal ref: J. Math. Phys. 57 (2016), 062901 (35 pages)

Showing 1–20 of 20 results for author: Pascuzzi, V