-
Roadmap on Advancements of the FHI-aims Software Package
Authors:
Joseph W. Abbott,
Carlos Mera Acosta,
Alaa Akkoush,
Alberto Ambrosetti,
Viktor Atalla,
Alexej Bagrets,
Jörg Behler,
Daniel Berger,
Björn Bieniek,
Jonas Björk,
Volker Blum,
Saeed Bohloul,
Connor L. Box,
Nicholas Boyer,
Danilo Simoes Brambila,
Gabriel A. Bramley,
Kyle R. Bryenton,
María Camarasa-Gómez,
Christian Carbogno,
Fabio Caruso,
Sucismita Chutia,
Michele Ceriotti,
Gábor Csányi,
William Dawson,
Francisco A. Delesma
, et al. (177 additional authors not shown)
Abstract:
Electronic-structure theory is the foundation of the description of materials including multiscale modeling of their properties and functions. Obviously, without sufficient accuracy at the base, reliable predictions are unlikely at any level that follows. The software package FHI-aims has proven to be a game changer for accurate free-energy calculations because of its scalability, numerical precis…
▽ More
Electronic-structure theory is the foundation of the description of materials including multiscale modeling of their properties and functions. Obviously, without sufficient accuracy at the base, reliable predictions are unlikely at any level that follows. The software package FHI-aims has proven to be a game changer for accurate free-energy calculations because of its scalability, numerical precision, and its efficient handling of density functional theory (DFT) with hybrid functionals and van der Waals interactions. It treats molecules, clusters, and extended systems (solids and liquids) on an equal footing. Besides DFT, FHI-aims also includes quantum-chemistry methods, descriptions for excited states and vibrations, and calculations of various types of transport. Recent advancements address the integration of FHI-aims into an increasing number of workflows and various artificial intelligence (AI) methods. This Roadmap describes the state-of-the-art of FHI-aims and advancements that are currently ongoing or planned.
△ Less
Submitted 5 June, 2025; v1 submitted 30 April, 2025;
originally announced May 2025.
-
Advances in quantum defect embedding theory
Authors:
Siyuan Chen,
Victor Wen-zhe Yu,
Yu Jin,
Marco Govoni,
Giulia Galli
Abstract:
Quantum defect embedding theory (QDET) is a many-body embedding method designed to describe condensed systems with strongly correlated electrons localized within a given region of space, for example spin defects in semiconductors and insulators. Although the QDET approach has been successful in predicting the electronic properties of several point defects, several limitations of the method remain.…
▽ More
Quantum defect embedding theory (QDET) is a many-body embedding method designed to describe condensed systems with strongly correlated electrons localized within a given region of space, for example spin defects in semiconductors and insulators. Although the QDET approach has been successful in predicting the electronic properties of several point defects, several limitations of the method remain. In this work, we propose multiple advances to the QDET formalism. We derive a double-counting correction that consistently treats the frequency dependence of the screened Coulomb interaction, and we illustrate the effect of including unoccupied orbitals in the active space. In addition, we propose a method to describe hybridization effects between the active space and the environment, and we compare the results of several impurity solvers, providing further insights into improving the reliability and applicability of the method. We present results for defects in diamond and for molecular qubits, including a detailed comparison with experiments.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Solvers for Large-Scale Electronic Structure Theory: ELPA and ELSI
Authors:
Petr Karpov,
Andreas Marek,
Tobias Melson,
Alexander Pöppl,
Victor Wen-zhe Yu,
Ben Hourahine,
Alberto Garcia,
William Dawson,
Yi Yao,
William Huhn,
Jonathan Moussa,
Sam Hall,
Reinhard Maurer,
Uthpala Herath,
Konstantin Lion,
Sebastian Kokott,
Volker Blum
Abstract:
In this contribution, we give an overview of the ELPA library and ELSI interface, which are crucial elements for large-scale electronic structure calculations in FHI-aims.
ELPA is a key solver library that provides efficient solutions for both standard and generalized eigenproblems, which are central to the Kohn-Sham formalism in density functional theory (DFT). It supports CPU and GPU architect…
▽ More
In this contribution, we give an overview of the ELPA library and ELSI interface, which are crucial elements for large-scale electronic structure calculations in FHI-aims.
ELPA is a key solver library that provides efficient solutions for both standard and generalized eigenproblems, which are central to the Kohn-Sham formalism in density functional theory (DFT). It supports CPU and GPU architectures, with full support for NVIDIA and AMD GPUs, and ongoing development for Intel GPUs. Here we also report the results of recent optimizations, leading to significant improvements in GPU performance for the generalized eigenproblem.
ELSI is an open-source software interface layer that creates a well-defined connection between "user" electronic structure codes and "solver" libraries for the Kohn-Sham problem, abstracting the step between Hamilton and overlap matrices (as input to ELSI and the respective solvers) and eigenvalues and eigenvectors or density matrix solutions (as output to be passed back to the "user" electronic structure code). In addition to ELPA, ELSI supports solvers including LAPACK and MAGMA, the PEXSI and NTPoly libraries (which bypass an explicit eigenvalue solution), and several others.
△ Less
Submitted 4 February, 2025;
originally announced February 2025.
-
Strongly correlated states of transition metal spin defects: the case of an iron impurity in aluminum nitride
Authors:
Leon Otis,
Yu Jin,
Victor Wen-zhe Yu,
Siyuan Chen,
Laura Gagliardi,
Giulia Galli
Abstract:
We investigate the electronic properties of an exemplar transition metal impurity in an insulator, with the goal of accurately describing strongly correlated, defect states. We consider iron in aluminum nitride, a material of interest for hybrid quantum technologies, and we carry out calculations with quantum embedding methods -- density matrix embedding theory (DMET) and quantum defect embedding…
▽ More
We investigate the electronic properties of an exemplar transition metal impurity in an insulator, with the goal of accurately describing strongly correlated, defect states. We consider iron in aluminum nitride, a material of interest for hybrid quantum technologies, and we carry out calculations with quantum embedding methods -- density matrix embedding theory (DMET) and quantum defect embedding theory (QDET) and with spin-flip time-dependent density functional theory (TDDFT). We show that both DMET and QDET accurately describe the ground state and low-lying excited states of the defect, and that TDDFT yields photoluminescence spectra in agreement with experiments. In addition, we provide a detailed discussion of the convergence of our results as a function of the active space used in the embedding methods, thus defining a protocol to obtain converged data, directly comparable with experiments.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
GPU-Accelerated Solution of the Bethe-Salpeter Equation for Large and Heterogeneous Systems
Authors:
Victor Wen-zhe Yu,
Yu Jin,
Giulia Galli,
Marco Govoni
Abstract:
We present a massively parallel, GPU-accelerated implementation of the Bethe-Salpeter equation (BSE) for the calculation of the vertical excitation energies (VEEs) and optical absorption spectra of condensed and molecular systems, starting from single-particle eigenvalues and eigenvectors obtained with density functional theory. The algorithms adopted here circumvent the slowly converging sums ove…
▽ More
We present a massively parallel, GPU-accelerated implementation of the Bethe-Salpeter equation (BSE) for the calculation of the vertical excitation energies (VEEs) and optical absorption spectra of condensed and molecular systems, starting from single-particle eigenvalues and eigenvectors obtained with density functional theory. The algorithms adopted here circumvent the slowly converging sums over empty and occupied states and the inversion of large dielectric matrices, through a density matrix perturbation theory approach and a low-rank decomposition of the screened Coulomb interaction, respectively. Further computational savings are achieved by exploiting the nearsightedness of the density matrix of semiconductors and insulators to reduce the number of screened Coulomb integrals. We scale our calculations to thousands of GPUs with a hierarchical loop and data distribution strategy. The efficacy of our method is demonstrated by computing the VEEs of several spin defects in wide-band-gap materials, showing that supercells with up to 1000 atoms are necessary to obtain converged results. We discuss the validity of the common approximation that solves the BSE with truncated sums over empty and occupied states. We then apply our GW-BSE implementation to a diamond lattice with 1727 atoms to study the symmetry breaking of triplet states caused by the interaction of a point defect with an extended line defect.
△ Less
Submitted 27 November, 2024; v1 submitted 23 September, 2024;
originally announced September 2024.
-
Many-body perturbation theory with hybrid density functional theory starting points accelerated by adaptively compressed exchange
Authors:
Victor Wen-zhe Yu,
Marco Govoni
Abstract:
We report on the use of the adaptively compressed exchange (ACE) operator to accelerate many-body perturbation theory (MBPT) calculations, including G$_0$W$_0$ and the Bethe Salpeter equation (BSE), for hybrid density functional theory starting points. We show that by approximating the exact exchange operator with the low-rank ACE operator, substantial computational savings can be achieved with sy…
▽ More
We report on the use of the adaptively compressed exchange (ACE) operator to accelerate many-body perturbation theory (MBPT) calculations, including G$_0$W$_0$ and the Bethe Salpeter equation (BSE), for hybrid density functional theory starting points. We show that by approximating the exact exchange operator with the low-rank ACE operator, substantial computational savings can be achieved with systematically controllable errors in the quasiparticle energies computed with full-frequency G$_0$W$_0$ and the optical absorption spectra and vertical excitation energies computed by solving the BSE within density matrix perturbation theory. Our implementation makes use of the ACE-accelerated electronic Hamiltonian to carry out both G$_0$W$_0$ and BSE without explicitly computing empty states. We show the robustness of the approach and present the computational gains obtained on both the central processing unit and graphics processing unit nodes. Our work will facilitate the exploration and evaluation of fine-tuned hybrid starting points aimed at enhancing the accuracy of MBPT calculations without involving computationally demanding self-consistency in Hedin's equations.
△ Less
Submitted 27 January, 2025; v1 submitted 22 September, 2024;
originally announced September 2024.
-
Quantum-centric Supercomputing for Materials Science: A Perspective on Challenges and Future Directions
Authors:
Yuri Alexeev,
Maximilian Amsler,
Paul Baity,
Marco Antonio Barroca,
Sanzio Bassini,
Torey Battelle,
Daan Camps,
David Casanova,
Young Jai Choi,
Frederic T. Chong,
Charles Chung,
Chris Codella,
Antonio D. Corcoles,
James Cruise,
Alberto Di Meglio,
Jonathan Dubois,
Ivan Duran,
Thomas Eckl,
Sophia Economou,
Stephan Eidenbenz,
Bruce Elmegreen,
Clyde Fare,
Ismael Faro,
Cristina Sanz Fernández,
Rodrigo Neumann Barros Ferreira
, et al. (102 additional authors not shown)
Abstract:
Computational models are an essential tool for the design, characterization, and discovery of novel materials. Hard computational tasks in materials science stretch the limits of existing high-performance supercomputing centers, consuming much of their simulation, analysis, and data resources. Quantum computing, on the other hand, is an emerging technology with the potential to accelerate many of…
▽ More
Computational models are an essential tool for the design, characterization, and discovery of novel materials. Hard computational tasks in materials science stretch the limits of existing high-performance supercomputing centers, consuming much of their simulation, analysis, and data resources. Quantum computing, on the other hand, is an emerging technology with the potential to accelerate many of the computational tasks needed for materials science. In order to do that, the quantum technology must interact with conventional high-performance computing in several ways: approximate results validation, identification of hard problems, and synergies in quantum-centric supercomputing. In this paper, we provide a perspective on how quantum-centric supercomputing can help address critical computational problems in materials science, the challenges to face in order to solve representative use cases, and new suggested directions.
△ Less
Submitted 19 September, 2024; v1 submitted 14 December, 2023;
originally announced December 2023.
-
Excited state properties of point defects in semiconductors and insulators investigated with time-dependent density functional theory
Authors:
Yu Jin,
Victor Wen-zhe Yu,
Marco Govoni,
Andrew C Xu,
Giulia Galli
Abstract:
We present a formulation of spin-conserving and spin-flip, hybrid time-dependent density functional theory (TDDFT), including the calculation of analytical forces, which allows for efficient calculations of excited state properties of solid-state systems with hundreds to thousands of atoms. We discuss an implementation on both GPU and CPU based architectures, along with several acceleration techni…
▽ More
We present a formulation of spin-conserving and spin-flip, hybrid time-dependent density functional theory (TDDFT), including the calculation of analytical forces, which allows for efficient calculations of excited state properties of solid-state systems with hundreds to thousands of atoms. We discuss an implementation on both GPU and CPU based architectures, along with several acceleration techniques. We then apply our formulation to the study of several point defects in semiconductors and insulators, specifically the negatively charged nitrogen-vacancy and neutral silicon-vacancy centers in diamond, the neutral divacancy center in 4H silicon carbide, and the neutral oxygen-vacancy center in magnesium oxide. Our results highlight the importance of taking into account structural relaxations in excited states, in order to interpret and predict optical absorption and emission mechanisms in spin-defects.
△ Less
Submitted 3 December, 2023; v1 submitted 7 September, 2023;
originally announced September 2023.
-
Roadmap on Electronic Structure Codes in the Exascale Era
Authors:
Vikram Gavini,
Stefano Baroni,
Volker Blum,
David R. Bowler,
Alexander Buccheri,
James R. Chelikowsky,
Sambit Das,
William Dawson,
Pietro Delugas,
Mehmet Dogan,
Claudia Draxl,
Giulia Galli,
Luigi Genovese,
Paolo Giannozzi,
Matteo Giantomassi,
Xavier Gonze,
Marco Govoni,
Andris Gulans,
François Gygi,
John M. Herbert,
Sebastian Kokott,
Thomas D. Kühne,
Kai-Hsin Liou,
Tsuyoshi Miyazaki,
Phani Motamarri
, et al. (16 additional authors not shown)
Abstract:
Electronic structure calculations have been instrumental in providing many important insights into a range of physical and chemical properties of various molecular and solid-state systems. Their importance to various fields, including materials science, chemical sciences, computational chemistry and device physics, is underscored by the large fraction of available public supercomputing resources d…
▽ More
Electronic structure calculations have been instrumental in providing many important insights into a range of physical and chemical properties of various molecular and solid-state systems. Their importance to various fields, including materials science, chemical sciences, computational chemistry and device physics, is underscored by the large fraction of available public supercomputing resources devoted to these calculations. As we enter the exascale era, exciting new opportunities to increase simulation numbers, sizes, and accuracies present themselves. In order to realize these promises, the community of electronic structure software developers will however first have to tackle a number of challenges pertaining to the efficient use of new architectures that will rely heavily on massive parallelism and hardware accelerators. This roadmap provides a broad overview of the state-of-the-art in electronic structure calculations and of the various new directions being pursued by the community. It covers 14 electronic structure codes, presenting their current status, their development priorities over the next five years, and their plans towards tackling the challenges and leveraging the opportunities presented by the advent of exascale computing.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
GPU Acceleration of Large-Scale Full-Frequency GW Calculations
Authors:
Victor Wen-zhe Yu,
Marco Govoni
Abstract:
Many-body perturbation theory is a powerful method to simulate electronic excitations in molecules and materials starting from the output of density functional theory calculations. By implementing the theory efficiently so as to run at scale on the latest leadership high-performance computing systems it is possible to extend the scope of GW calculations. We present a GPU acceleration study of the…
▽ More
Many-body perturbation theory is a powerful method to simulate electronic excitations in molecules and materials starting from the output of density functional theory calculations. By implementing the theory efficiently so as to run at scale on the latest leadership high-performance computing systems it is possible to extend the scope of GW calculations. We present a GPU acceleration study of the full-frequency GW method as implemented in the WEST code. Excellent performance is achieved through the use of (i) optimized GPU libraries, e.g., cuFFT and cuBLAS, (ii) a hierarchical parallelization strategy that minimizes CPU-CPU, CPU-GPU, and GPU-GPU data transfer operations, (iii) nonblocking MPI communications that overlap with GPU computations, and (iv) mixed-precision in selected portions of the code. A series of performance benchmarks have been carried out on leadership high-performance computing systems, showing a substantial speedup of the GPU-accelerated version of WEST with respect to its CPU version. Good strong and weak scaling is demonstrated using up to 25920 GPUs. Finally, we showcase the capability of the GPU version of WEST for large-scale, full-frequency GW calculations of realistic systems, e.g., a nanostructure, an interface, and a defect, comprising up to 10368 valence electrons.
△ Less
Submitted 9 August, 2022; v1 submitted 10 March, 2022;
originally announced March 2022.
-
Boron nitride on SiC(0001)
Authors:
You-Ron Lin,
Markus Franke,
Shayan Parhizkar,
Miriam Raths,
Victor Wen-zhe Yu,
Tien-Lin Lee,
Serguei Soubatch,
Volker Blum,
F. Stefan Tautz,
Christian Kumpf,
François C. Bocquet
Abstract:
In the field of van der Waals heterostructures, the twist angle between stacked two-dimensional (2D) layers has been identified to be of utmost importance for the properties of the heterostructures. In this context, we previously reported the growth of a single layer of unconventionally oriented epitaxial graphene that forms in a surfactant atmosphere [F. C. Bocquet, et al., Phys. Rev. Lett. 125,…
▽ More
In the field of van der Waals heterostructures, the twist angle between stacked two-dimensional (2D) layers has been identified to be of utmost importance for the properties of the heterostructures. In this context, we previously reported the growth of a single layer of unconventionally oriented epitaxial graphene that forms in a surfactant atmosphere [F. C. Bocquet, et al., Phys. Rev. Lett. 125, 106102 (2020)]. The resulting G-R0$^\circ$ layer is aligned with the SiC lattice, and hence represents an important milestone towards high quality twisted bilayer graphene (tBLG), a frequently investigated model system in this field. Here, we focus on the surface structures obtained in the same surfactant atmosphere, but at lower preparation temperatures at which a boron nitride template layer forms on SiC(0001). In a comprehensive study based on complementary experimental and theoretical techniques, we find -- in contrast to the literature -- that this template layer is a hexagonal B$_x$N$_y$ layer, but not high-quality hBN. It is aligned with the SiC lattice and gradually replaced by low-quality graphene in the 0$^\circ$ orientation of the B$_x$N$_y$ template layer upon annealing.
△ Less
Submitted 14 April, 2022; v1 submitted 2 March, 2022;
originally announced March 2022.
-
Accurate Frozen Core Approximation for All-Electron Density-Functional Theory
Authors:
Victor Wen-zhe Yu,
Jonathan Moussa,
Volker Blum
Abstract:
We implement and benchmark the frozen core approximation, a technique commonly adopted in electronic structure theory to reduce the computational cost by means of mathematically fixing the chemically inactive core electron states. The accuracy and efficiency of this approach are well controlled by a single parameter, the number of frozen orbitals. Explicit corrections for the frozen core orbitals…
▽ More
We implement and benchmark the frozen core approximation, a technique commonly adopted in electronic structure theory to reduce the computational cost by means of mathematically fixing the chemically inactive core electron states. The accuracy and efficiency of this approach are well controlled by a single parameter, the number of frozen orbitals. Explicit corrections for the frozen core orbitals and the unfrozen valence orbitals are introduced, safeguarding against seemingly minor numerical deviations from the assumed orthonormality conditions of the basis functions. A speedup of over two-fold can be achieved for the diagonalization step in all-electron density-functional theory simulations containing heavy elements, without any accuracy degradation in terms of the electron density, total energy, and atomic forces. This is demonstrated in a benchmark study covering 103 materials across the periodic table, and a large-scale simulation of CsPbBr3 with 2,560 atoms. Our study provides a rigorous benchmark of the precision of the frozen core approximation (sub-meV per atom for frozen core orbitals below -200 eV) for a wide range of test cases and for chemical elements ranging from Li to Po. The algorithms discussed here are implemented in the open-source Electronic Structure Infrastructure software package.
△ Less
Submitted 11 June, 2021;
originally announced June 2021.
-
SIESTA: recent developments and applications
Authors:
Alberto García,
Nick Papior,
Arsalan Akhtar,
Emilio Artacho,
Volker Blum,
Emanuele Bosoni,
Pedro Brandimarte,
Mads Brandbyge,
J. I. Cerdá,
Fabiano Corsetti,
Ramón Cuadrado,
Vladimir Dikan,
Jaime Ferrer,
Julian Gale,
Pablo García-Fernández,
V. M. García-Suárez,
Sandra García,
Georg Huhs,
Sergio Illera,
Richard Korytár,
Peter Koval,
Irina Lebedeva,
Lin Lin,
Pablo López-Tarifa,
Sara G. Mayo
, et al. (11 additional authors not shown)
Abstract:
A review of the present status, recent enhancements, and applicability of the SIESTA program is presented. Since its debut in the mid-nineties, SIESTA's flexibility, efficiency and free distribution has given advanced materials simulation capabilities to many groups worldwide. The core methodological scheme of SIESTA combines finite-support pseudo-atomic orbitals as basis sets, norm-conserving pse…
▽ More
A review of the present status, recent enhancements, and applicability of the SIESTA program is presented. Since its debut in the mid-nineties, SIESTA's flexibility, efficiency and free distribution has given advanced materials simulation capabilities to many groups worldwide. The core methodological scheme of SIESTA combines finite-support pseudo-atomic orbitals as basis sets, norm-conserving pseudopotentials, and a real-space grid for the representation of charge density and potentials and the computation of their associated matrix elements. Here we describe the more recent implementations on top of that core scheme, which include: full spin-orbit interaction, non-repeated and multiple-contact ballistic electron transport, DFT+U and hybrid functionals, time-dependent DFT, novel reduced-scaling solvers, density-functional perturbation theory, efficient Van der Waals non-local density functionals, and enhanced molecular-dynamics options. In addition, a substantial effort has been made in enhancing interoperability and interfacing with other codes and utilities, such as Wannier90 and the second-principles modelling it can be used for, an AiiDA plugin for workflow automatization, interface to Lua for steering SIESTA runs, and various postprocessing utilities. SIESTA has also been engaged in the Electronic Structure Library effort from its inception, which has allowed the sharing of various low level libraries, as well as data standards and support for them, in particular the PSML definition and library for transferable pseudopotentials, and the interface to the ELSI library of solvers. Code sharing is made easier by the new open-source licensing model of the program. This review also presents examples of application of the capabilities of the code, as well as a view of on-going and future developments.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
The CECAM Electronic Structure Library and the modular software development paradigm
Authors:
Micael J. T. Oliveira,
Nick Papior,
Yann Pouillon,
Volker Blum,
Emilio Artacho,
Damien Caliste,
Fabiano Corsetti,
Stefano de Gironcoli,
Alin M. Elena,
Alberto Garcia,
Victor M. Garcia-Suarez,
Luigi Genovese,
William P. Huhn,
Georg Huhs,
Sebastian Kokott,
Emine Kucukbenli,
Ask H. Larsen,
Alfio Lazzaro,
Irina V. Lebedeva,
Yingzhou Li,
David Lopez-Duran,
Pablo Lopez-Tarifa,
Martin Luders,
Miguel A. L. Marques,
Jan Minar
, et al. (12 additional authors not shown)
Abstract:
First-principles electronic structure calculations are very widely used thanks to the many successful software packages available. Their traditional coding paradigm is monolithic, i.e., regardless of how modular its internal structure may be, the code is built independently from others, from the compiler up, with the exception of linear-algebra and message-passing libraries. This model has been qu…
▽ More
First-principles electronic structure calculations are very widely used thanks to the many successful software packages available. Their traditional coding paradigm is monolithic, i.e., regardless of how modular its internal structure may be, the code is built independently from others, from the compiler up, with the exception of linear-algebra and message-passing libraries. This model has been quite successful for decades. The rapid progress in methodology, however, has resulted in an ever increasing complexity of those programs, which implies a growing amount of replication in coding and in the recurrent re-engineering needed to adapt to evolving hardware architecture. The Electronic Structure Library (\esl) was initiated by CECAM (European Centre for Atomic and Molecular Calculations) to catalyze a paradigm shift away from the monolithic model and promote modularization, with the ambition to extract common tasks from electronic structure programs and redesign them as free, open-source libraries. They include "heavy-duty" ones with a high degree of parallelisation, and potential for adaptation to novel hardware within them, thereby separating the sophisticated computer science aspects of performance optimization and re-engineering from the computational science done by scientists when implementing new ideas. It is a community effort, undertaken by developers of various successful codes, now facing the challenges arising in the new model. This modular paradigm will improve overall coding efficiency and enable specialists (computer scientists or computational scientists) to use their skills more effectively. It will lead to a more sustainable and dynamic evolution of software as well as lower barriers to entry for new developers.
△ Less
Submitted 24 June, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
GPU-Acceleration of the ELPA2 Distributed Eigensolver for Dense Symmetric and Hermitian Eigenproblems
Authors:
Victor Wen-zhe Yu,
Jonathan Moussa,
Pavel Kůs,
Andreas Marek,
Peter Messmer,
Mina Yoon,
Hermann Lederer,
Volker Blum
Abstract:
The solution of eigenproblems is often a key computational bottleneck that limits the tractable system size of numerical algorithms, among them electronic structure theory in chemistry and in condensed matter physics. Large eigenproblems can easily exceed the capacity of a single compute node, thus must be solved on distributed-memory parallel computers. We here present GPU-oriented optimizations…
▽ More
The solution of eigenproblems is often a key computational bottleneck that limits the tractable system size of numerical algorithms, among them electronic structure theory in chemistry and in condensed matter physics. Large eigenproblems can easily exceed the capacity of a single compute node, thus must be solved on distributed-memory parallel computers. We here present GPU-oriented optimizations of the ELPA two-stage tridiagonalization eigensolver (ELPA2). On top of cuBLAS-based GPU offloading, we add a CUDA kernel to speed up the back-transformation of eigenvectors, which can be the computationally most expensive part of the two-stage tridiagonalization algorithm. We benchmark the performance of this GPU-accelerated eigensolver on two hybrid CPU-GPU architectures, namely a compute cluster based on Intel Xeon Gold CPUs and NVIDIA Volta GPUs, and the Summit supercomputer based on IBM POWER9 CPUs and NVIDIA Volta GPUs. Consistent with previous benchmarks on CPU-only architectures, the GPU-accelerated two-stage solver exhibits a parallel performance superior to the one-stage counterpart. Finally, we demonstrate the performance of the GPU-accelerated eigensolver developed in this work for routine semi-local KS-DFT calculations comprising thousands of atoms.
△ Less
Submitted 14 January, 2021; v1 submitted 25 February, 2020;
originally announced February 2020.
-
ELSI -- An Open Infrastructure for Electronic Structure Solvers
Authors:
Victor Wen-zhe Yu,
Carmen Campos,
William Dawson,
Alberto García,
Ville Havu,
Ben Hourahine,
William P Huhn,
Mathias Jacquelin,
Weile Jia,
Murat Keçeli,
Raul Laasner,
Yingzhou Li,
Lin Lin,
Jianfeng Lu,
Jonathan Moussa,
Jose E Roman,
Álvaro Vázquez-Mayagoitia,
Chao Yang,
Volker Blum
Abstract:
Routine applications of electronic structure theory to molecules and periodic systems need to compute the electron density from given Hamiltonian and, in case of non-orthogonal basis sets, overlap matrices. System sizes can range from few to thousands or, in some examples, millions of atoms. Different discretization schemes (basis sets) and different system geometries (finite non-periodic vs. infi…
▽ More
Routine applications of electronic structure theory to molecules and periodic systems need to compute the electron density from given Hamiltonian and, in case of non-orthogonal basis sets, overlap matrices. System sizes can range from few to thousands or, in some examples, millions of atoms. Different discretization schemes (basis sets) and different system geometries (finite non-periodic vs. infinite periodic boundary conditions) yield matrices with different structures. The ELectronic Structure Infrastructure (ELSI) project provides an open-source software interface to facilitate the implementation and optimal use of high-performance solver libraries covering cubic scaling eigensolvers, linear scaling density-matrix-based algorithms, and other reduced scaling methods in between. In this paper, we present recent improvements and developments inside ELSI, mainly covering (1) new solvers connected to the interface, (2) matrix layout and communication adapted for parallel calculations of periodic and/or spin-polarized systems, (3) routines for density matrix extrapolation in geometry optimization and molecular dynamics calculations, and (4) general utilities such as parallel matrix I/O and JSON output. The ELSI interface has been integrated into four electronic structure code projects (DFTB+, DGDFT, FHI-aims, SIESTA), allowing us to rigorously benchmark the performance of the solvers on an equal footing. Based on results of a systematic set of large-scale benchmarks performed with Kohn-Sham density-functional theory and density-functional tight-binding theory, we identify factors that strongly affect the efficiency of the solvers, and propose a decision layer that assists with the solver selection process. Finally, we describe a reverse communication interface encoding matrix-free iterative solver strategies that are amenable, e.g., for use with planewave basis sets.
△ Less
Submitted 4 July, 2020; v1 submitted 31 December, 2019;
originally announced December 2019.
-
GPGPU Acceleration of All-Electron Electronic Structure Theory Using Localized Numeric Atom-Centered Basis Functions
Authors:
William Huhn,
Björn Lange,
Victor Wen-zhe Yu,
Mina Yoon,
Volker Blum
Abstract:
We present an implementation of all-electron density-functional theory for massively parallel GPGPU-based platforms, using localized atom-centered basis functions and real-space integration grids. Special attention is paid to domain decomposition of the problem on non-uniform grids, which enables compute- and memory-parallel execution across thousands of nodes for real-space operations, e.g. the u…
▽ More
We present an implementation of all-electron density-functional theory for massively parallel GPGPU-based platforms, using localized atom-centered basis functions and real-space integration grids. Special attention is paid to domain decomposition of the problem on non-uniform grids, which enables compute- and memory-parallel execution across thousands of nodes for real-space operations, e.g. the update of the electron density, the integration of the real-space Hamiltonian matrix, and calculation of Pulay forces. To assess the performance of our GPGPU implementation, we performed benchmarks on three different architectures using a 103-material test set. We find that operations which rely on dense serial linear algebra show dramatic speedups from GPGPU acceleration: in particular, SCF iterations including force and stress calculations exhibit speedups ranging from 4.5 to 6.6. For the architectures and problem types investigated here, this translates to an expected overall speedup between 3-4 for the entire calculation (including non-GPU accelerated parts), for problems featuring several tens to hundreds of atoms. Additional calculations for a 375-atom Bi$_2$Se$_3$ bilayer show that the present GPGPU strategy scales for large-scale distributed-parallel simulations.
△ Less
Submitted 13 December, 2019;
originally announced December 2019.
-
ELSI: A Unified Software Interface for Kohn-Sham Electronic Structure Solvers
Authors:
Victor Wen-zhe Yu,
Fabiano Corsetti,
Alberto García,
William P. Huhn,
Mathias Jacquelin,
Weile Jia,
Björn Lange,
Lin Lin,
Jianfeng Lu,
Wenhui Mi,
Ali Seifitokaldani,
Álvaro Vázquez-Mayagoitia,
Chao Yang,
Haizhao Yang,
Volker Blum
Abstract:
Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access dif…
▽ More
Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.
△ Less
Submitted 31 May, 2017;
originally announced May 2017.