ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

DFT-FE 1.0: A massively parallel hybrid CPU-GPU density functional theory code using finite-element discretization

Das, S and Motamarri, P and Subramanian, V and Rogers, DM and Gavini, V (2022) DFT-FE 1.0: A massively parallel hybrid CPU-GPU density functional theory code using finite-element discretization. In: Computer Physics Communications, 280 .

[img] PDF
DFT_FE 1.0 _mas_280_2022.pdf - Published Version
Restricted to Registered users only

Download (1MB) | Request a copy
Official URL: https://doi.org/10.1016/j.cpc.2022.108473

Abstract

We present DFT-FE 1.0, building on DFT-FE 0.6 Motamarri et al. (2020) 28, to conduct fast and accurate large-scale density functional theory (DFT) calculations (reaching �100,000 electrons) on both many-core CPU and hybrid CPU-GPU computing architectures. This work involves improvements in the real-space formulation�via an improved treatment of the electrostatic interactions that substantially enhances the computational efficiency�as well high-performance computing aspects, including the GPU acceleration of all the key compute kernels in DFT-FE. We demonstrate the accuracy by comparing the ground-state energies, ionic forces and cell stresses on a wide-range of benchmark systems against those obtained from widely used DFT codes. Further, we demonstrate the numerical efficiency of our GPU acceleration, which yields �20� speed-up on hybrid CPU-GPU nodes of the Summit supercomputer. Notably, owing to the parallel-scaling of the GPU implementation, we obtain wall-times of 80�140 seconds for full ground-state calculations, with stringent accuracy, on benchmark systems containing �6,000�15,000 electrons using 64�224 nodes of the Summit supercomputer. Program summary: Program Title: DFT-FE CPC Library link to program files: https://doi.org/10.17632/c5ghfc6ctn.1 Developer's repository link: https://github.com/dftfeDevelopers/dftfe Licensing provisions: LGPL v3 Programming language: C/C++ External routines/libraries: p4est (http://www.p4est.org/), deal.II (https://www.dealii.org/), BLAS (http://www.netlib.org/blas/), LAPACK (http://www.netlib.org/lapack/), ELPA (https://elpa.mpcdf.mpg.de/), ScaLAPACK (http://www.netlib.org/scalapack/), Spglib (https://atztogo.github.io/spglib/), ALGLIB (http://www.alglib.net/), LIBXC (http://www.tddft.org/programs/libxc/), PETSc (https://www.mcs.anl.gov/petsc), SLEPc (http://slepc.upv.es), NCCL (optional-https://github.com/NVIDIA/nccl). Nature of problem: Density functional theory calculations. Solution method: We employ a local real-space variational formulation of Kohn-Sham density functional theory that is applicable for both pseudopotential and all-electron calculations on periodic, semi-periodic and non-periodic geometries. Higher-order adaptive spectral finite-element basis is used to discretize the Kohn-Sham equations. Chebyshev polynomial filtered subspace iteration procedure (ChFSI) is employed to solve the nonlinear Kohn-Sham eigenvalue problem self-consistently. ChFSI in DFT-FE employs Cholesky factorization based orthonormalization, and spectrum splitting based Rayleigh-Ritz procedure in conjunction with mixed precision arithmetic. Configurational force approach is used to compute ionic forces and periodic cell stresses for geometry optimization. Additional comments including restrictions and unusual features: Exchange correlation functionals are restricted to Local Density Approximation (LDA) and Generalized Gradient Approximation (GGA), with and without spin. The pseudopotentials available are optimized norm conserving Vanderbilt (ONCV) pseudopotentials and Troullier�Martins (TM) pseudopotentials. Calculations are non-relativistic. DFT-FE handles all-electron and pseudopotential calculations in the same framework, while accommodating periodic, non-periodic and semi-periodic boundary conditions. © 2022 Elsevier B.V.

Item Type: Journal Article
Publication: Computer Physics Communications
Publisher: Elsevier B.V.
Additional Information: The copyright for this article belongs to the authors.
Keywords: All-electron; Electronic structure; GPU; Mixed-precision arithmetic; Pseudopotential; Real-space; Spectral finite-elements All-electron; Electronic structure; GPU; Mixed-precision arithmetic; Pseudopotential; Real-space; Spectral finite-elements All-electron; Electronic structure; GPU; Mixed-precision arithmetic; Pseudopotential; Real-space; Spectral finite-elements All-electron; Electronic structure; GPU; Mixed-precision arithmetic; Pseudopotential; Real-space; Spectral finite-elements All-electron; Electronic structure; GPU; Mixed-precision arithmetic; Pseudopotential; Real-space; Spectral finite-elements All-electron; Electronic structure; GPU; Mixed-precision arithmetic; Pseudopotential; Real-space; Spectral finite-elements All-electron; Electronic structure; GPU; Mixed-precision arithmetic; Pseudopotential; Real-space; Spectral finite-elements
Department/Centre: Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited: 07 Sep 2022 16:22
Last Modified: 07 Sep 2022 16:22
URI: https://eprints.iisc.ac.in/id/eprint/76423

Actions (login required)

View Item View Item