ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Communication Overlapping Pipelined Conjugate Gradients for Distributed Memory Systems and Heterogeneous Architectures

Tiwari, M and Vadhiyar, S (2022) Communication Overlapping Pipelined Conjugate Gradients for Distributed Memory Systems and Heterogeneous Architectures. In: 27th International Conference on Parallel and Distributed Computing, Euro-Par 2021, 30 - 31 August 2021, Virtual, Online, pp. 535-539.

[img] PDF
LNCS_2021.pdf - Published Version
Restricted to Registered users only

Download (179kB) | Request a copy
Official URL: https://doi.org/10.1007/978-3-031-06156-1_45


Preconditioned Conjugate Gradient (PCG) method has been one of the widely used methods for solving linear systems of equations for sparse problems. Pipelined PCG (PIPECG) attempts to eliminate the dependencies in the computations in the PCG algorithm and overlap non-dependent computations by reorganizing the traditional PCG code and using non-blocking allreduces. We have developed a novel pipelined PCG algorithm called PIPECG-OATI (One Allreduce per Two Iterations) which reduces the number of non-blocking allreduces to one per two iterations and provides large overlap of global communication and computations at higher number of cores in distributed memory CPU systems. PIPECG-OATI gives up to 3 × speedup over PCG and 1.73 × speedup over PIPECG at large number of cores. For GPU accelerated heterogeneous architectures, we have developed three methods for efficient execution of the PIPECG algorithm. These methods achieve task and data parallelism. Our methods give considerable performance improvements over PCG CPU and GPU implementations of Paralution and PETSc libraries.

Item Type: Conference Paper
Publication: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publisher: Springer Science and Business Media Deutschland GmbH
Additional Information: The copyright of this article belongs to the Springer Science and Business Media Deutschland GmbH.
Keywords: Conjugate gradient method; Memory architecture; Pipelines, All-reduce; Distributed memory systems; Heterogeneous architectures; Linear systems of equations; Memory system architectures; Non-blocking; Overlapping communication and computations; Preconditioned conjugate gradient; Preconditioned conjugate gradient algorithms; Preconditioned conjugate gradient method, Linear systems
Department/Centre: Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited: 13 Jul 2022 06:45
Last Modified: 19 May 2023 10:06
URI: https://eprints.iisc.ac.in/id/eprint/74758

Actions (login required)

View Item View Item