Pericherla, Surendra Varma and Vadhiyar, Sathish (2017) High Performance and Enhanced Scalability for Parallel Applications using MPI-3's non-blocking Collectives. In: International Conference on Computational Science (ICCS), JUN 12-14, 2017, Zurich, SWITZERLAND, pp. 2403-2407.
PDF
Int_Con_Com_Sci_108_2403_2017.pdf - Published Version Restricted to Registered users only Download (483kB) | Request a copy |
Abstract
Collective communications occupy 20-90% of total execution times in many MPI applications. In this paper, we propose strategies for automatically identifying the most time-consuming collective operations that also act as scalability bottlenecks. We then explore the use of MPI-3's non-blocking collectives for these communications. We also rearrange the codes to adequately overlap the independent computations with the non-blocking collective communications. Applying these strategies for different graph and machine learning applications, we obtained up to 33% performance improvements for large-scale runs on a Cray supercomputer. (C) 2017 The Authors. Published by Elsevier B.V.
Item Type: | Conference Proceedings |
---|---|
Series.: | Procedia Computer Science |
Additional Information: | Copy right for this article belongs to the ELSEVIER SCIENCE BV, SARA BURGERHARTSTRAAT 25, PO BOX 211, 1000 AE AMSTERDAM, NETHERLANDS |
Department/Centre: | Division of Interdisciplinary Sciences > Supercomputer Education & Research Centre |
Date Deposited: | 12 Aug 2017 06:56 |
Last Modified: | 12 Aug 2017 06:56 |
URI: | http://eprints.iisc.ac.in/id/eprint/57631 |
Actions (login required)
View Item |