Taming warp divergence

Anantpur, Jayvant and Govindarajan, R (2017) Taming warp divergence. In: 2017 International Symposium on Code Generation and Optimization, CGO 2017, 4 - 8 February 2017, Austin, pp. 50-60.

PDF
CGO 2017_50-60_2017 .pdf - Published Version
Restricted to Registered users only
Download (818kB) | Request a copy

Official URL: https://doi.org/10.1109/CGO.2017.7863728

Abstract

Graphics Processing Units (GPUs) are designed to exploit large amount of parallelism. However, warp-level divergence occurring due to different amounts of work, memory access latency experienced, etc., results in warps of a thread block (TB) finishing kernel execution at different points in time. This, in effect, reduces utilization of resources of SMs and hence performance of the GPU. We propose a simple and elegant technique to eliminate the waiting time of warps at the end of kernel execution and improve performance. The proposed technique uses the idea of persistent threads to define virtual thread blocks and virtual warps. This enables the virtual warp finishing earlier to initiate the execution of another warp from a subsequent thread block, avoiding the unnecessary waiting for sibling warps and the resulting resource underutilization. Further, this technique enables us to design a warp scheduling algorithm that is aware of the progress made by the virtual thread blocks and virtual warps, and uses this knowledge to prioritise warps effectively. The proposed approach is implemented using a simple source-to-source transformation and minimal hardware support. Evaluation of the proposed approach on a diverse set of kernels on the GPGPU-Sim simulator reveals a geometric mean improvement of 1.06x over the baseline architecture that uses the Greedy Then Old (GTO) warp scheduler and 1.09x over the Loose Round Robin (LRR) warp scheduler.

Item Type:	Conference Paper
Publisher:	Institute of Electrical and Electronics Engineers Inc.
Additional Information:	The copyright for this article belongs to the Institute of Electrical and Electronics Engineers Inc.
Keywords:	Divergence; GPU; Warp Scheduling
Department/Centre:	Division of Interdisciplinary Sciences > Supercomputer Education & Research Centre
Date Deposited:	08 Jun 2022 06:40
Last Modified:	08 Jun 2022 06:40
URI:	https://eprints.iisc.ac.in/id/eprint/73182

Actions (login required)

View Item


	Powered by EPrints		A service from The J.R.D. Tata Memorial Library Indian Institute of Science, Bengaluru-560012, India