ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Single-Dimension Software Pipelining for Multidimensional Loops

Rong, Hongbo and Tang, Zhizhong and Govindarajan, R and Douillet, Alban and Gao, Guang R (2007) Single-Dimension Software Pipelining for Multidimensional Loops. In: ACM Transactions on Architecture and Code Optimization (TACO), 4 (1). pp. 1-44.

[img] PDF
sing.pdf - Published Version
Restricted to Registered users only

Download (439kB) | Request a copy
Official URL: http://delivery.acm.org/10.1145/1220000/1216550/p1...


Traditionally, software pipelining is applied either to the innermost loop of a given loop nest or from the innermost loop to outer loops. This paper proposes a three-step approach, called single-dimension software pipelining (SSP), to software pipeline a loop nest at an arbitrary loop level that has a rectangular iteration space and contains no sibling inner loops in it. The first step identifies the most profitable loop level for software pipelining in terms of initiation rate, data reuse potential, or any other optimization criteria. The second step simplifies the multidimensional data-dependence graph (DDG) of the selected loop level into a one-dimensional DDG and constructs a one-dimensional (1D) schedule. Based on the one-dimensional schedule, the third step derives a simple mapping function that specifies the schedule time for the operation instances in the multidimensional loop. The classical modulo scheduling is subsumed by SSP as a special case. SSP is also closely related to hyperplane scheduling, and, in fact, extends it to be resource constrained. We prove that SSP schedules are correct and at least as efficient as those schedules generated by traditional modulo scheduling methods. We extend SSP to schedule imperfect loop nests, which are most common at the instruction level. Multiple initiation intervals are naturally allowed to improve execution efficiency. Feasibility and correctness of our approach are verified by a prototype implementation in the ORC compiler for the IA-64 architecture, tested with loop nests from Livermore and SPEC2000 floating-point benchmarks. Preliminary experimental results reveal that, compared to modulo scheduling, software pipelining at an appropriate loop level results in significant performance improvement. Software pipelining is beneficial even with prior loop transformations.

Item Type: Journal Article
Publication: ACM Transactions on Architecture and Code Optimization (TACO)
Publisher: ACM
Additional Information: Copyright of this article belongs to ACM
Keywords: Algorithms;Languages;Software pipelining;modulo scheduling; loop transformation
Department/Centre: Division of Interdisciplinary Sciences > Supercomputer Education & Research Centre
Date Deposited: 18 Dec 2008 07:53
Last Modified: 19 Sep 2010 04:52
URI: http://eprints.iisc.ac.in/id/eprint/16491

Actions (login required)

View Item View Item