ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

PLUTO plus : Near-Complete Modeling of Affine Transformations for Parallelism and Locality

Acharya, Aravind and Bondhugula, Uday (2015) PLUTO plus : Near-Complete Modeling of Affine Transformations for Parallelism and Locality. In: ASSOC COMPUTING MACHINERY, 2 PENN PLAZA, STE 701, NEW YORK, NY 10121-0701 USA . pp. 54-64.

[img] PDF
ACM_Sig_Not_50-8_54_2015.pdf - Published Version
Restricted to Registered users only

Download (302kB) | Request a copy
Official URL: http://dx.doi.org/10.1145/2688500.2688512

Abstract

Affine transformations have proven to be very powerful for loop restructuring due to their ability to model a very wide range of transformations. A single multi-dimensional affine function can represent a long and complex sequence of simpler transformations. Existing affine transformation frameworks like the Pluto algorithm, that include a cost function for modern multicore architectures where coarse-grained parallelism and locality are crucial, consider only a sub-space of transformations to avoid a combinatorial explosion in finding the transformations. The ensuing practical tradeoffs lead to the exclusion of certain useful transformations, in particular, transformation compositions involving loop reversals and loop skewing by negative factors. In this paper, we propose an approach to address this limitation by modeling a much larger space of affine transformations in conjunction with the Pluto algorithm's cost function. We perform an experimental evaluation of both, the effect on compilation time, and performance of generated codes. The evaluation shows that our new framework, Pluto+, provides no degradation in performance in any of the Polybench benchmarks. For Lattice Boltzmann Method (LBM) codes with periodic boundary conditions, it provides a mean speedup of 1.33x over Pluto. We also show that Pluto+ does not increase compile times significantly. Experimental results on Polybench show that Pluto+ increases overall polyhedral source-to-source optimization time only by 15%. In cases where it improves execution time significantly, it increased polyhedral optimization time only by 2.04x.

Item Type: Journal Article
Publication: ASSOC COMPUTING MACHINERY, 2 PENN PLAZA, STE 701, NEW YORK, NY 10121-0701 USA
Additional Information: Copy right for this article belongs to the ASSOC COMPUTING MACHINERY, 2 PENN PLAZA, STE 701, NEW YORK, NY 10121-0701 USA
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 08 Oct 2016 06:26
Last Modified: 08 Oct 2016 06:26
URI: http://eprints.iisc.ac.in/id/eprint/54720

Actions (login required)

View Item View Item