ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs

Kong, Martin and Pop, Antoniu and Pouchet, Louis-Noel and Govindarajan, R and Cohen, Albert and Sadayappan, P (2014) Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs. In: ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 11 (4).

[img] PDF
com_fra_dyn_dat_par_til_pro_11-4_2014.pdf - Published Version
Restricted to Registered users only

Download (2MB) | Request a copy
Official URL: http://dx.doi.org/ 10.1145/2687652

Abstract

Task-parallel languages are increasingly popular. Many of them provide expressive mechanisms for intertask synchronization. For example, OpenMP 4.0 will integrate data-driven execution semantics derived from the StarSs research language. Compared to the more restrictive data-parallel and fork-join concurrency models, the advanced features being introduced into task-parallelmodels in turn enable improved scalability through load balancing, memory latency hiding, mitigation of the pressure on memory bandwidth, and, as a side effect, reduced power consumption. In this article, we develop a systematic approach to compile loop nests into concurrent, dynamically constructed graphs of dependent tasks. We propose a simple and effective heuristic that selects the most profitable parallelization idiom for every dependence type and communication pattern. This heuristic enables the extraction of interband parallelism (cross-barrier parallelism) in a number of numerical computations that range from linear algebra to structured grids and image processing. The proposed static analysis and code generation alleviates the burden of a full-blown dependence resolver to track the readiness of tasks at runtime. We evaluate our approach and algorithms in the PPCG compiler, targeting OpenStream, a representative dataflow task-parallel language with explicit intertask dependences and a lightweight runtime. Experimental results demonstrate the effectiveness of the approach.

Item Type: Journal Article
Publication: ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION
Publisher: ASSOC COMPUTING MACHINERY
Additional Information: Copy right for this article belongs to the ASSOC COMPUTING MACHINERY, 2 PENN PLAZA, STE 701, NEW YORK, NY 10121-0701 USA
Keywords: Languages; Performance; Compilers; Task Parallelism; Dataflow; point-to-point synchronization; auto-parallelization; polyhedral framework; polyhedral compiler; tiling; dynamic wavefront; dependence partitioning; tile dependences
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 24 Feb 2015 06:39
Last Modified: 24 Feb 2015 06:39
URI: http://eprints.iisc.ac.in/id/eprint/50962

Actions (login required)

View Item View Item