ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Tile Size and Loop Order Selection using Machine Learning for Multi-/Many-Core Architectures

Babalad, S and Shevade, S and Thazhuthaveetil, MJ and Govindarajan, R (2024) Tile Size and Loop Order Selection using Machine Learning for Multi-/Many-Core Architectures. In: 38th ACM International Conference on Supercomputing, ICS 2024, 4 June 2024through 7 June 2024, Kyoto, Japan, pp. 388-399.

[img]
Preview
PDF
ICS_2024.pdf - Published Version

Download (1MB) | Preview
Official URL: https://doi.org/10.1145/3650200.3656630

Abstract

Loop tiling and loop interchange (or permutation) are techniques that can expose task and data-level parallelisms and can exploit data locality available in multi-dimensional loop nests. Choosing the appropriate tile size and loop order is important to achieve significant performance improvement. However, the effect of these transformations on the performance of the loop nest is not straightforward due to the complex interplay of several architectural features in multi-/many-core architectures. In this work, we propose using a supervised learning technique and develop a Support Vector Machine (SVM) based hierarchical classifier to identify the best-performing tile size and loop order for a given loop nest. Our approach results in identifying tile sizes and loop orders whose performance, on average, is within 18 and 9 of the optimal performance for two sets of loop nests on Intel Xeon Cascadelake architecture. Further, our method outperforms state-of-the-art techniques, Pluto and Polly, with a geometric mean speedup of 1.35x to 1.58x. © 2024 Owner/Author.

Item Type: Conference Paper
Publication: Proceedings of the International Conference on Supercomputing
Publisher: Association for Computing Machinery
Additional Information: The copyright for this article belongs to Authors.
Keywords: Architecture; Computer architecture; Learning systems, Hierarchical classifiers; Loop nests; Loop order; Loop transformation; Parallelizations; Performance; Support vectors machine; Tile size; Vectorization; Vectorization and parallelization, Support vector machines
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Division of Interdisciplinary Sciences > Supercomputer Education & Research Centre
Date Deposited: 30 Jul 2024 10:32
Last Modified: 30 Jul 2024 10:32
URI: http://eprints.iisc.ac.in/id/eprint/85619

Actions (login required)

View Item View Item