ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Extended histories: improving regularity and performance in correlation prefetchers

Manikantan, R and Govindarajan, R and Rajan, Kaushik (2011) Extended histories: improving regularity and performance in correlation prefetchers. In: HiPEAC '11 Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers, 2011, New York, NY, USA.

[img] PDF
High_Per_Emb_Arch_Com_67_2011.pdf - Published Version
Restricted to Registered users only

Download (646kB) | Request a copy
Official URL: http://dx.doi.org/10.1145/1944862.1944875

Abstract

Data Prefetchers identify and make use of any regularity present in the history/training stream to predict future references and prefetch them into the cache. The training information used is typically the primary misses seen at a particular cache level, which is a filtered version of the accesses seen by the cache. In this work we demonstrate that extending the training information to include secondary misses and hits along with primary misses helps improve the performance of prefetchers. In addition to empirical evaluation, we use the information theoretic metric entropy, to quantify the regularity present in extended histories. Entropy measurements indicate that extended histories are more regular than the default primary miss only training stream. Entropy measurements also help corroborate our empirical findings. With extended histories, further benefits can be achieved by triggering prefetches during secondary misses also. In this paper we explore the design space of extended prefetch histories and alternative prefetch trigger points for delta correlation prefetchers. We observe that different prefetch schemes benefit to a different extent with extended histories and alternative trigger points. Also the best performing design point varies on a per-benchmark basis. To meet these requirements, we propose a simple adaptive scheme that identifies the best performing design point for a benchmark-prefetcher combination at runtime. In SPEC2000 benchmarks, using all the L2 accesses as history for prefetcher improves the performance in terms of both IPC and misses reduced over techniques that use only primary misses as history. The adaptive scheme improves the performance of CZone prefetcher over Baseline by 4.6% on an average. These performance gains are accompanied by a moderate reduction in the memory traffic requirements.

Item Type: Conference Proceedings
Publisher: Association for Computing Machinery
Additional Information: Copyright of this article belongs to Association for Computing Machinery.
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 19 Mar 2013 05:26
Last Modified: 19 Mar 2013 05:26
URI: http://eprints.iisc.ac.in/id/eprint/46027

Actions (login required)

View Item View Item