Bounding speculative execution of atomic regions to a single retry

Gómez Hernández, Eduardo José; Cebrián, Juan M.; Kaxiras, Stefanos; Ros Bardisa, Alberto

Por favor, use este identificador para citar o enlazar este ítem: https://doi.org/10.1145/3622781.3674176

RefMan EndNote BibTex RefWorks Excel CSV PDF Mendeley

Título:	Bounding speculative execution of atomic regions to a single retry
Fecha de publicación:	1-may-2024
Editorial:	Association for Computing Machinery
ISBN:	979-8-4007-0391-1
Resumen:	Mutual exclusion has long served as a fundamental construct in parallel programs. Despite a long history of optimizing the lower-level lock and unlock operations used to enforce mutual exclusion, such operations largely dictate performance in parallel programs. Speculative Lock Elision, and more generally Hardware Transactional Memory, allow executing atomic regions (ARs) concurrently and speculatively, and ensure correctness by using conflict detection. However, practical implementations of these ideas are best-effort and, in case of conflicts, the execution of ARs is retried a predetermined number of times before falling back to mutual exclusion. This work explores the opportunities of using cacheline locking to bound the number of retries of speculative solutions. Our key insight is that ARs that access exactly the same set of addresses when re-executing can learn that set in the first execution and execute non-speculatively in the next one by performing an ordered cacheline locking. This way the speculative execution is bounded to a single retry. We first establish the conditions for ARs to be able to re-execute under a cacheline-locked mode. Based on these conditions, we propose cleAR, cacheline-locked executed AR, a novel technique that on the first abort, forces the reexecution to use cacheline locking. The detection and conversion to cacheline-locking mode is transparent to software. Using gem5 running data-structure benchmarks and the STAMP benchmark suite, we show that the average number of ARs that succeed on the first retry grows from 35.4% in our baseline to 64.4% with cleAR, reducing the percentage of fallback (coarse-grain mutual exclusion) execution from 37.2% to 15.4%. These improvements reduce average execution time by 35.0% over a baseline configuration and by 23.3% over more elaborated approaches like PowerTM.
Autor/es principal/es:	Gómez Hernández, Eduardo José Cebrián, Juan M. Kaxiras, Stefanos Ros Bardisa, Alberto
Forma parte de:	ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Vol. 4, pp. 17-30
Versión del editor:	https://dl.acm.org/doi/10.1145/3622781.3674176
URI:	http://hdl.handle.net/10201/154740
DOI:	https://doi.org/10.1145/3622781.3674176
Tipo de documento:	info:eu-repo/semantics/article
Número páginas / Extensión:	14
Derechos:	info:eu-repo/semantics/openAccess Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Descripción:	© 2024 Copyright is held by the owner/author(s). This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/ This document is the Published Manuscript version of a Published Work that appeared in final form in ASPLOS '24. To access the final edited and published work see https://doi.org/10.1145/3622781.3674176
Aparece en las colecciones:	Artículos

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
ejgomez-asplos24.pdf		719,08 kB	Adobe PDF	Visualizar/Abrir

Mostrar el registro Dublin Core completo del ítem Mostrar el registro PREMIS del ítem Estadísticas

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons