Por favor, use este identificador para citar o enlazar este ítem: 10.1016/j.jpdc.2022.11.007

Título: Speculative Inter-Thread Store-to-Load Forwarding in SMT Architectures
Fecha de publicación: 1-mar-2023
Cita bibliográfica: Journal of Parallel Distributed Computing (JPDC)
ISSN: 0743-7315
Palabras clave: Simultaneous multithreading
Memory consistency
Store-to-load forwarding
Multiple-copy atomicity
Resumen: Applications running on out-of-order cores have benefited for decades of store-to-load forwarding which accelerates communication of store values to loads of the same thread. Despite threads running on a simultaneous multi-threading (SMT) core could also access the load queues (LQ) and store queues (SQ) / store buffers (SB) of other threads to allow inter-thread store-to-load forwarding, we have skipped exploiting it because if we allow communication of different SMT threads via their LQs and SQs/SBs, write atomicity may be violated with respect to the outside world beyond the acceptable model of read-own-write-early multiple-copy atomicity (rMCA). In our prior work, we leveraged this idea to propose inter-thread store-to-load forwarding (ITSLF). ITLSF accelerates synchronization and communication of threads running in a simultaneous multi-threading processor by allowing stores in the store-queue of a thread to forward data to loads of another thread running in the same core without violating rMCA. In this work, we extend the original ITSLF mechanism to allow inter-thread forwarding from speculative stores (Spec-ITSLF). Spec-ITSLF allows forwarding store values to other threads earlier, which further accelerates synchronization. Spec-ITSLF outperforms a baseline SMT core by 15%, which is 2% better on average (and up to 5% for the TATP workload) than the original ITSLF mechanism. More importantly, Spec-ITSLF is on par with the original ITSLF mechanism regarding storage overhead but does not need to keep track of the speculative state of stores, which was an important source of overhead and complexity in the original mechanism.
Autor/es principal/es: Feliu, Josué
Ros, Alberto
Acacio, Manuel E.
Kaxiras, Stefanos
Facultad/Departamentos/Servicios: Facultades, Departamentos, Servicios y Escuelas::Departamentos de la UMU::Ingeniería y Tecnología de Computadores
URI: http://hdl.handle.net/10201/132503
DOI: 10.1016/j.jpdc.2022.11.007
Tipo de documento: info:eu-repo/semantics/article
Número páginas / Extensión: 27
Derechos: info:eu-repo/semantics/openAccess
Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Descripción: © 2023 ACM, Inc.This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/ This document is the Accepted Manuscript version of a Published Work that appeared in final form Journal of Parallel and Distributed Computing. To access the final edited and published work see DOI.: 10.1016/j.jpdc.2022.11.007
Aparece en las colecciones:Artículos: Ingeniería y Tecnología de Computadores

Ficheros en este ítem:
Fichero Descripción TamañoFormato 
Spec-ITSLF.pdf961,99 kBAdobe PDFVista previa
Visualizar/Abrir    Solicitar una copia


Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons