Por favor, use este identificador para citar o enlazar este ítem:
https://doi.org/10.1109/TC.2017.2769641
Twittear
Título: | Static Instruction Scheduling for High Performance on Limited Hardware |
Fecha de publicación: | 1-abr-2018 |
Editorial: | IEEE |
Cita bibliográfica: | IEEE Transactions on Computers, vol. 67, no. 4, pp. 513-527 |
ISSN: | Print: 0018-9340 Electronic: 1557-9956 |
Palabras clave: | Compilers Code generation Memory management Optimization |
Resumen: | Complex out-of-order (OoO) processors have been designed to overcome the restrictions of outstanding long-latency misses at the cost of increased energy consumption. Simple, limited OoO processors are a compromise in terms of energy consumption and performance, as they have fewer hardware resources to tolerate the penalties of long-latency loads. In worst case, these loads may stall the processor entirely. We present Clairvoyance, a compiler based technique that generates code able to hide memory latency and better utilize simple OoO processors. By clustering loads found across basic block boundaries, Clairvoyance overlaps the outstanding latencies to increases memory-level parallelism. We show that these simple OoO processors, equipped with the appropriate compiler support, can effectively hide long-latency loads and achieve performance improvements for memory-bound applications. To this end, Clairvoyance tackles (i) statically unknown dependencies, (ii) insufficient independent instructions, and (iii) register pressure. Clairvoyance achieves a geomean execution time improvement of 14 percent for memory-bound applications, on top of standard O3 optimizations, while maintaining compute-bound applications' high-performance. |
Autor/es principal/es: | Tran, Kim-Anh Carlson, Trevor E. Koukos, Konstantinos Själander, Magnus Spiliopoulos, Vasileios Kaxiras, Stefanos Jimborean, Alexandra |
Facultad/Departamentos/Servicios: | Facultades, Departamentos, Servicios y Escuelas::Departamentos de la UMU::Ingeniería y Tecnología de Computadores |
Versión del editor: | https://ieeexplore.ieee.org/document/8094900 |
URI: | http://hdl.handle.net/10201/138783 |
DOI: | https://doi.org/10.1109/TC.2017.2769641 |
Tipo de documento: | info:eu-repo/semantics/article |
Número páginas / Extensión: | 15 |
Derechos: | info:eu-repo/semantics/openAccess Attribution-NonCommercial-NoDerivatives 4.0 Internacional |
Descripción: | © 2018. IEEE. This document is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0 This document is the accepted version of a published work that appeared in final form in IEEE Transactions on Computers, vol. 67, no. 4, pp. 513-527 To access the final work, see DOI:, https://doi.org/10.1109/TC.2017.2769641 |
Aparece en las colecciones: | Artículos: Ingeniería y Tecnología de Computadores |
Ficheros en este ítem:
Fichero | Descripción | Tamaño | Formato | |
---|---|---|---|---|
2018_TC_Kim_Anh_Tran.pdf | 1,45 MB | Adobe PDF | Visualizar/Abrir |
Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons