Por favor, use este identificador para citar o enlazar este ítem: https://doi.org/10.1007/s11227-024-06008-w

Registro completo de metadatos
Campo DCValorLengua/Idioma
dc.contributor.authorMartínez Sánchez, Pablo Antonio-
dc.contributor.authorBernabé García, Gregorio-
dc.contributor.authorGarcía Carrasco, José Manuel-
dc.date.accessioned2024-04-08T10:52:06Z-
dc.date.available2024-04-08T10:52:06Z-
dc.date.issued2024-03-25-
dc.identifier.citationThe Journal of Supercomputing, 2024es
dc.identifier.issnPrint: 0920-8542-
dc.identifier.issnElectrónic: 1573-0484-
dc.identifier.urihttp://hdl.handle.net/10201/140581-
dc.description© The Author(s) 2024. This manuscript version is made available under the CC-BY 4.0 license http://creativecommons.org/licenses/by/4.0/ This document is the Published Manuscript version of a Published Work that appeared in final form in The Journal of Supercomputing. To access the final edited and published work see https://doi.org/10.1007/s11227-024-06008-w-
dc.description.abstractIn the era of heterogeneous computing, a new paradigm called accelerator level parallelism (ALP) has emerged. In ALP, accelerators are used concurrently to provide unprecedented levels of performance and energy efficiency. To reach that there are many problems to be solved, one of the most challenging being co-execution. In this paper, we present a new scheduling framework called POAS, a general method for providing co-execution to applications. Our proposal consists of four steps: predict, optimize, adapt and schedule. With POAS, an unseen application can be executed concurrently in ALP with little effort. We evaluate POAS on a heterogeneous environment consisting of CPUs, GPUs (CUDA cores), and XPUs (Tensor cores) on two different fields, namely linear algebra (matrix multiplication benchmark) and deep learning (convolution benchmark). Our experiments prove that POAS provides excellent performance and completes the tasks within a time very close to the optimal time for the hardware and applications used, with a negligible execution time overhead. Moreover, the POAS predictor performed exceptionally well, achieving very low RMSE values for both use cases. Therefore, POAS can be a valuable tool for fully exploiting ALP and improving overall performance over offloading in heterogeneous settings.es
dc.formatapplication/pdfes
dc.format.extent28es
dc.languageenges
dc.publisherSpringer-
dc.relationOpen Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. Grant No. (TED2021-129221B-I00) funded by MCIN/AEI/10.13039/501100011033 and by the “European Union NextGenerationEU/PRTR,” and Grant No. (PID2022-136315OB-I00) funded by MCIN/AEI/10.13039/501100011033/ and by “ERDF A way of making Europe,” EU.es
dc.rightsinfo:eu-repo/semantics/openAccesses
dc.rightsAtribución 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectHigh performance computinges
dc.subjectHeterogeneous computinges
dc.subjectAccelerator level parallelismes
dc.subjectSchedulinges
dc.subjectCo executiones
dc.titlePOAS: a framework for exploiting accelerator level parallelism in heterogeneous environmentses
dc.typeinfo:eu-repo/semantics/articlees
dc.embargo.terms2025-03-25-
dc.identifier.doihttps://doi.org/10.1007/s11227-024-06008-w-
dc.contributor.departmentDepartamento de Ingeniería y Tecnología de Computadores-
Aparece en las colecciones:Artículos

Ficheros en este ítem:
Fichero Descripción TamañoFormato 
JS24pub.pdf2,16 MBAdobe PDFVista previa
Visualizar/Abrir


Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons