Publication: An autotuning approach to select the inter-GPU communication library on heterogeneous systems
Authors
Cámara, Jesús ; Cuenca Muñoz, Antonio Javier ; Cuenca, Javier ; Galindo, Víctor ; Vicente, Arturo ; Boratto, Murilo
item.page.secondaryauthor
item.page.director
Publisher
Springer
publication.page.editor
publication.page.department
DOI
https://doi.org/10.1007/s11227-024-06794-3
item.page.type
info:eu-repo/semantics/article
Description
Abstract
In this work, an automatic optimisation approach for parallel routines on multi-GPU systems is presented. Several inter-GPU communication libraries (such as CUDA- Aware MPI or NCCL) are used with a set of routines to perform the numerical oper- ations among the GPUs located on the compute nodes. The main objective is the selection of the most appropriate communication library, the number of GPUs to be used and the workload to be distributed among them in order to reduce the cost of data movements, which represent a large percentage of the total execution time. To this end, a hierarchical modelling of the execution time of each routine to be opti- mised is proposed, combining experimental and theoretical approaches. The results show that near-optimal decisions are taken in all the scenarios analysed.
publication.page.subject
Citation
Journal of Supercomputing, 2025, Vol. 81, 283
item.page.embargo
Collections
Ir a Estadísticas
Este ítem está sujeto a una licencia Creative Commons. http://creativecommons.org/licenses/by/4.0/



