Dynamic and Speculative Polyhedral ParallelizationUsing Compiler-Generated Skeletons

Jimborean, Alexandra; Clauss, Philippe; Dollinger, Jean-François; Loechner, Vincent; Martinez Caamaño, Juan Manuel

Por favor, use este identificador para citar o enlazar este ítem: 10.1007/s10766-013-0259-4

RefMan EndNote BibTex RefWorks Excel CSV PDF Mendeley

Título:	Dynamic and Speculative Polyhedral ParallelizationUsing Compiler-Generated Skeletons
Fecha de publicación:	9-ago-2013
Editorial:	Springer
Cita bibliográfica:	International Journal Parallel Programming (2014) 42:529–545
ISSN:	0885-7458
Palabras clave:	Algorithmic skeletons Polytope model Automatic parallelization Dynamic parallelization Loop nests·Compilation Speculative parallelization
Resumen:	We propose a framework based on an original generation and use of algo-rithmic skeletons, and dedicated to speculative parallelization of scientific nested loopkernels, able to apply at run-time polyhedral transformations to the target code in orderto exhibit parallelism and data locality. Parallel code generation is achieved almostat no cost by using binary algorithmic skeletons that are generated at compile-time,and that embed the original code and operations devoted to instantiate a polyhedralparallelizing transformation and to verify the speculations on dependences. The skele-tons are patched at run-time to generate the executable code. The run-time processincludes a transformation selection guided by online profiling phases on short samples,using an instrumented version of the code. During this phase, the accessed memoryaddresses are used to compute on-the-fly dependence distance vectors, and are alsointerpolated to build a predictor of the forthcoming accesses. Interpolating functionsand distance vectors are then employed for dependence analysis to select a paral-lelizing transformation that, if the prediction is correct, does not induce any rollbackduring execution. In order to ensure that the rollback time overhead stays low, the code is executed in successive slices of the outermost original loop of the nest. Eachslice can be either a parallel version which instantiates a skeleton, a sequential originalversion, or an instrumented version. Moreover, such slicing of the execution providesthe opportunity of transforming differently the code to adapt to the observed executionphases, by patching differently one of the pre-built skeletons. The framework has beenimplemented with extensions of the LLVM compiler and an x86-64 runtime system.Significant speed-ups are shown on a set of benchmarks that could not have beenhandled efficiently by a compiler.
Autor/es principal/es:	Jimborean, Alexandra Clauss, Philippe Dollinger, Jean-François Loechner, Vincent Martinez Caamaño, Juan Manuel
URI:	http://hdl.handle.net/10201/138367
DOI:	10.1007/s10766-013-0259-4
Tipo de documento:	info:eu-repo/semantics/article
Número páginas / Extensión:	14
Derechos:	info:eu-repo/semantics/embargoedAccess
Aparece en las colecciones:	Artículos: Ingeniería y Tecnología de Computadores

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
2014_IJPP_Alexandra_Jimborean.pdf		779,25 kB	Adobe PDF	Visualizar/Abrir Solicitar una copia

Mostrar el registro Dublin Core completo del ítem Mostrar el registro PREMIS del ítem Estadísticas