GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption

Shivdikar, Kaustubh; Agrawal, Rashmi; Jonatan, Gilbert; Abellán, José L.; Livesay, Neal; Joshi, Ajay; Bao, Yuhui; Shen, Michael; Evelio, Mora; Kim, John; Ingare, Alexander; David Kaeli

Por favor, use este identificador para citar o enlazar este ítem: https://doi.org/10.1145/3613424.3614279

RefMan EndNote BibTex RefWorks Excel CSV PDF Mendeley

Título:	GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption
Fecha de publicación:	2023
Fecha de defensa / creación:	2023
Editorial:	ACM Association for Computing Machinery
Cita bibliográfica:	MICRO '23: 56th Annual IEEE/ACM International Symposium on Microarchitecture
ISBN:	979-8-4007-0329-4/23/10
Palabras clave:	Zero-trust frameworks Fully Homomorphic Encryption (FHE) Custom accelerators CU-side interconnects Modular reduction
Resumen:	Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without decrypting it. FHE has garnered significant attention over the past decade as it supports secure outsourcing of data processing to remote cloud services. Despite its promise of strong data privacy and security guarantees, FHE introduces a slowdown of up to five orders of magnitude as compared to the same computation using plaintext data. This overhead is presently a major barrier to the commercial adoption of FHE. While prior efforts recommend moving to custom accelerators to accelerate FHE computing, these solutions lack cost-effectiveness and scalability. In this work, we leverage GPUs to accelerate FHE, capitalizing on a well-established GPU ecosystem that is available in the cloud. We propose GME, which combines three key microarchitectural extensions along with a compile-time optimization to the current AMD CDNA GPU architecture. First, GME integrates a lightweight on-chip compute unit (CU)-side hierarchical interconnect to retain ciphertext in cache across FHE kernels, thus eliminating redundant memory transactions and improving performance. Second, to tackle compute bottlenecks, GME introduces special MOD-units that provide native custom hardware support for modular reduction operations, one of the most commonly executed sets of operations in FHE. Third, by integrating the MOD-unit with our novel pipelined 64-bit integer arithmetic cores (WMAC-units), GME further accelerates FHE workloads by 19%. Finally, we propose a Locality-Aware Block Scheduler (LABS) that improves FHE workload performance, exploiting the temporal locality available in FHE primitive blocks. Incorporating these microarchitectural features and compiler optimizations, we create a synergistic approach achieving average speedups of 796×, 14.2×, and 2.3× over Intel Xeon CPU, NVIDIA V100 GPU, and Xilinx FPGA implementations, respectively.
Autor/es principal/es:	Shivdikar, Kaustubh Agrawal, Rashmi Jonatan, Gilbert Abellán, José L. Livesay, Neal Joshi, Ajay Bao, Yuhui Shen, Michael Evelio, Mora Kim, John Ingare, Alexander David Kaeli
Forma parte de:	MICRO ’23, October 28-November 1, 2023, Toronto, ON, Canada
URI:	http://hdl.handle.net/10201/134004
DOI:	https://doi.org/10.1145/3613424.3614279
Tipo de documento:	info:eu-repo/semantics/article
Número páginas / Extensión:	14
Derechos:	info:eu-repo/semantics/openAccess Atribución 4.0 Internacional
Descripción:	© 2023. The authors. This document is made available under the CC-BY 4.0 license http://creativecommons.org/licenses/by /4.0/ This document is the published version of a published work that appeared in final form in MICRO '23: 56th Annual IEEE/ACM International Symposium on Microarchitecture. To access the final work, see DOI: https://doi.org/10.1145/3613424.3614279
Aparece en las colecciones:	Artículos

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
GME_MICRO_2023_Camera_Ready.pdf		1,09 MB	Adobe PDF	Visualizar/Abrir

Mostrar el registro Dublin Core completo del ítem Mostrar el registro PREMIS del ítem Estadísticas

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons