Por favor, use este identificador para citar o enlazar este ítem: https://doi.org/10.1145/3582016.3582069

Título: Flexagon: A Multi-Dataflow Sparse-Sparse Matrix MultiplicationAccelerator for Efficient DNN Processing
Fecha de publicación: mar-2023
Editorial: Association for Computing Machinery
Cita bibliográfica: ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3
ISBN: 978-1-4503-9918-0/23/03.
Palabras clave: Deep Neural Network Accelerators
Sparse-Sparse Matrix Multiplication
Dataflow
Merger-Reduction Network
Memory Hierarchy
Resumen: Sparsity is a growing trend in modern DNN models.Existing Sparse-Sparse Matrix Multiplication (SpMSpM) accel-erators are tailored to a particular SpMSpM dataflow (i.e., InnerProduct, Outer Product or Gustavson’s), which determines theiroverall efficiency. We demonstrate that this static decision inher-ently results in a suboptimal dynamic solution. This is becausedifferent SpMSpM kernels show varying features (i.e., dimensions,sparsity pattern, sparsity degree), which makes each dataflow bettersuited to different data sets.In this work we present Flexagon, the first SpMSpM reconfig-urable accelerator that is capable of performing SpMSpM computa-tion by using the particular dataflow that best matches each case.Flexagon accelerator is based on a novel Merger-Reduction Net-work (MRN) that unifies the concept of reducing and merging inthe same substrate, increasing efficiency. Additionally, Flexagonalso includes a new L1 on-chip memory organization, specificallytailored to the different access characteristics of the input and out-put compressed matrices. Using detailed cycle-level simulation ofcontemporary DNN models from a variety of application domains,we show that Flexagon achieves average performance benefits of4.59×, 1.71×, and 1.35×with respect to the state-of-the-art SIGMA-like, SpArch-like and GAMMA-like accelerators (265%, 67%, and18%, respectively, in terms of average performance/area efficiency).
Autor/es principal/es: Muñoz-Martínez, Francisco
Abellán, José L.
Acacio, Manuel E.
Garg, Raveesh
Pellauer, Michael
Krishna, Tushar
Facultad/Departamentos/Servicios: Facultades, Departamentos, Servicios y Escuelas::Departamentos de la UMU::Ingeniería y Tecnología de Computadores
Forma parte de: ASPLOS ’23, March 25–29, 2023, Vancouver, BC, Canada
Versión del editor: https://dl.acm.org/doi/proceedings/10.1145/3582016
URI: http://hdl.handle.net/10201/128556
DOI: https://doi.org/10.1145/3582016.3582069
Tipo de documento: info:eu-repo/semantics/article
Número páginas / Extensión: 14
Derechos: info:eu-repo/semantics/openAccess
Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Descripción: © 2023. The authors. This document is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0 This document is the accepted version of a published work that appeared in final form in ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3 To access the final work, see DOI: https://doi.org/10.1145/3582016.3582069
Aparece en las colecciones:Artículos: Ingeniería y Tecnología de Computadores

Ficheros en este ítem:
Fichero Descripción TamañoFormato 
asplosc23main-p1166-p-f253962830-63228-submitted.pdf1,51 MBAdobe PDFVista previa
Visualizar/Abrir


Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons