Repository logo
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
    New user? Click here to register.
Repository logo

Repositorio Institucional de la Universidad de Murcia

Repository logoRepository logo
  • Communities & Collections
  • All of DSpace
  • menu.section.collectors
  • menu.section.acerca
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
    New user? Click here to register.
  1. Home
  2. Browse by Subject

Browsing by Subject "Heterogeneous system coherence"

Now showing 1 - 1 of 1
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Publication
    Embargo
    Enhanced system-level coherence for heterogeneous unified memory architectures
    (IEEE Computer Society, 2024-11-28) Nataraja, Anoop Mysore; Fernández Pascual, Ricardo; Ros Bardisa, Alberto; Ingeniería y Tecnología de Computadores
    Heterogeneous Unified Memory Architectures (HUMA) provide a unified memory space for on-die CPUs, GPUs, and other hardware accelerators. Such architectures improve performance and energy efficiency by obviating explicit data transfers between processors. An important feature of such architectures is Heterogeneous System Coherence (HSC) which simplifies the programming model by reducing the explicit synchronizations otherwise expected of the programmers of such systems. However, due to differences in the memory models and bandwidth requirements of CPUs and GPUs, hardware implementation of coherence for such systems is often complex and comes at high power, performance, and area trade-offs.This paper optimizes the existing heterogeneous coherence mechanism in early AMD Accelerated Processing Units, approximately modeled in the gem5 simulator. It introduces precise sharing information in the system-level directory, which monitors both CPU and GPU cache lines, and implements a new write-back shared last-level cache (LLC). The original implementation consisted of a stateless system-level directory and a write-through LLC. Our evaluation results with a set of collaborative heterogeneous benchmarks reveal, on average, a 14.4% performance improvement and 80.8% and 50.4% reduced probing traffic and main-memory interactions, respectively. Through optimizations and adaptation of the evaluated benchmarks, this work aims to reduce the barriers to entry into HSC research.

DSpace software copyright © 2002-2026 LYRASIS

  • Cookie settings
  • Accessibility
  • Send Feedback