Performance Implications of Nonuniform Device Topologies in Scalable Heterogeneous Architectures (ACM)
This article considers trends in heterogeneous system design, particularly for GPUs. Using the Keeneland Initial Delivery System, the authors examine the performance implications of increased parallelism and specialized hardware on parallel scientific applications. They examine how nonuniform data-transfer performance across the node-level topology can impact performance. Finally, they help users of GPU-based systems avoid performance problems related to this nonuniformity.
Paper available at ACM.