Showing 1–1 of 1 results for author: Elvinger, P
-
Measuring GPU utilization one level deeper
Authors:
Paul Elvinger,
Foteini Strati,
Natalie Enright Jerger,
Ana Klimovic
Abstract:
GPU hardware is vastly underutilized. Even resource-intensive AI applications have diverse resource profiles that often leave parts of GPUs idle. While colocating applications can improve utilization, current spatial sharing systems lack performance guarantees. Providing predictable performance guarantees requires a deep understanding of how applications contend for shared GPU resources such as bl…
▽ More
GPU hardware is vastly underutilized. Even resource-intensive AI applications have diverse resource profiles that often leave parts of GPUs idle. While colocating applications can improve utilization, current spatial sharing systems lack performance guarantees. Providing predictable performance guarantees requires a deep understanding of how applications contend for shared GPU resources such as block schedulers, compute units, L1/L2 caches, and memory bandwidth. We propose a methodology to profile resource interference of GPU kernels across these dimensions and discuss how to build GPU schedulers that provide strict performance guarantees while colocating applications to minimize cost.
△ Less
Submitted 12 February, 2025; v1 submitted 28 January, 2025;
originally announced January 2025.