Item

Work-in-Progress: Enabling Transparent Priority Scheduling for NVIDIA GPUs

Weaver, Noah
Bakita, Joshua
Supervisor
Department
Computer Science
Embargo End Date
Type
Conference proceeding
Date
License
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Modern embedded systems increasingly rely on GPUs for safety-critical tasks such as object detection in autonomous vehicles and data visualization in medical diagnostics. These applications require running high-priority workloads alongside lower-priority tasks on shared GPU hardware due to cost, power, or space constraints. However, NVIDIA GPUs employ round-robin scheduling that treats all tasks equally, causing unpredictable delays for critical workloads. Existing GPU priority scheduling solutions require application modifications, introduce high overhead, or depend on specialized embedded hardware. We present a GPU-level static priority scheduler built on the nvsched framework that requires no application changes and runs on commodity GPUs. Our scheduler maintains tasks in a priority-ordered queue and uses a parameter communication system to enable dynamic priority updates from userspace. Experiments with MobileNetV3 inference under GPU contention demonstrate 32-41% lower maximum latency for high-priority tasks when using our scheduler compared to NVIDIA's default scheduler. Our approach provides practical priority control for GPU workloads while maintaining low scheduling overhead.
Citation
N. Weaver, J. Bakita, "Work-in-Progress: Enabling Transparent Priority Scheduling for NVIDIA GPUs," 2025, pp. 588-591.
Source
Proceedings Real Time Systems Symposium
Conference
2025 IEEE Real-Time Systems Symposium (RTSS)
Keywords
33 Built Environment and Design, 3301 Architecture, 46 Information and Computing Sciences
Subjects
Source
2025 IEEE Real-Time Systems Symposium (RTSS)
Publisher
IEEE
Full-text link