Title: HW-SW Co-design method to mitigate GPU Memory Safety Vulnerability

Date: Tuesday, September 3, 2024
Time: 1:00 PM - 3:00 PM EDT
Location:
     In-Person: KACB 2100
     Online: Click here to join the meeting

Jaewon Lee
Ph.D. Candidate
School of Computer Science
College of Computing
Georgia Institute of Technology

Committee:
Dr. Hyesoon Kim (advisor), School of Computer Science, Georgia Institute of Technology
Dr. Moinuddin Qureshi, School of Computer Science, Georgia Institute of Technology
Dr. Tushar Krishna, School of Electrical and Computer Engineering & School of Computer Science, Georgia Institute of Technology
Dr. Saibal Mukhopadhyay, School of Electrical and Computer Engineering, Georgia Institute of Technology
Dr. Jaekyu Lee, Arm

Abstract
Graphics Processing Units (GPUs) were once considered acceptable even if they were insecure; however, the surge in Artificial Intelligence (AI) applications has now brought GPUs into critical roles in life and financial decision-making. Recent studies have successfully demonstrated that attackers can induce failures in AI models by exploiting vulnerabilities in GPU memory.
In this dissertation, we first propose GPUShield, an efficient hardware/software co-designed GPU memory bounds-checking mechanism that fully leverages the characteristics of GPU programs. GPUShield introduces efficient hardware bounds-checking logic by utilizing pointer tagging methods. It minimizes metadata access by taking advantage of the GPU's region-based memory access, metadata caching, and memory coalescing.
Secondly, we propose a more lightweight and practical mechanism, LMI. LMI is a fine-grained memory safety mechanism that leverages these concepts. Even when thousands of threads simultaneously allocate and access memory buffers in both the stack and heap, LMI ensures negligible impact on performance and hardware costs,. This is achieved by implementing power-of-two-sized and aligned pointers and performing static analysis to identify pointer operations. This approach also enables storing metadata inside the unused upper bits of pointers, which are shrinking due to the expansion of the virtual memory address space. The unique characteristics of GPU programs make this approach feasible, unlike in CPUs, where the inherent complexity of programs poses challenges. Our evaluation demonstrates that LMI incurs extremely low hardware costs and near-zero performance overhead, making it a practical and efficient solution for enhancing GPU memory safety.