Senior Compute Kernel Architect, GPU Power
NVIDIA
About This Role
NVIDIA is seeking a Compute Kernel Performance Architect with a unique blend of skills: someone who can write, profile, and analyze CUDA kernels with a laser focus on power consumption and current draw — and who understands how those kernels interact with the GPU's Power Delivery Network (PDN) at a system level. This is not a typical performance role. You will be writing stress workloads that deliberately push GPU power to its limits, partnering with hardware architects to validate power integrity assumptions, and helping ensure that our chips survive the harshest real-world di/dt scenarios. This position is an opportunity to have real impact on the GPU power architecture of future NVIDIA products, working at the boundary of GPU architecture, software and silicon. What You'll Be Doing: Design and develop CUDA kernels purpose-built to enhance GPU power consumption — targeting worst-case current draw across compute, memory, and I/O subsystems Collaborate with hardware power architects to validate PDN assumptions, di/dt specs to appropriately target weak points Build and maintain a library of power stress microbenchmarks that sweep power profiles across GPU functional units — tensor cores, memory controllers, I/O interfaces — to stress PDN resonance and droop conditions across GPU families Analyze trade-offs between kernel throughput, power efficiency, and voltage stability, contributing insights that feed directly into future GPU architecture decisions Partner across teams — GPU architects, power circuit designers, silicon validation engineers — to ensure power stress methodologies are aligned from pre-silicon simulation through post-silicon bringup What We Need to See: MS or PhD in Computer Science, Electrical Engineering, or Computer Engineering (or equivalent experience) 5+ years of experience in GPU kernel development, CUDA programming, or high-performance computing Strong CUDA and C++ programming skills, with hands-on experience writing and optimizing kernels at the assembly or PTX level Experience with GPU performance profiling tools — Nsight Compute, Nsight Systems, nvprof, or equivalent Solid understanding of GPU architecture — SMs, memory hierarchy, power states, and how they map to current draw profiles Working knowledge of Power Delivery Networks (PDNs) — including board-level PDN design, package inductance, decoupling capacitors, and their role in voltage droop and overshoot Conceptual understanding of di/dt — how rapid current transitions cause voltage transients, and how software workloads can be designed to control or stress those transitions Strong programming skills in Python for scripting, data analysis, and automation of power characterization workflows, using vibe coding effectively Excellent communication skills and comfort working across hardware and software disciplines Ways to Stand Out from the Crowd: Hands-on experience writing GPU power stress microbenchmarks — synthetic workloads designed to hit worst-case power consumption on specific GPU functional units Direct experience with post-silicon power characterization — measuring VDD voltage droop, di/dt slew rates, and power supply transient response using oscilloscopes, sensors, or equivalent lab tools Experience with DVFS, AVFS, and noise mitigation features and understanding how they interact with kernel behavior Knowledge of PDN impedance targets across die, package, and board domains, and how resonance frequencies map to observed voltage droop signatures Our team works at the core of NVIDIA's GPU performance stack. We partner closely with the Compute Architecture group, Silicon Solutions, Power Architecture, and Deep Learning framework teams. Our work directly influences how future GPUs are designed — both in terms of silicon power delivery and in the software stack that runs on top of it. If you want your code to literally shape the next generation of NVIDIA hardware, this is the place. The GPU started out as the engine for simulating human imagination, conjuring up the amazing virtual worlds of video games and Hollywood films. Now, NVIDIA’s GPU runs deep learning algorithms, simulating human intelligence, and acts as the brain of computers, robots and self-driving cars that can perceive and understand the world. Just as human imagination and intelligence are linked, computer graphics and artificial intelligence come together in our architecture. Today, NVIDIA GPUs are used broadly for deep learning, and NVIDIA is increasingly known as “the AI computing company.” Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until April 11, 2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. NVIDIA is the world leader in accelerated computing. NVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and digital twins is transforming the world's largest industries and profoundly impacting society. Learn more about NVIDIA.
Responsibilities
Design and develop CUDA kernels to stress GPU power delivery networks and validate power integrity assumptions. Collaborate with hardware architects and silicon validation teams to analyze power consumption trade-offs and ensure robust chip performance.
Requirements
Requires an MS or PhD in Computer Science, Electrical Engineering, or Computer Engineering with at least 5 years of experience in GPU kernel development. Candidates must possess strong CUDA and C++ programming skills and a deep understanding of GPU architecture and power delivery systems.
Education
- postgraduate degree
Benefits
Skills & Tags
Keywords
Categories
Source: workday