Technical Program Manager, Cloud Infrastructure

Posted:
3/4/2026, 2:14:39 AM

Location(s):
California, United States ⋅ Washington, United States ⋅ Santa Clara, California, United States ⋅ Redmond, Washington, United States

Experience Level(s):
Mid Level ⋅ Senior

Field(s):
Product

NVIDIA's deep learning platforms are at the forefront of innovation, profoundly impacting various fields and widely adopted by leading academic institutions, startups, and major Internet companies globally. We're seeking an accomplished and highly skilled Technical Program Manager (TPM) to join our NVIDIA DGX Cloud team. This is an exciting opportunity for a passionate, results-oriented, and creative individual to deliver exceptional value to our DGX Cloud customers.

We are specifically looking for a TPM with extensive experience in cloud infrastructure  bring-up with external partners. You'll be instrumental in partnering with emerging Nvidia Cloud Providers (NCPs) and engineering teams internally to help build AI capacity and infrastructure across the globe 

What you'll be doing:

As a DGX Cloud Technical Program Manager, you'll be a key partner to our Engineering, Infrastructure, Software teams and their leadership, driving critical programs related to AI capacity enablement and management. You'll play a pivotal role in developing and maturing foundational capabilities and processes for DGX Cloud, spanning critical areas such as cluster/capacity bring-up including CPU, storage, networking and compute requirements to support GPUs.  This is a dynamic, fast-paced environment where TPMs are expected to apply fungible skillsets to a range of high-impact programs across DGX Cloud.

  • Collaborating closely with storage engineering and network engineering teams to define and communicate requirements to CSP (Cloud Service Providers) and NCP’s (NVIDIA Cloud Providers).  Drive alignment and a POR for capacity blocks based on workload needs.

  • Drive early engagement with CSP (Cloud Service Providers) and NCP’s (NVIDIA Cloud Providers) to understand their managed storage, network solutions and influence alignment with NVIDIA Cloud roadmap

  • Gathering technical requirements, developing comprehensive roadmaps, establishing clear milestones, and ensuring adherence to our Product Lifecycle (PLC) process.

  • Managing ongoing capacity operations and the engineering engagement with CSP (Cloud Service Providers) and NCP’s (NVIDIA Cloud Provider) partners, collaborating closely with an SRE lead.  Focus on availability, maintenance and other critical performance indicators.

  • Partner closely within NVIDIA to understand workload requirements, related HW and infra needs, including speeds/feeds to optimize and infrastructure readiness with CSP (Cloud Service Providers) and NCP’s (NVIDIA Cloud Providers)

  • Leveraging Jira and other program management platforms to instill rigor and structure in the management of engineering deliverables.

  • Identifying and driving opportunities to onboard the adoption of third-party and in-house cloud infrastructure solutions for deployments, support, security, compliance and observability across DGX Cloud 

  • Establishing key performance indicators (KPIs) and quantitatively demonstrating the value and impact delivered by your programs.

  • Proactively identifying, resolving, and mitigating risks and issues that could affect scope, schedule, and quality across all program aspects.

  • Cultivating a culture of continuous improvement, consistently identifying opportunities for process enhancements within our cloud infrastructure operations.

What we need to see:

  • 12+ years of technical program management experience, specifically driving the planning and execution of large-scale cloud infrastructure programs with external partners, with a strong focus on software engineering projects within a matrixed organization.

  • Extensive hands-on experience in cloud infrastructure, preferably gained from working at a major Cloud Service Provider (CSP).

  • Domain knowledge in the bring-up and end to end operations of compute, storage, networking and GPU (including common failure points at the HW and SW levels).

  • Expert-level proficiency with Jira, Smartsheet, or similar program management tools, with the ability to confidently guide engineering teams on their use of the tools.

  • Exceptional strategic and tactical thinking abilities, coupled with a strong capacity to build consensus and drive program success

  • Comfort and effectiveness in thriving within ambiguous environments.

  • Possess excellent communication and technical presentation skills, particularly for executive audiences.

  • BS or MS in Electrical Engineering or Computer Science, or equivalent experience. 

Ways to stand out from the crowd:

  • In depth knowledge of  NVIDIA GPU products, including deployment and bring-up

  • Working knowledge of various cloud technologies (Kubernetes, API integration, Terraform, etc)

  • A highly enthusiastic, energetic, responsive, and passionate individual with a keen eye for identifying process improvement opportunities.

  • Significant experience with productivity tools and process automation is a major plus.

  • Deep familiarity with cloud-native product / services environments and familiarity with AI, ML infrastructure, and cloud/services 

NVIDIA is widely considered to be one of the technology world’s most desirable employers! We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 200,000 USD - 322,000 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until March 8, 2026.

This posting is for an existing vacancy. 

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.