Senior Solutions Architect, NPN

Posted:
6/17/2024, 5:00:00 PM

Location(s):
North Carolina, United States ⋅ Texas, United States ⋅ Kansas, United States ⋅ California, United States ⋅ Maryland, United States ⋅ Missouri, United States

Experience Level(s):
Expert or higher ⋅ Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Workplace Type:
Hybrid

Want to be part of a team that's revolutionizing the field of AI with data center scale solutions? We are looking for a hardworking Solution Architect with experience in designing, building, and maintaining large scale HPC and AI hybrid computing solutions to join our team at NVIDIA. As Solution Architects on the NVIDIA Partner Network team, we are actively helping NVIDIA DGX and DGX SuperPOD solutions bring the benefits of large scale AI to customers through our partners. We work closely with customers and partners to address unsolved problems in the industry and help to deploy and operationalize AI solutions at scale.

What you'll be doing:

  • Our day-to-day work involves guiding partners in their adoption of end-to-end Machine Learning and Deep Learning solutions, using NVIDIA's compute, networking, and software stacks. Don't think this is a high-level slideshow job - we are the voice of experience, using Kubernetes, SaaS, infrastructure-as-code tools, network debugging, and problem solving skills to help build modern AI factories.

  • We also excel at sharing knowledge with others, whether it's delivering demos, assisting with proof-of-concepts, or writing papers and developer blogs. By collaborating with executives and engineering, we solve complex problems and help bring NVIDIA's premiere technologies to life in the cloud and in the datacenter.

  • Our mission is to solve the problems that nobody else has solved yet, and we need someone to be an instrumental part of that!

What we need to see:

  • Strong foundational expertise and a BS, MS, or PhD in Engineering, Computer Science, or a related field or equivalent experience.

  • Established track record working with AI and HPC clusters, both on-premises and cloud based.

  • 12+ years of proven experience with cluster management and related tools, including Docker Containers, Slurm, Kubernetes, and Ansible.

  • Hands-on experience with network, storage, cluster configuration and debugging.

  • Strong analytical and problem-solving skills, along with an ability to articulate what you know to others.

  • Ability to multitask efficiently in a dynamic environment.

Ways to stand out from the crowd:

  • Strong coding and debugging skills, including experience with Python, C/C++, Bash, and Linux utilities.

  • Demonstrated expertise through projects or Open Source contributions involving GPU workloads, Kubernetes, InfiniBand, Ethernet, or other areas related to high-performance clusters and hybrid cloud solutions.

  • Exhibit hands on experience with NVIDIA AI Enterprise, Base Command Manager and NEMO cloud native framework.

  • Willingness and ability to learn quickly and solve advanced problems.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

The base salary range is 220,000 USD - 339,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

NVIDIA

Website: https://www.nvidia.com/

Headquarter Location: Santa Clara, California, United States

Employee Count: 10001+

Year Founded: 1993

IPO Status: Public

Last Funding Type: Grant

Industries: Artificial Intelligence (AI) ⋅ GPU ⋅ Hardware ⋅ Software ⋅ Virtual Reality