Posted:
12/10/2024, 3:25:02 AM
Location(s):
Santa Clara, California, United States ⋅ California, United States
Experience Level(s):
Senior
Field(s):
Software Engineering
We are looking for an experienced software engineering manager to lead the development of NVIDIA’s distributed runtime stack for large-scale distributed computing that attempts to democratize scalable accelerated computing for everyone. Around the world, leading commercial and academic organizations are revolutionizing AI, scientific computing, and data analytics, using data centers powered by GPUs. Applications of these technologies include LLMs, Computer Vision, autonomous vehicles and countless others. Our team develops foundational distributed computing software that extremely simplifies development of such applications!
In this role, you will lead an engineering team designing, developing, and optimizing the distributed task-based runtime software stack that includes Legate, Legion and Realm. Ideal candidates should have experience leading software product engineering teams, and be motivated to advance the state-of-the-art in a variety of accelerated computing domains. If this sounds exciting, we would love to meet you!
What you'll be doing:
Lead, mentor, and grow your distributed runtime engineering team and be responsible for the planning and execution of projects as well as the quality, and performance of the runtime stack.
Work closely with NVIDIA Research, Engineering, Developer Technology, and Product Management teams in the areas of scientific computing, data analytics, programming systems, and AI to help collect requirements for your products as well as contribute to the development of technology roadmaps.
Interact with external partners and researchers to understand their use cases and requirements.
What we need to see:
BS, MS or PhD degree in Computer Science, Electrical Engineering or related field (or equivalent experience)
8+ years of overall experience in developing distributed runtimes or at-scale high-performance software.
3+ years of experience recruiting, training and leading software engineering teams.
Background in high performance computing and performance critical applications
Experience implementing, tuning, and debugging runtimes and/or distributed systems for supercomputers or the cloud
Hands-on experience with design, development, testing, maintenance, and performance optimization of GPU-accelerated software using C, C++ or Python.
Strong collaboration, communication, and documentation habits.
Experience with agile software development practices using project management tools such as JIRA.
Ways to stand out from the crowd:
Experience with development of distributed runtimes such as Legion, Ray or Dask
Experience with parallel programming, ideally using CUDA, MPI or OpenMP
Good knowledge of CPU and/or GPU hardware architecture.
Development of domain specific libraries/languages for high performance computing
Good understanding of Machine Learning and Deep Learning technologies
You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
Website: https://www.nvidia.com/
Headquarter Location: Santa Clara, California, United States
Employee Count: 10001+
Year Founded: 1993
IPO Status: Public
Last Funding Type: Grant
Industries: Artificial Intelligence (AI) ⋅ GPU ⋅ Hardware ⋅ Software ⋅ Virtual Reality