Senior System Software Engineer - Cloud Networking

Posted:
7/30/2024, 5:00:00 PM

Experience Level(s):
Expert or higher ⋅ Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Workplace Type:
Remote

We are looking for a Senior System Software Engineer, Cloud Networking to design, implement and operate scalable Software Defined Networking (SDN) Service in NVIDIA GPU Cloud hosting NVIDIA's Cloud services and applications for Deep Learning, Inference, Streaming of User Experience and other GPU-accelerated applications.

NVIDIA’s invention of the GPU 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “the AI computing company”. We are looking to grow our company, and grow with the smartest people in the world. We are looking for you.

What you’ll be doing:

  • Design, implement and operate the next generation scalable multi-tenant SDN network architecture with focus on performance and security in mind to support data center and cloud infrastructure at massive scale.

  • To complement the efficient networking architecture, you will help crafting Infrastructure-as-a-Service for networking that can provision performant virtual networks on demand to provide support for VMs and Kubernetes containers in multi-tenant environment. This service will provide easy to use high level abstractions for the users, via Cloud APIs. You will also focus on operational aspects of the SDN service in production - including playbooks, integration with SRE tools, triaging complex Cloud Infra issues related to the SDN Service and it's interaction with the rest of the Cloud Infra Software Stack.

  • You will also design and develop software for monitoring the overall network health for effective operations.

What we need to see:

  • BA/BS degree in Computer Science, related technical discipline

  • 10+ years of experience in a host networking and K8s.

  • Hands-on experience with supporting and operating host networking services in large Clouds.

  • Deep understanding of various networking protocols with hands on kernel development experience.

  • Some Expertise in Unix/Linux networking stack and networking drivers.

  • Experience in crafting network architecture for cloud/distributed systems.

  • Understanding of SR-IOV and Xen virtualization.

  • Hands-on experience with Open Virtual Switch (or an equivalent solution)

  • Hands-on experience with one or more SDN solutions

  • Experience in network management systems, network monitoring systems or network operations.

Ways to stand out from the crowd:

  • Hands-on experience with openvswitch.org code

  • Experience with hardware acceleration of networking data, control and orchestration layers

  • Experience with RDMA (InfiniBand or RoCE) fabrics

  • Deep understanding of container networking (networking for Kubernetes, Docker etc.)

  • Coding skills in scripting languages like Python

  • Working knowledge of distributed databases and data structures