Service Reliability Operations Manager

Posted:
9/4/2024, 6:52:24 AM

Location(s):
Austin, Texas, United States ⋅ Texas, United States

Experience Level(s):
Mid Level ⋅ Senior

Field(s):
IT & Security ⋅ Operations & Logistics

Workplace Type:
Remote

Our technology has no boundaries! NVIDIA is building the world’s most groundbreaking and state of the art compute platforms in the world. It’s because of our work that scientists, researchers and engineers can advance their ideas. At its core, our visual computing technology not only enables an amazing computing experience, it is energy efficient! We pioneered a supercharged form of computing loved by the most demanding computer users in the world - scientists, designers, artists, and gamers. It’s not just technology though! It is our people, some of the brightest in the world, and our company culture make NVIDIA one of the most fun, innovative and dynamic places to work in the world! At the center of NVIDIA's culture are our core values like innovation, excellence and determination and team, that guide us to be the best we can be.

NVIDIA's NGC team is looking for highly motivated Service Reliability Operations Manager to help lead, design, develop and implement a global, dynamic, state-of-the-art Service Reliability Operations Center (known as Mission Control), to provide extraordinary levels of support for our Cloud products and services. As a key leader in the Mission Control team, you will partner with other key members of our organization including Site Reliability Engineering, Security Operations Center, DevOps teams, and other datacenter operations partners to help monitor, maintain, and grow our infrastructure. On the rare occasion that an incident occurs, you will be our front line to decrease the frequency and duration of any issue.

What you will be doing:

  • The SRO Manager will include hands-on technical support work related to the overall health and maintenance of production/pre-production environments, as well as delegating out tasks and supervising team members.

  • This position will perform tier-2 and 3 escalation support and act as a primary point of contact in Mission Control for inquiries from other departments.

  • The SRO Manager will manage all related partner/customer operational expectations.

  • Ensure continual process improvement within the SRO team including but not limited to automation of SRO tasks and reporting, implementation of enterprise-wide monitoring initiatives, and routine administration tasks.

  • Identify areas for process and efficiency improvement within the SRO team; recommend prioritized enhancements and oversee implementation.

  • Ensure that reports are accurate and delivered on time. Help discover incidents and issues, including initiating the incident management procedure.

  • Your interpersonal skills will help keep the team engaged through resolution and ensure our clients believe we value their time and effort.

  • You may perform other tasks that will help us provide extraordinary service levels for our customers.

What we need to see:

  • B.S. degree or equivalent experience

  • 5+ overall experience

  • Experience of team leadership engaged in production Linux-server based services.

  • 3+ years of demonstrated history of successfully leading teams.

  • Experience with both Jira, Confluence, and Gitlab (or similar git based solutions).

  • Knowledge of Linux server provisioning and configuration at a near-expert level or above.

  • Ability to work independently as well as in a team environment.

  • Network services experience is desired.

  • Excellent leadership qualities.

  • Excellent skills in developing processes and procedures for Client & In-house team. Excellent oral communication skills, writing and presentation skills. Ability to interact with clients in a professional, articulate manner.

The base salary range is 164,000 USD - 310,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

NVIDIA

Website: https://www.nvidia.com/

Headquarter Location: Santa Clara, California, United States

Employee Count: 10001+

Year Founded: 1993

IPO Status: Public

Last Funding Type: Grant

Industries: Artificial Intelligence (AI) ⋅ GPU ⋅ Hardware ⋅ Software ⋅ Virtual Reality