Site Reliability Engineer I

Posted:
11/12/2024, 4:00:00 PM

Location(s):
Oregon, United States ⋅ Beaverton, Oregon, United States

Experience Level(s):
Mid Level ⋅ Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Site Reliability Engineer I

Beaverton, Oregon

WHO YOU’LL WORK WITH

As a Software Engineer specializing in Resilience Engineering, you will play a critical role in ensuring the maximum availability, observability, reliability, security, and performance of Nike’s digital experiences. This position requires a proactive approach to maintaining robust, consumer-facing systems that support millions of users worldwide.

In this role, you will lead in-depth problem analysis, identify infrastructure and code-level defects, establish observability processes for key performance indicators (KPIs), and collaborate closely with product delivery teams to design sustainable solutions to production challenges. Your expertise will be vital to enhancing Nike’s commitment to a seamless and resilient digital experience.

WHO WE ARE LOOKING FOR

Nike is seeking talented and driven full stack developers with expertise in cloud infrastructure and services. The ideal candidate will possess:

  • Bachelor’s degree in computer science, Information Systems, or a related field or combination of education and relevant professional experience
  • Proven experience in designing and developing applications using Java, Node.js, or similar languages
  • Familiarity with front-end frameworks (e.g., React, Angular) is advantageous
  • Experience with modern programming languages such as Scala, Python, or Golang is preferred
  • A solid understanding of DNS, networking, virtualization, and Linux operating systems
  • Demonstrated expertise in building and managing scalable, cloud-based microservices, ideally on AWS
  • Experience with Docker or serverless architectures
  • Proficiency in at least one NoSQL database (e.g., DynamoDB, Cassandra)
  • Strong understanding of RESTful APIs
  • Familiarity with service management, agile, and observability tools such as ServiceNow, Jira, Jenkins, Splunk, New Relic, and SignalFX

WHAT YOU’LL WORK ON

  • Observing, diagnosing, and quickly resolving production issues with precision to minimize service interruptions
  • Developing and implementing real-time monitoring solutions that deliver essential insights into system health and key performance indicators
  • Communicating technical issues and their business impacts clearly, ensuring alignment across teams and effective response strategies
  • Reporting high-value metrics and insights to leadership, demonstrating the impact of site reliability on consumer experience and overall business objectives
  • Managing IT service processes such as Incident, Problem, Change, and Knowledge Management to maintain service quality and reliability
  • Collaborating closely with both business and technical teams to analyze system performance, troubleshoot consumer-reported issues, and proactively optimize system efficiency
  • Leading initiatives to enhance application reliability for high-demand consumer web and mobile platforms, ensuring consistent performance
  • Leveraging negotiation and influence to foster alignment and drive collaborative solutions across multiple teams
  • Promoting a culture of growth by coaching, mentoring, and sharing knowledge, supporting continuous improvement and resilience across the team

Join us in delivering resilient, high-performance digital solutions that will empower millions of consumers around the world. Your skills and insights will be pivotal in driving Nike’s digital transformation.