SRE Engineer

Posted:
1/27/2026, 4:29:41 AM

Experience Level(s):
Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Workplace Type:
Remote

We’re looking for a Site Reliability Engineer (SRE) to join our Global SRE team at Resmed. In this role, you’ll blend software engineering and systems engineering to help ensure our large-scale, distributed digital products are reliable, scalable, and efficient. You’ll work closely with software, platform, and product teams to design, build, and operate systems that support Resmed’s customers worldwide.

Responsibilities

  • Ensure the reliability, availability, and resiliency of Resmed’s digital products by designing and operating fault-tolerant systems

  • Partner with product and platform teams to define and improve service health using operational and customer-experience metrics

  • Design, implement, and maintain monitoring, alerting, logging, and tracing solutions that provide real-time visibility into system behavior and customer experience

  • Analyze system performance, scalability, and capacity, and drive optimizations to improve efficiency and stability in cloud environments

  • Build automation and tooling to support deployments, scaling, incident response, and operational workflows

  • Participate in an on-call rotation as part of a globally distributed team, lead incident response efforts, troubleshoot production issues, conduct postmortems, and drive continuous improvement initiatives

  • Collaborate with security and compliance partners to support secure, privacy-aware, and compliant operations

  • Work closely with engineering teams to improve developer experience, operational maturity, and overall customer experience

Qualifications

  • Experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering roles

  • Experience operating Kubernetes-based production systems

  • Hands-on experience with AWS and infrastructure-as-code tools

  • Experience designing and supporting CI/CD pipelines and automated deployments

  • Proficiency in Python for automation, tooling, or backend services

  • Solid understanding of distributed systems and networking concepts

  • Experience with monitoring and observability platforms such as Datadog and CloudWatch

Joining us is more than saying “yes” to making the world a healthier place. It’s discovering a career that’s challenging, supportive and inspiring. Where a culture driven by excellence helps you not only meet your goals, but also create new ones. We focus on creating a diverse and inclusive culture, encouraging individual expression in the workplace and thrive on the innovative ideas this generates. If this sounds like the workplace for you, apply now! We commit to respond to every applicant.