Staff Site Reliability Engineer

Posted:
6/10/2026, 6:49:42 PM

Location(s):
Gurgaon, Haryana, India ⋅ Haryana, India

Experience Level(s):
Mid Level ⋅ Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Work Flexibility: Hybrid

What You Will Do

  • Own and maintain highly available production systems, lead incident response (P1/P2), conduct RCA/PIRs, and drive improvements to reliability, performance, and operational excellence.

  • Design, build, and manage scalable cloud infrastructure on AWS using Terraform, with strong ownership of Kubernetes (EKS), networking, security, and platform resilience.

  • Develop and optimize CI/CD and GitOps pipelines using GitLab CI and ArgoCD, while automating operational processes to improve efficiency and consistency.

  • Manage observability and on-call operations through tools such as PagerDuty/Zenduty, Prometheus, Grafana, ELK, and Datadog, ensuring actionable monitoring and effective alert management.

  • Collaborate with global engineering, security, and product teams, contribute to cloud architecture and compliance initiatives (SOC2, ISO27001), create operational documentation, and mentor team members

What You Will Need

Required qualifications 

  • 5–8 years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles. 

  • Strong hands-on experience with AWS, Terraform, Kubernetes, and ArgoCD in production environments. 

  • Experience operating EKS-based platforms, including networking, scaling, monitoring, and troubleshooting. 

  • Strong knowledge of CI/CD, GitOps, and automation practices, with hands-on use of GitLab CI and ArgoCD. 

  • Experience managing production systems in a 24×7 environment, including incident response and on-call practices. 

  • Solid Linux and cloud networking background. 

  • Experience with observability tools such as ELK and Prometheus. 

  • Strong scripting skills in Python, Bash, or Go. 

 

Preferred qualifications 

  • Engineering degree in computer science or equivalent.  

  • Cloud certifications such as AWS SysOps / DevOps Engineer, or CKA/CKAD. 

  • Exposure to ITSM or change management processes in regulated industries (healthcare, fintech, or similar). 

 

Travel Percentage: 10%