Principal Site Reliability Engineer

Posted:
10/2/2024, 5:00:00 PM

Location(s):
Hyderabad, Telangana, India ⋅ Telangana, India

Experience Level(s):
Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Career Category

Information Systems

Job Description

Join Amgen’s Mission of Serving Patients

At Amgen, if you feel like you’re part of something bigger, it’s because you are. Our shared mission—to serve patients living with serious illnesses—drives all that we do.

Since 1980, we’ve helped pioneer the world of biotech in our fight against the world’s toughest diseases. With our focus on four therapeutic areas –Oncology, Inflammation, General Medicine, and Rare Disease– we reach millions of patients each year. As a member of the Amgen team, you’ll help make a lasting impact on the lives of patients as we research, manufacture, and deliver innovative medicines to help people live longer, fuller happier lives.

Our award-winning culture is collaborative, innovative, and science based. If you have a passion for challenges and the opportunities that lay within them, you’ll thrive as part of the Amgen team. Join us and transform the lives of patients while transforming your career.

What you will do

Let’s do this. Let’s change the world. In this vital role you will drive operational excellence through automation, incident response, and proactive performance tuning, while also reducing infrastructure costs. You will work closely with multi-functional teams to establish standard methodologies for service availability, efficiency, and cost control.

Roles & Responsibilities:

  • Talent Management & Team Leadership: Lead, mentor, empower and manage 5-10 hard-working engineering team to deliver exceptional results
  • System Reliability, Performance Optimization & Cost Reduction: Ensure the reliability, scalability, and performance of Amgen’s infrastructure, platforms, and applications. Proactively identify and resolve performance bottlenecks, and implement long-term fixes. Continuously evaluate system design and usage to find opportunities for cost optimization, ensuring infrastructure efficiency without compromising reliability.
  • Automation & Infrastructure as Code (IaC): Drive the adoption of automation and Infrastructure as Code (IaC) across the organization to streamline operations, minimize manual interventions, and enhance scalability. Implement tools and frameworks (such as Terraform, Ansible, or Kubernetes) that increase efficiency and reduce infrastructure costs through optimized resource utilization.
  • Standardization of Processes & Tools: Establish standardized operational processes, tools, and frameworks across Amgen’s technology stack to ensure consistency, maintainability, and best-in-class reliability practices. Champion the use of industry standards to optimize performance and increase operational efficiency.
  • Monitoring, Incident Management & Continuous Improvement: Implement and maintain comprehensive monitoring, alerting, and logging systems to detect issues early and ensure rapid incident response. Lead the incident management process to minimize downtime, conduct root cause analysis, and implement preventive measures to avoid future occurrences. Foster a culture of continuous improvement by demonstrating data from incidents and performance monitoring.
  • Collaboration & multi-functional Leadership: Partner with software engineering, DevOps, and IT teams to integrate reliability, performance optimization, and cost-saving strategies throughout the development lifecycle. Act as a domain expert in SRE principles and advocate for standard methodologies across all teams.
  • Capacity Planning & Disaster Recovery: Develop and implement capacity planning processes to support future growth, performance, and cost management. Maintain disaster recovery strategies to ensure system reliability and minimize downtime in the event of failures.

What we expect of you

We are all different, yet we all use our unique contributions to serve patients.

Basic Qualifications:

  • Master’s degree and 8 to 10 years of Computer Science, Engineering, or related field experience OR
  • Bachelor’s degree and 10 to 14 years of Computer Science, Engineering, or related field experience OR
  • Diploma and 14 to 18 years of Computer Science, Engineering, or related field experience

Preferred Qualifications:

  • Performance Tuning & Cost Optimization: Expertise in identifying performance bottlenecks in large-scale distributed systems and implementing optimization strategies. Experience with cost management in cloud environments (AWS, Azure) to drive cost-effective infrastructure decisions.
  • Automation Tools & Infrastructure as Code: Deep expertise with automation tools such as Terraform, Ansible, or Puppet, and hands-on experience with Infrastructure as Code (IaC) to automate infrastructure provisioning and maintenance, enhancing both performance and cost efficiency.
  • Monitoring & Incident Management: Proficient in deploying and managing monitoring solutions in production such as Dynatrace, Datadog, or New Relic to maintain high system performance and ensure rapid incident response. Proven experience with incident management
  • Standardization & Best Practices: Strong background in creating and enforcing standardized processes, coding practices, and frameworks to ensure consistency, scalability, and improved system performance, and evangelize by collaborating across teams

Good-to-Have Skills:

  • Experience with containerization (Docker) and orchestration tools (Kubernetes) to optimize resource usage and improve scalability.
  • Knowledge of cloud-native technologies and strategies for cost optimization in multi-cloud environments.
  • Familiarity with distributed systems, databases, and large-scale system architectures.

Certifications

  • AWS Certified DevOps Engineer - Professional
  • Recognizes sophisticated knowledge of AWS and DevOps standard methodologies to automate and optimize infrastructure and applications in AWS.
  • Certified Kubernetes Administrator (CKA)
  • Validates skills required to design, build, and maintain production-grade Kubernetes clusters.

What you can expect of us

As we work to develop treatments that take care of others, we also work to care for your professional and personal growth and well-being. From our competitive benefits to our collaborative culture, we’ll support your journey every step of the way.

In addition to the base salary, Amgen offers competitive and comprehensive Total Rewards Plans that are aligned with local industry standards.

Apply now

for a career that defies imagination

Objects in your future are closer than they appear. Join us.

careers.amgen.com

.