Senior Site Reliability Engineer

Posted:
12/3/2024, 12:52:08 AM

Location(s):
Santiago Metropolitan Region, Chile

Experience Level(s):
Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Workplace Type:
Remote

Launchpad, a people-first technology company, is a leader in North America´s rapidly growing tech sector. Through two solutions, Launchpad supports its clients with digital transformation:

  • PaasportTM, our iPaaS solution, streamlines software integration and automates workflows.
  • Nearshore Staff Augmentation, our managed IT staffing service, connects top IT talent across various geographical regions, bringing industry expertise to leading clients.

Based in Vancouver, Canada, our operational footprint spans across North and South America, with a second headquarters in Santiago, Chile.

In 2023, our unwavering dedication to innovation garnered recognition as a Deloitte Technology Fast 50™ Program Company. Our clientele boasts industry leaders such as Walmart, GM, TIME Magazine, Salesforce, Tableau, Splunk, Bolt.com, Freedom House, and more.

At Launchpad, we genuinely care about our people as individuals. If you are looking for a team that values growth, drive, and passion for your craft, if you’re seeking a place to achieve your goals and dreams with fairness and integrity, then we’d love to hear from you.

About the Role

We are seeking a Senior Site Reliability Engineer (SRE) to play a pivotal role in ensuring the reliability, scalability, and performance of our infrastructure. This is a mission-critical role, requiring someone who can address both external product reliability and internal platform demands while contributing strategically to organizational objectives.

You will balance hands-on technical work with leadership in reliability initiatives, driving improvements across our platform and collaborating with stakeholders at all levels. This position is crucial to maintaining operational excellence as we navigate complex compliance standards and evolving business needs.


Responsibilities

Strategic and Leadership Responsibilities

  • Drive the development and enhancement of reliability frameworks and processes.
  • Collaborate with VPs, managers, and cross-functional teams, presenting monthly reliability reports and real-time data analysis.
  • Lead initiatives to address key operational challenges and identify areas for process innovation.

Technical Responsibilities

  • Design, build, and maintain reliable and scalable infrastructure using Azure (90%) and AWS (10%).
  • Automate infrastructure and operational tasks with KubernetesTerraformJenkins, and GitHub Actions.
  • Develop and refine monitoring solutions using tools like GrafanaPrometheusELK Stack, and Azure Monitoring.
  • Manage incident response and conduct post-mortem analyses to improve system resiliency.
  • Provide Level 3 operational support, including on-call availability (preferred).
  • Address gaps in automation and optimize existing processes.

Compliance and Reporting

  • Ensure systems comply with ISO 27001 and SOC2 standards.
  • Develop and improve reliability metrics and their communication to stakeholders.

Qualifications

Technical Skills

  • Expertise in cloud infrastructure, particularly Azure and AWS.
  • Strong experience with containerization and orchestration (Kubernetes) and infrastructure-as-code tools (Terraform).
  • Proficiency in CI/CD pipelines (JenkinsGitHub Actions) and monitoring tools (GrafanaPrometheus).
  • Experience with secrets management tools such as HashiCorp Vault and incident management platforms like OpsGenie.

Experience

  • 7+ years in Site Reliability Engineering, DevOps, or similar roles.
  • Proven track record of managing complex cloud environments and driving operational improvements.
  • Familiarity with compliance frameworks such as ISO 27001 and SOC2.
  • Experience presenting technical data and initiatives to executive-level stakeholders.

Soft Skills

  • Exceptional communication skills to collaborate across diverse teams and organizational levels.
  • Strong analytical mindset to address reliability challenges and identify innovative solutions.
  • Ability to work in a fast-paced environment and meet urgent deadlines.

 

Why work for Launchpad?

  • 100% remote
  • People first culture
  • Excellent compensation in US Dollars
  • Hardware setup for working from home
  • Work with global teams and prominent brands based in North America, Europe, and Asia
  • Training allowances
  • Personal time off (PTO) for vacations, study leave, personal time, etc.
  • ...and more!

At Launchpad, we genuinely care about our people as individuals. If you are looking for a team that values growth, drive, and passion for your craft, if you’re seeking a place to achieve your goals and dreams with fairness and integrity, then you are the future of Launchpad. Launchpad is committed to fostering a diverse and representative workforce and an inclusive work environment where all employees are respected and treated equally.

Are you ready to elevate your career at Launchpad? We want to hear your story! Contact us today.