Manager, Cloud Reliability Engineering (United States)

Posted:
9/11/2024, 9:04:20 AM

Location(s):
San Francisco, California, United States ⋅ New York, United States ⋅ New York, New York, United States ⋅ California, United States ⋅ Washington, United States

Experience Level(s):
Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Workplace Type:
Remote

Do you want to empower organizations to fairly and equitably hire, promote, retain and compensate their employees? Syndio is a Series-C technology company committed to fairness in the workplace. Fueled by investments of $83M from Bessemer Ventures, Voyager Capital and social change organization Emerson Collective, Syndio is investing in growing our team and products.

This is a critical moment when organizations are looking for ways to take tangible action to fight gender and racial bias, and we believe creating diverse and inclusive workplaces for all starts with workplace equity.

About the role

#ubiquitous intro

This role will report to the Director, Platform Engineering with ownership of Site Reliability Engineering (SRE), Platform Engineering (PE), and Cloud Operations Engineering (COE) functions. In this role, you will help to define and drive a vision for the organization and align it with the broader engineering and business goals. You’ll develop and execute on Service Level Objectives (SLOs), capacity management and planning, and cost optimization. You’ll have clear line of sight into the overall health of our SDLC pipelines and the reliability, scalability, and performance of our systems. Lastly, you’ll develop and maintain our Internal Development Platform (IDP) and related tooling, relentlessly focused on making the software development process quick and painless for our engineers. Kindly note, this is a remote role based within  one of our talent hubs: NYC Tri-State, Greater Seattle, or San Francisco Bay Area. 

#the gist

We are looking for someone who is passionate about SRE practices, understands the importance of platform engineering and overall developer experience, and values the operational needs of a distributed SaaS-based platform that caters to the largest enterprises in the world. As a startup in a fast-paced high growth environment, we are looking for a leader that is flexible and can grow outside the traditional expectation of an engineering leader. You’ll be exposed to, develop skills in, and be responsible for work that will span a wide-range of engineering disciplines and leadership functions.

#not awkward- just real
Before we get started, let’s be specific here with minimum skill sets. This is a first-line manager leadership position. We use Kubernetes, Helm, and Terraform almost exclusively in a 100% cloud-based environment. You are a leader who is comfortable in your knowledge of these technologies and has relevant experience managing SaaS-based applications in an SRE, PE, Cloud Operations, or similar role.
 
Why this job is exciting
  • Guide the development and evolution of our cloud platform, ensuring it is scalable, reliable, and secure
  • Design, implement, and operate production systems using best practices in automation, monitoring, and observability
  • Recruit, mentor, and develop a high-performing team of SREs, Platform Engineers, and Cloud Operations Engineers
  • Foster strong partnerships with development teams, product management, security, and other stakeholders to ensure reliability is built into the entire product lifecycle
  • Define, track, and report on SLOs to ensure system reliability and performance meet or exceed customer expectations
  • Forecast and manage infrastructure capacity to ensure systems can scale to meet demand while optimizing costs
  • Identify and implement cost-saving measures to ensure cloud infrastructure spending is optimized.
  • Proactively identify and address security vulnerabilities in the cloud environment
  • Stay abreast of emerging technologies and industry trends, and evaluate their potential to enhance Cloud Reliability Engineering practices
  • Experiment with cloud infrastructure environments and services
About you
  • #important stuff
    • 3+ years of experience as a manager leading a team of SRE’s, PE’s, or COE’s
    • 5+ years in an SRE, PE, COE or similar role operationalizing and maintaining cloud services
    • Experience with continuous integration and delivery frameworks
  • #really important stuff
    • You are eager to learn, share knowledge and ideas, and grow with our team
    • You are available and willing to step in for emergency response and incident management
    • You are relentlessly curious, willing to explore all aspects of the product to identify gaps in stability, scalability, and performance
    • You want to own and drive initiatives within Syndio that leverage and build your skills and interests
  • #REALLY really important stuff
    • You assume positive intent, are humble and eager, expect the best from yourself, value partnership over perfection, and provide grace and understanding in stressful situations
      You value a remote work environment and know that it requires greater intentionality on your part to build and maintain strong working relationships
    • While this is a remote position, you MUST currently reside within commuting distance of one of our talent hubs. Relocation is not currently offered. 
      • Kindly note, you must also be eligible to work there legally, as we are NOT able to provide visa support at this time.

Role progression

  • Within 1 month, you’ll complete a comprehensive and supportive onboarding process and be able to make isolated contributions to the product, developer tooling, and infrastructure
    Within 3 months, you’ll have a grasp of the complete set of components (services, tools, configuration, etc.) that make up the product and infrastructure. You will continue to make isolated contributions. 
    Within 6 months, you’ll be driving and executing on a roadmap (partially of your own making) and implementing complex changes to the infrastructure, developer tooling, and scalability, reliability, and performance of our platform.
Why you'll love it here:
  • Check out our Employee Experience page for more information on our Mission & Values, Work-Life Balance, Pay Transparency, Diversity, Culture, and Benefits. 
  • 💰 Competitive Compensation. For this role our base salary is targeted at $168-200k CAD. Final offer amounts are determined by factors such as experience and expertise.
  • 🏆  Syndio Equity. So you can share in Syndio’s success.
  • 🏝 20 days annually. We encourage our team to recharge when they need to, plus paid sick & safe time, compassion leave, and voting leave. 
  • 🏦 Pension Contribution
  • 📍 Remote-First within talent hubs - NYC Tri-State, Greater Seattle, or San Francisco Bay Area - providing more opportunities to meet up #LI-Remote 

The Interview Overview

Below you'll find an outline of the interview plan for our Manager, Cloud Reliability Engineering position. Please note that this is what we expect the process to look like; we may ask you for supplemental information or require an additional step before making a final decision.

  • 30 min interview with a member of our Talent Team
  • 30 minute Zoom interview with the Hiring Manager
  • Three video interviews with several key members - 1hr 45min
  • 30 minute Finalist interview with Executive

At Syndio, we're building a diverse team that values candor, curiosity, and community. If you share these values and are interested in joining us, we'd love to talk with you even if you don't 100% meet the "about you" listed here. We don't expect anyone to have all the answers, as long as you're willing to learn and grow with us.

Employees joining the Syndio team at this early stage of growth will impact this critical social issue and support a growing customer base (including Nordstrom, General Mills, Match Group, and others) to take tangible action on workplace fairness. 

Syndio is an Equal Opportunity Employer. We are building an inclusive and collaborative workplace as we grow, and we welcome team members regardless of gender/identity, sexual orientation, race or cultural background, religion, physical disability and age. 

Syndio

Website: https://synd.io/

Headquarter Location: Seattle, Washington, United States

Employee Count: 101-250

Year Founded: 2016

IPO Status: Private

Last Funding Type: Series C

Industries: Analytics ⋅ Human Resources ⋅ Software