Senior Engineer, Site Reliability

Posted:
6/2/2024, 5:00:00 PM

Location(s):
Chevy Chase, Maryland, United States ⋅ Maryland, United States

Experience Level(s):
Senior

Field(s):
Software Engineering

Workplace Type:
Hybrid

Senior Engineer, Site Reliability 

 

Position Summary  

GEICO is seeking an experienced Engineer with a passion for building high-performance, low maintenance, zero-downtime platforms, and applicationsYou will help drive our insurance business transformation as we transition from a traditional IT model to a tech organization with engineering excellence as its mission, while co-creating the culture of psychological safety and continuous improvement. 

Position Description 

Our Senior SRE works with our Manager, Distinguished and Staff Engineers to innovate and build new systems, improve, and enhance existing systems and identify new opportunities to apply your knowledge to solve critical problems. You will lead the execution of a technical roadmap that will increase the velocity of delivering products and unlock new engineering capabilities. The ideal candidate has a deep understanding of technology, risk management, site reliability engineering principles and strategic planning to design and implement resilient systems that safeguard our business from potential threats. 

 

Position Responsibilities 

As an SRE, you will: 

  • Drive the overall strategy for the organization, aligning it with the organization's business goals and objectives 

  • Provide thought leadership in the organization, staying ahead of industry trends and emerging technologies to enhance the reliability of our services. 

  • Conduct comprehensive risk assessments to identify potential threats and vulnerabilities 

  • Design and implement robust strategies to ensure data safety, integrity, and correctness  

  • Lead the design and architecture of resilient and scalable systems, considering both on-premises and cloud-based solutions 

  • Collaborate with cross-functional teams to integrate data safeguard best practices into the development and deployment processes 

  • Develop and maintain comprehensive incident response plans to address various disaster scenarios on our backup/restore systems 

  • Conduct regular simulations and drills to ensure the readiness of the organization in the event  

of a disaster 

  • Hands-on software engineering and SDLC best practices (Technical Review Documents, Architecture, Software Development, Software Reviews, Testing, Production Readiness Reviews, among others) 

  • Evaluate, select, and implement cutting-edge technologies and tools to enhance our data safeguard capabilities including but not limited to processes, compliance, and visibility 

  • Stay current with industry best practices and emerging technologies to continuously improve our data safeguard capabilities 

  • Work closely with executive leadership, IT teams, and other stakeholders to communicate the importance of data safeguarding and foster a culture of resilience 

  • Act as a trusted advisor, providing guidance on best practices managing and securing our data,  to technical and non-technical stakeholders 

  • Be a role model and mentor, helping to coach and strengthen the technical expertise and know-how of our engineering and product community 

  • Influence and educate executives 

  • Analyze cost and forecast 

  • Determine and support resource requirements, evaluate operational processes, measure outcomes to ensure desired results, and demonstrate adaptability and sponsoring continuous learning 

 

Qualifications 

  • Deep knowledge of SRE practices, methodologies, and principles, along with a solid understanding of on prem and public cloud-based network, compute, and storage technologies 

  • Fluency and specialization in software development and best practices using modern programming languages such as Go and Python  

  • Understanding of SQL and NoSQL databases, including stateful services management and storage 

  • Understanding of networking, caches, key/value stores, load balancing, global load balancing, queues, DNS and CDN 

  • In-depth knowledge of hybrid cloud architecture, IaaS and PaaS technologies, container orchestration platforms (e.g., Kubernetes), cloud efficiency and observability etc. 

  • Strong background in incident management 

  • Ability to create incident response playbooks, runbooks, incident triaging strategies, and post-incident analysis to drive continuous improvement in system reliability and availability 

  • Experience with open-source management and monitoring tools 

  • Experience with infrastructure automation, tooling, and configuration management frameworks (e.g., Puppet, Chef, Ansible, Pulumi, Terraform, etc.) 

  • Familiarity with cloud security best practices and compliance standards 

  • Excellent leadership skills with a passion for mentoring and fostering professional growth 

  • Detail-oriented and a drive for operational excellence 

  • Visionary thinker with the ability to anticipate future challenges and opportunities 

  • Excellent communication skills 

  • Strong analytical and problem-solving capabilities 

  • Proven track record of successfully leading and building software in large and complex organizations 

 

Experience 

  • 4+ years of professional experience in software development, platform architecture, administration and maintenance of software, and its ecosystem  

  • 3+ years of experience with architecture and design  

  • 3+ years of experience with AWS, GCP, Azure, or hybrid data center  

  • 2+ years of experience in open-source frameworks  

Education 

  • Bachelor's degree in computer science, Information Systems, or equivalent education or work experience 


 

Annual Salary

$82,000.00 - $204,500.00

The above annual salary range is a general guideline. Multiple factors are taken into consideration to arrive at the final hourly rate/ annual salary to be offered to the selected candidate. Factors include, but are not limited to, the scope and responsibilities of the role, the selected candidate’s work experience, education and training, the work location as well as market and business considerations.


 

At this time, GEICO will not sponsor a new applicant for employment authorization for this position.


 

Benefits:

As an Associate, you’ll enjoy our Total Rewards Program* to help secure your financial future and preserve your health and well-being, including:

  • Premier Medical, Dental and Vision Insurance with no waiting period**
  • Paid Vacation, Sick and Parental Leave
  • 401(k) Plan
  • Tuition Reimbursement
  • Paid Training and Licensures

*Benefits may be different by location.  Benefit eligibility requirements vary and may include length of service.

**Coverage begins on the date of hire. Must enroll in New Hire Benefits within 30 days of the date of hire for coverage to take effect.

The equal employment opportunity policy of the GEICO Companies provides for a fair and equal employment opportunity for all associates and job applicants regardless of race, color, religious creed, national origin, ancestry, age, gender, pregnancy, sexual orientation, gender identity, marital status, familial status, disability or genetic information, in compliance with applicable federal, state and local law. GEICO hires and promotes individuals solely on the basis of their qualifications for the job to be filled.

GEICO reasonably accommodates qualified individuals with disabilities to enable them to receive equal employment opportunity and/or perform the essential functions of the job, unless the accommodation would impose an undue hardship to the Company. This applies to all applicants and associates. GEICO also provides a work environment in which each associate is able to be productive and work to the best of their ability. We do not condone or tolerate an atmosphere of intimidation or harassment. We expect and require the cooperation of all associates in maintaining an atmosphere free from discrimination and harassment with mutual respect by and for all associates and applicants.