Site Reliability Engineer

Posted:
8/20/2025, 10:15:03 PM

Location(s):
Haryana, India ⋅ Tamil Nadu, India ⋅ Chennai, Tamil Nadu, India ⋅ Gurugram, Haryana, India

Experience Level(s):
Mid Level ⋅ Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Join us as a Site Reliability Engineer

In this key role, you’ll support the improvement of non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and services
You’ll enjoy significant stakeholder interaction, working in collaboration with engineers to ensure a principled approach to deliver change in a safe and secure way
This is a chance to join an inclusive team with a collaborative ethos and a commitment to innovation and professional development
We’re offering this role at senior analyst level

What you'll do

As a Site Reliability Engineer, you’ll be supporting colleagues and feature team members to meet defined service level objectives and continually improve systems and environments. You’ll also be proactively contributing new ideas and innovations to meet short term and longer term goals while balancing and managing risk.

We’ll look to you to ensure the availability, performance, and scalability of the services, as well as monitoring systems and applications to proactively identify and resolve issues before they impact end-users. You’ll also be responding to incidents promptly and effectively, using ITIL principles to manage escalations, root cause analysis, and resolution.

A typical day will involve:

Documenting incidents thoroughly for future reference and improvement
Implementing and enhancing monitoring, logging, and alerting systems to provide full visibility into the health and performance of applications
Automating repetitive, manual tasks and processes to reduce toil and improve operational efficiency
Working closely with development and operations teams to understand application flow and provide support in troubleshooting complex technical issues
Participating in post-incident reviews and propose actionable improvements based on findings

The skills you'll need

We’re looking for someone with at least four years of experience as a Site Reliability Engineer or similar role, ideally in a banking domain, with a solid understanding of production support. You’ll need basic proficiency in SQL for database interactions, as well as an understanding of application flow such as Java Microservices, and architecture to troubleshoot effectively.

You’ll bring incident, problem and change management experience, paired with production support experience. You’ll also need knowledge of Cloud Services, preferably AWS, as well as experience of monitoring and observability tools such as Splunk, DX-APM or similar technologies.

Additionally, you'll need:

Familiarity with ITIL frameworks, particularly in incident and problem management
Knowledge of scripting languages, such as Python or Bash, to automate repetitive tasks and improve operational functions
The ability to provide on-call support on rotation basis
Excellent problem-solving abilities, strong communication skills, and a collaborative mindset to work effectively within teams
Experience of route cause analysis of incidents, as well as coordinating with development, infrastructure platform teams
Experience of non-production and production environment deployments, and CI/CD support

Hours

Job Posting Closing Date:

28/08/2025

Notify

postings

pricing

login

Site Reliability Engineer

What you'll do

The skills you'll need

Digi Ventures Ltd

Related Postings

Analyst, Java Engineer - Fraud

Lead Android Engineering(BSP, Android)

Software Engineer (C++, C#)

Lead Customer Service Engineer

Security Software Engineer, Applied Computing (Starshield)

Notify

postings

our prices

login

contact us

privacy policy