Posted:
2/26/2026, 7:24:35 PM
Location(s):
Karnataka, India ⋅ Bengaluru, Karnataka, India
Experience Level(s):
Senior
Field(s):
DevOps & Infrastructure ⋅ Software Engineering
Job Title: Senior Site Reliability Engineer (SRE) – DBaaS Platform (Automation)
Location: Bangalore
Department: Customer Success
Reports To: VP Customer Success
Role Overview
We are seeking a highly skilled Senior SRE to lead reliability engineering for our cloud-
native Database-as-a-Service (DBaaS) platform. This role will drive automation-first
operations, SRE agent architecture, AI-enabled incident acceleration, and SLO-driven
reliability governance across AWS, Azure, and GCP environments.
You will operate at the intersection of platform engineering, cloud infrastructure,
database reliability, and automation — building self-healing, scalable, and cost-efficient
systems.
Key Responsibilities
1. SRE Agent Architecture & Technical Ownership
Design and own SRE automation agents for proactive monitoring, remediation,
and performance optimization.
Build event-driven reliability frameworks integrated with observability platforms.
Define extensible architectures for auto-detection, auto-healing, and intelligent
alert reduction.
2. Automation Roadmap Leadership
Own the automation strategy across DBaaS lifecycle (provisioning, scaling,
patching, backup, DR).
Drive infrastructure and operational automation maturity.
Eliminate toil through scripting, tooling, and CI/CD integration.
3. Engineering-Driven Reliability & SLO Governance
Define and manage SLIs, SLOs, and error budgets.
Implement reliability scorecards and availability governance.
Partner with Product and Engineering to embed SRE practices into platform
design.
4. AI-Enabled Operational Acceleration
Integrate AI/ML-based anomaly detection and predictive scaling.
Enable automated RCA enrichment using log analytics and telemetry intelligence.
Drive AI-assisted runbooks and decision frameworks.
5. Strong Programming Expertise
Develop automation frameworks using Python and/or Go.
Build scalable microservices for reliability orchestration.
Contribute to platform APIs and reliability tooling.
6. Infrastructure as Code (IaC) Mastery
Architect and manage infrastructure using Terraform.
Implement policy-as-code and compliance automation.
Ensure consistent multi-cloud deployments.
7. Multi-Cloud Expertise
Deep hands-on experience with AWS, Azure, and GCP.
Design high-availability, multi-region architectures.
Implement secure, scalable network and storage solutions across clouds.
8. Containerization & Orchestration
Strong hands-on with Docker and Kubernetes.
Build and manage stateful workloads in Kubernetes.
Implement scaling, failover, and resilience patterns.
9. Cloud Networking & Security
Strong understanding of VPC/VNet, peering, routing, firewalls, IAM, encryption.
Implement Zero-Trust and least-privilege access models.
Embed security into reliability workflows.
10. Database Reliability & High Availability
Experience managing HA architectures for relational and NoSQL databases.
Strong knowledge of replication, failover, backup, DR, PITR.
Performance tuning and capacity planning expertise.
11. Incident Leadership & RCA Excellence
Lead critical incident response (P1/P2).
Conduct structured RCA and preventive action planning.
Build post-incident automation improvements.
12. Cost Optimization & Operational Efficiency
Implement FinOps practices for DBaaS workloads.
Optimize compute, storage, and licensing costs.
Drive performance-per-dollar improvements.
13. Cross-Team Technical Leadership
Mentor junior SREs and platform engineers.
Collaborate with Product, DBA, Security, and Dev teams.
Influence architecture decisions with reliability-first mindset.
Required Qualifications
8+ years in SRE / DevOps / Platform Engineering roles.
3+ years in multi-cloud production environments.
Strong programming expertise in Python and/or Go.
Deep experience with Terraform and infrastructure automation.
Hands-on Kubernetes production experience.
Experience managing large-scale database platforms.
Strong understanding of observability (metrics, logs, traces).
Preferred Qualifications
Experience in DBaaS or SaaS platform companies.
Experience with AI-driven monitoring/operations.
Knowledge of distributed systems internals.
Experience implementing SRE best practices at scale.
Key Competencies
Systems thinking
Automation-first mindset
Bias for engineering over manual ops
Data-driven decision making
Strong ownership and accountability
Executive-level communication during incidents
Website: https://www.tessell.com/
Headquarter Location: San Ramon, California, United States
Employee Count: 51-100
Year Founded: 2021
IPO Status: Private
Last Funding Type: Series A
Industries: Database ⋅ PaaS ⋅ SaaS ⋅ Software