SRE Application Support Lead

Posted:
10/27/2025, 7:34:39 PM

Location(s):
Chennai, Tamil Nadu, India ⋅ Tamil Nadu, India

Experience Level(s):
Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Workplace Type:
Hybrid

TransUnion's Job Applicant Privacy Notice

What We'll Bring:

We are seeking a highly skilled and motivated SRE Application Support Lead / Sr. Lead to join our 24x7 support team. This role is critical to ensuring the stability, performance, and reliability of mission-critical applications deployed across modern platforms including Docker, Kubernetes, and cloud environments. The ideal candidate will possess strong technical expertise, leadership capabilities, and a proactive mindset to drive operational excellence.

What You'll Bring:

Key Responsibilities

Team Leadership & Management

  • Lead and mentor a team of SRE/Application Support Engineers.

  • Assign tasks, set goals, and ensure smooth day-to-day operations.

  • Foster a culture of ownership, accountability, and continuous improvement.

Incident & Problem Management

  • Own and manage critical incidents end-to-end.

  • Perform root cause analysis and drive permanent resolutions.

  • Collaborate with cross-functional teams and vendors for quick recovery.

Monitoring & Observability

  • Utilize tools like Splunk, Grafana, AppDynamics, Spotfire to monitor application health.

  • Set up proactive alerting and dashboards for performance tracking.

Automation & Tooling

  • Develop scripts (Shell, Python) to automate routine tasks.

  • Build and maintain internal tools to improve support efficiency.

Cloud & DevOps Integration

  • Support applications deployed in Docker, Kubernetes, and cloud platforms.

  • Collaborate with DevOps teams for CI/CD pipeline support and release validations.

Change & Release Management

  • Perform pre- and post-release validations.

  • Ensure production stability during deployments.

Documentation & Knowledge Management

  • Maintain runbooks, SOPs, and knowledge base articles.

  • Ensure onboarding materials and troubleshooting guides are up-to-date.

Stakeholder Communication

  • Provide timely updates to leadership and business teams.

  • Present metrics, incident summaries, and improvement plans.

SRE Mindset

  • Apply SRE principles to improve reliability, scalability, and performance of supported applications through proactive monitoring and automation.

  • Focus on reducing toil by automating repetitive tasks and improving operational efficiency.

  • Participate in blameless postmortems and contribute to continuous improvement initiatives based on incident learnings.

  • Drive observability enhancements by integrating metrics, logs, and traces into monitoring dashboards.

  • Collaborate with engineering teams to define and measure SLIs/SLOs, ensuring alignment with business availability goals.

Required Skills:

  • Strong Incident Management (IM) expertise: Proven ability to lead and coordinate high-severity incidents, including real-time triaging, root cause identification, and resolution tracking.

  • Bridge Call Management: Experience in initiating and leading bridge calls, ensuring timely updates, stakeholder alignment, and effective resolution.

  • Stakeholder Communication & Coordination: Ability to interact with cross-functional teams, vendors, and leadership during incidents and planned changes.

  • Monitoring & Observability Tools: Proficient in Splunk, Grafana, AppDynamics, Spotfire, and other monitoring platforms.

  • Technical Proficiency: Strong hands-on experience in Linux, SQL, Shell scripting, and Python (preferred).

  • Cloud & Containerization: Exposure to cloud platforms (AWS, Azure, GCP), Docker, and Kubernetes.

  • Automation & Tooling: Experience in automating support tasks and building internal tools to improve operational efficiency.

  • Change & Problem Management: Familiarity with ITIL processes, including change, incident, and problem management.

  • Certifications: ITIL, AWS, Azure, Kubernetes, or other relevant technical/process certifications are a plus.

  • Excellent Communication Skills: Strong verbal and written communication for effective collaboration and reporting.

  • Team Leadership: Experience in managing and mentoring support teams, driving performance, and ensuring 24x7 operational readiness.

Impact You'll Make:

Lead 24x7 SRE/Application Support operations ensuring high availability and performance of critical applications.

  • Drive Incident Management processes including triage, resolution, and post-incident reviews.
  • Initiate and lead bridge calls during high-severity incidents, ensuring timely updates and coordination across teams.
  • Act as the primary point of contact for stakeholder communication during incidents and planned changes.
  • Oversee monitoring and observability using tools like Splunk, Grafana, AppDynamics, and Spotfire.
  • Support applications deployed in Docker, Kubernetes, and cloud platforms (AWS/Azure/GCP).
  • Lead automation initiatives using Shell scripting and Python to improve operational efficiency.
  • Collaborate with DevOps and Engineering teams for CI/CD and release management.
  • Ensure compliance with ITIL processes (Incident, Problem, Change Management).
  • Maintain documentation including runbooks, SOPs, and knowledge base articles.
  • Tools & Technologies: Linux, SQL, Docker, Kubernetes, Splunk, Grafana, AppDynamics, Spotfire, Shell, Python
  • Certifications Preferred: ITIL, AWS/Azure/GCP, Kubernetes, DevOps
  • Work Mode: Hybrid (as per team policy)
  • Shift Type: Rotational (24x7 coverage)

This is a hybrid position and involves regular performance of job responsibilities virtually as well as in-person at an assigned TU office location for a minimum of two days a week.

TransUnion Job Title

Sr Lead, Applications Support

Commerce Signals Inc

Website: https://commercesignals.com/

Headquarter Location: Palo Alto, California, United States

Employee Count: 11-50

Year Founded: 2012

IPO Status: Private

Last Funding Type: Debt Financing

Industries: Analytics ⋅ Mobile ⋅ Retail ⋅ Software