Project Role : Service Management Lead
Project Role Description : Lead the delivery of programs, projects or managed services. Coordinate projects through contract management and shared service coordination. Develop and maintain relationships with key stakeholders and sponsors to ensure high levels of commitment and enable strategic agenda
Must have skills : Site Reliability Engineering
Good to have skills : Cloud Infrastructure
Minimum
5 year(s) of experience is required
Educational Qualification : 15 years full time education
ROLE SUMMARY:
As a highly skilled Site Reliability Engineer III, you will be responsible for managing and scaling our application observability platforms and tools. The ideal candidate will have a solid understanding of how best to implement tools, monitoring, and logging to best ensure we maintain optimal system and application health, by way of availability and performance. You will also play a role in scripting, troubleshooting, and setting up alerting for the monitors and logging. This role requires a solid foundation in automation, scripting, and problem-solving to ensure efficient and reliable information delivery, by way of our tools.
KEY Responsibilities: s
• Implement and maintain an observability platform using Open Telemetry with Grafana Cloud (Log, Matrics and Traces).
• Integrate additional monitoring tools like Pingdom/SolarWinds, PagerDuty and Nagios to monitor and analyze system health and events.
• Collaborate with development and operations teams to improve system reliability, scalability, and performance.
• Analyze system and application performance metrics to identify and resolve performance bottlenecks.
• Develop and execute automation scripts for health checks, alerts, and auto remediation of the events
• Manage SaaS and cloud hosting environments to ensure optimal performance, security, and compliance.
• Conduct root cause analysis for system failures and implement preventive measures
• Create and maintain comprehensive technical documentation for Standard, template, and processes.
• Provide technical guidance on Migration and Critical projects to team members.
REQUIRED QUALIFICATIONS
• 5-8 years prior experience in a SRE engineering role is required for this position
• BS Degree in Computer Science, Software Engineering, or related software engineering field
• Extensive experience with observability tools and practices, including Grafana Cloud, Prometheus, Loki and related technologies.
• Proficiency with Open Telemetry for comprehensive monitoring and performance tracking with Metrics and Traces
• Advanced knowledge of, and experience with, monitoring tools like Pingdom/SolarWinds Observability, Nagios and PagerDuty.
• Advanced skills in managing infrastructure on Windows Server 2016 and 2022, as well as Linux distributions (CentOS 7+, Alma).
• Advanced skills with scripting and automation (e.g., PowerShell, Python), health checks and auto restoration
• Strong understanding of SaaS and cloud hosting environments with virtualization technologies (VMware) and Azure DevOps.
Experience with incident/ outage response, disaster recovery and service restoration.
• Strong problem-solving and troubleshooting skills and the ability to resolve complex outage/performance issues
• Ability to work independently and drive projects/tasks effectively.
• Willingness to work off-hours when necessary.
15 years full time education
About Accenture
Accenture is a leading global professional services company that helps the world’s leading businesses, governments and other organizations build their digital core, optimize their operations, accelerate revenue growth and enhance citizen services—creating tangible value at speed and scale. We are a talent- and innovation-led company with 750,000 people serving clients in more than 120 countries. Technology is at the core of change today, and we are one of the world’s leaders in helping drive that change, with strong ecosystem relationships. We combine our strength in technology and leadership in cloud, data and AI with unmatched industry experience, functional expertise and global delivery capability. We are uniquely able to deliver tangible outcomes because of our broad range of services, solutions and assets across Strategy & Consulting, Technology, Operations, Industry X and Song. These capabilities, together with our culture of shared success and commitment to creating 360° value, enable us to help our clients reinvent and build trusted, lasting relationships. We measure our success by the 360° value we create for our clients, each other, our shareholders, partners and communities. Visit us at www.accenture.com
Equal Employment Opportunity Statement
All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law.
Job candidates will not be obligated to disclose sealed or expunged records of conviction or arrest as part of the hiring process.
Accenture is committed to providing veteran employment opportunities to our service men and women.