Cloud DevOps Engineer – AI Ops

Posted:
5/16/2026, 4:55:40 AM

Location(s):
Kerala, India ⋅ Mayyanad, Kerala, India

Experience Level(s):
Expert or higher ⋅ Senior

Field(s):
DevOps & Infrastructure ⋅ Software Engineering

Department:

Technology

Our Company Promise


We are committed to provide our Employees a stable work environment with equal opportunity for learning and personal growth. Creativity and innovation are encouraged for improving the effectiveness of Southwest Airlines. Above all, Employees will be provided the same concern, respect, and caring attitude within the organization that they are expected to share externally with every Southwest Customer.

Job Description:

As a Cloud DevOps Engineer Global supporting Southwest’s AI Platform Operations Pod, you’ll help build and operate the cloud automation and reliability backbone that keeps AI and agentic platforms running 24×7. This role focuses on designing and maintaining Infrastructure as Code, CI/CD pipelines, and observability standards that enable reliable, secure, and scalable AI workloads in AWS. You’ll partner closely with Application, Security, and Architecture Teams to create reusable cloud patterns, support business continuity and disaster recovery, and continuously improve platform stability and performance. Working hands‑on with container platforms, automation tooling, and monitoring systems, you’ll help establish DevOps, observability, and FinOps practices for a growing AI Ops organization—balancing reliability, cost, and security for production AI systems. This role offers the opportunity to build deep cloud operations expertise while supporting AI platforms that must perform consistently as they scale across the Company.

Responsibilities
  • Operate in a DevOps culture and team, responsible for architecture, design, development, implementation, and ongoing operations of new and emerging technology platforms
  • Implement automation tools and frameworks for automatic code deployment (CI/CD)
  • Lead the design, development, and evolution of cloud infrastructure
  • Establish and/or follow procedures and standards to ensure high quality and quantity of work
  • Establish or follow prioritization processes to drive work and has a sense of urgency about getting work completed
  • Consult with stakeholders to specify requirements and solutions which address business challenges and opportunities
  • Serve as a subject matter expert in cloud infrastructure, performing design reviews and consulting with your teams to ensure design best practices
  • Maintain business continuity and disaster recovery processes
  • Participate in system and acceptance testing to ensure that systems are functionally appropriate, technically sound, and well-integrated
  • Test and implement system and enhancements using techniques that preserve system integrity
  • May perform other job duties as directed by Employee's Leaders

Knowledge, Skills and Abilities
  • Knowledge of AWS Infrastructure as a Service Automation tools like Cloud Formation
  • Knowledge of AWS compute, data sources, security technologies
  • Knowledge of infrastructure-as-code (IAC)
  • Skilled in deployment strategies using Docker for containerization
  • Skilled in monitoring tools
  • Ability to mentor and guide team members' learning, including introducing new ideas and technologies
  • Ability to anticipate business needs and lead a team towards identifying and solving cross-domain problems
Education
  • Required: Bachelor's degree in Computer Science, Engineering, Information Systems or related field and/or equivalent formal training

Experience
  • Required: Intermediate-level experience, fully functioning broad knowledge in Cloud software or DevOps
    • 2-5 years of relevant work-related experience
    • 3+ years of hands-on experience with AWS—EKS/ECS, Lambda, IAM, VPC, CloudWatch
    • 2+ years of experience with IaC—Terraform or CloudFormation
    • 3+ years of experience with CI/CD pipelines—Jenkins, GitHub Actions, GitLab CI
    • 2+ years of experience with Containers—Docker + Kubernetes
    • 2+ years of experience with observability & incident response—logs, metrics, traces
  • Preferred:
    • Experience with secrets management (Vault, AWS Secrets Manager)
    • Experience with FinOps for AI workloads
    • Experience deploying LLM/agentic workloads (Bedrock, SageMaker)
    • Knowledge of GPU infrastructure
    • Familiarity with SRE practices—SLOs, error budgets

Other Qualifications
  • Must meet confidentiality expectations as to confidential, proprietary and sensitive Company information
  • Ability to work extended hours as needed

Southwest Airlines is an Equal Opportunity Employer.
Please print/save this job description because it won't be available after you apply.