Responsibilities:
About Tencent
Tencent is an Internet-based platform company founded in Shenzhen, China, in 1998. We use technology to enrich the lives of Internet users and assist the digital upgrade of enterprises. Our mission is "Value for Users, Tech for Good". We embrace a culture of teamwork & creativity and are driven by our values - Integrity, Proactivity, Collaboration and Creativity.
We are rapidly expanding our international operations and are looking for top talent to propel us forward. Combining the results-oriented nature of a startup with the resources of a profitable and leading Internet company, Tencent offers a unique opportunity for aspiring individuals to thrive.
About WeChat
With over 1.2 billion users worldwide, WeChat is changing the mobile landscape by connecting people, services, and businesses in China and around-the world. The WeChat team in Singapore is responsible for managing and growing our core product including messaging and social networking for users around the world (excluding the Chinese mainland).
Join the WeChat team and play an impactful role in keeping people around the world connected , help redefine how people use their mobile devices to communicate and interact online and to better understand user behavior and preferences of users worldwide.
Roles & Responsibilities
- Design, deploy, and manage CI/CD pipelines to ensure smooth and consistent delivery of software to production.
- Administer, scale, and optimize Kubernetes (k8s) deployments, ensuring high availability and fault tolerance for all applications.
- Architect and maintain microservices infrastructure, ensuring seamless communication, efficient scaling, and robust security.
- Implement, customize, and oversee monitoring solutions for proactive detection of system anomalies and performance bottlenecks.
- Lead capacity planning and resource management efforts, leveraging pressure testing to benchmark, tune, and enhance system performance.
- Swiftly identify, troubleshoot, and remediate issues affecting critical service operations, ensuring maximum uptime and minimal disruption to users.
- Proactively participate in the creation, maintenance, and enhancement of Standard Operating Procedures (SOPs) to ensure uniformity in operational tasks.
- Manage high-severity incidents and events with significant customer impact, focusing on rapid detection, analysis, and recovery while coordinating with cross-functional teams.
- Innovate and develop automated operational tools and systems to minimize manual interventions and streamline processes.
- Collaborate closely with development, QA, and business teams to align infrastructure and operations with organizational goals and customer needs.
Requirements:
Qualifications
- Bachelor's or higher degree in Computer Science, Information Systems or related fields
- Hands-on experience with at least one of the programming languages: Bash, Go, Python
- Good command of Linux environment with deep understanding of the Linux Operating System, including Kernel, Memory, Process, Threads, Static / Shared Libraries, IPC, Signals
- Understanding of standard networking protocols such as HTTP, DNS, SSL, TCP/IP, ICMP
- Experience in large-scale distributed environments. Familiarity with distributed systems including the CAP Theorem, Microservices
- Experience with container technology such as Docker, Kubernetes
- Experience with monitoring tools like Prometheus, Zabbix
- Strong sense of ownership, customer service, and integrity demonstrated
- Passion for eliminating repetitive manual processes using automation
- Fast learning ability and a good team player
- Fluency in both English and Mandarin to deal with international stakeholders and stakeholders who are based in HQ