Posted:
12/16/2025, 8:32:44 PM
Location(s):
New South Wales, Australia ⋅ Sydney, New South Wales, Australia
Experience Level(s):
Senior
Field(s):
Operations & Logistics
At The Trade Desk, we recognize that a seamless customer experience is driven by operational excellence. In pursuit of constantly improving the reliability of our platform, we have established a global Systems Operations team. This team's core mission is to vigilantly monitor The Trade Desk platform services, refine our incident response methodologies, and guarantee a robust and highly-available customer experience. If you're passionate about ensuring system reliability, process improvement, and making an essential customer impact, we invite you to play a critical role in this next evolution of our on-call experience.
Act as a technical expert and advisor to more junior Systems Operations Engineers
At an escalated tier, monitor the state of platform services and stability via telemetry and alerts; triage issues, escalate to engineering teams as needed
Work collaboratively with development teams to facilitate issue remediation
Manage remediation task workflow
Proactively update and improve Systems Operations documentation and runbooks
Increase the effectiveness of the incident response process by defining and measuring relevant metrics
There may be periodic weekend coverage requirements
Undergraduate degree or relevant substitute experience
6+ years relevant work experience in Technical and/or Application Support with strong knowledge of services support and troubleshooting
The Systems Operations Engineer will either possess or be excited to learn a number of skills...
Technical Proficiency:
Understanding of large-scale distributed system architectures (e.g., databases, web services, application services).
Familiarity with monitoring tools (e.g., Prometheus, Grafana, Nagios).
Ability to configure and fine-tune alerts.
Proficiency or ability to learn programming languages including SQL, Python, C# to automate repetitive tasks.
Incident Management and Troubleshooting:
Ability to prioritize and manage incidents based on severity, with a focus on customer impact.
Ability to remain calm under pressure and quickly diagnose issues.
Understanding of system logs, metrics, telemetry.
Communication Skills:
Ability to communicate effectively with stakeholders during an incident.
Clear and concise documentation skills.
Ability to maintain and update troubleshooting guides (TSGs) and operational documentation.
Ability to translate complex technical issues and platform outages to non-technical stakeholders.
As an Equal Opportunity Employer, The Trade Desk is committed to creating an inclusive hiring experience where everyone has the opportunity to thrive.
Please reach out to us at accommodations@thetradedesk.com to request an accommodation or discuss any accessibility needs you may require to access our Company Website or navigate any part of the hiring process.
When you contact us, please include your preferred contact details and specify the nature of your accommodation request or questions. Any information you share will be handled confidentially and will not impact our hiring decisions.
Website: http://thetradedesk.com/
Headquarter Location: Ventura, California, United States
Employee Count: 501-1000
Year Founded: 2009
IPO Status: Public
Last Funding Type: Post-IPO Equity
Industries: Advertising ⋅ Digital Media ⋅ Information Technology ⋅ Internet ⋅ Mobile ⋅ Native Advertising ⋅ Social ⋅ Software ⋅ Video Advterising