Senior Research Computing Cloud SRE

Posted:
5/6/2024, 1:21:43 AM

Location(s):
New York, New York, United States

Experience Level(s):
Senior

Field(s):
Software Engineering

Pay:
$336/hr or $698,880 total comp

The Research Computing HPC team is a group of experts solving computing problems in the critical path of Research. We work directly with Research and Model Implementation teams and provide them with tools and computing resources to take their ideas from inception to real tradable products. We are looking for an ambitious and operationally minded software engineer to join our team as we mature and scale our cloud HPC platform to the next iteration of our firm-wide Research platform.

Why join us? 
PDT Partners has a stellar 30+ year track record and a reputation for excellence. Our goal is to be the best quantitative investment manager in the world—measured by the quality of our products, not their size.  PDT’s very high employee-retention and mobility speaks for itself. Our people are intellectually extraordinary, and our community is close-knit, down-to-earth, and diverse.  Our engineers love to work on challenging and complicated problems, and in return, they have a chance to make a direct impact on our bottom line, without the attitude and bureaucracy of a typical Wall Street firm.
 
Responsibilities: 

We are a small flat team sitting at the cross-section of research, implementation, and platform infrastructure. Our team responsibilities span many areas. Below find a sampling of the types of work you will be expected to work on:

  • Design and implementation of cloud-based HPC systems:
    • Our projects involve equal parts engineering and operations for success in our fast-moving environment. You will be expected to conceive and implement projects small and large.
  • Running our HPC plant day-to-day:
    • Our research environment is up 24/7, and we want to keep it that way. Everybody on the team contributes to the support of our platform, which thankfully is light because of our automation and quality work.
  • Implementing automation:
    • We will always choose to work smart over working hard. You will be responsible for conception and implementation of automation from CI/CD pipelines to production metrics and monitoring of our cloud HPC platform.
  • Capacity management and benchmark optimization:
    • Our demand for compute is constant and involves challenging problems focused on scaling our compute, optimizing workloads, and choosing the right type of accelerators to target.
  • Obsessive User Focus:
    • All members of the team are expected to partner with researchers and engineers to deliver high-quality cloud HPC systems that are efficient and reliable. This includes leading projects to evolve it as our needs change.
  • Design, implement, and deliver scalable and performant systems:
    • Projects typically involve equal parts engineering and operations, for success in our fast-moving environment. You will be expected to do both for projects small and large, working with a mix of open-source and proprietary tools.
  • Implementing automation:
    • We will always choose to work smart over working hard. You will be responsible for conception and implementation of new automation from CI/CD pipelines to production metrics to other automation for the platform infrastructure that your team owns.
  • Obsessive User Focus:
    • All members of platform teams collaborate closely with peer engineers and/or researchers to build high-quality, efficient, and reliable systems. This includes adapting to change, and at times diving into new domains to deeply understand stakeholder needs.
  • Capacity management and benchmark optimization:
    • Our demand for scale and performance is constant and involves challenging optimization problems for workloads critical to research and trading
  • Running our platform systems day-to-day:
    • Our platforms are mission critical for the firm’s success and are very stable, and we want to keep it that way. Everybody on the team contributes to the support of our platforms, which we strive to make light through automation and quality work.

Below is a list of skills and experiences we think are relevant. Even if you don’t think you’re a perfect match, we still encourage you to apply because we are committed to developing our people.

  • Experience with systems programming and/or software engineering
  • Practical experience supporting, debugging, and improving production systems and services
  • Experience using Linux and other Open Source Software
  • Experience with configuration management and infrastructure-as-code frameworks
  • Production experience working with a public cloud, AWS preferred
  • Qualified candidates will have at least one area of specialty platform knowledge: HPC, Trading, CI/CD, Kubernetes, Linux, Cloud Infrastructure, or Networking 

Education:  
Bachelors or Masters degree in an Engineering or Applied Sciences field from a rigorous academic program or equivalent professional experience.

The salary range for this role is between $195,000 and $225,000. This range is not inclusive of any potential bonus amounts.  Factors that may impact the agreed upon salary within the range for a particular candidate include years of experience, level of education obtained, skill set, and other external factors.

PRIVACY STATEMENT: For information on ways PDT may collect, use, and process your personal information, please see PDT’s privacy notices.