Senior Software Engineer, ReOps

Posted:
9/25/2024, 7:12:09 AM

Location(s):
Seattle, Washington, United States ⋅ Washington, United States

Experience Level(s):
Senior

Field(s):
AI & Machine Learning ⋅ Software Engineering

Workplace Type:
On-site

Hybrid: Individuals in this role are expected to live in the Greater Seattle area and are encouraged to spend 1-3 days per week on-site in our Seattle offices. 
Compensation Range: $140,230 - $213,150

Who You Are:

You’re a talented, self-directed software engineer and operator who thrives in a highly-collaborative, fast-paced environment. You’re well versed in developing and operating distributed systems and comfortable working in the lowest levels of the stack to deploy and debug software on linux servers. You’re an excellent technical communicator who can translate ambiguous requirements into crisp, pragmatic designs. You’re capable of and comfortable wearing many hats and are a natural technical leader and mentor who enjoys frequent collaboration.

Who We Are: 

The Research Ops (ReOps) team maintains the software and servers that allow research teams at Ai2 to execute machine learning workloads on high performance, on-premise GPU clusters. Most of our time is spent contributing to Beaker, a user-friendly and GPU-first job orchestration system that was authored at the institute.  We’re also responsible for configuring Ai2’s on-premise servers and maintaining a suite of internal tools, systems and services that are depended on by teams across the organization.

Your Next Challenge:

You’ll be responsible for developing and operating core infrastructure and software that’s used across Ai2 to train, evaluate and serve state of the art machine learning models. Most of your time will be spent writing code, but you’ll also spend a fair amount of time taking care of operational tasks. As part of this you’ll be expected to be a part of our regular on-call cycle, which requires supporting end-users and diagnosing and resolving system issues.

As a senior individual contributor you’ll be expected to quickly implement high quality software, and lead by example in doing so. You’ll need to work both autonomously and collaboratively to deliver features – and play an important part in maintaining a healthy, high-performing engineering culture. You’ll be responsible for designing key systems and driving larger, complicated projects with broad impact.

The essential functions include, but are not limited to the following:

  • Making improvements to Beaker, a distributed system written in Go
  • Debugging problems with machine learning workloads and/or the underlying infrastructure (servers, network, storage, etc)
  • Preparing technical specifications and design documents for new functionality
  • Conducting design reviews and delivering features, end-to-end
  • Operating and configuring on on-premise linux servers
  • Mentor engineers on the team and provide technical guidance
  • Collaborating with team members to review code, discuss designs and pair on issues
  • Drive technical vision for new infrastructure and software changes with broad impact
  • Design critical systems and architectures with a team-wide and long-term impact, or exhibit clear leadership in identifying and applying impactful research
  • Provide valuable input into annual and quarterly planning
  • Contribute to a healthy, high-performance engineering culture

What You’ll Need:

  • 6+ years developing highly available software in a professional setting
  • Proficiency in Golang, Python, SQL, shell scripting and linux server administration
  • A strong understanding of running containerized workloads (Docker)
  • Experience debugging live systems
  • Familiarity with cloud infrastructure (GCP, AWS)
  • Excellent communication and collaboration skills

Bonus Qualifications:

  • Experience operating GPU clusters or developing distributed ML workloads
  • Deep systems administration expertise
  • Familiarity with Kubernetes
  • Experience hosting models for inference

Physical Demands and Work Environment:

The physical demands described here are representative of those that must be met by a team member to successfully perform the essential functions of this position. Reasonable accommodations may be made to enable individuals with disabilities to perform the functions.

  • Must be able to remain in a stationary position for long periods of time. 
  • The ability to communicate information and ideas so others will understand. Must be able to exchange accurate information in these situations. 
  • The ability to observe details at close range.
  • Can work under deadlines.

A Little More About Ai2:

Ai2 is a Seattle based non-profit AI research institute founded in 2014 by the late Paul Allen. Our mission is building breakthrough AI to solve the world’s biggest problems. We develop foundational AI research and innovation to deliver real-world impact through large-scale open models, data, robotics, conservation, and beyond.

In addition to Ai2’s core mission, we also aim to contribute to humanity through our treatment of each member of the Ai2 Team. Some highlights are:

  • We are a learning organization – because everything Ai2 does is ground-breaking, we are learning every day. Similarly, through weekly Ai2 Academy lectures, a wide variety of world-class AI experts as guest speakers, and our commitment to your personal on-going education, Ai2 is a place where you will have opportunities to continue learning alongside your coworkers. 
  • We value diversity - We seek to hire, support, and promote people from all genders, ethnicities, and all levels of experience regardless of age. We particularly encourage applications from women, non-binary individuals, people of color, members of the LGBTQA+ community, and people with disabilities of any kind. 
  • We value inclusion - We understand the value that people's individual experiences and perspectives can bring to an organization, and we are building a culture in which all voices are heard, respected and considered.
  • We emphasize a healthy work/life balance – we believe our team members are happiest and most productive when their work/life balance is optimized. While we value powerful research results which drive our mission forward, we also value dinner with family, weekend time, and vacation time. We offer generous paid vacation and sick leave as well as family leave.
  • We are collaborative and transparent – we consider ourselves a team, all moving with a common purpose. We are quick to cheer our successes, and even quicker to share and jointly problem solve our failures.
  • We are in Seattle – and our office is on the water! We have mountains, we have lakes, we have four seasons, we bike to work, we have a vibrant theater scene, and we have so much else. We even have kayaks for you to paddle right outside our front door. We welcome interest from applicants from outside of the United States.
  • We are friendly– chances are you will like every one of the 200+ (and growing) people who work here. We do. 

Ai2 is proud to be an Equal Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. You may view the related Know Your Rights compliance poster and the Pay Transparency Nondiscrimination Provision by clicking on their corresponding links. 

This employer participates in E-Verify and will provide the federal government with your Form I-9 information to confirm that you are authorized to work in the U.S. If E-Verify cannot confirm that you are authorized to work, this employer is required to give you written instructions and an opportunity to contact the Department of Homeland Security (DHS) or Social Security Administration (SSA) so you can begin to resolve the issue before the employer can take any action against you, including terminating your employment. Employers can only use E-Verify once you have accepted a job offer and completed the Form I-9.

We are committed to providing reasonable accommodations to employees and applicants with disabilities to the full extent required by the Americans with Disabilities Act (ADA). If you feel you need a reasonable accommodation pursuant to the ADA, you are encouraged to contact us at [email protected].

Benefits: 

  • Team members and their families are covered by medical, dental, vision, basic life insurance, basic accidental death and dismemberment insurance, short-term disability, long-term disability, and an employee assistance program. 
  • Team members are able to enroll in our voluntary life insurance program, our voluntary accidental death and dismemberment program, our health savings account plan, our healthcare reimbursement arrangement plan, and our health care and dependent care flexible spending account plans. 
  • Team members are able to enroll in our company’s 401k plan. 
  • Team members will receive $125 per month to assist with commuting or internet expenses and will also receive $200 per month for fitness and wellbeing expenses. 
  • Team members will also receive up to ten sick days per year, up to seven personal days per year, up to 20 vacation days per year and twelve paid holidays throughout the calendar year.
  • Team members will be able to receive annual bonuses and can participate in the long-term incentive plan.

 

 

 

 

Note: This job description in no way states or implies that these are the only duties to be performed by the team members(s) of this position. Team members will be required to follow any other job-related instructions and to perform any other job-related duties requested by any person authorized to give instructions or assignments. All duties and responsibilities are essential functions and requirements and are subject to possible modification to reasonably accommodate individuals with disabilities. To perform this job successfully, the team member(s) will possess the skills, aptitudes, and abilities to perform each duty proficiently. Some requirements may exclude individuals who pose a direct threat or significant risk to the health or safety of themselves or others. The requirements listed in this document are the minimum levels of knowledge, skills, or abilities. This document does not create an employment contract, implied or otherwise, other than an at will relationship.

The Allen Institute for AI

Website: https://allenai.org/

Headquarter Location: Seattle, Washington, United States

Employee Count: 101-250

Year Founded: 2014

IPO Status: Private

Industries: Artificial Intelligence (AI) ⋅ Finance ⋅ Financial Services ⋅ Incubators ⋅ Venture Capital