Engineer, Fleet Monitoring & Analysis

Posted:
10/3/2024, 7:42:56 AM

Location(s):
New Jersey, United States ⋅ New York, New York, United States ⋅ Seattle, Washington, United States ⋅ New York, United States ⋅ Roseland, New Jersey, United States ⋅ Washington, United States ⋅ Sunnyvale, California, United States ⋅ California, United States

Experience Level(s):
Junior

Field(s):
AI & Machine Learning

Workplace Type:
Hybrid

CoreWeave is a specialized cloud provider, delivering a massive scale of GPU compute resources on top of the industry’s fastest and most flexible infrastructure. CoreWeave builds cloud solutions for compute intensive use cases — VFX and rendering, machine learning and AI, batch processing, and Pixel Streaming — that are up to 35 times faster and 80% less expensive than the large, generalized public clouds. Learn more at www.coreweave.com.

About the role:

The Fleet Monitoring & Analysis Team contributes to the automated provisioning and management of CoreWeave’s ever-expanding fleet of hardware nodes and node types by continually improving node and environmental monitoring and observability. Playing a central role in CoreWeave’s growth strategy, this team is a critical piece of our cohesive, zero-touch, and high-reliability fleet management engine.

We seek an Engineer to join the Fleet Monitoring & Analysis team to help us build, run, and refine our metrics, alerts, visualizations, and data-driven insights. This individual will join a team of mixed-skill engineers focused on elevating the art of managing high-performance hardware at scale. As a team member, you would have the opportunity to:

  • Design and implement solutions to large-scale server observability to continually improve the stability of CoreWeave’s global hardware fleet.
  • Adapt, extend, and implement open-source solutions to augment the depth and breadth of our visibility into our operating environment.
  • Generate and maintain custom reports, alarms, and visualizations to help teams understand and respond to our growth and changes.
  • Create test plans, deployment automation, dashboards, alerts, and insights into our fleet operations, as well as participate in the Fleet Engineering Developers’ on-call rotation.
  • Grow, change, invest in your teammates, be invested in, share your ideas, listen to others, be curious, have fun, and, above all, be yourself.

Wondering if you’re a good fit? We believe in investing in our people, and value candidates who can bring their own diversified experiences to our teams – even if you aren't a 100% skill or experience match. Here are some qualities we’ve found compatible with our team. If a portion of this resonates with you, we’d love to talk. 

  • You have 2 or more years experience in a software or infrastructure engineering industry.
  • You have experience in the domains of automation and orchestration workflows and are knowledgeable about server hardware, components, and related technologies and strategies for the management of physical infrastructure at scale.
  • You have experience implementing metrics collection and alerting on standard platforms.
  • You believe in the value of automation and will champion practices that drive reliability and prioritize the CoreWeave customer experience.

 

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $160,000-$185,000. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.

Hybrid Workplace

Successful candidates will be expected to attend onboarding training at our NJ Headquarters for up to 2 weeks within their first month of employment, with subsequent quarterly travel requirements of 1 week duration.

If you reside within a 30-mile radius of our New Jersey, New York, Philadelphia, Sunnyvale or Bellevue offices, we're excited for you to join us at the office at least three times a week, recognizing the significance we place on fostering connections, collaboration, and creativity within our office culture. Our commitment to operating as a hybrid workplace underscores our dedication to enabling our employees to tailor their work-life balance to their individual preferences

Why CoreWeave?

At CoreWeave, we work hard, have fun, and move fast!  We’re in an exciting stage of hyper-growth that you will not want to miss out on. We’re not afraid of a little chaos, and we’re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values: 

  • Be Curious at your Core
  • Act like an Owner
  • Empower Employees
  • Deliver Best In-Class Client Experience 
  • Achieve More Together

We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems. As we get set for take off, the growth opportunities within the organization are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too. Come join us! 

Benefits

We offer a competitive salary and benefits, including:

  • Medical, dental and vision insurance - 100% paid for the employee
  • Company paid Life Insurance 
  • Voluntary supplemental life insurance 
  • Short and long-term disability insurance 
  • Flexible Spending Account
  • Tuition Reimbursement 
  • Mental Wellness Benefits through Spring Health 
  • Family-Forming support provided by Carrot
  • Paid Parental Leave 
  • Flexible, full-service childcare support with Kinside
  • 401(k) with a generous employer match
  • Flexible PTO
  • Catered lunch each day in our offices
  • A casual work environment
  • Work culture focused on innovative disruption

California Consumer Privacy Act - California applicants only

CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information.

As part of this commitment and consistent with the Americans with Disabilities Act (ADA), CoreWeave will ensure that qualified applicants and candidates with disabilities are provided reasonable accommodations for the hiring process, unless such accommodation would cause an undue hardship. If reasonable accommodation is needed, please contact: [email protected]

 

CoreWeave

Website: https://coreweave.com/

Headquarter Location: Roseland, New Jersey, United States

Employee Count: 251-500

Year Founded: 2017

Last Funding Type: Secondary Market

Industries: Cloud Computing ⋅ Cloud Infrastructure ⋅ Information Technology ⋅ Machine Learning