Backend Engineer - Supernova

Posted:
7/25/2024, 9:07:53 AM

Location(s):
New York, United States ⋅ New York, New York, United States

Experience Level(s):
Senior

Field(s):
Software Engineering

Workplace Type:
Remote

About the role

Our Supernova team looks after all of Hex’s compute backend, and is responsible for scaling our architecture to support fast data analysis at any data scale.

This is a hugely challenging problem. Our users do everything from analyzing 100 line csv files through to querying TB / PB scale Snowflake tables, and in all cases expect to be able to do this without worrying how this is done under the hood.

Our job is to define and build the how.

A few of the things we work on:

  • Distributing compute to underlying warehouses where appropriate
  • How to support Hex’s polyglot functionality (interoperation between Python and SQL) while also running things as quickly as possible
  • Operating on datasets that are too large to pull into memory
  • Expanding our features to work correctly across the enterprise, including scaling infrastructure (with help from our Infra team), and architecture to support things like permissions on datasets and metadata

What you will do:

The following are a few examples of the kind of work you can expect to do in this role:

  • Work on evolving our compute architecture, including entirely new compute components that support our goal of providing ultra-fast execution by using the appropriate executors for the relevant query or operation
  • Refine our caching strategies to minimize return trips to data warehouses
  • Propose designs for extensions, improvements, or entirely new parts of our backend, and implement said designs
  • Collaborate with other teams to support new features — for example, building and exposing an internal query language that compiles to the appropriate dialect for any given connected warehouse

About You

  • At least 6+ years of software engineering experience
  • Excited about making a product as fast as it can be. Hex’s version of this is complex — our users can execute their own code against their own data!
  • A strong eye for detail, and a desire to deeply understand what’s going on under the hood. We’ve found examples of significant performance gains in the past by poring over execution logs and throughly understanding the internals of things like the Jupyter kernel and specific data warehouse connection drivers
  • Experience working in a remote team, and able to operate and communicate effectively in this context
  • An interest and desire to collaborate through technical RFCs, writing, and reviewing code
  • An instinct for strategic thinking and aligning with business and product goals while keeping a healthy balance of velocity and engineering excellence
  • Interest in the data space, and a love of shipping great products and building tools that empower end users to do more
  • An ability to lead initiatives while collaborating and mentoring fellow engineers
  • Have a key interest in the latest compute, caching, and technological advances relevant to our field. Some things we’ve been interested in lately are DuckDB, SQLGlot, Calcite, Polars, etc.

Our stack

Our product is a web-based notebook and app authoring platform. Our frontend is built with Typescript and React, using a combination of Apollo GraphQL and Redux for managing application state and data. On the backend, we also use Typescript to power an Express/Apollo GraphQL server that interacts with Postgres, Redis, and Kubernetes to manage our database and Python kernels. Our backend is tightly integrated with our infrastructure and CI/CD, where we use a combination of Terraform, Helm, and AWS to deploy and maintain our stack.