About Karius
Karius is a venture-backed life science startup that is transforming the way pathogens and other microbes are observed throughout the body. By unlocking the information present in microbial cell-free DNA, we're helping doctors quickly solve their most challenging cases, providing industry partners with access to 1000’s of biomarkers to accelerate clinical trials, discovering new microbes, and reducing patient suffering worldwide.
Position Summary
We are seeking a seasoned Staff data engineer to drive data platform initiatives across the data value chain at Karius. We develop and operate AI-driven data analytics pipelines to deliver life-saving results in the highly complex infectious disease landscape. In this role, you will have the opportunity to develop and optimize the data platform to enable our users to extract insights from large amounts of commercial, operational, genomic, and clinical data, ultimately providing actionable insights that serve the business and patients. You will be able to incorporate the next generation of AI technologies and tools (e.g. Generative AI) into the Karius data platform to significantly increase delivered value to internal stakeholders and our customers.
Why Should You Join Us?
Karius aims to conquer infectious diseases through innovations around genomic sequencing and machine learning. The company’s platform is already delivering unprecedented insights into the microbial landscape, providing clinicians with a comprehensive test capable of identifying more than a thousand pathogens directly from blood, and helping industry accelerate the development of therapeutic solutions. The Karius test we provide today is one of the most advanced solutions available to physicians who aim to deliver better care to many otherwise ineffectively treated patients. Our test is the result of some incredible work done by our scientists, statisticians, engineers, and physicians, all driven by the same mission. You, as part of the Karius team, will be able to see how directly your work has a life-changing impact on people, and at scale.
Reports to: Sr. Manager, Analytical Systems & Data Insights
*This position is not open to agency support at this time. Agencies, please contact [email protected] and not the hiring manager for inquiries.
Location: Redwood City, CA (Hybrid)
Primary Responsibilities
• Data Platform Development: Design, develop, test, deploy, and maintain production-grade platforms and tooling that add value across the data lifecycle (ingest, transform, serve) for various use cases such as reporting, data analytics, machine learning, and bioinformatics. Examples of projects you will own and drive:
- Data Architecture: Ensure centralized, standardized, and secure data access across business domains.
- Data Pipelines and Reporting: Combine internal (including PHI) and external data sources via ETL/ELT tooling to calculate operational and commercial KPIs for data analysis and reporting use-cases, catalyzing insight generation.
- ML/AI Data Platform Capabilities: Provide support for computational teams to drive value from clinical and genomic data assets through the ML lifecycle and data science tooling Generative.
- Generative AI Integrations: Implement generative AI solutions to increase the value and usability of Karius data assets across use-cases such as data pipelining, data discovery, knowledgebase search, and conversational analytics.
• Project Management: Proactively interface with cross-functional technical and non-technical stakeholders to identify unmet needs, ensure alignment with data initiatives (scope, timelines, deliverables), and communicate results and outcomes.
• Collaboration: Coordinate with the engineering and IT domains to understand, interface with, and extend production data and software systems using engineering best practices.
• Data Governance: Inform, implement, and follow data governance best practices and policies in conjunction with the Security and Compliance team to meet regulatory and legal requirements.
• Continuous Improvement: Foster a growth mindset and self-starter attitude, continually seeking opportunities for process and system improvements with a focus on quality, practicality, and delivered value
*Note that job duties and responsibilities may evolve based on company needs and technological advancements.
What’s Fun About the Job?
Karius is operating at the edge of what is now known to be possible in infectious disease diagnostics. With that, comes a wave of new and incredible challenges and opportunities. To deliver on that value, you will be tapping into some of the most advanced technologies, architecting and innovating where the current solutions simply don't suffice. You will get to see how much your work really matters.
Travel: No travel required
Physical Requirements
Subject to extended periods of sitting and/or standing, vision to monitor and moderate noise levels. Work is generally performed in an office environment.
Position Requirements
We are seeking a data engineer with exceptional system thinking. Critical to this role is the ability to grasp business needs, identify the complexity and interconnections of data elements, and determine the desired insights to extract. The ideal candidate will excel in translating business requirements into a technical roadmap and developing remarkable solutions to satisfy those needs.
Educational Background
• B.S. degree in Computer Science, Software Engineering, Electrical Engineering, Bioengineering, or related technical fields involving algorithms or coding (e.g., Physics or Mathematics).
Professional Experience
• 10+ years of data engineering / software development experience with at least 5 years of relevant experience in building enterprise-scale data platforms.
Technical Skills
• Data Platforms and Cloud Services: Hands-on experience with data platforms (e.g. Databricks - strongly preferred, Snowflake) and cloud services (e.g AWS - strongly preferred, GCP, Azure).
• Data Integration and Pipelines:
- ETL/ELT tooling: Experience with ETL/ELT tools (e.g. Fivetran, Stitch, Airbyte) for integrating internal and third-party data sources
- Batch and Stream Processing: Experience in building scalable infrastructure for batch processing (e.g., Spark, Hadoop) and stream processing (e.g., Kafka, Kinesis) for large volumes of data
• Developer Toolset: Proficiency in programming languages for data engineering (i.e. Python and SQL) applied in conjunction with SDLC principles and developer practices (e.g. code/data version control, containerization, CI/CD, IaC, automated testing, monitoring/alerting).
• Data Modeling and Architecture: Strong conceptual understanding of data modeling and practical experience with enterprise data models and data architecture components (e.g. databases, warehouse, lake, lakehouse, catalog).
• Reporting and Visualization: Experience with reporting and dashboard tools (e.g. Looker, Streamlit, Tableau, PowerBI, Hex, Dash).
• ML Tooling: Familiarity with data science tooling such as notebooks, standard data processing/visualization libraries (e.g. pyspark, pandas, numpy, scipy, plotly, seaborn, matplotlib, altair), and ML tooling (e.g. MLflow, SageMaker).
• Generative AI: Working knowledge of generative AI concepts and hands-on experience with frameworks and tooling (e.g. LangChain, LlamaIndex, OpenAPI, RAG, vector databases, agents, Bedrock).
• Data Governance and Compliance: Demonstrated experience in implementing and maintaining data governance and compliance frameworks, including handling Protected Health Information (PHI) and adhering to regulatory standards.
Non-Technical Skills
• Ability to work in a fast-paced, dynamic startup environment
• Ability to balance quality and speed when building engineering systems.
• Strong organizational and time management abilities.
• Excellent communication and collaboration skills.
• Attention to detail and commitment to delivering high-quality solutions
Nice-to-haves
• Experience working in the healthcare or life sciences industries, especially within a diagnostics setting.
• Experience deploying generative AI products into production environments.
• Familiarity with genomics datasets and bioinformatics pipelines.
Personal Qualifications
We want to add a humble, curious, and collaborative member to our team. At the Karius engineering team, we highly value deep domain expertise, drive for innovation, desire to collaborate, being open to learning and unlearning, and passion for solving hard problems with a meaningful impact on the world. A sense of ownership and personal/group accountability allows us to be a productive and high-performing team. If you share our vision we would like to have you on board.
At Karius, we value a diverse and inclusive workplace and provide equal employment opportunities for all applicants and employees and are committed to honor and invest in the full diversity of people, in our hiring, recruiting and development of employees across the Company. All qualified applicants for employment are encouraged to apply and will be considered without regard to an individual’s race, color, sex, gender identity and gender expression (including transgender individuals who are transitioning, have transitioned, or are perceived to be transitioning to the gender with which they identify), religion, age, national origin or ancestry, citizenship, physical or mental disability, medical condition, family care status, marital status, domestic partner status, sexual orientation, genetic information, military or veteran status, or any other basis protected by federal, state or local laws. If you are unable to submit your application due to a disability, please contact us at [email protected] and we will accommodate qualified individuals with disabilities.