Principal Engineer – Data Platform

Posted:
12/11/2025, 2:34:02 PM

Location(s):
Karnataka, India ⋅ Bengaluru, Karnataka, India

Experience Level(s):
Expert or higher ⋅ Senior

Field(s):
Software Engineering

About McKesson Compile

Established in 1833, McKesson is a US Fortune 10 global leader in healthcare supply chain management solutions, retail pharmacy, healthcare technology, community oncology, and specialty care. We partner with life sciences companies, manufacturers, providers, pharmacies, governments, and other healthcare organizations to help provide the right medicines, medical products, and healthcare services to the right patients at the right time, safely and cost effectively.

Based in Bangalore India, McKesson Compile’s data is a comprehensive, full linked system of record for the US Healthcare market, with intelligence on 2M+ healthcare professionals (HCPs) and over 800K facilities. Compile’s data includes high capture medical and pharmacy claims, closed capture Medicare claims (100%), along with best-in-class provider affiliations and customer master.

At McKesson we deliver careers with purpose and potential. Our focus on better health starts with creating an inclusive environment with strong values where you can build a fulfilling career. You can count on us to provide you with resources and opportunities to grow and be your best, while contributing to our pursuit of improving lives.

About Us

At Compile (a McKesson company), we’re transforming fragmented healthcare data into powerful intelligence that drives real-world impact — from mapping patient journeys to optimizing go-to-market strategies for life sciences.

We're building a modern, scalable, and secure data platform that powers data products across the organization. As a Principal Engineer, you'll be the hands-on technical leader driving the design and development of this foundational platform.

If you’re passionate about clean architecture, distributed systems, and solving real-world data challenges — especially in healthcare — this is your opportunity to make a deep impact.

What You’ll Do

  • Architect and lead development of a reusable, scalable data platform framework

  • Design robust ETL/ELT pipelines for structured and semi-structured healthcare data

  • Build APIs and internal tools using Django, focused on performance and maintainability

  • Use Prefect for orchestration, and Ray or Spark for distributed compute

  • Leverage Databricks for testing and validation of data pipelines (not for primary compute)

  • Enforce data quality, observability, and reliability using Metaplane or similar tools

  • Integrate and manage data across Postgres, Snowflake, and Snowflake Shares

  • Optimize for scalability and performance in a cloud-native Azure environment

  • Mentor engineers and collaborate with product, data, and platform teams

Tech Stack

  • Languages & Frameworks: Python (Django, FastAPI), SQL

  • Orchestration & Compute: Prefect, Ray, Apache Spark

  • Data Storage: Postgres, Snowflake, dbt, Snowflake Shares

  • Cloud Platform: Azure (Blob Storage, Data Factory, Azure Functions)

  • Testing & CI/CD: Pytest, GitHub Actions, Databricks (for test pipelines)

  • Observability: Metaplane or similar data observability tooling

  • Nice-to-Have: Apache Iceberg, Airbyte, familiarity with GenAI/LLM concepts (e.g., RAG, embeddings, vector stores)

What We’re Looking For

  • 15+ years of experience in data engineering, platform architecture, or backend systems

  • Proven experience designing and building modular data infrastructure

  • Hands-on expertise with ETL frameworks, orchestration tools (Prefect), and distributed compute (Ray/Spark)

  • Strong experience in Django-based API development

  • Deep understanding of data modeling, warehousing, and pipeline reliability

  • Experience with Azure cloud services and managing large datasets across Snowflake and Postgres

  • Familiarity with data observability and monitoring tools (e.g., Metaplane)

  • Nice-to-have: Exposure to GenAI/LLM systems such as vector search or RAG pipelines

  • Experience with healthcare or life sciences data is a strong plus

Work Environment

  • Location: Bangalore (Hybrid – 3 days/week in office)

  • High-ownership, collaborative engineering culture

  • Lean, fast-moving team solving tough technical and domain problems

  • Backed by McKesson, one of the world’s largest healthcare companies