Context Trainer (Developer)

Posted:
2/26/2026, 12:09:42 AM

Location(s):
Karnataka, India

Experience Level(s):
Junior ⋅ Mid Level

Field(s):
Software Engineering

By clicking the “Apply” button, I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Takeda’s Privacy Notice and Terms of Use.  I further attest that all information I submit in my employment application is true to the best of my knowledge.

Job Description

THE OPPORTUNITY 

Curate and govern the context layer (RAG/KBs, embeddings, metadata, labeling) to improve answer quality and minimize hallucinations, while protecting data/PII. 

RESPONSIBILITIES 

Curation & Labeling 

  • Extract and curate content from enterprise sources (Confluence, Jira, SharePoint, ServiceNow, qTest) using APIs and automation. 

  • Define chunking and metadata schemas; labeling guidelines; golden Q&A and evaluation sets. 

  • Implement chunking strategies for diverse content types (code repositories, technical documentation, tickets, test cases). 

  • Implement curation workflows and retention policies. 

Retrieval Quality 

  • Run A/B experiments across vector stores; monitor answer quality vs. cost/latency; recommend defaults. 

  • Analyze failure cases and propose data-driven improvements. 

Data Governance 

  • Enforce data minimization, retention, and access controls; maintain lineage and approvals per RAI (Responsible AI). 

  • Document data sources and usage for audit readiness. 

SKILLS & QUALIFICATIONS 

Required 

  • 3+ years data/ML experience with embeddings/retrieval expertise; strong documentation and runbook skills.  

  • Experience with content transformation, metadata extraction, and labeling workflows. 

  • Familiarity with privacy and data governance principles. 

  • Hands-on experience with vector stores (OpenSearch/pgvector/Kendra/Chroma) and labeling tools. 

  • Experience with REST APIs and data extraction from enterprise systems. 

  • Python coding proficiency for data pipelines and automation. 

Preferred/Nice to have 

  • Experience designing golden datasets and evaluation pipelines. 

  • AWS Bedrock Knowledge Bases experience. 

  • Familiarity with software development lifecycle and technical documentation patterns. 

Locations

IND - Bengaluru

Worker Type

Employee

Worker Sub-Type

Regular

Time Type

Full time