About the role
As a Member of Technical Staff specializing in Applied Research (Extraction and Document Understanding), you will work on state-of-the-art research and apply it to production systems for RAG 2.0 (Contextual Language Models + Fine Tuning + Alignment). Your focus will include advanced document understanding, efficient chunking strategies, and multimodal extraction techniques.
What you'll do
- Work on and conduct research in state-of-the-art retrieval augmented language models, fine-tuning, preference alignment, and document understanding algorithms.
- Develop and implement advanced chunking strategies for efficient document processing and information retrieval.
- Design and implement multimodal extraction techniques to process and analyze text, images, and other data types within documents.
- Collaborate closely with ML researchers, product managers, and designers to understand requirements and translate them into technical solutions.
- Integrate this research into the product through Software Development.
- Architect and build scalable and efficient backend services, APIs, and databases to support the platform's functionality and performance requirements, with a focus on document processing and multimodal data handling.
- Ensure seamless integration with machine learning models and pipelines, enabling efficient model deployment and management for document understanding tasks.
- Collaborate with cross-functional teams to continuously improve the platform's functionality, usability, and user experience, particularly in areas related to document processing and information extraction.
- Mentor and provide technical guidance to junior team members, promoting knowledge sharing and professional growth in areas of document understanding and multimodal extraction.
What we're seeking
- Bachelor's degree in Computer Science, Software Engineering, or a related field. Master's or PhD preferred.
- Detailed knowledge of machine learning concepts and frameworks, with experience in document understanding and natural language processing.
- Familiarity with chunking techniques and their application in information retrieval and document understanding.
- Experience with language modeling, document processing libraries, OCR technologies, and multimodal modeling data handling is a plus.
- Experience with cloud platforms, such as AWS, Azure, or GCP, and familiarity with deploying applications on the cloud, particularly for large-scale document processing.
- Strong proficiency in programming languages such as Python, JavaScript, or Java and in backend software development.
- Strong problem-solving skills and the ability to work effectively in a fast-paced, collaborative environment.
- Excellent communication and interpersonal skills, with the ability to work closely with cross-functional teams and explain complex document understanding concepts to non-technical stakeholders.
Location: Mountain View, CA
Salary Range for California Based Applicants: $150,000 - $400,000 + equity + benefits (actual compensation will be determined based on experience, location, and other factors permitted by law).
Equal Employment
Contextual AI provides equal employment opportunities to all qualified individuals without regard to race, color, religion, sex, gender identity, sexual orientation, pregnancy, age, national origin, physical or mental disability, military or veteran status, genetic information, or any other protected classification. Equal employment opportunity includes but is not limited to, hiring, training, promotion, demotion, transfer, leaves of absence, and termination. Contextual AI takes allegations of discrimination, harassment, and retaliation seriously, and will promptly investigate when such behavior is reported.