Important Information
Location: India
Experience: 8+ years
Job Mode: Full-time
Job Summary
The Data Engineer will play a vital role in ensuring the integrity, performance, and scalability of data pipelines and platforms in the data modernization project. This role focuses on building testing frameworks, automating data validation processes, and optimizing system performance using tools like Apache Spark. The engineer will collaborate closely with the QA Lead, DBA/DB Migration Specialist, and ETL Specialist to maintain high-quality data workflows while contributing to the scalability of the data platform
Responsibilities and Duties
- Testing Framework Development & Data Validation:
- Design and implement automated testing frameworks for validating data pipelines and transformations.
- Integrate tools like Great Expectations to enforce data quality standards and ensure consistency across systems.
- Conduct performance and validation tests for migrated data to identify discrepancies and maintain integrity.
- Performance Optimization & Scalability:
- Use Apache Spark to optimize data transformation workflows for scalability and efficiency.
- Monitor and enhance pipeline performance to meet system throughput and latency benchmarks.
- Collaborate with the Data Architect to implement strategies for scaling the data platform to handle future growth.
- Collaboration & Alignment:
- Work closely with the QA Lead to align testing frameworks with project quality assurance processes.
- Collaborate with the DBA/DB Migration Specialist to validate database performance and identify opportunities for optimization.
- Support the ETL Specialist by providing insights on performance bottlenecks and scalability improvements.
- Monitoring & Issue Resolution:
- Develop and implement monitoring solutions to proactively identify and resolve data processing issues.
- Support ongoing system performance analysis and make recommendations for resource optimization.
- Documentation & Knowledge Sharing:
- Create comprehensive documentation for testing frameworks, validation procedures, and scalability improvements.
- Share best practices and lessons learned with the team to enhance overall project execution.
Qualifications and Skills
- 8+ years of experience in data engineering with a focus on testing frameworks, performance optimization, and scalability.
- Hands-on expertise with Apache Spark for scalable data processing and transformation.
- Experience with data validation tools like Great Expectations and data monitoring frameworks.
- Strong understanding of data architecture principles, particularly within Snowflake and PostgreSQL environments.
- Proven track record of implementing scalable solutions in cloud-native environments, particularly AWS.
- Exceptional attention to detail and commitment to maintaining data integrity and quality.
- Strong collaboration and communication skills to align with cross-functional teams.
- Ability to work independently and manage multiple priorities in a fast-paced environment.
Additional Requirements
- Familiarity with ETL tools like AWS Glue and dbt for data workflows.
- Experience working with orchestration tools like Apache Airflow.
- Knowledge of advanced performance testing techniques for distributed data platforms.
- Strong analytical and problem-solving skills for identifying and resolving data quality issues.
About Encora
Encora is a global company that offers Software and Digital Engineering solutions. Our practices include Cloud Services, Product Engineering & Application Modernization, Data & Analytics, Digital Experience & Design Services, DevSecOps, Cybersecurity, Quality Engineering, AI & LLM Engineering, among others.
At Encora, we hire professionals based solely on their skills and do not discriminate based on age, disability, religion, gender, sexual orientation, socioeconomic status, or nationality.