Project Role : Data Engineer
Project Role Description : Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes to migrate and deploy data across systems.
Must have skills : Databricks Unified Data Analytics Platform
Good to have skills : NA
Minimum
7.5 year(s) of experience is required
Educational Qualification : 15 years full time education
summary ::
The ideal candidate will have experience building:
Reusable Python/PySpark frameworks for standardizing data engineering workflows
Test frameworks to ensure pipeline reliability and correctness
Data quality frameworks for monitoring and validation
Additionally, hands-on experience with Datadog or similar observability tools is required to monitor pipeline performance, optimize resource usage, and ensure system reliability.
You will work within a cross-functional team, building scalable, production-grade data pipelines on cloud platforms such as AWS, Azure, or GCP.
Roles & Responsibilities:-
Data Engineering & Framework Development
Develop and maintain ETL/ELT pipelines in Databricks using PySpark and Python.
Build reusable, modular frameworks to accelerate development and enforce standards across pipelines.
Implement test frameworks for automated unit, integration, and regression testing of pipelines.
Design and maintain data quality frameworks to validate ingestion, transformation, and output.
Optimize Spark jobs for performance, scalability, and cost-efficiency.
Collaborate with data architects to define robust data models and design patterns.
Cloud & Platform Integration
Integrate Databricks pipelines with cloud-native storage services (e.g., S3, ADLS, Snowflake).
Implement CI/CD pipelines for Databricks notebooks and jobs using Git, Jenkins, or Azure DevOps.
Ensure pipelines follow best practices for modularity, reusability, and maintainability.
Monitoring, Observability & Optimization
Use Datadog to monitor pipeline performance, resource utilization, and system health.
Build dashboards and alerts for proactive monitoring and troubleshooting.
Analyze metrics and logs to identify bottlenecks and improve reliability.
Collaboration & Delivery
Partner with data scientists, analysts, and business stakeholders to translate requirements into scalable solutions.
Conduct code reviews, enforce best practices, and mentor junior engineers.
Promote knowledge-sharing of reusable frameworks, testing practices, and data quality approaches.
Professional & Technical Skills:-
Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
5–8 years of experience in data engineering or software development.
3+ years hands-on experience with Databricks and PySpark.
Strong Python programming skills, including writing reusable libraries and frameworks.
Experience designing and implementing test frameworks for ETL/ELT pipelines.
Experience building data quality frameworks for validation, monitoring, and anomaly detection.
Proficiency in SQL and experience with cloud data warehouses (Snowflake, Redshift, BigQuery).
Familiarity with Datadog or similar monitoring tools for metrics, dashboards, and alerts.
Experience integrating Databricks with AWS, Azure, or GCP services.
Working knowledge of CI/CD, Git, Docker/Kubernetes, and automated testing.
Strong understanding of data architecture patterns — medallion/lakehouse architectures preferred.
Nice to Have
Experience with Airflow, Prefect, or Azure Data Factory for orchestration.
Exposure to infrastructure-as-code tools (Terraform, CloudFormation).
Familiarity with MLflow, Delta Live Tables, or Unity Catalog.
Experience designing frameworks for logging, error handling, or observability.
Knowledge of data security, access control, and compliance standards.
Soft Skills
Strong problem-solving and analytical skills.
Excellent verbal and written communication.
Ability to work in agile, cross-functional teams.
Ownership mindset, proactive, and self-driven.
Additional Information:- The candidate should have a minimum of 5 years of experience in Large Language Models.
- This position is based at our Bengaluru office.
- A 15 years full-time education is required.
15 years full time education
About Accenture
Accenture is a leading global professional services company that helps the world’s leading businesses, governments and other organizations build their digital core, optimize their operations, accelerate revenue growth and enhance citizen services—creating tangible value at speed and scale. We are a talent- and innovation-led company with approximately 791,000 people serving clients in more than 120 countries. Technology is at the core of change today, and we are one of the world’s leaders in helping drive that change, with strong ecosystem relationships. We combine our strength in technology and leadership in cloud, data and AI with unmatched industry experience, functional expertise and global delivery capability. Our broad range of services, solutions and assets across Strategy & Consulting, Technology, Operations, Industry X and Song, together with our culture of shared success and commitment to creating 360° value, enable us to help our clients reinvent and build trusted, lasting relationships. We measure our success by the 360° value we create for our clients, each other, our shareholders, partners and communities.
Visit us at www.accenture.com
Equal Employment Opportunity Statement
We believe that no one should be discriminated against because of their differences. All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, military veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by applicable law. Our rich diversity makes us more innovative, more competitive, and more creative, which helps us better serve our clients and our communities.