Principal Data Engineer

Posted:
10/21/2024, 2:34:46 AM

Location(s):
Dallas, Texas, United States ⋅ Texas, United States

Experience Level(s):
Expert or higher ⋅ Senior

Field(s):
Data & Analytics

Workplace Type:
Hybrid

About the job

The world’s most critical--and at risk--business applications have been neglected for far too long. Onapsis eliminates this blind spot by providing cybersecurity solutions dedicated to business-critical applications. Whether running on premises, in the cloud, or in a hybrid environment, Onapsis helps nearly 30% of the Forbes Global 100 understand the threats and risks across their SAP and Oracle landscapes.

We are seeking a Senior Data Engineer to join our mission-driven team. This role is ideal for experienced data engineers with a proven track record in architecting scalable data pipelines, leveraging cloud technologies, and contributing to high-impact cybersecurity solutions. You will be responsible for building high-performance ETL frameworks, optimizing data platforms, and contributing directly to the enhancement of our customers' threat detection, response, and remediation capabilities.

 

What you will be doing, your legacy: 

You will be working directly with company Principal Engineers evaluating, scoping, proposing, and building features to fulfill business solution requirements to protect our customers. You will be working directly setting the foundation of a new product. Additionally, you will be working with Engineering and DevOps to deliver high-quality products and services while also working closely with security and IT professionals to ensure safe and secure best practices are followed. 

Responsibilities:

  • Architect and Design Scalable Data Solutions: Design, develop, and maintain highly-scalable ETL/ELT pipelines across diverse data domains using cloud technologies like AWS (Glue, Redshift, Lambda, EMR, S3) and Azure (Data Factory, Synapse, Databricks).
  • Data Pipeline Development: Implement data models and data processing frameworks (Spark, Kafka, Snowflake) to ingest, transform, and load large datasets (100+ TB), ensuring high availability and reliability of data.
  • Advanced Data Integration: Develop solutions that integrate multiple data sources into Snowflake or similar data warehouses to enable real-time analytics and reporting across dashboards.
  • AI/ML Integration: Collaborate with cross-functional teams to co-develop AI-driven features like text summarization and chatbot functionalities using AWS Bedrock, SageMaker, or similar AI/ML technologies, reducing response times and enhancing decision-making capabilities.
  • Compliance and Security: Ensure compliance with industry standards and secure best practices (SOX, SOC 1/2), by implementing data governance frameworks, monitoring data pipelines, and optimizing cloud database architectures to protect sensitive information.
  • Stakeholder Collaboration: Work closely with stakeholders, including analysts, engineers, and product managers, to understand their data needs, propose solutions, and drive data-driven decision-making by delivering actionable insights.
  • Data Infrastructure Monitoring: Continuously monitor, troubleshoot, and enhance data pipelines, leveraging CI/CD tools (Docker, Jenkins, GitHub Actions) and orchestrating workflows using Apache Airflow to maintain operational efficiency.
  • Leadership and Mentorship: Provide technical leadership within the data platform organization, leading the implementation of cutting-edge cloud technologies and mentoring junior data engineers in best practices and advanced data management techniques.
  • Cloud Migration: Lead large-scale database migrations from on-premises environments (Oracle, SQL Server) to cloud-based solutions like Snowflake and AWS, improving query performance and reducing technical debt.
  • Documentation and Governance: Establish comprehensive documentation for data architecture, governance, and processes to ensure scalability, compliance, and security.

 

Qualifications:

  • 5+ years of proven experience as a Data Engineer or in a similar role with a deep understanding of data architecture and cloud-based ETL/ELT frameworks.
  • Strong experience with AWS and/or Azure cloud services, particularly with Glue, Redshift, Lambda, Step Functions, Databricks, Synapse, and Snowflake.
  • Proficiency in big data technologies such as Apache Spark, Kafka, Hadoop, and Databricks for distributed data processing.
  • Strong programming skills in Python and SQL, with experience in advanced data modeling (star, snowflake schemas) and partitioning techniques.
  • Hands-on experience in building real-time data processing and AI/ML-driven analytics solutions (SageMaker, Bedrock, NLP, Power BI).
  • Proven ability to architect and manage data warehouse solutions (e.g., Snowflake, Redshift) for enterprise-grade performance and reliability.
  • Familiarity with compliance and audit requirements (SOX, SOC 1/2, GDPR) and implementing data governance and security frameworks.
  • Strong problem-solving skills with a focus on data integrity, scalability, and performance optimization.
  • Experience with CI/CD tools (Jenkins, GitHub Actions, Docker) and data orchestration platforms (Apache Airflow).

Preferred Qualifications:

  • Experience with advanced data architecture principles (medallion architecture, materialized views, task scheduling).
  • Proven track record of successful cloud migrations for large datasets and optimizing query performance in Snowflake or similar platforms.
  • Familiarity with real-time analytics using Tableau, Power BI, and other BI tools to drive decision-making and reduce reporting lag.
  • Leadership experience, including mentoring junior engineers and leading technical projects.

 

Location: Dallas, TX, US. This is a hybrid role. 

About Onapsis:

Onapsis protects the business applications that run the global economy. The Onapsis Platform delivers vulnerability management, change assurance, and continuous compliance for business applications from leading vendors such as SAP, Oracle, and others. The Onapsis Platform is powered by the Onapsis Research Labs, the team responsible for the discovery and mitigation of more than 1,000 zero-day vulnerabilities in business applications.

Onapsis is headquartered in Boston, MA, with offices in Heidelberg, Germany and Buenos Aires, Argentina, and proudly serves hundreds of the world’s leading brands, including close to 30% of the Forbes Global 100, six of the top 10 automotive companies, five of the top 10 chemical companies, four of the top 10 technology companies, and three of the top 10 oil and gas companies.

For more information, connect with Onapsis on LinkedIn or visit https://www.onapsis.com.

Onapsis

Website: https://www.onapsis.com/

Headquarter Location: Boston, Massachusetts, United States

Employee Count: 251-500

Year Founded: 2009

IPO Status: Private

Last Funding Type: Series D

Industries: Cloud Data Services ⋅ Cyber Security ⋅ Enterprise Resource Planning (ERP) ⋅ Network Security ⋅ Security