Job description
A rapidly growing technology organization is seeking a Senior Data Engineer to help design and build the data infrastructure that powers advanced analytics and machine learning systems across the business.
This role focuses on transforming complex, large-scale datasets into reliable, production-ready data products used by analytics teams, machine learning engineers, and business stakeholders. The ideal candidate enjoys solving difficult data problems, building scalable systems, and working closely with both engineering and analytics teams.
Many of the data challenges involve high-volume transactional and operational data commonly found in regulated industries such as insurance and financial services, where accuracy, data governance, and reliability are critical.
Responsibilities
-
Design and build scalable data pipelines to ingest, process, and transform large datasets
-
Develop distributed data processing workflows using Python, Spark, and Scala
-
Build data models and transformation layers that support analytics and machine learning applications
-
Collaborate with data scientists and ML engineers to prepare datasets for predictive modeling and advanced analytics
-
Improve data quality, monitoring, and reliability across the data platform
-
Optimize performance of large-scale data pipelines and processing frameworks
-
Contribute to the design of data architecture, schema standards, and governance practices
-
Work with cross-functional teams to integrate data from multiple internal and external sources
Requirements
-
6+ years of experience in data engineering or backend data platform development
-
Strong programming experience in Python
-
Experience building distributed data pipelines using Apache Spark
-
Working knowledge of Scala in production data environments
-
Experience handling large-scale datasets in cloud or distributed environments
-
Familiarity with machine learning data preparation, feature pipelines, or ML infrastructure
-
Strong SQL and data modeling skills
-
Experience working with modern data platforms (data lakes, distributed processing frameworks, etc.)
Nice to Have
-
Experience supporting machine learning workflows or feature engineering pipelines
-
Exposure to insurance, financial services, or other regulated industry data environments
-
Experience working with cloud-based data platforms
-
Familiarity with streaming or real-time data processing
What You'll Work On
-
Building the core data infrastructure that supports analytics and predictive modeling
-
Enabling machine learning teams with reliable, high-quality datasets
-
Designing scalable pipelines capable of processing large volumes of operational data
-
Improving the reliability and usability of enterprise data assets
#LI-MC1