Data Engineer
Auburn Hills, MI 
Posted 10 days ago
Job Description

The Data Engineer's work will be building, managing and optimizing data pipelines and then moving these data pipelines effectively into production for key data and analytics consumers. Data Engineers also need to guarantee compliance with data governance and data security requirements while creating, improving and operationalizing these integrated and reusable data pipelines. The newly hired Data Engineer will be the key interface in operationalizing data and analytics. Candidates need to be self-driven, curious, and creative.

As a Data Engineer your job responsibilities include but not limited to:

  • Source, load, transform, and store data
  • Create, maintain, and optimize data pipelines to move workloads from development to production for specific use cases
  • Implement and automate data quality checks
  • Drive automation using innovative and modern tools, techniques, and architectures to minimize manual and error-prone processes and improve productivity
  • Assist with renovating the data management infrastructure to drive automation and ensure data integrity
  • Learn and use modern data preparation, integration, and AI-enabled metadata management tools and techniques
  • Track data consumption patterns
  • Perform intelligent sampling and caching
  • Recommend and automate existing and future integration flows
  • Ensure data is provisioned and used responsibly through data governance and compliance initiatives
  • Work with data governance teams (and information stewards within these teams) and participate in vetting and promoting content created in the business and by data scientists to the curated data catalog for governed reuse

Basic Qualifications:

  • Bachelor's degree
  • 5 years of work experience in data management disciplines including data integration, modeling, optimization, and data quality, and/or other areas directly relevant to data engineering responsibilities and tasks
  • 3 years of experience working in cross-functional teams and collaborating with business stakeholders in support of a departmental and/or multi-departmental data management and analytics initiative
  • Strong experience with advanced analytics tools for object-oriented/object function scripting using languages such as R, Python, Java, C++, Scala and others
  • Strong ability to design, build and manage data pipelines for data structures encompassing data transformation, data models, schemas, metadata, and workload management
  • Ability to work with both IT and business in integrating analytics and data science output into business processes and workflows
  • Strong experience with popular database programming languages including SQL, PL/SQL and others for relational databases and certifications on upcoming
  • NoSQL/Hadoop oriented databases like MongoDB, Cassandra, others for non-relational databases
  • Strong experience in working with large, heterogeneous datasets in building and optimizing data pipelines, pipeline architectures and integrated datasets using traditional data integration technologies including ETL/ELT, data replication/CDC, message-oriented data movement, API design, and access and upcoming data ingestion and integration technologies such as stream data integration, CEP, and data virtualization
  • Strong experience in working with SQL on Hadoop tools and technologies including HIVE, Impala, Presto, etc. from an open-source perspective and Hortonworks Data Flow (HDF), Dremio, Informatica, Talend, others from a commercial vendor perspective
  • Basic experience in working with data governance, data quality and data security teams and specifically information stewards and privacy and security officers in moving data pipelines into production with appropriate data quality, governance, security standards, and certification

Preferred Qualifications:

  • Master's degree or Ph.D. in Computer Science, Statistics, Applied Mathematics, Data Management, Information Systems, Information Science or a related quantitative field
  • Ability to work across multiple deployment environments including cloud, on-premises, and hybrid, multiple operating systems and through containerization techniques such as Docker, Kubernetes, AWS Elastic Container Service and others
  • Strong experience in working with both open-source and commercial message queuing technologies such as Kafka, JMS, Azure Service Bus, and Amazon Simple Queuing Service, stream data integration technologies such as Apache Nifi, Apache Beam, Apache Kafka Streams, Amazon Kinesis, etc. and stream analytics technologies such as Apache Kafka KSQL Apache Spark Streaming Apache Samza, etc.
  • Basic experience working with popular data discovery, analytics and BI software tools like PowerBI, Qlik, Sisense, Tableau, etc. for semantic-layer-based data discovery
  • Strong experience in working with data science teams in refining and optimizing data science and machine learning models and algorithms
  • Adept in agile methodologies and well-versed in applying DevOps/MLOps methods to the construction of ML and data science pipelines
  • Knowledge of industry standard BA tools, including Cognos, QlikView, Business Objects, and other tools that could be used for enterprise solutions
  • Superior communication skills, including storytelling and other techniques to guide, inspire, and explain analytics capabilities and techniques to the organization
  • Strong experience in working with tools like Github for version control and source code management
  • Domain knowledge and experience working with automotive and consumer data

Equal Opportunity Employer Minorities/Women/Protected Veterans/Disabled.


Job Summary
Start Date
As soon as possible
Employment Term and Type
Regular, Full Time
Required Education
Bachelor's Degree
Required Experience
5+ years
Email this Job to Yourself or a Friend
Indicates required fields