This role is responsible for data collection procedures, including accurate and relevant data for machine learning models, extracting and analyzing data from the primary and secondary database. The role ensures data integrity and compliance by performing data cleansing and data validations. The role performs root-cause analysis and recommends or executes corrective actions when data related system problems occur. The role applies developed subject matter knowledge to solve common and complex business issues within established guidelines and recommends appropriate alternatives.
Responsibilities
Designs and establishes secure and performant data architectures, enhancements, updates, and programming changes for portions and subsystems of data pipelines, repositories or models for structured/unstructured data.
Analyzes design and determines coding, programming, and integration activities required based on general objectives and knowledge of overall architecture of product or solution.
Writes and executes complete testing plans, protocols, and documentation for assigned portion of data system or component; identifies and debugs, and creates solutions for issues with code and integration into data system architecture.
Leads a project team of other data engineers to develop reliable, cost effective and high-quality solutions for assigned data system, model, or component.
Collaborates and communicates with project team regarding project progress and issue resolution.
Analyzes data inaccuracies, identifies opportunities and supports the development of automated solutions to enhance overall quality of the enterprise data.
Identifies problematic areas and conducts research to determine the best course of action to correct the data; identifies, analyzes and interprets trends and patterns in complex datasets.
Works cross-functionally with different departments to assess, define, and develop report deliverables.
Represents the software data engineering team for all phases of larger and more-complex development projects.
Provides guidance and mentoring to less experienced staff members.
Education & Experience Recommended
Four-year or Graduate Degree in Computer Science, Information Technology, Software Engineering, Statistics/ Mathematics, or any other related discipline or commensurate work experience or demonstrated competence.
Typically has 4-7 years of work experience, preferably in data analytics, data engineering, data modeling, or a related field or an advanced degree with 3-5 years of work experience.
Preferred Certifications
Programming Language/s Certification (SQL, Python, or similar)
Knowledge & Skills
Agile Methodology
Amazon Web Services
Apache Hadoop
Apache Kafka
Apache Spark
Big Data
Computer Science
Data Analysis
Data Engineering
Data Modeling
Data Pipelines
Data Warehousing
Extract Transform Load (ETL)
Java (Programming Language)
Machine Learning
Microsoft Azure
Python (Programming Language)
Scala (Programming Language)
Scalability
SQL (Programming Language)
Cross-Org Skills
Effective Communication
Results Orientation
Learning Agility
Digital Fluency
Customer Centricity
Impact & Scope
Impacts multiple teams and may act as a team or project leader providing direction to team activities and facilitates information validation and team decision making process.
Complexity
Responds to moderately complex issues within established guidelines.
Disclaimer
This job description describes the general nature and level of work performed in this role. It is not intended to be an exhaustive list of all duties, skills, responsibilities, knowledge, etc. These may be subject to change and additional functions may be assigned as needed by management.