We are a growing grocery app looking for an experienced Data Engineer to join our data team. As a key member of our team, you'll be responsible for building and maintaining our data infrastructure, optimizing our data processes, and ensuring data quality and accessibility across the organization.
Key Responsibilities:
1. Advanced Airflow Environment Development
Build and maintain a robust Airflow environment to run scripts for creating datamarts and automating data logic
Implement best practices including CI/CD, linter checking, secret key management, and Airflow variables
Set up and manage Docker-based Airflow deployments for improved isolation and portability
Configure and optimize Kubernetes executor for scalable and efficient task execution
Implement advanced scheduling techniques, including dynamic task generation and cross-DAG dependencies
Set up comprehensive monitoring and alerting for the Airflow environment
Implement effective logging strategies for improved debuggability
Ensure high availability and fault tolerance of the Airflow cluster
2. Data Pipeline Migration and Optimization
Migrate existing data team scripts from the old Airflow environment to the new one
Improve script quality and optimize performance during the migration process
Implement data quality checks and SLAs for critical pipelines
3. Looker Governance and Optimization
Manage Looker Enterprise implementation
Develop and implement a strategy for tailored access to specific Looker explores for different teams
Optimize Looker performance and ensure proper data governance
Set up and maintain Look ML CI/CD pipelines
4. Web Scraping Projects
Design and implement various web scraping projects to collect relevant external data
Ensure the quality and reliability of scraped data
Implement robust error handling and retry mechanisms for web scraping pipelines
5. General Data Engineering Tasks
Collaborate with data scientists, analysts, and other stakeholders to understand data requirements
Design, build, and maintain scalable data pipelines and ETL processes
Build and maintain cloud-based data warehouses (e.g., Big Query), including schema design, optimization, and management
Implement data modeling best practices for efficient querying and analysis
Ensure data quality, reliability, and accessibility across the organization
Optimize data warehouse performance and cost-efficiency
Develop and maintain data documentation and metadata management systems
Bachelor's degree in Computer Science, Data Science, or a related field
3+ years of experience in data engineering or a similar role
Strong proficiency in Python and SQL
Extensive experience with Apache Airflow, including advanced features and best practices
Experience with Docker and container orchestration, preferably Kubernetes
Proven experience in building and maintaining cloud data warehouses (especially Big Query)
Familiarity with Looker or similar BI tools
Knowledge of web scraping techniques and tools
Experience with cloud platforms (especially GCP)
Strong problem-solving skills and attention to detail
Excellent communication skills and ability to work in a team environment
"Pihak Hired Today.com dan Perusahaan tidak akan meminta biaya dalam bentuk apapun pada saat melakukan proses recruitment. Mohon segera melaporkan kepada kami, apabila Anda jika pada saat diundang untuk interview dan diminta untuk melakukan pembayaran dengan sejumlah uang." "Hired Today.com and the Company will not ask for any form of payment during the recruitment process. Please report to us immediately, if you are invited for an interview and asked to make a payment with a sum of money."
We regret to inform you that this job opportunity is no longer available