DE - Test
DE - Test
Objective:
To evaluate the candidate's proficiency in data engineering,
including data pipelines, ETL processes, data storage
solutions, and distributed systems.
Duration:
2-3 hours
Submission:
Candidates should submit their code via a GitHub repository.
Scenario:
You are working for a Data Product company that needs to
process user interaction data. The data is in CSV format
and needs to be cleaned, transformed, and loaded into a
data warehouse for analysis.
Steps:
1. Data Ingestion:
2. Data Cleaning:
3. Data Transformation:
4. Data Loading:
5. Documentation:
Deliverables:
Steps:
1. Set Up Airflow:
3. Task Scheduling:
4. Error Handling:
Deliverables:
Steps:
1. Data Storage:
2. Data Retrieval:
3. Optimization:
Deliverables:
Evaluation Criteria: