• Data Engineering •
Career Course 2022
DESCRIPTION
This course consists of two modules: analytics
TEaching Style engineering and data engineering.
In the analytics engineering module, students learn
All classes are conducted live online coding fundamentals in Python and SQL, and begin
via zoom building data pipelines.
Days consist of 30 minute live lectures In the data engineering module, students deepen
followed by interactive readings and their Python and Sql skills by building APIs in Flask,
and learn cloud computing and data orchestration
labs performed either individually or in skills.
pairs.
During the data engineering module, students are
Quizzes are regularly delivered to paired with companies to perform externships (more
assess understanding. below).
Halfway through the course, students
are assigned to outside companies to Schedule - 24 Weeks
complete an externship where they
perform data engineering work. Tues, Weds and Thurs 6:30 - 9:30 pm EST
Sundays 12:30 pm - 9:30 pm EST
Module 1 Module 2
Analytics Engineering Data Engineering
Weeks 1 - 12 Weeks 13 - 24
Python: Data Structures, Functions, Flask: ORMs, MVC, Adapter Pattern
Apis, OOP Cloud: Docker, AWS EC2
SQL: Postgres, Snowflake Pipelines: Airflow, AWS S3, RDS,
ELT Pipelines: Fivetran, DBT, Mode Redshift
• Module 1: Analytics Engineering •
Weeks 1 - 12
DESCRIPTION
This course prepares students for their externship by training them in two subjects: (1)
software engineering fundamentals and (2) data pipelines.
Software Engineering Fundamentals
Students learn to retrieve and manipulate data in Python, write clean functions, and data model
with object oriented programming. Then with SQL, students learn single table queries,
relational queries, and advanced SQL techniques.
Tools: Python, Apis, OOP, SQL, Postgres
Data Pipeline Fundamentals
Students learn to build data pipelines that move data from a transactional database to an analytics
database, and the data modeling and data queries that come with it.
Tools: Fivetran, Snowflake, DBT, and Mode
Analytics Engineering Data Pipeline
Marketing Data
Product Data Ingestion Data Warehouse Transform Data Dashboard
Sales Data
Fivetran Snowflake DBT Mode
Meltano Redshift PowerBI
Stitch BigQuery Looker
• Module 1: Analytics Engineering •
Weeks 1 - 12
Week Number Topic
Week 1 Data Structures in Python
Week 2 Functional Python, Git, Bash
Week 3 Object Oriented Programming
Week 4 Python Projects
Week 5 SQL Single Table Queries
Week 6 SQL Relational Queries
Week 7 Coercing Data with Advanced SQL
Week 8 Data Modeling with OLTPs vs OLAPs
Week 9 ETL in Snowflake
Week 10 Intro to DBT
Week 11 DBT Pipelines
Week 12 Review and Data Dashboards
• Module 2: Data Engineering •
Weeks 13 - 24
DESCRIPTION
This module teaches design patterns in backend web programming and building data engineering
pipelines. Throughout the second module, we will allocate 1 - 2 hours per week for technical
interview prep, and six hours per week for students to work on externships.
Backend Engineering
We’ll learn the classic web design programming pattern of model-view-controller, performing ETL in
Python with the adapter pattern, and will build out an object relational mapper.
Tools: Scraping, Flask, MVC, Adapter Pattern, ORMs
Flask API Design
View
ETL Pattern
Controller
MVC
External Client Adapter Model Pattern
APIs
Extract Transform Load Database
Data Pipelines with Cloud Computing
Through cloud computing we'll learn how to deploy our backend APIs with AWS. Then
we'll learn how to automate requests for external data and load data into an analytics
database.
Tools: AWS EC2, S3, RDS, Redshift, Docker, Airflow
Airflow
Data
External Client Staging
Staging Warehouse
Data
Docker AWS Redshift
AWS S3
Flask
AWS RDS
• Module 2: Data Engineering •
Weeks 13 - 24
Week Number Topic
Week 13 HTML, CSS, and Scraping
Week 14 Intro to Flask
Week 15 Model - View - Controller
Week 16 The Adapter Pattern
Week 17 MVC with the Adapter Pattern
Week 18 Reviewing MVC with Adapters
Week 19 Docker Containers
Week 20 Building Docker Images
Week 21 Deploying Websites with AWS
Week 22 Working with RDS and advanced bash
Week 23 ETL with RDS, S3, and Redshift
Week 24 Performing ETL in Airflow