0% found this document useful (0 votes)
120 views3 pages

Data Engineering 6 Months Plan

The document outlines a learning path for data engineering fundamentals. It recommends courses and resources to learn: 1. Computer science fundamentals like Python programming and algorithms. 2. Programming in Python and practicing web scraping and building a calculator. 3. SQL for querying and analyzing data through courses and practice on Hackerrank. 4. Linux basics like commands through courses and a beginner portfolio project. 5. Big data and data warehousing concepts through Coursera specializations. 6. Building batch and real-time data pipelines with Spark, Kafka, and Airflow. 7. Data visualization with tools like Tableau or using Python code. 8. Cloud

Uploaded by

iti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
120 views3 pages

Data Engineering 6 Months Plan

The document outlines a learning path for data engineering fundamentals. It recommends courses and resources to learn: 1. Computer science fundamentals like Python programming and algorithms. 2. Programming in Python and practicing web scraping and building a calculator. 3. SQL for querying and analyzing data through courses and practice on Hackerrank. 4. Linux basics like commands through courses and a beginner portfolio project. 5. Big data and data warehousing concepts through Coursera specializations. 6. Building batch and real-time data pipelines with Spark, Kafka, and Airflow. 7. Data visualization with tools like Tableau or using Python code. 8. Cloud

Uploaded by

iti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

1.

Computer Science Fundamentals (If you don’t have a CS background)


a. edX - Introduction to Computer Science and Programming Using Python | edX
b. edX - CS50's Introduction to Computer Science
c. Coursera - Computer Communications Specialization
d. Book - Grokking Algorithms: An illustrated guide

2. Programming Language
Do any courses, your main goal here is to understand how to write basic Python
Code and how to work with different datasets!

a. Darshil - Python for Data Engineering (Recommended)


b. DataCamp - Data Engineering With Python
c. Coursera - Python for Everybody Specialization (Do this if you don’t know
anything about python)
d. edX - Python Basics for Data Science | edX
e. Udemy - Python Bootcamps: Learn Python Programming and Code Training

Practice Projects:
● Scrape Data Using BeautifulSoup Library eg. Amazon, Covid, Wikipedia, or any
website you like
● Build A Calculator Using Python

3. SQL (Structured Query Language)


Learn about the basics of SQL and how to write queries, once you complete the
course make sure you do hands-on practice on Hackerrank or any website you
like!

a. Udemy - The Complete SQL Bootcamp for the Manipulation and Analysis of
Data (Recommended)
b. Coursera - SQL for Data Science
c. DataCamp - Intro To SQL DataCamp

Practice SQL here


● Hackerrank SQL

4. Basics Of Linux
Why Linux? Because you will be working with many remote machines, doing SSH to
access them, and performing operations so it’s important to learn them.

You don’t have to remember all the commands but just understand what they do and
how to write them

a. Udemy - Linux for Beginners: Linux Basics (Recommended)


b. Coursera - Linux Fundamentals
Do Hands-On Project
● Beginner Data Engineering Portfolio Project (Recommended)

5. Big Data Fundamentals


This section is theoretical and you need to understand how big data system works
and their history of them

a. Coursera - Big Data Specialization (Recommended)


b. edX - Big Data Fundamentals
c. Udemy - Learn Big Data: The Hadoop Ecosystem Masterclass (Do this if you want to
learn about legacy systems)

6. Data Warehouse Fundamentals


Same as the previous section, more theory, and understanding of concepts

a. Coursera - Data Warehousing for Business Intelligence Specialization


(recommended for deep dive)
b. Udemy - Data Warehouse Fundamentals for Beginners (recommended for quick
learning)
7. Learn Batch/Realtime Streaming Pipeline Building
a. Batch Pipeline (Spark)
i. DataCamp - Big Data Fundamentals with PySpark (recommended)
ii. Udemy - Spark and Python for Big Data with PySpark
b. Realtime Streaming (Kafka)
i. Udemy - Apache Kafka Course for Beginners: Learn Kafka Online (check
this)
ii. edX - Building ETL and Data Pipelines with Bash, Airflow, and Kafka

8. Data Orchestration (AirFlow)


a. Udemy - The Complete Hands-On Introduction to Apache Airflow
b. DataCamp - Airflow

9. Dashboard Tool
There are two ways to visualize, one using code and another one using the tool
so I have added both here

a. Udemy - Python Data Analysis & Visualization Masterclass (Using Code)


b. Udemy - Tableau 10: Training on How to Use Tableau For Data Science (Using
Tool)
c. Coursera - Data Visualization with Tableau Specialization
d. Udemy - Microsoft Power BI with Desktop Training Course
10. Cloud Computing
Advance section, do courses, and then do the certification to add value in your
Resume, If you are new then start with AWS but if you know about
other clouds then you can do that too!

a. AWS (Amazon Web Services)


i. Udemy - Ultimate AWS Certified Cloud Practitioner
ii. Udemy - Ultimate AWS Certified Solutions Architect Associate (SAA)
b. GCP (Google Cloud Platform)
i. Coursera - Cloud Data Engineer Professional Certificate
c. Microsoft Azure
i. Udemy - AZ-900: Microsoft Azure Fundamentals
ii. Udemy - Azure Data Engineer Certified:8 COURSE BUNDLE

Once you learn about different services then consider doing some hands-on projects
Do Hands-On - Data Engineering Cloud Project Series (AWS)
Do Hands-On - YouTube Data Analysis Project (AWS)

You might also like