POST GRADUATE PROGRAM IN
Data Engineering
Master Data Engineering skills and take your career to
the next level!
1.2 Million 1:1 Personlized 55% Average
Learners Mentorship Salary Hike
IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com
Post Graduate Program in
Data Engineering
Launch your career in data engineering with this post graduate program designed by experts in
the domain. The program is integrated with MITxMicroMasters, and you will learn from leading
faculty at MIT along with live sessions by industry experts. Get in-depth training in core data
engineering skills such as Python, SQL, AWS, Spark, Kafka, Apache Airflow, etc.
Hottest Job of 21st Century
1.1 Million Job Postings Skill Development
There is a global estimate of 1.1 million job Data Engineers are equipped with various
postings for Data Engineers in 2022 relevant skills that fetch lucrative job
offers
Growing Data Industry Future-oriented Career
Data Engineering is a budding field; a head
50% CAGR in the global data industry start will prove to be beneficial
Popular Degree High Demand
46% of Data Engineers have a master’s In 2022, India and the US will face a demand-
degree supply gap of 400,000 Data Engineers
Our Credentials
1.2 Million+ 1,000+ 400+
Aspiring Active Students Industry-expert Instructors
Hiring Partners
200+ 55% 155+
Corporates Upskilled Average Salary Hike Countries’ Learners
Page - 1
IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com
About Program
Intellipaat’s PGP Certification in Data Engineering will provide you in-depth knowledge in SQL,
Python, data pipelines, data transformation, Spark, and cloud services of AWS and Azure. Work
on multiple real-world projects to gain knowledge on creating production-ready ETL and pulling
data from multiple data sources, including real-time streaming services, and loading them into
cloud data warehouses.
Learning Format 7 Months Career Services MITxMicroMasters
Online Boot Camp Live Classes by Intellipaat Certification
Key H h h ig lig ts
7 Months of Live Sessions by Industry Experts Career Guidance
200 Hrs of Self-paced Videos One-on-One with Industry Mentors
24*7 Support 50+ Industry Projects & Case Studies
Integrated with MITxMicroMasters E-learning Videos from MIT faculty
Flexible Schedule Lifetime Free Upgrade
Soft Skills Essential Training cost
Program Pedagogy
c
Attend tor
Instru Liveled -
Classes
raining T H c
a kat ons
Innovative h
LMS
From
Get worlds top
trained bFaculty
y top and
industry experts Get a sense of
For effective how Learning
online real projects are b
uilt
Industry experts experience
c
Learn ated
Dedi by Doing
Learning Managers Peer Networking and Group Learning
To helpon
Hands you with your
exercises, learning
project needs
work, For effective
Improve your online
professional
Learning
network and learn from
quiz, capstone projects experience
peers
Self- c
pa ed videos
1:1 Personalized Learning Gamified Learning
Handsat
Learn onyour
exercises,
own pace
project
withwork, Learn
Get trough in
involved Hackathons and
group activities to
quiz, capstone
world-class content
projects Group
solve Learning
real-world pro blems
24*7
Pro j c
eSupport
ts and Ex c
er ises 1:1 Personalized Learning
Hands
Get on exercises,
real-world project
experience ofwork,
projects Hands-on exercises, project work,
quiz, capstone projects quizzes, and capstone projects
Page - 2
IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com
Who Can Apply for the Course?
Freshers and Undergraduates willing to pursue a career in data engineering
Anyone looking for a career transition to data engineering
IT Professionals
Ex perienced Professionals willing to learn data engineering
T echnical and Nontechnical Professionals with basic-level programming knowledge can also apply
Project Managers
Application Process
The application process consists of three simple steps. Candidates have to submit their application.
An offer of admission will be made to the selected candidates, and their application will be accepted
upon the payment of the admission fee.
SUBMIT APPLICATION
1
Tell us a bit about yourself and why you want to join
ADMISSION TEST & APPLICATION REVIEW
2 Clear the admission test and have a personal interview
with our interview panel
ADMISSION LETTER
3
Shortlisted candidates would be offered the admission letter
Page - 3
IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com
Learning Path
Live Courses
Start of the
1
Course
Preparatory Sessions
— Python and Linux
2 3 4
Data Wrangling Introduction to Big Data Engineering with
with SQL Data Engineering Apache Spark and Kafka
7 6 5
Mastering ETL Cloud Data Warehouses Data Modeling
Tool — Informatica
8 9 10
Data Engineering Schedule and Automate Data Data Virtualization and
on the Cloud Pipelines with Apache Airflow Containerization
PGP in Data Engineering
11
j
Capstone Pro ect
Self-paced Course
Azure Data Factory
Page - 4
IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com
Program Curriculum
Module 1 Integrating Apache Flume and Apache Kafka
Preparatory Sessions – Python & Linux Spark Streaming
Python Improving Spark Performance
Introduction to Python and IDEs
Spark SQL and Data Frames
Python Basics
Scheduling or Partitioning
Object Oriented Programming
Hands-on Sessions And Assignments for Practice
Module 5
Data Modeling
Linux
Understand the difference between SQL and NoSQL.
Introduction to Linux
Create relations data models and NoSQL-based data
Linux Basics
models on business reporting requirements. Work with
Hands-on Sessions And Assignments for Practice ETL tools to push the data to the model.
Work on MS SQL and Cassandra for creating
Module 2
Data Wrangling with SQL databases and using ETL tools for data extracting,
transformation, and loading to the models.
SQL Basics
Advanced SQL
Project 1: Data Modeling using Relational Databases
Deep Dive into User Defined Functions
Project 2: Data Modeling using Apache Cassandra
SQL Optimization and Performance
Module 6
Module 3 Cloud Data Warehouses
Introduction to Data Engineering Master the skills of building a highly scalable data
What is Data Engineering, Use Cases, and Applications?
warehouse on AWS. Work with Redshift and pull the data
Data Engineer or Data Scientist?
from RDS and other media services of AWS using ETL
Data Engineering Problems
pipeline and load the data into the data warehouse.
Tools of a Data Engineer
Working with Different Databases
Module 7
Processing Tasks, Scheduling Tools, and Different Cloud Providers
Mastering ETL Tool — Informatica
Why Cloud Computing, Use Cases, and Applications?
What is ETL, Use Cases, and Applications?
Different Cloud Services Why We Need ETL Tools>
Working with Different Data Sources—Relational
Module 4
Databases, NoSQL, HDFS, Stream Data, CSV Files, TXT
Big Data Engineering with Apache Spark and Kafka
Files, Json or XML Files, and Fixed File Formats
Learn the big data ecosystem and Apache Spark to load large
Transformation of Data
volumes of data. Work with Spark SQL for querying data and optimizing
Loading Data into a Data Model or File System
the same. Build an ETL pipeline to pull the data from different data
Using SQL for Data Transformation
sources, such as HDFS and S3, and use different format data files, such
Optimizing ETL Processes
as csv, txt, json, fixed format, streaming data, etc., to load the data.
Understanding ETL Architecture for Tracking the Data
Work on Spark Cluster using AWS.
Flow and Data Pipelines
Introduction to HDFS and Apache Spark
Understanding Data Quality Checks
Spark Basics
Working with RDDs in Spark
Module 8
Aggregating Data with Pair RDDs
Data Engineering on the Cloud
Writing and Deploying Spark Applications
Master the skills of building a highly scalable data
Parallel Processing
warehouse on AWS. Work with Redshift and pull the data
Spark RDD Persistence from RDS and other media services of AWS using ETL
Page - 5
IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com
Program Curriculum
pipeline and load the data into the data warehouse. Module 13
Machine Learning with Python
AWS Data Storage Services—S3, S3 Glacier, Amazon
Learn Basic statistics required for Data Science
DynamoDB
Master Data Science Algorithms
AWS Processing Services—AWS EMR, EMR Cluster,
Learn Linear regression and work on Recommender problems,
Hadoop, Hue with EMR, Spark with EMR, AWS Lambda,
collaborative filtering
HCatalog, Glue, and Glue Lab
Non-linear classification, kernels
AWS Data Analysis Services—Amazon Redshift, Tuning
Deep Learning Introduction and Neural networks
Query Performance, Amazon ML, Amazon Athena,
RNN & CNN
Amazon Elasticsearch, and ES Domain
Unsupervised learning: clustering
Module 9 Generative models, mixtures
Schedule and Automate Data Pipelines with Apache Learning to control: Reinforcement learning
Airflow Natural Language Processing
Learn to schedule, automate, and monitor ETL pipelines with Apache
Airflow, Luigi, and Cron. Learn and master how to implement data
quality checks and processes for running the ETL in a production Skills to Master
environment. Understand and create a strong process and
architecture to avoid ETL failure due to data quality issues. Learn
SQL
No SQL (MongoDB)
how to handle ETL failure issues in a production environment. Data Warehousing
OLAP
OLTP
ETL
Module 10
Data Virtualization and Containerization Python Programming
Hadoop
Use Docker for converting your applications and data pipelines Spark
Spark Streaming
to containers-based applications
AWS
Redshift
Orchestrate containers to deliver scalable and reliable
RDS
EMR
performance using Kubernetes
Apache Airflow
S3
Module 11 S3 Glacier
Glue
Capstone Project
Docker Kubernetes
Implement the concepts learnt in the program and create a highly
scalable data warehouse architecture for loading data from
different sources and use NoSQL database for query to provide Tools to Master
data results asked by the analytics team. Use AWS cluster to
deploy your solution data processing.
Module 12
Azure Data Factory
Non-Relational Data Stores and Azure Data Lake Storage
Data Lake and Azure Cosmos DB
Relational Data Stores
Why Azure SQL?
Azure Batch
Azure Data Factory
Azure Data Bricks
Azure Stream Analytics
Monitoring & Security
Page - 6
IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com
Program Partners
About Intellipaat
Intellipaat is one of the leading online training providers with more than 1.2 million learners
in over 155 countries. We are on a mission to democratize education as we believe that
everyone has the right to quality education.
We create courses in collaboration with top universities and MNCs for employability like IIT
Madras, University of Essex, University of Liverpool, IIT Roorkee, IIT Guwahati, IBM, Microsoft,
etc.
Our courses are delivered by SMEs & our pedagogy enables quick learning of difficult
topics. 24/7 technical support & career services help learners to jump-start their careers.
About MT I an d M T DSS
I I
The Institute for Data, Systems, and Society (IDSS) is a cross-disciplinary unit made up of
faculty from across the Massachusetts Institute of Technology (MIT). IDSS advances
education and research in cutting-edge data analysis, statistics, and machine learning,
and applies these tools in collaboration with social scientists, community stakeholders, and
policy makers to address complex societal challenges across diverse sectors.
On the completion of this program, you will:
Receive an industry - z
recogni ed certification in D
ata E ngineering from Intellipaat
Receive a course completion certification by MITxMicromasters on the completion of the modules by MIT
Get your dream job in just 6 months from the completion of the program
Page - 7
IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com
1.2 Million Learners & 200+ Corporates across 155+ countries
upskilling on Intellipaat Platform
Contact Us
INDIA
AMR Tech Park 3, Ground Floor, Tower B, Hongasandra Village,
Bommanahalli, Hosur Road, Bangalore, Karnataka 560068, India
Phone No: +91-7022374614
UK
Flat 16 Bluepoint Court, 203 Station Road, Harrow,
Middlesex HA1 2TS, UK
USA
1219 E. Hillsdale Blvd. Suite 205, Foster City, CA 94404
Phone No: 1-800-216-8930
[email protected]
www.intellipaat.com