0% found this document useful (0 votes)
177 views

Data-Engineering Course Structure

The document advertises an online post-graduate program in data engineering. It highlights that the 7-month program will provide in-depth training in core skills like Python, SQL, AWS, Spark and Kafka. It is integrated with MITxMicroMasters and will provide career services, a high salary hike of 55% on average, and over 1 million job openings are available for data engineers in 2022. The program includes live sessions, industry projects, mentorship and career guidance.

Uploaded by

Kiran Chinta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
177 views

Data-Engineering Course Structure

The document advertises an online post-graduate program in data engineering. It highlights that the 7-month program will provide in-depth training in core skills like Python, SQL, AWS, Spark and Kafka. It is integrated with MITxMicroMasters and will provide career services, a high salary hike of 55% on average, and over 1 million job openings are available for data engineers in 2022. The program includes live sessions, industry projects, mentorship and career guidance.

Uploaded by

Kiran Chinta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

POST GRADUATE PROGRAM IN

Data Engineering
Master Data Engineering skills and take your career to
the next level!

1.2 Million 1:1 Personlized 55% Average


Learners Mentorship Salary Hike

IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com


Post Graduate Program in
Data Engineering
Launch your career in data engineering with this post graduate program designed by experts in
the domain. The program is integrated with MITxMicroMasters, and you will learn from leading
faculty at MIT along with live sessions by industry experts. Get in-depth training in core data
engineering skills such as Python, SQL, AWS, Spark, Kafka, Apache Airflow, etc.

Hottest Job of 21st Century


1.1 Million Job Postings Skill Development
There is a global estimate of 1.1 million job Data Engineers are equipped with various
postings for Data Engineers in 2022 relevant skills that fetch lucrative job
offers

Growing Data Industry Future-oriented Career


Data Engineering is a budding field; a head
50% CAGR in the global data industry start will prove to be beneficial

Popular Degree High Demand


46% of Data Engineers have a master’s In 2022, India and the US will face a demand-
degree supply gap of 400,000 Data Engineers

Our Credentials
1.2 Million+ 1,000+ 400+
Aspiring Active Students Industry-expert Instructors Hiring Partners

200+ 55% 155+


Corporates Upskilled Average Salary Hike Countries’ Learners

Page - 1

IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com


About Program
Intellipaat’s PGP Certification in Data Engineering will provide you in-depth knowledge in SQL,

Python, data pipelines, data transformation, Spark, and cloud services of AWS and Azure. Work

on multiple real-world projects to gain knowledge on creating production-ready ETL and pulling

data from multiple data sources, including real-time streaming services, and loading them into

cloud data warehouses.

Learning Format 7 Months Career Services MITxMicroMasters


Online Boot Camp Live Classes by Intellipaat Certification

Key H h h ig lig ts

7 Months of Live Sessions by Industry Experts Career Guidance

200 Hrs of Self-paced Videos One-on-One with Industry Mentors

24*7 Support 50+ Industry Projects & Case Studies

Integrated with MITxMicroMasters E-learning Videos from MIT faculty

Flexible Schedule Lifetime Free Upgrade

Soft Skills Essential Training cost

Program Pedagogy

c
Attend tor
Instru Liveled -
Classes
raining T H c
a kat ons
Innovative h
LMS

From
Get worlds top
trained bFaculty
y top and

industry experts Get a sense of


For effective how Learning

online real projects are b


uilt

Industry experts experience

c
Learn ated
Dedi by Doing
Learning Managers Peer Networking and Group Learning

To helpon
Hands you with your
exercises, learning
project needs
work, For effective
Improve your online
professional
Learning

network and learn from

quiz, capstone projects experience


peers

Self- c
pa ed videos
1:1 Personalized Learning Gamified Learning

Handsat
Learn onyour
exercises,
own pace
project
withwork, Learn
Get trough in
involved Hackathons and

group activities to

quiz, capstone
world-class content
projects Group
solve Learning
real-world pro blems

24*7
Pro j c
eSupport
ts and Ex c
er ises 1:1 Personalized Learning

Hands
Get on exercises,
real-world project
experience ofwork,
projects Hands-on exercises, project work,

quiz, capstone projects quizzes, and capstone projects

Page - 2

IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com


Who Can Apply for the Course?
Freshers and Undergraduates willing to pursue a career in data engineering

Anyone looking for a career transition to data engineering

IT Professionals

Ex perienced Professionals willing to learn data engineering

T echnical and Nontechnical Professionals with basic-level programming knowledge can also apply

Project Managers

Application Process
The application process consists of three simple steps. Candidates have to submit their application.
An offer of admission will be made to the selected candidates, and their application will be accepted
upon the payment of the admission fee.

SUBMIT APPLICATION
1
Tell us a bit about yourself and why you want to join

ADMISSION TEST & APPLICATION REVIEW


2 Clear the admission test and have a personal interview

with our interview panel

ADMISSION LETTER
3
Shortlisted candidates would be offered the admission letter

Page - 3

IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com


Learning Path

Live Courses

Start of the
1
Course
Preparatory Sessions

— Python and Linux

2 3 4

Data Wrangling Introduction to Big Data Engineering with

with SQL Data Engineering Apache Spark and Kafka

7 6 5

Mastering ETL Cloud Data Warehouses Data Modeling

Tool — Informatica

8 9 10

Data Engineering Schedule and Automate Data Data Virtualization and


on the Cloud Pipelines with Apache Airflow Containerization

PGP in Data Engineering


11

j
Capstone Pro ect

Self-paced Course
Azure Data Factory

Page - 4

IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com


Program Curriculum

Module 1 Integrating Apache Flume and Apache Kafka

Preparatory Sessions – Python & Linux Spark Streaming

Python Improving Spark Performance

Introduction to Python and IDEs


Spark SQL and Data Frames

Python Basics
Scheduling or Partitioning
Object Oriented Programming

Hands-on Sessions And Assignments for Practice


Module 5
Data Modeling
Linux
Understand the difference between SQL and NoSQL.
Introduction to Linux
Create relations data models and NoSQL-based data
Linux Basics
models on business reporting requirements. Work with
Hands-on Sessions And Assignments for Practice ETL tools to push the data to the model.

Work on MS SQL and Cassandra for creating


Module 2
Data Wrangling with SQL databases and using ETL tools for data extracting,
transformation, and loading to the models.
SQL Basics

Advanced SQL
Project 1: Data Modeling using Relational Databases

Deep Dive into User Defined Functions


Project 2: Data Modeling using Apache Cassandra
SQL Optimization and Performance
Module 6
Module 3 Cloud Data Warehouses
Introduction to Data Engineering Master the skills of building a highly scalable data
What is Data Engineering, Use Cases, and Applications?
warehouse on AWS. Work with Redshift and pull the data
Data Engineer or Data Scientist?
from RDS and other media services of AWS using ETL
Data Engineering Problems
pipeline and load the data into the data warehouse.
Tools of a Data Engineer

Working with Different Databases


Module 7
Processing Tasks, Scheduling Tools, and Different Cloud Providers
Mastering ETL Tool — Informatica

Why Cloud Computing, Use Cases, and Applications?


What is ETL, Use Cases, and Applications?

Different Cloud Services Why We Need ETL Tools>

Working with Different Data Sources—Relational


Module 4
Databases, NoSQL, HDFS, Stream Data, CSV Files, TXT
Big Data Engineering with Apache Spark and Kafka
Files, Json or XML Files, and Fixed File Formats

Learn the big data ecosystem and Apache Spark to load large
Transformation of Data

volumes of data. Work with Spark SQL for querying data and optimizing
Loading Data into a Data Model or File System

the same. Build an ETL pipeline to pull the data from different data
Using SQL for Data Transformation

sources, such as HDFS and S3, and use different format data files, such
Optimizing ETL Processes

as csv, txt, json, fixed format, streaming data, etc., to load the data.
Understanding ETL Architecture for Tracking the Data
Work on Spark Cluster using AWS.
Flow and Data Pipelines

Introduction to HDFS and Apache Spark


Understanding Data Quality Checks
Spark Basics

Working with RDDs in Spark


Module 8
Aggregating Data with Pair RDDs
Data Engineering on the Cloud
Writing and Deploying Spark Applications
Master the skills of building a highly scalable data
Parallel Processing
warehouse on AWS. Work with Redshift and pull the data
Spark RDD Persistence from RDS and other media services of AWS using ETL

Page - 5

IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com


Program Curriculum

pipeline and load the data into the data warehouse. Module 13
Machine Learning with Python
AWS Data Storage Services—S3, S3 Glacier, Amazon
Learn Basic statistics required for Data Science

DynamoDB

Master Data Science Algorithms

AWS Processing Services—AWS EMR, EMR Cluster,


Learn Linear regression and work on Recommender problems,
Hadoop, Hue with EMR, Spark with EMR, AWS Lambda,
collaborative filtering
HCatalog, Glue, and Glue Lab

Non-linear classification, kernels

AWS Data Analysis Services—Amazon Redshift, Tuning


Deep Learning Introduction and Neural networks

Query Performance, Amazon ML, Amazon Athena,


RNN & CNN

Amazon Elasticsearch, and ES Domain


Unsupervised learning: clustering

Module 9 Generative models, mixtures

Schedule and Automate Data Pipelines with Apache Learning to control: Reinforcement learning

Airflow Natural Language Processing


Learn to schedule, automate, and monitor ETL pipelines with Apache
Airflow, Luigi, and Cron. Learn and master how to implement data
quality checks and processes for running the ETL in a production Skills to Master
environment. Understand and create a strong process and
architecture to avoid ETL failure due to data quality issues. Learn
SQL
No SQL (MongoDB)

how to handle ETL failure issues in a production environment. Data Warehousing


OLAP

OLTP
ETL

Module 10
Data Virtualization and Containerization Python Programming
Hadoop

Use Docker for converting your applications and data pipelines Spark
Spark Streaming

to containers-based applications
AWS
Redshift

Orchestrate containers to deliver scalable and reliable


RDS
EMR

performance using Kubernetes


Apache Airflow
S3

Module 11 S3 Glacier
Glue

Capstone Project
Docker Kubernetes
Implement the concepts learnt in the program and create a highly
scalable data warehouse architecture for loading data from
different sources and use NoSQL database for query to provide Tools to Master
data results asked by the analytics team. Use AWS cluster to
deploy your solution data processing.

Module 12
Azure Data Factory
Non-Relational Data Stores and Azure Data Lake Storage

Data Lake and Azure Cosmos DB

Relational Data Stores

Why Azure SQL?

Azure Batch

Azure Data Factory

Azure Data Bricks

Azure Stream Analytics

Monitoring & Security


Page - 6

IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com


Program Partners

About Intellipaat

Intellipaat is one of the leading online training providers with more than 1.2 million learners

in over 155 countries. We are on a mission to democratize education as we believe that

everyone has the right to quality education.

We create courses in collaboration with top universities and MNCs for employability like IIT

Madras, University of Essex, University of Liverpool, IIT Roorkee, IIT Guwahati, IBM, Microsoft,

etc.

Our courses are delivered by SMEs & our pedagogy enables quick learning of difficult

topics. 24/7 technical support & career services help learners to jump-start their careers.

About MT I an d M T DSS
I I

The Institute for Data, Systems, and Society (IDSS) is a cross-disciplinary unit made up of

faculty from across the Massachusetts Institute of Technology (MIT). IDSS advances

education and research in cutting-edge data analysis, statistics, and machine learning,

and applies these tools in collaboration with social scientists, community stakeholders, and

policy makers to address complex societal challenges across diverse sectors.

On the completion of this program, you will:

Receive an industry - z
recogni ed certification in D
ata E ngineering from Intellipaat

Receive a course completion certification by MITxMicromasters on the completion of the modules by MIT

Get your dream job in just 6 months from the completion of the program

Page - 7

IND: +91 7022374614 US: 1-800-216-8930 [email protected] www.intellipaat.com


1.2 Million Learners & 200+ Corporates across 155+ countries

upskilling on Intellipaat Platform

Contact Us
INDIA

AMR Tech Park 3, Ground Floor, Tower B, Hongasandra Village,
Bommanahalli, Hosur Road, Bangalore, Karnataka 560068, India

Phone No: +91-7022374614


UK

Flat 16 Bluepoint Court, 203 Station Road, Harrow,

Middlesex HA1 2TS, UK


USA

1219 E. Hillsdale Blvd. Suite 205, Foster City, CA 94404

Phone No: 1-800-216-8930

[email protected]

www.intellipaat.com

You might also like