0% found this document useful (0 votes)
65 views

Post Graduate Program in Data Engineering

The document summarizes a Post Graduate Program in Data Engineering offered in partnership between Simplilearn, Purdue University, and IBM. The 180-hour blended learning program provides 14 hands-on projects and a capstone project to help learners master skills in Hadoop, Spark, Scala, AWS, Azure, and data engineering. It is designed for graduates and professionals from diverse backgrounds seeking careers in data engineering. Upon completing the program, learners will receive a joint certification from Purdue and Simplilearn.

Uploaded by

Shashank Rai
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

Post Graduate Program in Data Engineering

The document summarizes a Post Graduate Program in Data Engineering offered in partnership between Simplilearn, Purdue University, and IBM. The 180-hour blended learning program provides 14 hands-on projects and a capstone project to help learners master skills in Hadoop, Spark, Scala, AWS, Azure, and data engineering. It is designed for graduates and professionals from diverse backgrounds seeking careers in data engineering. Upon completing the program, learners will receive a joint certification from Purdue and Simplilearn.

Uploaded by

Shashank Rai
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Post Graduate Program

in Data Engineering
In collaboration with IBM

1 | www.simplilearn.com
Table About the Program 03

Key Features of the Post Graduate


of Program in Data Engineering 04

Contents About the Post Graduate Program in


Data Engineering in Partnership with
Purdue University 05

About Simplilearn 05

Program Eligibility Criteria and


Application Process 06

Learning Path Visualization 08

Program Outcomes 09

Who Should Enroll in this Program 10

Courses

Step 1 - Big Data for Data Engineering 11

Step 2 - Data Engineering with Hadoop 12

Step 3 - Data Engineering with Scala 13

Step 4 - Big Data Hadoop and Spark Developer 14

Step 5 - AWS Tech Essentials 16

Step 6 - Big Data on AWS 17

Step 7 - Azure Fundamentals 18

Step 8 - Azure Data Engineer 19

Step 9 - Data Engineering Capstone 20

Electives 21

Certification 24

Advisory Board Members 25

2 | www.simplilearn.com
About
the Program
Accelerate your career with this
acclaimed Post Graduate Program in
Data Engineering, in partnership with
Purdue University and in collaboration
with IBM. This program provides a
perfect mix of theory, case studies,
and extensive hands-on practice.
Learners will receive a comprehensive
data engineering education, leveraging
Purdue’s academic excellence in
data engineering and Simplilearn’s
partnership with IBM.

This post graduate program is designed


to give experienced professionals,
coming from diverse backgrounds, an
extensive data engineering education
through a blend of online self-paced
videos, live virtual classes, hands-on
projects, and labs. Learners will also
get access to mentorship sessions
that provide a high-engagement
learning experience and real-world
applications to help master essential
data engineering skills. You’ll learn the
implementation of data engineering
concepts such as distributed processing
using the Hadoop framework, large
scale data processing using Spark,
building data pipelines with Kafka,
and working with databases both on
on-premise, AWS and Azure cloud
infrastructures.

3 | www.simplilearn.com
Key Features of the Post Graduate
Program in Data Engineering in
Partnership with Purdue University

Purdue Post 180+ hours of 14+ hands-on


Graduate Program Blended Learning projects
Certification

Masterclasses from Purdue Alumni Capstone


Purdue faculty Association project in 3
membership domains

Curriculum aligned with


Microsoft Azure (DP-200),
AWS (DAS-C01), and Cloudera
(CCA175) certifications

4 | www.simplilearn.com
About the Post Graduate Program in
Data Engineering in Partnership with
Purdue University
Purdue University, a top public the industry sectors and verticals -
research institution, offers higher including manufacturing, healthcare,
education at its highest proven and ecommerce.
value. Committed to student success, Upon completing this program, you
Purdue is changing the student will receive a joint Purdue-Simplilearn
experience with a greater focus certification of completion.
on faculty-student interaction and
creative use of technology.
This Post Graduate Program in Data
Engineering in partnership with
Purdue University will open pathways
for you in the data engineering
field, which has its presence in all

About Simplilearn
Simplilearn is a leader in digital skills training, focused on the emerging technologies
that are transforming our world. Our unique Blended Learning approach drives
learner engagement and is backed by the industry’s highest course completion rates.
Partnering with professionals and companies, we identify their unique needs and
provide outcome-centric solutions to help them achieve their professional goals.

5 | www.simplilearn.com
Program Eligibility Criteria and
Application Process
Those wishing to enroll in this Post Graduate Program in Data Engineering in
partnership with Purdue University will be required to apply for admission.

Eligibility Criteria
For admission to this Post Graduate Program in Data Engineering,
candidates should have:
A bachelor’s degree with an average of 50% or higher marks
2+ years of work experience (preferred)
Basic understanding of object oriented programming (preferred)

Application Process
The application process consists of three simple steps. An offer of
admission will be made to the selected candidates and accepted by the
candidates by paying the admission fee.

STEP 1 STEP 2 STEP 3


SUBMIT AN APPLICATION ADMISSION
APPLICATION REVIEW

Complete the application and A panel of admissions An offer of admission will be


include a brief statement of counselors will review your made to qualified candidates.
purpose. The latter informs application and statement of You can accept this offer by
our admissions counselors purpose to determine whether paying the program fee.
why you’re interested and you qualify for acceptance.
qualified for the program.

6 | www.simplilearn.com
Talk to an Admissions Counselor
We have a team of dedicated admissions counselors who are here to help
guide you in applying to the program. They are available to:

Address questions related to the application

Assist with financial aid (if required)

Help you resolve your questions and understand the program

7 | www.simplilearn.com
Learning Path Electives
Python for Data Science
PySpark Training
Apache Kafka
MongoDB Developer and Administrator
GCP Fundamentals
SQL Training
Java Training
Big Data for Spark for Scala Analytics
Data Engineering Academic master class - Data Engineering
- Purdue university

Data Engineering Data Engineering


with Hadoop with Scala

AWS Technical Big Data Hadoop


Essentials and Spark Developer

Big Data on AWS Azure Fundamentals

Data Engineering Azure Data Engineer


Capstone

8 | www.simplilearn.com
Program Outcomes

Gather data requirements, access Understand how to use Amazon EMR


data from multiple sources, process for processing data using Hadoop
data for business needs, and store ecosystem tools
data on the cloud as well as on-
premises
Understand how to use Amazon Kinesis
for Big Data processing in real-time
Gain insights into how to improve
business productivity by processing
Big Data on platforms that can Analyze and transform Big Data using
handle its volume, velocity, variety, Kinesis Streams and visualize data
and veracity and perform queries using Amazon
QuickSight

Get a solid understanding of the


fundamentals of the Scala language, Implement data storage solutions;
its tooling, and the development manage and develop data processing;
process and monitor and optimize data solutions
using Azure Cosmos DB, Azure SQL
Database, Azure Synapse Analytics,
Master the various components of Azure Data Lake Storage, Azure Data
the Hadoop ecosystem, such as Factory, Azure Stream Analytics, Azure
Hadoop, Yarn, MapReduce, Pig, Hive, Databricks, and Azure Blob storage
Impala, HBase, ZooKeeper, Oozie, services
Sqoop, and Flume

Apply the knowledge, skills, and


Identify AWS concepts, capabilities gathered throughout the
terminologies, benefits, and program to build an industry-ready
deployment options to meet product
business requirements

9 | www.simplilearn.com
Who Should Enroll in this Program?
This program caters to graduates experienced professionals with a
in any discipline and working passion for data, including:
professionals from diverse
backgrounds who have basic IT professionals
programming knowledge. The
diversity of our students adds Database administrators
richness to class discussions and
interactions. Beginners in the data
engineering domain
A data engineer builds and
maintains data structures and BI Developers
architectures for data ingestion,
Data science professionals
processing, and deployment
who want to expand their
for large-scale, data-intensive
skill set
applications. It’s a promising
career for both new and Students in UG/ PG
programs

10 | www.simplilearn.com
S
T
E
P
Big Data for Data Engineering 1
2
The Big Data for Data Engineering course from IBM covers the basic
concepts and terminologies of Big Data and its real-life applications 3
across industries. You will gain insights on how to improve business
productivity by processing large volumes of data and extracting
4
valuable information from them. 5
6
Key Learning Objectives
7
Understand what Big Data is, sources of Big Data, and real-life
examples 8
Learn the key difference between Big Data and data science 9
Master the usage of Big Data for operational analysis and better
customer service

Gain knowledge of the ecosystem of Big Data and the Hadoop


framework

Course curriculum
Lesson 01 - What is Big Data

Lesson 02 - Beyond the Hype

Lesson 03 - Big data and Data Science

Lesson 04 - Big Data Use Cases

Lesson 05 - Processing Big Data

11 | www.simplilearn.com
S
T
E
P
Data Engineering with Hadoop 1
2
The Data Engineering with Hadoop course by IBM will give you an
overview of what Hadoop is and its components, such as MapReduce 3
and Hadoop Distributed File System (HDFS). Additionally, this course will
teach you to explore with large data sets and use Hadoop’s method of
4
distributed processing. 5
6
Key Learning Objectives
7
Understand Hadoop’s architecture and primary components, such as
MapReduce and HDFS 8
Add and remove nodes from Hadoop clusters, check the available disk
space on each node, and modify configuration parameters
9
Learn about Apache projects that are part of the Hadoop ecosystem,
including Pig, Hive, HBase, ZooKeeper, Oozie, Sqoop, and Flume

Course curriculum
Lesson 01 - Introduction to Hadoop

Lesson 02 - Hadoop Architecture and HDFS

Lesson 03 - Hadoop Administration

Lesson 04 - Hadoop Components

12 | www.simplilearn.com
S
T
E
P
Data Engineering with Scala 1
2
The Data Engineering with Scala course carefully crafted by IBM focuses
on Scala programming. You will learn to write Scala codes, perform Big 3
Data analysis using Scala, and create your own Scala projects.
4
5
Key Learning Objectives
Create your own Scala project
6
Understand basic object-oriented programming methodologies 7
in Scala
8
Work with data in Scala, including pattern matching, applying
synthetic methods, and handling options, failures, and futures 9

Course curriculum
Lesson 01 - Introduction

Lesson 02 - Basic Object Oriented Programming

Lesson 03 - Case Objects and Classes

Lesson 04 - Collections

Lesson 05 - Idiomatic Scala

13 | www.simplilearn.com
S
T
E
P
Big Data Hadoop and Spark Developer 1
2
This Big Data Hadoop and Spark developer course helps you master the
concepts of the Hadoop framework, Big Data, and Hadoop ecosystem 3
tools such as HDFS, YARN, MapReduce, Hive, Impala, Pig, HBase,
Spark, Flume, Sqoop, and including additional concepts of the Big Data
4
processing life cycle. This course is aligned with Cloudera’s CCA175 5
Big Data Certification.
6
Key Learning Objectives 7
Learn how to navigate the Hadoop ecosystem and understand how to
8
optimize its use
9
Ingest data using Sqoop, Flume, and Kafka

Implement partitioning, bucketing, and indexing in Hive

Work with RDD in Apache Spark

Process real-time streaming data

Perform DataFrame operations in Spark using SQL queries

Implement User-Defined Functions (UDF) and User-Defined Attribute


Functions (UDAF) in Spark

Course curriculum
Lesson 01 - Course Introduction

Lesson 02 - Introduction to Big Data and Hadoop

Lesson 03 - Hadoop Architecture, Distributed Storage


(HDFS), and YARN

Lesson 04 - Data Ingestion into Big Data Systems and ETL

14 | www.simplilearn.com
Lesson 05 - Distributed Processing - MapReduce Framework and Pig

Lesson 06 - Apache Hive

Lesson 07 - NoSQL Databases - HBase

Lesson 08 - Basics of Functional Programming and Scala

Lesson 09 - Apache Spark Next Generation Big Data Framework

Lesson 10 - Spark Core Processing RDD0

Lesson 11 - Spark SQL - Processing DataFrames

Lesson 12 - Spark MLLib - Modelling BigData with Spark

Lesson 13 - Stream Processing Frameworks and Spark Streaming

Lesson 14 - Spark GraphX

15 | www.simplilearn.com
S
T
E
P
AWS Tech Essentials 1
2
The AWS Technical Essentials course teaches you how to navigate the
AWS management console, understand AWS security measures, storage, 3
and database options, and gain expertise in web services like RDS and
EBS. This course helps you become proficient in identifying and efficiently 4
using AWS services.
5
6
Key Learning Objectives
Understand the fundamental concepts of AWS platform and cloud
7
computing 8
Identify AWS concepts, terminologies, benefits, and deployment
options to meet business requirements 9
Identify deployment and network options in AWS

Course curriculum
Lesson 01 - Introduction to Cloud Computing

Lesson 02 - Introduction to AWS

Lesson 03 - Storage and Content Delivery

Lesson 04 - Compute Services and Networking

Lesson 05 - AWS Managed Services and Databases

Lesson 06 - Deployment and Management

16 | www.simplilearn.com
S
T
E
P
Big Data on AWS 1
2
The AWS Big Data course helps you understand the Amazon Web
Services cloud platform, Kinesis Analytics, AWS Big Data storage, 3
processing, analysis, visualization, and security service, EMR, AWS
Lambda, Glue, and machine learning algorithms.
4
5
Key Learning Objectives 6
Understand how to use Amazon EMR for processing the data using 7
Hadoop ecosystem tools
8
Understand how to use Amazon Kinesis for Big Data processing in
real-time and analyze and transform Big Data using Kinesis Streams 9
Visualize data and perform queries using Amazon QuickSight

Course curriculum
Lesson 01 - Big Data on AWS Certification Course Overview

Lesson 02 - Big Data on AWS Introduction

Lesson 03 - AWS Big Data Collection Services

Lesson 04 - AWS Big Data Storage Services

Lesson 05 - AWS Big Data Processing Services

Lesson 06 - Analysis

Lesson 07 - Visualization

Lesson 08 - Security

17 | www.simplilearn.com
S
T
E
P
Azure Fundamentals 1
2
The Azure Fundamentals course covers the main principles of cloud
computing and how they have been implemented in Microsoft Azure. 3
You will work on the concepts of Azure services, security, privacy,
compliance, trust, pricing, and support and learn how to create the most 4
common Azure services, including virtual machines, web apps, SQL
databases, features of Azure Active Directory, and methods of integrating
5
it with on-premises Active Directory. 6
7
Key Learning Objectives
8
Describe Azure storage and create Azure web apps
9
Deploy databases in Azure

Understand Azure AD, cloud computing, Azure, and Azure


subscriptions

Create and configure VMs in Microsoft Azure

Course curriculum
Lesson 01 - Cloud Concepts

Lesson 02 -Core Azure Services

Lesson 03 -Security, Privacy, Compliance, and Trust

Lesson 04 -Azure Pricing and Support

18 | www.simplilearn.com
S
T
E
P
Azure Data Engineer 1
2
The Azure Data Engineer course will focus on data-related
implementation which includes provisioning data storage services, 3
ingesting streaming and batch data, transforming data, implementing
security requirements, implementing data retention policies, identifying 4
performance bottlenecks, and accessing external data sources.
5
6
Key Learning Objectives
7
Implement data storage solutions using Azure Cosmos DB, Azure SQL
Database, Azure Synapse Analytics, Azure Data Lake Storage, Azure 8
Data Factory, Azure Stream Analytics, Azure Databricks, and Azure
Blob storage services 9
Develop batch processing and streaming solutions

Monitor Data Storage and Data Processing

Optimize Azure Data Solutions

Course curriculum
Implement data storage solutions

Manage and develop data processing

Monitor and Optimize Data Solutions

19 | www.simplilearn.com
S
T
E
P
Data Engineer Capstone 1
2
The data engineering capstone project will give you an opportunity
to implement the skills you learned throughout this program. Through 3
dedicated mentoring sessions, you’ll learn how to solve a real-world,
industry-aligned data engineering problem, from setting up configuration, 4
ETL, data streaming, and data analysis to data visualization. This project
is the final step in the learning path and will enable you to showcase your
5
expertise in data engineering to future employers. 6
You can choose to work on projects that cover the most relevant domains 7
(ecommerce, BFSI, video sharing) to make your practice more relevant.
8
9

20 | www.simplilearn.com
Elective Course

Python for Data Science


The Python for Data Science course, carefully
crafted by IBM helps you understand how to
integrate Python using the PySpark interface. This
course enables you to write your Python scripts,
perform fundamental hands-on data analysis using
the Jupyter-based lab environment, and create your
data science projects using IBM Watson.

Pyspark
This PySpark course provides an overview of Apache
Spark, the open-source query engine for processing
large datasets, and how to integrate it with Python
using the PySpark interface. You will learn to build
and implement data-intensive applications as you
dive into the world of high-performance machine
learning and leverage Spark RDD, Spark SQL, Spark
MLlib, Spark Streaming, HDFS, Sqoop, Flume, Spark
GraphX, and Kafka.

Apache Kafka
In this Apache Kafka course, you will learn the
architecture, installation, configuration, and
interfaces of Kafka open-source messaging. You
will gain a fair understanding of basics of Apache
ZooKeeper as a centralized service and develop the
skills to deploy Kafka for real-time messaging.

21 | www.simplilearn.com
MongoDB Developer and Administrator
This course provides you an in-depth knowledge of
NoSQL, data modeling, ingestion, query, sharding,
and data replication.

GCP Fundamentals
In GCP Fundamentals course, you will learn how
to analyze and deploy infrastructure components
such as networks, storage systems, and application
services in Google Cloud Platform. This course
covers IAM, networking, and cloud storage and
introduces you to the flexible infrastructure and
platform services provided by Google Cloud
Platform.

SQL Training
This SQL course covers all the information you need
to successfully start working with SQL databases
and make use of the database in your applications.
You will learn to structure your database, author
efficient SQL statements, and clauses, and manage
your SQL database for scalable growth.

22 | www.simplilearn.com
Java Training
This Java course covers the concepts of Java, from
introductory techniques to advanced programming
skills and provides you with the knowledge of Core
Java 8, operators, arrays, loops, methods, and
constructors in JDBC and JUnit framework.

Spark for Scala Analytics


This Spark for Scala Analytics course, provides you
an overview on the history of Apache Spark, how
it evolved, how to build applications with Spark,
RDDs and Data frames, Spark and its associated
ecosystems. It will teach you to leverage the core
RDD and DataFrame APIs to perform analytics on
datasets with Scala..

Academic master class - Data Engineering


- Purdue university
Attend an online interactive Masterclass and get insights into the world of
data engineering.

23 | www.simplilearn.com
Certification

Upon completion of this Post Graduate Program in Data Engineering in


partnership with Purdue University, you will receive the Post Graduate Program
Certification from Purdue University. You will also receive certificates from
Simplilearn for the courses in the learning path. These certificates will testify to
your skills as a data engineering Expert.

24 | www.simplilearn.com
Advisory Board Members

Aly El Gamal
Assistant Professor, Purdue University

Aly El Gamal has a Ph.D. in Electrical


and Computer Engineering and M.S. in
Mathematics from the University of Illinois. Dr.
El Gamal specializes in the areas of information
theory and machine learning and has received
multiple commendations for his research and
teaching expertise.

Ronald Van Loon


Big Data Expert, Director - Adversitement

Named by Onalytica as one of the three most


influential people in Big Data, Ronald van Loon
is an author for a number of leading Big Data
and Data Science websites, including Datafloq,
Data Science Central, and The Guardian. He is
also a renowned speaker at industry events.

25 | www.simplilearn.com
USA

Simplilearn Americas, Inc.


201 Spear Street, Suite 1100, San Francisco, CA 94105
United States
Phone No: +1-844-532-7688

INDIA

Simplilearn Solutions Pvt Ltd.


# 53/1 C, Manoj Arcade, 24th Main, Harlkunte
2nd Sector, HSR Layout
Bangalore - 560102
Call us at: 1800-212-7688

www.simplilearn.com

Disclaimer: All programs are offered on a non-credit basis and are not transferable to a degree.

26 | www.simplilearn.com

You might also like