0% found this document useful (0 votes)
85 views12 pages

Data Engineern - Bootcamp Brochure

This document summarizes a 3-4 month intensive bootcamp for data engineering. By the end of the bootcamp, students will be able to build data engineering pipelines, analyze and extract insights from structured and unstructured data using tools like SQL, Pandas, Kafka and Spark, and store and process large datasets on AWS. The bootcamp covers in-demand skills like data scraping, cleaning, visualization, and streaming. It is suitable for people of all backgrounds interested in a fast-growing career in data engineering. The curriculum is divided into 7 milestones covering topics such as Linux, Python, databases, MongoDB, data warehousing, Hadoop, and Spark.

Uploaded by

roopini8819
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views12 pages

Data Engineern - Bootcamp Brochure

This document summarizes a 3-4 month intensive bootcamp for data engineering. By the end of the bootcamp, students will be able to build data engineering pipelines, analyze and extract insights from structured and unstructured data using tools like SQL, Pandas, Kafka and Spark, and store and process large datasets on AWS. The bootcamp covers in-demand skills like data scraping, cleaning, visualization, and streaming. It is suitable for people of all backgrounds interested in a fast-growing career in data engineering. The curriculum is divided into 7 milestones covering topics such as Linux, Python, databases, MongoDB, data warehousing, Hadoop, and Spark.

Uploaded by

roopini8819
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Data Engineer

Bootcamp
A 3-4 Months’ Intensive Bootcamp with Dedicated Career Assistance
By the End of This Bootcamp,
You’ll Be Able To:
Build Data Engineering Pipelines

Scrape the data

Analyze the data and get meaningful insights

Able to extract data from the databases using SQL queries

Clean, reshape and transform datasets using Pandas

Retrieve, manipulate, and explore unstructured data

Import and work with data from streaming APIs

Implement parallel computing techniques on larger datasets

Build compelling visualizations with the seaborn package

Schedule Big Data jobs in Oozie

Master development of Spark applications

Stream Processing with Kafka

Securing Kafka

Storing and processing large data on AWS

Secure Big Data applications on AWS and combat Big Data Threats

Master In-Demand Tools and Technologies


Join the Data Engineering Bandwagon
And become a part of the trending Big Data Market

US$ 274 Billion 97.2% #6


Size of Big Data and Organizations investing U.S. Jobs on the Rise
Analytics market in Big Data and AI Report (2021)

Source: Harvard Business Review Source: NewVantage Partners Survey Source: LinkedIn

Did You Know?

2.7 Million $1,12,493


Job listings for Data Science Average Salary for Data
and Analytics (2022) Engineers in the U.S.

Source: Forbes Source: Glassdoor

Data Engineering is one of the fastest-growing tech occupations, and you can tap into this
lucrative market. (Dice’s 2020 Tech Job Report)

Fastest growing tech occupations


Year-over-year growth

Data Engineer 50%


Back End Developer 38%
Senior Data Scientist 32%

CRM Developer 29%


UI Developer 24%

Python Developer 22%


Android Developer 22%
Devops Engineer 21%

Front End Developer 21%

Cloud Architect 16%


Bootcamp Prerequisites
There are no prerequisites for attending this Bootcamp. If you’re passionate
to learn, grow, and thrive as a Data Engineer, then this intensive Bootcamp is
the right option for you.

Who is This Bootcamp for?


This Bootcamp is open to learners from various backgrounds. Both tech and
non-tech graduates and professionals from any background are welcome to
attend this Bootcamp and get certified as a credible Data Engineer.

Research and Analytics Students interested in a career


Professionals in Data Engineering

Banking and Finance Developers looking to transition


Professionals into Data Engineering

Marketing Professionals Anyone working in the IT


industry
Supply Chain Executives
Novices who want to become
Business Analysts
Data Engineers
Bootcamp Curriculum
Milestone 1: Basecamp
Build your foundation from the ground up and gain critical conceptual and practical skills
with the Basecamp. Get set to embrace technologies and corresponding workflows that
follow.

• Introduction
• Linus Command Line
• Files & Directories
• Creating and Editing files
Linux • User, Group and Permissions
Essentials
• Other Essential Features
• Processes in Linux
• Networking in Linux
• Shell Scripts

Milestone 2: Essentials of Python for Data Analysis


Get started on your Python learning journey through in-depth theory and practical lessons.
Master Python language through the various basic, intermediate, and advanced levels of
Python used for Data Analysis.

• Introduction to Python
• Code and Data
• Building Blocks
• Strings
• Data Structures
• Flow Control
Python for Data • Functions
Engineering • Modules
• Files
• NumPy
• Pandas
• Regular Expression
• Visualization
Milestone 3: Deep Dive – Relational Databases and SQL
Build a strong understanding of the design and architecture of Relational Databases and
SQL. Go beyond the basics and learn how to use Excel for Data Extraction and
Performance Analysis.

• Introduction to Relational Database


• Architecture of Relational database
• Important Aspects of Relational Databases
• Database Structure and Design
• Database Design
• Data modelling methodologies
• SQL Components
• Transaction and Concurrency

Data Structures • Database Joins and Performance Tuning


using JavaScript • Backup and Recovery
• On-Prem vs Cloud databases
• What is SQL and Why is it Important?
• SQL Database Admin Commands
• The Basics of SQL
• Filtering Data Using WHERE Clause in SQL
• Aggregation and Summary Functions in SQL
• Miscellaneous Analysis in SQL
• Table to Table Relationship in SQL
• Combining Tables
• Advanced SQL Data Analysis
• Making Efficient Analysis
Milestone 4: NoSQL – MongoDB
Gain complete, and end-to-end knowledge of MongoDB concepts, ranging from CRUD
operations to MongoDB on the Cloud.

• Introduction to MongoDB
• MongoDB Fundamentals
• CRUD Operations
• Schema Design and Modelling
MongoDB • Advanced Operations
• Replication and Sharding
• Administration and Security
• MongoDB with other Applications
• MongoDB in the Cloud

Milestone 5: Data Warehousing


Learn about Data Warehousing - the latest Data Storage trend implemented by leading IT
professionals around the globe. Learn how to integrate Data while understanding the
applications of Data Warehousing, its challenges, and its future.

• Concept of a Data Warehouse


• The Different Implementation Methods of the Data
Warehouse
• Data Integration
• Data model for a data warehouse
Data Warehousing • Designing Dimension Models
• Managing history in data warehouse
• Ecosystem for data warehouse
• Business Intelligence
• Industry Examples
Milestone 6: Big Data Processing using Hadoop
Master the concepts of distributed processing frameworks like Hadoop MapReduce, and
Spark, along with frameworks like Hive, Impala, Pig, Sqoop, Flume, and Oozie. Learn the
nuts and bolts of NoSQL databases like HBase that are widely used to store distributed
data in Hadoop.

• Introduction to Big Data and Hadoop


• Hadoop Distributed File System – HDFS and YARN
• MapReduce Processing in Hadoop
• Data Ingestion and Egestion into Hadoop
Big Data Processing • Data Processing in Hadoop
using Hadoop • NoSQL and HBase
• Apache Oozie
• Introduction to Spark
• Hadoop Cloud on Amazon/Elastic MapReduce

Milestone 7: Streaming Big Data with Spark


Build a robust foundation of Spark programing in the RDD, Data Frame and Spark SQL
APIs. Gain a comprehensive overview of Spark Streaming and Structured Streaming with
Spark using Python, including integration with Apache Kafka, Amazon Kinesis and more.

• The Spark Runtime


• ETL with Spark
• NLP SparkSQL and DataFrames
• Introduction to Stream Processing with Spark
• Stateful processing with Spark Streaming
• Sliding Window Operations with Spark Streaming
Streaming Big Data • Introduction to Structured Streaming
with Spark
• Introduction to Apache Kafka
• Kafka Integration with Spark Streaming
• Kafka Integration with Structured Streaming
• Using Spark Streaming with Kinesis
• Using Spark Streaming with Kinesis
• Additional Spark Streaming Integrations
Milestone 8: Apache Kafka
Understand the ‘where’ and ‘how’ of Kafka and how it fits in the Big Data space followed by
learning about the Kafka architecture. You’ll learn about the Kafka Cluster, its components,
and about configuring clusters.

• Why Kafka?
• First Steps with Kafka
• Kafka as a Distributed Log
• Reliability and Performance
• Setting up a Development Project
• Producing Messages with Kafka
• Consuming Messages
Apache Kafka • Improving the Reliability and Performance of Our Clients
• What is Kafka Connect?
• Kafka Streams
• Stateless Stream Processing
• Stateful Stream Processing
• Securing Apache Kafka
• Real-world Use Cases of Apache Kafka

Milestone 9: AWS in Big Data


Build your own state-of-the-art Big Data solutions by getting an overview of all the different
services that can be leveraged within AWS. Learn how to store and process Big Data
on AWS.

• Big Data and AWS


• Data Collection, Catalogs, and Preparation
AWS in Big Data • Storing Large data on AWS
• Processing your data on AWS
• Advanced Topics on Big Data
Milestone 10: Big Data Security
Address security and privacy from a Big Data Analytics /contemporary Data Architecture
perspective. Learn about the regulations and standards, challenges and solutions that
come into play when protecting data.

• Introduction / Overview
• Data Security / Privacy Standards and Regulations
• Threat Sources and Types
• Security Concepts
• Data Understanding and Governance
Big Data Security
• Big Data Pipeline
• Big Data Storage
• Big Data End User Access
• Using Big Data to Combat Big Data Threats
• Big Data Security and Privacy Implementation
See What Our Learners Are Saying

4.8 4.9 4.7 4.9

3,851 Reviews 1,850 Reviews 4,303 Reviews 220 Reviews

There are very few Data Science bootcamps available out there that meet the gold standard.
To my pleasant surprise, KnowledgeHut turned out to be a great steppingstone in my career!
The hands-on tools and tech that you’ll learn in this bootcamp are impressive. Thanks to this
bootcamp I’ve greater clarity when it comes to data visualization, analysis, and leveraging my
data skills to meet my business goals.

- Brett Weaver
Data Mining Engineer

Being a working mom, it was very difficult for me to juggle between a full-time job, upskill myself,
and focus on my family. But thanks to KnowledgeHut’s flexible learning modes, I could get certified.
The instructors were very patient, and the weekly doubt-clearing sessions resolved all my queries.
I recommend this course to every learning enthusiast.

- Alivia Wilkins
Application Developer

Being from a Statistics background, I’m very particular about accuracy of data. KnowledgeHut’s Data
Science bootcamp helped me to refine my methodologies and taught me the latest in-trend
practices. I can’t wait to apply my learnings in the real world.

- Harrison Houston
Statistician

Even though it’s an advanced course, I opted for this one because I have successfully cleared all the
beginner levels and from KnowledgeHut itself. It’s the best Data Science bootcamp for learners who
are self-motivated and want to acquire the latest Data Science toolkit. Thanks to the whole team for
such an enriching experience!

- Alexia Bernard
Data Scientist
KnowledgeHut is a global ed-tech company, equipping the world’s workforce with the
skills of the future via immersive learning. A trusted skills transformation partner to over
4,500+ organizations across 100+ countries, KnowledgeHut is the skills solutions provider
that organizations and individuals count on to innovate faster and create progress.

350,000+ 250+ 100+


Professionals Workshops Countries and
Trained Every Month Counting

US (Headquarters)
+1-469-442-0620
[email protected]

India Canada
+91-80-41520045, Toll-Free: 1800-121-9232 +1-613-707-0763
[email protected] [email protected]

UK New Zealand
+44-2033320846 +64-36694791
[email protected] [email protected]

Singapore Australia
Singapore
+65-315-83941 +61-290995641
[email protected] [email protected]

Malaysia UAE
+601548770914 Toll Free 8000180860
[email protected] [email protected]

Ready to skill up?

[email protected] www.knowledgehut.com

You might also like