Bigdata Engineer Complete Syllabus: Presented by
Bigdata Engineer Complete Syllabus: Presented by
Presented By
Topic 5 – Spark (RDD, DF, SQL & ML) Topic 12 – Statistics Fundamentals
NumPy Pandas
Topic - 2
Hadoop Introduction
Brief Hadoop VS
Why Bigdata Google History of
Introduction 4V’s of Bigdata Google
needed now.? Concepts Databases
to Big Data Architecture
History of
Hadoop 1x vs Hadoop Secondary
Hadoop Layers Hadoop & Name Node
2x vs 3x Daemons: - Name Node
Ecosystems
Resource High
Node Manager
Data Node Manager / Job Heart Beat Block Report Availability
/ Task Tracker
Tracker (HA)
Replication
Special File
versus Erasure Block size Input Split
format
Encoding
Topic - 2
BigData Hadoop & YARN (cont..)
Hadoop
Application
Commands Container YARN
Master
Hands-on
Hadoop Job
Opportunities
Topic - 3
Hive Introduction
Introduction Hive Hive Meta Hive Server 1 Beeline VS
on Hive Architecture Store vs 2 Hive CLI
Dynamic
Partitioning Static Partition Bucketing SerDe Hive Joins
Partition
Create a Create a
Sample
DataFrame using DataFrame using
Project 2
Hive Tables JSON
Topic - 5
Apache Spark Introduction (cont..)
In Sync Kafka
Offset Consumer Broker
Replica Serialization
Bit shift
upstream and Sensors Executors Data Profiling Adhoc Queries
Downstream
Sample
Project 6
Topic - 10
Azure Services
Introduction of Data Analytics Virtual Machine
Blob Storage
Azure services Services (VM)
Sample
Project 7
Topic - 11
Databricks
Create a
Introduction Integrate with
Automated
of Databricks Azure
Cluster
Create a
DBFS Magic Sample
Interactive
functions Project 8
Cluster
Topic - 12
Statistics Fundamentals
Mean,
VAR, Std Dev Inferential
Median, Skewness Kurtosis
and IQR Statistics
Mode
Project 9 Project 10
Underfitting Confusion Matrix Build a Model using Build a Model Using
Python Libraries Spark ML Libraries
❑ Tips and Tricks for Cloudera Certification for Spark and Hadoop
Developer (CCA 175).