T06 Yarn
T06 Yarn
x
1
Hadoop 1 Components
◼ HDFS Daemons
◼ NameNode (NN)
◼ Secondary NameNode (SNN)
◼ DataNode (DN)
◼ MapReduce Daemons
◼ JobTracker (JT)
◼ TaskTracker (TT)
2
Hadoop 1: MapReduce
3
Limitations of Hadoop 1
◼ It is only suitable for Batch Processing. Not suitable for Real-time Data Processing or
Data Streaming.
◼ It supports up to 4000 Nodes per Cluster.
◼ It has a single component : JobTracker to perform many activities like Resource
Management, Job Scheduling, Job Monitoring, Re-scheduling Jobs etc.
◼ It supports only one Name Node and one JobTracker, which results in single point of
failure.
◼ It runs only Map/Reduce jobs.
◼ It is less efficient in resource management.
4
Hadoop 1 Vs Hadoop 2
5
Sample Applications That Can Run on YARN
6
Benefits of YARN
7
Hadoop 2 components
◼ HDFS
◼ Data node
◼ Secondary name node
◼ Active name node
◼ Passive name node (Standby)
◼ Zookeeper
◼ YARN (Yet Another Resource Negotiator)
◼ Resource Manager
◼ Application Manager
◼ Scheduler
◼ Node Manager
◼ Application Master
◼ Container
8
Components of YARN
9
Components of Resource Manager
10
Components of Node Manager
11
Applications on YARN
12
Running an Application in YARN
13
14
15
16
17
18
19
Refs
◼ https://blog.cloudera.com/untangling-apache-hadoop-yarn-part-1-cluster-and-yarn-
basics/
20
Acknowledgment
◼ Some of the figures in the slides were created by SimpliLearn and Edureka
21
END
22