Download
Download
Hadoop
RAO ATIF MAHMOOD
ROLL NO : 85
YARN (Yet Another Resource
Negotiator)
• YARN is a resource management layer in the Hadoop ecosystem. It
was introduced in Hadoop 2.0 to replace the traditional MapReduce
framework. YARN provides a more scalable, flexible, and efficient way
to manage resources and execute jobs on a Hadoop cluster.
• Components of YARN:
• Resource Manager : The central component that manages resources and schedules
jobs.
• Node Manager: Runs on each node in the cluster and manages resources and job
execution.
• Application Master: Responsible for managing the execution of a specific job
(application).
Hadoop YARN Architecture
• YARN stands for “Yet Another Resource Negotiator“. It was
introduced in Hadoop 2.0 to remove the bottleneck on Job Tracker
which was present in Hadoop 1.0. YARN was described as a
“Redesigned Resource Manager” at the time of its launching, but it
has now evolved to be known as large-scale distributed operating
system used for Big Data processing.
• YARN architecture basically separates resource management layer from
the processing layer. In Hadoop 1.0 version, the responsibility of Job
tracker is split between the resource manager and application manager .
HADOOP 2.0
YARN also allows different data processing
engines like graph processing, interactive
processing, stream processing as well as batch
processing to run and process data stored in
HDFS (Hadoop Distributed File System) thus
making the system much more efficient.
Through its various components, it can
dynamically allocate various resources and
schedule the application processing. For large
volume data processing, it is quite necessary to
manage the available resources properly so
that every application can leverage them.
YARN Features:
• Scalability: The scheduler in Resource manager of YARN architecture allows
Hadoop to extend and manage thousands of nodes and clusters.
• Compatibility: YARN supports the existing map-reduce applications without
disruptions thus making it compatible with Hadoop 1.0 as well.
• Cluster Utilization:Since YARN supports Dynamic utilization of cluster in
Hadoop, which enables optimized Cluster Utilization.
• Multi-tenancy: It allows multiple engine access thus giving organizations a
benefit of multi-tenancy.
Application workflow in Hadoop
YARN:
1.Client submits an application
2.The Resource Manager allocates a container to start the
Application Manager
3.The Application Manager registers itself with the Resource
Manager
4.The Application Manager negotiates containers from the
Resource Manager
5.The Application Manager notifies the Node Manager to
launch containers
6.Application code is executed in the container
7.Client contacts Resource Manager/Application Manager to
monitor application’s status
8.Once the processing is complete, the Application Manager
un-registers with the Resource Manager