MapReduce Workflows
MapReduce Workflows
01 04
Job Submission Phase Task Execution Phase
02 05
Job Initialization Phase Progress Update Phase
03 06
Task Assignment Phase Failure Recovery
• In order to run the MR program the
hadoop uses the command- ‘yarn jar
client.jar job-class H DFS input HDFS-
output directory’ , where yarn is an
utility and jar is the command .
• Client.jar and job class name written
by the developer
• When we
execute on
terminal the
Yarn will
initiate a set
of actions
Steps Hadoop takes to run a
MRjob
1. CLIENT - Submits the Map Reduce Jobs
2. JOB TRACKER - is a Java application, which
coordinates the job run, whose main class is
Job Tracker
3. TASK TRACKER - is a Java application,
which runs the tasks that the job has been
split into, whose main class is Task Tracker
4. DISTRIBUTED FILE SYSTEM - used for
sharing job fi les between the other entities.
JOB RUN - CLASSIC
MAP REDUCE
How job reduce is carried out in
layer.
HADOOP YARN
Resource Manager.
• The Application Manager negotiates containers
from the Resource Manager.
Application workflow in Hadoop
YARN
5. The Application Manager notifi es the Node Manager
to launch containers.
6. Application code is executed in the container.
7. Client contacts Resource Manager/Application
Manager to monitor application’s status.
8. Once the processing is complete, the Application
Manager un-registers with the Resource Manager
FAILURE CASES
IN YARN