MapReduce Architecture
MapReduce Architecture
MapReduce Architecture:
Components of MapReduce Architecture:
1. Client: The MapReduce client is the one who brings the Job to
the MapReduce for processing. There can be multiple clients
available that continuously send jobs for processing to the
Hadoop MapReduce Manager.
2. Job: The MapReduce Job is the actual work that the client
wanted to do which is comprised of so many smaller tasks that
the client wants to process or execute.
3. Hadoop MapReduce Master: It divides the particular job into
subsequent job-parts.
4. Job-Parts: The task or sub-jobs that are obtained after
dividing the main job. The result of all the job-parts combined
to produce the final output.
5. Input Data: The data set that is fed to the MapReduce for
processing.
6. Output Data: The final result is obtained after the processing.
The MapReduce task is mainly divided into 2 phases i.e. Map phase
and Reduce phase.
1. Map: As the name suggests its main use is to map the input
data in key-value pairs. The input to the map may be a key-
value pair where the key can be the id of some kind of address
and value is the actual value that it keeps. The Map() function
will be executed in its memory repository on each of these
input key-value pairs and generates the intermediate key-
value pair which works as input for the Reducer
or Reduce() function.
How Job tracker and the task tracker deal with MapReduce: