Unit 4
Unit 4
• The number of blocks of input file defines the number of map-task in the
Hadoop
• Map-phase, which can be calculated with the help of the below formula.
• Combiner The Combiner class is used in between the Map class and the
Reduce class to reduce the volume of data transfer between Map and
Reduce. Usually, the output of the map task is large and the data
transferred to the reduce task is high
• It produces the output by returning new key value pairs. The
input data has to be converted to key-value pairs as Mapper
can not process the raw input records or tuples(key-value
pairs). The mapper also generates some small blocks of data
while processing the input records as a key-value pair. we
will discuss the various process that occurs in Mapper, There
key features and how the key-value pairs are generated in the
Mapper.
• The number of partitioners is equal to the number of reducers. That means a partitioner
will divide the data according to the number of reducers. Therefore, the data
data using a user-defined condition, which works like a hash function. The
• total number of partitions is same as the number of Reducer tasks for the job. Let us take