0% found this document useful (0 votes)
14 views6 pages

Unit I - Map Reduce

MapReduce is a programming model designed for processing large datasets using a distributed algorithm across a cluster. It consists of three main operations: Map, Shuffle, and Reduce, which collectively enable efficient data processing. An example of its application is the Word Count program, which utilizes Map and Reduce functions to generate key-value pairs and merge results.

Uploaded by

studybunkers
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views6 pages

Unit I - Map Reduce

MapReduce is a programming model designed for processing large datasets using a distributed algorithm across a cluster. It consists of three main operations: Map, Shuffle, and Reduce, which collectively enable efficient data processing. An example of its application is the Word Count program, which utilizes Map and Reduce functions to generate key-value pairs and merge results.

Uploaded by

studybunkers
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Map Reduce

Programming model for


Big Data Processing
Map Reduce
MapReduce programming model is a powerful framework for processing and
generating large datasets with a distributed algorithm on a cluster.
Map - Reduce
MapReduce typically consists of three main operations:
Map: Each worker node applies the map function to the local data and writes the
output to temporary storage. A master node ensures that redundant input data is
processed only once.
Shuffle: Worker nodes redistribute the data based on the output keys produced by
the map function, ensuring that all data associated with a particular key is located
on the same worker node.
Reduce: Worker nodes process each group of output data, per key, in parallel.
Example: Word Count
Two main functions Map and Reduce
Map Function: Takes input and produces a set of intermediate key-value pairs.
Reduce Function: Merges all intermediate values associated with the same key.
Example: Word Count Code

You might also like