Map Reduce

MapReduce is a software framework for processing large datasets in a distributed computing environment. It allows for parallel processing of data across clusters of computers. The framework includes Mapper and Reducer classes that can be customized by developers. Mappers process input records to generate intermediate key-value pairs, which are then shuffled and sorted for input to Reducers. Reducers consolidate these intermediate pairs to produce the final output data. MapReduce thus provides an easy way to write applications that process vast amounts of structured and unstructured data stored in Hadoop Distributed Filesystem.

Uploaded by

Vishakha Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views8 pages

Map Reduce

Uploaded by

Vishakha Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Hadoop

Mapreduce
By-Vishakha Sharma
7CSE 8X
A2305218493
What is map reduce

MapReduce is the data processing layer of Hadoop. It is a software framework for easily
writing applications that process the vast amount of structured and unstructured data stored
in the Hadoop Distributed Filesystem (HDFS). It processes the huge amount of data in
parallel by dividing the job (submitted job) into a set of independent tasks (sub-job). By this
parallel processing speed and reliability of cluster is improved.

There are also Mapper and Reducer classes provided by this framework which are

predefined and modified by the developers as per the organizations requirement.
Why map reduce
• The emergence of massive datasets presents both challenges and
opportunities in data storage and analysis.
• These “big data” challenges traditional analytic tools and will increasingly
require new solutions.
• MapReduce is used to write scalable applications that can do parallel
processing to process a large amount of data on a large cluster of commodity
hardware servers.
Mapper

Mapper is the initial line

of code that initially
interacts with the input
dataset.
Reducer
Reducer is the second part
of the Map-Reduce
programming model. The
Mapper produces the
output in the form of key-
value pairs which works
as input for the Reducer.
Map-reduce flow

input Map Shuffling Reduce output

and sorting

The data for a It processes each Now, the output

It takes the set of The final output
MapReduce task input record and is Shuffled to the
intermediate key- obtained after
is stored in input generates new key- reduce node.
value pairs Reduce
files, and input value pair, and this The shuffling is
produced by the
files typically key-value pair the physical
mappers as the
lives in HDFS. generated by movement of the
input and then
Mapper is data which is
runs a reducer
completely done over the
function on each of
different from the network
them to generate
input pair. the output.
Steps of data flow
THANK YOU

Complete Investigator
100% (1)
Complete Investigator
26 pages
FINAL English 10 Q1 Module 7
No ratings yet
FINAL English 10 Q1 Module 7
30 pages
Mathematics
No ratings yet
Mathematics
15 pages
Linking Words
No ratings yet
Linking Words
10 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
9 pages
Widt Unit-Ii
No ratings yet
Widt Unit-Ii
47 pages
BDA Unit-2
No ratings yet
BDA Unit-2
11 pages
Hadoop (Mapreduce)
No ratings yet
Hadoop (Mapreduce)
43 pages
A Considerable Speck ICSE Class 10 English Answers, Notes
No ratings yet
A Considerable Speck ICSE Class 10 English Answers, Notes
3 pages
Afro Shakuntala
No ratings yet
Afro Shakuntala
3 pages
A Watson Bain M A-French Poetry For Beginners PDF
No ratings yet
A Watson Bain M A-French Poetry For Beginners PDF
97 pages
Adobe Framemaker 2015 Help
100% (1)
Adobe Framemaker 2015 Help
990 pages
Map Reduce Workflow Colloquim
No ratings yet
Map Reduce Workflow Colloquim
30 pages
Math Grade 7 DLL Q2 W7 JAN
No ratings yet
Math Grade 7 DLL Q2 W7 JAN
4 pages
Knowing The Different Types of Film Genre
No ratings yet
Knowing The Different Types of Film Genre
13 pages
Question Bank NLP SOLUTIONS
No ratings yet
Question Bank NLP SOLUTIONS
21 pages
Hadoop
No ratings yet
Hadoop
34 pages
DevGuru ASP Quickref
No ratings yet
DevGuru ASP Quickref
85 pages
Maximum Parsimony Using PAUP and TNT
No ratings yet
Maximum Parsimony Using PAUP and TNT
9 pages
Chapter 4 - Understanding Map Reduce Fundamentals
No ratings yet
Chapter 4 - Understanding Map Reduce Fundamentals
45 pages
Drawing Combined
No ratings yet
Drawing Combined
10 pages
CSE 325 Numerical Methods: Sadia Tasnim Barsha Lecturer, CSE, SU
No ratings yet
CSE 325 Numerical Methods: Sadia Tasnim Barsha Lecturer, CSE, SU
13 pages
Answer Assignment Questions
No ratings yet
Answer Assignment Questions
3 pages
Big Data
No ratings yet
Big Data
120 pages
Unit-2 (MapReduce-II)
No ratings yet
Unit-2 (MapReduce-II)
11 pages
Unit 2 - From Hadoop Streaming PDF
No ratings yet
Unit 2 - From Hadoop Streaming PDF
20 pages
Lecture 2 - Mapreduce: Cpe 458 - Parallel Programming, Spring 2009
No ratings yet
Lecture 2 - Mapreduce: Cpe 458 - Parallel Programming, Spring 2009
26 pages
Sap Xi 3.0 - JMS
No ratings yet
Sap Xi 3.0 - JMS
10 pages
Bda U2
No ratings yet
Bda U2
79 pages
Big Data Management Continued
No ratings yet
Big Data Management Continued
48 pages
Map Reduce
No ratings yet
Map Reduce
74 pages
Bda Unit 2
No ratings yet
Bda Unit 2
48 pages
Big Data Unit-2 PPT Part2
No ratings yet
Big Data Unit-2 PPT Part2
78 pages
Orientalism and Visual Culture: Imagining Mesopotamia in Nineteenth Century Europe
No ratings yet
Orientalism and Visual Culture: Imagining Mesopotamia in Nineteenth Century Europe
16 pages
Стилистика. Ответы на семинар
100% (1)
Стилистика. Ответы на семинар
9 pages
Unit 5 Turing Machine
No ratings yet
Unit 5 Turing Machine
53 pages
BDA UNIT-3 (1) - Merged
No ratings yet
BDA UNIT-3 (1) - Merged
98 pages
Unit 3 - Big Data Technologies
No ratings yet
Unit 3 - Big Data Technologies
42 pages
Unit - III
No ratings yet
Unit - III
37 pages
Bda Unit-3
No ratings yet
Bda Unit-3
44 pages
Bda Unit 3
No ratings yet
Bda Unit 3
29 pages
Unit 3
No ratings yet
Unit 3
33 pages
Map Reduce
No ratings yet
Map Reduce
45 pages
Kerrang! UK 2020 No 1808 - HTTP - Downmagaz - Com - Anna's Archive
No ratings yet
Kerrang! UK 2020 No 1808 - HTTP - Downmagaz - Com - Anna's Archive
68 pages
Map Reduce Report
No ratings yet
Map Reduce Report
16 pages
BDP 2024 09
No ratings yet
BDP 2024 09
24 pages
Understanding MapReduce in Hadoop
No ratings yet
Understanding MapReduce in Hadoop
25 pages
BDA Unit 2 Notes
No ratings yet
BDA Unit 2 Notes
32 pages
Tcontwebbac02 Iom
No ratings yet
Tcontwebbac02 Iom
74 pages
Unit 3
No ratings yet
Unit 3
27 pages
Unit-2 MapReduce2024
No ratings yet
Unit-2 MapReduce2024
41 pages
Unit 3
No ratings yet
Unit 3
22 pages
Mekatronika: Oleh: Liman Hartawan Jurusan Teknik Mesin Fakultas Teknologi Industri Institut Teknologi Nasional
No ratings yet
Mekatronika: Oleh: Liman Hartawan Jurusan Teknik Mesin Fakultas Teknologi Industri Institut Teknologi Nasional
51 pages
MapReduce Arch
No ratings yet
MapReduce Arch
29 pages
Lecture 10 Chapter 6 Part 1 Big Data Processing Concepts
No ratings yet
Lecture 10 Chapter 6 Part 1 Big Data Processing Concepts
26 pages
Data Science Presentation
No ratings yet
Data Science Presentation
20 pages
Map Reduce
No ratings yet
Map Reduce
25 pages
P.Prabu (23x61c) CCS334-BDA - Unit-3
No ratings yet
P.Prabu (23x61c) CCS334-BDA - Unit-3
23 pages
Unit-2 Map Reduce Notes
No ratings yet
Unit-2 Map Reduce Notes
28 pages
HadoopMapreduce Summerization
No ratings yet
HadoopMapreduce Summerization
24 pages
Map Reduce 2
No ratings yet
Map Reduce 2
14 pages
Bda Unit-3
No ratings yet
Bda Unit-3
20 pages
Big Data Analytics UNIT 3 Notets
No ratings yet
Big Data Analytics UNIT 3 Notets
12 pages
Unit 4 1
No ratings yet
Unit 4 1
12 pages
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
No ratings yet
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
15 pages
Big Data BCA Unit4
No ratings yet
Big Data BCA Unit4
9 pages
Unit 2
No ratings yet
Unit 2
12 pages
3.1.how Map Reduce Works & 3.2 Anatomy
No ratings yet
3.1.how Map Reduce Works & 3.2 Anatomy
11 pages
Big Data Unit - 3
No ratings yet
Big Data Unit - 3
7 pages
132 P16cse5a-P16ite3a 2020052706582977
No ratings yet
132 P16cse5a-P16ite3a 2020052706582977
15 pages
HDFS Unit 4
No ratings yet
HDFS Unit 4
12 pages
Big Data Notes
No ratings yet
Big Data Notes
13 pages
Understand: The First Phase of Mapreduce Paradigm, What Is A Map/Mapper, What Is The Input To The
No ratings yet
Understand: The First Phase of Mapreduce Paradigm, What Is A Map/Mapper, What Is The Input To The
5 pages
Unit 5 - Mapreduce
No ratings yet
Unit 5 - Mapreduce
8 pages
Cloud Computing Prof
No ratings yet
Cloud Computing Prof
11 pages
Data Science
No ratings yet
Data Science
7 pages
MapReduce Architecture
No ratings yet
MapReduce Architecture
5 pages
Cross Cultural Communication of Turkey: Done by Mohit Agrawal 11BSUHH010030
No ratings yet
Cross Cultural Communication of Turkey: Done by Mohit Agrawal 11BSUHH010030
27 pages
3 Fuel Consumption Example - MR
No ratings yet
3 Fuel Consumption Example - MR
7 pages
MapReduce - Documentation
No ratings yet
MapReduce - Documentation
2 pages
Notes Bug Data and of Apache
No ratings yet
Notes Bug Data and of Apache
4 pages
Analysis of Mapreduce Algorithms: Harini Padmanaban
No ratings yet
Analysis of Mapreduce Algorithms: Harini Padmanaban
6 pages
Hadoop - MapReduce
No ratings yet
Hadoop - MapReduce
5 pages
CMMS OptiMaint - Installation and Update Procedure
No ratings yet
CMMS OptiMaint - Installation and Update Procedure
20 pages
IXL Diagnostic Report - 2024 05 27 - MEHTA ARUSH RUSHIRAJ
No ratings yet
IXL Diagnostic Report - 2024 05 27 - MEHTA ARUSH RUSHIRAJ
4 pages
Difficulties in Listening of The First Year Students at Tay Do University in Vietnam
No ratings yet
Difficulties in Listening of The First Year Students at Tay Do University in Vietnam
9 pages
Muhammad Didn't Meet Gabriel in A Cave
No ratings yet
Muhammad Didn't Meet Gabriel in A Cave
4 pages
Suma - CV 202403190039573
No ratings yet
Suma - CV 202403190039573
1 page
Learning Hadoop 2
From Everand
Learning Hadoop 2
Garry Turkington
4/5 (1)
Learning Apache Spark 2
From Everand
Learning Apache Spark 2
Muhammad Asif Abbasi
No ratings yet

Map Reduce

Uploaded by

Map Reduce

Uploaded by

Hadoop

There are also Mapper and Reducer classes provided by this framework which are

Mapper is the initial line

input Map Shuffling Reduce output

The data for a It processes each Now, the output

You might also like