0% found this document useful (0 votes)
49 views29 pages

Map Reduce PDF

MapReduce is a programming model and software framework for processing large datasets in a distributed manner, where a dataset is broken into independent chunks that are processed in parallel by the distributed nodes, with the MapReduce framework handling fault tolerance, scheduling, and parallelizing the work. It involves two main phases - the map phase where the input data is processed key-value pair by key-value pair to generate intermediate key-value pairs, and the reduce phase where all intermediate values with the same key are grouped and passed to the reduce function to generate the final output. MapReduce is well-suited for problems that can be expressed as maps and reduces like counting word

Uploaded by

Mahalakshmi G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views29 pages

Map Reduce PDF

MapReduce is a programming model and software framework for processing large datasets in a distributed manner, where a dataset is broken into independent chunks that are processed in parallel by the distributed nodes, with the MapReduce framework handling fault tolerance, scheduling, and parallelizing the work. It involves two main phases - the map phase where the input data is processed key-value pair by key-value pair to generate intermediate key-value pairs, and the reduce phase where all intermediate values with the same key are grouped and passed to the reduce function to generate the final output. MapReduce is well-suited for problems that can be expressed as maps and reduces like counting word

Uploaded by

Mahalakshmi G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

MapReduce

1
Distributed File System (DFS)
• For very large files: TBs, PBs
• Each file is partitioned into chunks, typically
64MB
• Each chunk is replicated several times (≥3),
on different racks, for fault tolerance
• Implementations:
– Google’s DFS: GFS, proprietary
– Hadoop’s DFS: HDFS, open source

2
MapReduce
• Google: paper published 2004
• Free variant: Hadoop

• MapReduce = high-level programming model


and implementation for large-scale parallel
data processing

3
Typical Problems Solved by MR
• Read a lot of data
• Map: extract something you care about from each
record
• Shuffle and Sort
• Reduce: aggregate, summarize, filter, transform
• Write the results Paradigm stays the same,
change map and reduce
functions for different problems

4
slide source: Jeff Dean
Data Model
Files!

A file = a bag of (key, value) pairs

A MapReduce program:
• Input: a bag of (inputkey, value) pairs
• Output: a bag of (outputkey, value) pairs

5
Step 1: the MAP Phase
User provides the MAP-function:
• Input: (input key, value)
• Ouput:
bag of (intermediate key, value)

System applies the map function in parallel to all


(input key, value) pairs in the input file

6
Step 2: the REDUCE Phase

User provides the REDUCE function:


• Input:
(intermediate key, bag of values)
• Output: bag of output (values)

System groups all pairs with the same intermediate


key, and passes the bag of values to the REDUCE
function
7
Example
• Counting the number of occurrences of each
word in a large collection of documents
• Each Document
– The key = document id (did)
– The value = set of words (word)
reduce(String key, Iterator values):
map(String key, String value):
// key: a word
// key: document name
// values: a list of counts
// value: document contents
int result = 0;
for each word w in value:
for each v in values:
EmitIntermediate(w, “1”);
result += ParseInt(v);
8
Emit(AsString(result));
MAP REDUCE

(w1,1)
Shuffle
(did1,v1) (w2,1)

(w3,1) (w1, (1,1,1,…,1)) (w1, 25)

… (w2, (1,1,…)) (w2, 77)

(did2,v2) (w1,1) (w3,(1…)) (w3, 12)

(w2,1) … …

… … …

(did3,v3) … …

… …

....

9
Jobs v.s. Tasks
• A MapReduce Job
– One single “query”, e.g. count the words in all docs
– More complex queries may consists of multiple jobs

• A Map Task, or a Reduce Task


– A group of instantiations of the map-, or reduce-
function, which are scheduled on a single worker

10
Workers
• A worker is a process that executes one task
at a time
• Typically there is one worker per processor,
hence 4 or 8 per node

11
Fault Tolerance
• If one server fails once every year…
... then a job with 10,000 servers will fail in
less than one hour

• MapReduce handles fault tolerance by writing


intermediate files to disk:
– Mappers write file to local disk
– Reducers read the files (=reshuffling); if the server
fails, the reduce task is restarted on another
server
12
MAP Tasks REDUCE Tasks

(w1,1) Shuffle
(did1,v1) (w2,1)

(w3,1) (w1, (1,1,1,…,1)) (w1, 25)

… (w2, (1,1,…)) (w2, 77)

(did2,v2) (w1,1) (w3,(1…)) (w3, 12)

(w2,1) … …

… … …

(did3,v3) … …

… …

....

13
MapReduce Execution Details
Output to disk,
replicated in cluster

Reduce Task
Intermediate data
goes to local disk:
(Shuffle) M × R files (why?)

Map Task
Data not
necessarily local

File system: GFS


or HDFS
14
Implementation
• There is one master node
• Master partitions input file into M splits, by key
• Master assigns workers (=servers) to the M map
tasks, keeps track of their progress
• Workers write their output to local disk, partition
into R regions
• Master assigns workers to the R reduce tasks
• Reduce workers read regions from the map
workers’ local disks
16
Interesting Implementation Details

Worker failure:

• Master pings workers periodically,

• If down then reassigns the task to another


worker

17
Interesting Implementation Details

Backup tasks:
• Straggler = a machine that takes unusually long
time to complete one of the last tasks. Eg:
– Bad disk forces frequent correctable errors (30MB/s 
1MB/s)
– The cluster scheduler has scheduled other tasks on
that machine
• Stragglers are a main reason for slowdown
• Solution: pre-emptive backup execution of the
last few remaining in-progress tasks

18
Parallel Data Processing @ 2010

19
Issues with MapReduce
• Difficult to write more complex queries

• Need multiple MapReduce jobs: dramatically


slows down because it writes all results to
disk

• Next lecture: Spark

20
Relational Operators in
MapReduce
Given relations R(A,B) and S(B, C) compute:

• Selection: σA=123(R)

• Group-by: γA,sum(B)(R)

• Join: R ⋈S

21
Selection σA=123(R)

map(String value):
if value.A = 123:
EmitIntermediate(value.key, value);

reduce(String k, Iterator values):


for each v in values:
Emit(v);

22
Selection σA=123(R)

map(String value):
if value.A = 123:
EmitIntermediate(value.key, value);

reduce(String k, Iterator values):


for each v in values:
Emit(v);
No need for reduce.
But need system hacking
to remove reduce from MapReduce 23
Group By γA,sum(B)(R)

map(String value):
EmitIntermediate(value.A, value.B);

reduce(String k, Iterator values):


s=0
for each v in values:
s=s+v
Emit(k, v);
24
Join
Two simple parallel join algorithms:

• Partitioned hash-join (we saw it, will recap)

• Broadcast join

25
R(A,B) ⋈B=C S(C,D)

Partitioned Hash-Join

Initially, both R and S are horizontally partitioned

R1, S1 R2, S2 . . . RP, SP


Reshuffle R on R.B
and S on S.B

R’1, S’1 R’2, S’2 . . . R’P, S’P


Each server computes
the join locally

26
R(A,B) ⋈B=C S(C,D)

Partitioned Hash-Join
map(String value):
case value.relationName of
‘R’: EmitIntermediate(value.B, (‘R’, value));
‘S’: EmitIntermediate(value.C, (‘S’, value));

reduce(String k, Iterator values):


R = empty; S = empty;
for each v in values:
case v.type of:
‘R’: R.insert(v)
‘S’: S.insert(v);
for v1 in R, for v2 in S
Emit(v1,v2);
27
R(A,B) ⋈B=C S(C,D)

Broadcast Join
Broadcast S

Reshuffle R on R.B

R1 R2 . . . RP S

R’1, S R’2, S . . . R’P, S

28
R(A,B) ⋈B=C S(C,D)

Broadcast Join
map should read
several records of R:
value = some group
of records
map(String value):
open(S); /* over the network */
hashTbl = new() Read entire table S,
build a Hash Table
for each w in S:
hashTbl.insert(w.B, w)
close(S);

for each v in value:


for each w in hashTbl.find(v.B)
Emit(v,w); reduce(…):
/* empty: map-side only */
29
Conclusions
• MapReduce offers a simple abstraction, and
handles distribution + fault tolerance
• Speedup/scaleup achieved by allocating
dynamically map tasks and reduce tasks to
available server. However, skew is possible
(e.g. one huge reduce task)
• Writing intermediate results to disk is
necessary for fault tolerance, but very slow.
Spark replaces this with “Resilient Distributed
Datasets” = main memory + lineage

You might also like