0% found this document useful (0 votes)
27 views55 pages

Bda Unit I Lecture8 1

This document discusses models for analyzing the communication costs of MapReduce algorithms. It defines key terms like replication rate and reducer size. Examples are provided to illustrate how to model join and matrix multiplication algorithms. The document also explores using grouping to reduce communication costs for similarity joins by assigning multiple output pairs to each reducer.

Uploaded by

Anju2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views55 pages

Bda Unit I Lecture8 1

This document discusses models for analyzing the communication costs of MapReduce algorithms. It defines key terms like replication rate and reducer size. Examples are provided to illustrate how to model join and matrix multiplication algorithms. The document also explores using grouping to reduce communication costs for similarity joins by assigning multiple output pairs to each reducer.

Uploaded by

Anju2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

BIG DATA ANALYTICS

Complexity Analysis of MapReduce Algorithms


Communication Cost Model

◻ The model we will use:


Communication cost = sum of input sizes to each stage
◻ Output sizes are ignored
If the output is large, it’s likely that it will be input to another stage
The real outputs are typically small, e.g. some summary statistics, etc.
◻ Reading from disk is part of the communication cost
e.g. The input to the map stage can be from the disk of a reduce task at a different node
◻ Analysis is independent of scheduling decisions
e.g. Map and reduce tasks may or may not be assigned to the same node.

Input Map Reduce Map Reduce Output

2
Definitions: Replication Rate & Reducer Size

◻ Replication rate: Avg # of key-value pairs generated by Map tasks per input
The communication cost between Map and Reduce is determined by this
Donated as r
◻ Reducer size: Upper bound for the size of the value list corresponding to a single key
Donated as q
Choose q small enough such that:
1. there are many reducers for high levels of parallelism
2. the data for a reducer fits into the main memory of a node
◻ Typically q and r inversely proportional
Tradeoff between communication cost and parallelism/memory requirements.

3
Example: Join with MapReduce

◻ Map:
For each input tuple R(a, b):
Generate <key = b, value = (‘R’, a)> Replication rate:
For each input tuple S(b, c): r=1
Generate <key = b, value = (‘S’, c)>

Communication cost:
◻ Reduce: 2(|R|+|S|)
Input: <b, value_list>
In the value_list:
■ Pair each entry of the form (‘R’, a) with each entry (‘S’, c),
Reducer size (worst case):
and output: q = |R| + |S|
<a, b, c>

4
Example: Single-Step Matrix-Matrix Multiplication

◻ Map(input): Assume both M and N have size nxn


for each mij entry from matrix M:
for k=1 to n Replication rate:
generate <key = (i, k), value = (‘M’, j, mij) > r=n
for each njk entry from matrix N:
for i=1 to n
Communication cost:
generate <key = (i, k), value = (‘N’, j, njk) >
2n2 + 2n3

◻ Reduce(key, value_list)
Reducer size:
sum ← 0
for each pair (M, j, mij) and (N, j, njk) in value_list q = 2n
sum += mij . njk
output (key, sum)
5
A Graph Model for MapReduce Algorithms

Inputs Outputs ◻ Define a vertex for each input and output


◻ Define edges reflecting which inputs each output needs
◻ Every MapReduce algorithm has a schema that assigns
outputs to reducers.

◻ Assume that max reducer size is q.


◻ Assignment Requirements:
1. No reducer can be assigned more than q inputs.

2. Each output is assigned to at least one reducer that


receives all inputs needed for that output.

6
Example: Single-Step Matrix-Matrix Multiplication

We have assigned each output to a single reducer.


The replication rate r = n
The reducer size q = 2n

7
Application: Naïve Similarity Join
Naïve Similarity Join

◻ Objective: Given a large set of elements X and a similarity measure s(x1, x2), output
the pairs that have similarity above a given threshold.
Locality sensitive hashing is not used for the sake of this example.

◻ Example:
Each element is an image of 1M bytes
There are 1M images in the set
About 5x1011 (500B) image comparisons to make

9
Similarity Join with MapReduce (First Try)

◻ Let n be the # of pictures in the set.

◻ Map:
for each picture Pi do: Replication rate r = n-1
Reducer size q = 2
for each j=1 to n (except i)
Communication cost = n + n(n-1)
generate <key = (i,j), value = Pi> # of reducers = n(n-1)/2

◻ Reduce (key, value_list)


compute sim(Pi, Pj)
output (i,j) if similarity is above threshold
10
Example: 1M pictures with 1MByte size each

◻ Communication cost:
n(n-1) pictures communicated from Map to Reduce
total # bytes transferred = 1018

◻ Assume gigabit ethernet:


time to transfer 1018 bytes = 1010 seconds (~300 years)

◻ Replication rate r = n-1


◻ Reducer size q = 2
◻ Communication cost = n + n(n-1)
◻ # of reducers = n(n-1)/2

11
Graph Model

Our MapReduce algorithm:


One reducer per output.
Pi must be sent to each output.
Replication rate r = n-1
Reducer size q = 2

What if a reducer covers multiple outputs?

12
Graph Model: Multiple Outputs per Reducer

Replication rate & communication


cost reduced.

How to do the grouping?

13
Grouping Outputs

◻ Define g intervals between 1 and n.


◻ Reducer (u,v) will be responsible for comparing all inputs in range u with all inputs in range v.
interval 3
Example: 1 ..…………………....…… n
1
..
..
..
.. Reducer (2, 3) will compare all entries in
..
interval 2 .. interval 2 with all entries in interval 3.
..
..
..
..
..
..
..
..
..
n

14
Similarity Join with Grouping

◻ Let n be the number of inputs, and g be the number of groups.


◻ Map:
for each Pi in the input
let u be the group to which i belongs
for v = 1 to g
generate < key=(u, v), value=(i, Pi) > Problem:
Pi will be sent to (gi, gj)
Pj will be sent to (gj, gi)
◻ Reduce(key=(u,v), value_list)
for each i that belongs to group u in value_list
for each j that belongs to group v in value_list
compute sim(Pi, Pj), and output (i, j) if it is above threshold.
15
Similarity Join with Grouping

◻ Let n be the number of inputs, and g be the number of groups.


◻ Map:
for each Pi in the input
let u be the group to which i belongs
for v = 1 to g
generate < key=[min(u, v), max(u,v)], value=(i, Pi) >

Single key generated for (u,v) and (v,u)


◻ Reduce(key=(u,v), value_list)
for each i that belongs to group u in value_list
for each j that belongs to group v in value_list
compute sim(Pi, Pj), and output (i, j) if it is above threshold.

16
17
18
Example

Example: If g = 4, the highlighted comparisons will be performed.

1 ..…………………....…… n
There will be a reducer for each key (u, v),
1
.. where u ≤ v
..
..
..
..
..
..
..
..
..
..
..
..
..
..
n

19
Example

Which reducers will receive and use Pi in group 2?

1 ..…………………....…… n
Reducers: (1, 2), (2, 2), (2, 3), (2, 4)
1
..
..
..
..
..
Pi ..
..
..
..
..
..
..
..
..
..
n

20
Complexity Analysis

◻ Replication rate:
r=g
◻ Reducer size:
q = 2n/g
◻ Communication cost:
n+ng
◻ # of reducers:
g(g+1)/2

21
Example: 1M pictures with 1MByte size each

◻ Let g = 1000

◻ Reducer size q = 2n/g


memory needed for one node: ~2GB (reasonable)
◻ Communication cost = n + ng
total # bytes transferred = ~1015 (still a lot, but 1000x less than before)
◻ # of reducers = g(g+1)/2
there are ~500K reducers (enough parallelism for 1000s of nodes)
◻ What if g = 100?

22
Tradeoff Between Replication Rate and Reducer Size

Replication rate r = g q = 2n / r qr = 2n
Reducer size q = 2n/g

◻ Replication rate and reducer size are inversely proportional.

◻ Reducing replication rate will reduce communication, but will increase reducer size.
Extreme case: r = 1 and q = 2n. There is a single reducer doing all the comparisons.
Extreme case: r = n and q = 2. There is a reducer for each pair of inputs.

◻ Need to choose r small enough such that the data fits into local DRAM and there’s
enough parallelism.
23
Application: Matrix-Matrix Multiplication
with 1D Decomposition
Reminder: Matrix-Matrix Multiplication without Grouping

j k k

i mij i pik
X =
j njk

M N P

Each mij needs to be sent to each reducer (i, k) for all k

25
Reminder: Matrix-Matrix Multiplication without Grouping

j k k

i mij i pik
X =
j njk

M N P

Each njk needs to be sent to each reducer (i, k) for all i

Replication rate r = n

26
Multiple Outputs per Reducer

g stripes
j k K

i mij I
X = g stripes
j njk

M N P

Notation: Let reducer (I,K) be responsible


j: row/column index of an individual matrix entry for computing all pik where:
J: set of indices that belong to the Jth interval. i ∈ I and k ∈ K

27
Multiple Outputs per Reducer

g stripes
j

i mij I
X = g stripes

M N P

Which reducers need mij?


Reducers (I, K) for all 1 ≤ K ≤ g Replication rate r = g

28
Multiple Outputs per Reducer

g stripes
k K

X = g stripes
j njk

M N P

Which reducers need njk?


Reducers (I, K) for all 1 ≤ I ≤ g Replication rate r = g

29
1D Matrix Decomposition

g stripes
K K

I I
X = g stripes

M N P

Which matrix elements will reducer (I, K) receive?


Ith row stripe of M and Kth column stripe of N

30
MapReduce Formulation

◻ Map:
for each element mij from matrix M
for K=1 to g
generate <key=(I, K), value = (‘M’, i, j, mij)>
for each element njk from matrix N Replication rate:
for I=1 to g r=g
generate <key=(I, K), value = (‘N’, j, k, njk)>
Communication cost:
◻ Reduce(key=(I,K), value_list) 2n2 + 2gn2
for each i ∈ I and for each k ∈ K
pik = 0 Reducer size:
for j = 1 to n q= 2n2/g
pik += mij . njk
# of reducers:
output <key=(i, k), value = pik> g2

31
Communication Cost vs. Reducer Size

Replication rate vs. reducer size


q = 2n2/g 🡺 q = 2n2/r 🡺 qr = 2n2

Replication rate:
Communication cost vs. reducer size r=g
cost = 2n2 + 2gn2
= 2n2 + 4n4/q Communication cost:
2n2 + 2gn2

Inverse relation between communication cost and reducer size. Reducer size:
Reminder: q value chosen should be small enough such that: q = 2n2/g
Local memory is sufficient
There’s enough parallelism # of reducers:
g2

32
Application: Matrix-Matrix Multiplication
with 2D Decomposition
Two Stage MapReduce Algorithm

◻ What are we trying to achieve?


A better tradeoff between replication rate r and reducer size q
The previous algorithm: qr = 2n2
We will show that we can achieve qr2 = 2n2
For the same reducer size, the replication rate will be smaller

◻ Reminder: Two-stage MapReduce without grouping:


Stage 1: “Join” matrix entries that need to be multiplied together
Stage 2: Sum up products to compute final results

◻ Use a similar idea, but for sub-blocks of matrices instead of individual elements

34
2D Matrix Decomposition

g stripes
K K

I I
X = g stripes

M N P

Assume that M and N are partitioned to g horizontal and g vertical stripes.

35
Computing the Product at Stripe (I, K)

g stripes
K K

I I
X = g stripes

M N P

Note: MIJ x NJK is multiplication of two sub-matrices


36
How to Define Reducers?

g stripes
J K K

I I
X = g stripes
J

M N P

What if we define a reducer for each (I, K)?


It would be identical to the 1D decomposition
What if we define a reducer for each J?
Exercise: Derive the communication cost as a function of n and q
37
How to Define Reducers?

g stripes
J K K

I I
X = g stripes
J

M N P

What if we define a reducer for each (I, J, K)?


Smaller reducer size
Reducer (I, J, K) will be responsible for computing the Jth partial sum for block PIK

38
First MapReduce Step

J K K

I J = I
from M from N Jth partial sum

39
MapReduce Step 1: Map

g stripes
J

I I
X = g stripes

M N P

Block MIJ will be sent to the reducers (I, J, K) for all K

Reminder: Reducer (I, J, K) is responsible for computing the Jth partial sum for block PIK

40
MapReduce Step 1: Map

g stripes
K K

X = g stripes
J

M N P

Block NJK will be sent to the reducers (I, J, K) for all I

Reminder: Reducer (I, J, K) is responsible for computing the Jth partial sum for block PIK

41
MapReduce Step 1: Reduce

g stripes
J K K

I I
X = g stripes
J

M N P

Reducer (I, J, K) will receive MIJ and NJK blocks and will compute
the Jth partial sum for block PIK

42
MapReduce Step 1: Reducer Output

g stripes
K K

I I
X = g stripes

M N P

For each pik ∈ PIK, there are g reducers that compute a partial sum (each with key=(I, J, K))

The reduce outputs corresponding to pik: <key = (i, k), value = xJik>

43
MapReduce Step 2

◻ Map:
for each input <key = (i, k), value = xJik>
generate <key = (i, k), value = xJik>

◻ Reduce(key = (i, k), value_list)


pik = 0
for each xJik in value_list
pik += xJik
output <key = (i, k), value = pik>

44
Complexity Analysis: Step 1

Replication rate:
r1 = g

Communication cost:
2n2 + 2gn2

Reducer size:
q1= 2n2/g2

# of reducers:
g3

45
Complexity Analysis: MapReduce Step 2

◻ Map:
for each input <key = (i, k), value = xJik> Replication rate:
generate <key = (i, k), value = xJik> r2 = 1

Communication cost:
◻ Reduce(key = (i, k), value_list) gn2
pik = 0
for each xJik in value_list Reducer size:
pik += xJik q2= g
output <key = (i, k), value = pik>
# of reducers:
n2

46
Complexity Analysis

◻ Step 1 Step 2
Replication rate: Replication rate:
r1 = g r2 = 1

Communication cost: Communication cost:


2n2 + 2gn2 gn2

Reducer size: Reducer size:


q1= 2n2/g2 q 2= g

# of reducers: # of reducers:
g3 n2

47
Tradeoff Between Communication Cost and Reducer Size

◻ To decrease communication cost:


Choose g small enough
◻ To decrease reducer size:
Choose g large enough to reduce q1
Size of q2 is less of a concern. Why?
The reduce operation in step 2:
Simply accumulate the values
The same value is used only once
The value_list doesn’t have to fit into local memory

◻ Conclusion: Use the communication cost formula as a function of q1


to determine the right tradeoff.
48
Matrix-Matrix Multiplication
1D Decomposition vs. 2D Decomposition
Comparison: Parallelism

1D Decomposition 2D Decomposition

For the same # of groups, 2D decomposition has better parallelism

50
Comparison: Reducer Size

1D Decomposition 2D Decomposition

51
Comparison: Communication Costs

1D Decomposition 2D Decomposition

Note: We have control over how to choose the g


values for 1D and 2D decompositions. However,
the max q value is limited by the available local
memory size. So, it makes more sense to use the
same q value for 1D and 2D decompositions.

52
Comparison: Communication Costs (when reducer sizes are equal)

1D Decomposition 2D Decomposition

53
Conclusions

54
References

◻ Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of Massive


Datasets, Cambridge University Press, Second Edition, 2014.
◻ http://mmds.org/
◻ http://www.cs.bilkent.edu.tr/~mustafa.ozdal/cs425/

55

You might also like