02 Hadoop
02 Hadoop
Source: Barroso and Urs Hölzle (2009); performance figures from late 2007
What about communication?
¢ Nodes need to talk to each other!
l SMP: latencies ~100 ns
l LAN: latencies ~100 µs
¢ Scaling “up” vs. scaling “out”
l Smaller cluster of SMP machines vs. larger cluster of commodity machines
l E.g., 8 128-core machines vs. 128 8-core machines
l Note: no single SMP machine is big enough
¢ Let’s model communication overhead…
Source: analysis on this an subsequent slides from Barroso and Urs Hölzle (2009)
Modeling Communication Costs
¢ Simple execution cost model:
l Total cost = cost of computation + cost to access global data
l Fraction of local access inversely proportional to size of cluster
l n nodes (ignore cores for now)
l Light communication: f =1
l Medium communication: f =10
l Heavy communication: f =100
So why not?
Seeks vs. Scans
¢ Consider a 1 TB database with 100 byte records
l We want to update 1 percent of the records
¢ Scenario 1: random access
l Each update takes ~30 ms (seek, read, write)
l 1% updates = ~35 days
¢ Scenario 2: rewrite all records
l Assume 100 MB/s throughput
l Time = 5.6 hours(!)
¢ Lesson: avoid random seeks!
Branch mispredict 5 ns
L2 cache reference 7 ns
Mutex lock/unlock 25 ns
Main memory reference 100 ns
Send 2K bytes over 1 Gbps network 20,000 ns
Read 1 MB sequentially from memory 250,000 ns
Round trip within same datacenter 500,000 ns
Disk seek 10,000,000 ns
Read 1 MB sequentially from disk 20,000,000 ns
Send packet CA → Netherlands → CA 150,000,000 ns
split 0 Output 0
split 1
split 2
split 3 Output 1
split 4
NameNode
Secondary
NameNode
Client
Cluster Membership
see 1 bob 1
see bob throw
bob 1 run 1
throw 1 see 2
see 1 spot 1
see spot run
spot 1 throw 1
run 1
Map function
Reduce function