2012 IN4392 Lecture-5 CloudProgrammingModels
2012 IN4392 Lecture-5 CloudProgrammingModels
Cloud Computing (IN4392) D. .!. "pema and #. Iosup. $it% &ontri'utions (rom Claudio Martella and )ogdan *%it. 2012-2013
1
1')234-
.ost model (Efficiency) = cost/performance# o"erheads# $ 0calability 1ault!tolerance 0upport for specific ser"ices .ontrol model# e-g-# fine!grained many!task scheduling 5ata model# including partitioning and placement# out!of! memory data access# etc6- 0ynchroni,ation model
'(1'!'(1)
#genda
12. )237ntroduction Cloud Programming in Pra&ti&e (,%e Pro'lem) 8rogramming +odels for .ompute!7ntensi"e Workloads 8rogramming +odels for 9ig 5ata 0ummary
'(1'!'(1)
,oda./s C%allenges
e0cience :he 1ourth 8aradigm :he 5ata 5eluge and 9ig 5ata 8ossibly others
'(1'!'(1)
0ource Aim Bray and Clex 0,alay#e0cience !! C :ransformed 0cientific +ethod# http //research-microsoft-com/en!us/um/people/gray/talks/DE.!.0:9Fe0cience-ppt
I, support
7nfrastructure =@. Brid# Gpen 0cience Brid# 5C0# DorduBrid# $ 1rom programming models to infrastructure management tools
"6amples
'(1'!'(1)
7rom
.pot%esis to Data
2
:housand years ago science *as empiri&al describing natural phenomena =ast fe* hundred years t%eoreti&al branch using models# generali,ations
. a 4G c2 a = 3 2 a
=ast fe* decades a &omputational branch simulating complex phenomena :oday (t%e 7ourt% Paradigm) data e6ploration %1 What is the 1ourth 8aradigm&
unify theory# experiment# and simulation 5ata captured by instruments or generated by simulator %' What are the dangers 8rocessed by soft*are of the 1ourth 8aradigm& 7nformation/Lno*ledge stored in computer 0cientist analy,es results using data management and statistics
'(1'!'(1)
Oanne"ar 9ush in the 1I2(s record your life +7: +edia =aboratory :he @uman 0peechome 8ro>ect/:otalEecall# data mining/analysis/"isio 5eb Eoy and Eupal 8atel record practically e"ery *aking moment of their son?s first three years ('(< pri"acy time$7s this e"en legal&P 0hould it be&P) 11x1+8/12fps cameras# 12x14b!2QL@, mics# 2-2:9 EC75 + tapes# 1( computersR '((k hours audio!"ideo 5ata si,e '((B9/day# 1-389 total
11
http //fta-inria-fr
1eb '(1'
'(1'!'(1) 12
Oelocity
0peed of the feedback loop Bain competiti"e ad"antage fast recommendations 7dentify fraud# predict customer churn faster
Oariety
:he data can become messy text# "ideo# audio# etc 5ifficult to integrate into applications
'(11!'(1' 14
Cdapted from 5oug =aney# )5 data management# +E:C Broup/Bartner report# 1eb '((1- http //blogs-gartner-com/doug!laney/files/'(1'/(1/adI2I!)5!5ata! +anagement!.ontrolling!5ata!Oolume!Oelocity!and!Oariety-pdf
'(11!'(1'
16
#genda
1- 7ntroduction '- .loud 8rogramming in 8ractice (:he 8roblem) 3. Programming Models (or Compute-Intensi0e $or=loads 1. )ags o( ,as=s '- Workflo*s )- 8arallel 8rogramming +odels 2- 8rogramming +odels for 9ig 5ata 3- 0ummary
'(1'!'(1) 1Q
Why 9ag of :asks& 1rom the perspecti"e of the user# >obs in set are >ust tasks of a larger >ob C single useful result from the complete 9o: Eesult can be combination of all tasks# or a selection of the results of most or e"en a single task
'(1'!'(1) Iosup et al., The Characteristics and Performance of Groups of Jobs in Grids, Euro-Par, LNCS, ol.!"!#, pp. $%&-$'$, '((6.
1I
'(
2(
4(
Q(
1((
'(
2(
4(
Q( '1
1((
Iosup and Epema( Grid Computin) *or+loads. IEEE Internet Computin) #,-&.( #'-&" -&/##.
Requirements = OpSys == WINNT61 && .omplex 0=Cs can be specified easily Arch == INTEL && (Disk >= DiskUsage) && ((Memory * 1024)>=ImageSize)
'(1'!'(1)
'3
#genda
1- 7ntroduction '- .loud 8rogramming in 8ractice (:he 8roblem) 3. Programming Models (or Compute-Intensi0e $or=loads 1- 9ags of :asks 2. $or=(lo5s )- 8arallel 8rogramming +odels 2- 8rogramming +odels for 9ig 5ata 3- 0ummary
'(1'!'(1) '4
$%at is a $o=(lo5;
'(1'!'(1)
'6
HCdapted from .arole Boble and 5a"id de Eoure# .hapter in :he 1ourth 8aradigm# http //research-microsoft-com/en!us/collaboration/fourthparadigm/
$or=(lo5s "6isted in *rids3 'ut Did Not )e&ome a Dominant Programming Model
:races
0elected 1indings
=oose coupling Braph *ith )!2 le"els C"erage W1 si,e is )(/22 >obs 63<+ W1s are si,ed 2( >obs or less# I3< are si,ed '(( >obs or less
'I
'(1'!'(1) 0stermann et al., 0n the Characteristics of Grid *or+flo1s, CoreG2I3 Inte)rated 2esearch in Grid Computin) -CGI*., '((Q.
)ioin(ormati&s in ,a0erna
'(1'!'(1)
)(
0ource .arole Boble and 5a"id de Eoure# .hapter in :he 1ourth 8aradigm# http //research-microsoft-com/en!us/collaboration/fourthparadigm/
#genda
1- 7ntroduction '- .loud 8rogramming in 8ractice (:he 8roblem) 3. Programming Models (or Compute-Intensi0e $or=loads 1- 9ags of :asks '- Workflo*s 3. Parallel Programming Models 2- 8rogramming +odels for 9ig 5ata 3- 0ummary
'(1'!'(1) )1
Cbstract machines ,as= (groups o( B3 B minutes)2 (5istributed) shared memory dis&uss parallel programming in &louds
5istributed memory +87
4arbanescu et al.( To1ards an Effecti e 5nified Pro)rammin) 6odel for 6an7-Cores. IP3PS *S &/#&
#genda
1')4. 37ntroduction .loud 8rogramming in 8ractice (:he 8roblem) 8rogramming +odels for .ompute!7ntensi"e Workloads Programming Models (or )ig Data 0ummary
'(1'!'(1)
))
8rogramming +odel
P#C, MapCedu&e Model Pregel 5ataflo* Clgebrix
Execution Engine
1lume 5remel :era C,ure Nep%ele @aloop Engine 0er"ice 5ata Engine :ree Engine adoop+ *irap% +87/ 5ryad D#CN Erlang @yracks
0torage Engine
0) B10 :era C,ure 5ata 5ata 0tore 0tore D71 Ooldemort = .osmos10 1 0 Csterix 9!tree
)2
'(1'!'(1)
Cdapted from 5agstuhl 0eminar on 7nformation +anagement in the .loud# http //***-dagstuhl-de/program/calendar/partlist/&semnr=11)'1X0VGB
#genda
1')4. 7ntroduction .loud 8rogramming in 8ractice (:he 8roblem) 8rogramming +odels for .ompute!7ntensi"e Workloads Programming Models (or )ig Data 1. MapCedu&e '- Braph 8rocessing )- Gther 9ig 5ata 8rogramming +odels 3- 0ummary
'(1'!'(1)
)3
MapCedu&e
+odel for processing and generating large data sets Enables a functional!like programming model 0plits computations into independent parallel tasks +akes efficient use of large commodity clusters @ides the details of paralleli,ation# fault!tolerance# data distribution# monitoring and load balancing
'(11!'(1' )4
3. Reduce Phase:
Combines all intermediate values for a given key Produces a set of merged output values reduce(out_key, list(interm_value)) -> list(out_value)
'(11!'(1' )6
'(11!'(1'
)Q
$ord&ount "6ample
1ile 1 :he big data is big- 1ile ' +apEeduce tames big data- Map 8utput2
+apper!1 (:he# 1)# (big# 1)# (data# 1)# (is# 1)# (big# 1) +apper!' (+apEeduce# 1)# (tames# 1)# (big# 1)# (data# 1)
Cedu&e Input
Eeducer!1 Eeducer!' Eeducer!) Eeducer!2 Eeducer!3 Eeducer!4 (:he# 1) (big# 1)# (big# 1)# (big# 1) (data# 1)# (data# 1) (is# 1) (+apEeduce# 1)# (+apEeduce# 1) (tames# 1)
Cedu&e 8utput
Eeducer!1 Eeducer!' Eeducer!) Eeducer!2 Eeducer!3 Eeducer!4 (:he# 1) (big# )) (data# ') (is# 1) (+apEeduce# ') (tames# 1)
)I
'(11!'(1'
'(11!'(1'
2(
+apEeduce solution
+ap extract 3!*ord seNuences =[ count from document Eeduce combine counts# and keep if count large enough
'(11!'(1'
21
http //***-slideshare-net/>hammerb/mapreduce!pact(4!keynote
+apEeduce solution
+ap extract host name from VE=# lookup per!host info# combine *ith per!doc data and emit Eeduce identity function (emit key/"alue directly)
'(11!'(1'
2'
http //***-slideshare-net/>hammerb/mapreduce!pact(4!keynote
'(11!'(1'
2)
'(11!'(1'
22
'(11!'(1'
24
$%at is
7nspired by Boogle# supported by SahooP 5esigned to perform fast and reliable analysis of the big data =arge expansion in many domains such as
1inance# technology# telecom# media# entertainment# go"ernment# research institutions
'(11!'(1'
26
adoop F Da%oo
When you "isit yahoo# you are interacting *ith data processed *ith @adoopP
'(11!'(1'
2Q
http //cloud-berkeley-edu/data/hdfs-pdf
'(11!'(1'
3(
D71 #r&%ite&ture
+aster/sla"e architecture DameDode (DD)
+anages the file system namespace Eegulates access to files by clients Gpen/close/rename files or directories +apping of blocks to 5ataDodes Gne per node in the cluster +anages local storage of the node 9lock creation/deletion/replication initiated by DD 0er"e read/*rite reNuests from clients
'(11!'(1' 31
5ataDode (5D)
D71 Internals
Eeplica 8lacement 8olicy
1irst replica on one node in the local rack 0econd replica on a different node in the local rack :hird replica on a different node in a different rack impro"ed *rite performance ('/) are on the local rack) preser"es data reliability and read performance
.ommunication 8rotocols
=ayered on top of :.8/78 .lient 8rotocol client \ DameDode machine 5ataDode 8rotocol 5ataDodes \ DameDode DameDode responds to E8. reNuests issued by 5ataDodes / clients
'(11!'(1' 3'
adoop 1&%eduler
Aob di"ided into se"eral independent tasks executed in parallel
:he input file is split into chunks of 42 / 1'Q +9 Each chunk is assigned to a map task Eeduce task aggregate the output of the map tasks
5ata =ocality execute tasks close to their data 0peculati"e execution re!launch slo* tasks
'(11!'(1' 3)
32
Gate 1&%eduler
=C:E 0cheduler
=ongest Cpproximate :ime to End 0peculati"ely execute the task that *ill finish farthest into the future
'(11!'(1'
33
8aharia et al.( Impro in) 6ap2educe performance in hetero)eneous en ironments. 0S3I &//%.
7#IC 1&%eduling
7solation and statistical multiplexing :*o!le"el architecture
1irst# allocates task slots across pools 0econd# each pool allocates its slots among multiple >obs 9ased on a max!min fairness policy
'(11!'(1'
34
8aharia et al.( 3ela7 schedulin)( a simple techni9ue for achie in) localit7 and fairness in cluster schedulin). EuroS7s &/#/. :lso T2 EECS-&//'-,,
Dela. 1&%eduling
5ata locality issues
@ead!of!line scheduling 171G# 1C7E =o* probability for small >obs to achie"e data locality 3Q< of >obs ] 1C.E9GGL ha"e ^ '3 maps Gnly 3< achie"e node locality "s- 3I< rack locality
'(11!'(1'
36
8aharia et al.( 3ela7 schedulin)( a simple techni9ue for achie in) localit7 and fairness in cluster schedulin). EuroS7s &/#/. :lso T2 EECS-&//'-,,
Dela. 1&%eduling
0olution
0kip the head!of!line >ob until you find a >ob that can launch a local task Wait :1 seconds before launching tasks on the same rack Wait :' seconds before launching tasks off!rack :1 = :' = 13 seconds =[ Q(< data locality
'(11!'(1'
3Q
8aharia et al.( 3ela7 schedulin)( a simple techni9ue for achie in) localit7 and fairness in cluster schedulin). EuroS7s &/#/. :lso T2 EECS-&//'-,,
+E!Eunner
.onfiguration X deployment +E cluster monitoring Bro*/0hrink mechanism
Gn!demand +E clusters
8erformance isolation 5ata isolation 1ailure isolation Oersion isolation
'(11!'(1'
3I
CesiHing Me&%anism
:*o types of nodes
Core nodes2 fully!functional nodes# *ith :ask:racker and 5ataDode (local disk access) ,ransient nodes2 compute nodes# *ith :ask:racker
8arameters
7 = Dumber of running tasks per number of a"ailable slots 8redefined 7min and 7ma6 thresholds 8redefined constant step gro51tep / s%rin=1tep , = time elapsed bet*een t*o successi"e resource offers
:hree policies
*ro5-1%rin= Poli&. (*1P)2 gro*!shrink but maintain 1 bet*een 7min and 7ma6 *reed.-*ro5 Poli&. (**P)2 gro*# shrink *hen *orkload done *reed.-*ro5-5it%-Data Poli&. (**DP)2 BB8 *ith core nodes (local disk access)
4(
'(11!'(1'
$or=loads
J1K J'K
IQ< of >obs ] 1acebook process 4-I +9 and take less than a minute J1K Boogle reported in '((2 computations *ith :9 of data on 1(((s of machines J'K
S- .hen# 0- Clspaugh# 5- 9orthakur# and E- Lat,# Energy Efficiency for =arge!0cale +apEeduce Workloads *ith 0ignificant 7nteracti"e Cnalysis# pp- 2)\34# '(1'A- 5ean and 0- Bhema*at# +apreduce 0implified 5ata 8rocessing on =arge .lusters# .omm- of the C.+# Ool- 31# no- 1# pp- 1(6\11)# '((Q-
'(11!'(1'
41
1(( B9 input data 1( core nodes *ith Q map slots each Q(( map tasks executed in 6 *a"es Wordcount is .8V!bound in the map phase
'(11!'(1' 4'
8erformed by the +E frame*ork during the shuffling phase 7ntermediate key/"alue pairs are processed in increasing key order 0hort map phase *ith 2(<!4(< .8V utili,ation =ong reduce phase *hich is highly disk intensi"e
'(11!'(1' 4)
a) Wordcount
(:ype 1)
0peedup relati"e to an +E cluster *ith 1( core nodes Close to linear speedup on &ore nodes
'(11!'(1'
42
7nput data of 2( B9 Wordcount output data = '( L9 0ort output data = 2( B9 Wordcount scales better than 0ort on transient nodes
'(11!'(1' 43
0tream of 3( +E >obs +E cluster of '( core nodes + '( transient nodes BB8 increases the si,e of the data transferred across the net*ork B08 gro*s/shrinks based on the resource utili,ation of the cluster BB58 enables local *rites on the disks of pro"isioned nodes
'(11!'(1' 44
0olution
Eeser"e the nodes for se"eral days# import and process the data&
Gur approach
0plit the data into multiple subsets 0maller data sets =[ faster import and processing 0etup multiple +E clusters# one for each subset
46
'(11!'(1'
#genda
1')4. 7ntroduction .loud 8rogramming in 8ractice (:he 8roblem) 8rogramming +odels for .ompute!7ntensi"e Workloads Programming Models (or )ig Data 1- +apEeduce 2. *rap% Pro&essing )- Gther 9ig 5ata 8rogramming +odels 3- 0ummary
'(1'!'(1)
4Q
7nitial dataset
C 9 . 5 --'(1'!'(1)
^(# (9# 3)# (5# ))[ ^inf# (E# 1)[ ^inf# (1# 3)[ ^inf# (9# 1)# (.# ))# (E# 2)# (1# 2)[
4I
9atch!oriented processing Euns in!memory Oertex!centric C87 1ault!tolerant Euns on +aster!0la"e architecture
=ocal .omputation
D processing units *ith fast local memory 0hared communication medium 0eries of 1upersteps Blobal 0ynchroni,ation 9arrier .ommunication Ends *hen all "ote:o@alt
9arrier 0ynchroni,ation
6'
:ermination condition
Cll "ertices inacti"e Cll messages ha"e been transmitted
'(1'!'(1)
6)
'(1'!'(1)
62
+aster Worker 1
3
+aster coordinates 0upersteps +aster coordinates .heckpoints Workers execute "ertices compute() Workers exchange messages directly
1 ' 2 )
'(1'!'(1)
Worker
k
63
'(1'!'(1)
64
'(1'!'(1)
66
:asktracker
+ap 0lot +ap 0lot
:asktracker
+ap 0lot +ap 0lot
:asktracker
+ap 0lot +ap 0lot
Yookeeper
DD X A:
+aster
=oose implementation of 8regel 0trong community (1acebook# :*itter# =inked7n) Euns 1((< on existing @adoop clusters 0ingle +ap!only >ob
6Q
'(1'!'(1)
%ttp2++in&u'ator.apa&%e.org+girap%+
org-apache-giraph-benchmark-8ageEank9enchmark
Benerates data# number of edges# number of "ertices# Z of supersteps 1 master/YooLeeper '( supersteps Do checkpoints 1 random edge per "ertex
6I
<erte6+5or=er s&ala'ilit.
M o( 0erti&es (100Ns o( millions) '3(( '((( ,otal se&onds 13(( )(( 1((( '(( 3(( ( ( 3( 1(( 13( '(( M o( 5or=ers '3( )(( )3(
Q'
1(( (
#genda
1')4. 7ntroduction .loud 8rogramming in 8ractice (:he 8roblem) 8rogramming +odels for .ompute!7ntensi"e Workloads Programming Models (or )ig Data 1- +apEeduce '- Braph 8rocessing 3. 8t%er )ig Data Programming Models 3- 0ummary
'(1'!'(1)
Q)
1tratosp%ere
+eteor Nuery language# 0upremo operator frame*ork 8rogramming .ontracts (8C.:s) programming model
Extended set of 'nd order functions ("s +apEeduce) 5eclarati"e definition of data parallelism
Clso in +apEeduce
'(1'!'(1) Q3
'(1'!'(1)
Q4
1tratosp%ere Nep%ele
'(1'!'(1)
Q6
1tratosp%ere 0s MapCedu&e
8C.: extends +apEeduce
9oth propose 'nd!order functions (3 8C.:s "s +ap X Eeduce) 9oth reNuire from user 1st!order functions (*hat?s inside the +ap) 9oth can benefit from higher!le"el languages 8C.: ecosystem has 7aa0 support
'(1'!'(1)
0ource 1abian @ueske# =arge 0cale 5ata Cnalysis 9eyond +apEeduce# @adoop Bet :ogether# 1eb '(1'-
'(1'!'(1)
QI
2- :riads to triangles
'(1'!'(1)
I(
0ource 1abian @ueske# =arge 0cale 5ata Cnalysis 9eyond +apEeduce# @adoop Bet :ogether# 1eb '(1' and 0tratosphere example-
'(1'!'(1)
I1
0ource 1abian @ueske# =arge 0cale 5ata Cnalysis 9eyond +apEeduce# @adoop Bet :ogether# 1eb '(1' and 0tratosphere example-
'(1'!'(1)
I'
0ource 1abian @ueske# =arge 0cale 5ata Cnalysis 9eyond +apEeduce# @adoop Bet :ogether# 1eb '(1' and 0tratosphere example-
#genda
1')2B. 7ntroduction .loud 8rogramming in 8ractice (:he 8roblem) 8rogramming +odels for .ompute!7ntensi"e Workloads 8rogramming +odels for 9ig 5ata 1ummar.
'(1'!'(1)
I)
,%e 7ourt% Paradigm :he 1ourth 8aradigm# http //research-microsoft-com/en!us/collaboration/fourthparadigm/ Programming Models (or Compute-Intensi0e $or=loads
Clexandru 7osup# +athieu Aan# Gmer G,an 0onme,# 5ick @- A- Epema :he .haracteristics and 8erformance of Broups of Aobs in BridsEuro!8ar '((6 )Q'!)I) Gstermann et al-# Gn the .haracteristics of Brid Workflo*s# .oreBE75 7ntegrated Eesearch in Brid .omputing (.B7W)# '((Qhttp //***-pds-e*i-tudelft-nl/aiosup/*ftraces(6charsFcamera-pdf .orina 0tratan# Clexandru 7osup# 5ick @- A- Epema C performance study of grid *orkflo* engines- BE75 '((Q '3!)'