0% found this document useful (0 votes)
62 views

2012 IN4392 Lecture-5 CloudProgrammingModels

The document discusses cloud programming models and introduces key concepts such as bags of tasks, workflows, and parallel programming models. It provides examples of how bags of tasks can be used for parameter sweeps and Monte Carlo simulations. The document also discusses how bags of tasks became the dominant programming model for grid computing systems in the past.

Uploaded by

akbisoi1
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

2012 IN4392 Lecture-5 CloudProgrammingModels

The document discusses cloud programming models and introduces key concepts such as bags of tasks, workflows, and parallel programming models. It provides examples of how bags of tasks can be used for parameter sweeps and Monte Carlo simulations. The document also discusses how bags of tasks became the dominant programming model for grid computing systems in the past.

Uploaded by

akbisoi1
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 95

IN4392 Cloud Computing Cloud Programming Models

Cloud Computing (IN4392) D. .!. "pema and #. Iosup. $it% &ontri'utions (rom Claudio Martella and )ogdan *%it. 2012-2013
1

Parallel and Distri'uted *roup+,- Del(t

,erms (or ,oda./s Dis&ussion


Programming model = language + libraries + runtime system that create a model of computation (an abstract machine) = an abstraction of a computer system Wikipedia Examples message!passing "s shared memory# data! "s task!parallelism# $ #'stra&tion le0el % What is the best abstraction le"el& = distance from physical machine Examples Assembly lo*!le"el "s Java is high le"el +any design trade!offs performance# ease!of!use# common!task optimi,ation# programming paradigm# $
'(1'!'(1) '

C%ara&teristi&s o( a Cloud Programming Model

1')234-

.ost model (Efficiency) = cost/performance# o"erheads# $ 0calability 1ault!tolerance 0upport for specific ser"ices .ontrol model# e-g-# fine!grained many!task scheduling 5ata model# including partitioning and placement# out!of! memory data access# etc6- 0ynchroni,ation model

'(1'!'(1)

#genda
12. )237ntroduction Cloud Programming in Pra&ti&e (,%e Pro'lem) 8rogramming +odels for .ompute!7ntensi"e Workloads 8rogramming +odels for 9ig 5ata 0ummary

'(1'!'(1)

,oda./s C%allenges

e0cience :he 1ourth 8aradigm :he 5ata 5eluge and 9ig 5ata 8ossibly others

'(1'!'(1)

e1&ien&e2 ,%e $%.


0cience experiments already cost '3;3(< budget
$ and perhaps incur 63< of the delays

+illions of lines of code *ith similar functionality


=ittle code reuse across pro>ects and application domains $ but last t*o decades? science is "ery similar in structure

+ost results difficult to share and reuse


.ase!in!point 0loan 5igital 0ky 0ur"ey digital map of '3< of the sky x spectra 2(:9+ sky sur"ey data '((++ astro!ob>ects (images) 1++ ob>ects *ith spectrum (spectra) @o* to make it *ork for this and the next generation of scientists&
'(1'!'(1) 4

0ource Aim Bray and Clex 0,alay#e0cience !! C :ransformed 0cientific +ethod# http //research-microsoft-com/en!us/um/people/gray/talks/DE.!.0:9Fe0cience-ppt

e1&ien&e (!o%n ,a.lor3 -4 1&i.,e&%.3 1999)


# ne5 s&ienti(i& met%od
.ombine science *ith 7: 1ull scientific process control scientific instrument or produce data from simulations# gather and reduce data# analy,e and model results# "isuali,e results +ostly compute!intensi"e# e-g-# simulation of complex phenomena

I, support
7nfrastructure =@. Brid# Gpen 0cience Brid# 5C0# DorduBrid# $ 1rom programming models to infrastructure management tools

"6amples
'(1'!'(1)

% Why is .omp0ci an example here&


6

H physics# 9ioinformatics# +aterial science# Engineering# Comp1&i

,%e 7ourt% Paradigm2 ,%e $%. (#n #ne&dotal "6ample)

,%e 80er5%elming *ro5t% o( 4no5ledge


When 1' men founded the Dumber of 1II) 1II6 Eoyal 0ociety in 144(# it *as 8ublications 1II6 '((1 possible for an educated person to encompass all of scientific kno*ledge- J$K 7n the last 3( years# such has been the pace of scientific ad"ance that e"en the best scientists cannot keep up *ith disco"eries at frontiers outside their o*n field- :ony 9lair# 8+ 0peech# +ay '((' 5ata Ling#:he scientific impact of nations#Dature?(2-

,%e 7ourt% Paradigm2 ,%e $%at

7rom

.pot%esis to Data
2

:housand years ago science *as empiri&al describing natural phenomena =ast fe* hundred years t%eoreti&al branch using models# generali,ations
. a 4G c2 a = 3 2 a

=ast fe* decades a &omputational branch simulating complex phenomena :oday (t%e 7ourt% Paradigm) data e6ploration %1 What is the 1ourth 8aradigm&

unify theory# experiment# and simulation 5ata captured by instruments or generated by simulator %' What are the dangers 8rocessed by soft*are of the 1ourth 8aradigm& 7nformation/Lno*ledge stored in computer 0cientist analy,es results using data management and statistics
'(1'!'(1)

0ource Aim Bray and :he 1ourth 8aradigm# http //research-microsoft-com/en!us/collaboration/fourthparadigm/

,%e 9Data Deluge:2 ,%e $%.


ME"ery*here you look# the Nuantity of information in the *orld is soaring- Cccording to one estimate# mankind created 13( exabytes (billion gigabytes) of data in '((3- :his year# it *ill create 1#'(( exabytes- +erely keeping up *ith this flood# and storing the bits that might be useful# is difficult enoughCnalysing it# to spot patterns and extract useful information# is harder still- :he 5ata 5eluge# :he Economist# '3 1ebruary '(1(
1(

9Data Deluge:2 ,%e Personal Meme6 "6ample

Oanne"ar 9ush in the 1I2(s record your life +7: +edia =aboratory :he @uman 0peechome 8ro>ect/:otalEecall# data mining/analysis/"isio 5eb Eoy and Eupal 8atel record practically e"ery *aking moment of their son?s first three years ('(< pri"acy time$7s this e"en legal&P 0hould it be&P) 11x1+8/12fps cameras# 12x14b!2QL@, mics# 2-2:9 EC75 + tapes# 1( computersR '((k hours audio!"ideo 5ata si,e '((B9/day# 1-389 total
11

9Data Deluge:2 ,%e *aming #nal.ti&s "6ample

E% 77 '(:9/year all logs @alo) 1-289 ser"ed statistics on player logs


1'

9Data Deluge:2 Datasets in Comp.1&i.


5ataset 0i,e http //g*a-e*i-tudelft-nl :he 1ailure :race Crchi"e 1:9/yr 1:9 1((B9 1(B9 Bam:C 8'8:C

http //fta-inria-fr

8eer!to!8eer :race Crchi"e $ 8WC# 7:C# .ECW5C5# $

1B9 T(4 T(I T1( T11 Sear


1)

1#(((s of scientists 1rom theory to practice


1)

9Data Deluge:2 ,%e Pro(essional $orld *ets Conne&ted

1eb '(1'
'(1'!'(1) 12

0ource Oincen,o .osen,a# :he 0tate of =inked7n# http //"incos-it/the!state!of!linkedin/

$%at is 9)ig Data:;


Oery large# distributed aggregations of loosely structured data# often incomplete and inaccessible Easily exceeds the processing capacity of con"entional database systems 8rinciple of 9ig 5ata When you can# keep e"erythingP :oo big# too fast# and doesn?t comply *ith the traditional database architectures
'(11!'(1' 13

,%e ,%ree 9<:s o( )ig Data


Oolume
+ore data "s- better models 5ata gro*s exponentially Cnalysis in near!real time to extract "alue 0calable storage and distributed Nueries

Oelocity
0peed of the feedback loop Bain competiti"e ad"antage fast recommendations 7dentify fraud# predict customer churn faster

Oariety
:he data can become messy text# "ideo# audio# etc 5ifficult to integrate into applications
'(11!'(1' 14

Cdapted from 5oug =aney# )5 data management# +E:C Broup/Bartner report# 1eb '((1- http //blogs-gartner-com/doug!laney/files/'(1'/(1/adI2I!)5!5ata! +anagement!.ontrolling!5ata!Oolume!Oelocity!and!Oariety-pdf

Data $are%ouse 0s. )ig Data

'(11!'(1'

16

0ource http //*ikibon-org/

#genda
1- 7ntroduction '- .loud 8rogramming in 8ractice (:he 8roblem) 3. Programming Models (or Compute-Intensi0e $or=loads 1. )ags o( ,as=s '- Workflo*s )- 8arallel 8rogramming +odels 2- 8rogramming +odels for 9ig 5ata 3- 0ummary
'(1'!'(1) 1Q

$%at is a )ag o( ,as=s ()o,); # 1.stem <ie5


9o: = set of >obs sent by a user$

$that start at most Us after the first >ob

% What is the user?s "ie*& :ime JunitsK

Why 9ag of :asks& 1rom the perspecti"e of the user# >obs in set are >ust tasks of a larger >ob C single useful result from the complete 9o: Eesult can be combination of all tasks# or a selection of the results of most or e"en a single task
'(1'!'(1) Iosup et al., The Characteristics and Performance of Groups of Jobs in Grids, Euro-Par, LNCS, ol.!"!#, pp. $%&-$'$, '((6.

1I

#ppli&ations o( t%e )o, Programming Model


8arameter s*eeps
.omprehensi"e# possibly exhausti"e in"estigation of a model Oery useful in engineering and simulation!based science

+onte .arlo simulations


0imulation *ith random elements fixed time yet limited inaccuracy Oery useful in engineering and simulation!based science

+any other types of batch processing


8eriodic computation# .ycle sca"enging Oery useful to automate operations and reduce *aste
'(1'!'(1) '(

)o,s )e&ame t%e Dominant Programming Model (or *rid Computing


(V0) :eraBrid!' D.0C (V0) .ondor V-Wisc(EV) EBEE (.C) [email protected]: (V0) Brid) (V0) B=GW (VL) EC= (DG#0E) DorduBrid (1E) BridW3((( (D=) 5C0!'

7rom >o's ?@A (


(V0) :eraBrid!' D.0C (V0) .ondor V-Wisc(EV) EBEE (.C) [email protected]: (V0) Brid) (V0) B=GW (VL) EC= (DG#0E) DorduBrid (1E) BridW3((( (D=) 5C0!'

'(

2(

4(

Q(

1((

7rom CP-,ime ( ?@A


'(1'!'(1)

'(

2(

4(

Q( '1

1((

Iosup and Epema( Grid Computin) *or+loads. IEEE Internet Computin) #,-&.( #'-&" -&/##.

Pra&ti&al #ppli&ations o( t%e )o, Programming Model

Parameter 15eeps in Condor ?1+4A


0ue the scientist *ants to 1ind the "alue of 1(x#y#,) for 1( "alues for x and y# and 4 "alues for , 1olution Eun a parameter s5eep# *ith 1( x 1( x 4 = 4(( parameter "alues 8roblem of the solution
0ue runs one >ob (a combination of x# y# and ,) on her lo*!end machine- 7t takes 4 hours :hat?s 1B0 da.s uninterrupted computation on 0ue?s machineP
'(1'!'(1) ''

0ource .ondor :eam# .ondor Vser?s :utorialhttp //cs-u*isc-edu/condor

Pra&ti&al #ppli&ations o( t%e )o, Programming Model

Parameter 15eeps in Condor ?2+4A


Universe Executable Input Output Error Log = = = = = = vanilla sim.exe input.txt output.txt error.txt sim.log

Requirements = OpSys == WINNT61 && .omplex 0=Cs can be specified easily Arch == INTEL && (Disk >= DiskUsage) && ((Memory * 1024)>=ImageSize)

InitialDir = run_$(Process) Clso passed as Queue 600 parameter to sim.exe


'(1'!'(1) ')

0ource .ondor :eam# .ondor Vser?s :utorialhttp //cs-u*isc-edu/condor

Pra&ti&al #ppli&ations o( t%e )o, Programming Model

Parameter 15eeps in Condor ?3+4A


% condor_submit sim.submit Submitting job(s) ................................................ ................................................ ................................................ ................................................ ................................................ ............... Logging submit event(s) ................................................ ................................................ ................................................ ................................................ ................................................ ............... 600 job(s) submitted to cluster 3.
'(1'!'(1) '2

0ource .ondor :eam# .ondor Vser?s :utorialhttp //cs-u*isc-edu/condor

Pra&ti&al #ppli&ations o( t%e )o, Programming Model

Parameter 15eeps in Condor ?4+4A


% condor_q -- Submitter: x.cs.wisc.edu : <128.105.121.53:510> : x.cs.wisc.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 3.0 frieda 4/20 12:08 0+00:00:05 R 0 9.8 sim.exe 3.1 frieda 4/20 12:08 0+00:00:03 I 0 9.8 sim.exe 3.2 frieda 4/20 12:08 0+00:00:01 I 0 9.8 sim.exe 3.3 frieda 4/20 12:08 0+00:00:00 I 0 9.8 sim.exe ... 3.598 frieda 4/20 12:08 0+00:00:00 I 0 9.8 sim.exe 3.599 frieda 4/20 12:08 0+00:00:00 I 0 9.8 sim.exe 600 jobs; 599 idle, 1 running, 0 held

'(1'!'(1)

'3

0ource .ondor :eam# .ondor Vser?s :utorialhttp //cs-u*isc-edu/condor

#genda
1- 7ntroduction '- .loud 8rogramming in 8ractice (:he 8roblem) 3. Programming Models (or Compute-Intensi0e $or=loads 1- 9ags of :asks 2. $or=(lo5s )- 8arallel 8rogramming +odels 2- 8rogramming +odels for 9ig 5ata 3- 0ummary
'(1'!'(1) '4

$%at is a $o=(lo5;

W1 = set of >obs *ith precedence (think 5irect Ccyclic Braph)

'(1'!'(1)

'6

#ppli&ations o( t%e $or=(lo5 Programming Model


.omplex applications
.omplex filtering of data .omplex analysis of instrument measurements

Cpplications created by non!.0 scientistsH


Workflo*s ha"e a natural correspondence in the real!*orld# as descriptions of a scientific procedure Oisual model of a graph sometimes easier to program

8recursor of the +apEeduce 8rogramming +odel (next slides)


'(1'!'(1) 'Q

HCdapted from .arole Boble and 5a"id de Eoure# .hapter in :he 1ourth 8aradigm# http //research-microsoft-com/en!us/collaboration/fourthparadigm/

$or=(lo5s "6isted in *rids3 'ut Did Not )e&ome a Dominant Programming Model
:races

0elected 1indings

=oose coupling Braph *ith )!2 le"els C"erage W1 si,e is )(/22 >obs 63<+ W1s are si,ed 2( >obs or less# I3< are si,ed '(( >obs or less
'I

'(1'!'(1) 0stermann et al., 0n the Characteristics of Grid *or+flo1s, CoreG2I3 Inte)rated 2esearch in Grid Computin) -CGI*., '((Q.

Pra&ti&al #ppli&ations o( t%e $7 Programming Model

)ioin(ormati&s in ,a0erna

'(1'!'(1)

)(

0ource .arole Boble and 5a"id de Eoure# .hapter in :he 1ourth 8aradigm# http //research-microsoft-com/en!us/collaboration/fourthparadigm/

#genda
1- 7ntroduction '- .loud 8rogramming in 8ractice (:he 8roblem) 3. Programming Models (or Compute-Intensi0e $or=loads 1- 9ags of :asks '- Workflo*s 3. Parallel Programming Models 2- 8rogramming +odels for 9ig 5ata 3- 0ummary
'(1'!'(1) )1

Parallel Programming Models

,-D &ourse in4049 Introdu&tion to PC

Cbstract machines ,as= (groups o( B3 B minutes)2 (5istributed) shared memory dis&uss parallel programming in &louds
5istributed memory +87

,as= (inter (inter-group dis&ussion)2 dis&uss. models .onceptual programming


+aster/*orker 5i"ide and conNuer 5ata / :ask parallelism 908

0ystem!le"el programming models


:hreads on B8Vs and other multi!cores
'(1'!'(1) )'

4arbanescu et al.( To1ards an Effecti e 5nified Pro)rammin) 6odel for 6an7-Cores. IP3PS *S &/#&

#genda
1')4. 37ntroduction .loud 8rogramming in 8ractice (:he 8roblem) 8rogramming +odels for .ompute!7ntensi"e Workloads Programming Models (or )ig Data 0ummary

'(1'!'(1)

))

"&os.stems o( )ig-Data Programming Models


% Where does +E!on!demand %fit& Where does 8regel!on!B8Vs fit&
@igh!=e"el =anguage
1lume 9ig%uery 0%= +eteor AC%= @i"e 8ig 0a*,all 0cope 5ryad=7D% C%=

8rogramming +odel
P#C, MapCedu&e Model Pregel 5ataflo* Clgebrix

Execution Engine
1lume 5remel :era C,ure Nep%ele @aloop Engine 0er"ice 5ata Engine :ree Engine adoop+ *irap% +87/ 5ryad D#CN Erlang @yracks

0torage Engine
0) B10 :era C,ure 5ata 5ata 0tore 0tore D71 Ooldemort = .osmos10 1 0 Csterix 9!tree
)2

H 8lus Yookeeper# .5D# etc-

'(1'!'(1)

Cdapted from 5agstuhl 0eminar on 7nformation +anagement in the .loud# http //***-dagstuhl-de/program/calendar/partlist/&semnr=11)'1X0VGB

#genda
1')4. 7ntroduction .loud 8rogramming in 8ractice (:he 8roblem) 8rogramming +odels for .ompute!7ntensi"e Workloads Programming Models (or )ig Data 1. MapCedu&e '- Braph 8rocessing )- Gther 9ig 5ata 8rogramming +odels 3- 0ummary

'(1'!'(1)

)3

MapCedu&e
+odel for processing and generating large data sets Enables a functional!like programming model 0plits computations into independent parallel tasks +akes efficient use of large commodity clusters @ides the details of paralleli,ation# fault!tolerance# data distribution# monitoring and load balancing
'(11!'(1' )4

MapCedu&e2 ,%e Programming Model


C programmig model# not a programming languageP 1- 7nput/Gutput 0et of key/"alue pairs '- +ap 8hase
8rocesses input key/"alue pair 8roduces set of intermediate pairs map (in_key, in_value) -> list(out_key, interm_value)

3. Reduce Phase:
Combines all intermediate values for a given key Produces a set of merged output values reduce(out_key, list(interm_value)) -> list(out_value)
'(11!'(1' )6

MapCedu&e in t%e Cloud


1acebook use case
0ocial!net*orking ser"ices Cnaly,e connections in the graph of friendships to recommend ne* connections

Boogle use case


Web!base email ser"ices# Boogle5ocs Cnaly,e messages and user beha"ior to optimi,e ad selection and placement

Soutube use case


Oideo!sharing sites Cnaly,e user preferences to gi"e better stream suggestions

'(11!'(1'

)Q

$ord&ount "6ample
1ile 1 :he big data is big- 1ile ' +apEeduce tames big data- Map 8utput2
+apper!1 (:he# 1)# (big# 1)# (data# 1)# (is# 1)# (big# 1) +apper!' (+apEeduce# 1)# (tames# 1)# (big# 1)# (data# 1)

Cedu&e Input
Eeducer!1 Eeducer!' Eeducer!) Eeducer!2 Eeducer!3 Eeducer!4 (:he# 1) (big# 1)# (big# 1)# (big# 1) (data# 1)# (data# 1) (is# 1) (+apEeduce# 1)# (+apEeduce# 1) (tames# 1)

Cedu&e 8utput
Eeducer!1 Eeducer!' Eeducer!) Eeducer!2 Eeducer!3 Eeducer!4 (:he# 1) (big# )) (data# ') (is# 1) (+apEeduce# ') (tames# 1)
)I

'(11!'(1'

Colored 1Euare Counter

'(11!'(1'

2(

MapCedu&e F *oogle2 "6ample 1

Benerating language model statistics


.ount Z of times e"ery 3!*ord seNuence occurs in large corpus of documents (and keep all those *here count [= 2)

+apEeduce solution
+ap extract 3!*ord seNuences =[ count from document Eeduce combine counts# and keep if count large enough

'(11!'(1'

21

http //***-slideshare-net/>hammerb/mapreduce!pact(4!keynote

MapCedu&e F *oogle2 "6ample 2

Aoining *ith other data


Benerate per!doc summary# but include per!host summary (e-g-# Z of pages on host# important terms on host)

+apEeduce solution
+ap extract host name from VE=# lookup per!host info# combine *ith per!doc data and emit Eeduce identity function (emit key/"alue directly)

'(11!'(1'

2'

http //***-slideshare-net/>hammerb/mapreduce!pact(4!keynote

More "6amples o( MapCedu&e #ppli&ations


5istributed Brep .ount of VE= Cccess 1reNuency Ee"erse Web!=ink Braph :erm!Oector per @ost 5istributed 0ort 7n"erted indices 8age ranking +achine learning algorithms $

'(11!'(1'

2)

"6e&ution 80er0ie5 (1+2)


% What is the performance problem raised by this step&

'(11!'(1'

22

"6e&ution 80er0ie5 (2+2)


Gne master# many *orkers example
7nput data split into + map tasks (42 / 1'Q +9) '((#((( maps Eeduce phase partitioned into E reduce tasks 2#((( reduces 5ynamically assign tasks to *orkers '#((( *orkers

+aster assigns each map / reduce task to an idle *orker


+ap *orker 5ata locality a*areness Eead input and generate E local files *ith key/"alue pairs Eeduce *orker Eead intermediate output from mappers 0ort and reduce to produce the output
'(11!'(1' 23

7ailures and )a&=-up ,as=s

'(11!'(1'

24

$%at is

adoop; # MapCedu&e "6e&. "ngine

7nspired by Boogle# supported by SahooP 5esigned to perform fast and reliable analysis of the big data =arge expansion in many domains such as
1inance# technology# telecom# media# entertainment# go"ernment# research institutions

'(11!'(1'

26

adoop F Da%oo
When you "isit yahoo# you are interacting *ith data processed *ith @adoopP

'(11!'(1'

2Q

https //open&irrus-org/system/files/8penCirrus adoop'((I-ppt

adoop -se Cases


1- 0earch Sahoo# Cma,on# Y"ents '- =og processing 1acebook# Sahoo# Aoost# =ast-fm )- Eecommendation 0ystems 1acebook 2- 5ata Warehouse 1acebook# CG= 3- Oideo and 7mage Cnalysis De* Sork :imes# Eyealike
'(11!'(1' 2I

http //cloud-berkeley-edu/data/hdfs-pdf

adoop Distri'uted 7ile 1.stem


Cssumptions and goals
5ata distributed across hundreds or thousands of machines 5etection of faults and automatic reco"ery 5esigned for batch processing "s- interacti"e use @igh throughput of data access "s- lo* latency of data access @igh aggregate band*idth and high scalability Write!once!read!many file access model +o"ing computations is cheaper than mo"ing data minimi,e net*ork congestion and increase throughput of the system

'(11!'(1'

3(

D71 #r&%ite&ture
+aster/sla"e architecture DameDode (DD)
+anages the file system namespace Eegulates access to files by clients Gpen/close/rename files or directories +apping of blocks to 5ataDodes Gne per node in the cluster +anages local storage of the node 9lock creation/deletion/replication initiated by DD 0er"e read/*rite reNuests from clients
'(11!'(1' 31

5ataDode (5D)

D71 Internals
Eeplica 8lacement 8olicy
1irst replica on one node in the local rack 0econd replica on a different node in the local rack :hird replica on a different node in a different rack impro"ed *rite performance ('/) are on the local rack) preser"es data reliability and read performance

.ommunication 8rotocols
=ayered on top of :.8/78 .lient 8rotocol client \ DameDode machine 5ataDode 8rotocol 5ataDodes \ DameDode DameDode responds to E8. reNuests issued by 5ataDodes / clients
'(11!'(1' 3'

adoop 1&%eduler
Aob di"ided into se"eral independent tasks executed in parallel
:he input file is split into chunks of 42 / 1'Q +9 Each chunk is assigned to a map task Eeduce task aggregate the output of the map tasks

:he master assigns tasks to the *orkers in 171G order


Aob:racker maintains a Nueue of running >obs# the states of the :ask:rackers# the tasks assignments :ask:rackers report their state "ia a heartbeat mechanism

5ata =ocality execute tasks close to their data 0peculati"e execution re!launch slo* tasks
'(11!'(1' 3)

MapCedu&e "0olution ?1+BA

adoop is Maturing2 Important Contri'utors


Dumber of patches
'(1'!'(1)

32

0ource http //***-theregister-co-uk/'(1'/(Q/16/communityFhadoop/

MapCedu&e "0olution ?2+BA

Gate 1&%eduler
=C:E 0cheduler
=ongest Cpproximate :ime to End 0peculati"ely execute the task that *ill finish farthest into the future

timeFleft = (1!progress0core) / progressEate


:asks make progress at a roughly constant rate Eobust to node heterogeneity

E.' 0ort running times


=C:E "s- @adoop "s- Do spec-

'(11!'(1'

33

8aharia et al.( Impro in) 6ap2educe performance in hetero)eneous en ironments. 0S3I &//%.

MapCedu&e "0olution ?3+BA

7#IC 1&%eduling
7solation and statistical multiplexing :*o!le"el architecture
1irst# allocates task slots across pools 0econd# each pool allocates its slots among multiple >obs 9ased on a max!min fairness policy

'(11!'(1'

34

8aharia et al.( 3ela7 schedulin)( a simple techni9ue for achie in) localit7 and fairness in cluster schedulin). EuroS7s &/#/. :lso T2 EECS-&//'-,,

MapCedu&e "0olution ?4+BA

Dela. 1&%eduling
5ata locality issues
@ead!of!line scheduling 171G# 1C7E =o* probability for small >obs to achie"e data locality 3Q< of >obs ] 1C.E9GGL ha"e ^ '3 maps Gnly 3< achie"e node locality "s- 3I< rack locality

'(11!'(1'

36

8aharia et al.( 3ela7 schedulin)( a simple techni9ue for achie in) localit7 and fairness in cluster schedulin). EuroS7s &/#/. :lso T2 EECS-&//'-,,

MapCedu&e "0olution ?B+BA

Dela. 1&%eduling
0olution
0kip the head!of!line >ob until you find a >ob that can launch a local task Wait :1 seconds before launching tasks on the same rack Wait :' seconds before launching tasks off!rack :1 = :' = 13 seconds =[ Q(< data locality

'(11!'(1'

3Q

8aharia et al.( 3ela7 schedulin)( a simple techni9ue for achie in) localit7 and fairness in cluster schedulin). EuroS7s &/#/. :lso T2 EECS-&//'-,,

48#G# and MapCedu&e ?1+9A


LGC=C
8lacement X allocation .entral for all +E clusters +aintains +E cluster metadata

+E!Eunner
.onfiguration X deployment +E cluster monitoring Bro*/0hrink mechanism

Gn!demand +E clusters
8erformance isolation 5ata isolation 1ailure isolation Oersion isolation

'(11!'(1'

3I

4oala and MapCedu&e ?2+9A

CesiHing Me&%anism
:*o types of nodes
Core nodes2 fully!functional nodes# *ith :ask:racker and 5ataDode (local disk access) ,ransient nodes2 compute nodes# *ith :ask:racker

8arameters
7 = Dumber of running tasks per number of a"ailable slots 8redefined 7min and 7ma6 thresholds 8redefined constant step gro51tep / s%rin=1tep , = time elapsed bet*een t*o successi"e resource offers

:hree policies
*ro5-1%rin= Poli&. (*1P)2 gro*!shrink but maintain 1 bet*een 7min and 7ma6 *reed.-*ro5 Poli&. (**P)2 gro*# shrink *hen *orkload done *reed.-*ro5-5it%-Data Poli&. (**DP)2 BB8 *ith core nodes (local disk access)
4(

'(11!'(1'

4oala and MapCedu&e ?3+9A

$or=loads


J1K J'K

IQ< of >obs ] 1acebook process 4-I +9 and take less than a minute J1K Boogle reported in '((2 computations *ith :9 of data on 1(((s of machines J'K
S- .hen# 0- Clspaugh# 5- 9orthakur# and E- Lat,# Energy Efficiency for =arge!0cale +apEeduce Workloads *ith 0ignificant 7nteracti"e Cnalysis# pp- 2)\34# '(1'A- 5ean and 0- Bhema*at# +apreduce 0implified 5ata 8rocessing on =arge .lusters# .omm- of the C.+# Ool- 31# no- 1# pp- 1(6\11)# '((Q-

'(11!'(1'

41

4oala and MapCedu&e ?4+9A

$ord&ount (,.pe 03 (ull s.stem)


.8V 5isk

1(( B9 input data 1( core nodes *ith Q map slots each Q(( map tasks executed in 6 *a"es Wordcount is .8V!bound in the map phase
'(11!'(1' 4'

4oala and MapCedu&e ?B+9A

1ort (,.pe 03 (ull s.stem)


.8V 5isk

8erformed by the +E frame*ork during the shuffling phase 7ntermediate key/"alue pairs are processed in increasing key order 0hort map phase *ith 2(<!4(< .8V utili,ation =ong reduce phase *hich is highly disk intensi"e
'(11!'(1' 4)

4oala and MapCedu&e ?I+9A

1peedup ((ull s.stem)


Gptimum

a) Wordcount

(:ype 1)

b) 0ort (:ype ')

0peedup relati"e to an +E cluster *ith 1( core nodes Close to linear speedup on &ore nodes

'(11!'(1'

42

4oala and MapCedu&e ?J+9A

"6e&ution ,ime 0s. ,ransient Nodes

7nput data of 2( B9 Wordcount output data = '( L9 0ort output data = 2( B9 Wordcount scales better than 0ort on transient nodes
'(11!'(1' 43

4oala and MapCedu&e ?K+9A

Per(orman&e o( t%e CesiHing Me&%anism


7minL0.2B 7ma6L1.2B gro51tep=3 s%rin=1tep=' ,*1P =)( s ,**(D)P =1'( s

0tream of 3( +E >obs +E cluster of '( core nodes + '( transient nodes BB8 increases the si,e of the data transferred across the net*ork B08 gro*s/shrinks based on the resource utili,ation of the cluster BB58 enables local *rites on the disks of pro"isioned nodes
'(11!'(1' 44

4oala and MapCedu&e ?9+9A

-se Case F ,-Del(t


Boal 8rocess 1(s :9 of 8'8 traces 5C0!2 constraints
:ime limit per >ob (13 min) Don!persistent storage

0olution
Eeser"e the nodes for se"eral days# import and process the data&

Gur approach
0plit the data into multiple subsets 0maller data sets =[ faster import and processing 0etup multiple +E clusters# one for each subset
46

'(11!'(1'

#genda
1')4. 7ntroduction .loud 8rogramming in 8ractice (:he 8roblem) 8rogramming +odels for .ompute!7ntensi"e Workloads Programming Models (or )ig Data 1- +apEeduce 2. *rap% Pro&essing )- Gther 9ig 5ata 8rogramming +odels 3- 0ummary

'(1'!'(1)

4Q

*rap% Pro&essing "6ample 1ingle-1our&e 1%ortest Pat% (111P)


5i>kstra?s algorithm
0elect node *ith minimal distance Vpdate neighbors G(_E_ + _O_`log_O_) *ith 1ibo@eap

7nitial dataset
C 9 . 5 --'(1'!'(1)

^(# (9# 3)# (5# ))[ ^inf# (E# 1)[ ^inf# (1# 3)[ ^inf# (9# 1)# (.# ))# (E# 2)# (1# 2)[
4I

0ource .laudio +artella# 8resentation on Biraph at :V 5elft# Cpr '(1'-

*rap% Pro&essing "6ample 111P in MapCedu&e


% What is the performance problem& +apper output distances
^9# 3[# ^5# )[# ^.# inf[# ---

9ut also graph structure


^C# ^(# (9# 3)# (5# ))[ ---

Eeducer input distances


9 ^inf# 3[# 5 ^inf# )[ ---

9ut also graph structure D >obs# *here D is the graph diameter


'(1'!'(1)

9 ^inf# (E# 1)[ --6(

0ource .laudio +artella# 8resentation on Biraph at :V 5elft# Cpr '(1'-

,%e Pregel Programming Model (or *rap% Pro&essing

9atch!oriented processing Euns in!memory Oertex!centric C87 1ault!tolerant Euns on +aster!0la"e architecture

GL# the actual model follo*s in the next slides


'(1'!'(1) 61

0ource .laudio +artella# 8resentation on Biraph at :V 5elft# Cpr '(1'-

,%e Pregel Programming Model (or *rap% Pro&essing


8rocessors

9ased on Oaliant?s 9ulk 0ynchronous 8arallel (908)


=ocal .omputation

D processing units *ith fast local memory 0hared communication medium 0eries of 1upersteps Blobal 0ynchroni,ation 9arrier .ommunication Ends *hen all "ote:o@alt

8regel executes initiali,ation# one or se"eral supersteps# shutdo*n


'(1'!'(1)

9arrier 0ynchroni,ation
6'

0ource .laudio +artella# 8resentation on Biraph at :V 5elft# Cpr '(1'-

Pregel ,%e 1uperstep


Each Oertex (execution in parallel)
Eecei"e messages from other "ertices 8erform o*n computation (user!defined function) +odify o*n state or state of outgoing messages +utate topology of the graph 0end messages to other "ertices

:ermination condition
Cll "ertices inacti"e Cll messages ha"e been transmitted

'(1'!'(1)

6)

0ource 8regel article-

Pregel2 ,%e <erte6-)ased #PI


7nput message 7mplements processing algorithm Gutput message

'(1'!'(1)

62

0ource 8regel article-

,%e Pregel #r&%ite&ture Master-$or=er


+aster assigns "ertices to Workers
Braph partitioning

+aster Worker 1
3

+aster coordinates 0upersteps +aster coordinates .heckpoints Workers execute "ertices compute() Workers exchange messages directly
1 ' 2 )
'(1'!'(1)

Worker

k
63

0ource 8regel article-

Pregel Per(orman&e 111P on 1 )illion-<erte6 )inar. ,ree

'(1'!'(1)

64

0ource 8regel article-

Pregel Per(orman&e 111P on Candom *rap%s3 <arious 1iHes

'(1'!'(1)

66

0ource 8regel article-

#pa&%e *irap% #n 8pen-1our&e Implementation o( Pregel


:asktracker
+ap 0lot +ap 0lot

:asktracker
+ap 0lot +ap 0lot

:asktracker
+ap 0lot +ap 0lot

:asktracker
+ap 0lot +ap 0lot

8ersistent computation state

Yookeeper

DD X A:

+aster

=oose implementation of 8regel 0trong community (1acebook# :*itter# =inked7n) Euns 1((< on existing @adoop clusters 0ingle +ap!only >ob
6Q

'(1'!'(1)

0ource .laudio +artella# 8resentation on Biraph at :V 5elft# Cpr '(1'-

%ttp2++in&u'ator.apa&%e.org+girap%+

Page ran= 'en&%mar=s


:iberium :an
Clmost 2((( nodes# shared among numerous groups in SahooP @adoop (-'(-'(2 (secure @adoop) 'x %uad .ore '-2B@,# '2 B9 EC+# 1x 4:9 @5

org-apache-giraph-benchmark-8ageEank9enchmark
Benerates data# number of edges# number of "ertices# Z of supersteps 1 master/YooLeeper '( supersteps Do checkpoints 1 random edge per "ertex
6I

0ource C"ery .hing presentation at @ortonWorkshttp //***-slideshare-net/a"eryching/'(111(12horton*orks/

$or=er s&ala'ilit. (2B0M 0erti&es)


3((( 23(( 2((( ,otal se&onds )3(( )((( '3(( '((( 13(( 1((( 3(( ( ( 3( 1(( 13( '(( M o( 5or=ers '3( )(( )3(
Q(

0ource C"ery .hing presentation at @ortonWorkshttp //***-slideshare-net/a"eryching/'(111(12horton*orks/

<erte6 s&ala'ilit. (300 5or=ers)


'3(( '((( ,otal se&onds 13(( 1((( 3(( ( ( 1(( '(( )(( 2(( 3(( M o( 0erti&es (in 100Ns o( millions) 4((
Q1

0ource C"ery .hing presentation at @ortonWorkshttp //***-slideshare-net/a"eryching/'(111(12horton*orks/

<erte6+5or=er s&ala'ilit.
M o( 0erti&es (100Ns o( millions) '3(( '((( ,otal se&onds 13(( )(( 1((( '(( 3(( ( ( 3( 1(( 13( '(( M o( 5or=ers '3( )(( )3(
Q'

4(( 3(( 2((

1(( (

0ource C"ery .hing presentation at @ortonWorkshttp //***-slideshare-net/a"eryching/'(111(12horton*orks/

#genda
1')4. 7ntroduction .loud 8rogramming in 8ractice (:he 8roblem) 8rogramming +odels for .ompute!7ntensi"e Workloads Programming Models (or )ig Data 1- +apEeduce '- Braph 8rocessing 3. 8t%er )ig Data Programming Models 3- 0ummary

'(1'!'(1)

Q)

1tratosp%ere
+eteor Nuery language# 0upremo operator frame*ork 8rogramming .ontracts (8C.:s) programming model
Extended set of 'nd order functions ("s +apEeduce) 5eclarati"e definition of data parallelism

Dephele execution engine


0chedules multiple dataflo*s simultaneously 0upports 7aa0 en"ironments based on Cma,on E.'# Eucalyptus

@510 storage engine


'(1'!'(1) Q2

1tratosp%ere Programming Contra&ts (P#C,s) ?1+2A


'nd!order function

Clso in +apEeduce
'(1'!'(1) Q3

0ource 8C.: o"er"ie*# https //stratosphere-eu/*iki/doku-php/*iki pactpm

1tratosp%ere Programming Contra&ts (P#C,s) ?2+2A

% @o* can 8C.:s optimi,e data processing&

'(1'!'(1)

Q4

0ource 8C.: o"er"ie*# https //stratosphere-eu/*iki/doku-php/*iki pactpm

1tratosp%ere Nep%ele

'(1'!'(1)

Q6

1tratosp%ere 0s MapCedu&e
8C.: extends +apEeduce
9oth propose 'nd!order functions (3 8C.:s "s +ap X Eeduce) 9oth reNuire from user 1st!order functions (*hat?s inside the +ap) 9oth can benefit from higher!le"el languages 8C.: ecosystem has 7aa0 support

Ley!"alue data model

'(1'!'(1)

QQ

0ource 1abian @ueske# =arge 0cale 5ata Cnalysis 9eyond +apEeduce# @adoop Bet :ogether# 1eb '(1'-

1tratosp%ere 0s MapCedu&e Pair5ise 1%ortest Pat%s ?1+3A


1loyd!Warshall Clgorithm
For k from 1 to n For i from 1 to n For j from 1 to n Di,j min( Di,j , Di,k + Dk,j )

'(1'!'(1)

QI

0ource 0tratosphere example# https //stratosphere-eu/*iki/lib/exe/detail-php/*iki all'allF spFtaskdescription-png&id=*iki<)Ca'aspexample

1tratosp%ere 0s MapCedu&e )- Aoin edges to triads ,riangle "numeration ?1+3A


7nput undirected graph edge!based Gutput triangles 1- Eead input graph '- 0ort by smaller "ertex 75

2- :riads to triangles

'(1'!'(1)

I(

0ource 1abian @ueske# =arge 0cale 5ata Cnalysis 9eyond +apEeduce# @adoop Bet :ogether# 1eb '(1' and 0tratosphere example-

1tratosp%ere 0s MapCedu&e ,riangle "numeration ?2+3A


+apEeduce 1ormulation

'(1'!'(1)

I1

0ource 1abian @ueske# =arge 0cale 5ata Cnalysis 9eyond +apEeduce# @adoop Bet :ogether# 1eb '(1' and 0tratosphere example-

1tratosp%ere 0s MapCedu&e ,riangle "numeration ?3+3A


0tratosphere 1ormulation

'(1'!'(1)

I'

0ource 1abian @ueske# =arge 0cale 5ata Cnalysis 9eyond +apEeduce# @adoop Bet :ogether# 1eb '(1' and 0tratosphere example-

#genda
1')2B. 7ntroduction .loud 8rogramming in 8ractice (:he 8roblem) 8rogramming +odels for .ompute!7ntensi"e Workloads 8rogramming +odels for 9ig 5ata 1ummar.

'(1'!'(1)

I)

Con&lusion ,a=e,a=e- ome Message


Programming model L &omputer s.stem a'stra&tion Programming Models (or Compute-Intensi0e $or=loads
+any trade!offs# fe* dominant programming models +odels bags of tasks# *orkflo*s# master/*orker# 908# $

Programming Models (or )ig Data


)ig data programming models %a0e e&os.stems +any trade!offs# many programming models +odels +apEeduce# 8regel# 8C.:# 5ryad# $ Execution engines @adoop# Loala++E# Biraph# 8C.:/Dephele# 5ryad# $

Cealit. &%e&=2 &loud programming is maturing


Gctober 1(# '(1' http //***-flickr-com/photos/dimitrisotiropoulos/2'(264421Q/ I2

Ceading Material (Ceall. #&ti0e 7ield)


$or=loads
Clexandru 7osup# 5ick @- A- Epema Brid .omputing Workloads- 7EEE 7nternet .omputing 13(') 1I!'4 ('(11)

,%e 7ourt% Paradigm :he 1ourth 8aradigm# http //research-microsoft-com/en!us/collaboration/fourthparadigm/ Programming Models (or Compute-Intensi0e $or=loads
Clexandru 7osup# +athieu Aan# Gmer G,an 0onme,# 5ick @- A- Epema :he .haracteristics and 8erformance of Broups of Aobs in BridsEuro!8ar '((6 )Q'!)I) Gstermann et al-# Gn the .haracteristics of Brid Workflo*s# .oreBE75 7ntegrated Eesearch in Brid .omputing (.B7W)# '((Qhttp //***-pds-e*i-tudelft-nl/aiosup/*ftraces(6charsFcamera-pdf .orina 0tratan# Clexandru 7osup# 5ick @- A- Epema C performance study of grid *orkflo* engines- BE75 '((Q '3!)'

Programming Models (or )ig Data


Aeffrey 5ean# 0an>ay Bhema*at +apEeduce 0implified 5ata 8rocessing on =arge .lusters- G057 '((2 1)6!13( Aeffrey 5ean# 0an>ay Bhema*at +apEeduce a flexible data processing tool- .ommun- C.+ 3)(1) 6'!66 ('(1() +atei Yaharia# Cndy Lon*inski# Cnthony 5- Aoseph# Eandy Lat,# and 7on 0toica- '((Q- 7mpro"ing +apEeduce performance in heterogeneous en"ironments- 7n 8roceedings of the Qth V0ED7b conference on Gperating systems design and implementation (G057W(Q)V0ED7b Cssociation# 9erkeley# .C# V0C# 'I!2'+atei Yaharia# 5hruba 9orthakur# Aoydeep 0en 0arma# Lhaled Elmeleegy# 0cott 0henker# 7on 0toica 5elay scheduling a simple techniNue for achie"ing locality and fairness in cluster scheduling- Euro0ys '(1( '43!'6Q :yson .ondie# Deil .on*ay# 8eter Cl"aro# Aoseph +- @ellerstein# Lhaled Elmeleegy# Eussell 0ears +apEeduce Gnline- D057 '(1( )1)!)'Q Br,egor, +ale*ic,# +atthe* @- Custern# Cart A- .- 9ik# Aames .- 5ehnert# 7lan @orn# Daty =eiser# Br,egor, .,a>ko*ski 8regel a system for large!scale graph processing- 07B+G5 .onference '(1( 1)3!124 5ominic 9attrc# 0tephan E*en# 1abian @ueske# Gde> Lao# Oolker +arkl# 5aniel Warneke Dephele/8C.:s a programming model and execution frame*ork for *eb!scale analytical processing- 0o.. '(1( 11I!1)(
'(1'!'(1) I3

You might also like