IntroDistributed Systems
IntroDistributed Systems
24October2013
Lecture1
Slide Credits: Maarten van Steen
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
2
TopicsforToday
CourseIntroductionandSyllabus
Definitions
Goals
Transparency
Openness
Scaling
Pitfalls
TypesofDistributedSystems
DistributedComputingSystems
DistributedInformationSystems
Source:TvS1.11.3.1
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
3
ISE437:DistributedInformationSystems
Twocourseseries:ISE437andISE441:Distributed
AlgorithmsinCommunicationNetworks
Thissemester:Systemsdesignprinciples
Definitions,kinds,technologies,techniques,tools
More:Practicalcode,applications
Less:Formalanalysis,proofs,complexity,theory
Nextsemester:Algorithmicpropertiesandtheory
Classesofalgorithms,models,theories
More:Formalanalysis,proofs,complexity,theory
Less(ornone):codingandapplications
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
4
SoFar
CourseIntroductionandSyllabus
Definitions
Goals
Transparency
Openness
Scaling
Pitfalls
TypesofDistributedSystems
DistributedComputingSystems
DistributedInformationSystems
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
5
DistributedSystem:Definition
Adistributedsystemisapieceofsoftwarethatensuresthat:
acollectionofindependentcomputersappearstoitsusersasasingle
coherentsystem
Twoaspects:(1)independentcomputersand(2)singlesystem
middleware.
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
6
GoalsofDistributedSystems
Makingresourcesavailable
Distributiontransparency
Openness
Scalability
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
7
DistributionTransparency
Transparency Description
Access
Hidesdifferencesindatarepresentationandinvocation
mechanisms
Location Hideswhereanobjectresides
Migration
Hidesfromanobjecttheabilityofasystemtochange
thatobjectslocation
Relocation
Hidesfromaclienttheabilityofasystemtochangethe
locationofanobjecttowhichtheclientisbound
Replication
Hidesthefactthatanobjectoritsstatemaybereplicated
andthatreplicasresideatdifferentlocations
Concurrency
Hidesthecoordinationofactivitiesbetweenobjectsto
achieveconsistencyatahigherlevel
Failure Hidesfailureandpossiblerecoveryofobjects
Note: Distribution transparency is a nice goal, but achieving it is a different story.
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
8
DegreeofTransparency
Observation:Aimingatfulldistributiontransparencymaybe
toomuch.
Usersmaybelocatedindifferentcontinents
Completelyhidingfailuresofnetworksandnodesis
(theoreticallyandpractically)impossible
Youcannotdistinguishaslowcomputerfromafailingone
Youcanneverbesurethataserveractuallyperformedan
operationbeforeacrash
Fulltransparencywillcostperformance,exposing
distributionofthesystem
KeepingWebcachesexactlyup-to-datewiththemaster
Immediatelyflushingwriteoperationstodiskforfaulttolerance
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
9
SoFar
CourseIntroductionandSyllabus
Definitions
Goals
Transparency
Openness
Scaling
Pitfalls
TypesofDistributedSystems
DistributedComputingSystems
DistributedInformationSystems
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
10
OpennessofDistributedSystems
Opendistributedsystem
Beabletointeractwithservicesfromotheropensystems,
irrespectiveoftheunderlyingenvironment:
Systemsshouldconformtowell-definedinterfaces
Systemsshouldsupportportabilityofapplications
Systemsshouldeasilyinteroperate
Achievingopenness
Atleastmakethedistributedsystemindependentfrom
heterogeneityoftheunderlyingenvironment:
Hardware
Platforms
Languages
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
11
PolicyversusMechanisms
Implementingopenness
Requiressupportfordifferentpolicies:
Whatlevelofconsistencydowerequireforclient-cacheddata?
Whichoperationsdoweallowdownloadedcodetoperform?
WhichQoSrequirementsdoweadjustinthefaceofvarying
bandwidth?
Whatlevelofsecrecydowerequireforcommunication?
Implementingopenness
Ideally,adistributedsystemprovidesonlymechanisms:
Allow(dynamic)settingofcachingpolicies
Supportdifferentlevelsoftrustformobilecode
ProvideadjustableQoSparametersperdatastream
Offerdifferentencryptionalgorithms
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
12
ScaleinDistributedSystems
Observation
Manydevelopersofmoderndistributedsystemseasilyusetheadjective
scalablewithoutmakingclearwhytheirsystemactuallyscales.
Scalability
Atleastthreecomponents:
Numberofusersand/orprocesses(sizescalability)
Maximumdistancebetweennodes(geographicalscalability)
Numberofadministrativedomains(administrativescalability)
Observation
Mostsystemsaccountonly,toacertainextent,forsizescalability.The
(non)solution:powerfulservers.Today,thechallengeliesin
geographicalandadministrativescalability.
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
13
TechniquesforScaling
Hidecommunicationlatencies
Avoidwaitingforresponses;dosomethingelse:
Makeuseofasynchronouscommunication
Haveseparatehandlerforincomingresponse
Problem:noteveryapplicationfitsthismodel
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
14
TechniquesforScaling
Distribution
Partitiondataandcomputationsacrossmultiplemachines:
Movecomputationstoclients(Javaapplets)
Decentralizednamingservices(DNS)
Decentralizedinformationsystems(WWW)
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
15
TechniquesforScaling
Replication/caching
Makecopiesofdataavailableatdifferentmachines:
Replicatedfileserversanddatabases
MirroredWebsites
Webcaches(inbrowsersandproxies)
Filecaching(atserverandclient)
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
16
ScalingTheProblem
Observation
Applyingscalingtechniquesiseasy,exceptforonething:
Havingmultiplecopies(cachedorreplicated),leadsto
inconsistencies:modifyingonecopymakesthatcopy
differentfromtherest.
Alwayskeepingcopiesconsistentandinageneralway
requiresglobalsynchronizationoneachmodification.
Globalsynchronizationprecludeslarge-scalesolutions.
Observation
Ifwecantolerateinconsistencies,wemayreducetheneed
forglobalsynchronization,buttoleratinginconsistenciesis
applicationdependent.
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
17
DevelopingDistributedSystems:Pitfalls
Observation
Manydistributedsystemsareneedlesslycomplexcausedby
mistakesthatrequiredpatchinglateron.Therearemany
falseassumptions:
Thenetworkisreliable
Thenetworkissecure
Thenetworkishomogeneous
Thetopologydoesnotchange
Latencyiszero
Bandwidthisinfinite
Transportcostiszero
Thereisoneadministrator
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
18
SoFar
CourseIntroductionandSyllabus
Definitions
Goals
Transparency
Openness
Scaling
Pitfalls
TypesofDistributedSystems
DistributedComputingSystems
DistributedInformationSystems
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
19
TypesofDistributedSystems
DistributedComputingSystems
DistributedInformationSystems
DistributedPervasiveSystems
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
20
DistributedComputingSystems
Observation
ManydistributedsystemsareconfiguredforHigh-
PerformanceComputing
ClusterComputing
Essentiallyagroupofhigh-endsystemsconnectedthrough
aLAN:
Homogeneous:sameOS,near-identicalhardware
Singlemanagingnode
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
21
DistributedComputingSystems
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
22
DistributedComputingSystems
GridComputing
Thenextstep:lotsofnodesfromeverywhere:
Heterogeneous
Dispersedacrossseveralorganizations
Caneasilyspanawide-areanetwork
Note
Toallowforcollaborations,gridsgenerallyusevirtual
organizations.Inessence,thisisagroupingofusers(or
better:theirIDs)thatwillallowforauthorizationon
resourceallocation.
31 October 2013
ISE 437/SE424: Distributed (Information) Systems
23
DistributedInformationSystems
Observation
Thevastamountofdistributedsystemsinusetodayareformsof
traditionalinformationsystems,thatnowintegratelegacysystems.
Example:Transactionprocessingsystems.
Atomicity: All operations either succeed, or all of them fail. When the
transaction fails, the state of the object will remain unaffected by the
transaction.
Consistency:Atransactionestablishesavalidstatetransition.Thisdoes
not exclude the possibility of invalid, intermediate states during the
transactionsexecution.
Isolation: Concurrent transactions do not interfere with each other. It
appears to each transaction T that other transactions occur either
beforeT,orafterT,butneverboth.
Durability: After the execution of a transaction, its effects are made
permanent:changestothestatesurvivefailures.
31 October 2013
ISE 437/SE424: Distributed (Information) Systems
25
TransactionProcessingMonitor
Observation
Inmanycases,thedatainvolvedinatransactionisdistributedacross
severalservers.ATPMonitorisresponsibleforcoordinatingthe
executionofatransaction
31 October 2013
ISE 437/SE424: Distributed (Information) Systems
26
Distr.IS:EnterpriseApplicationIntegration
Problem
ATPmonitordoesntseparateappsfromtheirdatabases.
Alsoneededarefacilitiesfordirectcommunication
betweenapps.
RemoteProcedureCall(RPC)
MessageOrientedMiddleware(MOM)
24 October 2013
ISE 437/SE 424: Distributed (Information) Systems
27
Conclusion
CourseIntroductionandSyllabus
Definitions
Goals
Transparency
Openness
Scaling
Pitfalls
TypesofDistributedSystems
DistributedComputingSystems
DistributedInformationSystems