Classification of Distributed Computing Systems
Classification of Distributed Computing Systems
• Unlike the cluster or grid, a P2P network does not use dedicated
interconnection network.
Distributed File Sharing: content distribution of MP3 music, video, etc. E.g.
Gnutella, Napster, BitTorrent.
• Scalability:
• Scalability is the ability of a system to handle growing amount of work in a
capable/efficient manner or its ability to be enlarged to accommodate that
growth.
• For example, it can refer to the capability of a system to increase total
throughput under an increased load when resources (typically hardware)
are added.
Scalability
• One form of scalability for parallel and distributed systems is:
• Size Scalability
This refers to achieving higher performance or more functionality by
increasing the machine size. Size in this case refers to adding processors,
cache, memory, storage, or I/O channels.
• Scale Horizontally and Vertically
Methods of adding more resources for a particular application fall into two
broad categories:
Scale Horizontally
To scale horizontally (or scale out ) means to add more nodes to a system,
such as adding a new computer to a distributed software application. An
example might be scaling out from one Web server system to three.
The scale-out model has created an increased demand for shared data
storage with very high I/O performance, especially where processing of
large amounts of data is required.
Scalability
Scale Vertically
To scale vertically (or scale up ) means to add resources to a single node in
a system, typically involving the addition of CPUs or memory to a single
computer.
Tradeoffs
There are tradeoffs between the two models. Larger numbers of computers
means increased management complexity, as well as a more complex
programming model and issues such as throughput and latency between
nodes.
Also, some applications do not lend themselves to a distributed computing
model.
In the past, the price difference between the two models has favored "scale
up" computing for those applications that fit its paradigm, but recent
advances in virtualization technology have blurred that advantage, since
deploying a new virtual system/server over a hypervisor is almost always
less expensive than actually buying and installing a real one.
Amdahl’s Law
It is typically cheaper to add a new node to a system in order to achieve
improved performance than to perform performance tuning to improve the
capacity that each node can handle. But this approach can have diminishing
returns as indicated by Amdahl’s Law.
where the first term is the sequential execution time on a single processor
and the second term is the parallel execution time on n processing nodes.
Double the processing power has only improved the speedup by roughly
System Efficiency, E = S / n = 1 / [α n + (1 - α ) ]
System efficiency can be rather low if the cluster size is very large.
• All hardware, software, and network components may fail. Single points of
failure that bring down the entire system must be avoided when designing
distributed systems.