Distributed Computing
02/08/22 1
02/08/22 2
Textbook & Readings
02/08/22 3
Assignments:
This course will involve listening to lectures, reading papers,
writing reviews of papers, participating in class discussions,
presenting papers and leading class discussions.
Students will be required to write reviews for paper they read.
02/08/22 4
Distributed System
Distributed Systems - what and why?
Architecture
Challenges
Overview - principles and paradigms
02/08/22 5
Distributed system, distributed computing
Early computing was performed on a single
processor. Uni-processor computing can be called
centralized computing.
A distributed system is a collection of independent
computers, interconnected via a network, capable of
collaborating on a task.
Distributed computing is computing performed in a
distributed system.
02/08/22 6
Distributed Systems
w o rk
s tatio n s a lo c al n etw o rk
T h e I n tern et
a n etw o r k h o s t
02/08/22 7
DISTRIBUTED SYSTEMS
What is a distributed system?
Andrew Tannenbaum defines it as follows:
“A distributed system is a collection of independent computers that appear to
its users as a single coherent system.”
Is there any such system?
We will learn about the challenges in building “true” distributed
systems
“A distributed system is a collection of independent computers that are used jointly to
perform a single task or to provide a single service.”
02/08/22 8
Why Distributed?
Resource and Data Sharing
Printers, databases, multimedia servers etc.
Availability, Reliability
The loss of some instances can be hidden
Scalability, Extensibility
System grows with demands (e.g. extra servers)
Performance
Huge power (CPU, memory etc.) available
Horizontal distribution (same logical level is distr.)
Inherent distribution, communication
Organizational distribution, e-mail, video conference
Vertical distribution (corresponding to org. struct.)
02/08/22 9
Problems of Distribution
Concurrency, Security
Clients must not disturb each other
Partial failure
We often do not know, where is the error (e.g. RPC)
Location, Migration, Replication
Clients must be able to find their servers
Heterogeneity
Hardware, platforms, languages, management
Convergence
Between distributed systems and telecommunication
02/08/22 10
Examples of distributed systems
Network of workstations (NOW): a group of networked
personal workstations connected to one or more server
machines.
The Internet
An intranet: a network of computers and workstations within
an organization, separated from the Internet via a protective
device (a firewall).
A large-scale distributed system – eBay
Collection of Web servers: distributed database of hypertext
and multimedia documents.
Distributed file system on a LAN
Domain Name Service (DNS)
small-scale distributed system
-Home Smart
02/08/22 11
Computers in a Distributed System
Workstations: computers used by end-users to
perform computing
Server machines: computers which provide
resources and services
Personal Assistance Devices: handheld computers
connected to the system via a wireless
communication link.
02/08/22 12
THE ADVANTAGES OF DISTRIBUTED SYSTEMS
What are economic and technical reasons for having distributed
systems?
Economics:
Performance: By using the combined processing and storage capacity
of many nodes, performance levels can be reached that is out of the
scope of centralized machines.
Scalability: Resources such as processing and storage capacity can be
increased incrementally.
Inherent distribution: Some applications, such as email and the Web
(where users are spread out over the whole world), are naturally
distributed. This includes cases where users are geographically
dispersed as well as when single resources (e.g., printers, data) need to
be shared.
Reliability: By having redundant components, the impact of hardware
and software faults on users can be reduced.
Better flexibility in meeting users’ needs.
02/08/22 13
THE DISADVANTAGES OF DISTRIBUTED SYSTEMS
Multiple Points of Failures: the failure of one or
more participating computers, or one or more
network links, can bring trouble.
Security Concerns: In a distributed system, there
are more opportunities for unauthorized attack.
Software complexity: Distributed software is more
complex and harder to develop than conventional
software.
Distributed systems are hard to build and
understand.
02/08/22 14
Centralized vs. Distributed Computing
term in al
m ain fram e c o m pu ter
w o rk s tatio n
n etw ork lin k
n etw ork h o s t
ce n tralize d co m pu tin g
dis tribu te d com pu tin g
02/08/22 15
DISTRIBUTED SYSTEM ARCHITECTURE
Hardware Architecture: Software Architecture:
§ Uniprocessor §Uniprocessor OS
§ Multiprocessor §Multiprocessor OS
§ Multicomputer §Network OS (NOS)
§ Distributed OS (DOS)
§ Middleware
A key characteristic of distributed systems that includes
a hardware aspect (independent computers)
and a software aspect (performing a task and providing a service)
02/08/22 16
HARDWARE ARCHITECTURE
Uniprocessor:
Properties:
1. Single processor
2. Direct memory access
02/08/22 17
Continued……
Multiprocessor: those that have shared memory , usually called
multiprocessors, and those do not , sometimes called Multicomputers.
Typically, there are 2 major types of Parallel Architectures that are
common in the industry: Shared Memory Architecture and Distributed
Memory Architecture. Shared Memory Architecture, again, is of 2 types:
Uniform Memory Access (UMA), and Non-Uniform Memory Access
(NUMA).
Properties:
1. Multiple processors
2. Direct memory access
3. Uniform memory access -access time to a memory location is independent
of which processor makes the request
4. Non-uniform memory access- the memory access time depends on the
memory location relative to a processor.
02/08/22 18
Continued……
Multicomputer:
Properties:
1. Multiple computers
2. Network Connection (to support higher bandwidth)
3. Homogeneous (all nodes support same physical architecture) vs.
Heterogeneous (does not support same physical architecture)
02/08/22 19
SOFTWARE ARCHITECTURE
Uniprocessor OS:
02/08/22 20
Continued….
Multiprocessor OS:
Similar to a uniprocessor OS but:
Kernel designed to handle multiple CPUs
Communication uses same primitives as uniprocessor OS
Tightly-coupled software on tightly-coupled hardware
Examples: high-performance servers
shared memory
single run queue
02/08/22 21
Continued….
Network OS:
Properties:
1. No single system image (Users are aware of multiplicity of machines)
2. Individual nodes are highly autonomous
3. Access to resources of various machines is done explicitly by:
Remote logging into the appropriate remote machine (telnet )
Transferring data from remote machines to local machines, via the File Transfer
Protocol (FTP) mechanism
4. Examples: Linux, Windows
02/08/22 22
Continued….
Distributed OS:
Properties:
1. Users not aware of multiplicity of machines
• Access to remote resources similar to access to local resources
2. High degree of transparency (single system image)
3. Support distributed services (services may include distributed shared memory, assignment
of tasks to processors, distributed storage, interprocess communication, transparent sharing of
resources, distributed resource management, etc.
4. Homogeneous hardware
02/08/22 23
Continued….
Middleware: Middleware enables the components to
coordinate their activities.
Properties:
1. System independent interface for distributed programming
2. Improves transparency (e.g., hides heterogeneity)
3. Provides services (e.g., naming service, transactions, etc.)
4. Provides programming model (e.g., distributed objects)
02/08/22 24
Continued….
Why is Middleware ‘Winning’?:
1. Builds on commonly available abstractions of
network OS
2. Examples: RPC, NFS, CORBA
3. Usually runs in user space
5. less error
6. Independence from OS, network protocol,
programming language, etc.
02/08/22 25
Comparison between Systems
02/08/22 26
Software Concepts
System Description Main Goal
Tightly-coupled operating system for multi- Hide and manage
DOS
processors and homogeneous multicomputers hardware resources
Loosely-coupled operating system for Offer local services
NOS
heterogeneous multicomputers (LAN and WAN) to remote clients
Additional layer at top of NOS implementing Provide distribution
Middleware
general-purpose services transparency
02/08/22 27
Openness
An open distributed system is a system that offers services
according to standard rules that describe syntax and
semantics of the services.
Systems should conform to well-defined Interfaces
System should support Interoperability
Components of different origin can communicate
System should support Portability
Components work on different platforms
Standards – a necessity
02/08/22 28
DISTRIBUTED SYSTEMS IN CONTEXT
Networking:
Network Protocols, routing protocols, etc.
Distributed Systems: make use of networks
Operating Systems:
Resource management for single systems
Distributed Systems: management of distributed resources
02/08/22 29
BASIC PROBLEMS AND CHALLENGES IN
DISTRIBUTED SYSTEMS
The distributed nature of a systems gives rise of the
following challenges:
Transparency
Scalability
Dependability
Performance
Flexibility
02/08/22 30
TRANSPARENCY
Concealment of the separation of the components of a distributed system
(single image view).
There are a number of forms of transparency:
Access: Local and remote resources accessed in same way
the users should not need or able to recognize whether a resource (hardware
or software) is remote or local.
This is facilitated by distributed operating system.
Location: Users unaware of location of resources
Migration: Resources can migrate without name change
Replication: Users unaware of existence of multiple copies
Failure: Users unaware of the failure of individual components
Concurrency: Users unaware of sharing resources with others
02/08/22 31
SCALABILITY
A system is said to be scalable if it can handle the addition of users and
resources without suffering a noticeable loss of performance or
increase in administrative complexity
Scale has three dimensions:
Size: number of users and resources (problem: overloading)
Geography: distance between users and resources (problem:
communication)
Administration: number of organizations that apply administrative
control over parts of the system (problem: administrative mess)
Note:
Scalability often conflicts with (small system) performance
02/08/22 32
DEPENDABILITY
Dependability of distributed systems is a double-edged sword:
• Distributed systems promise higher availability:
-Replication
• But availability may degrade:
-More potential points of failure
Dependability requires consistency, security, and fault tolerance
02/08/22 33
PERFORMANCE
Any system should strive for maximum performance
In distributed systems, performance directly conflicts with some other
desirable properties….
Transparency
Security
Dependability
Scalability
02/08/22 34
FLEXIBILITY
“A flexible distributed system can be configured to provide exactly the
services that a user or programmer needs”
Flexibility generally provides a number of key properties:
Extensibility: Components/services can be changed or added
Openness of interfaces and specifiction
Interoperability
02/08/22 35
PRINCIPLES
There are several key principles underlying all distributed
systems. As such, any distributed system can be described
based on how those principles apply to that system.
• System Architecture
• Communication
• Synchronization
• Replication and Consistency
• Fault Tolerance
• Security
• Naming
During the rest of the course we will examine each of these principles in
detail.
02/08/22 36
PARADIGMS
Most distributed systems are built based on a particular
paradigm (or model)
•Shared memory
•Distributed objects
•Distributed file system
•Shared documents
•Distributed coordination
•Agents
This course will cover the first three in detail.
02/08/22 37
MISCELLANEOUS “RULES OF THUMB”
Here we present some rules of thumb that are relevant to the study and design of
distributed systems
•Trade-offs - Many of the challenges provide conflicting requirements. For example
better scalability can cause worse overall performance. Have to make trade-offs -
what is more important?
•Separation of Concerns- Split a problem into individual concerns and address each
separately
•End-to-End Argument- Some communication functions can only be reliably
implemented at the application level
•Mechanism -A system should build mechanisms that allow flexible application of
policies.
•Keep It Simple- make things as simple as possible, but no simpler.
02/08/22 38
OVERVIEW OF COURSE
Introduction
System Architecture and Communication
Replication and Consistency and Distributed Shared Memory
Middleware and Distributed Objects
Synchronization and Coordination
Fault Tolerance
Security
Naming
Distributed File Systems
Extras:
Some reading materials
Research papers
02/08/22 39