Lecture 1 Introduction To Distributed Systems - 034922
Lecture 1 Introduction To Distributed Systems - 034922
Examples:
• ATM
• Internet
• Distributed manufacturing system (e.g., automated assembly line)
• Network of branch office computers
1
• In distributed computing, each processor has its own private memory (distributed
memory). Information is exchanged by passing messages between the processors.
• Distributed systems
• Middleware
• Distributed applications
Hardware Concepts
a. Tightly coupled systems (multiprocessors) – have shared memory, intermachine
delay is short and have high data rate
b. Loosely coupled systems (multicomputers) – Have private memory, intermachine
delay long, data rate low
2
(a)–(b) A distributed system.
(c) A parallel system.
Software Concepts
• Software is more important for users
• There are three types:
1. Network Operating Systems
2. Distributed Systems
3. Multiprocessor Time Sharing
Network Operating Systems
• loosely-coupled software on loosely-coupled hardware
• A network of workstations connected by LAN
• each machine has a high degree of autonomy
• a few system-wide requirements: format and meaning of all the messages
exchanged
NFS
NFS Architecture – Have Server exports directories and Clients mount exported
directories
NFS Protocols - For handling mounting and For read/write: open/close jobs.
3
Transparency
• How to achieve the single-system image, i.e., how to make a collection of
computers appear as a single computer.
• Hiding all the distribution from the users as well as the application programs can
be achieved at two levels:
1) hide the distribution from users
2) at a lower level, make the system look transparent to programs.
Types of transparency
– Location Transparency: users cannot tell where hardware and software resources
such as CPUs, printers, files, data bases are located.
– Migration Transparency: resources must be free to move from one location to
another without their names changed.
– Replication Transparency: OS can make additional copies of files and resources
without users noticing.
– Concurrency Transparency: The users are not aware of the existence of other
users. Need to allow multiple users to concurrently access the same resource. Lock
and unlock for mutual exclusion.
– Parallelism Transparency: Automatic use of parallelism without having to program
explicitly
Flexibility
• Make it easier to change
• Monolithic Kernel: systems calls are trapped and executed by the kernel. All
system calls are served by the kernel, e.g., UNIX.
• Microkernel: provides minimal services such as Interprocess
Communication(IPC), memory management, low-level process management and
scheduling, low-level i/o
E.g., support multiple file systems, multiple system interfaces.
Reliability
• Distributed system should be more reliable than single system. Example: 3
machines with a 95% probability of being up. 1-.05**3 probability of being up.
– Availability: fraction of time the system is usable. Redundancy improves
it.
– Need to maintain consistency
– Need to be secure
– Fault tolerance: need to mask failures, recover from errors.
Performance
• Performance loss due to communication delays:
– fine-grain parallelism: high degree of interaction
– coarse-grain parallelism
• Performance loss due to making the system fault tolerant.
Scalability
• Systems grow with time or become obsolete. Techniques that require resources
linearly in terms of the size of the system are not scalable. e.g., broadcast based
query won't work for large distributed systems.
• Examples of bottlenecks
o Centralized components: a single mail server
4
o Centralized tables: a single URL address book
o Centralized algorithms: routing based on complete information
• Client–server: Smart client code contacts the server for data then formats and
displays it to the user. Input at the client is committed back to the server when it
represents a permanent change.
• 3-tier architecture: Three tier systems move the client intelligence to a middle tier
so that stateless clients can be used. This simplifies application deployment. Most
web applications are 3-Tier.
• n-tier architecture: n-tier refers typically to web applications which further
forward their requests to other enterprise services. This type of application is the
one most responsible for the success of application servers.
• Tightly coupled (clustered): refers typically to a cluster of machines that closely
work together, running a shared process in parallel. The task is subdivided in parts
that are made individually by each one and then put back together to make the
final result.
• Peer-to-peer: an architecture where there is no special machine or machines that
provide a service or manage the network resources. Instead all responsibilities are
uniformly divided among all machines, known as peers. Peers can serve both as
clients and servers.
▪ Space based: refers to an infrastructure that creates the illusion (virtualization) of
one single address-space. Data are transparently replicated according to
application needs. Decoupling in time, space and reference is achieved.
Models
• Interaction model - Reflects the assumptions about the processes and the
communication channels in the distributed system
5
• Failure model - Distinguish between the types of failures of the processes and the
communication channels
• Security Model - Assumptions about the principals and the adversary
Interaction Models
Synchronous Distributed Systems: a system in which the following bounds are defined
• The time to execute each step of a process has an upper and lower bound
• Each message transmitted over a channel is received within a known bounded
delay
• Each process has a local clock whose drift rate from real time has a known bound
Asynchronous distributed system
▪ Each step of a process can take an arbitrary time
▪ Message delivery time is arbitrary
▪ Clock drift rates are arbitrary
Implications
▪ In a synchronous system, timeouts can be used to detect failures
▪ Impossible to detect failures or “reach agreement” in an asynchronous system