0% found this document useful (0 votes)
122 views

Lecture 1 Introduction To Distributed Systems - 034922

This document provides an introduction to distributed systems. It defines a distributed system as a collection of independent computers that appear as a single system to users. Examples include ATMs, the internet, and manufacturing systems. Distributed systems offer advantages like economics, speed, reliability, and incremental growth over centralized systems. The document discusses properties, classifications, components, advantages/disadvantages, concepts, design issues, and models of distributed systems.

Uploaded by

leonardkigen100
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
122 views

Lecture 1 Introduction To Distributed Systems - 034922

This document provides an introduction to distributed systems. It defines a distributed system as a collection of independent computers that appear as a single system to users. Examples include ATMs, the internet, and manufacturing systems. Distributed systems offer advantages like economics, speed, reliability, and incremental growth over centralized systems. The document discusses properties, classifications, components, advantages/disadvantages, concepts, design issues, and models of distributed systems.

Uploaded by

leonardkigen100
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Introduction to Distributed Systems – Lecture 1

What is a distributed system?


A distributed system is a collection of independent computers interconnected via
a network that appear to the users of the system as a single system and are capable of
collaborating on a task.
A distributed system runs on a collection of computers that do not have shared
memory, yet looks like a single computer to its users.

Distributed computing is computing performed in a distributed system. A computer


program that runs in a distributed system is called a distributed program, and
distributed programming is the process of writing such programs

Examples:
• ATM
• Internet
• Distributed manufacturing system (e.g., automated assembly line)
• Network of branch office computers

Advantages of Distributed Systems over Centralized Systems


• Economics/ costs: a collection of microprocessors offer a better price/performance
than mainframes hence a cost effective way to increase computing power.
• Speed: a distributed system may have more total computing power than a mainframe
hence enhanced performance through load distributing.
• Inherent distribution: Some applications are inherently distributed. E.g a
supermarket chain.
• Reliability: If one machine crashes, the system as a whole can still survive. Higher
availability and improved reliability.
• Incremental growth: Computing power can be added in small increments. Modular
expandability
• The existence of large number of personal computers, the need for people to
collaborate and share information.
Properties of distributed systems include:
• Each computer may have its own user with individual needs.
• The system has to tolerate failures in individual computers.
• The structure of the system (network topology, network latency, number of
computers) is not known in advance, the system may consist of different kinds of
computers and network links, and the system may change during the execution of a
distributed program.
• Each computer has only a limited, incomplete view of the system. Each computer
may know only one part of the input.

It is possible to roughly classify concurrent systems as "parallel" or "distributed" using


the following criteria:

• In parallel computing, all processors have access to a shared memory. Shared


memory can be used to exchange information between processors.

1
• In distributed computing, each processor has its own private memory (distributed
memory). Information is exchanged by passing messages between the processors.

Components of Distributed Software Systems

• Distributed systems
• Middleware
• Distributed applications

Advantages of Distributed Systems over Independent PCs


• Data sharing: allow many users to access to a common data base
• Resource Sharing: expensive peripherals like color printers
• Communication: enhance human-to-human communication, e.g., email, chat
• Flexibility: spread the workload over the available machines
Disadvantages of Distributed Systems
• Software: difficult to develop software for distributed systems
• Network: saturation, lossy transmissions
• Security: easy access also applies to secrete data

Hardware Concepts
a. Tightly coupled systems (multiprocessors) – have shared memory, intermachine
delay is short and have high data rate
b. Loosely coupled systems (multicomputers) – Have private memory, intermachine
delay long, data rate low

2
(a)–(b) A distributed system.
(c) A parallel system.

Software Concepts
• Software is more important for users
• There are three types:
1. Network Operating Systems
2. Distributed Systems
3. Multiprocessor Time Sharing
Network Operating Systems
• loosely-coupled software on loosely-coupled hardware
• A network of workstations connected by LAN
• each machine has a high degree of autonomy
• a few system-wide requirements: format and meaning of all the messages
exchanged
NFS
NFS Architecture – Have Server exports directories and Clients mount exported
directories
NFS Protocols - For handling mounting and For read/write: open/close jobs.

Multiprocessor Operating Systems


• Tightly-coupled software on tightly-coupled hardware
• Examples: high-performance servers where there are shared memory, single run
queue

Design Issues of Distributed Systems


• Transparency
• Flexibility
• Reliability
• Performance
• Scalability

3
Transparency
• How to achieve the single-system image, i.e., how to make a collection of
computers appear as a single computer.
• Hiding all the distribution from the users as well as the application programs can
be achieved at two levels:
1) hide the distribution from users
2) at a lower level, make the system look transparent to programs.

Types of transparency
– Location Transparency: users cannot tell where hardware and software resources
such as CPUs, printers, files, data bases are located.
– Migration Transparency: resources must be free to move from one location to
another without their names changed.
– Replication Transparency: OS can make additional copies of files and resources
without users noticing.
– Concurrency Transparency: The users are not aware of the existence of other
users. Need to allow multiple users to concurrently access the same resource. Lock
and unlock for mutual exclusion.
– Parallelism Transparency: Automatic use of parallelism without having to program
explicitly

Flexibility
• Make it easier to change
• Monolithic Kernel: systems calls are trapped and executed by the kernel. All
system calls are served by the kernel, e.g., UNIX.
• Microkernel: provides minimal services such as Interprocess
Communication(IPC), memory management, low-level process management and
scheduling, low-level i/o
E.g., support multiple file systems, multiple system interfaces.
Reliability
• Distributed system should be more reliable than single system. Example: 3
machines with a 95% probability of being up. 1-.05**3 probability of being up.
– Availability: fraction of time the system is usable. Redundancy improves
it.
– Need to maintain consistency
– Need to be secure
– Fault tolerance: need to mask failures, recover from errors.
Performance
• Performance loss due to communication delays:
– fine-grain parallelism: high degree of interaction
– coarse-grain parallelism
• Performance loss due to making the system fault tolerant.
Scalability
• Systems grow with time or become obsolete. Techniques that require resources
linearly in terms of the size of the system are not scalable. e.g., broadcast based
query won't work for large distributed systems.
• Examples of bottlenecks
o Centralized components: a single mail server

4
o Centralized tables: a single URL address book
o Centralized algorithms: routing based on complete information

Packet Switching versus Circuit Switching


i) circuit switching -- a dedicated path between a source and a destination e.g.,
telephone connection. wastes bandwidth (bandwidth = amount of data transmitted in a
given time period)
ii) packet switching -- message or data is broken into packets packets are routed
independently, better network utilization, disassemble and assembler overheads

Distributed programming typically falls into one of several basic architectures or


categories: client–server, 3-tier architecture, n-tier architecture, distributed objects, loose
coupling, or tight coupling.

• Client–server: Smart client code contacts the server for data then formats and
displays it to the user. Input at the client is committed back to the server when it
represents a permanent change.
• 3-tier architecture: Three tier systems move the client intelligence to a middle tier
so that stateless clients can be used. This simplifies application deployment. Most
web applications are 3-Tier.
• n-tier architecture: n-tier refers typically to web applications which further
forward their requests to other enterprise services. This type of application is the
one most responsible for the success of application servers.
• Tightly coupled (clustered): refers typically to a cluster of machines that closely
work together, running a shared process in parallel. The task is subdivided in parts
that are made individually by each one and then put back together to make the
final result.
• Peer-to-peer: an architecture where there is no special machine or machines that
provide a service or manage the network resources. Instead all responsibilities are
uniformly divided among all machines, known as peers. Peers can serve both as
clients and servers.
▪ Space based: refers to an infrastructure that creates the illusion (virtualization) of
one single address-space. Data are transparently replicated according to
application needs. Decoupling in time, space and reference is achieved.

Models

• A model captures the essential ingredients that we need to consider to understand


and reason about a system’s behavior
• Addresses the following questions: - What are the main entities in the system?,
How do they interact?, What are the characteristics that affect their collective and
individual behavior?

Three models exist

• Interaction model - Reflects the assumptions about the processes and the
communication channels in the distributed system

5
• Failure model - Distinguish between the types of failures of the processes and the
communication channels
• Security Model - Assumptions about the principals and the adversary

Interaction Models

Synchronous Distributed Systems: a system in which the following bounds are defined
• The time to execute each step of a process has an upper and lower bound
• Each message transmitted over a channel is received within a known bounded
delay
• Each process has a local clock whose drift rate from real time has a known bound
Asynchronous distributed system
▪ Each step of a process can take an arbitrary time
▪ Message delivery time is arbitrary
▪ Clock drift rates are arbitrary
Implications
▪ In a synchronous system, timeouts can be used to detect failures
▪ Impossible to detect failures or “reach agreement” in an asynchronous system

You might also like