0% found this document useful (0 votes)
57 views21 pages

ln13 Ds

This document provides an overview of key concepts in distributed systems including: - Reasons for developing distributed systems including availability of cheap processors and need for collaboration. - A distributed system appears as a single system to users but is actually a collection of independent computers. - Advantages of distributed systems over centralized and independent systems including economics, performance, reliability, and flexibility. - Challenges of distributed systems including complex software, network issues, and security.

Uploaded by

Munish Kharb
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views21 pages

ln13 Ds

This document provides an overview of key concepts in distributed systems including: - Reasons for developing distributed systems including availability of cheap processors and need for collaboration. - A distributed system appears as a single system to users but is actually a collection of independent computers. - Advantages of distributed systems over centralized and independent systems including economics, performance, reliability, and flexibility. - Challenges of distributed systems including complex software, network issues, and security.

Uploaded by

Munish Kharb
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 21

University of Pennsylvania

Distributed Systems
CSE 380 Lecture Note 13 Insup Lee

11/14/00

CSE 380

Introduction to Distributed Systems


Why do we develop distributed systems? availability of powerful yet cheap microprocessors (PCs, workstations), continuing advances in communication technology,

University of Pennsylvania

What is a distributed system?


A distributed system is a collection of independent computers that appear to the users of the system as a single system.

Examples:
Network of workstations Distributed manufacturing system (e.g., automated assembly line) Network of branch office computers

11/14/00

CSE 380

Advantages of Distributed Systems over Centralized Systems

University of Pennsylvania

Economics: a collection of microprocessors offer a better price/performance than mainframes. Low price/performance ratio: cost effective way to increase computing power. Speed: a distributed system may have more total computing power than a mainframe. Ex. 10,000 CPU chips, each running at 50 MIPS. Not possible to build 500,000 MIPS single processor since it would require 0.002 nsec instruction cycle. Enhanced performance through load distributing. Inherent distribution: Some applications are inherently distributed. Ex. a supermarket chain. Reliability: If one machine crashes, the system as a whole can still survive. Higher availability and improved reliability. Incremental growth: Computing power can be added in small increments. Modular expandability Another deriving force: the existence of large number of personal computers, the need for people to collaborate and share information.

11/14/00

CSE 380

Advantages of Distributed Systems over Independent PCs

University of Pennsylvania

Data sharing: allow many users to access to a common data base Resource Sharing: expensive peripherals like color printers Communication: enhance human-to-human communication, e.g., email, chat Flexibility: spread the workload over the available machines

11/14/00

CSE 380

Disadvantages of Distributed Systems


Software: difficult to develop software for distributed systems Network: saturation, lossy transmissions Security: easy access also applies to secrete data

University of Pennsylvania

11/14/00

CSE 380

Hardware Concepts
Taxonomy (Fig. 9-4) MIMD (Multiple-Instruction Multiple-Data) Tightly Coupled versus Loosely Coupled
Tightly coupled systems (multiprocessors)

University of Pennsylvania

o shared memory
o intermachine delay short, data rate high Loosely coupled systems (multicomputers) o private memory

o intermachine delay long, data rate low

11/14/00

CSE 380

Bus versus Switched MIMD

University of Pennsylvania

Bus: a single network, backplane, bus, cable or other medium that connects all machines. E.g., cable TV

Switched: individual wires from machine to machine, with many different wiring patterns in use.
Multiprocessors (shared memory) Bus Switched Multicomputers (private memory) Bus

Switched

11/14/00

CSE 380

Switched Multiprocessors

University of Pennsylvania

Switched Multiprocessors (Fig. 9-6) for connecting large number (say over 64) of processors crossbar switch: n**2 switch points omega network: 2x2 switches for n CPUs and n memories, log n switching stages, each with n/2 switches, total (n log n)/2 switches delay problem: E.g., n=1024, 10 switching stages from CPU to memory. a total of 20 switching stages. 100 MIPS 10 nsec instruction execution time need 0.5 nsec switching time NUMA (Non-Uniform Memory Access): placement of program and data building a large, tightly-coupled, shared memory multiprocessor is possible, but is difficult and expensive

11/14/00

CSE 380

Multicomputers

University of Pennsylvania

Bus-Based Multicomputers (Fig. 9-7) easy to build communication volume much smaller relatively slow speed LAN (10-100 MIPS, compared to 300 MIPS and up for a backplane bus)

Switched Multicomputers (Fig. 9-8) interconnection networks: E.g., grid, hypercube hypercube: n-dimensional cube

11/14/00

CSE 380

10

Software Concepts
Software more important for users

University of Pennsylvania

Three types:
1. Network Operating Systems 2. (True) Distributed Systems 3. Multiprocessor Time Sharing

11/14/00

CSE 380

11

Network Operating Systems

University of Pennsylvania

loosely-coupled software on loosely-coupled hardware A network of workstations connected by LAN each machine has a high degree of autonomy o rlogin machine o rcp machine1:file1 machine2:file2 Files servers: client and server model Clients mount directories on file servers Best known network OS: o Suns NFS (network file servers) for shared file systems (Fig. 9-11) a few system-wide requirements: format and meaning of all the messages exchanged

11/14/00

CSE 380

12

NFS
NFS Architecture Server exports directories Clients mount exported directories NSF Protocols For handling mounting

University of Pennsylvania

For read/write: no open/close, stateless


NSF Implementation

11/14/00

CSE 380

13

(True) Distributed Systems

University of Pennsylvania

tightly-coupled software on loosely-coupled hardware

provide a single-system image or a virtual uniprocessor


a single, global interprocess communication mechanism, process management, file system; the same system call interface everywhere

Ideal definition:
A distributed system runs on a collection of computers that do not have shared memory, yet looks like a single computer to its users.

11/14/00

CSE 380

14

Multiprocessor Operating Systems


(Fig. 9-12)

University of Pennsylvania

Tightly-coupled software on tightly-coupled hardware Examples: high-performance servers shared memory single run queue

traditional file system as on a single-processor system: central block cache


Fig. 9-13 for comparisons

11/14/00

CSE 380

15

Design Issues of Distributed Systems


Transparency Flexibility Reliability Performance Scalability

University of Pennsylvania

11/14/00

CSE 380

16

1. Transparency

University of Pennsylvania

How to achieve the single-system image, i.e., how to make a collection of computers appear as a single computer. Hiding all the distribution from the users as well as the application programs can be achieved at two levels: 1) hide the distribution from users 2) at a lower level, make the system look transparent to programs. 1) and 2) requires uniform interfaces such as access to files, communication.

11/14/00

CSE 380

17

Types of transparency

University of Pennsylvania

Location Transparency: users cannot tell where hardware and software resources such as CPUs, printers, files, data bases are located. Migration Transparency: resources must be free to move from one location to another without their names changed. E.g., /usr/lee, /central/usr/lee Replication Transparency: OS can make additional copies of files and resources without users noticing. Concurrency Transparency: The users are not aware of the existence of other users. Need to allow multiple users to concurrently access the same resource. Lock and unlock for mutual exclusion. Parallelism Transparency: Automatic use of parallelism without having to program explicitly. The holy grail for distributed and parallel system designers. Users do not always want complete transparency: a fancy printer 1000 miles away
11/14/00 CSE 380 18

2. Flexibility
Make it easier to change

University of Pennsylvania

Monolithic Kernel: systems calls are trapped and executed by the kernel. All system calls are served by the kernel, e.g., UNIX. Microkernel: provides minimal services. (Fig 9-15) 1) IPC 2) some memory management 3) some low-level process management and scheduling 4) low-level i/o E.g., Mach can support multiple file systems, multiple system interfaces.

11/14/00

CSE 380

19

3. Reliability

University of Pennsylvania

Distributed system should be more reliable than single system. Example: 3 machines with .95 probability of being up. 1-.05**3 probability of being up. Availability: fraction of time the system is usable. Redundancy improves it. Need to maintain consistency Need to be secure Fault tolerance: need to mask failures, recover from errors.

11/14/00

CSE 380

20

4. Performance

University of Pennsylvania

Without gain on this, why bother with distributed systems. Performance loss due to communication delays: fine-grain parallelism: high degree of interaction coarse-grain parallelism Performance loss due to making the system fault tolerant.

11/14/00

CSE 380

21

5. Scalability

University of Pennsylvania

Systems grow with time or become obsolete. Techniques that require resources linearly in terms of the size of the system are not scalable. e.g., broadcast based query won't work for large distributed systems.

Examples of bottlenecks
o Centralized components: a single mail server o Centralized tables: a single URL address book o Centralized algorithms: routing based on complete information

11/14/00

CSE 380

22

You might also like