0% found this document useful (0 votes)

144 views20 pages

Lecture 19

This document provides an overview of multiple processor systems including multiprocessors and multicomputers. It discusses uniform memory access and non-uniform memory access multiprocessor hardware configurations. It also covers multiprocessor operating system paradigms, synchronization techniques, and scheduling approaches. For multicomputers, it describes interconnection topologies, network interfaces, and techniques for user-level communication and distributed shared memory.

Uploaded by

api-3801184

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

144 views20 pages

Lecture 19

Uploaded by

api-3801184

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Lecture Overview

• Multiple processors
– Multiprocessors
• UMA versus NUMA
• Hardware configurations
• OS configurations
• Process scheduling
– Multicomputers
• Interconnection configurations
• Network interface
• User-level communication
• Distributed shared memory
• Load balancing
– Distributed Systems
Operating Systems - July 5, 2001

Multiple Processors

Continuous need for faster computers

a) Shared memory multiprocessor
b) Message passing multicomputer
c) Wide-area distributed system

1
Multiprocessor System

Definition
A computer system in which two or more
CPUs share full access to a common
RAM

Multiprocessor System

Two types of multiprocessor systems

• Uniform Memory Access (UMA)
– All memory addresses are reachable as fast as
any other address
• Non-uniform Memory Access (NUMA)
– Some memory addresses are slower than others

2
UMA Multiprocessor Hardware

UMA bus-based multiprocessors

a) CPUs communicate via bus to RAM
b) CPUs have a local cache to reduce bus access
c) CPUs have private memory, shared memory
access via bus

UMA Multiprocessor Hardware

UMA multiprocessor using a crossbar switch

– Alleviates bus access problems, but is expensive (grows as n2)

3
UMA Multiprocessor Hardware

• UMA multiprocessors using multistage switching

networks can be built from 2 × 2 switches
– Input to switches in in the form of a message

a) 2 × 2 switch b) Message format

UMA Multiprocessor Hardware

UMA omega switching network

– Less costly than crossbar switch

4
NUMA Multiprocessor Hardware
NUMA Multiprocessor Characteristics
• Single address space visible to all CPUs
• Access to remote memory via commands
- LOAD
- STORE
• Access to remote memory slower than to local

NC-NUMA versus CC-NUMA

– No cache versus cache-coherent

NUMA Multiprocessor Hardware

a) 256-node directory-based multiprocessor

b) Fields of 32-bit memory address
c) Directory at node 36

5
Multiprocessor OS Paradigms

Each CPU has its own operating system

– Allows sharing of devices
– Efficient process communication via shared memory
– Since each OS is independent
• No process sharing among CPUs
• No page share among CPUs
• Makes disk buffering very difficult

Multiprocessor OS Paradigms

Master-slave multiprocessors
– OS and all tables are on one CPU
• Process sharing, so no idle CPUs
• Page sharing
• Disk buffers
– Master CPU becomes a bottleneck

6
Multiprocessor OS Paradigms

Bus
Symmetric multiprocessors
– One copy of OS, but any CPU can run it
– Balances processes and memory dynamically
– Eliminates bottleneck
– More complicated requires reasonably fine-grained
synchronization to avoid bottleneck, deadlock issues

Multiprocessor Synchronization

Locking
– Test-and-set instructions fail is bus cannot be locked
– Create contention for bus, caching doesn’t help since
test-and-set is a write instruction
– Could read first, before test-and-set; also use
exponential back-off

7
Multiprocessor Synchronization

Locking with multiple private locks

– Try to lock first, if failed, create private lock and put
it at the end of a list of CPUs waiting for lock
– Lock holder releases original lock and frees private
lock of first CPU on waiting list

Multiprocessor Synchronization
Spinning versus Switching
• In some cases CPU must wait
– For example, must wait to acquire ready list
• In other cases a choice exists
– Spinning wastes CPU cycles
– Switching uses up CPU cycles also
– Possible to make separate decision each time locked mutex
encountered

8
Multiprocessor Scheduling

• Timesharing
– Non-related processes
– Note use of single data structure for scheduling
• Provides automatic load balancing
• Contention for process list

Multiprocessor Scheduling
• What about processes holding a spin lock?
– Does not make sense to block such a process
• In general, all CPUs are equal, but some are more
equal than others
– The CPU cache may have cached blocks of process that was
previously running on it
– The CPU TLB may have cached pages of a process that was
previously running on it
– Use affinity scheduling to try to keep processes on the same
CPU

9
Multiprocessor Scheduling

• Space sharing
– Related processes/threads
– Multiple threads at same time across multiple CPUs
• A group of threads is created and assigned to CPUs as a block
• Group runs until completion
– Eliminates multiprogramming and context switches
– Potentially wastes CPU time, when CPUs left idle

Multiprocessor Scheduling

• Potential communication problem when scheduling

threads independently
– A0 and A1 both belong to process A
– Both running out-of-phase

10
Multiprocessor Scheduling
• Need to avoid wasting idle CPUs and out-of-
phase thread communication
• Solution: Gang Scheduling
– Groups of related threads scheduled as a unit (a gang)
– All members of gang run simultaneously on different
timeshared CPUs
– All gang members start and end time slices together

Multiprocessor Scheduling

Gang scheduling
• All CPUs scheduled synchronously
• Still has some idle time and out-of-phase, but reduced

11
Multicomputers
Definition
Tightly-coupled CPUs that do not share memory

Also known as
– Cluster computers
– Clusters of workstations (COWs)

Multicomputer Interconnection

Interconnection topologies
a) single switch d) double torus
b) ring e) cube
c) grid f) hypercube

12
Multicomputer Interconnection

• Switching schemes
– Store-and-forward packet switching
• Send a complete packet to first switch
• Complete packet is received and forward to next switch
• Repeated until it arrives at destination
• Increases latency due to all the copying
– Circuit switching
• Establishes a path through switches (i.e., a circuit)
• Pumps packet bits non-stop to destination
• No intermediate buffer
• Requires set-up and tear-down time

Multicomputer Network Interface

• Interface boards usually contain buffer for packets

– Needs to control flow onto interconnection network when sending and
receiving packets
• Interface boards can use DMA to copy packets into main RAM

13
Multicomputer Network Interface
• Must avoid unnecessary copying of packets
– Problematic if interface board is mapped into kernel memory

• Map interface board into process memory

• If several processes are running on node
– Each needs network access to send packets …
– Must have sharing/synchronization mechanism

• If kernel needs access to network …

• One possible solution is to use two network boards
– One for user space, one for kernel space

Multicomputer Network Interface

Node to network interface communication

• Complicated when user is controlling DMA
• If interface has its own CPU, then must coordinate
with man CPU
– Use send & receive rings

14
Multicomputer User-Level Communication
• Bare minimum, send and receive
– Blocking versus non-blocking
• Choices
– Blocking send (CPU idle during message transmission)
– Non-blocking send with copy (CPU time waste for extra copy)
– Non-blocking send with interrupt (makes programming difficult)
– Copy on write (extra copy eventually)
– Pop-up thread
• Creates a thread spontaneously when a message arrives
– Active messages
• Message handler code is run directly in the interrupt handler

Multicomputer User-Level Communication

• The send/receive primitives are wrong paradigm

• Remote procedure call (RPC) maintains procedural
paradigm
– Breaks a procedure into client and server

15
Multicomputer User-Level Communication
RPC implementation issues
• Cannot pass pointers
– Call by reference becomes copy-restore (but might fail)
• Weakly typed languages
– Client stub cannot determine size
• Not always possible to determine parameter types
– Think about printf(…) with variable parameters
• Cannot use global variables
– May get moved to remote machine

Multicomputer Distributed Shared Memory

• Layers where shared memory can be implemented

– Hardware (multiprocessors)
– Operating system

16
Multicomputer Distributed Shared Memory

Replication
a) Pages distributed on 4
machines

b) CPU 0 reads page 10

c) CPU 1 reads page 10

Multicomputer Distributed Shared Memory

• False Sharing
• Must also achieve sequential consistency (i.e.,
cache coherency problem)

17
Multicomputer Process Scheduling
• On a multicomputer, each node has its own
memory and its own set of processes
– This is very similar to a uniprocessor, so process scheduling
can use similar algorithms
• Unless you have multiprocessors as nodes
• The critical aspect of multicomputer scheduling is
allocating processes to processors
– Processor allocation algorithms
• Use various metrics to determine process “load” and how to
properly allocate processes to processors
– These are “load” balancing algorithms

Multicomputer Load Balancing

Process

• Graph-theoretic deterministic algorithm

– Know processes, CPU and memory requirements, and average
communication traffic among processes
– Partition graph to minimize network traffic and to meet constraints
on CPU and memory

18
Multicomputer Load Balancing

• Sender-initiated distributed heuristic algorithm

– Overloaded sender probes for underloaded node
– Searches come during heavy loads, which adds more load

Multicomputer Load Balancing

• Receiver-initiated distributed heuristic algorithm

– Under loaded sender probes for overloaded node
– Searches come during lower loads

19
Distributed Systems

Comparison of three kinds of multiple CPU systems

Distributed System Middleware

Achieving uniformity with middleware

Chapter - 10 - RTOS - Task Synchronization Techniques
67% (3)
Chapter - 10 - RTOS - Task Synchronization Techniques
27 pages
5CS4-03 - Operating System - Kajal Mathur
No ratings yet
5CS4-03 - Operating System - Kajal Mathur
157 pages
Chapter 8 Pipeline and Vector Processing
0% (1)
Chapter 8 Pipeline and Vector Processing
12 pages
CS-604 Quiz-4 File by Vu Topper RM
No ratings yet
CS-604 Quiz-4 File by Vu Topper RM
139 pages
WINCC Storage
No ratings yet
WINCC Storage
150 pages
Unit 6 - Computer Organization and Architecture - WWW - Rgpvnotes.in
No ratings yet
Unit 6 - Computer Organization and Architecture - WWW - Rgpvnotes.in
14 pages
Lecture 13
No ratings yet
Lecture 13
14 pages
Lecture 1 Operating System Fundamentals
No ratings yet
Lecture 1 Operating System Fundamentals
74 pages
Quarter 3 - Module 1-W4&W5: Answer & Submit This Page. (W4-1)
No ratings yet
Quarter 3 - Module 1-W4&W5: Answer & Submit This Page. (W4-1)
11 pages
41 S 3 Fcpics
No ratings yet
41 S 3 Fcpics
21 pages
Foundations of Information Systems in Business
No ratings yet
Foundations of Information Systems in Business
31 pages
OS Question Bank
No ratings yet
OS Question Bank
4 pages
Introduction To Distributed Operating Systems Communication in Distributed Systems
No ratings yet
Introduction To Distributed Operating Systems Communication in Distributed Systems
150 pages
Memory in Multiprocessor System
No ratings yet
Memory in Multiprocessor System
52 pages
Lecture 20
No ratings yet
Lecture 20
18 pages
Unit 1
No ratings yet
Unit 1
88 pages
Os MD 4
No ratings yet
Os MD 4
112 pages
Chapter-3 CPU Scheduling
No ratings yet
Chapter-3 CPU Scheduling
65 pages
COA
No ratings yet
COA
107 pages
Mca - Unit Iii
No ratings yet
Mca - Unit Iii
59 pages
Lecture 17
No ratings yet
Lecture 17
13 pages
Module 07 - Multiprocessing
No ratings yet
Module 07 - Multiprocessing
60 pages
Lecture 18
100% (1)
Lecture 18
16 pages
10 Multithreading
No ratings yet
10 Multithreading
60 pages
Multiple Processor Systems: 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed Systems
No ratings yet
Multiple Processor Systems: 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed Systems
36 pages
Parallelism and Multicores
No ratings yet
Parallelism and Multicores
54 pages
DS Lec03
No ratings yet
DS Lec03
43 pages
Lecture 07
No ratings yet
Lecture 07
16 pages
Unit - 1 Introduction
No ratings yet
Unit - 1 Introduction
52 pages
Unit 6
No ratings yet
Unit 6
36 pages
Lecture 15
No ratings yet
Lecture 15
11 pages
Chapter 3
No ratings yet
Chapter 3
35 pages
Distributed System
100% (1)
Distributed System
26 pages
DoS - Unit 1
No ratings yet
DoS - Unit 1
57 pages
Arkom 13-40275
No ratings yet
Arkom 13-40275
32 pages
8.1.1 Multiprocessors Hardware 8.1.2 Multiprocessors Operation System Types 8.1.3 Multiprocessors Synchronization 8.1.4 Multiprocessors Scheduling
No ratings yet
8.1.1 Multiprocessors Hardware 8.1.2 Multiprocessors Operation System Types 8.1.3 Multiprocessors Synchronization 8.1.4 Multiprocessors Scheduling
49 pages
Unit VI
No ratings yet
Unit VI
50 pages
Comporg6 ch12
No ratings yet
Comporg6 ch12
36 pages
cs516 Unit II
No ratings yet
cs516 Unit II
24 pages
Unit-5 Part-2
No ratings yet
Unit-5 Part-2
22 pages
CS Chap7 Multicores Multiprocessors Clusters
No ratings yet
CS Chap7 Multicores Multiprocessors Clusters
65 pages
Lecture 02
No ratings yet
Lecture 02
16 pages
Lecture 03
No ratings yet
Lecture 03
15 pages
CH17 COA9e Parallel Processing
No ratings yet
CH17 COA9e Parallel Processing
52 pages
Chapter 8 - Parallel Processing
No ratings yet
Chapter 8 - Parallel Processing
50 pages
Multiprocessor
No ratings yet
Multiprocessor
22 pages
Aca UNIT-4
No ratings yet
Aca UNIT-4
18 pages
Midterm Reviewer
No ratings yet
Midterm Reviewer
17 pages
Aca UNIT-4
No ratings yet
Aca UNIT-4
19 pages
Unit6 - Microprocessor - Final 1
No ratings yet
Unit6 - Microprocessor - Final 1
30 pages
CH17 COA9e
No ratings yet
CH17 COA9e
51 pages
Os 1
No ratings yet
Os 1
31 pages
Lecture 16
No ratings yet
Lecture 16
15 pages
2ad6a430 1637912349895
No ratings yet
2ad6a430 1637912349895
51 pages
Distributed Operating Systems
No ratings yet
Distributed Operating Systems
42 pages
Lecture 05
No ratings yet
Lecture 05
14 pages
Unit 5
No ratings yet
Unit 5
23 pages
Unit 11
No ratings yet
Unit 11
10 pages
Lecture 11
No ratings yet
Lecture 11
12 pages
Lecture 14
No ratings yet
Lecture 14
12 pages
Lecture 04
No ratings yet
Lecture 04
11 pages
Contents:: Multiprocessors: Characteristics of Multiprocessor, Structure of Multiprocessor
No ratings yet
Contents:: Multiprocessors: Characteristics of Multiprocessor, Structure of Multiprocessor
52 pages
CSO Notes Unit 5 Multiprocessor
No ratings yet
CSO Notes Unit 5 Multiprocessor
52 pages
Aos Questions Ia
No ratings yet
Aos Questions Ia
19 pages
COA Assignment
No ratings yet
COA Assignment
21 pages
Distributed Operating Syst EM: 15SE327E Unit 1
No ratings yet
Distributed Operating Syst EM: 15SE327E Unit 1
49 pages
Unit-3 2 Multiprocessor Systems
No ratings yet
Unit-3 2 Multiprocessor Systems
12 pages
Multiple Processor Systems: 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed Systems
No ratings yet
Multiple Processor Systems: 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed Systems
55 pages
CS 134: Operating Systems: Multiprocessing
No ratings yet
CS 134: Operating Systems: Multiprocessing
16 pages
Lecture 06
No ratings yet
Lecture 06
16 pages
07 Multiprocessors MF PDF
No ratings yet
07 Multiprocessors MF PDF
99 pages
OS-Kings College Unit Wise
No ratings yet
OS-Kings College Unit Wise
8 pages
Background: Computer System Architectures Computer System Software
No ratings yet
Background: Computer System Architectures Computer System Software
25 pages
Lecture 10
No ratings yet
Lecture 10
11 pages
Lecture 09
No ratings yet
Lecture 09
10 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
51 pages
MultiProcessors Tanenbaum BP
No ratings yet
MultiProcessors Tanenbaum BP
29 pages
Lecture 01
No ratings yet
Lecture 01
9 pages
Multiprocessor Architecture and Programming
No ratings yet
Multiprocessor Architecture and Programming
20 pages
CompArch 23a MP-1
No ratings yet
CompArch 23a MP-1
17 pages
A502018463 23825 5 2019 Unit6
No ratings yet
A502018463 23825 5 2019 Unit6
36 pages
Lec 44 Multicore
No ratings yet
Lec 44 Multicore
23 pages
Chapter 1 Tutorial
No ratings yet
Chapter 1 Tutorial
10 pages
Operating system: Full name: Trịnh Mạnh Hùng Student id: 1952740
No ratings yet
Operating system: Full name: Trịnh Mạnh Hùng Student id: 1952740
6 pages
Characteristics of Modern Operating System 1. Microkernel Architecture 2. Monolithic Kernel 3. Hybrid Kernel
No ratings yet
Characteristics of Modern Operating System 1. Microkernel Architecture 2. Monolithic Kernel 3. Hybrid Kernel
7 pages
CICS 504 Computer Organization
No ratings yet
CICS 504 Computer Organization
35 pages
Multiprocessor Systems: - Tightly Coupled vs. Loosely Coupled Systems
No ratings yet
Multiprocessor Systems: - Tightly Coupled vs. Loosely Coupled Systems
14 pages
Lecture 3 Multiprocessor Vs Multicomputer Vs DS
No ratings yet
Lecture 3 Multiprocessor Vs Multicomputer Vs DS
55 pages
Active Programmable Network
No ratings yet
Active Programmable Network
3 pages
CS211 Lec 6
No ratings yet
CS211 Lec 6
17 pages
Distributed Systems: University of Pennsylvania
No ratings yet
Distributed Systems: University of Pennsylvania
26 pages
Symmetric Multiprocessors: Unit 5 Memory Organization
No ratings yet
Symmetric Multiprocessors: Unit 5 Memory Organization
6 pages
Ch03 OS9e
No ratings yet
Ch03 OS9e
56 pages
Types of Os
No ratings yet
Types of Os
11 pages
Cs Important Questions by Ujjwal
No ratings yet
Cs Important Questions by Ujjwal
19 pages
Distributed Systems: - CSE 380 - Lecture Note 13 - Insup Lee
No ratings yet
Distributed Systems: - CSE 380 - Lecture Note 13 - Insup Lee
24 pages
Unit 2
No ratings yet
Unit 2
60 pages
TSN2101 - Tutorial 3 (Processes and Threads)
No ratings yet
TSN2101 - Tutorial 3 (Processes and Threads)
3 pages
MCA - 2 Syllabus
No ratings yet
MCA - 2 Syllabus
36 pages
Roll No. ...................... Total Pages: 3: BCA/D-21
No ratings yet
Roll No. ...................... Total Pages: 3: BCA/D-21
3 pages
Java Report
No ratings yet
Java Report
30 pages
Pedi4Gyan OS Notes
No ratings yet
Pedi4Gyan OS Notes
17 pages
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
Deadlock Prevention
No ratings yet
Deadlock Prevention
2 pages
Linux
No ratings yet
Linux
8 pages
Lecture 12 Cse Gate
No ratings yet
Lecture 12 Cse Gate
15 pages
Introduction To Linux Process Scheduling
No ratings yet
Introduction To Linux Process Scheduling
15 pages
GRE Thesaurus
No ratings yet
GRE Thesaurus
3 pages
All You Need To Know About Processes in Rhel-8: Karun Behal
No ratings yet
All You Need To Know About Processes in Rhel-8: Karun Behal
7 pages
Operating Systems Interview Questions You'll Most Likely Be Asked
From Everand
Operating Systems Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Lecture 19

Uploaded by

Lecture 19

Uploaded by

Lecture Overview

Continuous need for faster computers

Two types of multiprocessor systems

UMA bus-based multiprocessors

UMA Multiprocessor Hardware

UMA multiprocessor using a crossbar switch

• UMA multiprocessors using multistage switching

a) 2 × 2 switch b) Message format

UMA Multiprocessor Hardware

UMA omega switching network

NC-NUMA versus CC-NUMA

NUMA Multiprocessor Hardware

a) 256-node directory-based multiprocessor

Each CPU has its own operating system

Locking with multiple private locks

• Potential communication problem when scheduling

Multicomputer Network Interface

• Interface boards usually contain buffer for packets

• Map interface board into process memory

• If kernel needs access to network …

Multicomputer Network Interface

Node to network interface communication

Multicomputer User-Level Communication

• The send/receive primitives are wrong paradigm

Multicomputer Distributed Shared Memory

• Layers where shared memory can be implemented

b) CPU 0 reads page 10

c) CPU 1 reads page 10

Multicomputer Distributed Shared Memory

Multicomputer Load Balancing

• Graph-theoretic deterministic algorithm

• Sender-initiated distributed heuristic algorithm

Multicomputer Load Balancing

• Receiver-initiated distributed heuristic algorithm

Comparison of three kinds of multiple CPU systems

Distributed System Middleware

Achieving uniformity with middleware

You might also like