Unit-5 Part-2
Unit-5 Part-2
COUPLING OF PROCESSORS
Tightly Coupled System
- Tasks and/or processors communicate in a highly synchronized fashion
- Communicates through a common shared memory
- Shared memory system
Loosely Coupled System
- Tasks or processors do not communicate in a
synchronized fashion
- Communicates by message passing packets
- Overhead for data exchange is high
- Distributed memory system
Multiprocessors Characteristics of Multiprocessors
MEMORY
Shared (Global) Memory
- A Global Memory Space accessible by all processors
- Processors may also have some local memory
Distributed (Local, Message-Passing) Memory
- All memory units are associated with processors
- To retrieve information from another processor's
memory a message must be sent there
Uniform Memory
- All processors take the same time to reach all memory locations
SHARED MEMORY
Nonuniform (NUMA) Memory
Memory DISTRIBUTED MEMORY
- Memory access is not uniform Network
Network
Processors Processors/Memory
Multiprocessors Interconnection Structure
INTERCONNECTION STRUCTURES
* Time-Shared Common Bus
* Multiport Memory
* Crossbar Switch
* Multistage Switching Network
* Hypercube System
Bus
All processors (and memory) are connected to a
common bus or busses
- Memory access is fairly uniform, but not very scalable
Multiprocessors Interconnection Structure
BUS
- A collection of signal lines that carry module-to-module communication
- Data highways connecting several digital system elements
Operations of Bus Devices
M3 S7 M6 S5 M4
S2
Bus
M3 wishes to communicate with S5
[1] M3 sends signals (address) on the bus that causes
S5 to respond
[2] M3 sends data to S5 or S5 sends data to
M3(determined by the command line)
Local Bus
Disadvantages
CPU 1
- Memory control logic
- Large number of cables and CPU 2
connections
CPU 3
CPU 4
Multiprocessors Interconnection Structure
Memory
address
R/W
Multiplexers
and
arbitration
} data,address, and
control from CPU 2
Module
memory
enable
logic
} data,address, and
control from CPU 3
} data,address, and
control from CPU 4
Multiprocessors Interconnection Structure
MULTISTAGE SWITCHING NETWORK
Interstage Switch
A 0 A 0
1 1
B B
A connected to 0 A connected to 1
A 0 A 0
1 1
B B
B connected to 0 B connected to 1
Multiprocessors Interconnection Structure
MULTISTAGE INTERCONNECTION NETWORK
Binary Tree with 2 x 2 Switches 0 000
0 1
001
1
0 010
P1 0
1
1 011
P2
0 100
0 1
1 101
0 110
1
111
8x8 Omega Switching Network
0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111
Multiprocessors Interconnection Structure
HYPERCUBE INTERCONNECTION
n-dimensional hypercube (binary n-cube)
- p = 2n
- processors are conceptually on the corners of a
n-dimensional hypercube, and each is directly
connected to the n neighboring nodes
- Degree = n 011 111
010
0 01 11 110
101
001
1 00 10 100
000
One-cube Two-cube Three-cube
Multiprocessors INTERPROCESSOR ARBITRATION Interprocessor Arbitration
Bus
Board level bus
Backplane level bus
Interface level bus
* Strobe pulse - supplied by one of the units to indicate to the other unit when the
data transfer has to occur
e.g. IEEE standard 796 bus
- 86 lines
BUS SIGNALS Data: 16(multiple of 8)
Address: 24
Control: 26
Bus signal allocation IEEE Standard 796 Multibus Signals Power: 20
- address Data and address
Data lines (16 lines) DATA0 - DATA15
- data Address lines (24 lines) ADRS0 - ADRS23
- control Data transfer
- arbitration Memory read MRDC
- interrupt Memory write MWTC
- timing IO read IORC
IO write IOWC
- power, ground Transfer acknowledge TACK (XACK)
Interrupt control
Interrupt request INT0 - INT7
interrupt acknowledge INTA
Multiprocessors Interprocessor Arbitration
BUS SIGNALS
IEEE Standard 796 Multibus Signals (Cont’d)
Miscellaneous control
Master clock CCLK
System initialization INIT
Byte high enable BHEN
Memory inhibit (2 lines) INH1 - INH2
Bus lock LOCK
Bus arbitration
Bus request BREQ
Common bus request CBRQ
Bus busy BUSY
Bus clock BCLK
Bus priority in BPRN
Bus priority out BPRO
Power and ground (20 lines)
INTERPROCESSOR ARBITRATION STATIC ARBITRATION
Serial Arbitration
Procedure
Parallel Arbitration
Procedure
INTERPROCESSOR ARBITRATION DYNAMIC ARBITRATION
Priorities of the units can be dynamically changeable while the system is in
operation
Time Slice
Fixed length time slice is given sequentially to each processor, round-robin
fashion
Polling
Unit address polling - Bus controller advances the address to identify the
requesting unit
LRU- The least recently used (LRU) algorithm gives the highest priority to the
requesting device that has not used the bus for the longest interval.
FIFO- In the first-come, first-serve scheme, requests are served in the order received.
Rotating Daisy Chain
• Conventional Daisy Chain - Highest priority to the nearest unit to the bus
controller
• Rotating Daisy Chain - Highest priority to the unit that is nearest to the unit
that has most recently accessed the bus (it becomes the bus controller)
Multiprocessors Interprocessor Communication and Synchronization
INTERPROCESSOR COMMUNICATION
Interprocessor
Communication
INTERPROCESSOR SYNCHRONIZATION
Synchronization
Communication of control information between processors
- To enforce the correct sequence of processes
- To ensure mutually exclusive access to shared writable data
Hardware Implementation
Mutual Exclusion with a Semaphore
Mutual Exclusion
- One processor to exclude or lock out access to shared resource by
other processors when it is in a Critical Section
- Critical Section is a program sequence that, once begun, must complete execution
before another processor accesses the same shared resource
Semaphore
- A binary variable
- 1: A processor is executing a critical section, that not available to other processors
0: Available to any requesting processor
- Software controlled Flag that is stored in memory that all processors can be access
SEMAPHORE
Testing and Setting the Semaphore
- Avoid two or more processors test or set the same semaphore
- May cause two or more processors enter the same critical section at the same time
- Must be implemented with an indivisible operation
These are being done while locked, so that other processors cannot test
and set while current processor is being executing these instructions
If R=1, another processor is executing the critical section, the processor executed
this instruction does not access the shared memory