0% found this document useful (0 votes)
19 views22 pages

Unit-5 Part-2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views22 pages

Unit-5 Part-2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Multiprocessors Characteristics of Multiprocessors

COUPLING OF PROCESSORS
Tightly Coupled System
- Tasks and/or processors communicate in a highly synchronized fashion
- Communicates through a common shared memory
- Shared memory system
Loosely Coupled System
- Tasks or processors do not communicate in a
synchronized fashion
- Communicates by message passing packets
- Overhead for data exchange is high
- Distributed memory system
Multiprocessors Characteristics of Multiprocessors
MEMORY
Shared (Global) Memory
- A Global Memory Space accessible by all processors
- Processors may also have some local memory
Distributed (Local, Message-Passing) Memory
- All memory units are associated with processors
- To retrieve information from another processor's
memory a message must be sent there
Uniform Memory
- All processors take the same time to reach all memory locations
SHARED MEMORY
Nonuniform (NUMA) Memory
Memory DISTRIBUTED MEMORY
- Memory access is not uniform Network

Network

Processors Processors/Memory
Multiprocessors Interconnection Structure
INTERCONNECTION STRUCTURES
* Time-Shared Common Bus
* Multiport Memory
* Crossbar Switch
* Multistage Switching Network
* Hypercube System

Bus
All processors (and memory) are connected to a
common bus or busses
- Memory access is fairly uniform, but not very scalable
Multiprocessors Interconnection Structure
BUS
- A collection of signal lines that carry module-to-module communication
- Data highways connecting several digital system elements
Operations of Bus Devices
M3 S7 M6 S5 M4
S2
Bus
M3 wishes to communicate with S5
[1] M3 sends signals (address) on the bus that causes
S5 to respond
[2] M3 sends data to S5 or S5 sends data to
M3(determined by the command line)

Master Device: Device that initiates and controls the communication


Slave Device: Responding device
Multiple-master buses
-> Bus conflict
Multiprocessors Interconnection Structure
SYSTEM BUS STRUCTURE FOR MULTIPROCESSORS

Local Bus

Common System Local


Shared Bus CPU IOP
Memory
Memory Controller
SYSTEM BUS

System Local System Local


CPU IOP Bus CPU
Bus Memory Memory
Controller Controller

Local Bus Local Bus


Multiprocessors Interconnection Structure
MULTIPORT MEMORY
Multiport Memory Module
- Each port serves a CPU

Memory Module Control Logic


- Each memory module has control logic
- Resolve memory module conflicts Fixed priority among CPUs
Memory Modules
Advantages
MM 1 MM 2 MM 3 MM 4
- Multiple paths -> high transfer rate

Disadvantages
CPU 1
- Memory control logic
- Large number of cables and CPU 2
connections
CPU 3

CPU 4
Multiprocessors Interconnection Structure

CROSSBAR SWITCH Memory modules

MM1 MM2 MM3 MM4


CPU1
CPU2
CPU3
CPU4
Block Diagram of Crossbar
Switch
data
} data,address, and
control from CPU 1

Memory
address
R/W
Multiplexers
and
arbitration
} data,address, and
control from CPU 2
Module
memory
enable
logic
} data,address, and
control from CPU 3

} data,address, and
control from CPU 4
Multiprocessors Interconnection Structure
MULTISTAGE SWITCHING NETWORK

Interstage Switch

A 0 A 0

1 1
B B
A connected to 0 A connected to 1

A 0 A 0

1 1
B B
B connected to 0 B connected to 1
Multiprocessors Interconnection Structure
MULTISTAGE INTERCONNECTION NETWORK
Binary Tree with 2 x 2 Switches 0 000
0 1
001
1
0 010
P1 0
1
1 011
P2
0 100
0 1
1 101
0 110
1
111
8x8 Omega Switching Network
0 000
1 001
2 010
3 011
4 100
5 101
6 110
7 111
Multiprocessors Interconnection Structure
HYPERCUBE INTERCONNECTION
n-dimensional hypercube (binary n-cube)
- p = 2n
- processors are conceptually on the corners of a
n-dimensional hypercube, and each is directly
connected to the n neighboring nodes
- Degree = n 011 111

010
0 01 11 110

101
001
1 00 10 100
000
One-cube Two-cube Three-cube
Multiprocessors INTERPROCESSOR ARBITRATION Interprocessor Arbitration

Bus
Board level bus
Backplane level bus
Interface level bus

System Bus - A Backplane level bus

- Printed Circuit Board


- Connects CPU, IOP, and Memory
- Each of CPU, IOP, and Memory board can be
plugged into a slot in the backplane(system bus)
- Bus signals are grouped into 3 groups

Data, Address, and Control(plus power)

- Only one of CPU, IOP, and Memory can be


granted to use the bus at a time
- Arbitration mechanism is needed to handle
multiple requests
Multiprocessors Interprocessor Arbitration
SYNCHRONOUS & ASYNCHRONOUS DATA TRANSFER
Synchronous Bus
Each data item is transferred over a time slice known to both source and destination unit
- Common clock source
- Or separate clock and synchronization signal is transmitted periodically to
synchronize the clocks in the system
Asynchronous Bus
* Each data item is transferred by Handshake mechanism
- Unit that transmits the data transmits a control signal that indicates the presence of
data
- Unit that receiving the data responds with another control signal to acknowledge
the receipt of the data

* Strobe pulse - supplied by one of the units to indicate to the other unit when the
data transfer has to occur
e.g. IEEE standard 796 bus
- 86 lines
BUS SIGNALS Data: 16(multiple of 8)
Address: 24
Control: 26
Bus signal allocation IEEE Standard 796 Multibus Signals Power: 20
- address Data and address
Data lines (16 lines) DATA0 - DATA15
- data Address lines (24 lines) ADRS0 - ADRS23
- control Data transfer
- arbitration Memory read MRDC
- interrupt Memory write MWTC
- timing IO read IORC
IO write IOWC
- power, ground Transfer acknowledge TACK (XACK)
Interrupt control
Interrupt request INT0 - INT7
interrupt acknowledge INTA
Multiprocessors Interprocessor Arbitration
BUS SIGNALS
IEEE Standard 796 Multibus Signals (Cont’d)
Miscellaneous control
Master clock CCLK
System initialization INIT
Byte high enable BHEN
Memory inhibit (2 lines) INH1 - INH2
Bus lock LOCK
Bus arbitration
Bus request BREQ
Common bus request CBRQ
Bus busy BUSY
Bus clock BCLK
Bus priority in BPRN
Bus priority out BPRO
Power and ground (20 lines)
INTERPROCESSOR ARBITRATION STATIC ARBITRATION

Serial Arbitration
Procedure

Parallel Arbitration
Procedure
INTERPROCESSOR ARBITRATION DYNAMIC ARBITRATION
Priorities of the units can be dynamically changeable while the system is in
operation
Time Slice
Fixed length time slice is given sequentially to each processor, round-robin
fashion
Polling
Unit address polling - Bus controller advances the address to identify the
requesting unit
LRU- The least recently used (LRU) algorithm gives the highest priority to the
requesting device that has not used the bus for the longest interval.
FIFO- In the first-come, first-serve scheme, requests are served in the order received.
Rotating Daisy Chain
• Conventional Daisy Chain - Highest priority to the nearest unit to the bus
controller
• Rotating Daisy Chain - Highest priority to the unit that is nearest to the unit
that has most recently accessed the bus (it becomes the bus controller)
Multiprocessors Interprocessor Communication and Synchronization
INTERPROCESSOR COMMUNICATION

Interprocessor
Communication
INTERPROCESSOR SYNCHRONIZATION
Synchronization
Communication of control information between processors
- To enforce the correct sequence of processes
- To ensure mutually exclusive access to shared writable data

Hardware Implementation
Mutual Exclusion with a Semaphore
Mutual Exclusion
- One processor to exclude or lock out access to shared resource by
other processors when it is in a Critical Section
- Critical Section is a program sequence that, once begun, must complete execution
before another processor accesses the same shared resource

Semaphore
- A binary variable
- 1: A processor is executing a critical section, that not available to other processors
0: Available to any requesting processor
- Software controlled Flag that is stored in memory that all processors can be access
SEMAPHORE
Testing and Setting the Semaphore
- Avoid two or more processors test or set the same semaphore
- May cause two or more processors enter the same critical section at the same time
- Must be implemented with an indivisible operation

R <- M[SEM] / Test semaphore /


M[SEM] <- 1 / Set semaphore /

These are being done while locked, so that other processors cannot test
and set while current processor is being executing these instructions

If R=1, another processor is executing the critical section, the processor executed
this instruction does not access the shared memory

If R=0, available for access, set the semaphore to 1 and access


The last instruction in the program must clear the semaphore
CACHE COHERENCE
Shared Cache MAINTAINING CACHE COHERENCY
- Disallow private cache
- Access time delay
Software Approaches
* Read-Only Data are Cacheable
- Private Cache is for Read-Only data
- Shared Writable Data are not cacheable
- Compiler tags data as cacheable and noncacheable
- Degrade performance due to software overhead
*Centralized Global Table
- Status of each memory block is maintained in CGT: RO(Read-Only); RW(Read and Write)
- All caches can have copies of RO blocks
- Only one cache can have a copy of RW block
Hardware Approaches
* Snoopy Cache Controller
- Cache Controllers monitor all the bus requests from CPUs and IOPs
- All caches attached to the bus monitor the write operations
- When a word in a cache is written, memory is also updated (write through)
- Local snoopy controllers in all other caches check their memory to determine if they have a copy
of that word; If they have, that location is marked invalid(future reference to this location causes cache
miss)

You might also like