Multiprocessing: Flynn's Classification (1966)
Multiprocessing: Flynn's Classification (1966)
IT6030 1
11/1/2010
IT6030 2
11/1/2010
P1 P2 P3 P
u=?
u=? 3
L1 • Reading an address
$
4
$ 5 $
100:67
should return the last
u :5 u :5 u= 7 L2
100:35
value written to that
address
Memory – Easy in uniprocessors, except
for I/O
1 I/O devices Disk 100:34
2
u :5
Memory
• Too vague and simplistic; 2 issues
1. Coherence defines values returned by a read
– Processors see different values for u after event 3
– With write back caches, value written back to memory depends on
2. Consistency determines when a written value will be
happenstance of which cache flushes or writes back value when returned by a read
» Processes accessing main memory may see very stale value • Coherence defines behavior to same location,
– Unacceptable for programming, and its frequent! Consistency defines behavior to other locations
IT6030 3
11/1/2010
P1 P2 P3
State P1 Pn
Bus snoop
Address u=? 3
u=?
Data 4
$ $
$ $ 5 $
u :5 u :5 u= 7
Cache-memory
I/O devices transaction
Mem
IT6030 4
11/1/2010
• Cache block state transition diagram • Write-through: get up-to-date copy from memory
– FSM specifying how disposition of block changes – Write through simpler if enough memory BW
» invalid, valid, dirty • Write-back harder
• Broadcast Medium Transactions (e.g., bus) – Most recent copy can be in a cache
– Fundamental system design abstraction
– Logically single set of wires connect several devices
• Can use same snooping mechanism
– Protocol: arbitration, command/addr, data 1. Snoop every address placed on the bus
⇒ Every device observes every transaction
2. If a processor has dirty copy of requested cache block, it
• Broadcast medium enforces serialization of read or provides it in response to a read request and aborts the
write accesses ⇒ Write serialization memory access
– 1st processor to get medium invalidates others copies
– Implies cannot complete write until it obtains bus
– Complexity from retrieving cache block from a processor cache,
which can take longer than retrieving it from memory
– All coherence schemes require serializing accesses to same cache
block
• Write-back needs lower memory bandwidth
• Also need to find up-to-date copy of cache block ⇒ Support larger numbers of faster processors
⇒ Most multiprocessors use write-back
IT6030 5
11/1/2010
IT6030 6
11/1/2010
IT6030 7
11/1/2010
IT6030 8