Lecture4 (Share Memory-"According Access")
Lecture4 (Share Memory-"According Access")
Issued by:
Dr. Ameer Mosa Thoeny Al-Sadi
Lecture 4
Classification of Distributed Systems
(1-Share memory-” according access”)
Outlines
• Forms parallelism.
• Classification architecture:
1. SISD.
2. SIMD.
3. MISD.
4. MIMD.
1. Share memory:
-According access” UMA, NUMA, COMA”.
-According connection: buses, crossing switch, multistage network.
2. Message system:
-according network topology.
Two main problems need to be addressed when designing a shared memory system:
performance degradation due to contention, and coherence problems.
Coherence problem: The copies in the caches are coherent if they are all equal to the
same value. However, if one of the processors writes over the value of one of the
copies, then the copy becomes inconsistent because it no longer equals the value of
the other copies.
In the NUMA system, each processor has part of the shared memory attached. The
memory has a single address space. Therefore, any processor could access any
memory location directly using its real address. However, the access time to modules
depends on the distance to the processor.
In the extreme, the bus contention might be reduced to zero after the cache memories
are loaded from the global memory, because it is possible for all instructions and
data to be completely contained within the cache.
Similar to the NUMA, each processor has part of the shared memory in the COMA.
However, in this case the shared memory consists of cache memory. A COMA
system requires that data be migrated to the processor requesting it. There is no
memory hierarchy and the address space is made of all the caches. There is a cache
directory (D) that helps in remote cache access. The Kendall Square search’s KSR-
1 machine is an example of such architecture.
Register – M0
Cache-M1:
-Almost on chip with processor.
-speed up access to main memory (RAM) by (~10 ns).
-relatively small (~ 10² kB).
-is transparent , control Access is HW (MMU).
Main memory – M2
-on motherboard technology D-RAM or DDRAM.
- relatively fast (~ 10-10² ns ).
- level bigger than cache (~ GB).
- control Access done by OS and MMU.
Hard disk-M3:
-in computer.
-long rang time access (~10 ms).
-big capacity(~ 10² GB).
-control Access is OS.
Inclusive :
- .
Coherence:
Locality:
Cache–Memory Coherence:
In a single cache system, coherence between memory and the cache is maintained
using one of two policies: (1) write-through, and (2) write-back. When a task
running on a processor P requests the data in memory location X, for example, the
contents of X are copied to the cache, where it is passed on to P. When P updates
the value of X in the cache, the other copy in memory also needs to be updated in
order to maintain consistency. In write-through, the memory is updated every time
Coherence of (Multi-cache)
Caches play key role in all cases:
1. Reduce average data access time
2. Reduce bandwidth demands placed on shared interconnect.
Write-invalidate Write-update
• Snooping protocols: are based on watching bus activities and carry out the
appropriate coherency commands when necessary.
• Global memory is moved in blocks, and each block has a state associated
with it, which determines what happens to the entire contents of the block.
• The state of a block might change as a result of the operations:
1. Read-Miss,
2. Read-Hit,
3. Write-Miss,
4. and Write-Hit.
Multiple processors can read block copies from main memory safely until one
processor updates its copy. At this time, all cache copies are invalidated and the
memory is updated to remain consistent.
A valid block can be owned by memory and shared in multiple caches that can
contain only the shared copies of the block. Multiple processors can safely read these
blocks from their caches until one processor updates its copy. At this time, the writer
becomes the only owner of the valid block and all other copies are invalidated.
1) Time locality: block data which we needed it before instant (moment), will need
it also after moment.
Homework