Average Memory Access Time
Average Memory Access Time
To reduce it, we can target each of these factors, and study how these can be reduced.
TC
W TC W TC W
time
Possibility of early reading Data can be read unfinished previous writes still pending in the write buffer, although this can potentially complicate life.
Only for systems supporting virtual memory Virtual address is translated to physical address. Physical address is used to check cache tags. Why not store the virtual addresses as tags?
C M Disk
After context switching Either the cache needs to be flushed, since the new process may try to use the old page numbers. Or the tags must contain the process ids that can be used to distinguish between processes.
Method 1 : Give priority to read miss over write. Consider a direct mapped cache using write-through. Assume that addresses 512 and 1024 map to the same cache block. M[512] R3; R2 M[512]; *value of R3 in write buffer*
R1 M[1024]; *read miss, fetch M[1024]* *read miss, fetch M[512]* *value of R3 not yet written* *R2 R3 Read miss must wait until the write buffer is empty. To reduce the wait, let read miss check the write buffer. If there is no conflict, read M to get the data. Else, read from the write buffer.
Method 2
Early restart and Critical word first Do not wait for the whole block to be loaded into the cache. As soon as the requested word arrives, send it to the CPU. Request to transfer the missing word first. Let CPU continue while the rest of the cache block is being filled. C M
Method 4. Nonblocking cache Instruction 1 Cache Miss Instruction 2 This is a hit But should it wait for1?
In dynamic instruction scheduling, a stalled instruction does not necessarily block the subsequent instructions. So, instruction 2 can pass instruction 1.
Miss
access
transfer from M
instruction 1
Hit
read C
instruction 2
With a packet switched bus (i.e. split-transaction bus), it is possible to implement hit under miss under miss.
Instr 1 accessing M 3
2 1
Extensively used in high performance computer systems. Miss buffer stores the missing cache lines from M until these are transferred to the cache by the cache controller.
M x
x1
x2
P2
P1 writes X:=10 using write-through. P2 now reads X and uses its local copy x2, but finds that X is still 5. P2 does not know that P1 modified X.
M C
Cache Controller
Memory Controller
P Configuration 1 Configuration 1
I/O
P Configuration 2
I/O
No cache coherence problem, but there is a risk of data overrun, when the controller is incapable of handling the data traffic.
Configuration 2
When the I/O device (or the processor) wants to read a dirty block, the memory controller supplies a clean copy of the dirty block from the main memory.