Lec 34
Lec 34
Lec 34
Improvement
Dr. A. Sahu
Dept of Comp. Sc. & Engg.
Indian Institute of Technology Guwahati
Outline
• Eleven Advanced Cache Performance
Optimization
– Prev: Reducing hit time & Increasing bandwidth
– Prev: Reducing
d miss penalty
l
– Reducing miss rate
– Reducing
R d i missi penaltylt * miss
i rate
t
Eleven Advanced Optimization for
Cache
h Performance
f
• Reducing hit time
• Reducing miss penalty
• Reducing miss rate
• Reducing miss penalty * miss rate
[C ] = [ A] × [B ]
L×M L× N N ×M
Cache Organization for the example
C A B
accesses LM LMN LMN
misses LM/4 LMN/4 LMN
C A B
accesses LMN LN LMN
misses LMN/4 LN LMN/4
C A B
accesses LMN LN LMN
misses LMN/4 LN/4 LMN/4
• Non‐blocking
Non blocking cache
• Hardware prefetching
• Compiler
C il controlled
ll d prefetching
f hi
Non‐blocking Cache
In OOO processor
C h
Cache
prefetch
buffer
from mem
Compiler Controlled Pre
Pre‐fetching
fetching
• Semantically invisible (no change in registers
or cache contents)
• Makes sense if processor doesn’t
doesn t stall while
prefetching (non‐blocking cache)
• Overhead
O h d off prefetch
f h instruction
i i should
h ld not
exceed the benefit
SW Prefetch Example
• 8 KB direct mapped, write back data cache with
16 byte blocks.
blocks
• a is 3 × 100, b is 101 × 3