0% found this document useful (0 votes)
286 views8 pages

Improving and Measuring Cache Performance

Memory is organized in a hierarchy with caches sitting between the CPU and main memory. Caches are small, fast memory that store frequently used data from main memory. When the CPU requests data, the cache is checked first. If the data is present (a cache hit) it is retrieved quickly from cache. If not present (a cache miss), the data is fetched from main memory to cache before being sent to the CPU. Caches use tags to identify which blocks of main memory are stored in each cache slot. Multilevel caches with different sizes and speeds can further reduce the memory access time penalty by satisfying more requests in lower level caches.

Uploaded by

udhaya kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
286 views8 pages

Improving and Measuring Cache Performance

Memory is organized in a hierarchy with caches sitting between the CPU and main memory. Caches are small, fast memory that store frequently used data from main memory. When the CPU requests data, the cache is checked first. If the data is present (a cache hit) it is retrieved quickly from cache. If not present (a cache miss), the data is fetched from main memory to cache before being sent to the CPU. Caches use tags to identify which blocks of main memory are stored in each cache slot. Multilevel caches with different sizes and speeds can further reduce the memory access time penalty by satisfying more requests in lower level caches.

Uploaded by

udhaya kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 8

Memory Hierarchy:

Terminology

Lower Level
To Processor Upper Level Memory
Memory
Blk X
From Processor Blk Y
Cache
• Small amount of fast memory
• Sits between normal main memory and CPU
• May be located on CPU chip or module
Cache operation - overview

• CPU requests contents of memory location


• Check cache for this data
• If present, get from cache (fast)
• If not present, read required block from main
memory to cache
• Then deliver from cache to CPU
• Cache includes tags to identify which block
of main memory is in each cache slot
Cache Misses and Cache Hit
Measuring Cache Performance
• CPU time = (CPU execution clock cycles +
Memory stall clock cycles)  Clock-cycle time
• Memory stall clock cycles =
Read-stall cycles + Write-stall cycles
• Read-stall cycles = Reads/program  Read miss rate 
Read miss penalty
• Write-stall cycles = (Writes/program  Write miss rate 
Write miss penalty) + Write buffer stalls
(assumes write-through cache)
• Write buffer stalls should be negligible and write and read
miss penalties equal (cost to fetch block from memory)
• Memory stall clock cycles = Mem access/program  miss
rate  miss penalty
Reducing the Miss Penalty using
Multilevel Caches
• To further reduce the gap between fast clock rates of CPUs
and the relatively long time to access memory additional
levels of cache are used (level two and level three caches).
• The primary cache is optimized for a fast hit rate, which
implies a relatively small size
• A secondary cache is optimized to reduce the miss rate and
penalty needed to go to memory.
• Example:
– Assume CPI = 1 (with all hits) and 5 GHz clock
– 100 ns main memory access time
– 2% miss rate for primary cache
– Secondary cache with 5 ns access time and miss rate of .5%
– What is the total CPI with and without secondary cache?
– How much of an improvement does secondary cache provide?
Improving Performance of
Cache
• Reducing the hit time – Small and simple first-level
caches and way-prediction. Both techniques also
generally decrease power consumption.
• Increasing cache bandwidth – Pipelined caches,
multi-banked caches, and non-blocking caches.
These techniques have varying impacts on power
consumption.
Improving Performance of
Cache
• Reducing the miss penalty – Critical word first
and merging write buffers. These optimizations
have little impact on power.
• Reducing the miss rate – Compiler
optimizations. Obviously any improvement at
compile time improves power consumption.
• Reducing the miss penalty or miss rate via
parallelism – Hardware prefetching and
compiler prefetching. These optimizations
generally increase power consumption,
primarily due to prefetched data that are
unused.

You might also like