0% found this document useful (0 votes)
73 views

Computer Organization & Architecture

This document discusses cache performance and organization. It describes the different types of cache misses like compulsory, capacity and conflict misses. It explains hit ratio and how average access time is calculated based on cache hit time and main memory access time. Write policies like write-through and write-back are covered. The benefits of multi-level caches and split versus unified caches are summarized. Cache line size impact on hit ratio is also addressed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

Computer Organization & Architecture

This document discusses cache performance and organization. It describes the different types of cache misses like compulsory, capacity and conflict misses. It explains hit ratio and how average access time is calculated based on cache hit time and main memory access time. Write policies like write-through and write-back are covered. The benefits of multi-level caches and split versus unified caches are summarized. Cache line size impact on hit ratio is also addressed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Computer Organization

& Architecture
BITS Pilani Virendra Singh Shekhawat
Department of Computer Science and Information Systems
Pilani Campus
BITS Pilani
Pilani Campus

Module-4 (Lecture-3)
[Ref. Computer Organization and Architecture , 8th Ed. by William Stallings]
Topics: Cache Performance Measurement , Cache Miss Types, Hit Ratio, Write
Policy, Multilevel Caches
Cache Performance

• More the processor references found in the


cache better the performance
– If a reference found in the cache it is HIT otherwise
MISS
– A penalty is associated with each MISS occurred
– More hits reduces average access time

3
BITS Pilani, Pilani Campus
Where to misses come from?
• Compulsory— Initially cache is empty or no valid data in it. So
the first access to a block is always miss. Also called cold start
misses or first reference misses.

• Capacity—If the cache size is not enough such that it can not
accommodate sufficient blocks needed during execution of a
program then frequent misses will occur. Capacity misses are
those misses that occur regardless of associativity or block size.

• Conflict—If block-placement strategy is set associative or direct


mapped, conflict misses (in addition to compulsory & capacity
misses) will occur because a block can be discarded and later
retrieved if too many blocks map to its set. Also called collision
misses.
4
BITS Pilani, Pilani Campus
Operation of Two Level
Memory System-1
• Hit Ratio(H)
– Ratio of number of references found in the higher
level memory (M1) (i.e. cache memory) to the
total references

• Average Access time


– Ts =H*T1 + (1-H)*(T1+T2)
– T1=Access time of M1 (Cache)
– T2=Access time of M2 (Main Memory)

5
BITS Pilani, Pilani Campus
Operation of Two Level
Memory System-2
• Cost
Cs = (C1S1+C2S2)/(S1+S2)

• Access Efficiency (T1/Ts)


– Measure of how close average access time is to M1
access time
– On chip cache access time is about 25 to 50 times
faster than main memory
– Off chip cache access time is about 5 to 15 times
faster than main memory access time
6
BITS Pilani, Pilani Campus
Main Memory Inconsistency

• What happens when data is modified in cache?

• Must not overwrite a cache block unless main


memory is up to date

• Multiple CPUs may have individual caches

7
BITS Pilani, Pilani Campus
Write Through Policy

• All writes go to main memory as well as cache


memory
• Multiple CPUs have to monitor main memory
traffic to keep local (to CPU) cache up to date
• Implications
– Lots of traffic and Slows down writes

8
BITS Pilani, Pilani Campus
Write Back Policy

• Updates initially made in cache only


• Update bit for cache slot is set when update
occurs
• If block is to be replaced, write to main
memory only if update bit or dirty bit is set
– Implications
• Other caches get out of sync
• I/O must access main memory through cache

9
BITS Pilani, Pilani Campus
Hit Ratio vs. Cache Line Size
• For each miss not only desired word but a number of
adjacent words are retrived
• Increased block size will increase hit ratio at first
– Due to the principle of locality
• Hit ratio will decreases as block becomes even bigger
– Probability of using newly fetched information becomes less
than probability of reusing replaced
• Larger blocks
– Reduce number of blocks that fit in cache
– Data overwritten shortly after being fetched
– Each additional word is less local so less likely to be needed
• No definitive optimum value has been found
• 8 to 64 bytes seems reasonable
10
BITS Pilani, Pilani Campus
Multi Level Caches
• On chip cache improves the performance. Why???
• Will more than one level of cache improves the
performance???
• Simplest organization is two level cache on chip (L1)
and external cache (L2)
• In many cache designs for an off chip cache (L2) a
separate bus is used to transfer the data.
• Now a days L2 cache is also available as on chip cache
due to shrinking the size of processor
• So now we have one more level of cache i.e. L3 off
chip cache
11
BITS Pilani, Pilani Campus
Multilevel Cache Performance

12
BITS Pilani, Pilani Campus
Unified Vs. Split Caches
• Earlier same cache is used for data as well as
instructions i.e. Unified Cache
• Now we have separate caches for data and
instructions i.e. Split cache
• Advantages of Unified cache
– It balances load between data and instruction automatically
• Advantages of Split cache
– Useful in parallel instruction execution
– Eliminate contention for the instruction fetch/decode unit
and the execution unit
– E.g. Super scalar machines Pentium and Power PC
13
BITS Pilani, Pilani Campus
Caches and External
Connections in P-3 Processor
Processing units

L1 data and instruction is 16-KB


Data cache: 4 way set associative
L1 instruction L1 data Instruction cache: 2 way set
cache cache
associative

Bus interface unit

System bus
Cache bus

Main
L2 cache memory Input/Output
L2 is of 512K
2 Way Set Associative
14
BITS Pilani, Pilani Campus
Cache Memory Evolution
Year of
Processor Type L1 cache L2 cache L3 cache
Introduction
IBM 360/85 Mainframe 1968 16 to 32 KB — —
PDP-11/70 Minicomputer 1975 1 KB — —
VAX 11/780 Minicomputer 1978 16 KB — —
IBM 3033 Mainframe 1978 64 KB — —
IBM 3090 Mainframe 1985 128 to 256 KB — —
Intel 80486 PC 1989 8 KB — —
Pentium PC 1993 8 KB/8 KB 256 to 512 KB —
PowerPC 601 PC 1993 32 KB — —
PowerPC 620 PC 1996 32 KB/32 KB — —
PowerPC G4 PC/server 1999 32 KB/32 KB 256 KB to 1 MB 2 MB
IBM S/390 G4 Mainframe 1997 32 KB 256 KB 2 MB
IBM S/390 G6 Mainframe 1999 256 KB 8 MB —
Pentium 4 PC/server 2000 8 KB/8 KB 256 KB —
High-end server/
IBM SP 2000 64 KB/32 KB 8 MB —
supercomputer
CRAY MTAb Supercomputer 2000 8 KB 2 MB —
Itanium PC/server 2001 16 KB/16 KB 96 KB 4 MB
SGI Origin 2001 High-end server 2001 32 KB/32 KB 4 MB —
Itanium 2 PC/server 2002 32 KB 256 KB 6 MB
IBM POWER5 High-end server 2003 64 KB 1.9 MB 36 MB
CRAY XD-1 Supercomputer 2004 64 KB/64 KB 1MB —

15
BITS Pilani, Pilani Campus
Summary

• Cache Performance
– Hit Ratio
– Average Access Time
– Access Efficiency
– Write Policy
– Line Size
– Multiple Level of cache
– Unified Cache and Split Cache

16
BITS Pilani, Pilani Campus
Thank You!

17
BITS Pilani, Pilani Campus

You might also like