0% found this document useful (0 votes)
17 views28 pages

10 Cache

Uploaded by

sans42699
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views28 pages

10 Cache

Uploaded by

sans42699
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Cache

COMPILED BY:
SAROSH SHAHID
Cache Miss
00 47
01 25
02 5
03 6
Find data in cache Find data in memory 04 88
MOV AL, [08] Not Found
05 99
Place 57 in 06 21
57 57
Processor cache
07 58
08 57
Cache 09 98
CACHE MISS 0A 65
0B 22 Memory
Cache Hit
00 47
01 25
02 5
03 6
Find data in cache 04 88
MOV AL, [08] Found
05 99
Data 57 in 06 21
57
Processor cache
07 58
08 57
Cache 09 98
CACHE HIT 0A 65
0B 22 Memory
Cache Miss
00 47
01 25
02 5
Find data in Find data in Find data in
L1 cache
03 6
Not Found L2 cache Not Found memory
04 88
MOV AL, [08]
Place 57 in Place 57 in 05 99
57 57 57
L1 cache L2 cache 06 21
Processor
07 58
08 57
L1 Cache L2 Cache 09 98
L1 CACHE L2 CACHE 0A 65
MISS MISS
0B 22

Memory
Cache Miss
00 47
01 25
02 5
Find data in Find data in
L1 cache
03 6
Not Found L2 cache Found
04 88
MOV AL, [08]
Place 57 in Data 57 in 05 99
57 57
L1 cache L2 cache 06 21
Processor
07 58
08 57
L1 Cache L2 Cache 09 98
L1 CACHE L2 CACHE HIT 0A 65
MISS
0B 22

Memory
Locality
Temporal Locality:
◦ Locality in time
◦ If data used recently, likely to use it again soon
◦ How to exploit: keep recently accessed data in higher levels of memory
hierarchy
Spatial Locality:
◦ Locality in space
◦ If data used recently, likely to use nearby data soon
◦ How to exploit: when access data, bring nearby data into higher levels
of memory hierarchy too
Locality
𝑓𝑜𝑟 𝑖𝑛𝑡 𝑖 = 0; 𝑖 < 10000; 𝑖 + +
𝑎 = 𝐴[𝑖] + 𝑎

Temporal Locality:
If data used recently, likely to use it again soon “Variables i and a”
Spatial Locality:
If data used recently, likely to use nearby data soon “Array A”
What data is held in cache?
Ideally, cache anticipates needed data and puts it in cache
But impossible to predict future
Use past to predict future – temporal and spatial locality:
◦ Temporal locality: copy newly accessed data into cache
◦ Spatial locality: copy neighboring data into cache too
Cache Performance Analysis
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑚𝑖𝑠𝑠𝑒𝑠
𝑀𝑖𝑠𝑠 𝑅𝑎𝑡𝑒 = = 1 − 𝐻𝑖𝑡 𝑅𝑎𝑡𝑒
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑚𝑒𝑚𝑜𝑟𝑦 𝐴𝑐𝑐𝑒𝑠𝑠𝑒𝑠

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 ℎ𝑖𝑡𝑠
𝐻𝑖𝑡 𝑅𝑎𝑡𝑒 = = 1 − 𝑀𝑖𝑠𝑠 𝑅𝑎𝑡𝑒
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑚𝑒𝑚𝑜𝑟𝑦 𝐴𝑐𝑐𝑒𝑠𝑠𝑒𝑠
Cache Performance Analysis
Example
Suppose a program has 2000 data access instructions (load or stores), and 1250 of these requested data
values are found in the cache. The other 750 data values are supplied to the processor by main memory or
disk memory. What are the miss and hit rates for the cache?
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑚𝑖𝑠𝑠𝑒𝑠
𝑀𝑖𝑠𝑠 𝑅𝑎𝑡𝑒 = = 1 − 𝐻𝑖𝑡 𝑅𝑎𝑡𝑒
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑚𝑒𝑚𝑜𝑟𝑦 𝐴𝑐𝑐𝑒𝑠𝑠𝑒𝑠

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 ℎ𝑖𝑡𝑠
𝐻𝑖𝑡 𝑅𝑎𝑡𝑒 = = 1 − 𝑀𝑖𝑠𝑠 𝑅𝑎𝑡𝑒
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑚𝑒𝑚𝑜𝑟𝑦 𝐴𝑐𝑐𝑒𝑠𝑠𝑒𝑠

Solution
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒎𝒊𝒔𝒔𝒆𝒔 𝟕𝟓𝟎
𝑴𝒊𝒔𝒔 𝑹𝒂𝒕𝒆 = = = 𝟎. 𝟑𝟕𝟓 = 𝟑𝟕. 𝟓%
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒕𝒐𝒕𝒂𝒍 𝒎𝒆𝒎𝒐𝒓𝒚 𝑨𝒄𝒄𝒆𝒔𝒔𝒆𝒔 𝟐𝟎𝟎𝟎

𝑯𝒊𝒕 𝑹𝒂𝒕𝒆 = 𝟏 − 𝑴𝒊𝒔𝒔 𝑹𝒂𝒕𝒆 = 𝟏 − 𝟎. 𝟑𝟕𝟓 = 𝟔𝟐. 𝟓%


Cache Terminologies
•Cache Capacity C = Total number of bytes that can be stored in the cache
•Data is transferred between memory and cache in blocks of fixed size called cache lines or cache
blocks.
b = Block size Cache Line
Block 0
Block 1
Block 2
Block 3
Block 4
Block 5
Block 6
Block 7

Cache
Cache Terminologies
b = 32 bits = 4B
C = 64 bits = 8B

Cache Line
Block 0 = 4B
Block 1 = 4B
Cache
Block 0 00 47
01 25

Block Size 02
03
5
6
Block 1 04 88

b = 32 bits = 4B 05 99
C = 64 bits = 8B 06 21
Cache Line 07 58
Block 0 = 4B Block 2 08 57
Block 1 = 4B 09 98
0A 65
Cache
0B 22
Block 3 0C 66
0D 8
0E 55
One block of cache can hold 4Bytes so memory is also logically
0F 1
divided into equal chunks of 4 bytes and all 4 bytes are brought in the cache at once.
Memory
Block 0 00 47
01 25

Block Size 02
03
5
6
Block 1 04 88
05 99
b = 32 bits = 4B 06 21
C = 64 bits = 8B
07 58
Cache Line
Block 2 08 57
Block 0 = 4B 57 98 65 22
09 98
Block 1 = 4B
0A 65
Cache 0B 22
Block 3 0C 66
0D 8
MOV AL, [09]
Instead of bringing only [09] will bring the whole block in the cache 0E 55
Block size = 4B so will bring 4Bytes and as the address [09] is in “Block 2” and 0F 1
it maps to the “yellow colored” block of the cache… (Spatial Locality)
Block 0 00 47
01 25

Offset Bits 02
03
5
6

B = 32 bits = 4B Block 1 04 88
C = 64 bits = 8B 05 99
𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 = log 2 𝒃 = log 𝟐 𝟒 = 𝟐 06 21
07 58
Cache Line
Block 2 08 57
Offset 00 01 10 11
09 98
Block 0 = 4B 57 98 65 22
0A 65
Block 1 = 4B
0B 22
Cache
Block 3 0C 66
0D 8
MOV AL, [09]
Offset bits are used to identify the data 0E 55
[09] is at offset “01b or 1” 0F 1
[0A] is at offset “10b or 2” and so on
Cache Terminology
Capacity (C):
◦ number of data bytes in cache

Block size (b):


◦ bytes of data brought into cache at once

Number of blocks (B = C/b):


◦ number of blocks in cache: B = C/b

Degree of associativity (N):


◦ number of blocks in a set

Number of sets (S = B/N):


◦ each memory address maps to exactly one cache set
Cache Types
Direct Mapped Cache N-way Associative Cache Fully Associative Cache

Set # 0
Set # 1 Set # 0
Set # 0
Set # 2
Set # 3 Set # 1

Set # 4 Set # 0

Set # 5 Set # 2
Set # 1
Set # 6
Set # 7 Set # 3

One-way Associative Cache Two-way Associative Cache Four-way Associative Cache

1 block per set N blocks per set All cache blocks in one set
Design Cache
b = 4B Offset 00 01 10 11
C = 64B Sets
N = 1 (Direct Mapped) 0000
0001
B = C/b = 16 0010
S = B/N = 16/1 = 16 0011
𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 = log 2 𝒃 = log 𝟐 𝟒 = 𝟐 0100
𝑺𝒆𝒕/𝒊𝒏𝒅𝒆𝒙 𝒃𝒊𝒕𝒔 = log 2 𝑺 = log 𝟐 𝟏𝟔 = 𝟒
0101
……..
1101
1110
1111
Design Cache
b = 32B Offset 00000 00001 00010 ….. 11110 11111
C = 128B Sets
N = 1 (Direct Mapped) 00
01
B = C/b = 128/32 =4 10
S = B/N = 4/1 = 4 11
𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 = log 2 𝒃 = log 𝟐 𝟑𝟐 = 𝟓
𝑺𝒆𝒕/𝒊𝒏𝒅𝒆𝒙 𝒃𝒊𝒕𝒔 = log 2 𝑺 = log 𝟐 𝟒 = 𝟐
Design Cache
b = 4B Offset 00 01 10 11
C = 32B Sets
N = 4 (4-way associative)

0
B = C/b = 32/4 =8
S = B/N = 8/4 = 2
𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 = log 2 𝒃 = log 𝟐 𝟒 = 𝟐
𝑺𝒆𝒕/𝒊𝒏𝒅𝒆𝒙 𝒃𝒊𝒕𝒔 = log 2 𝑺 = log 𝟐 𝟐 = 𝟏
1
Direct Memory
Design Cache
b = 8B Offset 000 001 010 011 100 101 110 111
Full Associative Cache Sets
Single set with 4-ways

0
C = 8B * 4 (ways) = 32B
B = C/b = 32/8 = 4
S = B/N = 4/4 = 1
𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 = log 2 𝒃 = log 𝟐 𝟖 = 𝟑
𝑺𝒆𝒕/𝒊𝒏𝒅𝒆𝒙 𝒃𝒊𝒕𝒔 = log 2 𝑺 = log 𝟐 𝟏 = 𝟎
Cache Terminologies
Tag Bits - A unique identifier for a group of data. Because different regions of memory may be
mapped into a single block, the tag is used to differentiate between them.
We need to add tags to the cache, which supply the rest of the address bits to let us distinguish
between different memory locations that map to the same cache block.

Valid bit (V)- A bit of information that indicates whether the data in a block is valid (1) or not (0).
1. At the very start, the cache is empty and does not contain valid data.
2. We should account for this by adding a valid bit for each cache block.
3. When the system is initialized, all the valid bits are set to 0. When data is loaded into a particular
cache block, the corresponding valid bit is set to 1.
Tag Bits Set bits Offset bits
Convert to
0x014 binary 0000 0001 0100

Example 01 000 00000000


b = 4B 004 00000004
C = 32B
N = 1 (Direct Mapped) V Tags 00 01 10 11 008 00000008
Addresses are 12 bits Sets bit 00C 0000000C
B = C/b = 8 000 0 010 00000010
S = B/N = 8/1 = 8
𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 = log 2 𝒃 = log 𝟐 𝟒 = 𝟐 001 0 014 00000014
𝑺𝒆𝒕/𝒊𝒏𝒅𝒆𝒙 𝒃𝒊𝒕𝒔 = log 2 𝑺 = log 𝟐 𝟖 = 𝟑
𝑻𝒂𝒈 𝒃𝒊𝒕𝒔 = 𝑨𝒅𝒅𝒓𝒆𝒔𝒔 𝒃𝒊𝒕𝒔 − 𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 − 𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 010 0 018 00000018
= 𝟏𝟐 − 𝟐 − 𝟑 = 𝟕
011 0 01C 0000001C
Processor Memory Access Sequence
1. 0x014 100 0 020 00000020
2. 0xFF0 101 0
1 0000000 00 00 00 14 …….
3. 0x0F7
110 0 FF4 00000FF4
4. 0xC03
5. 0x0F3 111 0 FF8 00000FF8
Cache FFC 00000FFC
Memory
Cache Miss = 1 Cache Hit = 0
Tag Bits Set bits Offset bits
Convert to
0xFF0 binary 1111 1111 0000

Example 01 000 00000000


b = 4B 004 00000004
C = 32B
N = 1 (Direct Mapped) V Tags 00 01 10 11 008 00000008
Addresses are 12 bits Sets bit 00C 0000000C
B = C/b = 8 000 0 010 00000010
S = B/N = 8/1 = 8
𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 = log 2 𝒃 = log 𝟐 𝟒 = 𝟐 001 0 014 00000014
𝑺𝒆𝒕/𝒊𝒏𝒅𝒆𝒙 𝒃𝒊𝒕𝒔 = log 2 𝑺 = log 𝟐 𝟖 = 𝟑
𝑻𝒂𝒈 𝒃𝒊𝒕𝒔 = 𝑨𝒅𝒅𝒓𝒆𝒔𝒔 𝒃𝒊𝒕𝒔 − 𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 − 𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 010 0 018 00000018
= 𝟏𝟐 − 𝟐 − 𝟑 = 𝟕
011 0 01C 0000001C
Processor Memory Access Sequence
1. 0x014 100 1 1111111 00 00 0F F0 020 00000020
2. 0xFF0 101 1 0000000 00 00 00 14 …….
3. 0x0F7
110 0 FF4 00000FF4
4. 0xC03
5. 0x0F3 111 0 FF8 00000FF8
Cache FFC 00000FFC
Memory
Cache Miss = 2 Cache Hit = 0
Tag Bits Set bits Offset bits
Convert to
0x0F7 binary 0000 1111 0111

Example 01 000 00000000


b = 4B 004 00000004
C = 32B
N = 1 (Direct Mapped) V Tags 00 01 10 11 008 00000008
Addresses are 12 bits Sets bit 00C 0000000C
B = C/b = 8 000 0
S = B/N = 8/1 = 8
010 00000010
𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 = log 2 𝒃 = log 𝟐 𝟒 = 𝟐 001 0
𝑺𝒆𝒕/𝒊𝒏𝒅𝒆𝒙 𝒃𝒊𝒕𝒔 = log 2 𝑺 = log 𝟐 𝟖 = 𝟑
014 00000014
𝑻𝒂𝒈 𝒃𝒊𝒕𝒔 = 𝑨𝒅𝒅𝒓𝒆𝒔𝒔 𝒃𝒊𝒕𝒔 − 𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 − 𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 010 0
= 𝟏𝟐 − 𝟐 − 𝟑 = 𝟕
018 00000018
011 0 01C 0000001C
Processor Memory Access Sequence
1. 0x014 100 1 1111111 00 00 0F F0 020 00000020
2. 0xFF0 101 1 0000000
0000111 00 00 00 14 …….
3. 0x0F7
110 0 FF4 00000FF4
4. 0xC03
5. 0x0F3 111 0 FF8 00000FF8
Cache FFC 00000FFC
Memory
Data already present in the set “101”. To check whether it’s the same data we need “compare the tag bits”. As
0000000 ≠ 0000111 It means we have to replace the data. Cache Miss = 3 Cache Hit = 0
Tag Bits Set bits Offset bits
Convert to
0xC03 binary 1100 0000 0011

Example 01 000 00000000


b = 4B 004 00000004
C = 32B
N = 1 (Direct Mapped) V Tags 00 01 10 11 008 00000008
Addresses are 12 bits Sets bit 00C 0000000C
B = C/b = 8 000 10 1100000 00 00 0C 03
S = B/N = 8/1 = 8
010 00000010
𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 = log 2 𝒃 = log 𝟐 𝟒 = 𝟐 001 0
𝑺𝒆𝒕/𝒊𝒏𝒅𝒆𝒙 𝒃𝒊𝒕𝒔 = log 2 𝑺 = log 𝟐 𝟖 = 𝟑
014 00000014
𝑻𝒂𝒈 𝒃𝒊𝒕𝒔 = 𝑨𝒅𝒅𝒓𝒆𝒔𝒔 𝒃𝒊𝒕𝒔 − 𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 − 𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 010 0
= 𝟏𝟐 − 𝟐 − 𝟑 = 𝟕
018 00000018
011 0 01C 0000001C
Processor Memory Access Sequence
1. 0x014 100 1 1111111 00 00 0F F0 020 00000020
2. 0xFF0 101 1 0000111 00 00 00 F7 …….
3. 0x0F7
110 0 FF4 00000FF4
4. 0xC03
5. 0x0F3 111 0 FF8 00000FF8
Cache FFC 00000FFC
Memory
Cache Miss = 4 Cache Hit = 0
Tag Bits Set bits Offset bits
Convert to
0x0F3 binary 0000 1111 0011

Example 01 000 00000000


b = 4B 004 00000004
C = 32B
N = 1 (Direct Mapped) V Tags 00 01 10 11 008 00000008
Addresses are 12 bits Sets bit 00C 0000000C
B = C/b = 8 000 1 1100000 00 00 0C 03
S = B/N = 8/1 = 8
010 00000010
𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 = log 2 𝒃 = log 𝟐 𝟒 = 𝟐 001 0
𝑺𝒆𝒕/𝒊𝒏𝒅𝒆𝒙 𝒃𝒊𝒕𝒔 = log 2 𝑺 = log 𝟐 𝟖 = 𝟑
014 00000014
𝑻𝒂𝒈 𝒃𝒊𝒕𝒔 = 𝑨𝒅𝒅𝒓𝒆𝒔𝒔 𝒃𝒊𝒕𝒔 − 𝑶𝒇𝒇𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 − 𝒔𝒆𝒕 𝒃𝒊𝒕𝒔 010 0
= 𝟏𝟐 − 𝟐 − 𝟑 = 𝟕
018 00000018
011 0 01C 0000001C
Processor Memory Access Sequence
1. 0x014 100 1 0000111 00 00 00 F7 020 00000020
2. 0xFF0 101 1 0000000 00 00 00 14 …….
3. 0x0F7
110 0 FF4 00000FF4
4. 0xC03
5. 0x0F3 111 0 FF8 00000FF8
Cache FFC 00000FFC
Memory
Data already present in the set “100”. To check whether it’s the same data we need “compare the tag bits”. As
0000111 = 0000111 It means we found the data and it’s a CACHE HIT. Cache Miss = 4 Cache Hit = 1

You might also like