0% found this document useful (0 votes)
10 views40 pages

Lec 26

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views40 pages

Lec 26

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Great Ideas

UC Berkeley
in UC Berkeley
Teaching Professor Computer Architecture Professor
Dan Garcia (a.k.a. Machine Structures) Bora Nikolić

Caches III

Garcia, Nikolić

cs61c.org
Accessing data in a direct mapped cache
§ Ex.: 16KB of data, Memory
direct-mapped, ...(hex) Value...of Word
Address

4 word blocks 00000010 a


00000014 b
ú Can you work out 00000018 c
height, width, area? 0000001C d
§ Read 4 addresses ... ...
1. 0x00000014 00000030 e
2. 0x0000001C 00000034 f
3. 0x00000034 00000038 g
4. 0x00008014 0000003C h
... ...
§ Memory values here: 00008010 i
00008014 j
00008018 k
0000801C l
... ... Garcia, Nikolić

Caches III (3)


Accessing data in a direct mapped cache
§ 4 Addresses:
ú 0x00000014, 0x0000001C,
0x00000034, 0x00008014
§ 4 Addresses divided (for convenience)
into Tag, Index, Byte Offset fields
000000000000000000 0000000001 0100
000000000000000000 0000000001 1100
000000000000000000 0000000011 0100
000000000000000010 0000000001 0100
Tag Index Offset Garcia, Nikolić

Caches III (4)


Example: 16 KB Direct-Mapped Cache, 16B blocks
§ Valid bit: determines whether anything is stored in that row (when
computer initially powered up, all entries invalid)
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (5)


1. Read 0x00000014
§ 000000000000000000 0000000001 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (6)


So we read block 1 (0000000001)
§ 000000000000000000 0000000001 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (7)


No valid data
§ 000000000000000000 0000000001 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (8)


So load that data into cache, setting tag, valid
§ 000000000000000000 0000000001 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (9)


Read from cache at offset, return word b
§ 000000000000000000 0000000001 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (10)


2. Read 0x0000001C = 0…00 0..001 1100
§ 000000000000000000 0000000001 1100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (11)


Index is Valid
§ 000000000000000000 0000000001 1100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (12)


Index is Valid, Tag Matches
§ 000000000000000000 0000000001 1100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (13)


Index is Valid, Tag Matches, return d
§ 000000000000000000 0000000001 1100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (14)


3. Read 0x00000034 = 0…00 0..011 0100
§ 000000000000000000 0000000011 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (15)


So read block 3
§ 000000000000000000 0000000011 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (16)


No valid data
§ 000000000000000000 0000000011 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 0
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (17)


Load that cache block, return word f
§ 000000000000000000 0000000011 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 1 0 h g f e
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (18)


4. Read 0x00008014 = 0…10 0..001 0100
§ 000000000000000010 0000000001 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 1 0 h g f e
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (19)


So read Cache Block 1, Data is Valid
§ 000000000000000010 0000000001 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 1 0 h g f e
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (20)


Cache Block 1 Tag does not match (0 ≠ 2)
§ 000000000000000010 0000000001 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 0 d c b a
2 0
3 1 0 h g f e
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (21)


Miss, so replace block 1 with new data & tag
§ 000000000000000010 0000000001 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 2 l k j i
2 0
3 1 0 h g f e
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (22)


And return word J
§ 000000000000000010 0000000001 0100
Tag Index Offset
Valid
Index Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 0
1 1 2 l k j i
2 0
3 1 0 h g f e
4 0
5 0
6 0
7 0
... ...
1022 0
1023 0

Garcia, Nikolić

Caches III (23)


Do an example yourself. What happens?
§ Chose from: Cache: Hit, Miss, Miss w. replace
Values returned: a ,b, c, d, e, ..., k, l
§ Read address 0x00000030 ?
000000000000000000 0000000011 0000
§ Read address 0x0000001c ?
000000000000000000 0000000001 1100
Index
0 0
1 1 2 l k j i
2 0
3 1 0 h g f e
4 0
5 0
6 0
7 0
Garcia, Nikolić

Caches III (24)


Answers
§ 0x00000030 a hit Memory
Index = 3, Tag matches, ...(hex) Value...of Word
Address
Offset = 0, value = e 00000010 a
§ 0x0000001c a miss 00000014 b
00000018 c
Index = 1, Tag mismatch, so 0000001C d
replace from memory,
Offset = 0xc, value = d ... ...
00000030 e
§ Since reads, values 00000034 f
must = memory values 00000038 g
whether or not cached: 0000003C h
... ...
ú 0x00000030 = e 00008010 i
ú 0x0000001c = d 00008014 j
00008018
0000801C
k
l
þ
... ... Garcia, Nikolić

Caches III (25)


Multiword-Block Direct-Mapped Cache
§ Four words/block, cache size = 4K words
31 30 . . . 15 14 13 ... 4 3 2 1 0
Byte
Hit offset Data

Tag 18 10 Block offset


Index
Data
Index Valid Tag
0
1
2
.
.
.
1021
1022
1023
18

2
“MUX”
AND
4à1 Multiplexor
32
What kind of locality are we taking advantage of? Garcia, Nikolić

Caches III (27)


What to do on a write hit?
§ Write-through
ú Update both cache and memory
§ Write-back
ú update word in cache block
ú allow memory word to be “stale”
ú add ‘dirty’ bit to block
memory & Cache inconsistent
needs to be updated when block is replaced
ú …OS flushes cache before I/O…
§ Performance trade-offs?
Garcia, Nikolić

Caches III (28)


Block Size Tradeoff
§ Benefits of Larger Block Size
ú Spatial Locality: if we access a given word, we’re likely to
access other nearby words soon
ú Very applicable with Stored-Program Concept
ú Works well for sequential array accesses
§ Drawbacks of Larger Block Size
ú Larger block size means larger miss penalty
on a miss, takes longer time to load a new block from next level
ú If block size is too big relative to cache size, then there are
too few blocks
Result: miss rate goes up

Garcia, Nikolić

Caches III (29)


Extreme Example: One Big Block
Valid Bit Tag Cache Data
B3 B2 B1 B0

§ Cache Size = 4 bytes Block Size = 4 bytes


ú Only ONE entry (row) in the cache!
§ If item accessed, likely accessed again soon
ú But unlikely will be accessed again immediately!
§ The next access will likely to be a miss again
ú Continually loading data into the cache but
discard data (force out) before use it again
ú Nightmare for cache designer: Ping Pong Effect

Garcia, Nikolić

Caches III (30)


Block Size Tradeoff Conclusions
Miss Miss Exploits Spatial Locality
Penalty Rate
Fewer blocks:
compromises
temporal locality
Block Size Block Size
Average Increased Miss Penalty
Access & Miss Rate
Time

Block Size Garcia, Nikolić

Caches III (31)


Types of Cache Misses (1/2)
§ “Three Cs” Model of Misses
§ 1st C: Compulsory Misses
ú occur when a program is first started
ú cache does not contain any of that program’s data
yet, so misses are bound to occur
ú can’t be avoided easily, so won’t focus on these in
this course
ú Every block of memory will have one compulsory
miss (NOT only every block of the cache)

Garcia, Nikolić

Caches III (32)


Types of Cache Misses (2/2)
§ 2nd C: Conflict Misses
ú miss that occurs because two distinct memory addresses
map to the same cache location
ú two blocks (which happen to map to the same location)
can keep overwriting each other
ú big problem in direct-mapped caches
ú how do we lessen the effect of these?
§ Dealing with Conflict Misses
ú Solution 1: Make the cache size bigger
Fails at some point
ú Solution 2: Multiple distinct blocks can fit in the same
cache Index? þ
Garcia, Nikolić

Caches III (33)


Fully Associative Cache (1/3)
§ Memory address fields:
ú Tag: same as before
ú Offset: same as before
ú Index: non-existant
§ What does this mean?
ú no “rows”: any block can go anywhere in the
cache
ú must compare with all tags in entire cache
to see if data is there
Garcia, Nikolić

Caches III (35)


Fully Associative Cache (2/3)
§ Fully Associative Cache (e.g., 32 B block)
ú compare tags in parallel

31 4 0
Cache Tag (27 bits long) Byte Offset

Cache Tag Valid Cache Data


= B 31 B1 B0

:
=
=
=
:
= : : :
Garcia, Nikolić

Caches III (36)


Fully Associative Cache (3/3)
§ Benefit of Fully Assoc Cache
ú No Conflict Misses (since data can go
anywhere)
§ Drawbacks of Fully Assoc Cache
ú Need hardware comparator for every single
entry: if we have a 64KB of data in cache
with 4B entries, we need 16K comparators:
infeasible

Garcia, Nikolić

Caches III (37)


Final Type of Cache Miss
§ 3rd C: Capacity Misses
ú miss that occurs because the cache has a
limited size
ú miss that would not occur if we increase the
size of the cache
ú sketchy definition, so just get the general
idea
§ This is the primary type of miss for
Fully Associative caches.
Garcia, Nikolić

Caches III (38)


How to categorize misses
§ Run an address trace against a set of
caches:
ú First, consider an infinite-size, fully-associative
cache. For every miss that occurs now, consider it a
compulsory miss.
ú Next, consider a finite-sized cache (of the size you
want to examine) with full-associativity. Every miss
that is not in #1 is a capacity miss.
ú Finally, consider a finite-size cache with finite-
associativity. All of the remaining misses that are
not #1 or #2 are conflict misses.
ú (Thanks to Prof. Kubiatowicz for the algorithm) Garcia, Nikolić

Caches III (39)


And in Conclusion…
1. Divide into T I O bits, Go to Index = I, check valid
1. If 0, load block, set valid and tag (COMPULSORY MISS) and use
offset to return the right chunk (1,2,4-bytes)
2. If 1, check tag
1. If Match (HIT), use offset to return the right chunk
2. If not (CONFLICT MISS), load block, set valid and tag, use offset
to return the right chunk
address: tag index offset
000000000000000000 0000000001 1100 þ
Valid Tag 0xc-f 0x8-b 0x4-7 0x0-3
0 1 0 d c b a
1
2
3
... Garcia, Nikolić

Caches III (40)

You might also like