0% found this document useful (0 votes)

4 views

Cache Org

Uploaded by

kataria95067

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Cache Org

Uploaded by

kataria95067

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Systems I

Cache Organization

Topics
 Generic cache memory organization
 Direct mapped caches
 Set associative caches
 Impact of caches on programming
Cache Vocabulary
Capacity
Cache block (aka cache line)
Associativity
Cache set
Index
Tag
Hit rate
Miss rate
Replacement policy

2
General Org of a Cache Memory
1 valid bit t tag bits B = 2b bytes
Cache is an array per line per line per cache block
of sets.
valid tag 0 1 ••• B–1
Each set contains E lines
set 0: •••
one or more lines. per set
valid tag 0 1 ••• B–1
Each line holds a
block of data. valid tag 0 1 ••• B–1

set 1: •••
S= 2s sets valid tag 0 1 ••• B–1

•••

valid tag 0 1 ••• B–1

set S-1: •••
valid tag 0 1 ••• B–1

Cache size: C = B x E x S data bytes

3
Addressing Caches
Address A:
t bits s bits b bits

m-1 0
v tag 0 1 • • • B–1
set 0: •••
v tag 0 1 • • • B–1 <tag> <set index> <block offset>

v tag 0 1 • • • B–1
set 1: •••
v tag 0 1 • • • B–1
The word at address A is in the cache if
••• the tag bits in one of the <valid> lines in
v tag 0 1 • • • B–1 set <set index> match <tag>.
set S-1: •••
v tag 0 1 • • • B–1 The word contents begin at offset
<block offset> bytes from the beginning
of the block.

4
Direct-Mapped Cache
Simplest kind of cache
Characterized by exactly one line per set.

set 0: valid tag cache block E=1 lines per set

set 1: valid tag cache block

•••

set S-1: valid tag cache block

5
Accessing Direct-Mapped Caches
Set selection
 Use the set index bits to determine the set of interest.

set 0: valid tag cache block

selected set valid tag cache block

set 1:
•••
t bits s bits b bits
set S-1: valid tag cache block
00 001
m-1
tag set index block offset0

6
Accessing Direct-Mapped Caches
Line matching and word selection
 Line matching: Find a valid line in the selected set with a
matching tag
 Word selection: Then extract the word

=1? (1) The valid bit must be set

0 1 2 3 4 5 6 7

selected set (i): 1 0110 w0 w1 w2 w3

(2) The tag bits in the cache (3) If (1) and (2), then
line must match the =?
cache hit,
tag bits in the address and block offset
selects
t bits s bits b bits
starting byte.
0110 i 100
m-1
tag set index block offset0

7
Direct-Mapped Cache Simulation
M=16 byte addresses, B=2 bytes/block,
S=4 sets, E=1 entry/set
t=1 s=2 b=1
x xx x Address trace (reads):
0 [00002], 1 [00012], 13 [11012], 8 [10002], 0 [00002]

0 [00002] (miss) 13 [11012] (miss)

v tag data v tag data
11 0 M[0-1]
m[1] m[0] 11 0 M[0-1]
m[1] m[0]

(1) (3) 11 1 M[12-13]

m[13] m[12]

8 [10002] (miss) 0 [00002] (miss)

v tag data v tag data
11 1 M[8-9]
m[9] m[8] 11 0 M[0-1]
m[1] m[0]

(4) 1 1 M[12-13]
(5) 11 1 m[13] m[12]
M[12-13]
8
Why Use Middle Bits as Index?
4-line Cache High-Order Middle-Order
Bit Indexing Bit Indexing
00 0000 0000
01 0001 0001
10 0010 0010
11 0011 0011
0100 0100
High-Order Bit Indexing 0101 0101
 Adjacent memory lines would map 0110 0110
to same cache entry 0111 0111
 Poor use of spatial locality 1000 1000
Middle-Order Bit Indexing 1001 1001
 Consecutive memory lines map to 1010 1010
different cache lines
1011 1011
 Can hold C-byte region of address
space in cache at one time 1100 1100
1101 1101
1110 1110
1111 1111
9
Set Associative Caches
Characterized by more than one line per set

valid tag cache block

set 0: E=2 lines per set
valid tag cache block

valid tag cache block

set 1:
valid tag cache block

•••
valid tag cache block
set S-1:
valid tag cache block

10
Accessing Set Associative Caches
Set selection
 identical to direct-mapped cache
valid tag cache block
set 0:
valid tag cache block

Selected set valid tag cache block

set 1:
valid tag cache block

•••
valid tag cache block
t bits s bits b bits set S-1:
valid tag cache block
00 001
m-1
tag set index block offset0

11
Accessing Set Associative Caches
Line matching and word selection
 must compare the tag in each valid line in the selected set.

=1? (1) The valid bit must be set.

0 1 2 3 4 5 6 7

1 1001
selected set (i):
1 0110 w0 w1 w2 w3

(2) The tag bits in one (3) If (1) and (2), then
of the cache lines must =? cache hit, and
block offset selects
match the tag bits in
starting byte.
the address
t bits s bits b bits
0110 i 100
m-1
tag set index block offset0

12
Cache Performance Metrics
Miss Rate
 Fraction of memory references not found in cache
(misses/references)
 Typical numbers:
 3-10% for L1
 can be quite small (e.g., < 1%) for L2, depending on size, etc.

Hit Time
 Time to deliver a line in the cache to the processor (includes
time to determine whether the line is in the cache)
 Typical numbers:
 1-3 clock cycle for L1
 5-12 clock cycles for L2

Miss Penalty
 Additional time required because of a miss
 Typically 100-300 cycles for main memory
13
Memory System Performance
Average Memory Access Time (AMAT)

Taccess = (1" pmiss )t hit + pmisst miss

t miss = t hit + t penalty

Assume
! 1-level cache, 90% hit rate, 1 cycle hit
time, 200 cycle miss penalty
!
AMAT = 21 cycles!!! - even though 90% only take
one cycle

14
Memory System Performance - II
How does AMAT affect overall performance?
Recall the CPI equation (pipeline efficiency)

CPI = 1.0 + lp + mp + rp
 load/use penalty (lp) assumed memory access of 1 cycle
 Further - we assumed that all load instructions were 1 cycle
 More realistic AMAT (20+ cycles), really hurts CPI and overall
!
performance
Cause Name Instruction Condition Stalls Product
Frequency Frequency
Load lp 0.30 0.7 21 4.41
Load/Use lp 0.30 0.3 21+1 1.98
Mispredict mp 0.20 0.4 2 0.16
Return rp 0.02 1.0 3 0.06
Total penalty 6.61
15
Memory System Performance - III

Taccess = (1" pmiss )t hit + pmisst miss

t miss = t hit + t penalty

How !
to reduce AMAT?
 Reduce miss rate
!
 Reduce miss penalty
 Reduce hit time

There have been numerous inventions targeting each of

these

16
Writing Cache Friendly Code
Can write code to improve miss rate
Repeated references to variables are good (temporal locality)
Stride-1 reference patterns are good (spatial locality)
Examples:
 cold cache, 4-byte words, 4-word cache blocks

int sumarrayrows(int a[M][N]) int sumarraycols(int a[M][N])

{ {
int i, j, sum = 0; int i, j, sum = 0;

for (i = 0; i < M; i++) for (j = 0; j < N; j++)

for (j = 0; j < N; j++) for (i = 0; i < M; i++)
sum += a[i][j]; sum += a[i][j];
return sum; return sum;
} }

Miss rate = 1/4 = 25% Miss rate = 100%

17
Questions to think about
What happens when there is a miss and the cache has
no free lines?
 What do we evict?

What happen on a store miss?

What if we have a multicore chip where the processing
cores share the L2 cache but have private L1
caches?
 What are some bad things that could happen?

18
Concluding Observations
Programmer can optimize for cache performance
 How data structures are organized
 How data are accessed
 Nested loop structure
 Blocking is a general technique

All systems favor “cache friendly code”

 Getting absolute optimum performance is very platform
specific
 Cache sizes, line sizes, associativities, etc.
 Can get most of the advantage with generic code
 Keep working set reasonably small (temporal locality)
 Use small strides (spatial locality)

Medical Tourism in India
No ratings yet
Medical Tourism in India
23 pages
Buyer'S Inspection Waiver: 3980 Colonial Way, Sacramento, CA 95817
No ratings yet
Buyer'S Inspection Waiver: 3980 Colonial Way, Sacramento, CA 95817
1 page
Lec 4
No ratings yet
Lec 4
31 pages
10-cacheperf
No ratings yet
10-cacheperf
24 pages
Class11 Cache
No ratings yet
Class11 Cache
41 pages
Memory Cache
No ratings yet
Memory Cache
18 pages
Computer Architecture: Assoc. Prof. Nguyễn Trí Thành, Phd
No ratings yet
Computer Architecture: Assoc. Prof. Nguyễn Trí Thành, Phd
55 pages
Cache
No ratings yet
Cache
34 pages
Miss Rate Versus Block Size: 25% 1K 4K 16K 64K 256K
No ratings yet
Miss Rate Versus Block Size: 25% 1K 4K 16K 64K 256K
33 pages
Cache Memory: A Safe Place For Hiding or Storing Things
100% (1)
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
15IF11 Multicore B
No ratings yet
15IF11 Multicore B
36 pages
10_Caches
No ratings yet
10_Caches
34 pages
Lec8 - Caches
No ratings yet
Lec8 - Caches
55 pages
CA_Lecture_08
No ratings yet
CA_Lecture_08
38 pages
CS2115 chapter-6
No ratings yet
CS2115 chapter-6
45 pages
Computer Architecture: Memory Organization
No ratings yet
Computer Architecture: Memory Organization
65 pages
Large and Fast: Exploiting Memory Hierarchy
No ratings yet
Large and Fast: Exploiting Memory Hierarchy
48 pages
Computer Org and Arch: R.Magesh
No ratings yet
Computer Org and Arch: R.Magesh
48 pages
Week 13 - Lecture 13 - Memory (cont)
No ratings yet
Week 13 - Lecture 13 - Memory (cont)
31 pages
Memory Hierarchy Design-Aca
No ratings yet
Memory Hierarchy Design-Aca
15 pages
L07-MemoryII
No ratings yet
L07-MemoryII
27 pages
ACA Unit-5
No ratings yet
ACA Unit-5
54 pages
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
No ratings yet
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
10 pages
Cache
No ratings yet
Cache
36 pages
10 Cache Memories
No ratings yet
10 Cache Memories
49 pages
Lectures wk11
No ratings yet
Lectures wk11
21 pages
L18-Cache-Wrap-up
No ratings yet
L18-Cache-Wrap-up
30 pages
EE6304 Lecture9 Mem Caches
No ratings yet
EE6304 Lecture9 Mem Caches
61 pages
Unit-4 (2)
No ratings yet
Unit-4 (2)
72 pages
Cache Memory
No ratings yet
Cache Memory
39 pages
Sampriya Chandra Cache Memory
No ratings yet
Sampriya Chandra Cache Memory
36 pages
Cache Design
No ratings yet
Cache Design
59 pages
cache_ppt
No ratings yet
cache_ppt
38 pages
UNIT2 Cahe-Opt
No ratings yet
UNIT2 Cahe-Opt
134 pages
53-Cache Memory_ Principles, Cache Memory Management Techniques-28!02!2025
No ratings yet
53-Cache Memory_ Principles, Cache Memory Management Techniques-28!02!2025
38 pages
55-Types of Caches, Caches Misses,-04!03!2025
No ratings yet
55-Types of Caches, Caches Misses,-04!03!2025
64 pages
Chapter 2z Ppt
No ratings yet
Chapter 2z Ppt
54 pages
Memory Hierarchies (Part 2) Review: The Memory Hierarchy
No ratings yet
Memory Hierarchies (Part 2) Review: The Memory Hierarchy
7 pages
Cache1 2
No ratings yet
Cache1 2
30 pages
Memory 2
No ratings yet
Memory 2
31 pages
Chapter 6: Memory: - CPU Accesses Memory at Least Once Per Fetch-Execute Cycle: - Memory Is Organized Into A Hierarchy
No ratings yet
Chapter 6: Memory: - CPU Accesses Memory at Least Once Per Fetch-Execute Cycle: - Memory Is Organized Into A Hierarchy
25 pages
Cache Memory: A Safe Place For Hiding or Storing Things
No ratings yet
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
CPSC 312 Cache Memories: Topics
No ratings yet
CPSC 312 Cache Memories: Topics
39 pages
05) Cache Memory Introduction
No ratings yet
05) Cache Memory Introduction
20 pages
CAO - Lecutre7 Cache Memory
100% (1)
CAO - Lecutre7 Cache Memory
39 pages
361 Computer Architecture Lecture 14: Cache Memory
No ratings yet
361 Computer Architecture Lecture 14: Cache Memory
20 pages
4.2 cachememory
No ratings yet
4.2 cachememory
12 pages
10-Cache
No ratings yet
10-Cache
28 pages
Memory Hierarchy Design
No ratings yet
Memory Hierarchy Design
115 pages
CODch 7 Slides
No ratings yet
CODch 7 Slides
49 pages
CS252 Graduate Computer Architecture Caches and Memory Systems I
No ratings yet
CS252 Graduate Computer Architecture Caches and Memory Systems I
49 pages
Chap 6
No ratings yet
Chap 6
48 pages
Question: Who Cares About The Memory Hierarchy?: Caches and Memory Systems I
No ratings yet
Question: Who Cares About The Memory Hierarchy?: Caches and Memory Systems I
13 pages
CL10 MemoryMgmt
No ratings yet
CL10 MemoryMgmt
45 pages
Lecture 13- Introduction to Cache
No ratings yet
Lecture 13- Introduction to Cache
47 pages
Memory Hierarchy: Two Principles
No ratings yet
Memory Hierarchy: Two Principles
68 pages
ch2 Appb
No ratings yet
ch2 Appb
58 pages
AC14L08 Memory Hierarchy
No ratings yet
AC14L08 Memory Hierarchy
20 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
32 pages
Chapter 6
No ratings yet
Chapter 6
37 pages
List of Nigeria Importers (ACCI) to Vietnam
No ratings yet
List of Nigeria Importers (ACCI) to Vietnam
9 pages
ATHE Level 3 Managing Business Operations (120) - Assignment Brief - Part 1 - IV'd
No ratings yet
ATHE Level 3 Managing Business Operations (120) - Assignment Brief - Part 1 - IV'd
7 pages
Contract To Sell of Foreshore Land
No ratings yet
Contract To Sell of Foreshore Land
3 pages
Lab Report On CSE, C Loop Operations
No ratings yet
Lab Report On CSE, C Loop Operations
7 pages
PAREDES-08 Task Performance 1 - ARG 222
No ratings yet
PAREDES-08 Task Performance 1 - ARG 222
1 page
Atienza Vs Brillantes Case Digest
No ratings yet
Atienza Vs Brillantes Case Digest
1 page
Hazard Map in Our House: By: Erullyn O. Lolo
No ratings yet
Hazard Map in Our House: By: Erullyn O. Lolo
3 pages
The Fern Residency Mundra: RD International GSTIN: 24AFFPR9751R1ZI State: GUJARAT
No ratings yet
The Fern Residency Mundra: RD International GSTIN: 24AFFPR9751R1ZI State: GUJARAT
4 pages
WIR Ew 0452 A Trench
No ratings yet
WIR Ew 0452 A Trench
1 page
Thanavala 12 3 15
No ratings yet
Thanavala 12 3 15
62 pages
Aircraft Connector Bonding Resistance Tests
No ratings yet
Aircraft Connector Bonding Resistance Tests
11 pages
HPE7-A04 Dumps - Aruba Certified Data Center Architect Exam
No ratings yet
HPE7-A04 Dumps - Aruba Certified Data Center Architect Exam
6 pages
Apple
No ratings yet
Apple
1 page
reading unit 2 - business sector (1)
No ratings yet
reading unit 2 - business sector (1)
3 pages
Spoit Config 221-24
No ratings yet
Spoit Config 221-24
4 pages
Islamic Finance and Cryptocurrencies
No ratings yet
Islamic Finance and Cryptocurrencies
22 pages
KPMG Better Buses Executive Summary
No ratings yet
KPMG Better Buses Executive Summary
1 page
Lic Unit 1
No ratings yet
Lic Unit 1
62 pages
Electromagnetic Braking System
No ratings yet
Electromagnetic Braking System
18 pages
Module 1
No ratings yet
Module 1
59 pages
Pilot Pressure 302.5
No ratings yet
Pilot Pressure 302.5
5 pages
Agreement - Land Title Processing
100% (2)
Agreement - Land Title Processing
2 pages
5S Audit Checklist
No ratings yet
5S Audit Checklist
8 pages
A Warning To Wall Street Amateurs
100% (1)
A Warning To Wall Street Amateurs
6 pages
Income Taxation
No ratings yet
Income Taxation
60 pages
Introduction To OM
No ratings yet
Introduction To OM
27 pages
N12electricalexperi06gern 1206 PDF
No ratings yet
N12electricalexperi06gern 1206 PDF
92 pages
Thoracic Needle Decompression For Tension Pneumothorax: Clinical Correlation With Catheter Length
No ratings yet
Thoracic Needle Decompression For Tension Pneumothorax: Clinical Correlation With Catheter Length
5 pages

Cache Org

Uploaded by

Cache Org

Uploaded by

Systems I

valid tag 0 1 ••• B–1

Cache size: C = B x E x S data bytes

set 0: valid tag cache block E=1 lines per set

set 1: valid tag cache block

set S-1: valid tag cache block

set 0: valid tag cache block

selected set valid tag cache block

=1? (1) The valid bit must be set

selected set (i): 1 0110 w0 w1 w2 w3

0 [00002] (miss) 13 [11012] (miss)

(1) (3) 11 1 M[12-13]

8 [10002] (miss) 0 [00002] (miss)

valid tag cache block

valid tag cache block

Selected set valid tag cache block

=1? (1) The valid bit must be set.

Taccess = (1" pmiss )t hit + pmisst miss

Taccess = (1" pmiss )t hit + pmisst miss

There have been numerous inventions targeting each of

int sumarrayrows(int a[M][N]) int sumarraycols(int a[M][N])

for (i = 0; i < M; i++) for (j = 0; j < N; j++)

Miss rate = 1/4 = 25% Miss rate = 100%

What happen on a store miss?

All systems favor “cache friendly code”

You might also like