0% found this document useful (0 votes)

52 views20 pages

Microprocessor & Computer Architecture (Μpca) : Unit 4: Cache Memory

No, the value in R2 may not be equal to the value in R3 due to the read-after-write hazard. With a write-through cache and write buffer that is not checked on read misses: - The SW instruction writes R3 to address 512, placing it in the write buffer. - The LW to address 1024 is a cache miss, so it must wait for the write buffer to drain before fetching from memory. - The LW to address 512 hits the cache but gets the old value from memory before the write buffer drained. So there is no guarantee the value in R2 will be the same as the new value in R3 written by the SW, due to the read occurring

Uploaded by

Pranathi Praveen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views20 pages

Microprocessor & Computer Architecture (Μpca) : Unit 4: Cache Memory

Uploaded by

Pranathi Praveen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Microprocessor & Computer

Architecture (μpCA)

Unit 4: Cache Memory

4th & 5th Optimization

UE19CS252

Session : 4.2
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty

• Observing the cache performance formula,

– Avg. memory access time = Hit time + Miss rate x Miss penalty,
• Improvements in Miss penalty is as advantageous as improvements in
miss rate.
• The performance gap between processor & memory raises a question:
– Should I make the cache faster to keep the pace with the speed of
the processor? Or
– Make the cache larger to overcome the widening performance gap
between the processor & the main memory?
• Answer for these questions is to do both.
– Adding another level of cache between memory & original cache simplifies the
decision.
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty

• First level cache can be small enough to match the clock cycle time of the processor.
• Second level cache be can be large enough to capture many accesses that would go
to main memory, thus reducing miss penalty.
• Multilevel cache will complicate performance analysis.
• Considering Memory access time for a two level cache using subscripts L1 & L2 to
refer, respectively, to the first & second level,
• The original formula is:
AMAT = Hit TimeL1 + Miss RateL1 x Miss PenaltyL1
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty
and
Miss PenaltyL1 = Hit TimeL2 + Miss RateL2 x Miss PenaltyL2

Substituting in original equation, we get,

AMAT = Hit TimeL1 + Miss RateL1 x [Hit TimeL2 + Miss RateL2 x Miss PenaltyL2]

Here,
Second level miss rate is measured on the leftovers from the first level cache.

To avoid ambiguity, the following terms are used for a two level cache system.
– Local miss rate
– Global miss rate
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty
Local miss rate:
• The number of misses in the cache divided by the total number of
memory accesses to this cache.
Ex: For first level cache it is, Miss RateL1
For second level cache it is, Miss RateL2
Global miss rate:
• The number of misses in the cache divided by the total number of memory
accesses generated by the processor.
Ex: Global miss rate for level1 cache is still Miss RateL1
but, for level2 cache it is : Miss RateL1 x Miss RateL2
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty

• Local miss rate is large for second level caches because

– The first level cache skims the cream of the memory accesses.
– Global miss rate is more useful measure.
• It indicates what fraction of the memory accesses that leave the processor go all the way
to memory.
– Here, the misses per instruction metric shines.
Expanding the memory stalls per instruction to add the impact of a second level,

Avg. Memory stalls per instruction = Misses per instructionL1 x Hit TimeL2 + Misses per
instructionL2 x Miss PenaltyL2
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty

Suppose that in 1000 memory references there are 40 misses in the first level cache and 20 misses in the
second –level cache. What are the various miss rates?
Assume the miss penalty from the L2 cache to memory is 200 clock cycles, the hit time of the L2 cache is 10
clock cycles, hit time for L1 cache is 1 clock cycle and there are 1.5 memory references per instruction.
What is the average memory access time and average stall cycles per instruction?
Ignore impact of writes.
Answer
The miss rate [either global or local ] for the first level cache is 40/1000 = 4%.
The local miss rate for the second-level cache is 20/40 = 50%.
The global miss rate of the second level cache is 20/1000 = 2%.
Then,
AMAT = Hit TimeL1 + Miss RateL1 x [Hit TimeL2 + Miss RateL2 x Miss PenaltyL2]
= 1+4% x ( 10 + 50% x 200 ) = 1 + 4% x 110 = 5.4 clock cycles.
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty
# of instruction = # of memory references / # of memory references per instruction.
= 1000 / 1.5 = 667 instructions.
Thus, for L1 cache ,
# misses for 40 memory accesses for 1000 instructions = 40 x 1.5 = 60 misses
and for 20 misses for L2 cache it is 1.5 x 20 = 30 misses.
Average memory stalls per instruction = Misses per instruction L1 x Hit timeL2 +
Misses per instruction L2 x Miss PenaltyL2
= (60/1000) x 10 + (30/1000) x 200
= 0.06 x 10 + 0.03 x200 = 0.6 + 6
Then, = 6.6 clock cycles.
Average memory = (AMAT - Hit timeL1 ) x Average # of memory references per instruction
stalls per instruction
= ( 5.4 - 1.0) x 1.5 = 6.6 clock cycles.
Note: The computation of the memory stalls per instruction is same for either way.
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty

First perspective :

Global cache miss rate is very similar to the

single cache miss rate of the second level
cache.

• Provided that the second level cache

is much larger than the first level
cache.
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty

Second Perspective:

• Local cache miss rate is not a good measure

of secondary caches.

• It is a function of the miss rate of the first

level cache.

• Can vary by changing the first – level cache.

Note: Global cache miss rate should be used when evaluating second level caches.
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty

Parameters for second level caches:

• Difference between two levels is the speed of the first level cache
• That affects the clock rate of the processor.
• While, Speed of the second level cache only affects the miss penalty of the first level
cache.
• Thus, many alternatives can be considered in second level cache that are ill chosen for
the first level cache.

Two major questions for the design of the second level cache:

1. Will it lower the average memory access time ?

2. How much does it cost ?
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty

• Initial decision is the size of a second level cache.

• Everything in first level cache is likely in the second level cache.
• The size of the second level cache should be much higher than the first.
• If second level caches are just a little bigger, the local miss rate will be higher.
• Thus, inspires the design of huge second level caches.
• Probably to the size of the main memory in older computers!!!

NOTE:
• Multi level inclusion is the natural policy for memory hierarchies.
• L1 data is always present in L2.
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty
MPCA - Fourth optimization:
Multilevel Caches to Reduce Miss Penalty

The essence of all cache designs is:

• Balancing fast hits and few misses.

• For second level caches,
• Many fewer hits than in the first level cache.
• Emphasis shifts to fewer misses.

• Insight leads to much larger caches and techniques to lower the miss rate:
• Such as higher associativity & larger blocks.
MPCA - Fifth optimization:
Prioritizes reads over writes

• This optimization serves reads before writes have been completed.

• Complexities of a write buffer with a write-through cache:
• The most improvement is a write buffer of a proper size.
• Memory accesses are complicated as it may hold the updated value of a location
needed on a read miss.
To resolve this ambiguity,
• Read miss to wait until the write buffer is empty.
or
• To check the contents of the write buffer on a read miss and if there are no conflicts
and the memory system is available.
• Let the read miss continue.
• Virtually, all processors use the later approach.
• Gives priority reads over writes.
MPCA - Fifth optimization:
Prioritizes reads over writes

• Write-back cache:
• The cost of the writes by the processor in a write back can also be reduced.
• Consider a read miss replacing a dirty block.
• Instead of writing the dirty block to the memory, and then reading memory, we could
copy the dirty block to a buffer, then read memory and then write memory will finish
sooner.
• Thus, if a read miss occurs, the processor can either stall until the buffer is empty or
• Check the addresses of the words in the buffer for conflicts.
MPCA - Fifth optimization:
Prioritizes reads over writes

Consider the following code sequence.

Ex: SW R3, 512 (R0)
LW R1, 1024 ( R0)
LW R2, 512 (R0)
Assume Direct Mapped Cache
Write –through cache that maps 512 and 1024 to the same block. Four word write buffer that is
not checked on a read miss. Will the value in R2 always be equal to the value in R3? Ans:
• This is a read-after-write data hazard in memory.
• The data in R3 are placed into the write buffer after the STR.
• The following LDR instruction uses the same cache index and is therefore a miss.
• The second LDR instruction, tries to put the value in location 512 into the register R2.
• This also results in a miss.
• If the write buffer hasn’t completed writing to location 512 in memory,
• The read of location 512 will put the old, wrong value into the cache block and then into R2.
• Without proper precautions, R3 would not be equal to R2!
Optimization 5
THANK YOU

Team MPCA
Department of Computer Science and
Engineering

Cache Performance - Reducing Cache Miss Penalty and
No ratings yet
Cache Performance - Reducing Cache Miss Penalty and
50 pages
Test 6 PracticeQuestion Cachememory 1 Updated
No ratings yet
Test 6 PracticeQuestion Cachememory 1 Updated
22 pages
How To Find AMAT - Final - Question
100% (1)
How To Find AMAT - Final - Question
17 pages
UE19CS252
No ratings yet
UE19CS252
25 pages
Lec 33
No ratings yet
Lec 33
26 pages
Memory Hierarchy 4.0
No ratings yet
Memory Hierarchy 4.0
50 pages
CA11 2023S1 New
No ratings yet
CA11 2023S1 New
26 pages
Chapter 2 Adv 2007 PPTV 4
No ratings yet
Chapter 2 Adv 2007 PPTV 4
54 pages
Cache Optimizations
No ratings yet
Cache Optimizations
29 pages
Chapter # 05
No ratings yet
Chapter # 05
42 pages
ACA Lecture 27 Cache Optimizations
No ratings yet
ACA Lecture 27 Cache Optimizations
20 pages
Cache Memory An Analysison Optimization
No ratings yet
Cache Memory An Analysison Optimization
6 pages
Week 13 - Lecture 13 - Memory (Cont)
No ratings yet
Week 13 - Lecture 13 - Memory (Cont)
31 pages
10 Cacheperf
No ratings yet
10 Cacheperf
24 pages
Cache
No ratings yet
Cache
10 pages
L18 Cache Wrap Up
No ratings yet
L18 Cache Wrap Up
30 pages
Computer Architecture - Quantitative Approach
No ratings yet
Computer Architecture - Quantitative Approach
7 pages
Lec 6
No ratings yet
Lec 6
18 pages
PowerStore+Concepts+and+Features+ +Participant+Guide (PDF) +
No ratings yet
PowerStore+Concepts+and+Features+ +Participant+Guide (PDF) +
72 pages
Cache TLB
100% (1)
Cache TLB
15 pages
Sheet 01
No ratings yet
Sheet 01
3 pages
Cache Optimizations
No ratings yet
Cache Optimizations
23 pages
Project Arcade
100% (2)
Project Arcade
27 pages
Test 6 PracticeQuestion Cachememory 1
No ratings yet
Test 6 PracticeQuestion Cachememory 1
21 pages
COMP 740: Computer Architecture and Implementation: Montek Singh
No ratings yet
COMP 740: Computer Architecture and Implementation: Montek Singh
41 pages
Lecture 12: Cache Innovations
No ratings yet
Lecture 12: Cache Innovations
17 pages
Milen Dimitrov HW3
No ratings yet
Milen Dimitrov HW3
3 pages
ch2 Appb
No ratings yet
ch2 Appb
58 pages
Lec 23
No ratings yet
Lec 23
13 pages
Advanced Architecture Memory
No ratings yet
Advanced Architecture Memory
13 pages
Cache
No ratings yet
Cache
34 pages
L07 MemoryII
No ratings yet
L07 MemoryII
27 pages
Cache Performance
No ratings yet
Cache Performance
41 pages
Cache Misses
No ratings yet
Cache Misses
8 pages
Improving Cache Performance:: Average Memory Access Time Amat T + Miss Rate X Miss Penalty
No ratings yet
Improving Cache Performance:: Average Memory Access Time Amat T + Miss Rate X Miss Penalty
16 pages
Average Memory Access Time
No ratings yet
Average Memory Access Time
12 pages
CS 322M Digital Logic & Computer Architecture: Cache Optimization Techniques-II
No ratings yet
CS 322M Digital Logic & Computer Architecture: Cache Optimization Techniques-II
14 pages
ARM hw5
No ratings yet
ARM hw5
5 pages
Memory Hierarchies (Part 2) Review: The Memory Hierarchy
No ratings yet
Memory Hierarchies (Part 2) Review: The Memory Hierarchy
7 pages
Computer Organization Exercise Answer7
No ratings yet
Computer Organization Exercise Answer7
7 pages
Cache Miss Penalty Reduction: #1 - Multilevel Caches
No ratings yet
Cache Miss Penalty Reduction: #1 - Multilevel Caches
8 pages
Parameters of Cache Memory: - Cache Hit - Cache Miss - Hit Ratio - Miss Penalty
No ratings yet
Parameters of Cache Memory: - Cache Hit - Cache Miss - Hit Ratio - Miss Penalty
18 pages
Lect12 Cache
No ratings yet
Lect12 Cache
39 pages
Memory Hierarchy - Ways To Reduce Misses: DAP Spr. 98 ©UCB 1
No ratings yet
Memory Hierarchy - Ways To Reduce Misses: DAP Spr. 98 ©UCB 1
23 pages
hw4 Sol
No ratings yet
hw4 Sol
4 pages
Computer Architecture
No ratings yet
Computer Architecture
5 pages
Advanced Computer Architecture-06CS81-Memory Hierarchy Design
No ratings yet
Advanced Computer Architecture-06CS81-Memory Hierarchy Design
18 pages
Guide For 150B1 01074848 Parameter Modification Through Maintenance Tool
100% (1)
Guide For 150B1 01074848 Parameter Modification Through Maintenance Tool
6 pages
Lecture 5 Cache Optimization
No ratings yet
Lecture 5 Cache Optimization
25 pages
COA Digital-Cheatsheet
No ratings yet
COA Digital-Cheatsheet
4 pages
Cau 6 Cache
No ratings yet
Cau 6 Cache
25 pages
Ca Sol PDF
No ratings yet
Ca Sol PDF
8 pages
Graphtec Ce2000-60 120 60ap 100ap 120ap Cutting Plotter
No ratings yet
Graphtec Ce2000-60 120 60ap 100ap 120ap Cutting Plotter
84 pages
Advance Computer Architecture Homework 2 Solution
No ratings yet
Advance Computer Architecture Homework 2 Solution
8 pages
ZTE N9130 Speed Secret Codes
No ratings yet
ZTE N9130 Speed Secret Codes
3 pages
Memory Hierarchy Design-Aca
No ratings yet
Memory Hierarchy Design-Aca
15 pages
Ec6009 Advanced Computer Architecture Unit V Memory and I/O: Cache Performance
No ratings yet
Ec6009 Advanced Computer Architecture Unit V Memory and I/O: Cache Performance
16 pages
Assign1 PDF
No ratings yet
Assign1 PDF
5 pages
Improving and Measuring Cache Performance
No ratings yet
Improving and Measuring Cache Performance
8 pages
Cache Memory
No ratings yet
Cache Memory
28 pages
Solution of CSE 240A Assignemnt 3
No ratings yet
Solution of CSE 240A Assignemnt 3
5 pages
1986FujitsuMemoriesDatabook 894963362
No ratings yet
1986FujitsuMemoriesDatabook 894963362
1,176 pages
Vocabulary - Information Technology III
No ratings yet
Vocabulary - Information Technology III
4 pages
Pleora Network Adapter
No ratings yet
Pleora Network Adapter
10 pages
Internal Structure of CPU
No ratings yet
Internal Structure of CPU
5 pages
Risc and Cisc: by Eugene Clewlow
No ratings yet
Risc and Cisc: by Eugene Clewlow
17 pages
AVR32DA28 32 48 DataSheet DS40002228A
No ratings yet
AVR32DA28 32 48 DataSheet DS40002228A
595 pages
Microprocessor & Computer Architecture (Μpca) : Unit 4: Usb, Pci, Scsi,Amba And Asb Bus Architecture
No ratings yet
Microprocessor & Computer Architecture (Μpca) : Unit 4: Usb, Pci, Scsi,Amba And Asb Bus Architecture
12 pages
Canin Inkjet Basics Technology Part 2
No ratings yet
Canin Inkjet Basics Technology Part 2
40 pages
3615B English User Manual
No ratings yet
3615B English User Manual
14 pages
EEE 3216 Experiment 01
No ratings yet
EEE 3216 Experiment 01
3 pages
5A Monitor Prog SiBAS
No ratings yet
5A Monitor Prog SiBAS
144 pages
Addressing Modes in 8085: Instruction Set of Intel 8085 Microprocessor Instruction Set of Intel 8086 Microprocessor
No ratings yet
Addressing Modes in 8085: Instruction Set of Intel 8085 Microprocessor Instruction Set of Intel 8086 Microprocessor
2 pages
DPU4F HdwGuide B3
No ratings yet
DPU4F HdwGuide B3
51 pages
CHT AG Hindi
No ratings yet
CHT AG Hindi
22 pages
IBM x3550M3 With Two 95W 6-Core CPUs With 460W Power Supply
No ratings yet
IBM x3550M3 With Two 95W 6-Core CPUs With 460W Power Supply
2 pages
Data Sheet: Graphics Displays and Interface/adaptor Cards
No ratings yet
Data Sheet: Graphics Displays and Interface/adaptor Cards
8 pages
Microprocessor & Computer Architecture (Μpca) : I/O And Bus Architecture
No ratings yet
Microprocessor & Computer Architecture (Μpca) : I/O And Bus Architecture
28 pages
ST 33 TPHF 20 Spi
No ratings yet
ST 33 TPHF 20 Spi
25 pages
GF8100 A e
No ratings yet
GF8100 A e
54 pages
Ict 7 Revision
No ratings yet
Ict 7 Revision
4 pages
Blue and Green Organic Group Project Presentation
No ratings yet
Blue and Green Organic Group Project Presentation
7 pages
Week 1 CN Lab
No ratings yet
Week 1 CN Lab
15 pages
GR 10 ICT Unit Exam Eng
No ratings yet
GR 10 ICT Unit Exam Eng
3 pages
6611 Seminar 2
No ratings yet
6611 Seminar 2
28 pages
Unit - 3. Science: 1. Information Technology
No ratings yet
Unit - 3. Science: 1. Information Technology
12 pages
520 6936 01B Cpu Spike 1
No ratings yet
520 6936 01B Cpu Spike 1
8 pages
Lecture 2 Genaral Computing System
No ratings yet
Lecture 2 Genaral Computing System
5 pages
Microprocessor & Computer Architecture (Μpca) : Dma Controller
No ratings yet
Microprocessor & Computer Architecture (Μpca) : Dma Controller
4 pages
An Example - SSD 256GB en Datasheet
No ratings yet
An Example - SSD 256GB en Datasheet
2 pages
Mastering the Craft of C Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Craft of C Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
Kafka Developer Certified: The Essential Guide
From Everand
Kafka Developer Certified: The Essential Guide
SUJAN
No ratings yet
C & C++ Interview Questions You'll Most Likely Be Asked
From Everand
C & C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Microprocessor & Computer Architecture (Μpca) : Unit 4: Cache Memory

Uploaded by

Microprocessor & Computer Architecture (Μpca) : Unit 4: Cache Memory

Uploaded by

Microprocessor & Computer

Unit 4: Cache Memory

• Observing the cache performance formula,

Substituting in original equation, we get,

• Local miss rate is large for second level caches because

Global cache miss rate is very similar to the

• Provided that the second level cache

• Local cache miss rate is not a good measure

• It is a function of the miss rate of the first

• Can vary by changing the first – level cache.

Parameters for second level caches:

1. Will it lower the average memory access time ?

• Initial decision is the size of a second level cache.

The essence of all cache designs is:

• Balancing fast hits and few misses.

• This optimization serves reads before writes have been completed.

Consider the following code sequence.

You might also like