0% found this document useful (0 votes)

4 views19 pages

Memory Hierarchies: Forecast - Memory (B5) - Motivation For Memory Hierarchy - Cache - Ecc - Virtual Memory

The document discusses memory hierarchies, detailing various types of memory elements such as registers, SRAM, DRAM, and disk storage, along with their characteristics in terms of size, speed, and cost. It emphasizes the motivation for a memory hierarchy based on locality of reference, which allows for faster access to frequently used data, and outlines the structure and management of caches. Additionally, it covers cache design considerations, including block placement, replacement strategies, and the impact of cache performance on CPU efficiency.

Uploaded by

jonathanj302

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views19 pages

Memory Hierarchies: Forecast - Memory (B5) - Motivation For Memory Hierarchy - Cache - Ecc - Virtual Memory

Uploaded by

jonathanj302

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Memory Hierarchies

Forecast
• Memory (B5)
• Motivation for memory hierarchy
• Cache
• ECC
• Virtual memory

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 1

Background
Mem
Size Speed Price/MB
Element
Register small 1-5ns high ??
SRAM medium 5-25ns $??
DRAM large 60-120ns $1
Disk large 10-20ms $0.20

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 2

Background
Need basic element to store a bit - latch, flip-flop, capacitor

Memory is logically a 2D array of #locations x data-width

• e.g., 16 registers 32 bits each is a 16 x 32 memory
• (4 address bits; 32 bits of data)
• today’s main memory chips are 8M x 8
• (23 address bits; 8 bits of data)

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 3

one 16-way mux per read port

decode write enable

can use tri-state and bus for each port

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 4

SRAM
Static RAM
• does not lose data like DRAM
• 6T CMOS cell
• pass transistors as switch
• bit lines, word lines

SRAM interface

Today - 2M x 8 in 5-15ns

Typical large implementations (512 x 64) x 8

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 5

DRAM
Dense memory
• 1 T cell
• forgets data on read and after a while
• e.g., 16M x 1 in 4k x 4k array
• 24 address bits - 12 for row and 12 for column

Implementation

writeback row to restore destroyed value

Refresh - in background, march through reading all rows

Interface reflects internal orgn. - addr/2, RAS, CAS, data

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 6
Optimizations
Give faster access to some bits of row
• static column - change column address
• page mode - change column address & CAS hit (EDO)
• nibble mode - fast access to 4 bits

Bigger changes in future

• bandwidth inside >> external bandwidth
• 8kb/50ns/chip >> 8b/50ns/chip
• 164 Gb/s >> 20 Mb/s
• RAMBUS, IRAM, etc

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 7

Motivation for Hierarchy

CPU wants
• memory reference/insn * bytes-per-reference * IPC/Cycle
• 1.2*4*1/2ns = 2.4 GB/s

CPU can go only as fast as memory can supply

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 8

Motivation for Hierarchy
Want memory with
• fast access (e.g., one 500 ps CPU cycle)
• large capacity (10 GB)
• inexpensive ($1/MB)

Incompatible requirements

Fortunately memory references are not random!

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 9

Motivation for Hierarchy

Locality in time (temporal locality)

if a datum is recently referenced,

it is likely to be referenced again soon

Locality in space (spacial locality)

If a datum is recently referenced,

neighbouring data is likely to be referenced soon

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 10

Motivation for Hierarchy
E.g.,
• researching term paper - don’t look at all books at random
• if you look at a chapter in one book
• temporal - may re-read the chapter again
• spatial - may read neighbouring chapters
• Solution - leave the book on desk for a while
• hit - book on desk
• miss - book not on desk
• miss ratio - fraction not on desk

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 11

Motivation for Hierarchy

Memory access time = access-desk + miss-ratio * access-shelf

• 1 + 0.05 * 100
• 6 << 100

Extend this to several levels of hierarchy

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 12

Memory Hierarchy
Small, fast, inexpensive memory
CPU
larger, slower, cheaper memory

...
L1
largest, slowest, cheapest memory larger
L2

L3
faster

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 13

Memory Hierarchy
Speed
Type Size
(ns)
Register < 1 KB 0.5
L1 Cache < 128 KB 1
L2 Cache < 16 MB 20
Main < 4 GB 100
memory
Disk > 10 GB 10 x 106

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 14

Memory Hierarchy
Registers <-> Main memory: managed by compiler/programmer
• holds expression temporaries
• holds variables - more aggressive
• register allocation
• spill when needed
• hard!

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 15

Memory Hierarchy
Main memory <-> Disk: managed by
• program - explicit I/O
• operating system - virtual memory
• illusion of larger memory
• protection
• transparent to user

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 16

Cache
cache managed by hardware

keep recently accessed block CPU

• temporal locality

break memory into blocks (several bytes) $

• spatial locality

transfer data to/from cache in blocks

Main Memory

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 17

Cache
put block in “block frame”
• state (e.g., valid)
• address tag
• data

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 18

Cache
on memory access
• if incoming tag == stored tag then HIT
• else MISS
• << replace old block >>
• get block from memory
• put block in cache
• return appropriate word within block

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 19

Cache Example
Memory words:
0x11c 0xe0e0e0e0

0x120 0xffffffff
0x124 0x00000001

0x128 0x00000007

0x12c 0x00000003
0x130 0xabababab

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 20

Cache Example
a 16-byte cache block frame:
• state tag data
• invalid 0x?? ???

lw $4, 0x128

Is tag ox120 in cache? (0x128 mod 16 = 0x128 & 0xfffffff0)

No, get block

• state tag data
• valid 0x129 0xffffffff, 0x1, 0x7, 0x3

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 21

Cache Example
Return 0x7 to CPU to put in $4
lw $5, 0x124

Is tag 0x120 in cache?

Yes, return 0x1 to CPU

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 22

Cache Example
Often
• cache 1 cycle
• main memory 20 cycles

Performance for data accesses with miss ratio 0.1

mean access = cache access + miss ratio * main memory access

= 1 + 0.01 * 20 = 1.2

Typically caches 64K, main memory 64M

• 20 times faster
• 1/1000 capacity but contains 98% of references

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 23

Cache
4 questions
• Where is block placed?
• How is block found?
• Which block is replaced?
• What happens on a write?

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 24

Cache Design
Simple cache first Address (showing bit positions)
31 30 17 16 15 5 4 3 2

• block size = 1 word 16 14 Byte

offset
• “direct-mapped” Hit Data

16 bits 32 bits

• 16K words (64KB) Valid Tag Data

• index - 14 bits
• tag - 16 bits 16K
entries

Consider
16 32
• hit & miss
• place & replace

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 25

Cache Design w/ 16-byte blocks (7.10)

Address (showing bit positions)

31 15
16 3241 0

16 12 2 Byte
Hit Tag Data
offset
Index Block offset
16 bits 128 bits
V Tag Data

4K
entries

16 32 32 32 32

Mux
32

Cache Design
What if blocks conflict?
• Fully associative cache
• CAM cells hold D and D’; incoming bits B and B’
• match = AND (Bi*Di + B’i*D’i)
• compromise - set associative cache

Cache Design w/ 4-way set-assoc. (7.19)

Address
31 30 12 11 10 9 8 321 0

22 8

Index V Tag Data V Tag Data V Tag Data V Tag Data

0
1
2

253
254
255
22 32

4-to-1 multiplexor

Hit Data

Cache Design
3C model
• Conflict
• Capacity
• Compulsory

Q3. Which block is replaced

• LRU
• random

Cache Design
Q4. What happens on a write?
• write hit must be slower
• propagate to memory?
• immediately - write-through
• on replacement - write-back

Cache Design
Exploit spatial locality
• bigger block size
• may increase miss penalty

Reduce conflicts
• more associativity
• may increase cache hit time

Cache Design
Unified vs. split instruction and data cache
Example
• consider building 16K I and D cache
• or a 32K unified cache
• let tcache be 1 cycle and tmemory be 10 cycles

Cache Design
I and D split cache
• (a) Imiss is 5% and Dmiss is 6%
• 75% references are instruction fetches
• tavg = (1 + 0.05*10)*0.75 + (1 + 0.06*10) * 0.25 = 1.5

Unified cache
• tavg = 1 + 0.04*10 = 1.4 WRONG!
• tavg = 1.4 + cycles-lost-to-interference
• will cycles-lost-to-interference be < 0.1?
• NOT for modern pipelined processors!

Cache Design
Multi-level caches
Many systems today have a cache hierarchy

E.g.,
• 16K I-cache
• 16K D-cache
• 1M L2-cache

Cache Design
Why?
• Processors getting faster w.r.t. main memory
• want larger caches to reduce frequency of costly misses
• but larger caches are slower!

Solution: Reduce cost of misses with a second level cache

Begin to occur: 3 Cache Levels

Split L1 instruction & data on chip
Unified L2 on chip
Unified L3 on board
© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 35

CPU and Cache Performance

Cache only
• miss ratio
• average access time

Integrate - assume cache hits are part of the pipeline

Time/prog = insn/prog * cycles/insn * sec/cycle

CPI = (execution cycles + stall cycles)/insn

CPI = execution cycles/insn + stall cycles/insn

CPU and Cache Performance
Stall cycles/insn =
• read stall cycles/insn + write stall cycles/insn

read stall cycles/insn =

• read/insn * miss ratio * read miss penalty

write stall cycles/insn =

• more complex - write through, write back, write buffer?

CPU and Cache Performance

Example
• CPI with ideal memory is 1.5
• Assume IF and write never stall
• How is CPI degraded if loads are 25% of all insns
• loads miss 10% and miss cost is 20 cycles

CPI = 1.5 + 0.250.1020 = 2

• 2/1,5 = 33% slower

485a3cfaa849b90f1dc3badbac594d94
100% (1)
485a3cfaa849b90f1dc3badbac594d94
501 pages
Microprocessor System Design: Error Correcting Codes Principle of Locality Cache Architecture
No ratings yet
Microprocessor System Design: Error Correcting Codes Principle of Locality Cache Architecture
28 pages
Writing An Interpreter in Go - Writing An Interpreter in Go - Thorsten Ball PDF
No ratings yet
Writing An Interpreter in Go - Writing An Interpreter in Go - Thorsten Ball PDF
14 pages
Soca Unitwise Important Questions
No ratings yet
Soca Unitwise Important Questions
4 pages
Chapter 03
No ratings yet
Chapter 03
57 pages
4 Caches With Notes
No ratings yet
4 Caches With Notes
121 pages
Impulse Signal
No ratings yet
Impulse Signal
1 page
Chapter 5 - Memory
No ratings yet
Chapter 5 - Memory
44 pages
Main Memory (Fig. 7.13)
No ratings yet
Main Memory (Fig. 7.13)
15 pages
Chapter 05
No ratings yet
Chapter 05
105 pages
CH 5
No ratings yet
CH 5
116 pages
Help 2
No ratings yet
Help 2
102 pages
Module 5 - 5.1 Overview of Computer Memory
No ratings yet
Module 5 - 5.1 Overview of Computer Memory
65 pages
Lecture11 Cda3101
No ratings yet
Lecture11 Cda3101
73 pages
CA Chap5 Memory
No ratings yet
CA Chap5 Memory
91 pages
Lecture 03
No ratings yet
Lecture 03
37 pages
Deep Learning Onramp
No ratings yet
Deep Learning Onramp
1 page
Iot Based Smart Weather Monitoring System For Poultry Farm: Jenny Priyanka Mondol
No ratings yet
Iot Based Smart Weather Monitoring System For Poultry Farm: Jenny Priyanka Mondol
37 pages
04 - Large and Fast Exploiting Memory Hierarchy
No ratings yet
04 - Large and Fast Exploiting Memory Hierarchy
92 pages
B.SC Electronics 5th & 6th Sem Syllabus 2023-24
No ratings yet
B.SC Electronics 5th & 6th Sem Syllabus 2023-24
25 pages
Large and Fast: Exploiting Memory Hierarchy: The Hardware/Software Interface
No ratings yet
Large and Fast: Exploiting Memory Hierarchy: The Hardware/Software Interface
33 pages
CH10 - Memory Hierarchy
No ratings yet
CH10 - Memory Hierarchy
106 pages
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II
No ratings yet
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II
27 pages
Lecture 3 (Memory Hierarchy and Caches)
No ratings yet
Lecture 3 (Memory Hierarchy and Caches)
88 pages
Lecture 9 - The Memory Hierarchy
No ratings yet
Lecture 9 - The Memory Hierarchy
25 pages
Pico 8
No ratings yet
Pico 8
75 pages
Lecture Slides On Instruction Set Architecture and Memory Design
No ratings yet
Lecture Slides On Instruction Set Architecture and Memory Design
23 pages
2023-02-01 T3 India
No ratings yet
2023-02-01 T3 India
84 pages
ch5 1
No ratings yet
ch5 1
44 pages
Chapter 3 Large and Fast
No ratings yet
Chapter 3 Large and Fast
86 pages
Lab 5-1 - Basic Manual Phone Configuration Using The CLI - Packet Tracer
No ratings yet
Lab 5-1 - Basic Manual Phone Configuration Using The CLI - Packet Tracer
12 pages
Lecture 13 16 Post
No ratings yet
Lecture 13 16 Post
24 pages
Computer Architecture: Memory Hierarchy Design
No ratings yet
Computer Architecture: Memory Hierarchy Design
60 pages
Lab 5
No ratings yet
Lab 5
17 pages
13 - Large and Fast Exploiting Memory Hierarchy Final
No ratings yet
13 - Large and Fast Exploiting Memory Hierarchy Final
118 pages
Memory
No ratings yet
Memory
57 pages
ECE 152 Introduction To Computer Architecture Where We Are in This Course Right Now
No ratings yet
ECE 152 Introduction To Computer Architecture Where We Are in This Course Right Now
12 pages
Fault Detection in Wireless Sensor Network Based On Deep Learning Algorithms
No ratings yet
Fault Detection in Wireless Sensor Network Based On Deep Learning Algorithms
8 pages
CPSC 312 Cache Memories: Topics
No ratings yet
CPSC 312 Cache Memories: Topics
39 pages
Lecture 10: Memory System - Memory Technology: CSE 564 Computer Architecture Summer 2017
No ratings yet
Lecture 10: Memory System - Memory Technology: CSE 564 Computer Architecture Summer 2017
44 pages
5G NR Study Material 19 - 07 - 2021 Edited
100% (2)
5G NR Study Material 19 - 07 - 2021 Edited
30 pages
Final Chapter-5
No ratings yet
Final Chapter-5
9 pages
AMS DIS - HP V22b 21.5-Inch Monitor Quickspec - 585
No ratings yet
AMS DIS - HP V22b 21.5-Inch Monitor Quickspec - 585
6 pages
CS2115 Chapter-6
No ratings yet
CS2115 Chapter-6
45 pages
Chapter5 PDF
No ratings yet
Chapter5 PDF
95 pages
פרק ט - גדול ומהיר - ניצול היררכיות זיכרון
No ratings yet
פרק ט - גדול ומהיר - ניצול היררכיות זיכרון
77 pages
Show Mac Address-Table: Syntax Description
No ratings yet
Show Mac Address-Table: Syntax Description
3 pages
L-4 (Cache Memory)
No ratings yet
L-4 (Cache Memory)
61 pages
Lec8 - Caches
No ratings yet
Lec8 - Caches
55 pages
Lecture 04 IS064
No ratings yet
Lecture 04 IS064
41 pages
5 Memory Hierarchy
No ratings yet
5 Memory Hierarchy
99 pages
ACA Unit 2
No ratings yet
ACA Unit 2
45 pages
Pertemuan 6
No ratings yet
Pertemuan 6
56 pages
Large and Fast: Exploiting Memory Hierarchy: Omputer Rganization and Esign
No ratings yet
Large and Fast: Exploiting Memory Hierarchy: Omputer Rganization and Esign
87 pages
Cert-K Alert
No ratings yet
Cert-K Alert
4 pages
Memory 2
No ratings yet
Memory 2
31 pages
CS 152 Computer Architecture and Engineering Lecture 6 - Memory
No ratings yet
CS 152 Computer Architecture and Engineering Lecture 6 - Memory
29 pages
Chapter 6: Memory: - CPU Accesses Memory at Least Once Per Fetch-Execute Cycle: - Memory Is Organized Into A Hierarchy
No ratings yet
Chapter 6: Memory: - CPU Accesses Memory at Least Once Per Fetch-Execute Cycle: - Memory Is Organized Into A Hierarchy
25 pages
Memory Design
No ratings yet
Memory Design
36 pages
Chapter 7
No ratings yet
Chapter 7
39 pages
2023 01 26T21 07 12 - R3dlog
No ratings yet
2023 01 26T21 07 12 - R3dlog
3 pages
10 Cache Memories
No ratings yet
10 Cache Memories
49 pages
416 Assignment 2
No ratings yet
416 Assignment 2
3 pages
Week 12 - Lecture 12 - Memory
No ratings yet
Week 12 - Lecture 12 - Memory
27 pages
4.distributed OS
No ratings yet
4.distributed OS
55 pages
Week 10
No ratings yet
Week 10
59 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
32 pages
Infineon-LED7SEG User Module-Software Module Datasheets-V01 02-En
No ratings yet
Infineon-LED7SEG User Module-Software Module Datasheets-V01 02-En
13 pages
Santhosh (4y 6m) Yrs WorkSoft b2 Hyd
No ratings yet
Santhosh (4y 6m) Yrs WorkSoft b2 Hyd
4 pages
Software Reference AIOWDM
No ratings yet
Software Reference AIOWDM
10 pages
Altamash
No ratings yet
Altamash
1 page
Computer Organization and Architecture Chapter 7 Large and Fast Exploiting
No ratings yet
Computer Organization and Architecture Chapter 7 Large and Fast Exploiting
32 pages
Chap 6
No ratings yet
Chap 6
48 pages
Main Sol Final
No ratings yet
Main Sol Final
26 pages
Main Midterm
No ratings yet
Main Midterm
25 pages
Midterm fs19 Sol Final
No ratings yet
Midterm fs19 Sol Final
24 pages
Control Overview: CS/ECE 552 Lecture Notes: Chapter 5 © 2000 by Mark D. Hill
No ratings yet
Control Overview: CS/ECE 552 Lecture Notes: Chapter 5 © 2000 by Mark D. Hill
15 pages
8086 Stack, Procedures
No ratings yet
8086 Stack, Procedures
18 pages
Basic Arithmetic and The ALU Basic Arithmetic and The ALU
No ratings yet
Basic Arithmetic and The ALU Basic Arithmetic and The ALU
11 pages
Back To Arithmetic Integer Multiplication: Multiplicand Shift Left 64 Bits
No ratings yet
Back To Arithmetic Integer Multiplication: Multiplicand Shift Left 64 Bits
11 pages
CH 3
No ratings yet
CH 3
11 pages
CH 2
No ratings yet
CH 2
10 pages
EC 5001 - Memory 1
No ratings yet
EC 5001 - Memory 1
56 pages
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
No ratings yet
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
10 pages
Main Final
No ratings yet
Main Final
28 pages
Cache Memory: William Stallings, Computer Organization and Architecture, 9 Edition
No ratings yet
Cache Memory: William Stallings, Computer Organization and Architecture, 9 Edition
47 pages
CH 1
No ratings yet
CH 1
5 pages
Memory Organization: Dr. Bernard Chen PH.D
No ratings yet
Memory Organization: Dr. Bernard Chen PH.D
77 pages
Large and Fast: Exploiting Memory Hierarchy
No ratings yet
Large and Fast: Exploiting Memory Hierarchy
48 pages
Processor Implementation Review Sequential Logic: C Q Q - Q Q - Q D Latch D D Latch D D
No ratings yet
Processor Implementation Review Sequential Logic: C Q Q - Q Q - Q D Latch D D Latch D D
4 pages
Chap 14 - 21
No ratings yet
Chap 14 - 21
8 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
Chapter 6
No ratings yet
Chapter 6
37 pages
CS 211: Computer Architecture Cache Memory Design
No ratings yet
CS 211: Computer Architecture Cache Memory Design
32 pages
Cache1 2
No ratings yet
Cache1 2
30 pages
Fortinet
No ratings yet
Fortinet
54 pages
CAS 005 Exam Dumps
No ratings yet
CAS 005 Exam Dumps
9 pages
MC Module-5 Notes
No ratings yet
MC Module-5 Notes
8 pages
Trabajo Fina3
No ratings yet
Trabajo Fina3
19 pages
Best Ethical Hacking Course
No ratings yet
Best Ethical Hacking Course
2 pages

Memory Hierarchies: Forecast - Memory (B5) - Motivation For Memory Hierarchy - Cache - Ecc - Virtual Memory

Uploaded by

Memory Hierarchies: Forecast - Memory (B5) - Motivation For Memory Hierarchy - Cache - Ecc - Virtual Memory

Uploaded by

Memory Hierarchies

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 1

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 2

Memory is logically a 2D array of #locations x data-width

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 3

one 16-way mux per read port

can use tri-state and bus for each port

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 4

Typical large implementations (512 x 64) x 8

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 5

writeback row to restore destroyed value

Refresh - in background, march through reading all rows

Interface reflects internal orgn. - addr/2, RAS, CAS, data

Bigger changes in future

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 7

Motivation for Hierarchy

CPU can go only as fast as memory can supply

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 8

Fortunately memory references are not random!

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 9

Motivation for Hierarchy

if a datum is recently referenced,

it is likely to be referenced again soon

Locality in space (spacial locality)

If a datum is recently referenced,

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 10

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 11

Motivation for Hierarchy

Memory access time = access-desk + miss-ratio * access-shelf

Extend this to several levels of hierarchy

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 12

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 13

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 14

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 15

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 16

keep recently accessed block CPU

break memory into blocks (several bytes) $

transfer data to/from cache in blocks

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 17

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 18

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 19

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 20

Is tag ox120 in cache? (0x128 mod 16 = 0x128 & 0xfffffff0)

No, get block

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 21

Is tag 0x120 in cache?

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 22

Performance for data accesses with miss ratio 0.1

mean access = cache access + miss ratio * main memory access

Typically caches 64K, main memory 64M

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 23

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 24

• block size = 1 word 16 14 Byte

• 16K words (64KB) Valid Tag Data

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 25

Cache Design w/ 16-byte blocks (7.10)

Address (showing bit positions)

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 26

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 27

Cache Design w/ 4-way set-assoc. (7.19)

Index V Tag Data V Tag Data V Tag Data V Tag Data

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 28

Q3. Which block is replaced

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 29

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 30

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 31

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 32

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 33

© 2000 by Mark D. Hill CS/ECE 552 Lecture Notes: Chapter 7 34

Solution: Reduce cost of misses with a second level cache

Begin to occur: 3 Cache Levels

CPU and Cache Performance

Integrate - assume cache hits are part of the pipeline

Time/prog = insn/prog * cycles/insn * sec/cycle

CPI = (execution cycles + stall cycles)/insn

CPI = execution cycles/insn + stall cycles/insn

CPI = 1.5 + 0.250.1020 = 2