0% found this document useful (0 votes)

30 views41 pages

Lecture 04 IS064

Uploaded by

fredrickodara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views41 pages

Lecture 04 IS064

Uploaded by

fredrickodara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Memory and Caching

The Memory Hierarchy

Hierarchy List
• Registers • As one goes down the
• L1 Cache hierarchy
• L2 Cache – Decreasing cost per bit
– Increasing capacity
• Main memory
– Increasing access time
• Disk cache – Decreasing frequency
• Disk of access of the
• Optical memory by the
processor – locality of
• Tape reference
So you want fast?
• It is possible to build a computer which uses
only static RAM (see later)
• This would be very fast
• This would need no cache
– How can you cache cache?
• This would cost a very large amount
Locality of Reference
• Temporal Locality
– Programs tend to reference the same memory locations
at a future point in time
– Due to loops and iteration, programs spending a lot of
time in one section of code
• Spatial Locality
– Programs tend to reference memory locations that are
near other recently-referenced memory locations
– Due to the way contiguous memory is referenced, e.g.
an array or the instructions that make up a program
• Locality of reference does not always hold, but it
usually holds
Cache Example
• Consider a Level 1 cache capable of holding 1000
words with a 0.1 s access time. Level 2 is memory
with a 1 s access time.
• If 95% of memory access is in the cache:
– T=(0.95)*(0.1 s) + (0.05)*(0.1+1 s) = 0.15 s
• If 5% of memory access is in the cache:
– T=(0.05)*(0.1 s) + (0.95)*(0.1+1 s) = 1.05 s
• Want as many cache hits as possible!
1.1 s

0.1 s

0% 100%
Semiconductor Memory
• RAM – Random Access Memory
– Misnamed as all semiconductor memory is
random access
– Read/Write
– Volatile
– Temporary storage
– Two main types: Static or Dynamic
Dynamic RAM
• Bits stored as charge in semiconductor capacitors
• Charges leak
• Need refreshing even when powered
• Simpler construction
• Smaller per bit
• Less expensive
• Need refresh circuits (every few milliseconds)
• Slower
• Main memory
Static RAM
• Bits stored as on/off switches via flip-flops
• No charges to leak
• No refreshing needed when powered
• More complex construction
• Larger per bit
• More expensive
• Does not need refresh circuits
• Faster
• Cache
Read Only Memory (ROM)
• Permanent storage
• Microprogramming
• Library subroutines
• Systems programs (BIOS)
• Function tables
Types of ROM
• Written during manufacture
– Very expensive for small runs
• Programmable (once)
– PROM
– Needs special equipment to program
• Read “mostly”
– Erasable Programmable (EPROM)
• Erased by UV
– Electrically Erasable (EEPROM)
• Takes much longer to write than read
– Flash memory
• Erase whole memory electrically
Characteristics of Memory
“Access method”
• Based on the hardware implementation of
the storage device
• Four types
– Sequential
– Direct
– Random
– Associative
Sequential Access Method
• Start at the beginning and read through in
order
• Access time depends on location of data and
previous location
• Example: tape
Direct Access Method
• Individual blocks have unique address
• Access is by jumping to vicinity then
performing a sequential search
• Access time depends on location of data
within "block" and previous location
• Example: hard disk
Random Access Method

• Individual addresses identify locations

exactly
• Access time is consistent across all
locations and is independent previous
access
• Example: RAM
Cache
• Small amount of fast memory
• Sits between normal main memory and CPU
• May be located on CPU chip or module
– An entire blocks of data is copied from memory to the cache
because the principle of locality tells us that once a byte is accessed,
it is likely that a nearby data element will be needed soon.
Cache operation - overview
• CPU requests contents of memory location
• Check cache for this data
• If present, get from cache (fast)
• If not present, read required block from main
memory to cache
• Then deliver from cache to CPU
• Cache includes tags to identify which block of
main memory is in each cache slot
Cache Structure
• Cache includes tags to identify the address of the
block of main memory contained in a line of the
cache
• Each word in main memory has a unique n-bit
address
• There are M=2n/K block of K words in main
memory
• Cache contains C lines of K words each plus a tag
uniquely identifying the block of K words
Cache Structure (continued)
Line
number Tag Block
0
1
2

C-1

Block length
(K words)
Memory Divided into Blocks
Memory
Address
1
2
3 Block of
K words

Block

2n-1

Word length
Cache Performance Metrics
Miss Rate
– Fraction of memory references not found in cache (misses / accesses)
• 1 – hit rate 
– Typical numbers (in percentages):
• 3-10% for L1
• can be quite small (e.g., < 1%) for L2, depending on size,
etc.
Hit Time
– Time to deliver a line in the cache to the processor
• includes time to determine whether the line is in the cache
– Typical numbers:
• 1-2 clock cycle for L1
• 5-20 clock cycles for L2
Cache Performance Metrics
Miss Penalty
– Additional time required because of a miss
• typically 50-200 cycles for main memory (Trend:
increasing!)
Associative Access Method
• Addressing information must be stored with
data in a general data location
• A specific data element is located by a
comparing desired address with address
portion of stored elements
• Access time is independent of location or
previous access
• Example: cache
Mapping Functions
• A mapping function is the method used to locate a
memory address within a cache
• It is used when copying a block from main
memory to the cache and it is used again when
trying to retrieve data from the cache
• There are three kinds of mapping functions
– Direct
– Associative
– Set Associative
Mapping Function
• Suppose we have the following configuration
– Word size of 1 byte
– Cache of 16 bytes
– Cache line / Block size is 2 bytes
• i.e. cache is 16/2 = 8 (23) lines of 2 bytes per line
• Will need 8 addresses for a block in the cache
– Main memory of 64 bytes
• 6 bit address needed to reference 64 bytes
• (26= 64)
• 64 bytes / 2 bytes-per-block  32 Memory Blocks
– Somehow we have to map the 32 memory blocks to the
8 lines in the cache. Multiple memory blocks will have
to map to the same line in the cache!
Mapping Function – 64K Cache
Example
• Suppose we have the following configuration
– Word size of 1 byte
– Cache of 64 KByte
– Cache line / Block size is 4 bytes
• i.e. cache is 64 Kb / 4 bytes = 16 (24) lines of 4 bytes
– Main memory of 16MBytes
• 24 bit address
• (224=16M)
• 16Mb / 4bytes-per-block  4 Meg of Memory Blocks
– Somehow we have to map the 4 Meg of blocks in
memory onto the 16K of lines in the cache. Multiple
memory blocks will have to map to the same line in the
cache!
Direct Mapping
• Simplest mapping technique - each block of main
memory maps to only one cache line
– i.e. if a block is in cache, it must be in one specific
place
• Formula to map a memory block to a cache line:
– i = j mod c
• i=Cache Line Number
• j=Main Memory Block Number
• c=Number of Lines in Cache
– i.e. we divide the memory block by the number of
cache lines and the remainder is the cache line address
Direct Mapping with C=4
• Shrinking our example to a cache line size of 4
slots (each slot/line/block still contains 4 words):
– Cache Line Memory Block Held
• 0 0, 4, 8, …
• 1 1, 5, 9, …
• 2 2, 6, 10, …
• 3 3, 7, 11, …
– In general:
• 0 0, C, 2C, 3C, …
• 1 1, C+1, 2C+1, 3C+1, …
• 2 2, C+2, 2C+2, 3C+2, …
• 3 3, C+3, 2C+3, 3C+3, …
Direct Mapping with C=4
Valid Dirty Tag Block 0
Main
Block 1 Memory
Slot 0

Slot 1 Block 2

Slot 2 Block 3

Slot 3 Block 4

Cache Memory Block 5

Each slot contains K words Block 6

Tag: Identifies which memory block Block 7
is in the slot
Direct Mapping pros & cons
• Simple
• Inexpensive
• Fixed location for given block
– If a program accesses 2 blocks that map to the
same line repeatedly, cache misses are very
high – condition called thrashing
Fully Associative Mapping
• A fully associative mapping scheme can overcome the
problems of the direct mapping scheme
– A main memory block can load into any line of cache
– Memory address is interpreted as tag and word
– Tag uniquely identifies block of memory
– Every line’s tag is examined for a match
– Also need a Dirty and Valid bit
• But Cache searching gets expensive!
– Ideally need circuitry that can simultaneously examine all tags
for a match
– Lots of circuitry needed, high cost
• Need replacement policies now that anything can get
thrown out of the cache (will look at this shortly)
Associative Mapping Example

Valid Dirty Tag Block 0

Main
Block 1 Memory
Slot 0

Slot 1 Block 2

Slot 2 Block 3

Slot 3 Block 4

Cache Memory Block 5

Block 6
Block can map to any slot
Tag used to identify which block is in which slot Block 7
All slots searched in parallel for target
Associative Mapping Traits
• A main memory block can load into any line of
cache
• Memory address is interpreted as:
– Least significant w bits = word position within block
– Most significant s bits = tag used to identify which
block is stored in a particular line of cache
• Every line's tag must be examined for a match
• Cache searching gets expensive and slower
Associative Mapping Address Structure
Example

Tag – s bits Word – w bits

(22 in example) (2 in ex.)

• 22 bit tag stored with each 32 bit block of data

• Compare tag field with tag entry in cache to
check for hit
• Least significant 2 bits of address identify which
of the four 8 bit words is required from 32 bit
data block
Fully Associative Cache Organization
Fully Associative Mapping Example
Assume that a portion of the tags in the cache in our example
looks like the table below. Which of the following addresses are
contained in the cache?

a.) 438EE816 b.) F18EFF16 c.) 6B8EF316 d.) AD8EF316

• Address length = (s + w) bits

• Number of addressable units = 2s+w words or
bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+ w/2w = 2s
• Number of lines in cache = undetermined
• Size of tag = s bits
Set Associative Mapping
• Compromise between fully-associative and direct-
mapped cache
– Cache is divided into a number of sets
– Each set contains a number of lines
– A given block maps to any line in a specific set
• Use direct-mapping to determine which set in the cache
corresponds to a set in memory
• Memory block could then be in any line of that set
– e.g. 2 lines per set
• 2 way associative mapping
• A given block can be in either of 2 lines in a specific set
– e.g. K lines per set
• K way associative mapping
• A given block can be in one of K lines in a specific set
• Much easier to simultaneously search one set than all lines
Set Associative Mapping
• To compute cache set number:
– SetNum = j mod v
Main Memory
• j = main memory block number
Block 0
• v = number of sets in cache
Block 1

Set 0 Slot 0 Block 2

Slot 1 Block 3
Set 1 Slot 2 Block 4

Slot 3 Block 5
Set Associative Mapping
64K Cache Example
Word
Tag 9 bit Set 13 bit 2 bit

• E.g. Given our 64Kb cache, with a line size of 4 bytes, we have
16384 lines. Say that we decide to create 8192 sets, where each set
contains 2 lines. Then we need 13 bits to identify a set (213=8192)
• Use set field to determine cache set to look in
• Compare tag field of all slots in the set to see if we have a hit, e.g.:
– Address = 16339C = 0001 0110 0011 0011 1001 1100
• Tag = 0 0010 1100 = 02C
• Set = 0 1100 1110 0111 = 0CE7
• Word = 00 = 0
– Address = 008004 = 0000 0000 1000 0000 0000 0100
• Tag = 0 0000 0001 = 001
• Set = 0 0000 0000 0001 = 0001
• Word = 00 = 0
K-Way Set Associative
• Two-way set associative gives much better
performance than direct mapping
– Just one extra slot avoids the thrashing problem
• Four-way set associative gives only slightly
better performance over two-way
• Further increases in the size of the set has
little effect other than increased cost of the
hardware!

Basic Computer Fundamentals
100% (5)
Basic Computer Fundamentals
46 pages
D720201 202manual
100% (1)
D720201 202manual
155 pages
Chap 6
No ratings yet
Chap 6
48 pages
Cache Mapping
100% (1)
Cache Mapping
44 pages
Cache Memory CAD
No ratings yet
Cache Memory CAD
16 pages
BiD 05
No ratings yet
BiD 05
97 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
Memory Organization AndCache Mapping Study 13
100% (1)
Memory Organization AndCache Mapping Study 13
55 pages
6.cache Memory - BVK
No ratings yet
6.cache Memory - BVK
47 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
67 pages
Chapter 2z
No ratings yet
Chapter 2z
54 pages
6.module 2 - Part 2
No ratings yet
6.module 2 - Part 2
39 pages
CAO - Lecutre7 Cache Memory
100% (1)
CAO - Lecutre7 Cache Memory
39 pages
CH 4.ppt Type I
No ratings yet
CH 4.ppt Type I
60 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
57 pages
04 - Cache Memory
No ratings yet
04 - Cache Memory
47 pages
04 - Cache Memory
No ratings yet
04 - Cache Memory
79 pages
Cache - Memory - Concept
No ratings yet
Cache - Memory - Concept
73 pages
Cache Memory
No ratings yet
Cache Memory
51 pages
Cache Memory
67% (3)
Cache Memory
72 pages
Cache Memory
No ratings yet
Cache Memory
57 pages
04 - Cache Memory
No ratings yet
04 - Cache Memory
61 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
57 pages
55-Types of Caches, Caches Misses,-04!03!2025
No ratings yet
55-Types of Caches, Caches Misses,-04!03!2025
64 pages
Unit 1 Part 2 (Chapter 4) Cache Memory
No ratings yet
Unit 1 Part 2 (Chapter 4) Cache Memory
53 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
57 pages
Cache Memory in Computer Organizatin
No ratings yet
Cache Memory in Computer Organizatin
12 pages
Cache + Associations Ch-4
No ratings yet
Cache + Associations Ch-4
52 pages
03-Chap4-Cache Memory Mapping
No ratings yet
03-Chap4-Cache Memory Mapping
24 pages
Unit-2 CDA DrManojY
No ratings yet
Unit-2 CDA DrManojY
81 pages
3 - Cache Memory
No ratings yet
3 - Cache Memory
35 pages
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
71 pages
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
54 pages
11 Cache Memory
No ratings yet
11 Cache Memory
40 pages
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition Cache Memory
64 pages
Computer Architecture and Organization: Dr. Mohd Hanafi Ahmad Hijazi
No ratings yet
Computer Architecture and Organization: Dr. Mohd Hanafi Ahmad Hijazi
47 pages
CH 06
No ratings yet
CH 06
58 pages
Cache Memory
No ratings yet
Cache Memory
47 pages
Computer Organization and Architecture: Cache Memory
100% (1)
Computer Organization and Architecture: Cache Memory
57 pages
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
71 pages
DECO - Module 4.3 - Cache
No ratings yet
DECO - Module 4.3 - Cache
20 pages
Module 4: Memory System Organization & Architecture
No ratings yet
Module 4: Memory System Organization & Architecture
97 pages
Chapter 7
No ratings yet
Chapter 7
39 pages
04 Memory
No ratings yet
04 Memory
101 pages
CO Lec.4
No ratings yet
CO Lec.4
36 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
Memory Organization: Dr. Bernard Chen PH.D
No ratings yet
Memory Organization: Dr. Bernard Chen PH.D
77 pages
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
45 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
Cache Memory
No ratings yet
Cache Memory
61 pages
CH 4A - Cache Memory
No ratings yet
CH 4A - Cache Memory
38 pages
Lectures9 10 Computer Memory 1
No ratings yet
Lectures9 10 Computer Memory 1
15 pages
Embedded Systems: Applications in Imaging and Communication
No ratings yet
Embedded Systems: Applications in Imaging and Communication
71 pages
04 Cache Memory
No ratings yet
04 Cache Memory
71 pages
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
From Everand
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
Nolan Reeves
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
Computer Science I Essentials
From Everand
Computer Science I Essentials
Randall Raus
5/5 (7)
Zig Programming: From Zero to Systems Master
From Everand
Zig Programming: From Zero to Systems Master
Niklas Hoffmann
No ratings yet
Mastering the Craft of C Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Craft of C Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
PlayStation 2 Architecture: Architecture of Consoles: A Practical Analysis, #12
From Everand
PlayStation 2 Architecture: Architecture of Consoles: A Practical Analysis, #12
Rodrigo Copetti
No ratings yet
Mastering Efficient Memory Management in C++: Unlock the Secrets of Expert-Level Skills
From Everand
Mastering Efficient Memory Management in C++: Unlock the Secrets of Expert-Level Skills
Larry Jones
No ratings yet
Rust Essentials: Master the Language of Safe Systems Programming
From Everand
Rust Essentials: Master the Language of Safe Systems Programming
Tyler Hayes
No ratings yet
Unit - Ii Micro Programmed Control: Control Memory
No ratings yet
Unit - Ii Micro Programmed Control: Control Memory
7 pages
Digital Electronics First Exam Fall 2023
No ratings yet
Digital Electronics First Exam Fall 2023
14 pages
Cdns Qspi Flash CTRL and Phy Design Specification
No ratings yet
Cdns Qspi Flash CTRL and Phy Design Specification
149 pages
Iar Kickstart Kit For Stm32F103Ze
No ratings yet
Iar Kickstart Kit For Stm32F103Ze
1 page
Memory Interface Circuits 80x86 Processors: Ref: Online Course On EE-390, KFUPM
No ratings yet
Memory Interface Circuits 80x86 Processors: Ref: Online Course On EE-390, KFUPM
15 pages
Lab Title:: Introduction To Microwind and Analysis of CMOS 0.25 Micron
No ratings yet
Lab Title:: Introduction To Microwind and Analysis of CMOS 0.25 Micron
7 pages
The Motherboard
No ratings yet
The Motherboard
8 pages
I C, 32-Bit, Binary Counter Clock With 64-Bit ID: General Description Features
No ratings yet
I C, 32-Bit, Binary Counter Clock With 64-Bit ID: General Description Features
12 pages
5a-AVR-Timers (8 Bit)
No ratings yet
5a-AVR-Timers (8 Bit)
32 pages
Overview of DRAM Design
No ratings yet
Overview of DRAM Design
21 pages
Memory Devices
No ratings yet
Memory Devices
60 pages
ARM Processor - Instruction Set - Module 5
No ratings yet
ARM Processor - Instruction Set - Module 5
24 pages
Programmig Ad Problem Solvig Through C': Ratioale
No ratings yet
Programmig Ad Problem Solvig Through C': Ratioale
19 pages
CD-29-Doc Design and Simulation of Successive Approximation Adc in Verilog
No ratings yet
CD-29-Doc Design and Simulation of Successive Approximation Adc in Verilog
56 pages
CEA201 Group6
No ratings yet
CEA201 Group6
20 pages
E560 CMG10 DB
No ratings yet
E560 CMG10 DB
5 pages
Motorola - MC14053B PDF
No ratings yet
Motorola - MC14053B PDF
9 pages
D-Type Flip Flop Counter or Delay Flip-Flop
No ratings yet
D-Type Flip Flop Counter or Delay Flip-Flop
13 pages
Unit-Fundamental of Computer
No ratings yet
Unit-Fundamental of Computer
14 pages
Embedded Systems
No ratings yet
Embedded Systems
9 pages
Introduction To ASIC Design: Dr. Paul D. Franzon Genreal Outline
No ratings yet
Introduction To ASIC Design: Dr. Paul D. Franzon Genreal Outline
54 pages
Chapter 3 - Combinational Logic Circuits (Part 1) - Digital Electronics
No ratings yet
Chapter 3 - Combinational Logic Circuits (Part 1) - Digital Electronics
12 pages
VHDL Code For Barrel Shifter
No ratings yet
VHDL Code For Barrel Shifter
11 pages
Digital Calculator Report
No ratings yet
Digital Calculator Report
15 pages
A Low Power Selective Median Filter Design
No ratings yet
A Low Power Selective Median Filter Design
69 pages
03-134232-002 Abdul Moiz DLD Lab Journal 13
No ratings yet
03-134232-002 Abdul Moiz DLD Lab Journal 13
6 pages
Chapter 3-PIC IO Port Programming
75% (4)
Chapter 3-PIC IO Port Programming
36 pages
Green University of Bangladesh Department of Computer Science and Engineering (CSE)
No ratings yet
Green University of Bangladesh Department of Computer Science and Engineering (CSE)
6 pages

Lecture 04 IS064

Uploaded by

Lecture 04 IS064

Uploaded by

Memory and Caching

The Memory Hierarchy

• Individual addresses identify locations

Cache Memory Block 5

Each slot contains K words Block 6

Valid Dirty Tag Block 0

Cache Memory Block 5

Tag – s bits Word – w bits

• 22 bit tag stored with each 32 bit block of data

a.) 438EE816 b.) F18EFF16 c.) 6B8EF316 d.) AD8EF316

Tag (binary) Addresses wi/ block

• Address length = (s + w) bits

Set 0 Slot 0 Block 2

You might also like