0% found this document useful (0 votes)

148 views39 pages

Understanding Cache Memory Types and Design

Cache memory provides fast access to recently used data and instructions by storing copies of frequently accessed main memory locations. There are three main types of cache organization: direct mapped, fully associative, and set associative. Direct mapped cache has a fixed mapping between blocks of main memory and cache lines. Fully associative cache allows a memory block to be placed in any cache line. Set associative cache divides the cache into sets, and a memory block can be placed in any line within a set. Modern processors use hierarchical cache designs with multiple levels such as L1, L2, and L3 caches to improve performance.

Uploaded by

hariprasathk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

148 views39 pages

Understanding Cache Memory Types and Design

Uploaded by

hariprasathk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Cache memory

Direct Cache Memory

Associate Cache Memory

Set Associative Cache Memory

How can one get fast memory with less expense?

It is possible to build a computer which uses only
static RAM (large capacity of fast memory)
This would be a very fast computer
But, this would be very costly

It also can be built using a small fast memory for

present reads and writes.
Add a Cache memory

Locality of Reference Principle

During the course of the execution of a program,
memory references tend to cluster
- e.g. programs - loops, nesting,
data
strings, lists, arrays,

This can be exploited with a Cache memory

Cache Memory Organization

Cache - Small amount of fast memory
Sits between normal main memory and CPU
May be located on CPU chip or in system
Objective is to make slower memory system look like fast memory.

There may be more levels of cache (L1, L2,..)

Cache Operation Overview

CPU requests contents of memory location
Cache is checked for this data
If present, get from cache (fast)
If not present, read required block from main memory to cache
Then deliver from cache to CPU
Cache includes tags to identify which block(s) of main memory
are in the cache

Cache Read Operation - Flowchart

Cache Design Parameters

Size of Cache
Size of Blocks in Cache
Mapping Function how to assign blocks
Write Policy - Replacement Algorithm when blocks
need to be replaced

Size Does Matter

Cost
More cache is expensive

Speed
More cache is faster (up to a point)
Checking cache for data takes time

Typical Cache Organization

Cache/Main Direct Caching Memory

Structure

Direct Mapping Cache Organization

Direct Mapping Summary

Each block of main memory maps to only one cache
line
i.e. if a block is in cache, it must be in one specific place

Address is in two parts

- Least Significant w bits identify unique word
- Most Significant s bits specify which one memory block

All but the LSBs are split into

a cache line field r and
a tag of s-r (most significant)

Example Direct Mapping Function

16MBytes main memory
i.e. memory address is 24 bits
- (224=16M) bytes of memory

Cache of 64k bytes

i.e. cache is 16k
- (214) lines of 4 bytes each

Cache block of 4 bytes

i.e. block is 4 bytes
- (22) bytes of data per block

Example Direct Mapping Address Structure

Tag s-r

Line or Slot r
14

24 bit address

2 bit word identifier (4 byte block)

likely it would be wider

22 bit block identifier

8 bit tag (=22-14)
14 bit slot or line

No two blocks sharing the same line have the same Tag field

Check contents of cache by finding line and checking Tag

Word w
2

Illustration
of Example

Direct Mapping pros & cons

Pros:
Simple
Inexpensive
?

Cons:
One fixed location for given block
If a program accesses 2 blocks that map to
the same line repeatedly, cache misses are
very high thrashing & counterproductivity
?

Associative Cache Mapping

A main memory block can load into any line of cache
The Memory Address is interpreted as tag and word
The Tag uniquely identifies block of memory
Every lines tag is examined for a match
Cache searching gets expensive/complex or slow

Fully Associative Cache Organization

Associative Caching Example

Comparison of Associate to Direct Caching

Direct Cache Example:
8 bit tag
14 bit Line
2 bit word
Associate Cache Example:
22 bit tag
2 bit word

Set Associative Mapping

Cache is divided into a number of sets
Each set contains a number of lines
A given block maps to any line in a given set
e.g. Block B can be in any line of set i

e.g. with 2 lines per set

We have 2 way associative mapping
A given block can be in one of 2 lines in only
one set

Two Way Set Associative Cache Organization

2 Way Set Assoc Example

Comparison of Direct, Assoc, Set Assoc Caching

Direct Cache Example (16K Lines):
8 bit tag
14 bit line
2 bit word
Associate Cache Example (16K Lines):
22 bit tag
2 bit word
Set Associate Cache Example (16K Lines):
9 bit tag
13 bit line
2 bit word

Replacement Algorithms (1)

Direct mapping
No choice
Each block only maps to one line
Replace that line

Replacement Algorithms (2)

Associative & Set Associative
Likely Hardware implemented algorithm (for speed)
First in first out (FIFO) ?
replace block that has been in cache longest

Least frequently used (LFU) ?

replace block which has had fewest hits

Random ?

Write Policy Challenges

Must not overwrite a cache block unless main memory
is correct
Multiple CPUs/Processes may have the block cached
I/O may address main memory directly ?
(may not allow I/O buffers to be cached)

Write through
All writes go to main memory as well as cache
(Typically 15% or less of memory references are
writes)

Challenges:
Multiple CPUs MUST monitor main memory traffic to
keep local (to CPU) cache up to date
Lots of traffic may cause bottlenecks
Potentially slows down writes

Write back
Updates initially made in cache only
(Update bit for cache slot is set when update occurs
Other caches must be updated)
If block is to be replaced, memory overwritten only if
update bit is set
( 15% or less of memory references are writes )
I/O must access main memory through cache or
update cache

Coherency with Multiple Caches

Bus Watching with write through
1) mark a block as invalid when another
cache writes back that block, or
2) update cache block in parallel with
memory write
Hardware transparency
(all caches are updated simultaneously)
I/O must access main memory through cache or update cache(s)
Multiple Processors & I/O only access non-cacheable memory
blocks

Choosing Line (block) size

8 to 64 bytes is typically an optimal block
(obviously depends upon the program)
Larger blocks decrease number of blocks in a given cache size,
while including words that are more or less likely to be accessed
soon.
Alternative is to sometimes replace lines with adjacent blocks
when a line is loaded into cache.
Alternative could be to have program loader decide the cache
strategy for a particular program.

Multi-level Cache Systems

As logic density increases, it has become advantages and
practical to create multi-level caches:
1) on chip
2) off chip

L2 cache may not use system bus to make caching faster

If L2 can potentially be moved into the chip, even if it
doesnt use the system bus
Contemporary designs are now incorporating an on
chip(s) L3 cache . . . .

Split Cache Systems

Split cache into:
1) Data cache
2) Program cache
Advantage:
Likely increased hit rates
- data and program accesses display different behavior
Disadvantage:
Complexity

Intel Caches
80386 no on chip cache
80486 8k using 16 byte lines and four way set associative organization
Pentium (all versions) two on chip L1 caches
Data & instructions

Pentium 3 L3 cache added off chip

Pentium 4
L1 caches

8k bytes
64 byte lines
four way set associative

L2 cache

Feeding both L1 caches

256k
128 byte lines
8 way set associative

L3 cache on chip

Pentium 4 Block Diagram

Intel Cache Evolution

Problem

Solution

Processoronwhichfeature
firstappears

Externalmemoryslowerthanthesystembus.

Addexternalcacheusingfaster
memorytechnology.

386

Increasedprocessorspeedresultsinexternalbusbecoming
abottleneckforcacheaccess.

Moveexternalcacheonchip,
operatingatthesamespeedasthe
processor.

486

Internalcacheisrathersmall,duetolimitedspaceonchip

AddexternalL2cacheusingfaster
technologythanmainmemory

486

Createseparatedataandinstruction
caches.

Pentium

Createseparatebacksidebusthatruns
athigherspeedthanthemain(front
side)externalbus.TheBSBis
dedicatedtotheL2cache.

PentiumPro

MoveL2cacheontotheprocessor
chip.

PentiumII

AddexternalL3cache.

PentiumIII

MoveL3cacheonchip.

Pentium4

ContentionoccurswhenboththeInstructionPrefetcher
andtheExecutionUnitsimultaneouslyrequireaccessto
thecache.Inthatcase,thePrefetcherisstalledwhilethe
ExecutionUnitsdataaccesstakesplace.

Increasedprocessorspeedresultsinexternalbusbecoming
abottleneckforL2cacheaccess.

Someapplicationsdealwithmassivedatabasesandmust
haverapidaccesstolargeamountsofdata.Theonchip
cachesaretoosmall.

PowerPC Cache Organization (Apple-IBM-Motorola)

601 single 32kb 8 way set associative

603 16kb (2 x 8kb) two way set associative
604 32kb
620 64kb
G3 & G4
64kb L1 cache
8 way set associative

256k, 512k or 1M L2 cache

two way set associative

G5
32kB instruction cache
64kB data cache

PowerPC G5 Block Diagram

Comparison of Cache Sizes

Processor

Type

YearofIntroduction

Primarycache(L1)

2ndlevelCache(L2)

3rdlevelCache(L3)

IBM360/85

Mainframe

1968

16to32KB

PDP11/70

Minicomputer

1975

1KB

VAX11/780

Minicomputer

1978

16KB

IBM3033

Mainframe

1978

64KB

IBM3090

Mainframe

1985

128to256KB

Intel80486

1989

8KB

Pentium

1993

8KB/8KB

256to512KB

PowerPC601

1993

32KB

PowerPC620

1996

32KB/32KB

PowerPCG4

PC/server

1999

32KB/32KB

256KBto1MB

2MB

IBMS/390G4

Mainframe

1997

32KB

256KB

2MB

IBMS/390G6

Mainframe

1999

256KB

8MB

Pentium4

PC/server
Highendserver/
supercomputer
Supercomputer

2000

8KB/8KB

256KB

2000

64KB/32KB

8MB

2000

8KB

2MB

PC/server

2001

16KB/16KB

96KB

4MB

SGIOrigin2001

Highendserver

2001

32KB/32KB

4MB

Itanium2

PC/server

2002

32KB

256KB

6MB

IBMPOWER5

Highendserver

2003

64KB

1.9MB

36MB

CRAYXD1

Supercomputer

2004

64KB/64KB

1MB

IBMSP
CRAYMTAb
Itanium

11 Cache Memory
No ratings yet
11 Cache Memory
40 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
51 pages
Memory Hierarchy and Cache Design
No ratings yet
Memory Hierarchy and Cache Design
53 pages
CAO - Lecutre7 Cache Memory
100% (1)
CAO - Lecutre7 Cache Memory
39 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
Cache Memory Essentials
No ratings yet
Cache Memory Essentials
36 pages
Cache Memory Characteristics Explained
No ratings yet
Cache Memory Characteristics Explained
57 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
57 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
04 - Cache Memory
No ratings yet
04 - Cache Memory
61 pages
Computer Organization and Architecture: Cache Memory
100% (1)
Computer Organization and Architecture: Cache Memory
57 pages
04 - Cache Memory
No ratings yet
04 - Cache Memory
47 pages
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
55 pages
Cache Memory
No ratings yet
Cache Memory
51 pages
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
54 pages
Cache Memory
No ratings yet
Cache Memory
57 pages
William Stallings Computer Organization and Architecture 7th Edition
No ratings yet
William Stallings Computer Organization and Architecture 7th Edition
57 pages
04 Cache Memory
No ratings yet
04 Cache Memory
71 pages
Cache Memory
No ratings yet
Cache Memory
56 pages
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
71 pages
Chap 6
No ratings yet
Chap 6
48 pages
Cache Memory Organization Overview
No ratings yet
Cache Memory Organization Overview
71 pages
04 - Cache Memory (Compatibility Mode)
No ratings yet
04 - Cache Memory (Compatibility Mode)
12 pages
BiD 05
No ratings yet
BiD 05
97 pages
Cache Memory Characteristics
No ratings yet
Cache Memory Characteristics
67 pages
4.1 Computer Memory System Overview
No ratings yet
4.1 Computer Memory System Overview
12 pages
55-Types of Caches, Caches Misses,-04!03!2025
No ratings yet
55-Types of Caches, Caches Misses,-04!03!2025
64 pages
CH05
No ratings yet
CH05
56 pages
Cache Memory
No ratings yet
Cache Memory
61 pages
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 8th Edition Cache Memory
71 pages
Lecture 7
No ratings yet
Lecture 7
34 pages
Cache Memory Characteristics Overview
No ratings yet
Cache Memory Characteristics Overview
57 pages
Sampriya Chandra Cache Memory
No ratings yet
Sampriya Chandra Cache Memory
36 pages
William Stallings Computer Organization and Architecture: Internal Memory
No ratings yet
William Stallings Computer Organization and Architecture: Internal Memory
60 pages
CH04
No ratings yet
CH04
46 pages
Cache1 2
No ratings yet
Cache1 2
30 pages
Unit 1 Part 2 (Chapter 4) Cache Memory
No ratings yet
Unit 1 Part 2 (Chapter 4) Cache Memory
53 pages
Cache Memory Characteristics Explained
No ratings yet
Cache Memory Characteristics Explained
72 pages
Cache Memory CAD
No ratings yet
Cache Memory CAD
16 pages
Cache Memory Essentials
No ratings yet
Cache Memory Essentials
52 pages
Cache Mapping
100% (1)
Cache Mapping
44 pages
Conspect of Lecture 7
No ratings yet
Conspect of Lecture 7
13 pages
Cache Memory Mapping Techniques
No ratings yet
Cache Memory Mapping Techniques
23 pages
Embedded Systems: Applications in Imaging and Communication
No ratings yet
Embedded Systems: Applications in Imaging and Communication
71 pages
15IF11 Multicore B
No ratings yet
15IF11 Multicore B
36 pages
Lecture 7 Cache Memory
No ratings yet
Lecture 7 Cache Memory
44 pages
Memory Hierarchy in Computer Architecture
No ratings yet
Memory Hierarchy in Computer Architecture
48 pages
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
No ratings yet
Lecture 5: Memory Hierarchy and Cache Traditional Four Questions For Memory Hierarchy Designers
10 pages
Cache Memory in Computer Architecture
No ratings yet
Cache Memory in Computer Architecture
64 pages
Cache Memory Characteristics Explained
No ratings yet
Cache Memory Characteristics Explained
46 pages
CH05
No ratings yet
CH05
10 pages
Computer Architecture: Cache Design
No ratings yet
Computer Architecture: Cache Design
61 pages
CH 4.ppt Type I
No ratings yet
CH 4.ppt Type I
60 pages
03-Chap4-Cache Memory Mapping
No ratings yet
03-Chap4-Cache Memory Mapping
24 pages
Cache Memory
No ratings yet
Cache Memory
47 pages
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
No ratings yet
William Stallings Computer Organization and Architecture 6th Edition Cache Memory
45 pages
Constructors & Destructors in C++
No ratings yet
Constructors & Destructors in C++
21 pages
Java Basics
No ratings yet
Java Basics
25 pages
GE6263 May 14
No ratings yet
GE6263 May 14
22 pages
The Potential Reality of Service-Oriented Architecture (SOA) in A Cloud Computing Strategy
No ratings yet
The Potential Reality of Service-Oriented Architecture (SOA) in A Cloud Computing Strategy
16 pages
Differential Geometry of Surfaces
No ratings yet
Differential Geometry of Surfaces
34 pages
An Engineering Approach To Digi - Fletcher, William I., 1938
100% (2)
An Engineering Approach To Digi - Fletcher, William I., 1938
792 pages
Sequential Interview Questions
No ratings yet
Sequential Interview Questions
3 pages
Multiplexers and Demultiplexers
No ratings yet
Multiplexers and Demultiplexers
14 pages
Datasheet - HK mc68hc16z1 1965226
No ratings yet
Datasheet - HK mc68hc16z1 1965226
200 pages
EE3412 Linear & Digital Circuits Lab Manual
No ratings yet
EE3412 Linear & Digital Circuits Lab Manual
123 pages
Overview of Computer Functions & Interconnections
No ratings yet
Overview of Computer Functions & Interconnections
8 pages
Half Adder, Full Adder, Encoder and Decoder Using VHDL: Objectives
No ratings yet
Half Adder, Full Adder, Encoder and Decoder Using VHDL: Objectives
5 pages
Pw2 - Dec30043 Nazri and Amirul
No ratings yet
Pw2 - Dec30043 Nazri and Amirul
14 pages
Fetch Execute Cycle
No ratings yet
Fetch Execute Cycle
19 pages
CP210x CP211x Overview Technical
No ratings yet
CP210x CP211x Overview Technical
37 pages
6.physical Design Flow
No ratings yet
6.physical Design Flow
6 pages
1.4.2 The Fetch-Execute Cycle
No ratings yet
1.4.2 The Fetch-Execute Cycle
4 pages
Atmel 11269 32 Bit Cortex A5 Microcontroller SAMA5D3 Xplained User Guide
No ratings yet
Atmel 11269 32 Bit Cortex A5 Microcontroller SAMA5D3 Xplained User Guide
60 pages
ASRock Z77 Extreme4 User Manual
No ratings yet
ASRock Z77 Extreme4 User Manual
79 pages
Introduction To Flash Memory in VLSI
No ratings yet
Introduction To Flash Memory in VLSI
10 pages
Amd Manual
No ratings yet
Amd Manual
921 pages
Embedded Systems Lab
100% (2)
Embedded Systems Lab
60 pages
Microcontroller Debugging Tools Guide
No ratings yet
Microcontroller Debugging Tools Guide
4 pages
Computer Organization & Architecture
No ratings yet
Computer Organization & Architecture
47 pages
Execution Unit (EU) : Instruction Queue, and The Instruction Pointer. It Has The Task of Making Sure That The Bus
No ratings yet
Execution Unit (EU) : Instruction Queue, and The Instruction Pointer. It Has The Task of Making Sure That The Bus
10 pages
Unit V - Sources of Power Dissipation
No ratings yet
Unit V - Sources of Power Dissipation
52 pages
NTAG I2C Plus Your Entryway To NFC v1.0 Public - 2 PDF
No ratings yet
NTAG I2C Plus Your Entryway To NFC v1.0 Public - 2 PDF
43 pages
8th Gen Core Family Datasheet Vol 1
No ratings yet
8th Gen Core Family Datasheet Vol 1
135 pages
3-Fault Collapsing in DFT
No ratings yet
3-Fault Collapsing in DFT
12 pages
8085 Microprocessor Memory Interface Design
No ratings yet
8085 Microprocessor Memory Interface Design
101 pages
e-PG PATHSHALA-Computer Science Computer Architecture
No ratings yet
e-PG PATHSHALA-Computer Science Computer Architecture
9 pages
Digital Logic Course Overview
No ratings yet
Digital Logic Course Overview
2 pages
Booklet F Y BSC (CS) Electronics Practical Slips 2019 Sem II
No ratings yet
Booklet F Y BSC (CS) Electronics Practical Slips 2019 Sem II
20 pages
AK-4 00 Is An All in One in Telligence BENZ / BMW Smart Key Maker
No ratings yet
AK-4 00 Is An All in One in Telligence BENZ / BMW Smart Key Maker
37 pages
Data Man Pro Plus Specification
No ratings yet
Data Man Pro Plus Specification
2 pages

Understanding Cache Memory Types and Design

Uploaded by

Understanding Cache Memory Types and Design

Uploaded by

Cache memory

Direct Cache Memory

Associate Cache Memory

Set Associative Cache Memory

How can one get fast memory with less expense?

It also can be built using a small fast memory for

Locality of Reference Principle

This can be exploited with a Cache memory

Cache Memory Organization

There may be more levels of cache (L1, L2,..)

Cache Operation Overview

Cache Read Operation - Flowchart

Cache Design Parameters

Size Does Matter

Typical Cache Organization

Cache/Main Direct Caching Memory

Direct Mapping Cache Organization

Direct Mapping Summary

Address is in two parts

All but the LSBs are split into

Example Direct Mapping Function

Cache of 64k bytes

Cache block of 4 bytes

Example Direct Mapping Address Structure

2 bit word identifier (4 byte block)

22 bit block identifier

Check contents of cache by finding line and checking Tag

Direct Mapping pros & cons

Associative Cache Mapping

Fully Associative Cache Organization

Associative Caching Example

Comparison of Associate to Direct Caching

Set Associative Mapping

e.g. with 2 lines per set

Two Way Set Associative Cache Organization

2 Way Set Assoc Example

Comparison of Direct, Assoc, Set Assoc Caching

Replacement Algorithms (1)

Replacement Algorithms (2)

Least frequently used (LFU) ?

Write Policy Challenges

Coherency with Multiple Caches

Choosing Line (block) size

Multi-level Cache Systems

L2 cache may not use system bus to make caching faster

Split Cache Systems

Pentium 3 L3 cache added off chip

Feeding both L1 caches

Pentium 4 Block Diagram

Intel Cache Evolution

PowerPC Cache Organization (Apple-IBM-Motorola)

601 single 32kb 8 way set associative

256k, 512k or 1M L2 cache

PowerPC G5 Block Diagram

Comparison of Cache Sizes

You might also like