0% found this document useful (0 votes)
295 views52 pages

Computer Organization & Architecture: Cache Memory

This document discusses cache memory and its role in computer organization and architecture. It begins by describing the characteristics of computer memory, including location, capacity, unit of transfer, access method, performance, physical type, and organization. It then discusses the memory hierarchy from registers to disk storage. The document outlines different cache design parameters like size, mapping function, replacement algorithm, write policy, and block size. It provides examples of cache implementations in Intel and IBM processors.

Uploaded by

muhammad farooq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
295 views52 pages

Computer Organization & Architecture: Cache Memory

This document discusses cache memory and its role in computer organization and architecture. It begins by describing the characteristics of computer memory, including location, capacity, unit of transfer, access method, performance, physical type, and organization. It then discusses the memory hierarchy from registers to disk storage. The document outlines different cache design parameters like size, mapping function, replacement algorithm, write policy, and block size. It provides examples of cache implementations in Intel and IBM processors.

Uploaded by

muhammad farooq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Computer Organization & Architecture

Chapter 4
Cache Memory
Characteristics of Computer Memory
• Location
• Capacity
• Unit of transfer
• Access method
• Performance
• Physical type
• Physical characteristics
• Organisation
Characteristics of Computer Memory
Location of Memory
• CPU
—Registers
• Internal
—Cache, Main Memory
• External
—Accessible via I/O module
—Hard disk, Optical disk
Memory Hierarchy - Diagram
Memory Hierarchy List
• Registers
• L1 Cache
• L2 Cache
• Main memory
• Virtual memory (OS)
• Disk
• Optical
• Tape
Memory Capacity
• Word size
—The natural unit of organisation
—No. of bits used to represent an integer and to
instruction length
• Number of words
—or Bytes
• Word length = 8, 16, 32 bits
Unit of Transfer
• Internal
—Unit of transfer = no. of lines in data bus
—32, 64, 128, 256 bits
• External
—Usually a block which is much larger than a
word
• Addressable unit
—Smallest location which can be uniquely
addressed
—Word or block
Access Methods (1)
• Sequential
—Start at the beginning and read through in
order
—Access time depends on location of data and
previous location
—e.g. tape
• Direct
—Individual blocks have unique addresses
—Access is by jumping to vicinity plus sequential
search
—Access time depends on location and previous
location
—e.g. disk
Access Methods (2)
• Random
—Individual addresses identify locations exactly
—Access time is independent of location or
previous access
—e.g. RAM
• Associative
—Data is located by a comparison with contents
of a portion of the store
—Access time is independent of location or
previous access
—Word retrieved based on a portion of its
contents rather than its address
—e.g. cache
Performance Units
• Access time (latency)
—Time between presenting the address and
getting the valid data
• Memory Cycle time
—Time may be required for the memory to
“recover” before next access
—Cycle time = access time + recovery time
• Transfer Rate
—Rate at which data can be moved
Physical Types
• Semiconductor
—RAM
• Magnetic
—Disk & Tape
• Optical
—CD & DVD
• Others
Physical Characteristics
• Decay (requires refresh circuitry)
• Volatility (requires voltages)
• Erasable (re-writeability)
• Power consumption

• Organization
—Physical arrangement of bits in word
The Bottom Line
• How much?
—Memory Capacity
• How fast?
—Transfer Time
• How expensive?
—Monetary cost
So you want fast?
• It is possible to build a computer which
uses only static RAM
• This would be very fast

• This would need no cache


• This would cost very high
Locality of Reference
• During the execution of a program,
memory references tend to cluster
—Sequential execution
—Consecutive instructions
—Repetitive access to variables
—Loops
• If one instruction is being executed,
then it is very likely that nearby
instructions will also be executed
—Fetch block of instructions rather than a single
instruction
Spatial & Temporal Locality
• Spatial Locality
—Tendency of execution to involve a number of
memory locations that are clustered
—Sequential execution of instructions
—Sequential access to data values e.g. from a
table
• Temporal Locality
—Tendency of a processor to access memory
locations that have been used recently
—Loop execution
Cache
• Small amount of fast memory
• Sits between CPU and main memory
• May be located on CPU chip
Cache Hierarchy
• L1 Cache -> Closest to the CPU
• L2 Cache -> Next
• L3 Cache -> Farthest from CPU
Cache Hierarchy: On-Chip Cache
Cache Hierarchy
Instruction iCache vs. Data dCache

instruction cache

data cache
Cache/Main Memory Structure

[email protected] 23
Cache operation – overview
• CPU requests contents of memory location
• Check cache for this data
• If present, get from cache (fast)
• If not present, read required block from
main memory to cache
• Then deliver from cache to CPU
• Cache includes tags to identify which
block of main memory is in each cache
slot
Cache Read Operation - Flowchart
Typical Cache Interconnection
Cache Design Parameters
• Size
• Mapping Function
• Replacement Algorithm
• Write Policy
• Block Size
• Number of Caches

[email protected] 27
Size does matter
• Cost
—More cache is expensive
• Speed
—More cache is faster (up to a point)
—Checking cache for data takes time

[email protected] 28
Comparison of Cache Sizes

[email protected] 29
Finding Cache Size on Computer

http://www.cpuid.com/downloads/cpu-z/1.57-setup-en.exe
[email protected] 30
Typical Mapping Function
• Cache of 64kByte
—Cache block of 4 bytes
—i.e. cache is 16k (214) lines of 4 bytes
• 16MBytes main memory
—24 bit address
—(224=16M)

[email protected] 31
Direct Mapping
• Each block of main memory maps to only
one cache line
—i.e. if a block is in cache, it must be in one
specific place
• Address is in two parts
• Least Significant w bits identify unique
word
• Most Significant s bits specify one memory
block
• The MSBs are split into a cache line field r
and a tag of s-r (most significant)
[email protected] 32
Direct Mapping

[email protected] 33
Direct Mapping pros & cons
• Simple
• Inexpensive
• Fixed location for given block
—If a program accesses 2 blocks that map to the
same line repeatedly, cache misses are very
high (This low hit ratio is called thrashing)

[email protected] 34
Associative Mapping
• A main memory block can load into any
line of cache
• Every line’s tag is examined for a match
• Cache searching gets expensive

[email protected] 35
Set Associative Mapping
• Cache is divided into a number of sets
• Each set contains a number of lines
• A given block maps to any line in a given
set
—e.g. Block B can be in any line of set i
• e.g. 2 lines per set
—2 way associative mapping
—A given block can be in one of 2 lines in only
one set

[email protected] 36
Cache Hit Ratio & L2 Cache Size

[email protected] 37
Cache Misses & Associativity

[email protected] 38
Replacement Algorithms (1): Direct mapping
• No choice
• Each block only maps to one line
• Replace that line

[email protected] 39
Replacement Algorithms (2)
Associative & Set Associative
• Hardware implemented algorithm (fast)
• Least Recently used (LRU)
—e.g. in 2 way set associative
—Which of the 2 blocks is LRU?
• First in first out (FIFO)
—replace block that has been in cache longest
• Least frequently used
—replace block which has fewest hits
• Others

[email protected] 40
Write Policy
• Must not overwrite a cache block unless
main memory is up to date
• Multiple CPUs may have individual caches
• I/O may address main memory directly

[email protected] 41
Write Through Policy
• All writes go to main memory as well as
cache
• Multiple CPUs can monitor main memory
traffic to keep local cache up to date
• Lots of traffic
• Slows down writes

[email protected] 42
Write Back Policy
• Updates are initially made in cache only
• Update bit for cache slot is set when
update occurs
• If block is to be replaced, write to main
memory only if update bit is set
• I/O must access main memory through
cache
• 15% of memory references are writes

[email protected] 43
Intel Cache Evolution

[email protected] 44
Pentium 4 Cache
• Pentium (all versions) – two on chip L1 caches
— Data & instructions
• Pentium III – L3 cache added off chip
• Pentium 4
— L1 caches
– 8k bytes
– four way set associative
— L2 cache
– Feeding both L1 caches
– 256k
– 8 way set associative
— L3 cache on chip

[email protected] 45
Pentium 4 Block Diagram

[email protected] 46
Pentium 4 Core Processor
• Fetch/Decode Unit
— Fetches instructions from L2 cache
— Decode into micro-ops
— Store micro-ops in L1 cache
• Out of order execution logic
— Schedules micro-ops
— Based on data dependence and resources
— May speculatively execute
• Execution units
— Execute micro-ops
— Data from L1 cache
— Results in registers
• Memory subsystem
— L2 cache and systems bus
[email protected] 47
Intel Core i7 Block Diagram

[email protected] 48
IBM PowerPC Cache Organization
• 601 – single 32kb 8 way set associative
• 603 – 16kb (2 x 8kb) two way set
associative
• 604 – 32kb
• 620 – 64kb
• G3 & G4
—64kb L1 cache
– 8 way set associative
—256k, 512k or 1M L2 cache
– two way set associative
• G5
—32kB instruction cache
—64kB data cache
[email protected] 49
Questions ???

[email protected] 50
Virtual Memory
• An Operating System construct

• Virtual memory combines your computer’s


RAM with temporary space on your hard
disk.

• When RAM runs low, virtual memory


moves data from RAM to a space called
a paging file.

[email protected] 51
Virtual
Memory

[email protected] 52

You might also like