0% found this document useful (0 votes)
39 views18 pages

7 Mem Cache

The document summarizes computer memory organization and caching techniques. Memory systems are organized in a hierarchy from fastest and smallest (registers, cache) to slower and larger (main memory, disk). Caches improve performance by exploiting locality and buffering frequently accessed data from main memory. There are different mapping techniques to determine where data is stored in cache, including direct mapping, set associative mapping, and fully associative mapping. Writing modified data back to main memory properly is important for cache coherence.

Uploaded by

Markonda Rajesh
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views18 pages

7 Mem Cache

The document summarizes computer memory organization and caching techniques. Memory systems are organized in a hierarchy from fastest and smallest (registers, cache) to slower and larger (main memory, disk). Caches improve performance by exploiting locality and buffering frequently accessed data from main memory. There are different mapping techniques to determine where data is stored in cache, including direct mapping, set associative mapping, and fully associative mapping. Writing modified data back to main memory properly is important for cache coherence.

Uploaded by

Markonda Rajesh
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 18

Memory Organization

• In a typical computer system, the storage system is


organized according to the following heirarchy:

slow access (1-5 s.) and large capacity (almost unlimited)

Archival Storage (magnetic tape


or photographic)

Moving head disk (magnetic or optical)

High speed drum

decreasing Charge Coupled Device decreasing


cost/bit access time

Main memory

Cache

Internal

fast access (1-20 ns.) and small capacity (1-4K byte)

B. Ross COSC 3p92 1


Memory speed

• Access time (Ta)


- the average time taken to read a unit of information
e.g., 100 ns (100 x 10**-9 s)

• Access rate (Ra) = 1/Ta (bits/second)

e.g., 1/100ns = 10 Mb/s

• Cycle time (Tc)


- the average time lapse between two successive read
operations
e.g., 500 ns (500 x 10**-9 s)

• Bandwidth or transfer rate (Rc) = 1/Tc (bits/second)

e.g., 1/500ns = 2 Mb/s

B. Ross COSC 3p92 2


Classes of Memory

• RAM (“normal memory”)


• Direct-access storage: HD, CD ROM, DVD
• Sequential access storage tapes: DAT
• Associative (content-addressable) memory: searches for
data via bit patterns
• ROM

B. Ross COSC 3p92 3


Categories of RAM and ROM

• primary memory

RAM
ROM

Bipolar MOS
magnetic semiconductor
core

static dynamic Mask PROM Mask PROM EPROM,


ROM ROM EAROM

B. Ross COSC 3p92 4


Main Memory Design

10
A9-A0

1K x 4 4
WE RAM chip D3-D0

CS

CS WE MODE Status of the Power


Bidirectional
Datalines D3-D0

H X not selected High impedence Standby

L L Write Acts as input bus Active

L H Read Acts as output bus Active

B. Ross COSC 3p92 5


Main Memory Design

Q. How do we build a 4K x 4 RAM using four 1K x 4


RAM chips?
Chip A11 A10 A9 A8 A7 . . . A0 Range
0 0 0 x x x ... x 0000 to 1023
1 0 1 x x x ... x 1024 to 2047
2 1 0 x x x ... x 2048 to 3071
3 1 1 x x x ... x 3072 to 4096

B. Ross COSC 3p92 6


Main Memory Design

Q. How do we build a 256KB RAM system with an 16-


bit address bus and four 64KB RAM chips?

• Memory band-switching

log2 n
1-of-n
Processor Decoder
n

1 2 Enable n

Enable 2

Addr bus Enable 1

Memory bank

(On bus in parallel )

B. Ross COSC 3p92 7


Main memory design

• Memory address extension

Data bus
4-bit
base Base
address
Processor
20 bit
Physical
address
to memory
16-bit Address bus
Offset

B. Ross COSC 3p92 8


Cache Memory

c c external
CPU a
Main
a
c
c memory storage
h
e h
e

• Cache: fast-access memory buffer


• locality principle: programs usually use limited memory areas, in contrast to
totally random access
– spatial: location, address
– temporal: time accesed
– if commonly used memory can be buffered in high-speed cache, overall
performance enhanced
– cache takes form of small amount of store, with hardware support for
maintenance and lookup
– each cache cell saves a cache line - block of main memory (4-64 words)
• cache hit: requested memory resides in cache
• cache miss: requested memory not in cache, and must be fetched from main
memory and put into cache
• unified cache: instns, data share same cache
• split cache: separate instn, data caches
– parallel access: double the bandwidth
• level 2 cache: between instn/data cache and main memory
• cache maintenance algorithms similar in spirit to virtual memory ideas at
operating system level; main difference is that cache is hardware-supported,
whereas v.m. is software implemented

B. Ross COSC 3p92 9


Measuring cache performance

• c - cache access time

• m - main memory access time

• hr - hit ratio ( 0 <= hr <= 1) :


• # cache hits / total memory requests

• mr - miss ratio (1-hr)

• mean access time = c + (1-hr)m

• if hr --> 1 then m.a.t. = c

• if hr --> 0 then m.a.t. = c + m

B. Ross COSC 3p92 10


example:

let c = 160 ns

m = 960 ns

h = .90 (common)

mean = 160 + (1-.90)960


= 256 ns

efficiency = c / mean
= 160/256 = 62.5%

B. Ross COSC 3p92 11


Direct mapping

Main memory

1
Cache

i mod N i

N-1

M-1

B. Ross COSC 3p92 12


Direct mapping

• [4.38]
• use a hash function to find cache location
• normally, modulo some bit field of address, then just use
low end field
• cache fields:
– valid bit
– tag - block # being held
– value - data block
• scheme:
• memory request:
– compute cache slot (low n bits)
– check block (tag) field
• hit: return value
• miss: fetch block from memory, give to CPU,
and put into that computed slot (replace existing
item if there)
• can occasionally produce thrashing
– eg. addresses that are multiple of cache size (64K) will
reside at same entry
– split instn/data cache helps avoid thrashing

B. Ross COSC 3p92 13


Set associative
mapping
Main memory
0

1
Cache
Set 0
S blocks per set
Set 1

i
Set i mod (N/S)

Set N/S - 1

M-1

4.30

B. Ross COSC 3p92 14


Set associative mapping

• [4.39]
• use same hash function as direct mapping, except that each
cache slot holds multiple data blocks
– usually max. 4 blocks (“4-way”)
• searching blocks in a slot done associatively: simultaneous
pattern matching
• more flexible than direct: multiple blocks in set
• use smaller tag than associative, therefore cheaper to
implement associative matching
• commonly used in larger systems (VAX 11-780)
• which line should be replaced when slot full?
– eg. LRU (least recently used)

B. Ross COSC 3p92 15


Writing back to the memory

• only write to memory if cache data modified

• write back (write-deferred):


– (i) use a modified bit. When swapping a cache slot or
ending job, write slot if its modified bit is set

• write through:
– (ii) whenever modifying data, always write it back to
main memory
– have to do this if memory being shared in a DMA or
multiprocessing system

B. Ross COSC 3p92 16


Example: direct
mapping

4 byte blocks
1 byte words
8 slots in cache

B. Ross COSC 3p92 17


Example (cont)

B. Ross COSC 3p92 18

You might also like