Lecture 04 IS064
Lecture 04 IS064
0.1 s
0% 100%
Semiconductor Memory
• RAM – Random Access Memory
– Misnamed as all semiconductor memory is
random access
– Read/Write
– Volatile
– Temporary storage
– Two main types: Static or Dynamic
Dynamic RAM
• Bits stored as charge in semiconductor capacitors
• Charges leak
• Need refreshing even when powered
• Simpler construction
• Smaller per bit
• Less expensive
• Need refresh circuits (every few milliseconds)
• Slower
• Main memory
Static RAM
• Bits stored as on/off switches via flip-flops
• No charges to leak
• No refreshing needed when powered
• More complex construction
• Larger per bit
• More expensive
• Does not need refresh circuits
• Faster
• Cache
Read Only Memory (ROM)
• Permanent storage
• Microprogramming
• Library subroutines
• Systems programs (BIOS)
• Function tables
Types of ROM
• Written during manufacture
– Very expensive for small runs
• Programmable (once)
– PROM
– Needs special equipment to program
• Read “mostly”
– Erasable Programmable (EPROM)
• Erased by UV
– Electrically Erasable (EEPROM)
• Takes much longer to write than read
– Flash memory
• Erase whole memory electrically
Characteristics of Memory
“Access method”
• Based on the hardware implementation of
the storage device
• Four types
– Sequential
– Direct
– Random
– Associative
Sequential Access Method
• Start at the beginning and read through in
order
• Access time depends on location of data and
previous location
• Example: tape
Direct Access Method
• Individual blocks have unique address
• Access is by jumping to vicinity then
performing a sequential search
• Access time depends on location of data
within "block" and previous location
• Example: hard disk
Random Access Method
C-1
Block length
(K words)
Memory Divided into Blocks
Memory
Address
1
2
3 Block of
K words
Block
2n-1
Word length
Cache Performance Metrics
Miss Rate
– Fraction of memory references not found in cache (misses / accesses)
• 1 – hit rate
– Typical numbers (in percentages):
• 3-10% for L1
• can be quite small (e.g., < 1%) for L2, depending on size,
etc.
Hit Time
– Time to deliver a line in the cache to the processor
• includes time to determine whether the line is in the cache
– Typical numbers:
• 1-2 clock cycle for L1
• 5-20 clock cycles for L2
Cache Performance Metrics
Miss Penalty
– Additional time required because of a miss
• typically 50-200 cycles for main memory (Trend:
increasing!)
Associative Access Method
• Addressing information must be stored with
data in a general data location
• A specific data element is located by a
comparing desired address with address
portion of stored elements
• Access time is independent of location or
previous access
• Example: cache
Mapping Functions
• A mapping function is the method used to locate a
memory address within a cache
• It is used when copying a block from main
memory to the cache and it is used again when
trying to retrieve data from the cache
• There are three kinds of mapping functions
– Direct
– Associative
– Set Associative
Mapping Function
• Suppose we have the following configuration
– Word size of 1 byte
– Cache of 16 bytes
– Cache line / Block size is 2 bytes
• i.e. cache is 16/2 = 8 (23) lines of 2 bytes per line
• Will need 8 addresses for a block in the cache
– Main memory of 64 bytes
• 6 bit address needed to reference 64 bytes
• (26= 64)
• 64 bytes / 2 bytes-per-block 32 Memory Blocks
– Somehow we have to map the 32 memory blocks to the
8 lines in the cache. Multiple memory blocks will have
to map to the same line in the cache!
Mapping Function – 64K Cache
Example
• Suppose we have the following configuration
– Word size of 1 byte
– Cache of 64 KByte
– Cache line / Block size is 4 bytes
• i.e. cache is 64 Kb / 4 bytes = 16 (24) lines of 4 bytes
– Main memory of 16MBytes
• 24 bit address
• (224=16M)
• 16Mb / 4bytes-per-block 4 Meg of Memory Blocks
– Somehow we have to map the 4 Meg of blocks in
memory onto the 16K of lines in the cache. Multiple
memory blocks will have to map to the same line in the
cache!
Direct Mapping
• Simplest mapping technique - each block of main
memory maps to only one cache line
– i.e. if a block is in cache, it must be in one specific
place
• Formula to map a memory block to a cache line:
– i = j mod c
• i=Cache Line Number
• j=Main Memory Block Number
• c=Number of Lines in Cache
– i.e. we divide the memory block by the number of
cache lines and the remainder is the cache line address
Direct Mapping with C=4
• Shrinking our example to a cache line size of 4
slots (each slot/line/block still contains 4 words):
– Cache Line Memory Block Held
• 0 0, 4, 8, …
• 1 1, 5, 9, …
• 2 2, 6, 10, …
• 3 3, 7, 11, …
– In general:
• 0 0, C, 2C, 3C, …
• 1 1, C+1, 2C+1, 3C+1, …
• 2 2, C+2, 2C+2, 3C+2, …
• 3 3, C+3, 2C+3, 3C+3, …
Direct Mapping with C=4
Valid Dirty Tag Block 0
Main
Block 1 Memory
Slot 0
Slot 1 Block 2
Slot 2 Block 3
Slot 3 Block 4
Slot 1 Block 2
Slot 2 Block 3
Slot 3 Block 4
Block 6
Block can map to any slot
Tag used to identify which block is in which slot Block 7
All slots searched in parallel for target
Associative Mapping Traits
• A main memory block can load into any line of
cache
• Memory address is interpreted as:
– Least significant w bits = word position within block
– Most significant s bits = tag used to identify which
block is stored in a particular line of cache
• Every line's tag must be examined for a match
• Cache searching gets expensive and slower
Associative Mapping Address Structure
Example
Slot 1 Block 3
Set 1 Slot 2 Block 4
Slot 3 Block 5
Set Associative Mapping
64K Cache Example
Word
Tag 9 bit Set 13 bit 2 bit
• E.g. Given our 64Kb cache, with a line size of 4 bytes, we have
16384 lines. Say that we decide to create 8192 sets, where each set
contains 2 lines. Then we need 13 bits to identify a set (213=8192)
• Use set field to determine cache set to look in
• Compare tag field of all slots in the set to see if we have a hit, e.g.:
– Address = 16339C = 0001 0110 0011 0011 1001 1100
• Tag = 0 0010 1100 = 02C
• Set = 0 1100 1110 0111 = 0CE7
• Word = 00 = 0
– Address = 008004 = 0000 0000 1000 0000 0000 0100
• Tag = 0 0000 0001 = 001
• Set = 0 0000 0000 0001 = 0001
• Word = 00 = 0
K-Way Set Associative
• Two-way set associative gives much better
performance than direct mapping
– Just one extra slot avoids the thrashing problem
• Four-way set associative gives only slightly
better performance over two-way
• Further increases in the size of the set has
little effect other than increased cost of the
hardware!