Module 4: Memory System Organization & Architecture
Module 4: Memory System Organization & Architecture
• TEMPORAL LOCALITY
• The principle stating that if a data location is referenced then it will
tend to be referenced again soon.
MEMORY HIERARCHY
Conceptual Organisation of Multilevel memory
CPU
Register Cache
(level 1) Cache
File
(level 2)
Main Memory
Secondary
IC1( Microprocessor) ICs Memory
2:m
ICs m:n
• SRAM stores a bit of data on four transistors using two cross-coupled inverters
• SRAM is best suited for secondary operations like the CPU’s fast cache memory
and storing registers
• SRAM’s cycle time is a lot shorter than DRAM’s because it does not need to
refresh. The cycle time of SRAM is shorter because it does not need to stop
between accesses to refresh
What is DRAM?
• Dynamic random-access memory (DRAM) is a type of random-
access memory that stores each bit of data in a separate capacitor
within an integrated circuit.
• Miss penalty : The time required to fetch a block into a level of the
memory hierarchy from the lower level, including the time to
access the block, transmit it from one level to the other, and insert
it in the level that experienced the miss.
CACHES
Basics of Cache
• Cache was the name chosen to represent the level of the
memory hierarchy between the processor and main
memory
• Associative Mapped
• Set Associative
Direct Mapping
• A cache structure in which each memory location is mapped to
exactly one location in the cache.
• Hence the cache may be accessed directly with the low-order bits
Example
Continued…
• Each cache location can contain the contents of a number of
different memory locations
• Example:
• No of Cache blocks =9
• No of blocks in main memory = 32
Multi-word cache
• If a cache has 2n blocks and each block has 2m words, the
number of bits required for representing the cache
address will be “n + m”
• n to address the block with in the cache
• m to address the word within the block
• So if a cache has 2n blocks and each block has 2m words. If the main memory address is X
bits then tag can be calculated as,
• From the above problem it could be seen that the actual capacity of 16KB data cache is
18.4 KB. Excluding the bits required for addressing, 16KB (actual data storage) will be
specified by most of the naming conventions
Three types of misses in Cache
• Referred to as Three C’s
• Compulsory misses: These are cache misses caused by the first access
to a block that has never been in the cache. These are also called cold-
start misses.
• Capacity misses: These are cache misses caused when the cache cannot
contain all the blocks needed during execution of a program. Capacity
misses occur when blocks are replaced and then later retrieved.
• The control unit must detect a miss and process the miss
by fetching the requested data from memory
3. Write the cache entry, putting the data from memory in the
data portion of the entry, writing the upper bits of the address
(from the ALU) into the tag field, and turning the valid bit on.
• After writing the data into the cache and into the write
buffer, the processor can continue execution
• Even if the write rate is less, stalls may still exist, if the writes
occurs in bursts
Continued…
• Solution to the above problem:
• Use of Write back cache
• In a write-back scheme, when a write occurs, the new value is written only
to the block in the cache.
• The modified block is written to the lower level of the hierarchy when it is
replaced
• It is difficult to reduce the latency to fetch the first word from memory
• The clock rate of the bus is usually much slower than the
processor, by as much as a factor of 10.
(or)
• Draw back:
• Cost overhead due to increased bus width and control logic to
select mux to write in to appropriate word
Interleaved Memory Organization
Continued…
• So if each bank is 4 word wide and still the bus is 1 word
wide the miss penalty would be,
• In fully associative
• As many comparators as there are blocks
Implementation of 4-way set associative cache
Replacement in Fully Associative and Set
Associative Caches
• What to be replaced when there is a MISS in Set-
Associative / Fully Associative Cache
Eg: Consider 16 blocks grouped in to 4 sets in 4-way set
associative cache. Each block contains 1 byte.
4-Way Set Associative
• Each entry in the table contains the physical page number for that virtual
page if the page is currently in memory.
• Page table register : To indicate the location of the page table in memory,
the hardware includes a register that points to the start of the page table
• Assume for now that the page table is in a fixed and contiguous area of
memory.
TLB
• A cache that keeps track of recently used address
mappings to avoid an access to the page table.
Error Detection Code Parity bit
• When a word is written into memory, the parity bit is also
written
• Then, when the word is read out, the parity bit is read and
checked.
• If the parity of the memory word and the stored parity bit
do not match, an error has occurred
• A 1-bit parity scheme can detect at most 1 bit of error in a
data item
• If there are 2 bits of error, then a 1-bit parity scheme will
not detect any errors, since the parity will match the data
with two errors
What is a code word?
• An n-bit unit containing data and check-bits is often
referred to as an n-bit codeword
Continued…
• In the above table to move from one code word to another
code word, a minimum distance (hamming distance) of 2 is
needed
• Example:
• If d is 7, smallest value of r that satisfies above relation is 4
Continued…
Example
• Let us consider 4 data bits. So r will be 3
• So d3 d2 d1 r2 r1
d4 r4
1 0 1 0
1 0 1 0 0 1 0
After adding
redundant bits to be
‘0’
Continued…
• If received data is
1110010