Unit-4 Memory Systems
Unit-4 Memory Systems
• For example, a computer that generates 16-bit addresses is capable of addressing up to 216 = 64K
memory locations.
• Machines whose instructions generate 32-bit addresses can utilize a memory that contains up to
• The number of locations represents the size of the address space of the computer.
• The memory is usually designed to store and retrieve data in word-length quantities. Memory
transfers usually happen in word granularities
The connection between the processor and its memory consists of address, data, and control lines
Measures for memory speed:
Memory access time: Time elapsed between the initiation of an operation to transfer data from/to
memory and the completion of that operation
Memory cycle time: Time required between initiation of two successive memory accesses for example,
the time between two successive Read operations.
• Most important issue in memory systems design is to provide a computer with as large and fast a
memory as possible, within a given cost target
• Cost of a memory device depends on both its capacity (total no. of bits) and its density (bits per unit
area)
Random Access Memory (RAM)
A memory unit is called a random-access memory (RAM), if the access time to any location is the same,
independent of the location’s address
Cache and Virtual Memory
Cache memory is a chip-based computer component that makes retrieving data from the
computer's memory more efficient. It acts as a temporary storage area that the computer's
processor can retrieve data from easily. This temporary storage area, known as a cache, is more
readily available to the processor than the computer's main memory .
The virtual memory is a logical memory. It is a memory management technique handled by the
operating system. Virtual memory allows the programmer to use more memory for a program
than the available main memory.
For example, assume that a computer has a main memory of 4GB and a virtual memory of 16GB.
The user can use this 16GB to execute the program. Therefore, the user can execute programs
which require more memory than the capacity of the main memory.
Block Transfers
The data move frequently between the main memory and the cache and between the main
memory and the disk. These transfers do not occur one word at a time. Data are always
transferred in contiguous blocks involving tens, hundreds, or thousands of words.
Data transfers between the main memory and high-speed devices such as a graphic display or an
Ethernet interface also involve large blocks of data.
Semiconductor RAM Memories
• Semiconductor memories are the volatile memory storages that store the program and data
until the power supply to the system is ON.
• The cycle time of these semiconductor memories ranges from 100 ns to 10 ns.
• The cycle time is the time from the start of one access to the start of the next access to the
memory.
• Almost all the memory units are made of semiconductor material, especially silicon.
Semiconductor memories are used for storing digital data as they can be accessed faster.
Internal Organization of Memory Chips
• Memory cells are usually organized in the form of an array, in which each cell is capable of
storing one bit of information.
• Each row of cells constitutes a memory word, and all cells of a row are connected to a
common line referred to as the word line, which is driven by the address decoder on the
chip.
• The cells in each column are connected to a Sense/Write circuit by two bit lines, and the
Sense/Write circuits are connected to the data input/output lines of the chip
The figure below represents the memory circuit with 16 words (W0 to W15) where each word
has a word length of 8 bits (b0 to b7). So, this is referred to as 16×8 memory organization.
A very small memory circuit consisting of 16 words of 8 bits each. This is referred to as a 16 × 8
organization.
The data input and the data output of each Sense/Write circuit are connected to a single bidirectional data
line that can be connected to the data lines of a computer.
Two control lines, R/W and CS, are provided. The R/W (Read/Write) input specifies the required
operation, and the CS (Chip Select) input selects a given chip in a multichip memory system.
16X8 Memory Circuit (128 memory cells)
requires 14 external connections
for address - 4
for data ---- 8
for control -- 2
•It also needs 2 lines for power supply and ground connection.
Consider now a slightly larger memory circuit, one that has 1K (1024) memory cells.
This circuit can be organized as a 128 × 8 memory, requiring a total of 19 external connections.
Alternatively, the same number of cells can be organized into a 1K × 1 format. In this case, a 10-bit
address is needed, but there is only one data line, resulting in 15 external connections
The required 10-bit address is divided into two groups of 5 bits each to form the row and
column addresses for the cell array.
A row address selects a row of 32 cells, all of which are accessed in parallel. But, only one of
these cells is connected to the external data line, based on the column address.
Large chips have essentially the same organization as Figure 8.3, but use a larger memory
cell array and have more external connections.
For example, a 1G-bit chip may have a 256M × 4 organization, in which case a 28-bit
address is needed and 4 bits are transferred to or from the chip.
There are two types of SEMI CONDUCTOR MEMORIES
1. Static Random Access Memory(SRAM)
2. Dynamic Random Access Memory(DRAM)
I. Synchronous DRAM
II. Asynchronous DRAM
• Present Day main systems are designed with DRAM
• Cache are designed with SRAM
STATIC MEMORIES
• In static RAM memories or SRAM, the content of the memory cell retains as long as the
power supply to the memory chip is ON. In any situation, if the power supply to the
memory chip is interrupted then the content of memory cells of static RAM is also lost.
• When the power is resumed back to the memory chip there is no guarantee that the
memory cells may have the same content as they have before the interruption of the power
supply. This is the reason the static RAM is volatile in nature
• Two inverters are cross-connected to form a latch. The latch is connected to two bit lines by
transistors T1 and T2. These transistors act as switches that can be opened or closed under
control of the word line.
• When the word line is at ground level, the transistors are turned off and the latch retains its
state.
• For example, if the logic value at point X is 1 and at point Y is 0, this state is maintained as
long as the signal on the word line is at ground level. Assume that this state represents the
value 1.
Read Operation
In order to read the state of the SRAM cell, the word line is activated to close switches T1 and
T2. If the cell is in state 1, the signal on bit line b is high and the signal on bit line b’ is low. The
opposite is true if the cell is in state 0. Thus, b and b’ are always complements of each other.
Write Operation
During a Write operation, the Sense/Write circuit drives bit lines b and b’ , instead of sensing
their state. It places the appropriate value on bit line b and its complement on b’ and activates
the word line. This forces the cell into the corresponding state, which the cell retains when the
word line is deactivated.
CMOS Cell
CMOS SRAMs i.e. complementary metal-oxide-semiconductor memory consumes very low
memory as the power is supplied through the cells only when the cell is being accessed.
6-Transistor Static Memory Cell or CMOS
• 1-bit SRAM has 6-transistors in modern day SRAM Implementations
• Transistor pairs (T3, T5) and (T4, T6) form the inverters in the latch
• X-> input of T4 & T6
• Y-> Input of T3 & T5.
In state 0:
The Voltage of x is low & Voltage of y is high , so T4 & T5 will be conducting
When word line is activated, T1 &T2 will be ON & bit lines b will have 0 & b` will have 1.
In state 1:
The Voltage of x is high & Voltage of y is low , so T3 & T6 will be conducting
When word line is activated, T1 &T2 will be ON & bit lines b will have 1 & b` will have 0.
A major advantage of CMOS SRAMs is their very low power consumption, because current
flows in the cell only when the cell is being accessed.
Otherwise, T1, T2, and one transistor in each inverter are turned off, ensuring that there is no
continuous electrical path between Vsupply and ground.
Advantages:
• SRAM can be accessed quickly.
• SRAMs are fast, but their cells require several transistors.
Dis adv:-
• SRAMS are said to be volatile memories.
• Their content are lost when power is interrupted
Dynamic RAMs
Though the static RAM is faster its memory cells require several transistors which makes it
expensive. So, to design a less expensive and higher density RAM we have to implement it
using simpler cells.
But, the fact with the simpler cell is that the simpler cell does not hold data for a long period
until the data is accessed from the cell frequently either for read or write operation. The
memory circuit implemented using such simpler cells is referred to as dynamic RAMs.
An example of a dynamic memory cell that consists of a capacitor, C, and a transistor, T, is
shown in Figure.
To store information in this cell, transistor T is turned on and an appropriate voltage is applied
to the bit line. This causes a known amount of charge to be stored in the capacitor.
•In dynamic RAM the information is stored in the memory cell in the form of charge stored on
a capacitor.
•So it requires periodic refresh
The capacitor is charged while its content is refreshed either by reading the contents
from the cell or writing new information to the cell.
• Less expensive than RAM
Requires less hardware support(one transistor & one capacitor per cell
•Sense Amplifier is connected to bit line which senses the charge stored in capacitor.
Read Operation:
• After transistor will be turned ON ,sense amplifier connected to bit line senses the charge
stored in capacitor.
• If charge is above threshold ,bit line is maintained at high voltage which represents logic 1.
• If charge is below threshold ,bit line is maintained at low voltage which represents logic 0.
Write Operation:
• After transistor will be turned ON.
• Depending upon the value to be written (0 or 1) appropriate voltage is applied to bit line
• Capacitor gets charged to required voltage stage
Dynamic Random Access Memory cell organization
Address Input (A24-11 and A10-0):
The memory address is divided into two parts. The higher-order address bits (A24-11) are used to select a row
in the cell array, while the lower-order bits (A10-0) are used to select a column.
Row Address Latch:
When the Row Address Strobe (RAS) signal is activated (logic low), the higher-order address bits (A24-11) are
latched into the Row Address Latch.
Row Decoder:
The Row Decoder takes the latched row address and activates one of the rows in the Cell Array.
Cell Array:
The cell array is organized as 16,384 rows by 2,048 bytes (or 2 KB) per row. This array stores the actual data in
the memory.
After selecting the appropriate row, the entire row of memory cells becomes accessible for reading or writing.
Column Address Latch:
When the Column Address Strobe (CAS) signal is activated (logic low), the lower-order address bits (A10-0) are
latched into the Column Address Latch.
Column Decoder:
The Column Decoder takes the latched column address and selects a particular column within the activated
row of the Cell Array.
Sense/Write Circuits:
• Once the row and column are selected, the Sense/Write Circuits are used to either read the data from the
memory cell or write data to it.
• The Read/Write (R/W) signal controls whether the data is being read from or written to the selected
memory cell.
• The Chip Select (CS) signal determines whether the memory chip is enabled for a read or write operation.
Data I/O Lines (D0 to D7):
• After selecting the row and column, and enabling the chip, the data is transferred through the Data I/O
lines (D0 to D7). This is an 8-bit wide data path.
• During a read operation, data is read out of the selected cell and placed on the D0 to D7 lines.
• During a write operation, data from the D0 to D7 lines is written into the selected cell.
Synchronous DRAMs
In the early 1990s, developments in memory technology resulted in DRAMs whose operation
is synchronized with a clock signal. Such memories are known as synchronous DRAMs
(SDRAMs).
Their structure is shown in Figure. The cell array is the same as in asynchronous DRAMs. The
distinguishing feature of an SDRAM is the use of a clock signal, the availability of which makes
it possible to incorporate control circuitry on the chip that provides many useful features.
For example, SDRAMs have built-in refresh circuitry, with a refresh counter to provide the
addresses of the rows to be selected for refreshing.
As a result, the dynamic nature of these memory chips is almost invisible to the user.
The address and data connections of an SDRAM may be buffered by means of registers
•Operation is directly synchronized with processor clock signal.
•The outputs of the sense circuits are connected to a latch.
•During a Read operation, the contents of the cells in a row are loaded onto the latches.
•During a refresh operation, the contents of the cells are refreshed without changing the
contents of the latches.
•Data held in the latches correspond to the selected columns are transferred to the output.
•For a burst mode of operation, successive columns are selected using column address
counter and clock. CAS signal need not be generated externally. A new data is placed during
raising edge of the clock
READ-ONLY MEMORIES
Both static and dynamic RAM chips are volatile, which means that they retain information only
while power is turned on. There are many applications requiring memory devices that retain
the stored information when power is turned off.
What is ROM?
• ROM, which stands for read only memory, here user can only read but cannot write.
• It is Non –Volatile memory 9Information is stored permanently & cannot be erased.
• The memory in ROM is filled during time of creation.
• Users can find ROM chips in computer & other electronic products like washing machines,
digital cameras , Micro ovens etc…
• A logic value 0 is stored in the cell if the transistor is connected to ground at point P;
otherwise, a 1 is stored The bit line is connected through a resistor to the power supply. To
read the state of the cell the word line is activated.
• Data are written into a ROM only once at the time of manufacture.
Types of ROM:
– FLASH ROM(MEMORY)
Programmable Read Only Memory (PROM):
• PROM is a blank version of ROM. It is manufactured as blank memory and programmed after
manufacturing.
• The user has the opportunity to program it or to add data and instructions as per his
requirement. Due to this reason, it is also known as the user-programmed ROM as a user can
program it.
• To write data onto a PROM chip; a device called PROM programmer or PROM burner is used.
• Once it is programmed, the data cannot be modified later, so it is also called as one-time
programmable device.
• ROMs provide flexibility and convenience not available with ROMs.
• PROMs provide a faster and considerably less expensive approach because they can be
programmed directly by the user
Uses: It is used in cell phones, video game consoles, medical devices, RFID tags, and more.
Erasable and Programmable Read Only Memory (EPROM)
• EPROM is a type of ROM that can be reprogramed and erased many times.
• Data can be erased by ultra violet light that takes duration upto 40 min
• You need a special device called a PROM programmer or PROM burner to reprogram the
EPROM.
• It provides considerable flexibility during the development phase of digital systems.
Uses:
Early computers
Game cartridges
Embedded systems & Micro Controllers
Electrically Erasable and Programmable Read Only Memory (EEPROM):
• Data can be electronically erased by applying an electric field
• ROM is a type of read only memory that can be erased and reprogrammed
repeatedly, up to 10000 times.
• It is erased and reprogrammed electrically without using ultraviolet light. Access time
is between 45 and 200 nanoseconds.
• It can erase complete or selected location of data & can be reprogrammed.
– This loads data into a processor register. Then the data that read are stored into a memory location.
• The reverse process takes place for transferring data from memory to an I/O device.
• Data transfer instruction is executed only after the processor determines that the I/O device is ready,
either by polling its status register or by waiting for an interrupt request.
• In either case, several program instructions must be executed for data transfer.
• In both cases Processor is involved for data transfer between memory and I/O devices.
DMA: an alternative approach to transfer blocks of data directly between the main memory and I/O
devices without processor intervention.
Contd…. Direct Memory Access
• The unit that controls DMA transfer is referred to as DMA controller.
• The DMA controller performs data transfer between main memory & I/O devices without
intervention of processor.
• To initiate the data transfer, the processor sends the starting address to DMA controller, also
sends no.of words in the block, and the direction of the transfer.
• The DMA controller then proceeds to perform the requested operation.
• When entire block has been transferred, it informs the processor by raising an interrupt.
DMA Controller Registers:
• Two registers are used for storing the starting address and the word count.
• The third register contains status and control flags.
• R/WI bit determines the direction of the operation. When this bit is 1 the controller perform
Read operation i.e it transfers data from the memory to the I/O devices.
• Otherwise, it performs a write operation.
• When the controller has completed transferring a block of data then it sets the DONE flag to
1.
• Bit 30 th is Interrupt-Enable flag. IE. When this flag is set to 1, it causes the controller to raise
an interrupt after it has completed transferring a block of data.
• The controller sets the IRQ bit 1, when it has requested an interrupt.
Figure shows how DMA controllers may be used in a computer system.
• One DMA controller connects a high-speed Ethernet to the computer’s I/O bus.
• The disk controller, which controls two disks, also has DMA capability and provides two DMA
channels.
• It can perform two independent DMA operations, as if each disk had its own DMA controller.
• The registers needed to store the memory address, the word count, and so on, are
duplicated, so that one set can be used with each disk.
Memory Hierarchy
Pr ocessor
Re gisters
Increasing Increasing Increasing
size speed cost per bit
Primary L1
cache
Secondary L2
cache
Main
memory
Magnetic disk
secondary
memory
• Fastest access is to the data held in processor registers. Registers are at the top of the memory
hierarchy.
• Relatively small amount of memory that can be implemented on the processor chip. This is
processor cache.
• Two levels of cache. Level 1 (L1) cache is on the processor chip. Level 2 (L2) cache is in between
main memory and processor.
• Next level is main memory, implemented as DIMMs. Much larger, but much slower than cache
memory.
• Next level is magnetic disks. Huge amount of inexepensive storage but slower than main memory
• Speed of memory access is critical, the idea is to bring instructions and data that will be used in
the near future as close to the processor as possible.
CACHE MEMORIES
• Processor is much faster than the main memory.
• As a result, the processor has to spend much of its time waiting while instructions and data are being fetched from the
main memory.
• Processor issues a Read request, a block of words is transferred from the main memory to
the cache, one word at a time.
• Subsequent references to the data in this block of words are found in the cache.
• At any given time, only some blocks in the main memory are held in the cache. Which
blocks in the main memory are in the cache is determined by a “mapping function”.
• When the cache is full, and a block of words needs to be transferred from the main
memory, some block of words in the cache must be replaced. This is determined by a
“replacement algorithm”.
Cache HIT
• Existence of a cache is transparent to the processor.
• The processor issues Read and Write requests in the same manner.
• If the data is in the cache it is called a Read or Write hit.
• Read hit: The data is obtained from the cache.
• Write hit:
Cache has a replica of the contents of the main memory.
Contents of the cache and the main memory may be updated simultaneously. This is the write-
through protocol.
Update the contents of the cache, and mark it as updated by setting a bit known as the dirty bit
or modified bit.
The contents of the main memory are updated when this block is replaced. This is write-back or
copy-back protocol.
Cache MISS
• If the data is not present in the cache, then a Read miss or Write miss occurs.
• READ MISS:
Block of words containing this requested word is transferred from the memory.
After the block is transferred, the desired word is forwarded to the processor.
The desired word may also be forwarded to the processor as soon as it is transferred without
waiting for the entire block to be transferred. This is called load-through or early-restart.
• WRITE-MISS:
• There are two protocols for Write MISS
Write-through protocol is used, then the contents of the main memory are updated directly.
If write-back protocol is used, the block containing the addressed word is first brought into the
cache. The desired word is overwritten with new information.
CACHE MEMORY MAPPING TECHNIQUE:-
Mapping functions determine how memory blocks are placed in the cache.
A simple processor example:
Cache consisting of 128 blocks of 16 words each.
Total size of cache is 2048 (2K) words.
Main memory is addressable by a 16-bit address.
Main memory has 64K words.
Main memory has 4K blocks of 16 words each.
Three mapping functions:
Direct mapping
Associative mapping
Set-associative mapping.
CACHE MEMORY MAPPING TECHNIQUE:-
DIRECT MAPPING
It is simplest technique
It Maps each block of main memory into only one possible cache line.
Less Expensive & Fast as only tag field matching is required for searching word.
Associative mapping
A block of main memory can map to any line of the cache that is freely available at the moment
Set-associative mapping
Cache lines are grouped into sets
A particular block of main memory can map to only one particular set of cache.
Direct Mapping Cache:
• Particular block of main memory can map to only one particular line of the cache.
Cache line number = (main memory block address) modulo (no. of line in the cache)
Main
A simple processor example: memory Block 0
Cache Block 1
Cache consisting of 128 blocks of 16 words each. tag
Block 0
Block 127
Main memory is addressable by a 16-bit address.
Block 128
Main memory has 64K words. tag
Block 129
Block 127
Block 255
Tag Block Word
Thus, whenever one of the main memory blocks 5 7 4 Block 256
Block 257
0, 128, 256,... is loaded into the cache, it is stored in cache block 0. Main memory address
Cache Block 1
tag
4
Total words in each Blocks =16B= 2 16 bit tag
Block 0
Block 1
Tag Index Off set
4-off set Block 127
5 7 4 Block 128
12
Total No. of MM words = 2 16 / 2 4
tag
Block 127 Block 129
= 2 12
Block 255
11 T ag Index Off set
Cache size=128*16=2048 B= 2 5 7 4 Block 256
Block 4095
Memory address is divided into three fields: Main
memory Block 0
•When a new block is brought into the cache, the Block 127
Block 128
next 7 bits determine which cache block this new tag
Block 127 Block 129
5 7 4 Block 256
32 blocks is currently present in the cache. These Block 257
Main memory address
Cache Block 1
- High order 12 bits or tag bits identify a memory block when it is Block 128
tag
resident in the cache. Block 127 Block 129
• Replacement algorithms can be used to replace an existing block in the Tag Off Set
Block 255
12 4 Block 256
cache when the cache is full. Block 257
Main memory address
What happens if the data in the disk and main memory changes and the write-back protocol is being used?
• In this case, the data in the cache may also have changed and is indicated by the dirty bit.
• The copies of the data in the cache, and the main memory are different. This is called the cache coherence
problem.
• One option is to force a write-back before the main memory is updated from the disk.
Replacement Algorithms
• In a direct mapping method, the position of each block is pre-determined and there is no need of
replacement strategy.
• In Associative and Set- Associative methods, the block position is not pre-determined. If the Cache is
full and new blocks are brought into the Cache, then the Cache-controller must decide which of the old
block has to be replaced.
• This is an important issue, because the decision can be a strong determining factor in system
performance
• The most commonly used algorithm is LRU (Least Recently Used). This algorithm choose least recently
used block.
• To use the LRU algorithm, the cache controller must track references to all blocks as computation
proceeds.
• Suppose it is required to track the LRU block of a four-block set in a set-associative cache.
• A 2-bit counter can be used for each block.
Replacement Algorithms
Cache Behavior:
• On a Hit:
– When a cache hit occurs for a referenced block:
• The counter of that block is reset to 0.
• Counters of blocks that had lower values than the referenced block's original value are incremented by 1.
• The counters of blocks with values equal to or higher than the referenced block remain unchanged.
• On a Miss (when the set is not full):
– If there is a cache miss and the set can accommodate a new block:
• The counter for the new block is set to 0.
• All other counters in the set are incremented by 1.
• On a Miss (when the set is full):
– If a miss occurs and the cache set is full:
• The block with a counter value of 3 (which is the highest in a 2-bit counter system) is evicted.
• The new block from main memory replaces it, with its counter set to 0.
• The remaining counters for the other three blocks are incremented by 1.
Performance considerations
• Performance of a processor depends on:
– How fast machine instructions can be brought into the processor for execution.
• When a Cache is used , the processor is able to access instructions and data more quickly, when that data is
available in Cache.
• Therefore, the extent to which cache improve performance is dependent on how frequently requested
instructions and data are found in the cache.
• Hit Rate and Miss Penalty: CPU refers a word and find the word in cache. It said to be Hit. Hit means
successful access to data in a cache.
• Hit rate:- no.of hits/total no.of references to memory
• Miss Penalty:- When Miss occurs, an extra time needed to bring the desired information into the cache
from main memory.
Performance considerations
• Average memory access time t avg= h*c + (1-h)*M
• So miss rate=1-h;
One possibility is to make the cache larger, but it increases the cost.
• In high performance processors 2 levels of caches are normally used.
T avg = h1c1+(1-h1)h2c2+(1-h1)(1-h2)M
Other Performance Enhancements
Write buffer
Write-through:
• Each write operation involves writing to the main memory.
• If the processor has to wait for the write operation to be complete, it slows down
the processor.
• Processor does not depend on the results of the write operation.
• Write buffer can be included for temporary storage of write requests.
• Processor places each write request into the buffer and continues execution.
• If a subsequent Read request references data which is still in the write buffer, then
this data is referenced in the write buffer.
Write-back:
• Block is written back to the main memory when it is replaced.
• If the processor waits for this write to complete, before reading the new block, it is
slowed down.
• Fast write buffer can hold the block to be written, and the new block can be read
first.
Other Performance Enhancements
Prefetching
• New data are brought into the processor when they are first needed.
• Processor has to wait before the data transfer is complete.
• Prefetch the data into the cache before they are actually needed, or a before a Read miss
occurs.
• Prefetching can be accomplished through software by including a special instruction in the
machine language of the processor.
Inclusion of prefetch instructions increases the length of the programs.
to this pattern.