Memory System
Memory System
Chapter – 6
Memory System
6.1 Microcomputer Memory
Memory is an essential component of the microcomputer system.
np
It stores binary instructions and datum for the microcomputer.
The memory is the place where the computer holds current programs and data that are in
use.
u.
None technology is optimal in satisfying the memory requirements for a computer
system.
ed
Computer memory exhibits perhaps the widest range of type, technology, organization,
performance and cost of any feature of a computer system.
The memory unit that communicates directly with the CPU is called main memory.
.
Devices that provide backup storage are called auxiliary memory or secondary memory.
es
6.2 Characteristics of memory systems
ot
The memory system can be characterised with their Location, Capacity, Unit of transfer,
Access method, Performance, Physical type, Physical characteristics, Organisation.
Location en
• Processor memory: The memory like registers is included within the processor and
io
termed as processor memory.
• Internal memory: It is often termed as main memory and resides within the CPU.
• External memory: It consists of peripheral storage devices such as disk and magnetic
m
Capacity
• Word size: Capacity is expressed in terms of words or bytes.
— The natural unit of organisation
• Number of words: Common word lengths are 8, 16, 32 bits etc.
d
— or Bytes
de
Unit of Transfer
• Internal: For internal memory, the unit of transfer is equal to the number of data lines
oa
• Addressable unit
— Smallest location which can be uniquely addressed
ow
— Word internally
— Cluster on Magnetic disks
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 1
Computer Organization and Architecture Chapter 6 : Memory System
Access Method
• Sequential access: In this access, it must start with beginning and read through a
specific linear sequence. This means access time of data unit depends on position of
records (unit of data) and previous location.
— e.g. tape
• Direct Access: Individual blocks of records have unique address based on location.
np
Access is accomplished by jumping (direct access) to general vicinity plus a
sequential search to reach the final location.
— e.g. disk
u.
• Random access: The time to access a given location is independent of the sequence of
prior accesses and is constant. Thus any location can be selected out randomly and
ed
directly addressed and accessed.
— e.g. RAM
• Associative access: This is random access type of memory that enables one to make a
.
comparison of desired bit locations within a word for a specified match, and to do this
es
for all words simultaneously.
— e.g. cache
ot
Performance
• Access time: For random access memory, access time is the time it takes to perform a
en
read or write operation i.e. time taken to address a memory plus to read / write from
addressed memory location. Whereas for non-random access, it is the time needed to
io
position read / write mechanism at desired location.
— Time between presenting the address and getting the valid data
• Memory Cycle time: It is the total time that is required to store next memory access
m
memory unit.
de
Physical Types
• Semiconductor
ow
— RAM
• Magnetic
— Disk & Tape
D
• Optical
— CD & DVD
• Others
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 2
Computer Organization and Architecture Chapter 6 : Memory System
— Bubble
— Hologram
Physical Characteristics
• Decay: Information decays mean data loss.
• Volatility: Information decays when electrical power is switched off.
np
• Erasable: Erasable means permission to erase.
• Power consumption: how much power consumes?
u.
Organization
• Physical arrangement of bits into words
ed
• Not always obvious
- e.g. interleaved
.
6.3 The Memory Hierarchy
es
Capacity, cost and speed of different types of memory play a vital role while designing a
memory system for computers.
ot
If the memory has larger capacity, more application will get space to run smoothly.
It's better to have fastest memory as far as possible to achieve a greater performance.
en
Moreover for the practical system, the cost should be reasonable.
There is a tradeoff between these three characteristics cost, capacity and access time. One
cannot achieve all these quantities in same memory module because
io
If capacity increases, access time increases (slower) and due to which cost per bit
decreases.
If access time decreases (faster), capacity decreases and due to which cost per bit
m
increases.
The designer tries to increase capacity because cost per bit decreases and the more
fro
application program can be accommodated. But at the same time, access time increases
and hence decreases the performance.
d
Therefore, it is more economical to use low-cost storage devices to serve as a backup for
storing the information that is not currently used by CPU
The memory unit that directly communicate with CPU is called the main memory
nl
system from the slow by high-capacity auxiliary memory to a relatively faster main
memory, to an even smaller and faster cache memory
The main memory occupies a central position by being able to communicate directly with
D
the CPU and with auxiliary memory devices through an I/O processor
A special very-high-speed memory called cache is used to increase the speed of
processing by making current programs and data available to the CPU at a rapid rate
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 3
Computer Organization and Architecture Chapter 6 : Memory System
CPU logic is usually faster than main memory access time, with the result that processing
speed is limited primarily by the speed of main memory
The cache is used for storing segments of programs currently being executed in the CPU
and temporary data frequently needed in the present calculations
The memory hierarchy system consists of all storage devices employed in a computer
system from slow but high capacity auxiliary memory to a relatively faster cache memory
np
accessible to high speed processing logic. The figure below illustrates memory hierarchy.
u.
. ed
es
ot
en
io
m
fro
d
de
oa
nl
ow
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 4
Computer Organization and Architecture Chapter 6 : Memory System
np
Hierarchy List
Registers
L1 Cache
u.
L2 Cache
Main memory
ed
Disk cache
Disk
.
Optical
es
Tape
ot
6.4 Internal and External memory
Internal or Main Memory
The main memory is the central unit of the computer system. It is relatively large
en
and fast memory to store programs and data during the computer operation. These
memories employ semiconductor integrated circuits. The basic element of the
io
semiconductor memory is the memory cell.
The memory cell has three functional terminals which carries the electrical signal.
o The select terminal: It selects the cell.
m
o The data in terminal: It is used to input data as 0 or 1 and data out or sense
terminal is used for the output of the cell's state.
fro
o The control terminal: It controls the function i.e. it indicates read and
write.
d
de
oa
nl
integrated circuits chips, but a portion of the memory may be constructed with
ROM chips
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 5
Computer Organization and Architecture Chapter 6 : Memory System
np
Integrated RAM are available in two possible operating modes, Static and
Dynamic
u.
Static RAM (SRAM)
The static RAM consists of flip flop that stores binary information and this stored
ed
information remains valid as long as power is applied to the unit.
.
es
ot
en
io
m
fro
d
de
In logic state 1, point C1 is high and point C2 is low. In this state, T1 & T4 are off
and T2 & T3 are on.
In logic state 0, point C1 is low and C2 is high. In this state, T1 & T4 are on and
nl
this line, the two transistors are switched on allowing for read and write operation.
For a write operation, the desired bit value is applied to line B while it's
complement is applied to line B complement. This forces the four transistors T1,
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 6
Computer Organization and Architecture Chapter 6 : Memory System
np
u.
. ed
es
ot
Fig: DRAM structure en
The address line is activated when the bit value from this cell is to be read or
io
written.
The transistor acts as switch that is closed i.e. allowed current to flow, if voltage
is applied to the address line; and opened i.e. no current to flow, if no voltage is
m
If data bus is high, then a +5V is applied on bit line and voltage will flow through
transistor and charge the capacitor.
oa
raise the voltage in bit line. The amplifier will store the voltage and place a 1 on
ow
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 7
Computer Organization and Architecture Chapter 6 : Memory System
np
o Faster, digital device
o Expensive, big in size
o Don't require refreshing circuit
u.
o Used in cache memory
Dynamic RAM
ed
o Uses capacitor to store information
o More dense i.e. more cells can be accommodated per unit area
o Slower, analog device
.
o Less expensive, small in size
es
o Needs refreshing circuit
o Used in main memory, larger memory units
ot
ROM– Read Only memory
changed. en
Read only memory (ROM) contains a permanent pattern of data that cannot be
Types of ROM
Programmable ROM (PROM)
o It is non-volatile and may be written into only once. The writing process is
nl
np
place, using ordinary bus control, addresses and data lines. EEPROM is
more expensive than EPROM and also is less dense, supporting fewer bits
per chip.
u.
Flash Memory
o Flash memory is also the semiconductor memory and because of the speed
ed
with which it can be reprogrammed, it is termed as flash. It is interpreted
between EPROM and EEPROM in both cost and functionality. Like
EEPROM, flash memory uses an electrical erasing technology. An entire
.
es
flash memory can be erased in one or a few seconds, which is much faster
than EPROM. In addition, it is possible to erase just blocks of memory
rather than an entire chip. However, flash memory doesn't provide byte
ot
level erasure, a section of memory cells are erased in an action or 'flash'.
External Memory
en
The devices that provide backup storage are called external memory or auxiliary
memory. It includes serial access type such as magnetic tapes and random access
io
type such as magnetic disks.
Magnetic Tape
m
A magnetic tape is the strip of plastic coated with a magnetic recording medium.
Data can be recorded and read as a sequence of character through read / write
fro
Magnetic Disk
A magnetic disk is a circular plate constructed with metal or plastic coated with
oa
magnetic material often both side of disk are used and several disk stacked on one
spindle which Read/write head available on each surface. All disks rotate together
at high speed. Bits are stored in magnetize surface in spots along concentric
nl
circles called tracks. The tracks are commonly divided into sections called
sectors. After the read/write head are positioned in specified track the system has
ow
to wait until the rotating disk reaches the specified sector under read/write head.
Information transfer is very fast once the beginning of sector has been reached.
Disk that are permanently attached to the unit assembly and cannot be used by
D
occasional user are called hard disk drive with removal disk is called floppy disk.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 9
Computer Organization and Architecture Chapter 6 : Memory System
Optical Disk
The huge commercial success of CD enabled the development of low cost optical
disk storage technology that has revolutionized computer data storage. The disk is
form from resin such as polycarbonate. Digitally recorded information is
imprinted as series of microscopic pits on the surface of poly carbonate. This is
done with the finely focused high intensity leaser. The pitted surface is then
np
coated with reflecting surface usually aluminum or gold. The shiny surface is
protected against dust and scratches by the top coat of acrylic.
Information is retrieved from CD by low power laser. The intensity of reflected
u.
light of laser changes as it encounters a pit. Specifically if the laser beam falls on
pit which has somewhat rough surface the light scatters and low intensity is
ed
reflected back to the surface. The areas between pits are called lands. A land is a
smooth surface which reflects back at higher intensity. The change between pits
and land is detected by photo sensor and converted into digital signal. The sensor
.
es
tests the surface at regular interval.
DVD-Technology
ot
Multi-layer
Very high capacity (4.7G per layer)
Full length movie on single disk
Using MPEG compression
Finally standardized (honest!)
en
io
Movies carry regional coding
Players only play correct region films
m
DVD-Writable
fro
Principles
o Intended to give memory speed approaching that of fastest memories available but with
large size, at close to price of slower memories
oa
o Each line includes a tag (usually a portion of the main memory address) which identifies
which particular block is being stored
ow
o Locality of reference implies that future references will likely come from this block of
memory, so that cache line will probably be utilized repeatedly.
o The proportion of memory references, which are found already stored in cache, is called
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 10
Computer Organization and Architecture Chapter 6 : Memory System
Cache memory is intended to give memory speed approaching that of the fastest
memories available, and at the same time provide a large memory size at the price of less
expensive types of semiconductor memories. There is a relatively large and slow main
memory together with a smaller, faster cache memory contains a copy of portions of
main memory.
When the processor attempts to read a word of memory, a check is made to determine if
np
the word is in the cache. If so, the word is delivered to the processor. If not, a block of
main memory, consisting of fixed number of words is read into the cache and then the
word is delivered to the processor.
u.
The locality of reference property states that over a short interval of time, address
generated by a typical program refers to a few localized area of memory repeatedly. So if
ed
programs and data which are accessed frequently are placed in a fast memory, the
average access time can be reduced. This type of small, fast memory is called cache
memory which is placed in between the CPU and the main memory.
.
es
ot
en
io
When the CPU needs to access memory, cache is examined. If the word is found in
cache, it is read from the cache and if the word is not found in cache, main memory is
accessed to read word. A block of word containing the one just accessed is then
m
Cache connects to the processor via data control and address line. The data and address
lines also attached to data and address buffer which attached to a system bus from which
main memory is reached.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 11
Computer Organization and Architecture Chapter 6 : Memory System
When a cache hit occurs, the data and address buffers are disabled and the
communication is only between processor and cache with no system bus traffic. When a
cache miss occurs, the desired word is first read into the cache and then transferred from
cache to processor. For later case, the cache is physically interposed between the
processor and main memory for all data, address and control lines.
np
Cache Operation Overview
u.
. ed
es
ot
en
io
m
fro
If not present, access and read required block from main memory to cache.
Allocate cache line for this new found block.
oa
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 12
Computer Organization and Architecture Chapter 6 : Memory System
np
u.
. ed
es
ot
Fig: Flowchart for cache read operation
en
io
Locality of Reference
The reference to memory at any given interval of time tends to be confined within
m
a few localized area of memory. This property is called locality of reference. This
is possible because the program loops and subroutine calls are encountered
fro
frequently. When program loop is executed, the CPU will execute same portion of
program repeatedly. Similarly, when a subroutine is called, the CPU fetched
starting address of subroutine and executes the subroutine program. Thus loops
and subroutine localize reference to memory.
d
This principle states that memory references tend to cluster over a long period of
de
time, the clusters in use changes but over a short period of time, the processor is
primarily working with fixed clusters of memory references.
Spatial Locality
oa
It refers to the tendency for a processor to access memory locations that have been
used frequently. For e.g. Iteration loops executes same set of instructions
repeatedly.
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 13
Computer Organization and Architecture Chapter 6 : Memory System
np
The larger the cache, the larger the number of gates involved in addressing the
cache.
Large caches tend to be slightly slower than small ones – even when built with the
u.
same integrated circuit technology and put in the same place on chip and circuit
board.
ed
The available chip and board also limits cache size.
.
The transformation of data from main memory to cache memory is referred to as
es
memory mapping process.
Because there are fewer cache lines than main memory blocks, an algorithm is
ot
needed for mapping main memory blocks into cache lines.
There are three different types of mapping functions in common use and are direct,
example.
o The cache can hold 64 Kbytes
en
associative and set associative. All the three include following elements in each
io
o Data is transferred between main memory and the cache in blocks of 4
bytes each. This means that the cache is organized as 16Kbytes = 214 lines
of 4 bytes each.
m
Direct Mapping
It is the simplex technique, maps each block of main memory into only one possible
de
cache line i.e. a given main memory block can be placed in one and only one place on
cache.
i = j modulo m
oa
Where I = cache line number; j = main memory block number; m = number of lines in
the cache
The mapping function is easily implemented using the address. For purposes of cache
nl
access, each main memory address can be viewed as consisting of three fields.
The least significant w bits identify a unique word or byte within a block of main
ow
memory. The remaining s bits specify one of the 2s blocks of main memory.
The cache logic interprets these s bits as a tag of (s-r) bits most significant position and a
line field of r bits. The latter field identifies one of the m = 2r lines of the cache.
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 14
Computer Organization and Architecture Chapter 6 : Memory System
np
Block size = line size = 2w words or bytes
Number of blocks in main memory = 2s+ w/2w = 2s
u.
Number of lines in cache = m = 2r
Size of tag = (s – r) bits
ed
24 bit address
2 bit word identifier (4 byte block)
.
22 bit block identifier
es
8 bit tag (=22-14), 14 bit slot or line
No two blocks in the same line have the same Tag field
ot
Check contents of cache by finding line and checking Tag
Cache line
0
1
en
Main Memory blocks held
0, m, 2m, 3m…2s-m
1,m+1, 2m+1…2s-m+1
io
m-1 m-1, 2m-1,3m-1…2s-1
Cache Line 0 1 2 3 4
m
0 1 2 3 4
Main 5 6 7 8 9
fro
Memory 10 11 12 13 14
Block 15 16 17 18 19
20 21 22 23 24
d
Note that
de
o all locations in a single block of memory have the same higher order bits (call them the
block number), so the lower order bits can be used to find a particular word in the block.
o within those higher-order bits, their lower-order bits obey the modulo mapping given
oa
above (assuming that the number of cache lines is a power of 2), so they can be used to
get the cache line for that block
o the remaining bits of the block number become a tag, stored with each cache line, and
nl
used to distinguish one block from another that could fit into that same cache
ow
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 15
Computer Organization and Architecture Chapter 6 : Memory System
line.
np
u.
. ed
es
ot
en
io
Fig: Direct mapping structure
m
fro
d
de
oa
nl
ow
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 16
Computer Organization and Architecture Chapter 6 : Memory System
np
u.
. ed
es
ot
en
io
m
fro
d
de
oa
nl
Inexpensive
Fixed location for given block
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 17
Computer Organization and Architecture Chapter 6 : Memory System
o If a program accesses 2 blocks that map to the same line repeatedly, cache
misses are very high
Associated Mapping
It overcomes the disadvantage of direct mapping by permitting each main memory block
to be loaded into any line of cache.
np
Cache control logic interprets a memory address simply as a tag and a word field
Tag uniquely identifies block of memory
Cache control logic must simultaneously examine every line’s tag for a match which
u.
requires fully associative memory
very complex circuitry, complexity increases exponentially with size
ed
Cache searching gets expensive
.
es
ot
en
io
m
fro
d
de
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 18
Computer Organization and Architecture Chapter 6 : Memory System
np
u.
. ed
es
ot
en
io
m
fro
d
de
oa
nl
Least significant 2 bits of address identify which 16 bit word is required from
32 bit data block
e.g.
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 19
Computer Organization and Architecture Chapter 6 : Memory System
np
Cache is divided into v sets, each of which has k lines; number of cache lines = vk
M=vXk
I = j modulo v
u.
Where, i = cache set number; j = main memory block number; m = number of lines in the
cache
ed
So a given block will map directly to a particular set, but can occupy any line in that set
(associative mapping is used within the set)
Cache control logic interprets a memory address simply as three fields tag, set and word.
.
The d set bits specify one of v = 2d sets. Thus s bits of tag and set fields specify one of the
es
2s block of main memory.
The most common set associative mapping is 2 lines per set, and is called two-way set
ot
associative. It significantly improves hit ratio over direct mapping, and the associative
hardware is not too expensive.
en
io
m
fro
d
de
oa
nl
ow
D
np
u.
. ed
es
ot
en
io
m
fro
d
de
oa
nl
ow
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 21
Computer Organization and Architecture Chapter 6 : Memory System
e.g
Address Tag Data Set number
np
1FF 7FFC 1FF 12345678 1FFF
001 7FFC 001 11223344 1FFF
u.
6.6.3 Replacement algorithm
ed
When all lines are occupied, bringing in a new block requires that an existing line be
overwritten.
Direct mapping
.
es
No choice possible with direct mapping
Each block only maps to one line
Replace that line
ot
Associative and Set Associative mapping
o replace that block in the set which has been in cache longest with no
io
reference to it
o Implementation: with 2-way set associative, have a USE bit for each line
m
in a set. When a block is read into cache, use the line whose USE bit is set
to 0, then set its USE bit to one and the other line’s USE bit to 0.
o Probably the most effective method
fro
o replace that block in the set which has experienced the fewest references
or hits
oa
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 22
Computer Organization and Architecture Chapter 6 : Memory System
np
Must not overwrite a cache block unless main memory is up to date
I/O modules may be able to read/write directly to memory
Multiple CPU’s may be attached to the same bus, each with their own cache
u.
Write Through
ed
All write operations are made to main memory as well as to cache, so main
memory is always valid
Other CPU’s monitor traffic to main memory to update their caches when needed
.
es
This generates substantial memory traffic and may create a bottleneck
Anytime a word in cache is changed, it is also changed in main memory
Both copies always agree
ot
Generates lots of memory writes to main memory
Multiple CPUs can monitor main memory traffic to keep local (to CPU) cache up
to date
Lots of traffic
Slows down writes
en
io
Remember bogus write through caches!
m
Write back
When an update occurs, an UPDATE bit associated with that slot is set, so when
fro
if a match
o Hardware Transparency - additional hardware links multiple caches so
ow
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 23
Computer Organization and Architecture Chapter 6 : Memory System
np
Requires no bus operation for cache hits
Short data paths and same speed as other CPU transactions
u.
Off-chip cache (L2 Cache)
ed
It is the external cache which is beyond the processor. If there is no L2 cache and
processor makes an access request for memory location not in the L1 cache, then
.
processor must access DRAM or ROM memory across the bus. Due to this
es
typically slow bus speed and slow memory access time, this results in poor
performance. On the other hand, if an L2 SRAM cache is used, then frequently
ot
the missing information can be quickly retrieved.
It can be much larger
It can be used with a local bus to buffer the CPU cache-misses from the system
bus en
io
Unified and Split Cache
Unified Cache
o Single cache contains both instructions and data. Cache is flexible and can
m
o Has a higher hit rate than split cache, because it automatically balances
load between data and instructions (if an execution pattern involves more
instruction fetches than data fetches, the cache will fill up with more
d
Split Cache
o Cache splits into two parts first for instruction and second for data. Can
oa
designs
o Eliminates cache contention between instruction processor and the
execution unit (which uses data)
D
Compiled By: Er. Hari Aryal [[email protected]] Reference: W. Stallings & M. Mano | 24