Memory Subsystem Notes-1
Memory Subsystem Notes-1
57
58 The memory subsystem [ Chapter VIII
we have a hit at that level. If not, we have a miss; the request is propagated to
the next level down and the block containing the requested data is copied at this
level when the data is found. This ensures that the next time this (or nearby) data
is accessed there will be a hit at this level. The fraction of memory references
found at a level is called hit rate.
Computer memory systems typically consist of a four-level hierarchy: Reg-
isters — Cache — Main Memory — Backing Store (i.e. disk). Note that data is
moved between the cache, main memory and backing store transparently, but
data movement between registers and the rest of memory is explicitly under
the control of the program. It is up to the compiler to decide which data items
should be moved to registers, and when, and the compiler must compile into the
code the relevant load and store instructions.
In practice, many computers introduce another level into the hierarchy: they
have two levels of cache; a small (perhaps 64Kbyte), very fast cache on the pro-
cessor chip (called the first level cache), and a larger (perhaps 512Kbyte) second
level cache, either on the processor chip or on separate chips, and intermediate
in speed between the on-chip cache and the main memory.
1021
1022
1023
20 32
which is equivalent to indexing the cache with some of the low bits of the mem-
ory address. Because part of the address has already been used to identify the
cache location, the tag field needs to hold only the ‘unused’ part of the address,
therefore the total amount of storage required in the cache is lower in compari-
son to fully-associative caches.
Figure VIII.1 (fig 7.7 in P&H 3/e) shows a 1Kword direct-mapped cache and
the block diagram of the search mechanism: Bits 11-2 of the address are used as
an index to read the cache line where the requested word may be stored. The tag
field of the cache location is compared with bits 31-12 of the address and if they
are equal and the valid bit is set, the access is declared a hit and the data can be
used by the processor.
Write accesses to memory, which are much less frequent than reads, are more
PAT07F07.eps
complicated, and different caches handle them in different ways. In a write-
through cache, if the word being written to is in the cache, both the copy in
the cache and the copy in main memory are written to at the same time. This
obviously takes as long as a main memory access, but is simpler to implement
than write-back caches, which store a block into the main memory only when it
is thrown out of the cache. Caches also vary in their behaviour if a write access
misses: some caches load the missing block into the cache; others do not.
60 The memory subsystem [ Chapter VIII
P D
Residence bit
Modified bit
Accessed bit
Frame number
R M A F
F D
16 bits 10 bits
REAL ADDRESS
PAGE TABLE
(in the system area of main memory)
Virtual memory removes the need for relocation, as all addresses in the program
are virtual and guaranteed to be unique to that program.
The virtual memory space is divided into a large number of equally sized Paging
chunks called pages. Page sizes vary between computer systems, but typical sizes
are 1K and 4K bytes, and the 32 bit virtual address therefore consists of two
fields, the page number (the most significant 22 bits) and the page offset, the num-
ber of the byte within the page (the least significant 10 bits for 1K byte pages).
The main memory of the machine is also divided up into chunks, of the same
size as pages, called page frames, so the physical address, which might be 26 bits
long, will consist of a 16 bit frame number and a 10 bit byte address within the
frame. The OS maintains the currently needed parts of the program code and
data areas in physical memory by loading the required pages of the program’s
virtual memory space into a set of (not necessarily contiguous) page frames in
physical memory.
start of the page table. When a virtual address is presented to the MMU for
translation, the page number P is added to the page table base address, to access
a table entry describing the location of the page. The F field of that table entry
is simply the number of the frame in physical memory containing the page, and
so if it is concatenated with the lower 10 bits (D) of the virtual address, we get
the corresponding physical memory address.
What happens if the process tries to access a page which is not held in main
memory, but is on disk? There is a bit in the page table entry, the R bit, which is
set only if the page is in main memory. If the R bit for the page being accessed is
not set, a page fault exception occurs, interrupting the process and invoking the
OS page fault handler. The handler fetches the missing page from disk and loads
it into an unused frame in main memory, then sets the R bit and the F field in the
page table entry. The interrupted process can then resume. This technique of
loading pages into memory when they are first accessed is called demand paging.
Page replacement Each process in a multi-tasking system has allocated to it a certain fraction
of all the page frames in the physical memory space. What happens if a page
fault occurs and all the process’s page frames are already in use? The OS must
pick one of the page frames for re-use, and load the new page on top of the page
currently in that frame, replacing it. If the replaced page is a data page, and any
location in the page has been written to since the page was loaded into the frame,
the old page must be written out to disk, updating the copy on disk, before the
new page is loaded over it. A bit in the page table entry for the page, the M
(Modified, alternatively called dirty) bit, is set if a write has been performed to
the page since it was loaded into the frame — if the M bit is unset, the page can
be overwritten without being saved to disk, as the previous copy on disk is still
valid.
If the page which was overwritten is accessed again by the process, it has to
be reloaded from disk (into any available frame), so ideally the OS should choose
pages for replacement which will not be accessed again for a long time. Using the
locality principle, the OS approximates this criterion by replacing pages which
have not been accessed recently. This is achieved by using a further bit in the
page table entry for each page, the A (Accessed) bit1 . Whenever any address
in a page is accessed, the A bit is set. All the A bits in the page table are reset
periodically by the OS. Thus any page which has an unset A bit has not been
used since the last reset of the A bits, and is a good candidate for replacement
with the incoming page.
2 Hardware associated with the TLB can keep track of which entries have been used most
recently.
64 The memory subsystem [ Chapter VIII