Memory Subsystem Notes-1

Uploaded by

blestfavoured

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Memory Subsystem Notes-1

Uploaded by

blestfavoured

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Chapter VIII

The memory subsystem

VIII.1 Memory hierarchy

The programmer’s view of the computer memory is that it is a flat, large storage
space. Obviously we want it to be as fast as possible, since instruction fetch and
data access take a significant part of a processor’s cycle.
Unfortunately this is not possible with any known technology. We can make
very fast memories by using expensive, static RAM circuits, but when we put
too many of them together to create a large memory, its speed drops as the mem-
ory size increases. Not even supercomputers, where high cost is not a problem,
can have a large, flat and fast memory.
At the same time we have technologies that can provide us with large, but
relatively slow, semiconductor memory. This memory is called Dynamic RAM
and each bit is stored in a single transistor in comparison with 6 for 1 bit of
SRAM. We also have huge, but very slow compared to a processor, magnetic
memory available to us, in the form of hard disks.
The principle of locality of memory references in computer programs comes Locality
handy here. It says that if a program accesses a particular memory address, it is
likely that the next few accesses will be to nearby addresses (spatial locality),
and also that the same address is likely to be accessed again within a short time
(temporal locality). This is true for instruction fetches, and also for data reads
and writes.
We can take advantage of the locality of references, to create a hierarchical
memory with multiple levels of memory of different speed and size (and tech-
nology). At the top of the hierarchy, we have a fast but small memory which is
directly connected to the processor. The memory sizes increase (but the speed
drops) as we move to lower levels of the hierarchy further away from the pro-
cessor. At the bottom, we have slow magnetic memory. The data held at a level
close to the processor are a subset of the data held at any level further away. As-
suming that the data is transfered between memory levels in a way that ensures
that most processor accesses are handled at the top level, the whole memory
system will appear as fast as the top memory level and as big as the bottom level.
The minimum amount of data transfered between two adjacent memory Block
levels is called a block or line. Although this could be as small as one word, the
spatial locality principle suggests that larger blocks are better.
If the data requested by the processor is found at a memory level, we say that Hits, misses and rates

57
58 The memory subsystem [ Chapter VIII

we have a hit at that level. If not, we have a miss; the request is propagated to
the next level down and the block containing the requested data is copied at this
level when the data is found. This ensures that the next time this (or nearby) data
is accessed there will be a hit at this level. The fraction of memory references
found at a level is called hit rate.
Computer memory systems typically consist of a four-level hierarchy: Reg-
isters — Cache — Main Memory — Backing Store (i.e. disk). Note that data is
moved between the cache, main memory and backing store transparently, but
data movement between registers and the rest of memory is explicitly under
the control of the program. It is up to the compiler to decide which data items
should be moved to registers, and when, and the compiler must compile into the
code the relevant load and store instructions.
In practice, many computers introduce another level into the hierarchy: they
have two levels of cache; a small (perhaps 64Kbyte), very fast cache on the pro-
cessor chip (called the first level cache), and a larger (perhaps 512Kbyte) second
level cache, either on the processor chip or on separate chips, and intermediate
in speed between the on-chip cache and the main memory.

VIII.2 Cache basics

The main issue in designing a cache is how to determine whether a data item
is present in the cache and where it is stored.
Tags As the storage space of a cache is much smaller than that of the main mem-
ory, each cache location can hold the contents of a number of different memory
locations. Since data items are identified by their memory address, in order to
ensure that a specific location in the cache indeed holds the required data, the
address must be stored in a special field at the cache location together with the
data. This special field is called the tag.
Valid bits When a cache location is unoccupied, the tag field could still match to a
requested memory address causing the program to malfunction. Therefore, a
single bit valid field is also added in each location signifying whether the data
held there is valid or not.
The cache organisation described above is called fully-associative. In order to
find the referenced data, we have to compare the tag field of every cache loca-
tion to the memory address requested by the processor. It is quite expensive to
implement these parallel comparisons in circuits, therefore full-associativity is
practical only for caches with a very small number of entries.
Direct-mapped caches are at the opposite extreme of cache organisation. Each
data item can be stored only at one cache location. The location is determined
by the memory address and the size of the cache as follows:
cache location = (memory address) MOD (cache size)
§ VIII.3] Cache basics 59

Address (showing bit positions)

31 30 13 12 11 2 10
Byte
offset
Hit 20 10
Tag
Index Data

Index Valid Tag Data

0
1
2

1021
1022
1023
20 32

Figure VIII.1: Example of direct-mapped cache.

which is equivalent to indexing the cache with some of the low bits of the mem-
ory address. Because part of the address has already been used to identify the
cache location, the tag field needs to hold only the ‘unused’ part of the address,
therefore the total amount of storage required in the cache is lower in compari-
son to fully-associative caches.
Figure VIII.1 (fig 7.7 in P&H 3/e) shows a 1Kword direct-mapped cache and
the block diagram of the search mechanism: Bits 11-2 of the address are used as
an index to read the cache line where the requested word may be stored. The tag
field of the cache location is compared with bits 31-12 of the address and if they
are equal and the valid bit is set, the access is declared a hit and the data can be
used by the processor.
Write accesses to memory, which are much less frequent than reads, are more
PAT07F07.eps
complicated, and different caches handle them in different ways. In a write-
through cache, if the word being written to is in the cache, both the copy in
the cache and the copy in main memory are written to at the same time. This
obviously takes as long as a main memory access, but is simpler to implement
than write-back caches, which store a block into the main memory only when it
is thrown out of the cache. Caches also vary in their behaviour if a write access
misses: some caches load the missing block into the cache; others do not.
60 The memory subsystem [ Chapter VIII

VIII.3 Virtual memory

The principle of virtual memory is that those parts of the code and data areas
of each active process which are actually being used are kept in main memory,
while those parts not being accessed during the current phase of execution are
kept on backing store, i.e. disk. The operation of the virtual memory system is
transparent to the programmer — the transfer of parts of the program between
main memory and disk happens automatically under control of the operating
system, as needed. There are two main benefits associated with virtual memory:
programs can run on computers with less memory than the total code and data
of the program, and the main memory is shared efficiently and safely among
different processes.
The compiler compiles the program assuming that the machine has a very
large memory, all available to the program — the virtual address space. All ad-
dresses in the program refer to this space, i.e. are virtual addresses, typically 32
bits long. The program in execution uses virtual addresses to refer to data and
to instructions — the program counter, stack pointer and pointers to other data
items contain virtual addresses.
The real memory space of the computer is usually smaller than the virtual
memory. The operating system keeps some chunks of the process code and data
in main memory, leaving the rest on disk. The key to efficiency here is to keep
the active parts of the program in main memory. This is very similar to the
operation of caches.
Address translation All addresses output by the processor as it fetches instructions or accesses
data must be translated from virtual addresses to the real addresses (called phys-
ical addresses) of the locations at which the instructions or data currently lie.
This is called address translation, and is done by a piece of hardware between the
processor and memory called a memory management unit (MMU). The address
translation can happen before or after accessing the cache; there are advantages
and disadvantages to both approaches, but this is out of the scope of Inf2C.
Relocation When a process is loaded into memory in order to run, the place it is loaded
depends on what other processes are already active. The compiler cannot know
in advance where in memory the process will be loaded. The address at which
each instruction and each data item reside in memory will be different each
time the process is loaded, and that leads to difficulties with instructions that
contain absolute addresses, such as absolute jumps. Any instruction that refers
to another instruction or to a data item using its absolute address (as opposed
to a relative offset) will need to be modified in some way, to take account of the
actual position in memory the process is loaded at. This modification is known
as relocation, and is inconvenient for systems which do not use virtual memory.
§ VIII.3] Virtual memory 61

PAGE TABLE BASE ADDRESS VIRTUAL ADDRESS BEING ACCESSED

P D

(in an MMU register) 22 bits 10 bits

Page number Byte within
page

Residence bit

Modified bit

Accessed bit

Frame number
R M A F

1 ENTRY PER PAGE

F D

16 bits 10 bits

REAL ADDRESS

PAGE TABLE
(in the system area of main memory)

Figure VIII.2: Address translation.

Virtual memory removes the need for relocation, as all addresses in the program
are virtual and guaranteed to be unique to that program.
The virtual memory space is divided into a large number of equally sized Paging
chunks called pages. Page sizes vary between computer systems, but typical sizes
are 1K and 4K bytes, and the 32 bit virtual address therefore consists of two
fields, the page number (the most significant 22 bits) and the page offset, the num-
ber of the byte within the page (the least significant 10 bits for 1K byte pages).
The main memory of the machine is also divided up into chunks, of the same
size as pages, called page frames, so the physical address, which might be 26 bits
long, will consist of a 16 bit frame number and a 10 bit byte address within the
frame. The OS maintains the currently needed parts of the program code and
data areas in physical memory by loading the required pages of the program’s
virtual memory space into a set of (not necessarily contiguous) page frames in
physical memory.

VIII.3.1 Address translation

Figure VIII.2 shows the address translation mechanism in the memory man-
agement unit (MMU). The page table contains information about each page in
the virtual memory space used by the running process. The table itself is in
physical memory, in that part of the memory reserved for use by the operating
system. In the MMU there is a register containing the (physical) address of the
62 The memory subsystem [ Chapter VIII

start of the page table. When a virtual address is presented to the MMU for
translation, the page number P is added to the page table base address, to access
a table entry describing the location of the page. The F field of that table entry
is simply the number of the frame in physical memory containing the page, and
so if it is concatenated with the lower 10 bits (D) of the virtual address, we get
the corresponding physical memory address.
What happens if the process tries to access a page which is not held in main
memory, but is on disk? There is a bit in the page table entry, the R bit, which is
set only if the page is in main memory. If the R bit for the page being accessed is
not set, a page fault exception occurs, interrupting the process and invoking the
OS page fault handler. The handler fetches the missing page from disk and loads
it into an unused frame in main memory, then sets the R bit and the F field in the
page table entry. The interrupted process can then resume. This technique of
loading pages into memory when they are first accessed is called demand paging.
Page replacement Each process in a multi-tasking system has allocated to it a certain fraction
of all the page frames in the physical memory space. What happens if a page
fault occurs and all the process’s page frames are already in use? The OS must
pick one of the page frames for re-use, and load the new page on top of the page
currently in that frame, replacing it. If the replaced page is a data page, and any
location in the page has been written to since the page was loaded into the frame,
the old page must be written out to disk, updating the copy on disk, before the
new page is loaded over it. A bit in the page table entry for the page, the M
(Modified, alternatively called dirty) bit, is set if a write has been performed to
the page since it was loaded into the frame — if the M bit is unset, the page can
be overwritten without being saved to disk, as the previous copy on disk is still
valid.
If the page which was overwritten is accessed again by the process, it has to
be reloaded from disk (into any available frame), so ideally the OS should choose
pages for replacement which will not be accessed again for a long time. Using the
locality principle, the OS approximates this criterion by replacing pages which
have not been accessed recently. This is achieved by using a further bit in the
page table entry for each page, the A (Accessed) bit1 . Whenever any address
in a page is accessed, the A bit is set. All the A bits in the page table are reset
periodically by the OS. Thus any page which has an unset A bit has not been
used since the last reset of the A bits, and is a good candidate for replacement
with the incoming page.

1 Called reference or use bit in P&H.

§ VIII.3] Virtual memory 63

VIII.3.2 Efficiency in address translation

Address translation in a paged virtual memory system involves an access to
the page table for every access by the process to memory. The page tables them-
selves are held in the main memory of the machine, so whenever a process ac-
cesses memory, the system must make two memory accesses, one for the page
table, and one to the process’s requested location. Since main memory speed is a
major limiting factor determining the speed of a computer, this would slow the
entire computer down by a factor of up to two.
To solve this problem, the MMU contains a fast memory which holds the
necessary information from the page table entries of the most recently accessed
pages. This fast memory is called the translation lookaside buffer (TLB) and typi-
cally holds 16 to 512 entries. Each entry contains the page number of the virtual
page, the M (Modified) bit from the page table entry for that page, and the frame
number of the memory frame holding the page. When the processor outputs a
virtual memory address to the MMU, the page number is simultaneously com-
pared against the page numbers of all the TLB entries. If one of the TLB entries
matches the requested virtual address, the frame number of the page is available
immediately from that entry, and address translation is very fast. If there is no
match in the TLB, the page table lookup must proceed by accessing the ‘full’
page table, kept in main memory, and the information is loaded into the TLB,
replacing a TLB entry which has not been used recently2 .

2 Hardware associated with the TLB can keep track of which entries have been used most

recently.
64 The memory subsystem [ Chapter VIII

Cache Memory Presentation Slides
No ratings yet
Cache Memory Presentation Slides
25 pages
PGCIL GIS Specification - Rev 4 (June-14)
100% (1)
PGCIL GIS Specification - Rev 4 (June-14)
74 pages
Anatomy of A Program in Memory
No ratings yet
Anatomy of A Program in Memory
5 pages
MC Module-5 Notes
No ratings yet
MC Module-5 Notes
8 pages
ED 340 1eng Manuale Rapido 1 - 0
100% (2)
ED 340 1eng Manuale Rapido 1 - 0
2 pages
Braxton Score Excerpts For Genome
100% (8)
Braxton Score Excerpts For Genome
41 pages
Cpu Concepts-2
No ratings yet
Cpu Concepts-2
52 pages
cache memory
No ratings yet
cache memory
7 pages
Caches in Multicore Systems: Universitatea Politehnica Din Timisoara Facultatea de Automatica Şi Calculatoare
No ratings yet
Caches in Multicore Systems: Universitatea Politehnica Din Timisoara Facultatea de Automatica Şi Calculatoare
7 pages
Mapping Functions
No ratings yet
Mapping Functions
23 pages
CPU Cache: Details of Operation
No ratings yet
CPU Cache: Details of Operation
18 pages
Computer Organization and Architecture Module 3
100% (1)
Computer Organization and Architecture Module 3
34 pages
Memory Cache: Computer Architecture and Organization
No ratings yet
Memory Cache: Computer Architecture and Organization
41 pages
Cache Entries
100% (1)
Cache Entries
13 pages
Cache
No ratings yet
Cache
36 pages
Usha Mittal Institute of Technology SNDT Women'S University: MUMBAI - 400049
No ratings yet
Usha Mittal Institute of Technology SNDT Women'S University: MUMBAI - 400049
19 pages
Computer Mapping and Different Memory
No ratings yet
Computer Mapping and Different Memory
9 pages
CPU Cache: From Wikipedia, The Free Encyclopedia
No ratings yet
CPU Cache: From Wikipedia, The Free Encyclopedia
19 pages
Cache Memory A
No ratings yet
Cache Memory A
62 pages
CPE Module IV
No ratings yet
CPE Module IV
11 pages
Term Paper: Cahe Coherence Schemes
No ratings yet
Term Paper: Cahe Coherence Schemes
12 pages
Anatomy of A Program in Memory
No ratings yet
Anatomy of A Program in Memory
19 pages
Unit 5
No ratings yet
Unit 5
40 pages
Technological University of The Philippines: Lopez Extension Campus
No ratings yet
Technological University of The Philippines: Lopez Extension Campus
4 pages
Notes M5
No ratings yet
Notes M5
13 pages
Computer Virtual Memory
No ratings yet
Computer Virtual Memory
18 pages
Lectures On Memory Organization: Unit 9
No ratings yet
Lectures On Memory Organization: Unit 9
32 pages
Unit 4 - Computer System Organisation - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Computer System Organisation - WWW - Rgpvnotes.in
8 pages
Brief Overview of Cache Memory: April 2020
No ratings yet
Brief Overview of Cache Memory: April 2020
6 pages
oslecture10-13
No ratings yet
oslecture10-13
80 pages
Cache: Why Level It: Departamento de Informática, Universidade Do Minho 4710 - 057 Braga, Portugal Nunods@ipb - PT
No ratings yet
Cache: Why Level It: Departamento de Informática, Universidade Do Minho 4710 - 057 Braga, Portugal Nunods@ipb - PT
8 pages
Cache AN3544
No ratings yet
Cache AN3544
12 pages
kientrucenglish
No ratings yet
kientrucenglish
9 pages
Cache Memory
No ratings yet
Cache Memory
20 pages
Ghazi University Dera Ghazi Khan: Assignment No 1
No ratings yet
Ghazi University Dera Ghazi Khan: Assignment No 1
7 pages
Anatomy of a Program in Memory
No ratings yet
Anatomy of a Program in Memory
5 pages
oslecture10-13
No ratings yet
oslecture10-13
80 pages
CAO UNIT 5
No ratings yet
CAO UNIT 5
12 pages
BCS402 MC M5 Notes
No ratings yet
BCS402 MC M5 Notes
13 pages
Memory Management: Background Swapping Contiguous Allocation Paging Segmentation Segmentation With Paging
No ratings yet
Memory Management: Background Swapping Contiguous Allocation Paging Segmentation Segmentation With Paging
55 pages
Memory Management
No ratings yet
Memory Management
26 pages
ICS 143 - Principles of Operating Systems
No ratings yet
ICS 143 - Principles of Operating Systems
83 pages
"Cache Memory" in (Microprocessor and Assembly Language) : Lecture-20
No ratings yet
"Cache Memory" in (Microprocessor and Assembly Language) : Lecture-20
19 pages
Shashank Aca Assignment
No ratings yet
Shashank Aca Assignment
21 pages
Ch. 3 Lecture 1 - 3 PDF
No ratings yet
Ch. 3 Lecture 1 - 3 PDF
83 pages
Chapter-24 Memory Management
No ratings yet
Chapter-24 Memory Management
8 pages
Computer Organisation (Short Answers 2pm)
No ratings yet
Computer Organisation (Short Answers 2pm)
3 pages
Objectives: Johannes Plachy IT Services & Solutions © 1998,1999 Jplachy@jps - at
No ratings yet
Objectives: Johannes Plachy IT Services & Solutions © 1998,1999 Jplachy@jps - at
35 pages
Hash Cache
No ratings yet
Hash Cache
18 pages
Module 7 - Main Memory
No ratings yet
Module 7 - Main Memory
41 pages
Comparch Individual Assignment1
No ratings yet
Comparch Individual Assignment1
12 pages
Smart Memories
No ratings yet
Smart Memories
11 pages
Unit 5 Memory Management
No ratings yet
Unit 5 Memory Management
20 pages
Cache Coherency
No ratings yet
Cache Coherency
19 pages
Coa Blog
No ratings yet
Coa Blog
17 pages
CS1120 - Principles of Operating Systems: Lectures 10,11,12 And13 - Memory Management Mr. Rowan N. Elomina
No ratings yet
CS1120 - Principles of Operating Systems: Lectures 10,11,12 And13 - Memory Management Mr. Rowan N. Elomina
83 pages
Chapter 4
No ratings yet
Chapter 4
10 pages
Chapter 2
No ratings yet
Chapter 2
6 pages
National Institute of Technology, Durgapur: Submitted By: Vipin Saharia (10/MCA/11) Submitted To: Prof. P. Chaudhary
No ratings yet
National Institute of Technology, Durgapur: Submitted By: Vipin Saharia (10/MCA/11) Submitted To: Prof. P. Chaudhary
18 pages
Memory Management - Summary
No ratings yet
Memory Management - Summary
6 pages
Unit-2_CDA_DrManojY
No ratings yet
Unit-2_CDA_DrManojY
81 pages
Cache Coherence: From Wikipedia, The Free Encyclopedia
No ratings yet
Cache Coherence: From Wikipedia, The Free Encyclopedia
8 pages
Distributed Caching & Data Management: Mastering Redis, Memcached, And Apache Ignite Caching
From Everand
Distributed Caching & Data Management: Mastering Redis, Memcached, And Apache Ignite Caching
Rob Botwright
No ratings yet
Chapter 7 Pricing Decision Nov2020
No ratings yet
Chapter 7 Pricing Decision Nov2020
31 pages
Manual AC Transducer
No ratings yet
Manual AC Transducer
11 pages
OOP - Project (Gym Management)
No ratings yet
OOP - Project (Gym Management)
11 pages
PWC 70 20S Doe PDF
No ratings yet
PWC 70 20S Doe PDF
6 pages
Emperical Research
No ratings yet
Emperical Research
13 pages
Planetary Gearset Is Used in Which Gearbox - SMD Gearbox
No ratings yet
Planetary Gearset Is Used in Which Gearbox - SMD Gearbox
2 pages
Gabor Analysis On LCAG 27.11.22
No ratings yet
Gabor Analysis On LCAG 27.11.22
14 pages
Feelpp Manual
No ratings yet
Feelpp Manual
99 pages
Introduction To Software Engineering
No ratings yet
Introduction To Software Engineering
19 pages
K Single-Impeller: Technical Data
No ratings yet
K Single-Impeller: Technical Data
4 pages
SP Grade 2 SB Course Materials
No ratings yet
SP Grade 2 SB Course Materials
23 pages
Nano
No ratings yet
Nano
88 pages
Delta Ia-Plc Asrtu-Ec Op en 20240601
No ratings yet
Delta Ia-Plc Asrtu-Ec Op en 20240601
158 pages
Determining The Factors Affecting The Acceptance of Filipinos On The Use of Renewable Energies: A Model
No ratings yet
Determining The Factors Affecting The Acceptance of Filipinos On The Use of Renewable Energies: A Model
17 pages
1E-1 Electric Circuit Diagram Versa 033 25 00 - 2014 - 07 - 02 - 102003V00R00 SW
100% (1)
1E-1 Electric Circuit Diagram Versa 033 25 00 - 2014 - 07 - 02 - 102003V00R00 SW
113 pages
Avionics Architectures
100% (2)
Avionics Architectures
29 pages
Formulation and Evaluation of Diphenhydramine Hydrochloride and Ibuprofen Soft Gelatin Capsules
No ratings yet
Formulation and Evaluation of Diphenhydramine Hydrochloride and Ibuprofen Soft Gelatin Capsules
3 pages
Mould Venting
No ratings yet
Mould Venting
38 pages
Piano Script
No ratings yet
Piano Script
23 pages
Assertions in Selenium
No ratings yet
Assertions in Selenium
13 pages
Class 10
No ratings yet
Class 10
1 page
Text Descriptive September 2023
No ratings yet
Text Descriptive September 2023
11 pages
Research Guide: Radio Wave Radiation: Submitted by
No ratings yet
Research Guide: Radio Wave Radiation: Submitted by
9 pages
Export To Excel PDF CSV and XML Using Display Tag
No ratings yet
Export To Excel PDF CSV and XML Using Display Tag
2 pages
BSSE
No ratings yet
BSSE
3 pages
Torin Geared Motor PDF
No ratings yet
Torin Geared Motor PDF
17 pages
Math WHLP
No ratings yet
Math WHLP
4 pages