Coa 2..2
Coa 2..2
1
Memory hierarchy
The computer memory hierarchy looks like a pyramid structure which is used to describe the
differences among memory types. It separates the computer storage based on hierarchy.
Level 0: CPU registers
Level 1: Cache memory
Level 2: Main memory or primary memory
Level 3: Magnetic disks or secondary memory
Level 4: Optical disks or magnetic types or tertiary Memory
In Memory Hierarchy the cost of memory, capacity is inversely proportional to speed. Here
the devices are arranged in a manner Fast to slow, that is form register to Tertiary memory.
Let us discuss each level in detail:
Level-0 − Registers
The registers are present inside the CPU. As they are present inside the CPU, they have least
access time. Registers are most expensive and smallest in size generally in kilobytes. They
are implemented by using Flip-Flops.
Level-1 − Cache
Cache memory is used to store the segments of a program that are frequently accessed by the
processor. It is expensive and smaller in size generally in Megabytes and is implemented by
using static RAM.
Level-2 − Primary or Main Memory
It directly communicates with the CPU and with auxiliary memory devices through an I/O
processor. Main memory is less expensive than cache memory and larger in size generally in
Gigabytes. This memory is implemented by using dynamic RAM.
Level-3 − Secondary storage
Secondary storage devices like Magnetic Disk are present at level 3. They are used as
backup storage. They are cheaper than main memory and larger in size generally in a few
TB.
Level-4 − Tertiary storage
Tertiary storage devices like magnetic tape are present at level 4. They are used to store
removable files and are the cheapest and largest in size (1-20 TB).
Let us see the memory levels in terms of size, access time, bandwidth.
References
Reference Books:
J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
Stallings, W., “Computer Organization and Architecture”, Eighth Edition, Pearson Education.
Text Books:
Carpinelli J.D,” Computer systems organization &Architecture”, Fourth Edition, Addison
Wesley.
Patterson and Hennessy, “Computer Architecture” , Fifth Edition Morgaon Kauffman.
Reference Website
Memory Hierarchy Design and its Characteristics - GeeksforGeeks
What is memory hierarchy? (tutorialspoint.com)
A faster and smaller segment of memory whose access time is as close as registers are
known as Cache memory. In a hierarchy of memory, cache memory has access time lesser
than primary memory. Generally, cache memory is very smaller and hence is used as a
buffer.
Cache performance
Performance of cache is measured by the number of cache hits to the number of searches.
This parameter of measuring performance is known as the Hit Ratio.
Hit ratio = (Number of cache hits)/(Number of searches)
L1 or Level 1 Cache: It is the first level of cache memory that is present inside the
processor. It is present in a small amount inside every core of the processor separately. The
size of this memory ranges from 2KB to 64 KB.
L2 or Level 2 Cache: It is the second level of cache memory that may present inside or
outside the CPU. If not present inside the core, It can be shared between two cores
depending upon the architecture and is connected to a processor with the high-speed bus.
The size of memory ranges from 256 KB to 512 KB.
L3 or Level 3 Cache: It is the third level of cache memory that is present outside the CPU
and is shared by all the cores of the CPU. Some high processors may have this cache. This
cache is used to increase the performance of the L2 and L1 cache. The size of this memory
ranges from 1 MB to 8MB.
Cache vs RAM
Although Cache and RAM both are used to increase the performance of the system there
exists a lot of differences in which they operate to increase the efficiency of the system.
RAM Cache
OS interacts with secondary memory to get data OS interacts with primary memory to
to be stored in Primary Memory or RAM get data to be stored in Cache.
Associative Memory
An associative memory can be treated as a memory unit whose saved information can be
recognized for approach by the content of the information itself instead of by an address or
memory location. Associative memory is also known as Content Addressable Memory
(CAM).
The block diagram of associative memory is shown in the figure. It includes a memory array
and logic for m words with n bits per word. The argument registers A and key register K
each have n bits, one for each bit of a word.
The match register M has m bits, one for each memory word. Each word in memory is
related in parallel with the content of the argument register.
The words that connect the bits of the argument register set an equivalent bit in the match
register. After the matching process, those bits in the match register that have been set denote
the fact that their equivalent words have been connected.
Reading is proficient through sequential access to memory for those words whose equivalent
bits in the match register have been set.
An associative memory can be treated as a memory unit whose saved information can be
recognized for approach by the content of the information itself instead of by an address or
memory location. Associative memory is also known as Content Addressable Memory
(CAM).
The block diagram of associative memory is shown in the figure. It includes a memory array
and logic for m words with n bits per word. The argument register A and key register K each
have n bits, one for each bit of a word.
The match register M has m bits, one for each memory word. Each word in memory is
related in parallel with the content of the argument register.
The words that connect the bits of the argument register set an equivalent bit in the match
register. After the matching process, those bits in the match register that have been set denote
the fact that their equivalent words have been connected.
Reading is proficient through sequential access to memory for those words whose equivalent
bits in the match register have been set.
The key register supports a mask for selecting a specific field or key in the argument word.
The whole argument is distinguished with each memory word if the key register includes all
1's.
Hence, there are only those bits in the argument that have 1's in their equivalent position of
the key register are compared. Therefore, the key gives a mask or recognizing a piece of data
that determines how the reference to memory is created.
The following figure can define the relation between the memory array and the external
registers in associative memory.
The cells in the array are considered by the letter C with two subscripts. The first subscript
provides the word number and the second determines the bit position in the word. Therefore,
cell Cij is the cell for bit j in word i.
A bit in the argument register is compared with all the bits in column j of the array supported
that Kj = 1. This is completed for all columns j = 1, 2, . . ., n.
If a match appears between all the unmasked bits of the argument and the bits in word i, the
equivalent bit Mi in the match register is set to 1. If one or more unmasked bits of the
argument and the word do not match, Mi is cleared to 0.
Associative Cache
A type of CACHE designed to solve the problem of cache CONTENTION that plagues the
DIRECT MAPPED CACHE. In a fully associative cache, a data block from any memory
address may be stored into any CACHE LINE, and the whole address is used as the cache
TAG: hence, when looking for a match, all the tags must be compared simultaneously with
any requested address, which demands expensive extra hardware. However, contention is
avoided completely, as no block need ever be flushed unless the whole cache is full, and
then the least recently used may be chosen.
A set-associative cache is a compromise solution in which the cache lines are divided into
sets, and the middle bits of its address determine which set a block will be stored in: within
each set the cache remains fully associative. A cache that has two lines per set is called two-
way set-associative and requires only two tag comparisons per access, which reduces the
extra hardware required. A DIRECT MAPPED CACHE can be thought of as being one-way
set associative, while a fully associative cache is n-way associative where n is the total
number of cache lines. Finding the right balance between associativity and total cache
capacity for a particular processor is a fine art-various current cpus employ 2 way, 4-way
and 8-way designs.
Associative memory is also known as content addressable memory (CAM) or associative
storage or associative array. It is a special type of memory that is optimized for performing
searches through data, as opposed to providing a simple direct access to the data based on the
address.
It can store the set of patterns as memories when the associative memory is being presented
with a key pattern, it responds by producing one of the stored patterns which closely
resembles or relates to the key pattern.
It can be viewed as data correlation here. input data is correlated with that of stored data in
the CAM.
It forms of two types:
Associative memory of conventional semiconductor memory (usually RAM) with added
comparison circuity that enables a search operation to complete in a single clock cycle. It is a
hardware search engine, a special type of computer memory used in certain very high
searching applications. Applications of Associative memory:
It can be only used in memory allocation format.
It is widely used in the database management systems, etc.
Advantages of Associative memory:
1. It is used where search time needs to be less or short.
2. It is suitable for parallel searches.
3. It is often used to speedup databases.
4. It is used in page tables used by the virtual memory and used in neural networks.
Disadvantages of Associative memory:
1. It is more expensive than RAM.
2. Each cell must have storage capability and logical circuits for matching its
content with external argument.
Associative Memory
The time required to find an object stored in memory can be significantly reduced if the
stored data can be identified by the content of the data for its own use rather than by access.
A memory unit accessed by a material is known as an associative memory or a content
addressable memory (CAM). This type of memory is accessed simultaneously and in
parallel based on the data content rather than the specific address or location. if a word is
written in associative memory, no address is given. Memory is capable of finding empty
unused space to store the word, or part of the word specified. memory detects all words that
match the specified content and marks them for reading.
Cache Memory
If the active part of the program and data can be kept in fast memory, the total execution
time can be reduced significantly. Such memory is known as cache memory, which is
inserted between the CPU and the main memory. To make this arrangement effective. The
cache needs to be much faster than main memory. This approach is more economical than
the use of fast memory devices to implement the entire main memory.
S.
No. Associative Memory Cache Memory
It reduces the time required to find the It reduces the average memory access
2 item stored in memory. time.
3 Here data accessed by its content. Here, data are accessed by its address.
Its basic characteristic is its logic circuit Its basic characteristic is its fast
5 for matching its content. access.
It is expensive as compared to
6 It is not as expensive as cache memory. associative memory.
References
Reference Books:
J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
Stallings, W., “Computer Organization and Architecture”, Eighth Edition, Pearson Education.
Text Books:
Carpinelli J.D,” Computer systems organization &Architecture”, Fourth Edition, Addison
Wesley.
Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon Kauffman.
Reference Website
Differences between Associative and Cache Memory - GeeksforGeeks
Cache size vs block size 2.2.3
CACHE PERFORMANCE
When the processor needs to read or write a location in main memory, it first checks for a
corresponding entry in the cache.
If the processor finds that the memory location is in the cache, a cache hit has
occurred and data is read from cache
If the processor does not find the memory location in the cache, a cache miss has
occurred. For a cache miss, the cache allocates a new entry and copies in data from
main memory, then the request is fulfilled from the contents of the cache.
The performance of cache memory is frequently measured in terms of a quantity called Hit
ratio.
We can improve Cache performance using higher cache block size, higher associativity,
reduce miss rate, reduce miss penalty, and reduce the time to hit in the cache.
CACHE LINES
Cache memory is divided into equal size partitions called as cache lines.
While designing a computer’s cache system, the size of cache lines is an important
parameter.
The size of cache line affects a lot of parameters in the caching system.
The following results discuss the effect of changing the cache block (or line) size in a caching
system.
The larger the block size, better will be the spatial locality.
Explanation-
Result-02: Effect of Changing Block Size on Cache Tag in Direct Mapped Cache-
In direct mapped cache, block size does not affect the cache tag anyhow.
Explanation-
Example-
Case-02: Increasing the Block Size-
Increasing the block size decreases the number of lines in cache.
With the increase in block size, the number of bits in block offset increases.
However, with the decrease in the number of cache lines, number of bits in line
number decreases.
Thus, number of bits in line number + number of bits in block offset = remains
constant.
Thus, there is no effect on the cache tag.
Example-
Result-03: Effect of Changing Block Size on Cache Tag in Fully Associative Cache-
In fully associative cache, on decreasing block size, cache tag is reduced and vice versa.
Explanation-
Decreasing the block size decreases the number of bits in block offset.
With the decrease in number of bits in block offset, number of bits in tag increases.
Increasing the block size increases the number of bits in block offset.
With the increase in number of bits in block offset, number of bits in tag decreases.
Result-04: Effect of Changing Block Size on Cache Tag in Set Associative Cache-
In set associative cache, block size does not affect cache tag anyhow.
Explanation-
Example-
Case-02: Increasing the Block Size-
Example-
Result-05: Effect of Changing Block Size on Cache Miss Penalty-
Explanation-
When a cache miss occurs, block containing the required word has to be brought from
the main memory.
If the block size is small, then time taken to bring the block in the cache will be less.
Hence, less miss penalty will incur.
But if the block size is large, then time taken to bring the block in the cache will be
more.
Hence, more miss penalty will incur.
Explanation-
Cache hit time is the time required to find out whether the required block is in cache
or not.
It involves comparing the tag of generated address with the tag of cache lines.
Smaller is the cache tag, lesser will be the time taken to perform the comparisons.
Hence, smaller cache tag ensures lower cache hit time.
On the other hand, larger is the cache tag, more will be time taken to perform the
comparisons.
Thus, larger cache tag results in higher cache hit time.
In designing a computer’s cache system, the cache block or cache line size is an important
parameter. Which of the following statements is correct in this context?
Reasons-
In direct mapped cache and set associative cache, there is no effect of changing block
size on cache tag.
In fully associative mapped cache, on decreasing block size, cache tag becomes
larger.
Thus, smaller block size does not imply smaller cache tag in any cache organization.
“A smaller block size implies a larger cache tag” is true only for fully associative
mapped cache.
Larger cache tag does not imply lower cache hit time rather cache hit time is
increased.
Right-click on the Start button and click on Task Manager. 2. On the Task Manager screen,
click on the Performance tab > click on CPU in the left pane. In the right-pane, you will see
L1, L2 and L3 Cache sizes listed under “Virtualization” section.
The size of these chunks is called the cache line size. Common cache line sizes are 32, 64 and
128 bytes. A cache can only hold a limited number of lines, determined by the cache size. For
example, a 64 kilobyte cache with 64-byte lines has 1024 cache lines.
cache block – The basic unit for cache storage. May contain multiple bytes/words of data.
Because different regions of memory may be mapped into a block, the tag is used to
differentiate between them. valid bit – A bit of information that indicates whether the data in
a block is valid (1) or not (0).
In the example the cache block size is 32 bytes, i.e., byte-addressing is being used; with four-
byte words, this is 8 words. As you can see there are four hits out of 12 accesses, so the hit
rate should be 33%.
A cache memory has a line size of eight 64-bit words and a capacity of 4K words. A cache
memory has a line size of eight 64-bit words and a capacity of 4K words.
Cache size, Block size, Mapping function, Replacement algorithm, and Write policy. These
are explained as following below.
1. Cache Size:
It seems that moderately tiny caches will have a big impact on performance.
2. Block Size:
Block size is the unit of information changed between cache and main memory. As
the block size will increase from terribly tiny to larger sizes, the hit magnitude
relation can initially increase as a result of the principle of locality. the high chance
that knowledge within the neck of the woods of a documented word square measure
possible to be documented within the close to future. As the block size increases, a
lot of helpful knowledge square measure brought into the cache.
a. The hit magnitude relation can begin to decrease, however, because the
block becomes even larger and also the chance of victimization the new
fetched knowledge becomes but the chance of reusing the information that
ought to be abstracted of the cache to form area for the new block.
3. Mapping Function:
When a replacement block of data is scan into the cache, the mapping performs
determines that cache location the block will occupy. Two constraints have an effect
on the planning of the mapping perform. First, once one block is scan in, another
could be replaced.
a. We would wish to do that in such the simplest way to minimize the chance
that we are going to replace a block which will be required within the close
to future. A lot of versatile the mapping performs, a lot of scopes we’ve to
style a replacement algorithmic rule to maximize the hit magnitude relation.
Second, a lot of versatile the mapping performs, a lot of advanced is that the
electronic equipment needed to look the cache to see if a given block is
within the cache.
4. Replacement Algorithm:
If the contents of a block within the cache square measure altered, then it’s
necessary to write down it back to main memory before exchange it. The written
policy dictates once the memory write operation takes place. At one extreme, the
writing will occur whenever the block is updated.
a. At the opposite extreme, the writing happens only if the block is replaced.
The latter policy minimizes memory write operations however leaves the
main memory in associate obsolete state. This can interfere with the
multiple-processor operation and with direct operation by I/O hardware
modules.
References
Reference Books:
J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
Stallings, W., “Computer Organization and Architecture”, Eighth Edition, Pearson Education.
Text Books:
Carpinelli J.D,” Computer systems organization &Architecture”, Fourth Edition, Addison
Wesley.
Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon Kauffman.
Other References
(1) New Message! (knowledgeburrow.com)
Cache Memory Design - GeeksforGeeks
https://www.gatevidyalay.com/cache-line-cache-line-size-cache-memory/
https://www.geeksforgeeks.org/cache-memory-in-computer-organization/
https://stackoverflow.com/questions/8107965/concept-of-block-size-in-a-cache
Memory Hierarchy
The cache is a part of the hierarchy present next to the CPU. It is used in storing the
frequently used data and instructions. It is generally very costly i.e., the larger the cache
memory, the higher the cost. Hence, it is used in smaller capacities to minimize costs. To
make up for its less capacity, it must be ensured that it is used to its full potential.
Optimization of cache performance ensures that it is utilized in a very efficient manner to
its full potential.
AMAT helps in analyzing the Cache memory and its performance. The lesser the AMAT,
the better the performance is. AMAT can be calculated as,
AMAT = Hit Ratio * Cache access time + Miss Ratio * Main memory access time
= (h * tc) + (1-h) * (tc + tm)
Note: Main memory is accessed only when a cache miss occurs. Hence, cache time is also
included in the main memory access time.
Example 1: What is the average memory access time for a machine with a cache hit rate of
75% and cache access time of 3 ns and main memory access time of 110 ns.
Solution:
Average Memory Access Time(AMAT) = (h * tc) + (1-h) * (tc + tm)
Given,
Hit Ratio(h) = 75/100 = 3/4 = 0.75
Miss Ratio (1-h) = 1-0.75 = 0.25
Cache access time(tc) = 3ns
Note: AMAT can also be calculated as Hit Time + (Miss Rate * Miss Penalty)
Example 2: Calculate AMAT when Hit Time is 0.9 ns, Miss Rate is 0.04, and Miss
Penalty is 80 ns.
Solution :
Average Memory Access Time(AMAT) = Hit Time + (Miss Rate * Miss Penalty)
Here, Given,
Hit time = 0.9 ns
Miss Rate = 0.04
Miss Penalty = 80 ns
Average Memory Access Time(AMAT) = 0.9 + (0.04*80)
= 0.9 + 3.2
= 4.1 ns
Hence, if Hit time, Miss Rate, and Miss Penalty are reduced, the AMAT reduces which in
turn ensures optimal performance of the cache.
Methods for reducing Hit Time, Miss Rate, and Miss Penalty:
1. Larger block size: If the block size is increased, spatial locality can be exploited in an
efficient way which results in a reduction of miss rates. But it may result in an increase in
miss penalties. The size can’t be extended beyond a certain point since it affects negatively
the point of increasing miss rate. Because larger block size implies a lesser number of
blocks which results in increased conflict misses.
2. Larger cache size: Increasing the cache size results in a decrease of capacity misses,
thereby decreasing the miss rate. But, they increase the hit time and power consumption.
3. Higher associativity: Higher associativity results in a decrease in conflict misses.
Thereby, it helps in reducing the miss rate.
1. Multi-Level Caches: If there is only one level of cache, then we need to decide between
keeping the cache size small in order to reduce the hit time or making it larger so that the
miss rate can be reduced. Both of them can be achieved simultaneously by introducing
cache at the next levels.
Suppose, if a two-level cache is considered:
The first level cache is smaller in size and has faster clock cycles comparable to
that of the CPU.
Second-level cache is larger than the first-level cache but has faster clock cycles
compared to that of main memory. This large size helps in avoiding much access
going to the main memory. Thereby, it also helps in reducing the miss penalty.
Hierarchical representation of Memory
2. Critical word first and Early Restart: Generally, the processor requires one word of
the block at a time. So, there is no need of waiting until the full block is loaded before
sending the requested word. This is achieved using:
The critical word first: It is also called a requested word first. In this method,
the exact word required is requested from the memory and as soon as it arrives,
it is sent to the processor. In this way, two things are achieved, the processor
continues execution, and the other words in the block are read at the same time.
Early Restart: In this method, the words are fetched in the normal order. When
the requested word arrives, it is immediately sent to the processor which
continues execution with the requested word.
These are the basic methods through which the performance of cache can be optimized.
1. Reducing the hit time – Small and simple first-level caches and way-prediction. Both
techniques also generally decrease power consumption.
2. Increasing cache bandwidth – Pipelined caches, multi-banked caches, and non-
blocking caches. These techniques have varying impacts on power consumption.
3. Reducing the miss penalty – Critical word first and merging write buffers. These
optimizations have little impact on power.
4. Reducing the miss rate – Compiler optimizations. Obviously, any improvement at
compile time improves power consumption.
5. Reducing the miss penalty or miss rate via parallelism – Hardware prefetching and
compiler prefetching. These optimizations generally increase power consumption,
primarily due to prefetched data that are unused.
References
Reference Books:
J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
Stallings, W., “Computer Organization and Architecture”, Eighth Edition, Pearson
Education.
Text Books:
Carpinelli J.D,” Computer systems organization &Architecture”, Fourth Edition,
Addison Wesley.
Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon Kauffman.
Reference Website
Basic Cache Optimization Techniques - GeeksforGeeks
Optimizing Cache Memory Performance (And the Math Behind It All) (aberdeen.com)
Replacement algorithms are used when there is no available space in a cache in which to
place a data. Four of the most common cache replacement algorithms are described below:
The LRU algorithm selects for replacement the item that has been least recently used by the
CPU.
First-In-First-Out (FIFO):
The FIFO algorithm selects for replacement the item that has been in the cache from the
longest time.
The LRU algorithm selects for replacement the item that has been least frequently used by
the CPU.
Random:
In direct mapping,
Example: Let we have a sequence 7, 0 ,1, 2, 0, 3, 0, 4, 2, 3 and cache memory has 3 lines.
There are a total of 6 misses in the LRU replacement policy.
Belady's Anamoly: For some cache replacement algorithm, the pages fault or miss rate
increase as the number of allocated frame increase.
Example: Let we have a sequence 7, 0 ,1, 2, 0, 3, 0, 4, 2, 3, and cache memory has 4 lines.
This cache algorithm uses a counter to keep track of how often an entry is accessed. With the
LFU cache algorithm, the entry with the lowest count is removed first. This method isn't used
that often, as it does not account for an item that had an initially high access rate and then was
not accessed for a long time.
References
Reference Books:
J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
Stallings, W., “Computer Organization and Architecture”, Eighth Edition, Pearson
Education.
Text Books:
Carpinelli J.D,” Computer systems organization &Architecture”, Fourth Edition,
Addison Wesley.
Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon Kauffman.
Other References
What is Associative Cache? - Computer Notes (ecomputernotes.com)
http://www.eazynotes.com/notes/computer-system-architecture/slides/cache-
memory.pdf
file:///C:/Users/91987/Desktop/COA/lect11-cache-replacement.pdf
https://searchstorage.techtarget.com/definition/cache-algorithm
https://www.includehelp.com/cso/types-of-cache-replacement-policies.aspx
Paging and segmentation are processes by which data is stored to, then retrieved from, a
computer's storage disk.
Paging is a computer memory management function that presents storage locations to the
computer's CPU as additional memory, called virtual memory. Each piece of data needs a
storage address.
Managing computer memory is a basic operating system function -- both paging and
segmentation are basic functions of the OS. No system can efficiently rely on limited RAM
alone. So, the computer’s memory management unit (MMU) uses the storage
disk, HDD or SSD, as virtual memory to supplement RAM.
WHAT IS PAGING?
As mentioned above, the memory management function called paging specifies storage
locations to the CPU as additional memory, called virtual memory. The CPU cannot directly
access storage disk, so the MMU emulates memory by mapping pages to frames that are in
RAM.
Before we launch into a more detailed explanation of pages and frames, let’s define some
technical terms.
By assigning an address to a piece of data using a "page table" between the CPU and the
computer's physical memory, a computer's MMU enables the system to retrieve that data
whenever needed.
Paging
Page number (p) -- page number is used as an index into a page table which contains
base address of each page in physical memory.
Page offset (d) -- page offset is combined with base address to define the physical
memory address.
Fig 2.8.2 Paging Table Architecture
A page table stores the definition of each page. When an active process requests data, the
MMU retrieves corresponding pages into frames located in physical memory for faster
processing. The process is called paging.
The MMU uses page tables to translate virtual addresses to physical ones. Each table entry
indicates where a page is located: in RAM or on disk as virtual memory. Tables may have a
single or multi-level page table such as different tables for applications and segments.
However, constant table lookups can slow down the MMU. A memory cache called the
Translation Lookaside Buffer (TLB) stores recent translations of virtual to physical addresses
for rapid retrieval. Many systems have multiple TLBs, which may reside at different
locations, including between the CPU and RAM, or between multiple page table levels.
Different frame sizes are available for data sets with larger or smaller pages and matching-
sized frames. 4KB to 2MB are common sizes, and GB-sized frames are available in high-
performance servers.
WHAT IS SEGMENTATION?
The process known as segmentation is a virtual process that creates address spaces of various
sizes in a computer system, called segments. Each segment is a different virtual address space
that directly corresponds to process objects.
When a process executes, segmentation assigns related data into segments for faster
processing. The segmentation function maintains a segment table that includes physical
addresses of the segment, size, and other data.
Segmentation is a technique to break memory into logical pieces where each piece
represents a group of related information.
For example, data segments or code segment for each process, data segment for
operating system and so on.
Segmentation can be implemented using or without using paging.
Unlike paging, segment are having varying sizes and thus eliminates internal
fragmentation.
External fragmentation still exists but to lesser extent.
Fig 2.8.4 Logical Address Space
Segment number (s) -- segment number is used as an index into a segment table
which contains base address of each segment in physical memory and a limit of
segment.
Segment offset (o) -- segment offset is first checked against limit and then is
combined with base address to define the physical memory address.
The CPU generates virtual addresses for running processes. Segmentation translates the CPU-
generated virtual addresses into physical addresses that refer to a unique physical memory
location. The translation is not strictly one-to-one: different virtual addresses can map to the
same physical address.
SEGMENTED PAGING
Some modern computers use a function called segmented paging. Main memory is divided
into variably-sized segments, which are then divided into smaller fixed-size pages on disk.
Each segment contains a page table, and there are multiple page tables per process.
Each of the tables contains information on every segment page, while the segment table has
information about every segment. Segment tables are mapped to page tables, and page tables
are mapped to individual pages within a segment.
Advantages include less memory usage, more flexibility on page sizes, simplified memory
allocation, and an additional level of data access security over paging. The process does not
cause external fragmentation.
Paging Advantages
On the programmer level, paging is a transparent function and does not require
intervention.
No external fragmentation.
No internal fragmentation on updated OS’s.
Frames do not have to be contiguous.
Paging Disadvantages
Segmentation Advantages
No internal fragmentation.
Segment tables consumes less space compared to page tables.
Average segment sizes are larger than most page sizes, which allows segments to
store more process data.
Less processing overhead.
Simpler to relocate segments than to relocate contiguous address spaces on disk.
Segment tables are smaller than page tables, and takes up less memory.
Segmentation Disadvantages
Size:
Paging: Fixed block size for pages and frames. Computer hardware determines
page/frame sizes.
Segmentation: Variable size segments are user-specified.
Fragmentation:
Paging: Older systems were subject to internal fragmentation by not allocating entire
pages to memory. Modern OS’s no longer have this problem.
Segmentation: Segmentation leads to external fragmentation.
Tables:
Paging: Page tables direct the MMU to page location and status. This is a slower
process than segmentation tables, but TLB memory cache accelerates it.
Segmentation: Segmentation tables contain segment ID and information, and are
faster than direct paging table lookups.
Availability:
Paging: Widely available on CPUs and as MMU chips.
Segmentation: Windows servers may support backwards compatibility, while Linux
has very limited support.
References
Reference Books:
J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
Stallings, W., “Computer Organization and Architecture”, Eighth Edition, Pearson
Education.
Text Books:
Carpinelli J.D,” Computer systems organization &Architecture”, Fourth Edition,
Addison Wesley.
Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon Kauffman.
Other References
https://www.ques10.com/p/10067/what-is-virtual-memory-explain-the-role-of-pagin-
1/
https://www.enterprisestorageforum.com/storage-hardware/paging-and-
segmentation.html
https://www.cmpe.boun.edu.tr/~uskudarli/courses/cmpe235/Virtual%20Memory.pdf
Paging and segmentation are processes by which data is stored to, then retrieved from, a
computer's storage disk.
Paging is a computer memory management function that presents storage locations to the
computer's CPU as additional memory, called virtual memory. Each piece of data needs a
storage address.
Managing computer memory is a basic operating system function -- both paging and
segmentation are basic functions of the OS. No system can efficiently rely on limited RAM
alone. So, the computer’s memory management unit (MMU) uses the storage
disk, HDD or SSD, as virtual memory to supplement RAM.
WHAT IS PAGING?
As mentioned above, the memory management function called paging specifies storage
locations to the CPU as additional memory, called virtual memory. The CPU cannot directly
access storage disk, so the MMU emulates memory by mapping pages to frames that are in
RAM.
Before we launch into a more detailed explanation of pages and frames, let’s define some
technical terms.
Page: A fixed-length contiguous block of virtual memory residing on disk.
Frame: A fixed-length contiguous block located in RAM; whose sizing is identical to
pages.
Physical memory: The computer’s random access memory (RAM), typically
contained in DIMM cards attached to the computer’s motherboard.
Virtual memory: Virtual memory is a portion of an HDD or SSD that is reserved to
emulate RAM. The MMU serves up virtual memory from disk to the CPU to reduce
the workload on physical memory.
Virtual address: The CPU generates a virtual address for each active process. The
MMU maps the virtual address to a physical location in RAM and passes the address
to the bus. A virtual address space is the range of virtual addresses under CPU
control.
Physical address: The physical address is a location in RAM. The physical address
space is the set of all physical addresses corresponding to the CPU’s virtual addresses.
A physical address space is the range of physical addresses under MMU control.
By assigning an address to a piece of data using a "page table" between the CPU and the
computer's physical memory, a computer's MMU enables the system to retrieve that data
whenever needed.
Paging
Page number (p) -- page number is used as an index into a page table which contains
base address of each page in physical memory.
Page offset (d) -- page offset is combined with base address to define the physical
memory address.
Fig 2.8.2 Paging Table Architecture
A page table stores the definition of each page. When an active process requests data, the
MMU retrieves corresponding pages into frames located in physical memory for faster
processing. The process is called paging.
The MMU uses page tables to translate virtual addresses to physical ones. Each table entry
indicates where a page is located: in RAM or on disk as virtual memory. Tables may have a
single or multi-level page table such as different tables for applications and segments.
However, constant table lookups can slow down the MMU. A memory cache called the
Translation Lookaside Buffer (TLB) stores recent translations of virtual to physical addresses
for rapid retrieval. Many systems have multiple TLBs, which may reside at different
locations, including between the CPU and RAM, or between multiple page table levels.
Different frame sizes are available for data sets with larger or smaller pages and matching-
sized frames. 4KB to 2MB are common sizes, and GB-sized frames are available in high-
performance servers.
WHAT IS SEGMENTATION?
The process known as segmentation is a virtual process that creates address spaces of various
sizes in a computer system, called segments. Each segment is a different virtual address space
that directly corresponds to process objects.
When a process executes, segmentation assigns related data into segments for faster
processing. The segmentation function maintains a segment table that includes physical
addresses of the segment, size, and other data.
Segmentation is a technique to break memory into logical pieces where each piece
represents a group of related information.
For example, data segments or code segment for each process, data segment for
operating system and so on.
Segmentation can be implemented using or without using paging.
Unlike paging, segment are having varying sizes and thus eliminates internal
fragmentation.
External fragmentation still exists but to lesser extent.
Fig 2.8.4 Logical Address Space
Segment number (s) -- segment number is used as an index into a segment table
which contains base address of each segment in physical memory and a limit of
segment.
Segment offset (o) -- segment offset is first checked against limit and then is
combined with base address to define the physical memory address.
The CPU generates virtual addresses for running processes. Segmentation translates the CPU-
generated virtual addresses into physical addresses that refer to a unique physical memory
location. The translation is not strictly one-to-one: different virtual addresses can map to the
same physical address.
SEGMENTED PAGING
Some modern computers use a function called segmented paging. Main memory is divided
into variably-sized segments, which are then divided into smaller fixed-size pages on disk.
Each segment contains a page table, and there are multiple page tables per process.
Each of the tables contains information on every segment page, while the segment table has
information about every segment. Segment tables are mapped to page tables, and page tables
are mapped to individual pages within a segment.
Advantages include less memory usage, more flexibility on page sizes, simplified memory
allocation, and an additional level of data access security over paging. The process does not
cause external fragmentation.
Paging Advantages
On the programmer level, paging is a transparent function and does not require
intervention.
No external fragmentation.
No internal fragmentation on updated OS’s.
Frames do not have to be contiguous.
Paging Disadvantages
Segmentation Advantages
No internal fragmentation.
Segment tables consumes less space compared to page tables.
Average segment sizes are larger than most page sizes, which allows segments to
store more process data.
Less processing overhead.
Simpler to relocate segments than to relocate contiguous address spaces on disk.
Segment tables are smaller than page tables, and takes up less memory.
Segmentation Disadvantages
Size:
Paging: Fixed block size for pages and frames. Computer hardware determines
page/frame sizes.
Segmentation: Variable size segments are user-specified.
Fragmentation:
Paging: Older systems were subject to internal fragmentation by not allocating entire
pages to memory. Modern OS’s no longer have this problem.
Segmentation: Segmentation leads to external fragmentation.
Tables:
Paging: Page tables direct the MMU to page location and status. This is a slower
process than segmentation tables, but TLB memory cache accelerates it.
Segmentation: Segmentation tables contain segment ID and information, and are
faster than direct paging table lookups.
Availability:
Paging: Widely available on CPUs and as MMU chips.
Segmentation: Windows servers may support backwards compatibility, while Linux
has very limited support.
References
Reference Books:
J.P. Hayes, “Computer Architecture and Organization”, Third Edition.
Mano, M., “Computer System Architecture”, Third Edition, Prentice Hall.
Stallings, W., “Computer Organization and Architecture”, Eighth Edition, Pearson
Education.
Text Books:
Carpinelli J.D,” Computer systems organization &Architecture”, Fourth Edition,
Addison Wesley.
Patterson and Hennessy, “Computer Architecture”, Fifth Edition Morgaon Kauffman.
Other References
https://www.ques10.com/p/10067/what-is-virtual-memory-explain-the-role-of-pagin-
1/
https://www.enterprisestorageforum.com/storage-hardware/paging-and-
segmentation.html
https://www.cmpe.boun.edu.tr/~uskudarli/courses/cmpe235/Virtual%20Memory.pdf