0% found this document useful (0 votes)

260 views97 pages

Module 4: Memory System Organization & Architecture

The total number of bits required for the cache is: - Number of blocks in cache = 210 = 1024 blocks - Words per block = 22 = 4 words - Word size = 32 bits - So bits per block = 4 * 32 = 128 bits - Address size = 32 bits - So tag size = 32 - (10 + 2) = 20 bits - Total bits = 1024 * (128 + 20 + 1) = 262,144 bits = 16 KB Therefore, the total number of bits required is 16 KB, which matches the given cache size.

Uploaded by

Surya Sunder

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

260 views97 pages

Module 4: Memory System Organization & Architecture

Uploaded by

Surya Sunder

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 97

MODULE 4: MEMORY SYSTEM

ORGANIZATION & ARCHITECTURE

Ideal Memory Characteristics
• CPU should have rapid, uninterrupted access to external
memories.
• Memory speed should match CPU speed.
• Unfortunately, such high speed memories are very
expensive.
• So the general approach is to distribute information over
various memory types that have different performance
and cost.
Principles of Locality
• SPATIAL LOCALITY
• The locality principle stating that if a data location is referenced,
data locations with nearby addresses will tend to be referenced
soon

• TEMPORAL LOCALITY
• The principle stating that if a data location is referenced then it will
tend to be referenced again soon.
MEMORY HIERARCHY
Conceptual Organisation of Multilevel memory

CPU
Register Cache
(level 1) Cache
File
(level 2)
Main Memory
Secondary
IC1( Microprocessor) ICs Memory
2:m
ICs m:n

Hard disk, etc

Memory Types
• CPU registers
• High speed -- Costly
• Working memory for temporary storage of instruction and data.
• Usually General Purpose Registers are used.
• Size : comparatively small like 32 data word.
• Within a clock cycle they can be accessed.
• Main(primary) Memory
• Fairly fast external memory
• Storage addressed by Load and Store instructions
• Generally sizes 1MegaByte. Now a days we have GB.
• 1 MB= 220 bytes and 1 GB= 210 MB.
• Access Time five or more clock cycles.
Continued…
• Secondary Memory
• Larger capacity – Many gigabytes
• Slower
• Acts as overflow memory when capacity of main memory
exceeded.
• Accessed via I/O programs
• Eg: Hard disks, CD-ROM’s etc.
• Cache
• Positioned logically between Registers and Main memory.
• Capacity less than main memory and greater than registers.
• Speed vice versa.
• External Memory  Together “Main Memory and Cache memory”
Continued…
• Random Access Memory

• Sequential Access Memory

• Semi Random Access memory

Understanding RAM and ROM (Video Available in
Youtube)
• Refer to the video uploaded Memory Concepts 1

• Or refer to the link

• https://www.youtube.com/watch?v=ufGRoLOvM9I
Questions for discussion
• When will RAM be used?

• When will ROM be used?

Random Access in RAM

• Refer to the link

• https://www.youtube.com/watch?v=Kav6oOFDQSA
Question for discussion
• How random access possible in RAM??

• How to modify a random access memory to access

sequentially?
Addressing and Data(read/write) in RAM
• Refer to the video uploaded Memory Concepts 2 ( in
MOODLE)

• Or refer to the link

• https://www.youtube.com/watch?v=UaFSsD0LPS8
Questions for discussion
• When a address is given, how only that address is
accessed for read / write?
Basic Structure of Memory hierarchy
Continued…
Continued…
• There are three primary technologies used in building
memory hierarchies.
• Main memory is implemented from DRAM (Dynamic Random
Access Memory)

• Caches use SRAM (Static Random Access Memory)

Continued…
What is SRAM?
• SRAM is a type of RAM that holds data in a static form, that is, as long as the
memory has power

• SRAM stores a bit of data on four transistors using two cross-coupled inverters

• SRAM is best suited for secondary operations like the CPU’s fast cache memory
and storing registers

• SRAM is most often found in hard drives as disc cache

• SRAM is about 10 nanoseconds

• SRAM’s cycle time is a lot shorter than DRAM’s because it does not need to
refresh. The cycle time of SRAM is shorter because it does not need to stop
between accesses to refresh
What is DRAM?
• Dynamic random-access memory (DRAM) is a type of random-
access memory that stores each bit of data in a separate capacitor
within an integrated circuit.

• The capacitor can be either charged or discharged; these two

states are taken to represent the two values of a bit, conventionally
called 0 and 1.

• The capacitors will slowly discharge, and the information

eventually fades unless the capacitor charge is refreshed
periodically

• The main memory (the "RAM") in personal computers is dynamic

RAM (DRAM)
Memory Hierarchy and the components used
• Register Files  fast static RAMs with multiple ports.
Such RAMs are distinguished by having dedicated read
and write ports, whereas ordinary multi-ported SRAMs will
usually read and write through the same ports

• Cache  SRAM with single read/write port

• Main Memory  DRAM

• Secondary memory  Hard disk and CD’s

Upper and Lower level in Memory Hierarchy
Continued…
• Block : The minimum unit of information that can be either
present or not present

• Hit : If the data requested by the processor appears in

some block in the upper level
• L1 Cache hit, L2 cache, Memory hit etc

• Miss: If the data is not found in the upper level

• L1 Cache miss, L2 cache miss, Memory miss etc
• The lower level in the hierarchy is then accessed to retrieve the
block containing the requested data.
Continued…
• Hit rate The fraction of memory accesses found in a level of
memory hierarchy

• Miss Rate : The fraction of memory accesses not found in a level

of the memory hierarchy.

• Hit Time :The time required to access a level of the memory

hierarchy, including the time needed to determine whether the
access is a hit or a miss

• Miss penalty : The time required to fetch a block into a level of the
memory hierarchy from the lower level, including the time to
access the block, transmit it from one level to the other, and insert
it in the level that experienced the miss.
CACHES
Basics of Cache
• Cache was the name chosen to represent the level of the
memory hierarchy between the processor and main
memory

• The term is also used to refer to any storage managed to

take advantage of locality of access.

• Let’s consider a cache with blocks of one word

Cache before and after reference to word Xn
Continued…
• Before the request, Xn is not in the cache.

• This results in a cache miss

• So Xn is brought in to cache memory from main memory

Continued…
• How do we know if a data item is in the cache?

• How do we find it?

Mapping between Cache & Main Memory
• Direct Mapped

• Associative Mapped

• Set Associative
Direct Mapping
• A cache structure in which each memory location is mapped to
exactly one location in the cache.

• Most popular mapping:

(Block address) modulo (Number of cache blocks in the cache)

• This mapping is attractive because if the number of entries in the

cache is a power of two, then modulo can be computed simply by
using the low-order log2 (cache size in blocks) bits of the address

• Hence the cache may be accessed directly with the low-order bits
Example
Continued…
• Each cache location can contain the contents of a number of
different memory locations

• How do we know whether the data in the cache corresponds to

a requested word
• Tags are added for this purpose

• Tags contain the address information required to identify

whether a word in the cache corresponds to the requested
word

• Tag is a field in a table to represent the above information

(a part of cache)
Continued….
• In the previous example, upper 2 bits of the 5 bit address is
used as tag

• Now we know whether it is the requested block or not.

• But how to know whether the requested block in the cache is

valid ? (not corrupt or up to date with respect to a particular
program)

• Valid bits are used for this purpose

• Field in the tables of a memory hierarchy that indicates that the
associated block in the hierarchy contains valid data
• If the bit is not set, there cannot be a match for this block.
Example Continued….
1. For the above example, the block references in main
memory are as follows (in the same order)
1. Initial state of the cache after powered on
Continued…
2.First reference is for 22, so 22 mod 8110 ( where the
data from main memory goes)
(22)10  101102 So tag =10. Valid bit is set.
Continued…
Continued…
Direct mapping hardware
Question???
• How to map if number of blocks in cache is not a power of
2?

• Example:
• No of Cache blocks =9
• No of blocks in main memory = 32
Multi-word cache
• If a cache has 2n blocks and each block has 2m words, the
number of bits required for representing the cache
address will be “n + m”
• n  to address the block with in the cache
• m to address the word within the block

• No. of address bits required for the above cache including

the bytes within words will be “n+m+2”

For byte with

in word (4
bytes so 2 bits
are enough)
Bits in a cache
• The total number of bits needed for a cache is a function of the cache size and the address
size because the cache includes both the storage for the data and the tags.

• So if a cache has 2n blocks and each block has 2m words. If the main memory address is X
bits then tag can be calculated as,

Tag size = X-(n+m+2)

• Total number of bits in the above cache will be

= No. of cache blocks *( (No. of words in a block * 32 )+ Tag size + Valid field size)
= 2n *( (2m * 32) + Tag size + valid field size)
• = 2n *( (2m * 32) + (X-(n+m+2)) + 1 )
Example
• How many total bits are required for a direct-mapped cache with 16 KB of data
and 4-word blocks, assuming a 32-bit address?
• Solution:
• Data in cache 16KB = 4Kwords = 212 words
• No. of words in a block = 4 words = 22 words
• If 212 words has to be represented in a block of 22 words,
No. of blocks in cache = 2 12/22 = 210
• So in the above problem,
• n= 10, m= 2 and X=32 bits
• So tag size =32-10-2-2 =18 bits
• Valid field =1 bit
• Total bits for cache in the given problem will be
• 210 * ((22 * 32)+18 +1)
= 210 *(128+18+1) =147 Kbits
= 18.375 Kbytes

• From the above problem it could be seen that the actual capacity of 16KB data cache is
18.4 KB. Excluding the bits required for addressing, 16KB (actual data storage) will be
specified by most of the naming conventions
Three types of misses in Cache
• Referred to as Three C’s

• Compulsory misses: These are cache misses caused by the first access
to a block that has never been in the cache. These are also called cold-
start misses.

• Capacity misses: These are cache misses caused when the cache cannot
contain all the blocks needed during execution of a program. Capacity
misses occur when blocks are replaced and then later retrieved.

• Conflict misses: These are cache misses that occur in set-associative or

direct-mapped caches when multiple blocks compete for the same set.
Conflict misses are those misses in a direct-mapped or set-associative
cache that are eliminated in a fully associative cache of the same size.
These cache misses are also called collision misses.
Ideal Block Size
• Compulsory misses reduces with increased block size

• So how big a block can be?

• The miss rate may go up eventually if the block size becomes a
significant fraction of the cache size because it may result in
Capacity Miss
• In addition Miss penalty also increases with increased block size

• So blocks should be ideally big enough to reduce

compulsory misses and not very large to contain capacity
miss
Continued…
Early Restart
• Miss penalty cannot be reduced but can be masked with early restart

• Early Restart: resume execution as soon as the requested word of

the block is returned , rather than wait for the entire block.

• Early restart is particularly useful with instruction access and less

effective with data cache.
• Because instruction fetch is strictly sequential (at-least in “in-order” and “out-of-
order” pipeline”)

• Requested word first: More sophisticated than early restart. Word

requested is first transferred to cache. The rest of the block is
transferred later
Handling Cache Misses
• Cache Miss: A request for data from the cache that cannot
be filled because the data is not present in the cache.

• The control unit must detect a miss and process the miss
by fetching the requested data from memory

• Let the memories in the data path discussed be changed

to caches (instruction cache and data cache)

• Modifying the control of a processor to handle a hit is

trivial
Continued…
• The cache miss handling is done with the processor
control unit

• A separate controller initiates the memory access and

refills the cache

• The processing of a cache miss creates a stall (no op)

When a Cache miss occurs…
• Stall the entire processor

• Freeze the contents of the temporary and programmer

visible registers
• What are programmer invisible registers?
Steps to perform with instruction miss
• If an instruction access results in a miss, then the content
of the Instruction register is invalid

• To get the proper instruction into the cache, we must be

able to instruct the lower level in the memory hierarchy to
perform a read.

• The address of the instruction that generates an

instruction cache miss is equal to the value of the program
counter minus 4 (since PC was increased by 4 during the
instruction fetch and identifying it is a miss)
Continued…
1. Send the original PC value (current PC – 4) to the
memory.

2. Instruct main memory to perform a read and wait for the

memory to complete its access.

3. Write the cache entry, putting the data from memory in the
data portion of the entry, writing the upper bits of the address
(from the ALU) into the tag field, and turning the valid bit on.

4. Restart the instruction execution at the first step, which

will re-fetch the instruction, this time finding it in the cache.
When there is Data Miss
• Data miss happens with load or store

• After EA is calculated and it is a miss in the cache, stall

memory access to fetch the required data
Handling Writes
• When there is a write to the cache (some address), How
to maintain consistency between cache and Memory?
• Write Through Cache : Always write the data into both the memory
and the cache

• When there is a write miss, (Write Through with no buffer)

• Fetch the words of the block from memory
• After the block is fetched and placed into the cache we can
overwrite the word that caused the miss into the cache block
• We also write the word to main memory.
Continued…
• But write through with no buffer takes more time
• To improve performance for write miss and provide
solution to the above problem,
• Use Write Buffer : queue that holds data while the data are waiting
to be written to memory

• After writing the data into the cache and into the write
buffer, the processor can continue execution

• When a write to main memory completes, the entry in the

write buffer is freed.
Continued…
• If the write buffer is full when the processor reaches a
write, the processor must stall until there is an empty
position in the write buffer

• Problems with Write Buffer:

• If the memory writes from buffer are slow then the buffer may not
help much

• Even if the write rate is less, stalls may still exist, if the writes
occurs in bursts
Continued…
• Solution to the above problem:
• Use of Write back cache

• In a write-back scheme, when a write occurs, the new value is written only
to the block in the cache.

• The modified block is written to the lower level of the hierarchy when it is
replaced

• Write-back schemes can improve performance, especially when processors

can generate writes as fast or faster than the writes can be handled by
main memory

• A write-back scheme is more complex to implement than write-through.

Designing Memory System to support Caches
• Cache misses are satisfied from main memory, which is constructed
from DRAMs.

• DRAMs are designed with the primary emphasis on density rather

than access time

• It is difficult to reduce the latency to fetch the first word from memory

• But we can reduce the miss penalty if we increase the bandwidth

from the memory to the cache.

• This reduction allows larger block sizes to be used ( with reduced

miss penalty)
Design to increase bandwidth of main memory
• The processor is typically connected to memory over a
bus

• The clock rate of the bus is usually much slower than the
processor, by as much as a factor of 10.

• The speed of this bus affects the miss penalty.

Continued…
• Widening the memory and the buses between the
processor and memory

(or)

• widening the memory but not the interconnection bus

One word wide organization
Example
• Assume
■ 1 memory bus clock cycle to send the address
■ 15 memory bus clock cycles for each DRAM access
initiated
■ 1 memory bus clock cycle to send a word of data

If we have a cache block of four words and a one-word-

wide bank of DRAMs then

Miss Penalty = 1 + 415+41 = 65 mem.clk cycles

Wide Memory Organization
Continued…
• Allows parallel access to all words of the block

• So if bus is 4 words wide and bank is also 4 word wide

• Miss Penalty = 1 + 1*15 + 1 =17 mem. Clocks

• Draw back:
• Cost overhead due to increased bus width and control logic to
select mux to write in to appropriate word
Interleaved Memory Organization
Continued…
• So if each bank is 4 word wide and still the bus is 1 word
wide the miss penalty would be,

• Miss Penalty = 1 + 15 + 4*1 = 20 mem clk cycles

Split Cache
• A scheme in which a level of the memory hierarchy is
composed of two independent caches that operate in
parallel with each other with one handling instructions and
one handling data.
Measuring Cache Performance
• CPU time = (CPU execution clock cycles + Memory-stall clock
cycles) * Clock cycle time

• Memory-stall clock cycles = Read-stall cycles + Write-stall cycles

• Read-stall cycles = (Reads /Program) * Read miss rate *Read

miss penalty

• For a write-through scheme

• write misses, which usually require that we fetch the block before continuing
the write
• write buffer stalls, which occur when the write buffer is full when a write occurs
Continued…
• Write-stall cycles = (Writes/Program) * Write miss rate
*Write miss penalty + Write Buffer Stalls
Fully Associative Mapping
• Cache structure in which a block can be placed in any location
in the cache.

• To find a given block in a fully associative cache, all the entries

in the cache must be searched because a block can be placed
in any one.

• To make the search practical, it is done in parallel with a

comparator associated with each cache entry

• These comparators significantly increase the hardware cost,

effectively making fully associative placement practical only for
caches with small numbers of blocks.
SET ASSOCIATIVE
• A cache that has a fixed number of locations (at least two)
where each block can be placed.

• (Block number) modulo (Number of sets in the cache)

Direct Mapped or One-Way Set Associative
Two Way Set Associative
Four Way Set Associative
8-Way Set Associative
Locating a block in the Set Associative cache

• Tag : To see if it matches the block address from the

processor

• The index value is used to select the set containing the

address of interest

• Because speed is of the essence, all the tags in the

selected set are searched in parallel
Continued…
• If the total cache size is kept the same,
• Increasing the associativity increases the number of blocks per set
• This increases the number of simultaneous compares needed to
perform the search in parallel
No. of Comparators needed
• In direct mapped cache,
• 1 comparator needed

• In m-way set associative,

• m comparators needed
• m to 1 mux is also needed

• In fully associative
• As many comparators as there are blocks
Implementation of 4-way set associative cache
Replacement in Fully Associative and Set
Associative Caches
• What to be replaced when there is a MISS in Set-
Associative / Fully Associative Cache
Eg: Consider 16 blocks grouped in to 4 sets in 4-way set
associative cache. Each block contains 1 byte.
4-Way Set Associative

Set 0 Block 0 Block 1 Block 2 Block 3

Block 0 Block 1 Block 2 Block 3
Set 1
Block 0 Block 1 Block 2 Block 3
Set 2
Block 0 Block 1 Block 2 Block 3
Set 3
Continued…
• If references are like, 0 ,1, 2, 3,7, 16, 20, 24
• Since in question max reference “19” we consider virtual address
of size “5”. Out of “5” bits last “2” bits for block reference
1. 000000 {goes to Set 0, any block}(Block 0)
2. 100001{Set 0;Any block except 0}(Block 1)
3. 200010{Set 0; Any block except 0,1}(Block 2)
4. 300011{Set 0;Any Block except 0,1,2}(Block 3)
5. 700111{Set 0;Any Block} Replacement Needed
What to replace?
• Many algorithms Exist  LRU Popular
• LRU: Least Recent Used
• So in the above scenario replace Block 0
Virtual Memory
• A virtual memory block is called a page, and a virtual
memory miss is called a page fault

• With virtual memory, the processor produces a virtual

address, which is translated by a combination of hardware
and software to a physical address, which in turn can be
used to access main memory.

• This process is called address mapping or address

translation
Mapping from Virtual Address to Physical Address
Page Table
• The table containing the virtual to physical address translations in a
virtual memory system.

• The table, which is stored in memory, is typically indexed by the virtual

page number

• Each entry in the table contains the physical page number for that virtual
page if the page is currently in memory.

• Page table register : To indicate the location of the page table in memory,
the hardware includes a register that points to the start of the page table

• Assume for now that the page table is in a fixed and contiguous area of
memory.
TLB
• A cache that keeps track of recently used address
mappings to avoid an access to the page table.
Error Detection Code  Parity bit
• When a word is written into memory, the parity bit is also
written
• Then, when the word is read out, the parity bit is read and
checked.
• If the parity of the memory word and the stored parity bit
do not match, an error has occurred
• A 1-bit parity scheme can detect at most 1 bit of error in a
data item
• If there are 2 bits of error, then a 1-bit parity scheme will
not detect any errors, since the parity will match the data
with two errors
What is a code word?
• An n-bit unit containing data and check-bits is often
referred to as an n-bit codeword
Continued…
• In the above table to move from one code word to another
code word, a minimum distance (hamming distance) of 2 is
needed

• A parity code cannot tell which bit in a data item is in error

• A 1-bit parity scheme is an error detection code (EDC)

• To correct errors ECC is used

• A 1-bit parity code is a distance-2 code

• There is a distance of two between legal combinations of parity and data
Error Correcting Code
• These codes work by using more bits to encode the data

• To detect more than one error or correct an error, we need

a distance-3 code
Hamming Code
• Hamming code contains redundant bits (r) and data bits
(d)

• To calculate the numbers of redundant bits (r) required to

correct d data bits :
• Total no.of bits to be transmitted  d+r
• So r must be able to indicate atleast d+r+1 different values
• 2r >= d+r+1

• Example:
• If d is 7, smallest value of r that satisfies above relation is 4
Continued…
Example
• Let us consider 4 data bits. So r will be 3

• Let the data to be sent is 1010 (even parity)

• So d3 d2 d1 r2 r1
d4 r4
1 0 1 0
1 0 1 0 0 1 0

After adding
redundant bits to be
‘0’
Continued…
• If received data is
1110010

C1 : To be calculated from bit positions 1, 3, 5, 7 parity

0
C2: 2,3,6,7 bit position parity 1
C3: 4,5,6,7 bit position parity  1
So C3C2C1110  6
Error in bit position 6
Continued…
• Basic approach for error detection and correction by using
Hamming code is as follows:
• To each group of m information bits k parity bits are added to form
(m+k) bit code
• Location of each of the (m+k) digits is assigned a decimal value
• The k parity bits are placed in positions 1, 2, …, 2k-1 positions
• K parity checks are performed on selected digits of each code-
word
• At the receiving end the parity bits are recalculated. The decimal
value of the k parity bits provides the bit-position in error, if any

PRMSU-ASA-OURSF 8 Request For Documents March 1 20231-3
No ratings yet
PRMSU-ASA-OURSF 8 Request For Documents March 1 20231-3
2 pages
Module 1: Introduction To CAD Software
100% (1)
Module 1: Introduction To CAD Software
8 pages
AutoCAD User Interface
No ratings yet
AutoCAD User Interface
4 pages
Preparing For An Engineering Career
No ratings yet
Preparing For An Engineering Career
4 pages
Evolution of Computing
67% (3)
Evolution of Computing
13 pages
PDF Document 3
No ratings yet
PDF Document 3
2 pages
Maximising Heat Exchanger Cleaning
No ratings yet
Maximising Heat Exchanger Cleaning
4 pages
Example Microprocessor Register Organizations
No ratings yet
Example Microprocessor Register Organizations
3 pages
Update Evolution of Tech
No ratings yet
Update Evolution of Tech
13 pages
Part C - Earthwork ITEM 100 - Clearing and Grubbing 100.1 Description
No ratings yet
Part C - Earthwork ITEM 100 - Clearing and Grubbing 100.1 Description
23 pages
Item 105 (1) A - Surplus Common Excavation: Total Area of Item 105 (M 2)
No ratings yet
Item 105 (1) A - Surplus Common Excavation: Total Area of Item 105 (M 2)
1 page
Project-Study V2.3 Final
No ratings yet
Project-Study V2.3 Final
57 pages
Activityt#1 - Bugaay
No ratings yet
Activityt#1 - Bugaay
2 pages
Bem 114 - Group 2 Lesson Plan
No ratings yet
Bem 114 - Group 2 Lesson Plan
7 pages
Level Measurement - Indirect Sensing
0% (1)
Level Measurement - Indirect Sensing
24 pages
Tle Exam Basic Electricity
No ratings yet
Tle Exam Basic Electricity
63 pages
Narrative Report
100% (1)
Narrative Report
4 pages
Motorcycle Parking Design With Simulation Approach Case Study: Rusunawa Penjaringan Sari 3, Surabaya
No ratings yet
Motorcycle Parking Design With Simulation Approach Case Study: Rusunawa Penjaringan Sari 3, Surabaya
5 pages
Data Transfer Instructions
No ratings yet
Data Transfer Instructions
6 pages
Systems Integration & Architecture IT312
No ratings yet
Systems Integration & Architecture IT312
36 pages
ITECOMPSYSL Activity 6 - Video-Function-Color-Attribute
No ratings yet
ITECOMPSYSL Activity 6 - Video-Function-Color-Attribute
5 pages
Worksheet Days and Months
No ratings yet
Worksheet Days and Months
1 page
School Billing System
No ratings yet
School Billing System
37 pages
Gantt Chart Capstone
No ratings yet
Gantt Chart Capstone
2 pages
Reaction Paper Educational Technology
No ratings yet
Reaction Paper Educational Technology
3 pages
d.5 To 6 Feet From The Ground
No ratings yet
d.5 To 6 Feet From The Ground
12 pages
Activity No 7 Procedure
No ratings yet
Activity No 7 Procedure
7 pages
Group 3
No ratings yet
Group 3
2 pages
Chapter 4
No ratings yet
Chapter 4
5 pages
1.0 What Is Operating System?: Four Components of Computer System
No ratings yet
1.0 What Is Operating System?: Four Components of Computer System
8 pages
Advanced Programming Exam 1
No ratings yet
Advanced Programming Exam 1
5 pages
Rubric For Sketchup Activity
No ratings yet
Rubric For Sketchup Activity
1 page
Using Contextual Analysis To Evaluate Texts
No ratings yet
Using Contextual Analysis To Evaluate Texts
2 pages
How To Calculate Cutting List
No ratings yet
How To Calculate Cutting List
4 pages
Unit 1:: Information Age
No ratings yet
Unit 1:: Information Age
14 pages
Chapter 2 Capstone
No ratings yet
Chapter 2 Capstone
5 pages
Self Assessment Check
No ratings yet
Self Assessment Check
3 pages
This Study Resource Was: I. Title of The Case: Royal Printing and Packaging Company II. Problems Encountered
No ratings yet
This Study Resource Was: I. Title of The Case: Royal Printing and Packaging Company II. Problems Encountered
3 pages
TOPIC 2 Observing Bulletin Board Displays
100% (1)
TOPIC 2 Observing Bulletin Board Displays
4 pages
Philippine College of Science and Technology: Assignment 3 I/O Interface
No ratings yet
Philippine College of Science and Technology: Assignment 3 I/O Interface
1 page
Cache Quiz
No ratings yet
Cache Quiz
3 pages
Activity 2: Christian Gutierrez Juan Mora Luis Brito José Pinos
No ratings yet
Activity 2: Christian Gutierrez Juan Mora Luis Brito José Pinos
4 pages
Don Honorio Ventura State University: Jun Flores, Pece, Mep-Ee
No ratings yet
Don Honorio Ventura State University: Jun Flores, Pece, Mep-Ee
10 pages
Computer Literacy
100% (1)
Computer Literacy
23 pages
07 Task Performance 1 (HCI)
No ratings yet
07 Task Performance 1 (HCI)
2 pages
Description of Mechanism and Process
No ratings yet
Description of Mechanism and Process
7 pages
Mobile Based Campus Guide System For Hu
No ratings yet
Mobile Based Campus Guide System For Hu
76 pages
Introduction To On-The-Job Narrative Report
No ratings yet
Introduction To On-The-Job Narrative Report
1 page
Assignment No. 1 Reaction Paper From The Movie Transcendence Jolly Ann P. Otero BSBA I-1
No ratings yet
Assignment No. 1 Reaction Paper From The Movie Transcendence Jolly Ann P. Otero BSBA I-1
2 pages
Visual Basic For Applications
No ratings yet
Visual Basic For Applications
17 pages
Lesson 6 (Roof Framing)
No ratings yet
Lesson 6 (Roof Framing)
7 pages
A Premier, Multidisciplinary Technological University
No ratings yet
A Premier, Multidisciplinary Technological University
2 pages
QUIZ 2 Landfill Answers
No ratings yet
QUIZ 2 Landfill Answers
4 pages
Rizal Shrine
0% (1)
Rizal Shrine
4 pages
Starting Autocad & Its User Interface: Chapter One
No ratings yet
Starting Autocad & Its User Interface: Chapter One
12 pages
Ce Project (Practicum) CEAQ5522P: Manuel S. Enverga University Foundation
No ratings yet
Ce Project (Practicum) CEAQ5522P: Manuel S. Enverga University Foundation
6 pages
Cache Memory CAD
No ratings yet
Cache Memory CAD
16 pages
CMP3010L08 Memory
No ratings yet
CMP3010L08 Memory
45 pages
Lecture 04 IS064
No ratings yet
Lecture 04 IS064
41 pages
Memory Organization Assignment
No ratings yet
Memory Organization Assignment
61 pages
2 Theory of Computation and Compiler Design Cse2002
No ratings yet
2 Theory of Computation and Compiler Design Cse2002
3 pages
IPS C Green Teams
No ratings yet
IPS C Green Teams
48 pages
Memory Traffic
No ratings yet
Memory Traffic
45 pages
Drugs
No ratings yet
Drugs
4 pages
IP Datagram Problem
No ratings yet
IP Datagram Problem
13 pages
Verification of Truth Tables For Logic Gates (Alarm or Buzzer Application)
No ratings yet
Verification of Truth Tables For Logic Gates (Alarm or Buzzer Application)
12 pages
Aldehydes, Ketones and Carboxylic Acids
No ratings yet
Aldehydes, Ketones and Carboxylic Acids
23 pages
R V Pu College: 1 Quarterly Test Portions 2019-20 2 Year English
No ratings yet
R V Pu College: 1 Quarterly Test Portions 2019-20 2 Year English
1 page
Estimating & Measuring Work Within A Construction Environment
No ratings yet
Estimating & Measuring Work Within A Construction Environment
29 pages
SBM Assessment Tool For Online Validation With Essential MOVs
No ratings yet
SBM Assessment Tool For Online Validation With Essential MOVs
10 pages
M4 L1Assessment For Learning Using Assessment To Classify Learning and Understanding
No ratings yet
M4 L1Assessment For Learning Using Assessment To Classify Learning and Understanding
5 pages
4A Lesson Plan in English Grade 2: Valencia City Central School
No ratings yet
4A Lesson Plan in English Grade 2: Valencia City Central School
3 pages
Constructive and Destructive Feedback Notes
No ratings yet
Constructive and Destructive Feedback Notes
5 pages
Strategic Plan For 2011-2016: Vision
No ratings yet
Strategic Plan For 2011-2016: Vision
4 pages
SOW102-Doing Social Research, 2nd Edition-Therese Baker-1994 - (Learnclax - Com) - Pages-200-235
No ratings yet
SOW102-Doing Social Research, 2nd Edition-Therese Baker-1994 - (Learnclax - Com) - Pages-200-235
36 pages
PicoWay Candela Specifications Brochure Resolve
No ratings yet
PicoWay Candela Specifications Brochure Resolve
8 pages
RSV4 Factory APRC - SM - 2010-11 - GB - 898952
No ratings yet
RSV4 Factory APRC - SM - 2010-11 - GB - 898952
504 pages
On Not Teaching Greek
100% (1)
On Not Teaching Greek
7 pages
3RB30461XW1
No ratings yet
3RB30461XW1
7 pages
Otto Wagner
No ratings yet
Otto Wagner
15 pages
Agisoft Metashape Change Log: Version 1.6.0 Build 9397 (21 October 2019, Preview Release)
No ratings yet
Agisoft Metashape Change Log: Version 1.6.0 Build 9397 (21 October 2019, Preview Release)
41 pages
IFPS User Manual
100% (1)
IFPS User Manual
678 pages
Catalogo Juntas Rotativas DEUBLIN
100% (1)
Catalogo Juntas Rotativas DEUBLIN
32 pages
BS 5493 1977 Amd 2 Code of Practice For Protective Coating o PDF
No ratings yet
BS 5493 1977 Amd 2 Code of Practice For Protective Coating o PDF
118 pages
B.Arch.: Architecture Program Curriculum Map: (155 Hours)
No ratings yet
B.Arch.: Architecture Program Curriculum Map: (155 Hours)
1 page
Unit 1 - What Kind of Movies Have You Been Watching Recently
No ratings yet
Unit 1 - What Kind of Movies Have You Been Watching Recently
12 pages
5 PG TRB Unit 10 Phrases and Patterns
No ratings yet
5 PG TRB Unit 10 Phrases and Patterns
109 pages
Biomimetics 06 00027 v3
No ratings yet
Biomimetics 06 00027 v3
16 pages
Clinical Research Cover Letter
100% (2)
Clinical Research Cover Letter
6 pages
Smoking Awareness Campaign by Slidesgo
No ratings yet
Smoking Awareness Campaign by Slidesgo
40 pages
Major1107202x12.2pscslab V4 Approve P11
No ratings yet
Major1107202x12.2pscslab V4 Approve P11
1 page
Python Datatypes
No ratings yet
Python Datatypes
6 pages
Classic 500
No ratings yet
Classic 500
86 pages
95 843 Xiameter Ofx 0531 Fluid
No ratings yet
95 843 Xiameter Ofx 0531 Fluid
5 pages
Chapter 2 (MPTH)
100% (1)
Chapter 2 (MPTH)
15 pages
What Happened To You Book Discussion Guide-National Version
No ratings yet
What Happened To You Book Discussion Guide-National Version
7 pages
CHE 1000-E LEARNING - BALANCING REDOX REACTIONS
No ratings yet
CHE 1000-E LEARNING - BALANCING REDOX REACTIONS
17 pages

Module 4: Memory System Organization & Architecture

Uploaded by

Module 4: Memory System Organization & Architecture

Uploaded by

MODULE 4: MEMORY SYSTEM

ORGANIZATION & ARCHITECTURE

Hard disk, etc

• Sequential Access Memory

• Semi Random Access memory

• Or refer to the link

• When will ROM be used?

• Refer to the link

• How to modify a random access memory to access

• Or refer to the link

• Caches use SRAM (Static Random Access Memory)

• SRAM is most often found in hard drives as disc cache

• SRAM is about 10 nanoseconds

• The capacitor can be either charged or discharged; these two

• The capacitors will slowly discharge, and the information

• The main memory (the "RAM") in personal computers is dynamic

• Cache  SRAM with single read/write port

• Main Memory  DRAM

• Secondary memory  Hard disk and CD’s

• Hit : If the data requested by the processor appears in

• Miss: If the data is not found in the upper level

• Miss Rate : The fraction of memory accesses not found in a level

• Hit Time :The time required to access a level of the memory

• The term is also used to refer to any storage managed to

• Let’s consider a cache with blocks of one word

• This results in a cache miss

• So Xn is brought in to cache memory from main memory

• How do we find it?

• Most popular mapping:

• This mapping is attractive because if the number of entries in the

• How do we know whether the data in the cache corresponds to

• Tags contain the address information required to identify

• Tag is a field in a table to represent the above information

• Now we know whether it is the requested block or not.

• But how to know whether the requested block in the cache is

• Valid bits are used for this purpose

• No. of address bits required for the above cache including

For byte with

Tag size = X-(n+m+2)

• Total number of bits in the above cache will be

• Conflict misses: These are cache misses that occur in set-associative or

• So how big a block can be?

• So blocks should be ideally big enough to reduce

• Early Restart: resume execution as soon as the requested word of

• Early restart is particularly useful with instruction access and less

• Requested word first: More sophisticated than early restart. Word

• Let the memories in the data path discussed be changed

• Modifying the control of a processor to handle a hit is

• A separate controller initiates the memory access and

• The processing of a cache miss creates a stall (no op)

• Freeze the contents of the temporary and programmer

• To get the proper instruction into the cache, we must be

• The address of the instruction that generates an

2. Instruct main memory to perform a read and wait for the

4. Restart the instruction execution at the first step, which

• After EA is calculated and it is a miss in the cache, stall

• When there is a write miss, (Write Through with no buffer)

• When a write to main memory completes, the entry in the

• Problems with Write Buffer:

• Write-back schemes can improve performance, especially when processors

• A write-back scheme is more complex to implement than write-through.

• DRAMs are designed with the primary emphasis on density rather

• But we can reduce the miss penalty if we increase the bandwidth

• This reduction allows larger block sizes to be used ( with reduced

• The speed of this bus affects the miss penalty.

• widening the memory but not the interconnection bus

If we have a cache block of four words and a one-word-

Miss Penalty = 1 + 4*15+4*1 = 65 mem.clk cycles

• So if bus is 4 words wide and bank is also 4 word wide

• Miss Penalty = 1 + 1*15 + 1 =17 mem. Clocks

• Miss Penalty = 1 + 15 + 4*1 = 20 mem clk cycles

• Memory-stall clock cycles = Read-stall cycles + Write-stall cycles

• Read-stall cycles = (Reads /Program) * Read miss rate *Read

• For a write-through scheme

• To find a given block in a fully associative cache, all the entries

• To make the search practical, it is done in parallel with a

Miss Penalty = 1 + 415+41 = 65 mem.clk cycles