0% found this document useful (0 votes)
137 views2 pages

Assosiative Mapping - Cache Memory

The document describes set associative mapping algorithms for CPU caches. A cache is divided into sets, with each set containing k blocks/lines. Each block of main memory maps to only one cache set, and k lines can occupy that set. This is called k-way set associative mapping. When a block is referenced, its USE bit is set while its partner's bit is cleared. Replacement algorithms like LRU are used to determine which block to replace when a new one is loaded into a full set.

Uploaded by

wafasa
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
137 views2 pages

Assosiative Mapping - Cache Memory

The document describes set associative mapping algorithms for CPU caches. A cache is divided into sets, with each set containing k blocks/lines. Each block of main memory maps to only one cache set, and k lines can occupy that set. This is called k-way set associative mapping. When a block is referenced, its USE bit is set while its partner's bit is cleared. Replacement algorithms like LRU are used to determine which block to replace when a new one is loaded into a full set.

Uploaded by

wafasa
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Set Associative Mapping Algorithm

POINTS OF INTEREST:

Cache
Tag0 Block for Tag0 Address length is s + w bits Block 0 Tag Block for Tag1 1 Cache is divided into a number of Block 1 Set 0 Tag2 Block for Tag2 sets, v = 2d Tag3 Block for Tag3 k blocks/lines can be contained within Block 128 Tag Block for Tag4 4 Block 129 each set Tag Block for Tag 5 5 k lines in a cache is called a k-way Set 1 Tag6 Block for Tag6 Block 256 set associative mapping Tag7 Block for Tag7 Block 257 Number of lines in a cache = vk = d Tag8 Block for Tag8 k2 Block 384 Size of tag = (s-d) bits Block 385 Tag511 Block for Tag511 Each block of main memory maps to only one cache set, but k-lines can occupy a set at the same time Two lines per set is the most Effect of Cache Set Size on Address Partitioning common organization. Word o This is called 2-way Tag bits Set ID bits ID bits associative mapping o A given block can be in one 18 bits 9 bits 3 bits Direct mapping (1 line/set) of 2 lines in only one 19 bits 8 bits 3 bits 2-way set associative (21 lines/set) specific set 20 bits 7 bits 3 bits 4-way set associative (22 lines/set) o Significant improvement 21 bits 6 bits 3 bits 8-way set associative (23 lines/set) over direct mapping o Replacement algorithm simply uses LRU with a 25 bits 2 bits 3 bits 128-way set associative (27 lines/set) USE bit. When one block 26 bits 1 bit 3 bits 256-way set associative (28 lines/set) is referenced, its USE bit is set while its partner in the 27 bits 3 bits Fully associative (1 big set) set is cleared

Memory Blocks
Memory

DEFINITION: A block is a group of neighboring words in memory identified by bits of address excluding
"w" word ID bits

EXAMPLE: Assume a block uses three word ID bits, i.e., w=3. Memory addresses for this system are therefore broken up as shown in the figure below. Memory address a29 a28 a27 a4 a3 a2 a1 a0

"Block Address" Bits identifying block

"Word ID" Bits identifying offset into block

EXERCISE: Which addresses below are contained the same block as the address 0x546A5 for a block
size of 8 words? a.) 0x536A5 a.) 0x546B5 a.) 0x546AF a.) 0x546A0 a.) 0x546C7 a.) 0x546A2

Locality of Reference Principle


DEFINITION: During execution, memory references of both data and instructions tend to cluster together
over a short period of time. Examples of instructions that might cluster together are iterative loops or functions. It might even be possible for a compiler to organize data so that data elements that are accessed together are contained in the same block.

EXAMPLE: Identify how many times the


processor "touches" each piece of data and each line of code in the following snippet of code.

int int int for

values[6] = {9, 34, 67, 23, 7, 3}; count; sum = 0; (count = 0; count < 8; count++) sum += values[count];

Writing to a Cache
Must not overwrite a cache block unless main memory is up to date Two main problems: If cache is written to, main memory is invalid or if main memory is written to, cache is invalid Can occur if I/O can address main memory directly Multiple CPUs may have individual caches; once one cache is written to, all caches are invalid Write Through: All writes go to main memory as well as cache Multiple CPUs can monitor main memory traffic to keep local (to CPU) cache up to date Lots of traffic and slows writes Write Back: Updates initially made in cache only Update bit for cache slot is set when update occurs If block is to be replaced, write to main memory only if update bit is set Other caches get out of sync I/O must access main memory through cache Research shows that 15% of memory references are writes Multiple Processors with Multiple Caches: Even if a write through policy is used, other processors may have invalid data in their caches

General Organization of a Cache


POINTS OF INTEREST:
Tags are unique identifiers derived from the address of the block contained in the corresponding line. When one word is loaded into the cache, all of the words in the same block are loaded into a single line. The number of lines in a cache equals the size of the cache divided by the number of words in a block.
TAG0 TAG1 TAG2 TAG3 Block of words corresponding to TAG0 Block of words corresponding to TAG1 Block of words corresponding to TAG2 Block of words corresponding to TAG3 L I N E S

TAGn-1

Block of words corresponding to TAGn-1 Block of 2w words where w = # of word ID bits

Direct Mapping Algorithm Fully Associative Mapping Algorithm


POINTS OF INTEREST:
Each block of main memory maps to only one cache line i.e. if a block is in cache, it will always be found in the same place Line number is calculated using the function i = j modulo m where i = cache line number j = block number derived from address m = number of lines in the cache The memory address is divided into three parts which are from right to left: the word id, the bits identifying the cache line number where the block is stored, and the tag. 2l = number of lines in cache 2w = number of words in a block 2t = number of blocks in memory that map to the same line in the cache.
Cache
Line 0 Line 1 Line 2 Line 3 Tag0 Tag1 Tag2 Tag3 Block for Tag0 Block for Tag1 Block for Tag2 Block for Tag3

Memory
Block 0 Block 1 Block 512 Block 513 Block 1024 Block 1025

POINTS OF INTEREST:

Line 511

Tag511

Block for Tag511


Block 1536 Block 1537

A main memory block can load into t bits any line of cache Memory address is interpreted as: Tag o Least significant w bits = word position within block o Most significant s bits = tag used to identify which block is stored in a particular line of cache Every line's tag must be examined for a match The algorithm for storing is independent of the size of the cache Cache searching gets expensive and slower

Fully Associative Mapping Memory Address Partitioning w bits Bits identifying word offset into block

Direct Mapping Partitioning of Memory Address t bits Tag l bits Bits identifying row in cache w bits Bits identifying word offset into block

EXAMPLE: Assume that a portion of the tags in the cache in our example looks like the table below. Which of the following addresses are contained in the cache?
a.) 0x438EE8 b.) 0xF18EFF c.) 0x6B8EF3 d.) 0xAD8EF3

Tag (binary) 0101 1110 1010 0110 1011 1111 0011 1101 1101 1011 0101 0001 1000 1100 1000 1000 0101 1000 1110 1001 1110 1110 1001 1110 1110 1011 1111 1111 0010 1111 10 01 00 11 00 11

Addresses wi/ block 00 01 10 11

EXAMPLE: What cache line number will the following addresses be stored to, and what will the minimum address and the maximum address of each block they are in be if we have a cache with 212 = 4K lines of 24 = 16 words to a block in a 228 = 256 Meg memory space?
a.) 0x9ABCDEF b.) 0x1234567 c.) 0xD43F6C2

EXAMPLE: Assume that a portion of the tags in the cache in our example looks like the table below. Which of the following addresses are contained in the cache?
a.) 0x438EE8 b.) 0xF18EFF c.) 0x6B8EF3 d.) 0xAD8EF3

Replacement Algorithms
There must be a method for selecting which line in the cache is going to be replaced when theres no room for a new line

Tag (binary) 0101 1110 1010 0110 1011 1111 0011 1101 1101 1011 0101 0001

Line number (binary) 1000 1000 1000 1000 1000 1000 1110 1110 1110 1110 1110 1110 1110 1110 1111 1111 1111 1111 10 11 00 01 10 11

Addresses wi/ block 00 01 10 11

POINTS OF INTEREST:
Hardware implemented algorithm for speed There is no need for a replacement algorithm with direct mapping since each block only maps to one line just replace line that is in the way. Types of replacement algorithms: o Least Recently used (LRU) replace the block that hasn't been touched in the longest period of time o First in first out (FIFO) replace block that has been in cache longest o Least frequently used (LFU) replace block which has had fewest hits o Random just pick one, only slightly lower performance than use-based algorithms LRU, FIFO, and LFU

EXAMPLE: For the previous example, how many lines does the cache contain? How many blocks can be mapped to a single line of the cache? PROS: Simple & inexpensive CONS: If a program accesses 2 blocks that map to the same line repeatedly, cache misses are very high (thrashing)

You might also like