0% found this document useful (0 votes)
47 views

Chapter 4 (Continued) : Caching Testing Memory Modules

The document discusses various techniques for memory organization and caching in computer systems. It covers direct mapping, associative mapping, and block-set associative caching strategies. It also discusses memory testing techniques like cyclic redundancy checks to test read-only memory contents for faults. Error checking techniques like parity bits and Hamming codes are explained for detecting and correcting single bit errors in transmitted messages.

Uploaded by

newhondacity
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

Chapter 4 (Continued) : Caching Testing Memory Modules

The document discusses various techniques for memory organization and caching in computer systems. It covers direct mapping, associative mapping, and block-set associative caching strategies. It also discusses memory testing techniques like cyclic redundancy checks to test read-only memory contents for faults. Error checking techniques like parity bits and Hamming codes are explained for detecting and correcting single bit errors in transmitted messages.

Uploaded by

newhondacity
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

Chapter 4 (continued):

Caching;

Testing Memory Modules

Memory organization:
Typical Memory map

fig_04_30

For power loss

Memory hierarchy

fig_04_31

Paging / Caching Why it typically works: locality of reference (spatial/temporal) working set Note: in real-time embedded systems, behavior may be atypical; but caching may still be a useful technique
fig_04_32

Here we consider caching external to the CPUthe CPU may have one or more levels of caching built in

Typical memory system with cache: hit rate (miss rate) important

Remember! Registers here

fig_04_33

Basic caching strategies:

Direct-mapped

Associative

Block-set associative

questions: what is associative memory?

what is overhead?
what is efficiency (hit rate)? is bigger cache better?

Associative memory: storage location related to data stored Examplehashing: --When software program is compiled or assembled, a symbol table must be created to link addresses with symbolic names --table may be large; even binary search of names may be too slow --convert each name to a number associated with the name, this number will be the symbol table index For example, let a = 1, b = 2, c = 3, Then cab has value 1 + 2 + 3 = 6 ababab has value 3 *(1 + 2) = 9 And vvvvv has value 5*22 = 110 Address will be modulo a prime p, if we expect about 50 unique identifiers, can take p = 101 (make storage about twice as large as number of items to be stored, reduce collisions) Now array of names in symbol table will look like: 0> 1> 2---> 6--->cab 9--->ababab--->vvvvv Here there is one collision, at address 9; the two items are stored in a linked list Access time for an identifier <= (time to compute address) + 1 + length of longest linked list ~ constant

Caching: the basic processnote OVERHEAD for each task --program needs information M that is not in the CPU --cache is checked for M how do we know if M is in the cache? --hit: M is in cache and can be retrieved and used by CPU --miss: M is not in cache (M in RAM or in secondary memory) where is M? * M must be brought into cache * if there is room, M is copied into cache how do we know if there is room? * if there is no room, must overwrite some info M how do we select M? ++ if M has not been modified, overwrite it how do we know if M has been modified? ++ if M has been modified, must save changes how do we save changes to M?

Example: direct mapping 32-bit words, cache holds 64K words, in 128 0.5K blocks Memory addresses 32 bits Main memory 128M words; 2K pages, each holds 128 blocks (~ cache)
Tag table: 128 entries (one for each block in the cache). Contains: Tag: page block came from Valid bit: does this block contain data write-through: any change propagated immediately to main memory delayed write: since this data may change again soon, do not propagate change to main memory immediately this saves overhead; instead, set the dirty bit Intermediate: use queue, update periodically When a new block is brought in, if the valid bit is true and the dirty bit is true, the old block must first be copied into main memory Replacement algorithm: none; each block only has one valid cache location

fig_04_34 fig_04_35 fig_04_36

2 bits--byte; 9 bits--word address; 7 bitsblock address (index); 11 (of 15)tag (page block is from)

Problem with direct mapping: two frequently used parts of code can be in different Block0sso repeated swapping would be necessary; this can degrade performance unacceptably, especially in realtime systems (similar to thrashing in operating system virtual memory system) Another method: associative mapping: put new block anywhere in the cache; now we need an algorithm to decide which block should be removed, if cache is full

fig_04_37

Step 1: locate the desired block within the cache; must search tag table, linear search may be too slow; search all entries in parallel or use hashing Step 2: if miss, decide which block to replace. a.Add time accessed to tag table info, use temporal locality: Least recently used (LRU) a FIFO-type algorithm Most recently used (MRU) a LIFO-type algorithm b. Choose a block at random

Drawbacks: long search times Complexity and cost of supporting logic Advantages: more flexibility in managing cache contents

fig_04_38

Intermediate method: block-set associative cache Each index now specifies a set of blocks Main memory: divided into m blocks organized into n groups Group number = m mod n Cache set number ~ main memory group number Block from main memory group j can go into cache set j Search time is less, since search space is smaller How many blocks: simulation answer (one rule of thumb: doubling associativity ~ doubling cache size, > 4-way probably not efficient)

Two-way set-associative scheme


fig_04_39

Example: Block 0 64 1 65 2 66 . . . 63 127

256K memory-64 groups, 512 blocks Group (m mod 64) 128 . . . 384 448 0 129 . . . 385 449 1 130 . . . 386 450 2 192 . . . 447 511 63

Dynamic memory allocation virtual storage): --for programs larger than main memory --for multiple processes in main memory --for multiple programs in main memory General strategies may not work well because of hard deadlines for real-time systems in embedded applications general strategies are nondeterministic Simple setup: Can swap processes/programs And their contexts --Need storage (may be in firmware) --Need small swap time compared to run time --Need determinism Ex: chemical processing, thermal control fig_04_40

Overlays (pre-virtual storage): Seqment program into one main section and a set of overlays (kept in ROM?) Swap overlays Choose segmentation carefully to prevent thrashing

fig_04_41

Multiprogramming: similar to paging


Fixed partition size: Can get memory fragmentation Example: If each partition is 2K and we have 3 jobs: J1 = 1.5K, J2 = 0.5K, J3 = 2.1K Allocate to successive partitions (4) J2 is using only 0.5 K J3 is using 2 partitions, one of size 0.1K If a new job of size 1K enters system, there is no place for it, even though there is actually enough unused memory for it

fig_04_42

Variable size: Use a scheme like paging Include compaction Choose parameters carefully to prevent thrashing

Memory testing: Components and basic architecture

fig_04_43

Faults to test: data and address lines; stuck-at and bridging (if we assume no internal manufacturing defects)

fig_04_45

ROM testing: stuck-at faults, bridging faults, correct data stored Method: CRC (cyclic reduncancy check) or signature analysis Use LFSR to compress a data stream into a K-bit pattern, similar to error checking (Q: how is error checking done?) ROM contents modeled as N*M-bit data stream, N= address size, M = word size

fig_04_49

Error checking: simple examples 1.Detect one bit error: add a parity bit 2.Correct a 1-bit error: Hamming code Example: send m message bits + r parity bits The number of possible error positions is m + r + 1, we need 2r >= m + r + 1 If m = 8, need r = 4; ri checks parity of bits with i in binary representation Pattern: Bit #: 1 2 3 4 5 6 7 8 9 10 11 12 Info: r0 r1 m1 r2 m2 m3 m4 r3 m5 m6 m7 m8 --- --- 1 --- 1 0 0 --- 0 1 1 1 Set parity = 0 for each group r0: bits 1 + 3 + 5 + 7 + 9 + 11 = r0 + 1 + 1 + 0 + 0 + 1 r0 = 1 r1: bits 2 + 3 + 6 + 7 + 10 + 11 = r1 + 1 + 0 + 0 + 1 + 1 r1 = 1 r2: bits 4 + 5 + 6 + 7 + 12 = r2 + 1 + 0 + 1 r2 = 0 r3: bits 8 + 9 + 10 + 11 + 12 = r3 + 0 + 1 + 1 + 1 r3 = 1 Exercise: suppose message is sent and 1 bit is flipped in received message Compute the parity bits to see which bit is incorrect Addition: add an overall parity bit to end of message to also detect two errors Note: a.this is just one example, a more general formulation of Hamming codes using the finite field arithmetic can also be given b. this is one example of how error correcting codes can be obtained, there are many more complex examples, e.g., Reed-Solomon codes used in CD players

You might also like