cs325 Fall10 Finalexam
cs325 Fall10 Finalexam
If false, correct the statement (points will be taken off if false statements are not clearly corrected).
a. [3%] For the same total cache size, set-associative caches typically have higher hit rate than fully-associative caches.
False
b.
[3%] For the same total cache size, increasing the cache block size will always result in higher hit rates.
c. [3%] Using virtual addresses to access caches effectively provides a larger cache space to the application than is physically available.
Virtual memory creates the illusion of more MEMORY. Cache size does not change.
d. [3%] The main problem with virtually-addressed caches is reduced hit rate.
e. [3%] It is impossible to overlap a cache access and a TLB access in a system that uses fully-associative caches.
Full Associate has no index, nothing to overlap. Two Tag comparison, full associate requires to wait for the address comes from the TLB. We need TLB bits. There is no INDEX, we just blindly compare everything, all the cache. We need physical address.
f. [3%] Direct-mapped caches have lower access delay compared to both set-associative and fully-associative caches.
True, because they can steal the data before the translation completes.
2.
Consider two different cache designs as detailed below: Design 1: a 64KB, 4-way set-associative cache with cache block size of 32 bytes. Design 2: a 64KB, 2-way set-associative cache with cache block size of 64 bytes.
a) [3%] How many sets are there in each of these caches? Explain your answer.
b) [3%] Which of these caches has a lower ratio of the tag bits to the data bits in the cache?
3. [6%] Provide a short stream of word addresses to illustrate the situation where a direct mapped cache has a higher hit rate than a 2-way setassociative cache of the same total size. For example, assume that the total size of each of the two caches is 8 words of data and the cache block size is one word (therefore, each cache contains 8 blocks).
4. [5%] Consider a memory system with a physically-addressed 128KB cache. Assume that the memory is byte addressable and the address space is 4GB. Further assume that the cache line size is 64B and the cache is 16-way setassociative. Determine the minimum page size that allows for the overlap of cache accesses with TLB accesses. Explain your answers!
5. [10%] Consider a direct-mapped cache design with 4 words per block and a total data size of 8 words. Assume that the memory is word-addressable, and consider the following sequence of word addresses: 1, 3, 2, 5, 56, 3, 57 and 6. Indicate which of these accesses will be cache hits and which will be cache misses. Show the contents of the cache after every access. Assume that the cache is initially empty.
6. [10%] Repeat problem 5 assuming a fully-associative cache with LRU replacement this time.
7. [10%] Consider a pipelined processor with a 2-level cache hierarchy, where the first level data and instruction caches have 1-cycle hit latency (no extra
penalty on a hit). The hit rate to the L1 instruction cache is 99%, and the hit rate of the L1 Data cache is 90%. The L2 cache (unified for both data and instructions) has a hit rate of 50% and hit latency of 20 cycles. The main memory access time is 150 cycles. Assume that 50% of all instructions are memory instructions. Estimate the CPI of this processor, assuming that cache misses represent the only source of possible pipeline bubbles for this processor.
8. [10%] Consider a pipelined processor with just one level of cache. Assume that in the absence of memory instructions and possible long memory access delays, the CPI of this processor is 1 (i.e. there are no other bubbles or delays in the pipeline). Assume that the percentage of memory instructions in a typical program executed on this CPU is 50% and the memory access latency is 150 cycles. Consider the following two alternatives for the cache design: Alternative 1: A small cache with a hit rate of 94% and hit access time of 1 cycle (assume that no stalled cycles are introduced to the pipeline on a cache hit in this case). Alternative 2: A larger cache with a hit rate of 98% and the hit access time of 2 cycles (assume that every memory instruction that hits in the cache will result in one-cycle pipeline bubble in this case).
Estimate the CPI metric for both of these designs and establish which of these two designs provides a better performance. Explain your answers!
Would your answer change if the memory access latency is reduced to 50 cycles? Explain!
9. [5%] Consider a memory system that uses paged virtual memory and physically addressed set-associative cache. Assume that virtual address space is 4 Gbytes, memory is byte-addressable and page size is 4 Kbytes. Further assume that cache block size is 64 bytes and data size of the cache is 64 Kbytes. Determine the minimum associativity of the cache that would allow for the overlap of the cache access with the TLB access. Explain your answer showing all calculations.
10.What are the advantages and drawbacks of directly using virtual addresses to access caches (i.e. using virtually-addressed caches)? Limit your answer to at most 4 sentences. [3%] Advantage(s):__________________________________
[3%] Drawback(s):____________________________________
11.What are the advantages and disadvantages of increasing the cache lines size (while keeping the total size of the cache the same)
[3%] Advantage(s):__________________________________
[3%] Drawback(s):____________________________________
12.Consider a processor where 32-bit addresses are used to access caches. Assume that the cache is 2-way set-associative, the cache line size is 64 bytes and the total size of the cache is 64 Kbytes. a. [2%] What is the total number of sets in this cache? Explain. b. [2%] What is the total number of tag bits needed to implement this cache? Explain.
c. [2%] What is the total number of tag comparators used in this cache? What is the width of each comparator? Explain.
d. [2%] Show the format of the address. Indicate the number of bits in tag, index and byte offset fields.