0% found this document useful (0 votes)
45 views11 pages

Memory Hierarchy Design: A Quantitative Approach, Fifth Edition

MEMORY HIERARCHY AND DESIGN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views11 pages

Memory Hierarchy Design: A Quantitative Approach, Fifth Edition

MEMORY HIERARCHY AND DESIGN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Computer Architecture

A Quantitative Approach, Fifth Edition

Chapter 2
Memory Hierarchy Design

Copyright 2012, Elsevier Inc. All rights reserved.

Introduction

Introduction

Programmers want unlimited amounts of memory with


low latency
Fast memory technology is more expensive per bit than
slower memory
Solution: organize memory system into a hierarchy

Entire addressable memory space available in largest, slowest


memory
Incrementally smaller and faster memories, each containing a
subset of the memory below it, proceed in steps up toward the
processor

Temporal and spatial locality insures that nearly all


references can be found in smaller memories

Gives the allusion of a large, fast memory being presented to the


processor

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved.

Introduction

Memory Hierarchy

Copyright 2012, Elsevier Inc. All rights reserved.

Introduction

Memory Performance Gap

Memory hierarchy design becomes more crucial


with recent multi-core processors:

Introduction

Memory Hierarchy Design


Aggregate peak bandwidth grows with # cores:

Intel Core i7 can generate two references per core per clock
Four cores and 3.2 GHz clock
25.6 billion 64-bit data references/second +
12.8 billion 128-bit instruction references
= 409.6 GB/s!

DRAM bandwidth is only 6% of this (25 GB/s)


Requires:

Multi-port, pipelined caches


Two levels of cache per core
Shared third-level cache on chip

Copyright 2012, Elsevier Inc. All rights reserved.

High-end microprocessors have >10 MB on-chip


cache

Introduction

Performance and Power


Consumes large amount of area and power budget

Copyright 2012, Elsevier Inc. All rights reserved.

When a word is not found in the cache, a miss


occurs:

Fetch word from lower level in hierarchy, requiring a


higher latency reference
Lower level may be another cache or the main
memory
Also fetch the other words contained within the block

Introduction

Memory Hierarchy Basics

Takes advantage of spatial locality

Place block into cache in any location within its set,


determined by address

block address MOD number of sets

Copyright 2012, Elsevier Inc. All rights reserved.

n sets => n-way set associative

Introduction

Memory Hierarchy Basics


Direct-mapped cache => one block per set
Fully associative => one set

Writing to cache: two strategies

Write-through

Write-back

Immediately update lower levels of hierarchy


Only update lower levels of hierarchy when an updated block
is replaced

Both strategies use write buffer to make writes


asynchronous

Copyright 2012, Elsevier Inc. All rights reserved.

Miss rate

Introduction

Memory Hierarchy Basics


Fraction of cache access that result in a miss

Causes of misses

Compulsory

Capacity

First reference to a block


Blocks discarded and later retrieved

Conflict

Program makes repeated references to multiple addresses


from different blocks that map to the same location in the
cache

Copyright 2012, Elsevier Inc. All rights reserved.

Introduction

Memory Hierarchy Basics

Note that speculative and multithreaded


processors may execute other instructions
during a miss

Reduces performance impact of misses

Copyright 2012, Elsevier Inc. All rights reserved.

10

Six basic cache optimizations:

Larger block size

Reduces overall memory access time

Giving priority to read misses over writes

Reduces conflict misses


Increases hit time, increases power consumption

Higher number of cache levels

Increases hit time, increases power consumption

Higher associativity

Reduces compulsory misses


Increases capacity and conflict misses, increases miss penalty

Larger total cache capacity to reduce miss rate

Introduction

Memory Hierarchy Basics

Reduces miss penalty

Avoiding address translation in cache indexing

Reduces hit time

Copyright 2012, Elsevier Inc. All rights reserved.

11

You might also like