0% found this document useful (0 votes)

41 views45 pages

L05 Memory

The document discusses memory technologies in computer architecture, covering early read-only and read/write memory systems, including core memory and semiconductor memory. It highlights the evolution of memory organization, modern DRAM structure, and the CPU-memory bottleneck, emphasizing factors like latency and bandwidth. Additionally, it addresses memory hierarchy and cache algorithms to optimize memory access patterns based on temporal and spatial locality.

Uploaded by

npcsignupaccnt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views45 pages

L05 Memory

Uploaded by

npcsignupaccnt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

CS 152 Computer Architecture and Engineering

CS252 Graduate Computer Architecture

Lecture 5 – Memory

Chris Fletcher
Electrical Engineering and Computer Sciences
University of California at Berkeley

https://cwfletcher.github.io/
http://inst.eecs.berkeley.edu/~cs152
Last time in Lecture 4
▪ Handling exceptions in pipelined machines by passing
exceptions down pipeline until instructions cross commit
point in order
▪ Can use values before commit through bypass network
▪ Four different pipeline categories: *-stage pipelined,
decoupled, out-of-order, super scalar
▪ Pipeline hazards can be avoided through software
techniques: scheduling, loop unrolling
▪ Decoupled architectures use queues between “access”
and “execute” pipelines to tolerate long memory latency
▪ Regularizing all functional units to have same latency
simplifies more complex pipeline design by avoiding
structural hazards, can be expanded to in-order
superscalar designs
2
Where are we
▪ ISA
▪ Microarchitecture
– Control & Datapath
• Fixed control & Pipelined
▪ *-Stage in-order pipelines
▪ Decoupled
▪ [Limited] Out-of-order
▪ [Limited] Out-of-order + super scalar
– Memory
• Today

3
Early Read-Only Memory Technologies

Punched cards, From early

1700s through Jaquard Loom,
Punched paper tape,
Babbage, and then IBM
instruction stream in
Harvard Mk 1
Diode Matrix, EDSAC-2
µcode store

IBM Balanced
Capacitor ROS
IBM Card Capacitor ROS
4
Early Read/Write Main Memory Technologies
Babbage, 1800s: Digits
stored on mechanical wheels

Williams Tube,
Manchester Mark 1, 1947

Mercury Delay Line, Univac 1, 1951

Also, regenerative capacitor memory on

Atanasoff-Berry computer, and rotating
magnetic drum memory on IBM 650
5
MIT Whirlwind Core Memory

6
Core Memory
▪ Core memory was first large scale reliable main memory
– invented by Forrester in late 40s/early 50s at MIT for Whirlwind project
▪ Bits stored as magnetization polarity on small ferrite cores
threaded onto two-dimensional grid of wires
▪ Coincident current pulses on X and Y wires would write
cell and also sense original state (destructive reads)
▪ Robust, non-volatile storage
▪ Used on space shuttle
computers
▪ Cores threaded onto wires by
hand (25 billion a year at peak
production)
▪ Core access time ~ 1µs
DEC PDP-8/E Board,
4K words x 12 bits, (1968)
7
Semiconductor Memory
▪ Semiconductor memory began to be
competitive in early 1970s
– Intel formed to exploit market for semiconductor
memory
– Early semiconductor memory was Static RAM (SRAM).
SRAM cell internals similar to a latch (cross-coupled
inverters).

▪ First commercial Dynamic RAM (DRAM)

was Intel 1103
– 1Kbit of storage on single chip
– charge on a capacitor used to hold value

Semiconductor memory quickly replaced

core in ‘70s [ Thomas Nguyen CC-BY-SA ]

8
▪ Good overview of the commercial memory technology
war
▪ And other things (litho basics, geopolitics, supply chain…)

9
Memory organization
▪ Y -> X: Y made up of X
▪ DRAM: DIMM -> Rank -> Chip -> Bank -> Array -> Cell
▪ SRAM: Similar story

▪ Bank: the “atomic unit” that gets activated / access

10
One-Transistor Dynamic RAM [Dennard, IBM]
1-T DRAM Cell

word
access transistor
TiN top electrode (VREF)
VREF
Ta2O5 dielectric

bit
Storage
capacitor (FET gate,
trench, stack)

poly W bottom
word electrode
line access
transistor

11
Modern DRAM Structure

[Samsung, sub-70nm DRAM, 2004]

12
DRAM Array Architecture
bit lines
Col. Col.
1 2M word lines
Row 1

Row Address
N

Decoder
Row 2N
Memory cell
M Column Decoder & (one bit)
N+M
Sense Amplifiers

Data D

▪ Bits stored in 2-dimensional arrays on chip

▪ Modern chips have around 4-8 logical banks on each chip
▪ each logical bank physically implemented as many smaller arrays
13
DRAM Operation
▪ Three steps in read/write access to a given bank
▪ Row access (RAS)
– decode row address, enable addressed row (often multiple Kb in row)
– bitlines share charge with storage cell
– small change in voltage detected by sense amplifiers which latch whole row of bits
– sense amplifiers drive bitlines full rail to recharge storage cells
▪ Column access (CAS)
– decode column address to select small number of sense amplifier latches (4, 8, 16,
or 32 bits depending on DRAM package)
– on read, send latched bits out to chip pins
– on write, change sense amplifier latches which then charge storage cells to
required value
– can perform multiple column accesses on same row without another row access
(burst mode)
▪ Precharge
– charges bit lines to known value, required before next row access
▪ Each step has a latency of around 15-20ns in modern DRAMs
▪ Various DRAM standards (DDR, RDRAM) have different ways of
encoding the signals for transmission to the DRAM, but all share
same core architecture

14
Double-Data Rate (DDR2) DRAM
200MHz
Clock

Row Column Precharge Row’

Data

400Mb/s
[ Micron, 256Mb DDR2 SDRAM datasheet ] Data Rate
15
DRAM vs. SRAM: Cell level

DRAM cell SRAM cell

Word line

Word line
~Bit line Bit line
Bit line

16
DRAM vs. SRAM: Cell level

Word line

Bit line Word line

~Bit line Bit line

DRAM SRAM
▪ Optimized for density then speed • Optimized for speed then density

▪ 1 transistor cells • 4 to 6 transistors per cell

▪ Time multiplexed address pins • Separate address pins
▪ Destructive reads • Reads not destructive
▪ Must refresh every few ms • Static → No Refresh

17
Relative Memory Cell Sizes
On-Chip DRAM on
SRAM in memory chip
logic chip

[ Foss, “Implementing
Application-Specific
Memory”, ISSCC 1996 ]

18
Administrivia
▪ PS 2 out today, due 2/20
▪ Lab 2 out Thursday
▪ Nafea Bshara guest lecture 2/25
▪ Midterm 3/4

19
CPU-Memory Bottleneck

CPU Memory

Performance of high-speed computers is usually limited by

memory bandwidth & latency
▪ Latency (time for a single access)
– Memory access time >> Processor cycle time
▪ Bandwidth (number of accesses per unit time)
if fraction m of instructions access memory
⇒ 1+m memory references / instruction
⇒ CPI = 1 requires 1+m memory refs / cycle (assuming RISC-V ISA)
▪ Also, Occupancy (time a memory bank is busy with one
request)

20
Factors influencing modern memory
system design
▪ Latency
– As a function of capacity/size
▪ Memory improvement vs. CPU improvement
▪ Bandwidth
– Modern packaging
▪ Memory access pattern characteristics

21
Processor-DRAM Gap (latency)

µProc 60%/year
1000 CPU
Performance

Processor-Memory
100 Performance Gap:
(growing 50%/yr)

10 DRAM
7%/year
DRAM

1
1988
1986
1987

1989
1990
1991
1992
1993
1994
1995
1996
1980
1981

1983
1984
1985

1997
1998
1999
2000
1982

Time
Four-issue 3GHz superscalar accessing 100ns DRAM could execute 1,200
instructions during time for one memory access!

22
Physical Size Affects Latency

CPU

Small
Memory
Big Memory

▪ Signals have further to travel

▪ Fan out to more locations

23
DRAM Packaging
(Laptops/Desktops/Servers)

~7
Clock and control signals
DRAM
Address lines multiplexed
row/column address ~12 chip
Data bus
(4b,8b,16b,32b)

▪ DIMM (Dual Inline Memory Module) contains

multiple chips with clock/control/address signals
connected in parallel (sometimes need buffers to
drive signals to all chips)
▪ Data pins work together to return wide word (e.g.,
64-bit data bus using 16x4-bit parts)

24
DRAM Packaging, Apple M1

Two DRAM chips

on same package
as system SoC

•128b databus,
running at 4.2Gb/s
•68GB/s bandwidth

25
High-Bandwidth Memory in SX-Aurora

1.2TB/s HBM Bandwidth, 6 channels * 1024 bits/channel * 1.6 Gb/s/pin

26
Real Memory Reference Patterns
Memory Address (one dot per access)

Donald J. Hatfield, Jeanette Gerald: Program Restructuring for Virtual Memory. Time
IBM Systems Journal 10(3): 168-192 (1971)
29
Typical Memory Reference Patterns

Address n loop iterations

Instruction
fetches

subroutine subroutine
call return
Stack
accesses
argument access

Data
accesses scalar accesses
Time

30
Two predictable properties of memory
references:
▪ Temporal Locality: If a location is
referenced it is likely to be referenced again
in the near future.

▪ Spatial Locality: If a location is referenced it

is likely that locations near it will be
referenced in the near future.

31
Memory Reference Patterns
Memory Address (one dot per access)

Temporal
Locality

Spatial
Locality
Donald J. Hatfield, Jeanette Gerald: Program Time
Restructuring for Virtual Memory. IBM Systems Journal
10(3): 168-192 (1971) 32
Factors influencing modern memory
system design
▪ Latency
– As a function of capacity/size
▪ Memory improvement vs. CPU improvement
▪ Bandwidth
– Modern packaging
▪ Memory access pattern characteristics

Punchline:
Memory hierarchy,
Arranged in small/fast → big/slow pyramid
Policies to exploit data locality
33
Memory Hierarchy

A B
Small,
Fast Memory Big, Slow Memory
CPU
(RF, SRAM) (DRAM)

holds frequently used data

• capacity: Register << SRAM << DRAM

• latency: Register << SRAM << DRAM
• bandwidth: on-chip >> off-chip
On a data access:
if data  fast memory  low latency access (SRAM)
if data  fast memory  high latency access (DRAM)

34
Management of Memory Hierarchy
▪ Small/fast storage, e.g., registers
– Address usually specified in instruction
– Generally implemented directly as a register file
• but hardware might do things behind software’s
back, e.g., stack management, register renaming

▪ Larger/slower storage, e.g., main memory

– Address usually computed from values in register
– Generally implemented as a hardware-managed cache
hierarchy (hardware decides what is kept in fast
memory)
• but software may provide “hints”, e.g., don’t cache
or prefetch
35
Caches exploit both types of
predictability:
▪ Exploit temporal locality by remembering the
contents of recently accessed locations.

▪ Exploit spatial locality by fetching blocks of data

around recently accessed locations.

36
Inside a Cache

Address Address

Processor Main
CACHE Memory
Data Data

copy of main copy of main

memory memory
location 100 location 101
Data Data
100 Byte Byte Line
Data
304 Byte

Address 6848
Tag 416

Data Block

37
Cache Algorithm (Read)
Look at Processor Address, search cache tags to
find match. Then either

Found in cache Not in cache

a.k.a. HIT a.k.a. MISS

Return copy Read block of data from

of data from Main Memory
cache
Wait …

Return data to processor

and update cache
Q: Which line do we replace? 38
Placement Policy
1111111111 2222222222 33
Block Number 0123456789 0123456789 0123456789 01

Memory

Set Number 0 1 2 3 01234567

Cache

Fully (2-way) Set Direct

Associative Associative Mapped
anywhere anywhere in only into
block 12
can be placed set 0 block 4
(12 mod 4) (12 mod 8)

39
Another view:
Cache = HW-Optimized Hash table

(Key1, Val1) (Key2, Val2)

Hash
function
Key

(Key3, Val3)

Bin Slots

▪ Hash function: take bits of the address (“Index bits”)

▪ Bin == Set, Slots == Ways
▪ 1 Bin → Fully associative; 1 Slot / Bin → direct mapped
▪ M Slots / N Bins where M, N > 1 → set associative
▪ Fixed # Slots per Bin; Slots read out in parallel
▪ Key/Value pairs in Bins stored separately (tag + data array)
40
Direct-Mapped Cache

Tag Index Block

Offset

t
k b
V Tag Data Block

2k
lines

t
=

HIT Data Word or Byte

41
Direct Map Address Selection
higher-order vs. lower-order address bits

Index Tag Block

Offset
k t
b
V Tag Data Block

2k
lines

t
=

HIT Data Word or Byte

42
2-Way Set-Associative Cache
Tag Index Block
Offset b
t
k
V Tag Data Block V Tag Data Block

= = Data
Word
or Byte

HIT

43
Fully Associative Cache
V Tag Data Block

t
=
Tag

t
=
HIT
Offset

Data
Block

= Word
b or Byte

44
Replacement Policy
In an associative cache, which block from a set should
be evicted when the set becomes full?
• Random
• Least-Recently Used (LRU)
• LRU cache state must be updated on every access
• true implementation only feasible for small sets (2-way)
• pseudo-LRU binary tree often used for 4-8 way
• First-In, First-Out (FIFO) a.k.a. Round-Robin
• used in highly associative caches
• Not-Most-Recently Used (NMRU)
• FIFO with exception for most-recently used block or blocks

This is a second-order effect. Why?

Replacement only happens on misses

45
Block Size and Spatial Locality
Block is unit of transfer between the cache and memory

Tag Word0 Word1 Word2 Word3 4 word block, b=2

Split CPU block address offsetb

address

32-b bits b bits

2b = block size a.k.a line size (in bytes)

Larger block size has distinct hardware advantages

• less tag overhead
• exploit fast burst transfers from DRAM
• exploit fast burst transfers over wide busses
What are the disadvantages of increasing block size?
Fewer blocks => more conflicts. Can waste bandwidth.

46
Acknowledgements
▪ This course is partly inspired by previous MIT 6.823 and
Berkeley CS252 computer architecture courses created by
my collaborators and colleagues:
– Arvind (MIT)
– Joel Emer (Intel/MIT)
– James Hoe (CMU)
– John Kubiatowicz (UCB)
– David Patterson (UCB)
– Krste Asanovic (UCB)
– Sophia Shao (UCB)

EG4 Destination Controller - Installation and Operating Manual
No ratings yet
EG4 Destination Controller - Installation and Operating Manual
134 pages
04 - Computer Memory Systems
No ratings yet
04 - Computer Memory Systems
91 pages
Ca Ut5
No ratings yet
Ca Ut5
54 pages
CSE2213 Lecture 8 Chapter5 the-Memory-System
No ratings yet
CSE2213 Lecture 8 Chapter5 the-Memory-System
48 pages
Ddca 2024 Lecture24 Memory Hierarchy and Caches Beforelecture
No ratings yet
Ddca 2024 Lecture24 Memory Hierarchy and Caches Beforelecture
304 pages
All CSC 417 Note
No ratings yet
All CSC 417 Note
238 pages
Module 3F
No ratings yet
Module 3F
183 pages
Chapter 3 Lecture 1
No ratings yet
Chapter 3 Lecture 1
54 pages
EE6304 Lecture8 Mem Hierarchy
No ratings yet
EE6304 Lecture8 Mem Hierarchy
54 pages
Lecture 10
No ratings yet
Lecture 10
44 pages
CS5204/EE5364 - Advanced Computer Architecture - Memory
No ratings yet
CS5204/EE5364 - Advanced Computer Architecture - Memory
67 pages
CA Lecture 4
No ratings yet
CA Lecture 4
28 pages
Megamat - Swiss Instruments LTD
No ratings yet
Megamat - Swiss Instruments LTD
74 pages
Lecture 3 (Memory Hierarchy and Caches)
No ratings yet
Lecture 3 (Memory Hierarchy and Caches)
88 pages
Jeppiaar Institute of Technology: Department OF Computer Science and Engineering
No ratings yet
Jeppiaar Institute of Technology: Department OF Computer Science and Engineering
35 pages
GOOD DRAM Interface Tutorial
No ratings yet
GOOD DRAM Interface Tutorial
91 pages
Memory & I/O Systems: Unit Iv
No ratings yet
Memory & I/O Systems: Unit Iv
38 pages
BCS 302 (Unit 4)
No ratings yet
BCS 302 (Unit 4)
45 pages
Lectures wk1
No ratings yet
Lectures wk1
18 pages
Chapter IV CAO
No ratings yet
Chapter IV CAO
93 pages
FCA2
No ratings yet
FCA2
46 pages
Lecture 12 - Memory Technologies
No ratings yet
Lecture 12 - Memory Technologies
47 pages
L06 Memory
No ratings yet
L06 Memory
37 pages
Week 10
No ratings yet
Week 10
59 pages
Abstract
No ratings yet
Abstract
23 pages
Mc9211unit 5 PDF
No ratings yet
Mc9211unit 5 PDF
89 pages
CS 152 Computer Architecture and Engineering Lecture 6 - Memory
No ratings yet
CS 152 Computer Architecture and Engineering Lecture 6 - Memory
29 pages
Data Storage Technology 2005 IS112
No ratings yet
Data Storage Technology 2005 IS112
50 pages
COA Lecture 20
No ratings yet
COA Lecture 20
26 pages
7 Memory
No ratings yet
7 Memory
89 pages
Module 6 - Memory
No ratings yet
Module 6 - Memory
32 pages
Unit 3 - Memory Organization
No ratings yet
Unit 3 - Memory Organization
98 pages
This Unit: Caches: - Basic Memory Hierarchy Concepts
No ratings yet
This Unit: Caches: - Basic Memory Hierarchy Concepts
24 pages
Onur 447 Spring15 Lecture21 Main Memory Afterlecture
No ratings yet
Onur 447 Spring15 Lecture21 Main Memory Afterlecture
94 pages
DRAM Terminology and Basics, Energy Innovations
No ratings yet
DRAM Terminology and Basics, Energy Innovations
14 pages
Memory
No ratings yet
Memory
87 pages
Chapter 3 P3
No ratings yet
Chapter 3 P3
23 pages
04 Cache Memory Internal Memory Revised 2
No ratings yet
04 Cache Memory Internal Memory Revised 2
43 pages
Cache Memory: Computer Organization and Architecture Characteristics of Memory Systems
No ratings yet
Cache Memory: Computer Organization and Architecture Characteristics of Memory Systems
16 pages
Lecture 10: Memory System - Memory Technology: CSE 564 Computer Architecture Summer 2017
No ratings yet
Lecture 10: Memory System - Memory Technology: CSE 564 Computer Architecture Summer 2017
44 pages
Lecture 10
No ratings yet
Lecture 10
44 pages
Memory Unit Bindu Agarwalla
No ratings yet
Memory Unit Bindu Agarwalla
62 pages
Memory: Computer Architecture and Assembly Language
No ratings yet
Memory: Computer Architecture and Assembly Language
15 pages
Evoltion&Future - Memory Technology
No ratings yet
Evoltion&Future - Memory Technology
37 pages
Memory System
No ratings yet
Memory System
70 pages
Onur 740 Fall11 Lecture25 Mainmemory
No ratings yet
Onur 740 Fall11 Lecture25 Mainmemory
50 pages
Lecture 10
No ratings yet
Lecture 10
44 pages
The Memory System: Deepak John, Department Computer Applications, SJCET-Pala
No ratings yet
The Memory System: Deepak John, Department Computer Applications, SJCET-Pala
63 pages
) MemorySystem Part 1
No ratings yet
) MemorySystem Part 1
24 pages
Fibre Channel For SANs 1st Edition Alan Frederic Benner All Chapter Instant Download
100% (8)
Fibre Channel For SANs 1st Edition Alan Frederic Benner All Chapter Instant Download
72 pages
William Stallings Computer Organization and Architecture 8th Edition
No ratings yet
William Stallings Computer Organization and Architecture 8th Edition
46 pages
The Memory Hierarchy
No ratings yet
The Memory Hierarchy
46 pages
Module 5.3
No ratings yet
Module 5.3
39 pages
Memory Organization
No ratings yet
Memory Organization
24 pages
Profibus DP Introduction
No ratings yet
Profibus DP Introduction
38 pages
Internet & Web Technology - I: Internet Overview, Evolution of Internet
No ratings yet
Internet & Web Technology - I: Internet Overview, Evolution of Internet
43 pages
Memory Subsytems
No ratings yet
Memory Subsytems
19 pages
Math 113, Section 4 Spring 2018: Homework #9
No ratings yet
Math 113, Section 4 Spring 2018: Homework #9
2 pages
Anatomy of Computer
No ratings yet
Anatomy of Computer
292 pages
Az 101
No ratings yet
Az 101
140 pages
Bizhub c350
No ratings yet
Bizhub c350
2 pages
ClassSession 58932020621639390
No ratings yet
ClassSession 58932020621639390
4 pages
Sol 15
No ratings yet
Sol 15
24 pages
Mazzulaquizlet
No ratings yet
Mazzulaquizlet
46 pages
Memory Subsystem: Dr. Gayathri Sivakumar Assistant Professor (SG-I) School of Electronics VIT, Chennai
No ratings yet
Memory Subsystem: Dr. Gayathri Sivakumar Assistant Professor (SG-I) School of Electronics VIT, Chennai
16 pages
DRAM Basics by Prof. Matthew D. Sinclair
No ratings yet
DRAM Basics by Prof. Matthew D. Sinclair
103 pages
Sol 14
No ratings yet
Sol 14
22 pages
Sol 12
No ratings yet
Sol 12
21 pages
8bit Digital Delay Sketch
No ratings yet
8bit Digital Delay Sketch
3 pages
Unit 3 OF ESD
No ratings yet
Unit 3 OF ESD
22 pages
Sol 11
No ratings yet
Sol 11
15 pages
Installing The Dark GDK With Visual Studio 2008
No ratings yet
Installing The Dark GDK With Visual Studio 2008
9 pages
Programming The Be Operating System
No ratings yet
Programming The Be Operating System
392 pages
Imsva 9.1 BPG 20160531
No ratings yet
Imsva 9.1 BPG 20160531
61 pages
Difference Between TCP and UDP
No ratings yet
Difference Between TCP and UDP
6 pages
DDWRT WireGuard Client Setup Guide v37
No ratings yet
DDWRT WireGuard Client Setup Guide v37
23 pages
Dell Emc Poweredge MX Modular Platform Installation, Configuration, and Management
No ratings yet
Dell Emc Poweredge MX Modular Platform Installation, Configuration, and Management
3 pages
Pos Exam Part12docx
No ratings yet
Pos Exam Part12docx
7 pages
Computer Science Igcse Student Workbook
No ratings yet
Computer Science Igcse Student Workbook
28 pages
WE Standalone AP User Manual
No ratings yet
WE Standalone AP User Manual
86 pages
Gigabyte Motherboard Manual GA-B85-HD3
No ratings yet
Gigabyte Motherboard Manual GA-B85-HD3
36 pages
Interfacing LPC2148 With GLCD.
No ratings yet
Interfacing LPC2148 With GLCD.
3 pages
OBiHai Install Instructions
No ratings yet
OBiHai Install Instructions
5 pages
Firepower FMC PDF
No ratings yet
Firepower FMC PDF
8 pages
Assignment 2 - CN
No ratings yet
Assignment 2 - CN
7 pages
Inventory of DCP 2018 22
No ratings yet
Inventory of DCP 2018 22
2 pages
Automation Ration Distribution System
No ratings yet
Automation Ration Distribution System
4 pages
Purchasing - File - Hasil Lelang Final
No ratings yet
Purchasing - File - Hasil Lelang Final
2 pages
Location Wise Details MASTER
No ratings yet
Location Wise Details MASTER
2 pages
Htop
No ratings yet
Htop
6 pages
DIFx Install Log
No ratings yet
DIFx Install Log
2 pages
File Mcqs
No ratings yet
File Mcqs
3 pages
Semiconductor Memory Design (Sram & Dram) : Kaushik Saha
No ratings yet
Semiconductor Memory Design (Sram & Dram) : Kaushik Saha
171 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
What is TCP/IP: Basic Concepts to More Advanced.
From Everand
What is TCP/IP: Basic Concepts to More Advanced.
Scott Markham
No ratings yet
PC Engine / TurboGrafx-16 Architecture: Architecture of Consoles: A Practical Analysis, #16
From Everand
PC Engine / TurboGrafx-16 Architecture: Architecture of Consoles: A Practical Analysis, #16
Rodrigo Copetti
No ratings yet

L05 Memory

Uploaded by

L05 Memory

Uploaded by

CS 152 Computer Architecture and Engineering

CS252 Graduate Computer Architecture

Punched cards, From early

Mercury Delay Line, Univac 1, 1951

Also, regenerative capacitor memory on

▪ First commercial Dynamic RAM (DRAM)

Semiconductor memory quickly replaced

▪ Bank: the “atomic unit” that gets activated / access

[Samsung, sub-70nm DRAM, 2004]

▪ Bits stored in 2-dimensional arrays on chip

Row Column Precharge Row’

DRAM cell SRAM cell

Bit line Word line

▪ 1 transistor cells • 4 to 6 transistors per cell

Performance of high-speed computers is usually limited by

▪ Signals have further to travel

▪ DIMM (Dual Inline Memory Module) contains

Two DRAM chips

1.2TB/s HBM Bandwidth, 6 channels * 1024 bits/channel * 1.6 Gb/s/pin

Address n loop iterations

▪ Spatial Locality: If a location is referenced it

holds frequently used data

• capacity: Register << SRAM << DRAM

▪ Larger/slower storage, e.g., main memory

▪ Exploit spatial locality by fetching blocks of data

copy of main copy of main

Found in cache Not in cache

Return copy Read block of data from

Return data to processor

Set Number 0 1 2 3 01234567

Fully (2-way) Set Direct

(Key1, Val1) (Key2, Val2)

▪ Hash function: take bits of the address (“Index bits”)

Tag Index Block

HIT Data Word or Byte

Index Tag Block

HIT Data Word or Byte

This is a second-order effect. Why?

Replacement only happens on misses

Tag Word0 Word1 Word2 Word3 4 word block, b=2

Split CPU block address offsetb

32-b bits b bits

Larger block size has distinct hardware advantages

You might also like