EE6304 Lecture8 Mem Hierarchy
EE6304 Lecture8 Mem Hierarchy
4
Memory Semiconductor Business - 2020
6
Memory Classifications
Memory Types
Memory Arrays
7
Memory Capacity, Memory Organization, Speed
8
Example
9
Memory Hierarchy
• Memory size vs. speed vs. cost trade-offs
PC Memories Hierarchy
+
Registers
On-
chip
Cost Cache L1
Speed
per bit ($)
(SRAM-CAM)
Cache L2 (SRAM)
14
On-Chip : Cache ($) Level 1 (L1) and Level 2 (L2)
15
On-Chip : Cash ($) Level 3 (L3)
• Third level of cache.
• Normally used in multi-core processors or between
CPU and Graphics unit as shared on-chip memory
16
Main Memory System
18
DRAM Types
• Different types of RAMs based on data rate
– DDR : 3.2 Gbytes/second
– DDR2 : 8.5 Gbytes/second and can be installed in pairs to
increase throughput
– DDR3 : 12.8 Gbytes/seconds and can be installed in pairs or
groups o 3
• Type of memory limited by your motherboard type
• RAM vendors have online memory scanning utilities
(www.crucial.com)
19
SDRAM vs. DDR RAM
• SDR DRAM (Single Data Rate)
– First generation DRAM
– Transfer a single data word per clock cycle (data word depends
on the design of memory system , typically 32 or 64-bits)
• DDR DRAM (Double Data Rate)
– Two memory transfers per clock cycle
23
Static RAM (SRAM)
• It retains data for as long as power supply is
applied.
• No special action (except power) is required
to retain stored data.
• The access time of a SRAM is much shorter
than that of a DRAM and secondary storage.
– Made of Flip-Flops for storage (each cell 4-
6 transistors to hold 1 bit)
• Normally used in CPU internal memories
(registers, caches, etc…)
24
6116 SRAM (cont.)
25
Memory Organization
• Most common organization in random-access architecture (RAM)
– Any memory location can be accessed in random order at a fixed rate
– Independent of physical location
– For reading or writing
• Access cell (R/W) by selecting its row (wordline) and column (bitline)
• Bit selection done using a multiplexer circuit to direct cell output to
register. Total of 2nx2m cells can be stored
bitline conditioning
wordlines
bitlines
row decoder
memory cells:
2n-k rows x
2m+k columns
n-k
k column
circuitry
n column
decoder 27
2m bits
Memory Organization
• If n=m=8 (wordline)è memory has a total of 65,536 cells (28x28)
• Memory uses 16-bit address to produce a single bit output ( 2n = 65,536
àn=16)
SRAM cell =
6 transistors
Ref: Hodges, Jackson, Saleh, Analysis and Design of Digital Integrated Circuits 28
SRAM – 6T Cell
• Cell size accounts for most of array size
– Reduce cell size at expense of complexity
• 6T SRAM Cell
– Used in most commercial chips
– Data stored in cross-coupled inverters
• Value is stored symmetrically—both true and
complement are stored on cross-coupled transistors
• Write:
– Drive data onto bit, bit’
– Raise wordline (select)
• Read:
– Raise wordline (select)
– Read data on bit, bit’
29
SRAM - Write
– Drive data onto bit, bit’
– Raise wordline (select)
– Lower wordline (select)
1
0 0 1 1
30
SRAM - READ
– Raise wordline (select)
– Read data on bit, bit’
– Lower wordline
1
0 0 1 1
31
Content-addressable Memory (CAM)
• Based on SRAM
• Also called associative memories
• Compares simultaneously the desired information against the
entire list of pre-stored entries à extremely fast (typically 1 clock
cycle)
– Matching scheme vs. decoding scheme
• Supplies desired data (tag) à returns data address (in some
architectures, it also returns the contents of that storage address)
Dynamic Random-Access Memories (DRAM)
• Smaller area memories reduce cost per bit
• SRAM require 6 transistors cell and 4-5 lines
connecting each cell
• DRAM store data as charge on a capacitance
• Leakage current removes stored charge è DRAM
require periodic refreshing of charge
• Refresh circuit makes DRAM:
– Slower than SRAM
– Consume more power
33
DRAM 1T cell
• Reading/Writing by turning transistor M1 on
• Data stored as high/low in C1
• C1 as small as possible, but trade-off to re-fresh more often
• During Read cycle data is destroyed (destructive read-out)
due to C discharge è needs to be regenerated
• After the data is read out the sense amplifier must
immediately write it back
34
DRAM 1T cell
• Read
– Sensing circuitry measures the amount of charge that flows
into the capacitor and determines whether it is zero or a
one.
– Capacitor refreshed by either fully charging it or completely
depleting it of charge, depending on which state it was
initially
• Write
– Wordline and the bitline are brought to high voltageà the
transistor is on
– charge can flow to the capacitor
• If the capacitor initially had no charge (stored zero) à charge flows
into the capacitor
• If the capacitor initially is charged (stored one) àvery little charge
flows into the capacitor
35
Data Retention in Memory
• Retention Time profile of DRAM
Location dependent
Stored value pattern dependent
Time dependent
DRAM – CAS/RAS
• Used to reduce the number
of pins à DRAM is used as
main computer memory
(currently ~8Gbytes)
• Send first address half (RAS)
• Send second address half
(CAS)
• DRAM has internal latch to
store the full address
èComplex interface
èRefresh circuit
37
Ref. Wikipedia
DRAM Read Cycle
• Simplified read Cycle
38
Example
39
Example
40
Memory Organization – 1 Byte
• Only 1-bit can be read or stored per array à Need multiple arrays
E.g., 8 to store 1 byte
• Memory array organized like this is called : Bank
Bank Organization
• Single chip will contain 4,8 or 16 banks
• 3-bits or memory address allocated to bank address (16 bank case)
DIMM Organization
• 8 Banks fitted on same circuit board: stick or RAM, memory module or
DIMM
• DIMM connected to CPU through memory channel
– All DIMMs connected to the same memory channel called RANK (typically each
DIMM on separate channel : Single RANK DIMM)
• Typical PC has four DIMM slots
– If can access all four DIMMS individually : Quad-channel mode (not all PCs are
quad-channel mode èuse utility e.g., CPU-Z
DRAM Capacity, Bandwidth, Latency Trends
100
20x
10
1.3x
1
1999 2003 2006 2008 2011 2013 2014 2015 2016 2017
CPU
DRAM PCM
Ctrl Ctrl
DRAM Phase Change Memory (or Tech. X)
Fast, durable
Small, Large, non-volatile, low-cost
leaky, volatile, Slow, wears out, high active energy
high-cost
First
Appearance
Recent DRAMs are More Vulnerable
• All modules from 2012-2013 become vulnerable
First
Appearance
Taking over the Computer
51
RowHammer
• Like breaking into an apartment by repeatedly slamming a
neighbors door until the vibrations open the door you were
after
Some Potential Solutions
http://www.aimemory.com.tw/index.php/technology/near-memory-computing/
Processing in-Memory
• Current von Neumann architecture spends more time moving
data than processing it
• Accelerators don’t help (enough) if using the same architecture
à Need new types of active memory that store data and can
process it
Summary
• Memory classification
• Need for memory hierarchies
• Memory capacity and organization
• SRAM
• DRAM
• Problems with new DRAM memories
– Rowhammer
• In memory/near memory computing