0% found this document useful (0 votes)
12 views72 pages

DICD Fall 2024 Lecture 10 Memory

The document is a lecture on Digital Integrated Circuit Design focusing on memory types including SRAM, DRAM, and Flash Memory, along with emerging memory technologies. It discusses memory hierarchy, classification, parameters, and architecture, detailing the operations and constraints of SRAM and DRAM. The lecture also covers the principles of Flash memory and compares NAND and NOR Flash, concluding with a brief mention of resistance-based memories.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views72 pages

DICD Fall 2024 Lecture 10 Memory

The document is a lecture on Digital Integrated Circuit Design focusing on memory types including SRAM, DRAM, and Flash Memory, along with emerging memory technologies. It discusses memory hierarchy, classification, parameters, and architecture, detailing the operations and constraints of SRAM and DRAM. The lecture also covers the principles of Flash memory and compares NAND and NOR Flash, concluding with a brief mention of resistance-based memories.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 72

EE-808 Fall 2024

Digital Integrated Circuit Design

Lecture # 10
Memory

Muhammad Imran
[email protected]
Acknowledgement
2

 Content from following resources has been used in these lectures


 Digital Integrated Circuits, Adam Teman, BIU, Israel
 Jan. M. Rabaey, Digital Integrated Circuits, 2nd Ed.
 Abu Sebastian, IBM Research
 Christian Weis, RPTU
 N. Wehn, RPTU
 C. Sudarshan
Contents
3

 Memory – An Introduction
 SRAM
 DRAM
 Flash Memory
 Emerging Memory Technologies
 In-Memory Computing
Memory – An Introduction
Significance of Memory
5

Source: Intel
Intel Pentium-M (2001) – 2MB L3
Cache

Source: Intel
Intel 10th Gen “Comet Lake” (2020) – 20MB L3
Cache
Memory Hierarchy
6

Source: Pavlov, Sachdev,


Memory Classification
7

Memory
Arrays

Random Access Serial Access Content Addressable


Memory Memory Memory (CAM
)
Read/Write Read Only
Shift Queue
Memory Memory (ROM)
Registers s
(RAM)
(Volatil (Nonvolatile)
e)
Serial In Parallel First In Last In
Static Dynamic Parallel In First First
RAM RAM Out Serial Out Out
(SRAM) (DRAM) (SIPO) Out (FIFO) (LIFO)
(PISO)

Mask Programma Erasable Electrically Flash


ROM ble ROM Programma Erasable ROM
(PROM) ble ROM Programma
(EPROM) ble ROM
(EEPROM)
Memory Parameters
8

 Size:
 Bits, Bytes, Words

 Timing Parameters:
 Read access, write access, cycle time

 Function:
 Read Only (ROM) non volatile
 Read Write (RWM) volatile
 NVRWM Non volatile Read Write

 Access Pattern:
 Random Access, FIFO, LIFO, Shift Register, CAM

 I/O Architecture:
 Single Port, Multi port

 Application:
 Embedded, External, Secondary
Random Access Chip Architecture
9

 Conceptual: linear array


 Each box holds some data
 But this leads to a long and skinny shape

 Let’s say we want to make a 1MB memory:


 1MB=220 words X 8 bits=223 bits, each word in a separate row
 A decoder would reduce the number of access pins from
 220 access pins to 20 address lines
 The output lines (=bit lines) would be extremely long, as would the
delay of the huge decoder
 The array’s height is about 128,000 times larger than its width (220/23)
Square Ratio
10

 Instead, let’s make the array square:


 1MB=223 bits=212 rows X 211
columns
 There are 4000 rows, so we need a
12-bit row address decoder (to
select a single row)
 There are 2000 columns,
representing 256 8-bit words
 We need to select only one of the
256 words through a column
address decoder (or multiplexer)
 We call the row lines “Word Lines”
and the column lines “Bit Lines”
Memory Architecture
11

 Memory Size: W Words of C bits


Storage
Cell
 =W x C bits
Bit Line

ADDA-1 : ADDM
 Address bus: A bits

Row Decoder
 Word
 W=2A Line

 Number of Words in a Row: 2M


 Multiplexing Factor: M
C×2M
Sense Amplifiers /Drivers
 Number of Rows: 2A-M ADDM-1 :
 Number of Columns: C x 2M Column Decoder
ADD0
Input/Output (C bits)
 Row Decoder: A-M  2A-M
 Column Decoder: M  2M
SRAM
Basic Static Memory Element
13

Q Q
Writing into a Cross-Coupled Pair
15

 The write operation is ratioed


 The access transistor must overcome the feedback.

En

D Q Q
How should we write a ‘1’
16

Option 1: nMOS Access Transistor Option 2: pMOS Access


Transistor

Passes a “weak ‘1’”, bad at Passes a “weak ‘0’”, bad at


pulling up against the feedback pulling down against the
feedback
Option 3: Transmission Gate
Solution: Differential nMOS Write

Writes well, but how do we


Reading from a 6T SRAM Cell
17
6-Transistor CMOS SRAM Cell
18

BL BLB

WL WL
M3 M6

M2 M
Q
5
M1 M4
QB
6T SRAM Operation
SRAM Operation – Hold
20

BL BLB

WL WL
M M
M2 3 6
M
Q 5
M M QB
1 4
SRAM Operation – Read
21

BL BLB

WL 0 WL
VDD
M M
3 6
M M
VDD 0
2 Q 5
M M QB
0
QB
M3 1 VDD 4 BLB

M5
Q=VDD WL

QB=ΔV
M2
WL
Q
M4
BL
“nMOS”
No Change! inverter – QB
voltage rises
SRAM Operation – Read
22

BL BLB
BLB
WL 0 WL M5
VDD WL
M M
QB=ΔV
3 6
M M
VDD 0 M4
2 Q 5 Q

M M QB
0
1 VDD 4

Cell
Ratio:W4
L4
CR 
L5
Cell Ration – Read Constraint
23

BLB

WL
M5

QB=ΔV

Q
M4

W4
So we need the pull down L4
transistor to be much stronger CR 
than the access transistor… W L
5
5
SRAM Operation – Write
24

BL BLB

WL WL
0
VDD
M M
M2 3 0 6
M
VDD VDD 0
Q 5
M VD 0
M QB
1 D 4
BL
Q
M2 M6
WL
Q=ΔV
QB=VOLmin
M1 WL M5
QB
Same as during read –
BLB
designed so ΔV<VM Pseudo nMOS
inverter!
SRAM Operation – Write
25

BL BLB

WL WL Q
0 M6
VDD
M M
M2 3 0 6
M QB=VOLmin
VDD VDD 0
Q 5
WL M5
M VD 0
M QB BLB
1 D 4

Pull-Up
Ratio W6
L6
PR 
L5
Pull Up Ration – Write Constraint
26

Q
M6

QB=VOLmin
WL M5
BLB

W6
So we need the a c c ess L6
transistor to be much PR 
stronger than the pull up L5
transistor…
Summary – SRAM Sizing Constraints
27

Read
Constraint W1 W4
L1 L PDN
CR  4
 access
2 W5
L2 L5
W 

KPDN

K
PDNK
 Kaccess  K PUN
Write access
Constraint

W3 W6
Kaccess  KPUN
L3
PR  L6 
W2 W5 access
 L2 PUN
L5
Multi-Port SRAM – Dual Port SRAM
29

Dual Port
SRAM
DRAM
DRAM Invention
32

 By Robert Heath Dennard


 In 1966
 The 1-transistor DRAM cell!

 What else is he known for?


 Dennard Scaling!
DRAM Cell
33

 Capacitor can be real capacitor or a transistor with gate to substrate


capacitance!
 Capacitor charged = logic 1
 Capacitor discharged = logic 0
 Multi-level cell suggested by complex technology
Writing to DRAM cell
34

 Drive bit line to Vdd or GND


 Turn on access transistor
 Capacitor is either charged (1) or discharged (0)
Reading from DRAM cell
35

 Bit-line has a reference voltage!


 Turn on access transistor
 If 1 is stored, charge is transferred from cell to bit-line, increasing the bit line
voltage slightly!
 If 0 is stored, charge is transferred from bit-line to the cell, decreasing the
bit line voltage slightly!
 Voltage change (+ve / -ve from reference) is amplified by sense amplifier to
detect logic 1 or 0
Reading from DRAM cell
36

 Two challenges
 Capacitor loses value while read operation is performed!
 Reads are destructive!
 Capacitance of bit line is huge compared to cell’s capacitance!
 Because bit line is attached to multiple cells!
 Cell must be able to change bit line voltage consistently!
Restore and Precharge
37

 Solution to two challenges


 Destructive read solution
 Each time a row is selected and read
 Amplifiers on columns recharge the capacitors in storage cells and
regenerate their previous levels
 Dealing with bit line capacitance
 Before row access, bit lines are precharged to known intermediate voltage
level
 Minimum charge transfer from cell is required to change value read from a
column
 Capacitor level value is compared to applied middle voltage level to determine 0
or 1
Sense Amplifier, Precharge and Write Drivers
38
Refresh
39

 DRAM capacitor loses charge over time


 Because access transistor isn’t a perfect switch!
 Need periodic refresh
 Typically, after every 64 ms these days!
 Makes DRAM consume a significant amount of overall power!
DRAM Organization
DRAM Rows and Columns
41

 Row is selected using row address!


 Column is selected using column address!
Address Multiplexing
42

 DRAM uses address multiplexing to reduce number of pins for


address
 The address is divided into row address and column address
 Row address is first provided and latched
 An entire row selected!
 Then column address is provided to select one cell within a row or one bit
from the row buffer!
Address Multiplexing
43

 nRAS – Row address strobe, a pulse to latch row address


 nCAS – Column address strobe
 nWE – Write enable
From Single to Multi-bit DRAMs
44

 At outset
 DRAMs had data width of 1-bit
 Din and Dout were separate buses
 Single control signel nWE used to select between read/write
 With more bit widths (multiple banks with multiple rows/column
organizations)
 Single tri-state DQ bus wes used!
 Separate nOE to select output to data pins!
 With 16-bit DRAM
 nUCAS (upper) and nLCAS (lower) allow addressing upper / lower byte
separately!
From Single to Multi-bit DRAMs
45

 Multiple chips combine to meet processor data-bus width!


 E.g., 64-bit processor would need 4 of x16 type DRAM chips or 8 of x4
DRAM chips!
From Single to Multi-bit DRAMs
46

 Connecting an 8-bit processor with eight 1-bit wide DRAMs


DRAM in General-Purpose Computers
47

 There are more rows than columns


 Rows require less peripheral circuitry than columns
 Typically, 16384 rows and 1024 columns!

 Multiple array (row x column) structures (e.g., 8) form one bank


 Allows (e.g.,) 1-byte read or write!
 Same address is applied to all rows …

 Multiple banks combine to form a DRAM chip


 Some address bits needed to select a bank within a chip!

 Several chips combine to form a rank


 There are typically two ranks on a Dual-Inline Memory Module (DIMM)!
 Each rank transfers data through a channel to memory controller!
 Channels may be shared between DIMMs or dedicated depending on
processor!
Flash Memory
Flash Memory
49

 Working principle
 Trapping electrons in floating gate of floating-gate MOSFET (Flash cell)
Flash Memory
50

 Erase operation – Writing all 1’s – NOR Flash

Digital Integrated Circuits – Prof. Yoonmyung Lee ( 이윤명 ), SKKU


Flash Memory
51

 Write operation – Writing 0 – NOR Flash

Digital Integrated Circuits – Prof. Yoonmyung Lee ( 이윤명 ), SKKU


Flash Memory
52

 Read operation – NOR Flash

Digital Integrated Circuits – Prof. Yoonmyung Lee ( 이윤명 ), SKKU


Flash Memory
53

 Erasure is block level rather than byte or bit level!


 Erasure necessary before writes!
 Performed by applying electric field to the block making process much
quicker!
 Erasure resets every bit in block to binary 1
 Removing electrons from floating gate!
 Writing involves generating binary 0s as required!
 Trapping electrons in floating gate!
NAND vs NOR Flash
54

https://www.embedded.com/flash-101-nand-flash-vs-nor-flash/
NAND vs NOR Flash
55

 After block erasure, NOR allows byte writes!


 NAND writes page (or part-page) at a time!
 Speed 6 times faster than NOR!

 Byte-level addressing of NOR allows (3X) faster read access


 Large sequential reads will be faster in NAND!
 NAND has more density and is cheaper!
 More errors in NAND than NOR!
 NAND useful in large storage, cameras, USB sticks!
 NOR useful in small sizes with frequent reads such as for OS for
embedded and mobile devices!
Emerging Memory Technologies
Resistance-Based Memories
57

ReRAM PCM STT-MRAM


Resistance-Based Memories
58

 Reliability concerns in emerging memories


 Resistance drift in PCM

Resistance Resistance
Resistanc

B Error!
Amorphous `
10
0 1
State
e

11

01
A 1 Crystalline
0
State 00
Time
SLC MLC
Resistance-Based Memories
59

 Reliability concerns in emerging memories


 Resistance drift in PCM
 Drift is data dependent

More Frequent More Frequent

10 Level 4
00 01 11 10 00 01 11 10
11 Level 3
Invert Complement
01 Level 2
11 10 00 01 00 11 01 10
00 Level 1

More Frequent More Frequent

Imran et al., ICCAD (2019)


Resistance-Based Memories
60

 Reliability concerns in emerging memories


 Write Disturbance in PCM
 Write disturbance is also data-dependent

Resistance

RESE
Amorphous
RESET
State
T
Crystalline
SET
State
Resistance-Based Memories
61

 Reliability concerns in emerging memories


 Write Disturbance in PCM

Binary
1 1 1 1 = 8+4+2+1 = (15)d = 16-1 = (15)d

Program to RESET
Intermediate State

Imran et al., IEEE Transactions on Computers (2021)


Resistance-Based Memories
62
In-Memory Computing
Memory Wall
64

 HW throughput grew faster than memory BW


 Memory-intensive application E.g. Neural Networks
 Memory and communication energy dominates

Source: AI and Memory Wall/Medium Post

Source: Google
In-Memory Computing - Emerging Memories
65

 Resistive Memories based architectures …

MAP to MAP to DECIPHER


conductanc read from the
e values voltage current

•Multiply & Add operation (@8-bit) in a 60x60 array consumes <0.001pJ per operation

•DigitalInt8 ADD  0.007pJ

Burr et al., Adv. Phys. X (2017), Xia and Yang, Nature Materials (2019)
In-Memory Computing - SRAM
66

 Logic operations in SRAM

SA Result = 1

SA Result = 0
operands

SA Result = 1

SA Result = 0

Aga et al., HPCA (2017), Jeloka et al., JSSC (2016)


In-Memory Computing - SRAM
67

 Addition, multiplication and convolution in SRAM

Eckert et al., ISCA (2018)


In-Memory Computing - SRAM
68

 Matrix Vector Multiplication using SRAM + Capacitor

MAP to
cap DECIPHE
voltage R from
MAP to voltage
SRAM along
content the BL

Biswas et al., ISSCC (2018)


Valavi et al., JSSC (2019)
Khaddam-Aljameh, TVLSI (2020)
In-Memory Computing - DRAM
69

 Row Clone in DRAM – Fast Parallel Mode

Seshadri et al., MICRO (2013)


In-Memory Computing - DRAM
70

 Row Clone in DRAM – Pipelined Serial Mode


 Reading from one bank
 Writing to another bank

Seshadri et al., MICRO (2013)


In-Memory Computing - DRAM
71

 AND/OR operations in DRAM

SA Result = 1
operands

SA Result = 0

SA Result = 1

SA Result = 0

0/1

Sheshadri et al., MICRO (2017), Li et al., MICRO (2017)


In-Memory Computing - DRAM
72

 NOT operation in DRAM

Sheshadri et al., MICRO (2017)


In-Memory Computing - DRAM
73

 Bit-wise operations in DRAM with modified sense amplifier

Xin et al., HPCA (2020)


In-Memory Computing - DRAM
74

 Bit-wise operations in DRAM with modified sense amplifier


 OR operation …

Xin et al., HPCA (2020)


Relevant Reading
75

 Jan. M. Rabaey, Digital Integrated Circuits, 2nd Ed.


 Chapter 12

You might also like