0% found this document useful (0 votes)
59 views75 pages

Introduction To DDRX Technology 1732457404

Uploaded by

neethu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views75 pages

Introduction To DDRX Technology 1732457404

Uploaded by

neethu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

RAGHAVENDRA ANJANAPPA Introduction to DDRx Technology

INTRODUCTION TO DDRX TECHNOLOGY

Raghavendra Anjanappa
▪ Image source Distrelec International CC: Vince Balogh https://www.behance.net/gallery/75625645/Processor
RAGHAVENDRA ANJANAPPA Introduction to DDRx Technology

AGENDA
► What is a Memory? Memory History and Application Space

► What is a RAM? SRAM vs DRAM, Types of DRAM

► SDRAM ,DDR SDRAM, QDR SDRAM

► RAM & CPU Machine Cycle, Computer System Memory Hierarchy

► DDR Evolution, DDR1/2/3/4, DDR comparison, SDRAM Packages, DIMM & SO-DIMM Packages

► Overview and Comparison of LPDDR & GDDR

► DDR Topologies, DDR Bus Options, Buffered & Unbuffered DIMM, DDR3/4 Signal Groups, DDR4 Key Changes

► DDR3/4 Routing Guidelines Address, Command, Control, Clock & Data

► DDR Timing Specifications JEDEC & DDR5 Overview


WHAT IS A MEMORY?

Memory refers to the processes


that are used to acquire, store,
retain, and later retrieve
information.
MEMORY - HISTORY
► Jan A. Rajchman invented the Selectron ► Fred Williams invented the Williams tube
tube which was a digital computer memory which was the first random-access computer
► It stored digital data as electrostatic memory
charges ► The Williams tube was able to store more
► The Selectron tube was limited to 256 bits information than the Selectron tube
and more expensive ► Less expensive

Selectron Tube 1946 Williams tube 1947


MEMORY - APPLICATION SPACE
WHAT IS A RAM?
RAM - Random Access Memory , also known as working memory - is primarily used by the
operating system or the CPU for fast access to data that is constantly being used.

Static RAM Dynamic RAM


SRAM VS DRAM

Static RAM (SRAM) DRAM (Dynamic RAM)


Made up of Flip Flops Made up of Transistors & Capacitors
Data is stored in Bits Data is stored as electrical charge
Bigger in Size Smaller in Size
Faster Access Speed Slower Access Speed
Expensive Relatively Cheaper
Data Retention is Good Data Leakage Is Possible
Used for Cache Memory Used as Main Memory
TYPES OF RAM
ASYNCHRONOUS DRAM
ADRAM (Asynchronous DRAM)

I/O BUS

RAM is not synchronized with the CPU clock


SYNCHRONOUS DRAM
SDRAM (Synchronous DRAM)

I/O BUS

RAM is synchronized with the CPU clock


SYNCHRONOUS DRAM
SDRAM (Synchronous DRAM)

I/O BUS

PC100 I/O Clock Frequency 100MHz


Maximum Bandwidth = 100x64 bits/8 bits = 800MB/s
SDRAM
► In the beginning of Memory, SDR was used
► Data was clocked once per clock period
► Processors got faster but DRAM chips did not

Rising Edge

CHANNEL A – 100MHz
DDR SDRAM
► DDR was designed to be able to clock data on
both rising and falling edge of the clock
► Throughput was doubled
► DDR,DDR2,DDR3,DDR4,DDR5

CHANNEL A – 933MHz
CHANNEL B – 933MHz > 1866MHz
Rising
Edge
Falling
Edge
QDR SDRAM
► Quad data rate (QDR, or quad pumping) is a communication signaling technique wherein data are
transmitted at four points in the clock cycle: on the rising and falling edges, and at two intermediate
points between them
► The intermediate points are defined by a second clock that is 90° out of phase from the first. The effect
is to deliver four bits of data per signal line per clock cycle

CHANNEL A
CHANNEL B > 1866MHz

> 3732MHz
Rising Falling

>
CHANNEL C Edge Edge
CHANNEL D 1866MHz
RAM & CPU MACHINE CYCLE

Fetch

Store Decode

Execute
COMPUTER SYSTEM MEMORY HIERARCHY

Primary Secondary
High Speed Memory Memory Memory

PROCESSOR CACHE RAM Hard Disk

CACHE CACHE
LEVEL-2 LEVEL-3
DDR EVOLUTION
2007
DDR EVOLUTION
2015
2020
Lorem Ipsum

Lorem Ipsum

Lorem Ipsum
DDR5
DDR4 2020
Lorem Ipsum

DDR3 2012
Lorem Ipsum

Lorem Ipsum DDR2 2007


Lorem Ipsum
DDR 2003
SDRAM 2000
1998
DDR

2 bit Prefetch
I/O BUS

Voltage -2.5V

RAM Internal clock frequency Clock Frequency Data Rate Module


133 MHz 133 MHz 266 MTPS PC-2100
200 MHz 200 MHz 400 MTPS PC-3200
DDR2

4 bit Prefetch
I/O BUS

Voltage – 1.8V

RAM Internal clock frequency Clock Frequency Data Rate Module


100 MHz 200 MHz 400 MTPS PC2-3200
200 MHz 400 MHz 800 MTPS PC2-6400
DDR3

8 bit Prefetch
I/O BUS

Voltage – 1.5V/1.35V

RAM Internal clock frequency Clock Frequency Data Rate Module


100 MHz 400 MHz 800 MTPS PC3-6400
266.6 MHz 1066.6 MHz 2133.3 MTPS PC3-17000
DDR4

8 bit Prefetch
I/O BUS

Voltage – 1.2V

RAM Internal Clock Frequency Clock Frequency Data Rate Module


200 MHz 800 MHz 1600 MTPS PC4-12800
400 MHz 1600 MHz 3200 MTPS PC4 - 25600
DDR COMPARISON
DDR Version DDR DDR2 DDR3 DDR4 DDR5
Operating Voltage 2.5V SSTL 1.8V SSTL 1.5V POD 1.2V 1.1V
Pre-fetch Buffer size 2 4 8 8 16
Chip Densities 128Mb – 1Gb 128Mb – 4Gb 512Mb – 8Gb 2Gb – 16Gb 8Gb – 64Gb
Data Rate (MT/s) 200 - 400 400 - 800 800 - 2133 1600 - 3200 3200 - 6400
Bank Groups 0 0 0 4 8
Nominal,
Termination ODT On Board ODT Enabled Park Modes Nominal Er/Rd
Dynamic modes
IO Clock MHz 100-200 200-533 533-1200 1066-2400 2133-3200
DIMM Pins 184 240 240 288 288
Channel Width 64 64 64 64 2x32
SDRAM PACKAGES
Dual In-Line Package

Single In-Line Module

Dual In-Line Module


DIMM PACKAGES

Voltage Pin Density

2.5V 184 Pins

1.8V 240 Pins

1.5V 240 Pins

1.2V 288 Pins


SO-DIMM PACKAGES

200 Pins

200 Pins

204 Pins
LOW POWER DDR
LPDDR, an abbreviation for Low-Power Double Data Rate, also known as LPDDR
SDRAM, is a type of synchronous dynamic random-access memory that consumes less
power and is targeted for mobile computers. Older variants are also known as Mobile
DDR, and abbreviated as mDDR

MDDR/LPDDR : Low Power DDR


► LPDDR1
► LPDDR2
► LPDDR3
► LPDDR4/LPDDR4X
LOW POWER DDR COMPARISON

Features LPDDR2 LPDDR3 LPDDR4


Supply Voltage 1.8V 1.2V 1.1V
Speed 800/1066 Mbps 1600/2133 Mbps 3200/4267 Mbps
Prefetch Buffer size 4/2 bit 8 bit 16 bit
Chip Densities 64Mb – 8Gb 1Gb – 32Gb 4Gb – 32Gb
Burst Length 16,8 & 4 8 only 16 & 32
CA Pins 10 (CA0:9) 10 (CA0:9) 6 (CA0:5)
GDDR – GRAPHICS DOUBLE DATA RATE
(Graphics Double Data Rate) GDDR is double data rate (DDR) memory specialized for fast
rendering on graphics cards (GPUs)
Introduced in 2000, GDDR is the primary graphics RAM in use today
GDDR is technically "GDDR SDRAM" and supersedes VRAM and WRAM

GDDR : Graphics DDR


► GDDR1
► GDDR2
► GDDR3
► GDDR4
► GDDR5
► GDDR6
GDDR COMPARISONS
GDDR4
GDDR GDDR2 GDDR3
nVIDIA ATi
Maximum Frequency 200 MHz 500 MHz 900 Mhz 1.2 GHz 1.4 GHz
Configuration 4Mx32 4Mx32 8Mx32 8Mx32 16Mx32
Prefetch 2n 4n 4n 4n 8n
Package 144 Ball FBGA 144 Ball FBGA 144 Ball FBGA 136 Ball FBGA 136 Ball FBGA
Interface SSTL-2 SSTL-18 POD-18 POD-15 POD-15
Bank 4 4 4 8 8
Voltage 2.5 2.5 1.8 1.8 1.8
Addressing Single Single Single Single/Double Double
DDR TOPOLOGIES
FLY BY TOPOLOGY
VTT

∆T3

MEMORY ∆T2
CONTROLLER Strobe/Data

∆T1

∆T
Clock/Address/Command
BALANCED SINGLE T TOPOLOGY

Strobe/Data
Strobe/Data Clock/Address/
Command

MEMORY CONTROLLER
BALANCED DOUBLE T TOPOLOGY

Strobe/Data
Strobe/Data

Clock/Address/ Clock/Address/
Command Command

MEMORY CONTROLLER
CHOOSING TOPOLOGY BASED ON DIE PACKAGE
SINGLE DIE PACKAGE

T TOPOLOGY/FLY BY TOPOLOGY

DOUBLE DIE PACKAGE

DOUBLE BALANCED T TOPOLOGY

QUAD DIE PACKAGE

FLY BY TOPOLOGY
TWO SDRAM ADDR / CMD / CNTRL TOPOLOGIES
Tree Topology Daisy Chain Topology

Fly By Topology
FOUR SDRAM ADDR / CMD / CNTRL TOPOLOGIES
Tree Topology

DDR3 Fly By Topology


ASYMMETRICAL & SYMMETRICAL T TOPOLOGY
DDR BUS OPTIONS

Advantages & Disadvantages


Bus Characteristics
T Topology Fly By Topology
Routing Difficult Easy
Performance Excellent, but offers low bandwidth Good, but offers high bandwidth
Load Handling Difficult & sensitive to large loads Easy and unaffected by large loads
Timing Skews Minimal Issues Issues require leveling
Signal Integrity Good Best
BUFFERED & UNBUFFERED DIMM
BUFFERED VS UNBUFFERED DIMM
Buffered DIMM Unbuffered DIMM
It is a memory in a computer that has a register It is a memory in a computer that does not have a
between DRAM and system’s memory controller register between DRAM and system’s memory controller
More stability to system Less stability to system
More costly as compared to unbuffered memory Less costly as compared to buffered memory
Known as your registered memory Known as unregistered or as conventional memory
Used for lessen electrical load on the memory
Generates more electrical load n memory
controller
High reliability in stored data Less reliability in stored data
Used for servers and other mission-critical systems
Used for regular desktops and laptops, etc
that require a stable operating environment
Has one clock cycle fewer Has no clock cycle penalty
DDR3/4 SIGNAL GROUPS &
DDR4 KEY CHANGES
Functional Groups

DDR3Data
SIGNALING GROUPS

CONTROLLER MEMORY
Functional Groups

DDR4Data
SIGNALING GROUPS

CONTROLLER MEMORY
DDR4 KEY CHANGES
► New VPP supply
► Removed VREFDQ reference input
► Changed I/O buffer from SSTL to POD
► DBI (Data Bus Inversion)
► Added ACT_n control
NEW VPP SUPPLY
► External Vpp for Word line voltage
► DDR3 utilizes on die voltage pump to generate higher word line voltage
► DDR4 utilizes separate Vpp voltage rail
► Externally supplied Vpp @ 2.5V enables more energy efficient system
► Reduces voltage draw and dice space
REMOVED VREFDQ REFERENCE INPUT
► The VREFDQ reference input supply was
removed from the package interface and
VREFDQ is now internally generated by
the DRAM
► This means the VREFDQ can be set to
any value over a wide range, there is no
specific value defined DDR3 DDR4
SDRAM SDRAM
► This means the DRAM controller must
set the DRAM’s VREFDQ settings to the
proper value, this introduces the need
for VREFDQ calibration
► JEDEC does not provide a specific
routine on how to perform VREFDQ
calibration
CHANGED I/O BUFFER FROM SSTL TO POD
► DDR4’s new memory interface employs pseudo open drain
(POD) termination
► DDR4 consumes power only when the VDD rail is pulled
down to a logical 0 (Low)
► The I/O buffer has been converted from push-pull to
pseudo open drain (POD)
► By being terminated to VDDQ instead of 1/2 of VDDQ, the
size of and center of the signal swing can be custom-
tailored to each design’s need
► POD enables reduced switching current when driving data
since only 0s consume power, and additional switching
current savings can be realized with DBI enabled
► An additional benefit with DBI enabled is a reduction in
crosstalk resulting in a larger data-eye
DATA BUS INVERSION
► Ensure more 1’s than 0’s
► This effect of capacitive charge and discharge
for all data lines at high speeds creates a Without DBI With DBI
problem called simultaneous switching output DQ0 1 0 1 0 1 1 1 1
(SSO), which stresses the DRAM’s power- DQ1 1 0 1 0 1 1 1 1
distribution system on the chip, in the DQ2 1 0 1 0 1 1 1 1
package, and on the board DQ3 1 0 0 1 1 1 0 0
► Because of capacitive coupling, the data lines DQ4 1 0 1 0 1 1 1 1
making the zero-to-one transition become DQ5 1 0 1 0 1 1 1 1
aggressors that try to induce that lone holdout 1 0 1 0 1 1 1 1
DQ6
bit to also make the transition even though it
DQ7 1 0 1 0 1 1 1 1
does not want to do so
DBI 0 1 0 1
► If more than 4 bits in a byte are 0, toggle bits
► DBI bit indicates that the data bus should be
Bus Inverted Fewer
inverted SSO Problem Aggressor/Victim
SSO Reduced Aggressor
Problem
ADDED ACT_N CONTROL
►DDR4 has for the first time multiplexed some of its address pins
►The ACT_n determines whether RAS_n/A16, CAS_n/A15, and WE_n/A14 are to
be treated as control pins or as address pins

ACT_n HIGH ACT_n LOW


DDR3/4 ROUTING
GUIDELINES
DDR3/4 PLACEMENT & SIGNAL GROUPS
The DDR Memory signals can be divided into the
following signal groups for the purpose of the
design guide A2
B3
► Data
► Address/Command/Control
MEMORY
► Clocks CONTROLLER
B2

B1
A1
DDR3/4 SIGNAL ROUTING ORDER
For proper optimization of the DDR interface we recommend the following sequence
while routing the DDR memory channel

ROUTE
ROUTE POWER
ROUTE ROUTE CLOCK
ADDRESS/ CONTROL PLANES
SIGNALS
ROUTE COMMAND SIGNALS
DATA SIGNALS SIGNALS
DDR GENERAL ROUTING GUIDELINES
► Use 45° angles (not 90° corners)
► Avoid T-Junctions for critical nets or clocks
► Avoid T-junctions greater than 75 ps (approximately 25 mils)
► Do not allow signals across split planes
► Restrict routing other signals close to system reset signals
► Avoid routing memory signals closer than 25mils to PCI or system clocks
► Match all signals within a given DQ group with a maximum skew of ±10 ps
or approximately ±50 mils and route on the same layer
PARALLELISM RULES FOR DQ, DQS, AND DM
► 5 mils for parallel runs < 0.5 inches
(1× spacing relative to plane distance)
► 10 mils for parallel runs between 0.5 and 1.0 inches
(2× spacing relative to plane distance)
► 15 mils for parallel runs between 1.0 and 6.0 inches
(3× spacing relative to plane distance)
DQ/DQS/DM DATA LENGTH MATCHING
► All signals within a given byte-lane
group must be matched in length with a
maximum deviation of ±10 ps or
approximately ± 50 mils
► Ensure all signals within a given byte
lane group are routed on the same layer
to avoid layer to layer transmission
velocity differences, which otherwise
increase the skew within the group
► DQ, DQS, DM signal don’t need to be “in
order”, meaning it’s not required that
signals in group 1 arrive later than group
0, group 2 later than group 1 and so on
DQS TO CLOCK LENGTH MATCHING
► For memory interfaces with leveling the
timing between the DQS and clock signals
on each device calibrates dynamically to
meet tDQSS to make sure the skew is not
too large for the leveling circuit’s capability
► Propagation delay of clock signal must not
be shorter than propagation delay of DQS
signal at every device
► CKi – DQSi > 0
0 < i < number of components – 1
► Total skew of CLK and DQS signal between
groups is less than one clock cycle
► Max(CKi + DQSi) – min(CKi + DQSi) < 1 × tCK

DATA ROUTING
ADDRESS AND COMMAND ROUTING GUIDELINES
► Route all addresses and commands to match the clock signals to within ±25
ps or approximately ± 125 mil to each discrete memory component
► Similar to the clock signals in DDR3 SDRAM, address and command signals
are routed in a daisy chain topology from the first SDRAM to the last
SDRAM. Ensure that each net maintains the same consecutive order
► Unbuffered DIMMs are more susceptible to crosstalk and are generally
noisier than buffered
► DIMMs. Route the address and command signals of unbuffered DIMMs on
a different layer than DQ and DM, and with greater spacing
► Do not route differential clock and clock enable signals close to address
signals
PARALLELISM RULES FOR ADD/CMD/CLK SIGNALS
► 5 mils for parallel runs < 0.1 inch
(1× spacing relative to plane distance)
► 10 mils for parallel runs < 0.5 inch
(2× spacing relative to plane distance)
► 15 mils for parallel runs between 0.5 and 1.0 inches
(3× spacing relative to plane distance)
► 20 mils for parallel runs between 1.0 and 6.0 inches
(4× spacing relative to plane distance)
ADDRESS AND COMMAND SIGNALS
CLOCK ROUTING GUIDELINES
► Route clocks on inner layers with outer-layer run lengths held to under 500 mils (12.7 mm)
► The space between differential pairs must be at least 2× the trace width of the differential pair
to minimize loss and maximize interconnect density. For example, differential clocks must be
routed differentially (5 mil trace width, 10-15 mil space on centers, and equal in length to
signals in the Address/Command Group)
► Take care with the via pattern used for clock traces. To avoid transmission-line-to-via
mismatches, it is recommended that your clock via pattern be a Ground-Signal-Signal-Ground
(GSSG) topology (via topology: GND | CLKP | CLKN | GND)
CLOCK SIGNALS LENGTH MATCHING
► Clocks must maintain a length
matching between clock pairs
of ±5ps or approximately ± 25
mils
► Differential clocks need to
maintain length matching
between positive and negative
signals of ±2ps or
approximately ± 10 mils routed
in parallel
SPACING GUIDELINES
Avoid routing two signal layers next to each other
► Always make sure that the signals related to memory GND or Power

interface are routed between appropriate GND or H

3H
power layers H

For DQ/DQS/DM traces GND or Power

► Maintain at least 3H spacing between the edges (air-


PCB
gap) for these traces cross-section
For Address/Command/Control traces GND or Power

► Maintain at least 3H spacing between the edges (air-


H

5H

gap) these traces H

For Clock traces: GND or Power

► Maintain at least 5H spacing between two clock pair or


a clock pair and any other memory interface trace
Note: Where H is the vertical distance to the closest
return path for that particular trace
DDR POWER PLANES
DESIGN GUIDELINES-CRITICAL CONSTRAINTS
► Clock nets, DQ (data) and DQS (strobes) are routed differentially.4.5” max length +/- 25MIL
► Net length from driver to first DIMM or chip-between 2 to 3” max depending on load
► Net length between DIMM’s or chips-0.5”
► Net length from last DIMM or chip to the VTT Termination-0.2” to 0.55”
► All DSQ/DQ (data and data strobe) should be minimized to reduce the skew within groups (or lanes) and
across group
► Skew between address nets should be 200MIL. Address and command nets are daisy chained
and with VTT pull-up for the termination
► Zo for DDR3 is 50 ohm. Zdiff is 100 ohm
DESIGN GUIDELINES-CRITICAL CONSTRAINTS
► Route the differential clocks (CK/CK#) and data strobe (DQS/DQS#) with a length matching
between P and N signals of ±2 ps
► Route the DQS/DQS# associated with a DQ group on the same PCB layer. Match
these DQS pairs to within ±5 ps
► Set the DQS/DQS# as the target trace propagation delay for the associated data and
data mask signals
► Route the data and data mask signals for the DQ group ideally on the same layer as the
associated DQS/DQS# to within ±10 ps skew of the target DQS/DQS#
► Route the CK/CK# clocks and set as the target trace propagation delays for the DQ group.
Match the CK/CK# clock to within ±50 ps of all the DQS/DQS#
► Route the address/control signal group (address, CS, CKE) ideally on the same layer as the
CK/CK# clocks, to within ±20 ps skew of the CK/CK# traces
DDR TIMING SPECIFICATIONS
The JEDEC memory standards are the
specifications for semiconductor DDRx JEDEC Specification
memory circuits and similar storage DDR DDR JESD79F
devices promulgated by the Joint DDR2 JESD79-2F
Electron Device Engineering Council
DDR3 JESD79-3F
Solid State Technology Association, a
semiconductor trade and engineering DDR3L JESD79-3-1A
standardization organization DDR4 JESD79-4
LPDDR JESD209B
LPDDR2 JESD209-2E
LPDDR3 JESD209-3
GDDR5 JESD212A
ADDRESS/COMMAND/CONTROL - TIMING
► Clock - Differential signal is the reference
► Transition points is when Clock cross VREF
► 1T timing-1 clock cycle for latching
► 2T timing–2 clock cycles for latching when
address bus is heavily loaded
DATA STROBE (DQS) & DATA - TIMING
► DQS-Differential Clock is the reference
► DATA is clocked at both Edges of DQS
► SETUP time is when Clock Crosses
VIL(AC) /VIH(AC)
► HOLD time is when Clock crosses
VIL(DC)/ VIH(DC)
OVERSHOOT & UNDERSHOOT - MARGIN
DDR5 OVERVIEW
KEY CHANGES
DDR5 KEY CHANGES
► VRM on DIMM ► Duty Cycle Adjuster
PMIC5000 and PMIC5010 Memory controller can adjust DRAM internal DQS/DQ duty cycle
For DDR5 RDIMM, LRDIMM and NVDIMM

► Decision Feedback Equalization


► On-Die ECC (Error Correction Code) 4-Tap DFE equalizes DQ signals
DRAM dynamically corrects single-bit
errors
DDR4 VS DDR5
Features DDR4 DDR5 DDR5 Advantages
1.6 to 3.2 Gbps Data Rate 4.8 to 6.4 Gbps Data Rate Higher Bandwidth
Speed
0.8 to 1.6 GHz Clock Rate 1.6 to 3.2 GHz Clock Rate DDR5-4800 initial designs

IO Voltage 1.2V 1.1V Lower power


Power Better power efficiency
On motherboard On DIM PMIC
Management Better scalability
72 bit data channel (64 data 40 bit data channel (32 data
Channel Higher memory efficiency
+8 ECC) +8 ECC)
Architecture Lower latency
1 channel per DIMM 2 channel per DIMM

Burst Length BC4, BL8 BC8, BL16 Higher memory efficiency

Maximum Die
16Gb 64Gb Higher capacity DIMMs
Density

You might also like