Introduction To DDRX Technology 1732457404
Introduction To DDRX Technology 1732457404
Raghavendra Anjanappa
▪ Image source Distrelec International CC: Vince Balogh https://www.behance.net/gallery/75625645/Processor
RAGHAVENDRA ANJANAPPA Introduction to DDRx Technology
AGENDA
► What is a Memory? Memory History and Application Space
► DDR Evolution, DDR1/2/3/4, DDR comparison, SDRAM Packages, DIMM & SO-DIMM Packages
► DDR Topologies, DDR Bus Options, Buffered & Unbuffered DIMM, DDR3/4 Signal Groups, DDR4 Key Changes
I/O BUS
I/O BUS
I/O BUS
Rising Edge
CHANNEL A – 100MHz
DDR SDRAM
► DDR was designed to be able to clock data on
both rising and falling edge of the clock
► Throughput was doubled
► DDR,DDR2,DDR3,DDR4,DDR5
CHANNEL A – 933MHz
CHANNEL B – 933MHz > 1866MHz
Rising
Edge
Falling
Edge
QDR SDRAM
► Quad data rate (QDR, or quad pumping) is a communication signaling technique wherein data are
transmitted at four points in the clock cycle: on the rising and falling edges, and at two intermediate
points between them
► The intermediate points are defined by a second clock that is 90° out of phase from the first. The effect
is to deliver four bits of data per signal line per clock cycle
CHANNEL A
CHANNEL B > 1866MHz
> 3732MHz
Rising Falling
>
CHANNEL C Edge Edge
CHANNEL D 1866MHz
RAM & CPU MACHINE CYCLE
Fetch
Store Decode
Execute
COMPUTER SYSTEM MEMORY HIERARCHY
Primary Secondary
High Speed Memory Memory Memory
CACHE CACHE
LEVEL-2 LEVEL-3
DDR EVOLUTION
2007
DDR EVOLUTION
2015
2020
Lorem Ipsum
Lorem Ipsum
Lorem Ipsum
DDR5
DDR4 2020
Lorem Ipsum
DDR3 2012
Lorem Ipsum
2 bit Prefetch
I/O BUS
Voltage -2.5V
4 bit Prefetch
I/O BUS
Voltage – 1.8V
8 bit Prefetch
I/O BUS
Voltage – 1.5V/1.35V
8 bit Prefetch
I/O BUS
Voltage – 1.2V
200 Pins
200 Pins
204 Pins
LOW POWER DDR
LPDDR, an abbreviation for Low-Power Double Data Rate, also known as LPDDR
SDRAM, is a type of synchronous dynamic random-access memory that consumes less
power and is targeted for mobile computers. Older variants are also known as Mobile
DDR, and abbreviated as mDDR
∆T3
MEMORY ∆T2
CONTROLLER Strobe/Data
∆T1
∆T
Clock/Address/Command
BALANCED SINGLE T TOPOLOGY
Strobe/Data
Strobe/Data Clock/Address/
Command
MEMORY CONTROLLER
BALANCED DOUBLE T TOPOLOGY
Strobe/Data
Strobe/Data
Clock/Address/ Clock/Address/
Command Command
MEMORY CONTROLLER
CHOOSING TOPOLOGY BASED ON DIE PACKAGE
SINGLE DIE PACKAGE
T TOPOLOGY/FLY BY TOPOLOGY
FLY BY TOPOLOGY
TWO SDRAM ADDR / CMD / CNTRL TOPOLOGIES
Tree Topology Daisy Chain Topology
Fly By Topology
FOUR SDRAM ADDR / CMD / CNTRL TOPOLOGIES
Tree Topology
DDR3Data
SIGNALING GROUPS
CONTROLLER MEMORY
Functional Groups
DDR4Data
SIGNALING GROUPS
CONTROLLER MEMORY
DDR4 KEY CHANGES
► New VPP supply
► Removed VREFDQ reference input
► Changed I/O buffer from SSTL to POD
► DBI (Data Bus Inversion)
► Added ACT_n control
NEW VPP SUPPLY
► External Vpp for Word line voltage
► DDR3 utilizes on die voltage pump to generate higher word line voltage
► DDR4 utilizes separate Vpp voltage rail
► Externally supplied Vpp @ 2.5V enables more energy efficient system
► Reduces voltage draw and dice space
REMOVED VREFDQ REFERENCE INPUT
► The VREFDQ reference input supply was
removed from the package interface and
VREFDQ is now internally generated by
the DRAM
► This means the VREFDQ can be set to
any value over a wide range, there is no
specific value defined DDR3 DDR4
SDRAM SDRAM
► This means the DRAM controller must
set the DRAM’s VREFDQ settings to the
proper value, this introduces the need
for VREFDQ calibration
► JEDEC does not provide a specific
routine on how to perform VREFDQ
calibration
CHANGED I/O BUFFER FROM SSTL TO POD
► DDR4’s new memory interface employs pseudo open drain
(POD) termination
► DDR4 consumes power only when the VDD rail is pulled
down to a logical 0 (Low)
► The I/O buffer has been converted from push-pull to
pseudo open drain (POD)
► By being terminated to VDDQ instead of 1/2 of VDDQ, the
size of and center of the signal swing can be custom-
tailored to each design’s need
► POD enables reduced switching current when driving data
since only 0s consume power, and additional switching
current savings can be realized with DBI enabled
► An additional benefit with DBI enabled is a reduction in
crosstalk resulting in a larger data-eye
DATA BUS INVERSION
► Ensure more 1’s than 0’s
► This effect of capacitive charge and discharge
for all data lines at high speeds creates a Without DBI With DBI
problem called simultaneous switching output DQ0 1 0 1 0 1 1 1 1
(SSO), which stresses the DRAM’s power- DQ1 1 0 1 0 1 1 1 1
distribution system on the chip, in the DQ2 1 0 1 0 1 1 1 1
package, and on the board DQ3 1 0 0 1 1 1 0 0
► Because of capacitive coupling, the data lines DQ4 1 0 1 0 1 1 1 1
making the zero-to-one transition become DQ5 1 0 1 0 1 1 1 1
aggressors that try to induce that lone holdout 1 0 1 0 1 1 1 1
DQ6
bit to also make the transition even though it
DQ7 1 0 1 0 1 1 1 1
does not want to do so
DBI 0 1 0 1
► If more than 4 bits in a byte are 0, toggle bits
► DBI bit indicates that the data bus should be
Bus Inverted Fewer
inverted SSO Problem Aggressor/Victim
SSO Reduced Aggressor
Problem
ADDED ACT_N CONTROL
►DDR4 has for the first time multiplexed some of its address pins
►The ACT_n determines whether RAS_n/A16, CAS_n/A15, and WE_n/A14 are to
be treated as control pins or as address pins
B1
A1
DDR3/4 SIGNAL ROUTING ORDER
For proper optimization of the DDR interface we recommend the following sequence
while routing the DDR memory channel
ROUTE
ROUTE POWER
ROUTE ROUTE CLOCK
ADDRESS/ CONTROL PLANES
SIGNALS
ROUTE COMMAND SIGNALS
DATA SIGNALS SIGNALS
DDR GENERAL ROUTING GUIDELINES
► Use 45° angles (not 90° corners)
► Avoid T-Junctions for critical nets or clocks
► Avoid T-junctions greater than 75 ps (approximately 25 mils)
► Do not allow signals across split planes
► Restrict routing other signals close to system reset signals
► Avoid routing memory signals closer than 25mils to PCI or system clocks
► Match all signals within a given DQ group with a maximum skew of ±10 ps
or approximately ±50 mils and route on the same layer
PARALLELISM RULES FOR DQ, DQS, AND DM
► 5 mils for parallel runs < 0.5 inches
(1× spacing relative to plane distance)
► 10 mils for parallel runs between 0.5 and 1.0 inches
(2× spacing relative to plane distance)
► 15 mils for parallel runs between 1.0 and 6.0 inches
(3× spacing relative to plane distance)
DQ/DQS/DM DATA LENGTH MATCHING
► All signals within a given byte-lane
group must be matched in length with a
maximum deviation of ±10 ps or
approximately ± 50 mils
► Ensure all signals within a given byte
lane group are routed on the same layer
to avoid layer to layer transmission
velocity differences, which otherwise
increase the skew within the group
► DQ, DQS, DM signal don’t need to be “in
order”, meaning it’s not required that
signals in group 1 arrive later than group
0, group 2 later than group 1 and so on
DQS TO CLOCK LENGTH MATCHING
► For memory interfaces with leveling the
timing between the DQS and clock signals
on each device calibrates dynamically to
meet tDQSS to make sure the skew is not
too large for the leveling circuit’s capability
► Propagation delay of clock signal must not
be shorter than propagation delay of DQS
signal at every device
► CKi – DQSi > 0
0 < i < number of components – 1
► Total skew of CLK and DQS signal between
groups is less than one clock cycle
► Max(CKi + DQSi) – min(CKi + DQSi) < 1 × tCK
►
DATA ROUTING
ADDRESS AND COMMAND ROUTING GUIDELINES
► Route all addresses and commands to match the clock signals to within ±25
ps or approximately ± 125 mil to each discrete memory component
► Similar to the clock signals in DDR3 SDRAM, address and command signals
are routed in a daisy chain topology from the first SDRAM to the last
SDRAM. Ensure that each net maintains the same consecutive order
► Unbuffered DIMMs are more susceptible to crosstalk and are generally
noisier than buffered
► DIMMs. Route the address and command signals of unbuffered DIMMs on
a different layer than DQ and DM, and with greater spacing
► Do not route differential clock and clock enable signals close to address
signals
PARALLELISM RULES FOR ADD/CMD/CLK SIGNALS
► 5 mils for parallel runs < 0.1 inch
(1× spacing relative to plane distance)
► 10 mils for parallel runs < 0.5 inch
(2× spacing relative to plane distance)
► 15 mils for parallel runs between 0.5 and 1.0 inches
(3× spacing relative to plane distance)
► 20 mils for parallel runs between 1.0 and 6.0 inches
(4× spacing relative to plane distance)
ADDRESS AND COMMAND SIGNALS
CLOCK ROUTING GUIDELINES
► Route clocks on inner layers with outer-layer run lengths held to under 500 mils (12.7 mm)
► The space between differential pairs must be at least 2× the trace width of the differential pair
to minimize loss and maximize interconnect density. For example, differential clocks must be
routed differentially (5 mil trace width, 10-15 mil space on centers, and equal in length to
signals in the Address/Command Group)
► Take care with the via pattern used for clock traces. To avoid transmission-line-to-via
mismatches, it is recommended that your clock via pattern be a Ground-Signal-Signal-Ground
(GSSG) topology (via topology: GND | CLKP | CLKN | GND)
CLOCK SIGNALS LENGTH MATCHING
► Clocks must maintain a length
matching between clock pairs
of ±5ps or approximately ± 25
mils
► Differential clocks need to
maintain length matching
between positive and negative
signals of ±2ps or
approximately ± 10 mils routed
in parallel
SPACING GUIDELINES
Avoid routing two signal layers next to each other
► Always make sure that the signals related to memory GND or Power
3H
power layers H
5H
Maximum Die
16Gb 64Gb Higher capacity DIMMs
Density