DRAM Lecture2
DRAM Lecture2
Switching element
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
Memory Array
Switching element
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
Row of DRAM
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
1 2 3
4 5 6
Sense and Amplify
6 Rows shown
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
4 5 6 Vcc/2
Sense and Amplify
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
4 5 6
Sense and Amplify Wordline Driven
Vcc/2
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
Row Decoder
Memory Array
AKA: OPEN a DRAM Page/Row or ACT (Activate a DRAM Page/Row) or RAS (Row Address Strobe)
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
once the data is valid on ALL of the bit lines, you can select a subset of the bits and send them to the output buffers ... CAS picks one of the bits big point: cannot do another RAS or precharge of the lines until finished reading the column data ... cant change the values on the bit lines or the output of the sense amps until it has been read by the memory controller
One Row of DRAM * SDRAM means SDRAM and variants. i.e. DDR SDRAM
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
Row Decoder
Memory Array
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
then the data is valid on the data bus ... depending on what you are using for in/out buffers, you might be able to overlap a litttle or a lot of the data transfer with the next CAS to the same page (this is PAGE MODE)
Row Decoder
Memory Array
Data Out ... with optional additional CAS: Column Address Strobe
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
NOTE
Row Decoder
Memory Array
tRCD
RCD (Row Command Delay)
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
DRAM Column Decoder Sense Amps ... Bit Lines... Row Decoder
. .. Word Lines ...
Memory Array
CAS: Column Address Strobe CASL: Column Address Strobe Latency CL: Column Address Strobe Latency
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
Row Decoder
Memory Array
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
Row Decoder
Memory Array
tRP
RP (Row Precharge Delay)
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
Row Decoder
Memory Array
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
RAS: Row Address Strobe CAS: Column Address Strobe RCD: Row Command Delay RAC :Random Access Delay RP :Row Precharge Delay RC :Row Cycle Time
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang
University of Maryland
PC133 SDRAM DDR 266 PC800 RDRAM FCRAM RLDRAM 133 133 * 2 400 * 2 200 * 2 300 * 2
DRAM is slow But doesnt have to be tRC < 10ns achievable Higher die cost Not commodity Not adopted in standard Expensive
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
DRAM latency isnt deterministic because of CAS or RAS+CAS, and there may be significant queuing delays within the CPU and the memory controller Each transaction has some overhead. Some types of overhead cannot be pipelined. This means that in general, longer bursts are more efficient.
DRAM latency
F CPU A B C D E2/E3 DRAM Mem
Controller
E1
A: Transaction request may be delayed in Queue B: Transaction request sent to Memory Controller C: Transaction converted to Command Sequences (may be queued) D: Command/s Sent to DRAM E1: Requires only a CAS or E2: Requires RAS + CAS or E3: Requires PRE + RAS + CAS F: Transaction sent back to CPU DRAM Latency = A + B + C + D + E + F
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
NOTE
Row Decoder
Row Decoder
Memory Array
Memory Array
Row Decoder
Memory Array
x2 DRAM
x4 DRAM
x8 DRAM
....
....
....
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
lets look at the interface another way .. the say the data sheets portray it. [explain] main point: the RAS\ and CAS\ signals directly control the latches that hold the row and column addresses ...
RAS
CAS
Data Transfer
Address Row Address Column Address Row Address Column Address
DQ
Valid Dataout
Valid Dataout
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
since DRAMs inception, there have been a stream of changes to the design, from FPM to EDO to Burst EDO to SDRAM. the changes are largely structural modifications -- nimor -- that target THROUGHPUT. [discuss FPM up to SDRAM Everything up to and including SDRAM has been relatively inexpensive, especially when considering the pay-off (FPM was essentially free, EDO cost a latch, PBEDO cost a counter, SDRAM cost a slight re-design). however, were run out of free ideas, and now all changes are considered expensive ... thus there is no consensus on new directions and myriad of choices has appeared [ do LATENCY mods starting with ESDRAM ... and then the INTERFACE mods ]
FPM
EDO
P/BEDO
SDRAM
ESDRAM
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
NOTE
DRAM Evolution
Read Timing for Conventional DRAM
Row Access Column Access Transfer Overlap
RAS
Data Transfer
CAS
DQ
Valid Dataout
Valid Dataout
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
FPM aallows you to keep th esense amps actuve for multiple CAS commands ... much better throughput problem: cannot latch a new value in the column address buffer until the read-out of the data is complete
DRAM Evolution
Read Timing for Fast Page Mode
Row Access Column Access Transfer Overlap
RAS
Data Transfer
CAS
DQ
Valid Dataout
Valid Dataout
Valid Dataout
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
solution to that problem -instead of simple tri-state buffers, use a latch as well. by putting a latch after the column mux, the next column address command can begin sooner
DRAM Evolution
Read Timing for Extended Data Out
Row Access Column Access Transfer Overlap Data Transfer
RAS
CAS
DQ
Valid Dataout
Valid Dataout
Valid Dataout
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
by driving the col-addr latch from an internal counter rather than an external signal, the minimum cycle time for driving the output bus was reduced by roughly 30%
DRAM Evolution
Read Timing for Burst EDO
Row Access Column Access Transfer Overlap Data Transfer
RAS
CAS
DQ
Valid Data
Valid Data
Valid Data
Valid Data
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
pipeline refers to the setting up of the read pipeline ... first CAS\ toggle latches the column address, all following CAS\ toggles drive data out onto the bus. therefore data stops coming when the memory controller stops toggling CAS\
DRAM Evolution
Read Timing for Pipeline Burst EDO
Row Access Column Access Transfer Overlap Data Transfer
RAS
CAS
DQ
Valid Data
Valid Data
Valid Data
Valid Data
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
main benefit: frees up the CPU or memory controller from having to control the DRAMs internal latches directly ... the controller/CPU can go off and do other things during the idle cycles instead of wait ... even though the time-to-first-word latency actually gets worse, the scheme increases system throughput
DRAM Evolution
Read Timing for Synchronous DRAM
Clock
RAS
CAS
DQ
Valid Data
Valid Data
Valid Data
Valid Data
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
output latch on EDO allowed you to start CAS sooner for next accesss (to same row) latch whole row in ESDRAM -allows you to start precharge & RAS sooner for thee next page access -- HIDE THE PRECHARGE OVERHEAD.
DRAM Evolution
Inter-Row Read Timing for ESDRAM
Regular CAS-2 SDRAM, R/R to same bank
Clock
Command ACT Address Row Addr Col Addr Bank Row Addr Col Addr READ PRE ACT READ
DQ
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Command ACT Address Row Addr Col Addr Bank Row Addr Col Addr READ PRE ACT READ
DQ
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
neat feature of this type of buffering: write-around
DRAM Evolution
Write-Around in ESDRAM
Regular CAS-2 SDRAM, R/W/R to same bank, rows 0/1/0
Clock
Command ACT Address Row Addr Col Addr Bank Row Addr Col Addr Bank Row Addr Col Addr READ PRE ACT WRITE PRE ACT READ
DQ
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Valid Data
Command ACT Address Row Addr Col Addr Bank Row Addr Col Addr Col Addr READ PRE ACT WRITE READ
DQ
Valid Data
Valid Data
Valid Data
Valid Data
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
main thing ... it is like having a bunch of open row buffers (a la rambus), but the problem is that you must deal with the cache directly (move into and out of it), not the DRAM banks ... adds an extra couple of cycles of latency ... however, you get good bandwidth if the data you want is cache, and you can prefetch into cache ahead of when you want it ... originally targetted at reducing latency, now that SDRAM is CAS-2 and RCD-2, this make sense only in a throughput way
DRAM Evolution
$
16 Channels (segments)
Input/Output Buffer
2Kb Segment
2Kbit
2Kb Segment
# DQs
DQs
2Kb Segment
Row Decoder
Sense Amps
Prefetch Restore
Sel/Dec
Read Write
Activate
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
FCRAM opts to break up the data array .. only activate a portion of the word line
DRAM Evolution
Internal Structure of Fast Cycle RAM
SDRAM FCRAM
Row Decoder
13 bits
Row Decoder
8K rows requires 13 bits tto select ... FCRAM uses 15 (assuming the array is 8k x 1k ... the data sheet does not specify)
15 bits
8M Array (?)
Sense Amps
Sense Amps
DRAM Memory System: Lecture 2 Spring 2003 Bruce Jacob David Wang University of Maryland
MoSys takes this one step further ... DRAM with an SRAM interface & speed but DRAM energy [physical partitioning: 72 banks] auto refresh -- how to do this transparently? the logic moves tthrough the arrays, refreshing them when not active. but what is one bank gets repeated access for a long duration? all other banks will be refreshed, but that one will not. solution: they have a bank-sized CACHE of lines ... in theory, should never have a problem (magic)
......
Bank Select
........
DRAM Evolution