0% found this document useful (0 votes)
33 views37 pages

Evoltion&Future - Memory Technology

ESE 345 covers computer architecture and memory technology. Early memory technologies included punched cards and paper tape. Core memory, using magnetized ferrite cores, was a reliable main memory technology until being replaced by semiconductor memory in the 1970s. Semiconductor memory includes static RAM and dynamic RAM. DRAM is the predominant main memory technology due to its high density and low cost, though it has slower access times than SRAM.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views37 pages

Evoltion&Future - Memory Technology

ESE 345 covers computer architecture and memory technology. Early memory technologies included punched cards and paper tape. Core memory, using magnetized ferrite cores, was a reliable main memory technology until being replaced by semiconductor memory in the 1970s. Semiconductor memory includes static RAM and dynamic RAM. DRAM is the predominant main memory technology due to its high density and low cost, though it has slower access times than SRAM.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Computer Architecture

ESE 345 Computer Architecture


Memory Technology

Memory technology 1
Early Read-Only Memory Technologies

Punched cards, From early


1700s through Jaquard
Loom, Babbage, and then
Punched paper tape,
IBM
instruction stream in
Diode Matrix, EDSAC-2 Harvard Mk 1
µcode store

IBM Balanced Capacitor


ROS
IBM Card Capacitor ROS
Memory technology 2
Early Read/Write Main Memory Technologies
Babbage, 1800s: Digits
stored on mechanical
wheels

Williams Tube,
Manchester Mark 1, 1947

Mercury Delay Line, Univac 1, 1951

Also, regenerative capacitor memory on


Atanasoff-Berry computer, and rotating
magnetic drum memory on IBM 650

Memory technology 3
MIT Whirlwind Core Memory

Memory technology 4
Core Memory
 Core memory was first large scale reliable main memory
 invented by Forrester in late 40s/early 50s at MIT for Whirlwind project
 Bits stored as magnetization polarity on small ferrite cores threaded onto two-
dimensional grid of wires
 Coincident current pulses on X and Y wires would write cell and also sense
original state (destructive reads)

• Robust, non-volatile storage


• Used on space shuttle computers
until recently
• Cores threaded onto wires by hand
(25 billion a year at peak production)
• Core access time ~ 1ms

DEC PDP-8/E Board,


4K words x 12 bits,
(1968)

Memory technology 5
Semiconductor Memory
 Semiconductor memory began to be
competitive in early 1970s
 Intel formed to exploit market for semiconductor memory
 Early semiconductor memory was Static RAM (SRAM). SRAM
cell internals similar to a latch (cross-coupled inverters).

 First commercial Dynamic RAM (DRAM)


was Intel 1103
 1Kbit of storage on single chip
 charge on a capacitor used to hold value

 Semiconductor memory quickly replaced


core in ‘70s

Memory technology 6
Memory Hierarchy Technology
° Random Access:
• “Random” is good: access time is the same for all locations
• DRAM: Dynamic Random Access Memory
- High density, low power, cheap, slow
- Dynamic: need to be “refreshed” regularly
• SRAM: Static Random Access Memory
- Low density, high power, expensive, fast
- Static: content will last “forever”(until lose power)
° “Non-so-random” Access Technology:
• Access time varies from location to location and from time to time
• Examples: Disk, CDROM, DRAM page-mode access
° Sequential Access Technology: access time linear in
location (e.g.,Tape)
° Today’s lecture will concentrate on random access
technology
• The Main Memory: DRAMs + Caches: SRAMs

Memory technology 7
Main Memory Background
° Performance of Main Memory:
• Latency: Cache Miss Penalty
- Access Time: time between request and word arrives
- Cycle Time: time between requests
• Bandwidth: I/O & Large Block Miss Penalty (L2)

° Main Memory is DRAM : Dynamic Random Access Memory


• Dynamic since needs to be refreshed periodically (8 ms)
• Addresses divided into 2 halves (Memory as a 2D matrix):
- RAS or Row Access Strobe
- CAS or Column Access Strobe

° Cache uses SRAM : Static Random Access Memory


• No refresh (6 transistors/bit vs. 1 transistor)
Size: DRAM/SRAM 4-8
Cost/Cycle time: SRAM/DRAM 8-16

Memory technology 8
Random Access Memory (RAM) Technology
° Why do computer designers need to know about RAM
technology?
• Processor performance is usually limited by memory bandwidth
• As IC densities increase, lots of memory will fit on processor chip
- Tailor on-chip memory to specific needs
- Instruction cache
- Data cache
- Write buffer

° What makes RAM different from a bunch of flip-flops?


• Density: RAM is much denser

Memory technology 9
Static RAM Cell
6-Transistor SRAM Cell
word
0 1 (row select)

0 1

bit bit

° Write:
1. Drive bit lines (bit=1, bit=0)
2.. Select row

° Read:
1. Precharge bit and bit to Vdd or Vdd/2 => make sure equal!
2.. Select row
3. Cell pulls one line low
4. Sense amp on column detects difference between bit and bit
Memory technology 10
Typical SRAM Organization: 16-word x 4-bit
Din 3 Din 2 Din 1 Din 0
WrEn
Precharge

Wr Driver & Wr Driver & Wr Driver & Wr Driver &


- Precharger+ - Precharger+ - Precharger+ - Precharger+
Word 0 A0

Address Decoder
SRAM SRAM SRAM SRAM
Cell Cell Cell Cell A1

Word 1 A2
SRAM SRAM SRAM SRAM
Cell Cell Cell Cell A3

: : : :
Word 15
SRAM SRAM SRAM SRAM
Cell Cell Cell Cell
- Sense Amp + - Sense Amp + - Sense Amp + - Sense Amp + Q: Which is longer:
word line or
bit line?
Dout 3 Dout 2 Dout 1 Dout 0
Memory technology 11
Logic Diagram of a Typical SRAM
A
N 2 N words
WE_L x M bit
SRAM
OE_L D
M

° Write Enable is usually active low (WE_L)


° Din and Dout are combined to save pins:
• A new control signal, output enable (OE_L) is needed
• WE_L is asserted (Low), OE_L is disasserted (High)
- D serves as the data input pin
• WE_L is disasserted (High), OE_L is asserted (Low)
- D is the data output pin
• Both WE_L and OE_L are asserted:
- Result is unknown. Don’t do that!!!

Memory technology 12
1-Transistor Memory Cell (DRAM)
row select
° Write:
• 1. Drive bit line
• 2.. Select row

° Read:
• 1. Precharge bit line to Vdd/2
• 2.. Select row bit
• 3. Cell and bit line share charges
- Very small voltage changes on the bit line
• 4. Sense (fancy sense amp)
- Can detect changes of ~1 million electrons
• 5. Write: restore the value

° Refresh
• 1. Just do a dummy read to every cell.

Memory technology 13
DRAM Chip Organization
Bi tl ine s

Wo rd
Li ne s

Row Decoder
Me mo ry
Ro w Tra n sisto r
Ce ll B itlin e
Ad d ress
Ar ra y
Wo rd lin e

Ca pa citor

Se n se A mps

Ro w Bu ffe r

Co lum n
Co lum n De cod er
Ad d ress

Data bu s

 Optimized for density, not speed  Cycle time roughly twice


access time
 Data stored as charge in capacitor
 Need to precharge bitlines
 Discharge on reads => destructive reads before access
 Charge leaks over time
 refresh every 64ms

Memory technology 14
DRAM Chip Organization
Bi tl ine s

Wo rd
Li ne s

Row Decoder
Me mo ry
Ro w Ce ll Tra n sisto r
Ad d ress B itlin e
Ar ra y
Wo rd lin e

Ca pa citor

Se n se A mps

Ro w Bu ffe r

Co lum n
Ad d ress Co lum n De cod er

Data bu s
 DRAM in 2014  Address pins are time-
multiplexed
 8Gbit @25nm
– Row address strobe (RAS)
 266 MHz synchronous interface
– Column address strobe (CAS)
 Data clock 4x (1066MHz), double-
data rate so 2133 MT/s
Memory technology 15
DRAM Chip Organization
Bi tl ine s

Wo rd
Li ne s

Row Decoder
Me mo ry
Ro w Ce ll Tra n sisto r
Ad d ress B itlin e
Ar ra y
Wo rd lin e

Ca pa citor

Se n se A mps

Ro w Bu ffe r

Co lum n
Ad d ress Co lum n De cod er

Data bu s

 New RAS results in:  New CAS


 Bitline precharge – Read from row buffer
 Row decode, sense – Much faster (3x)
 Row buffer write (up to 8K)
Memory technology 16
DRAM Generations
Year Capacity $/GB
Row and Column Access times, ns
1980 64Kbit $1500000 300

1983 256Kbit $500000


250
1985 1Mbit $200000
1989 4Mbit $50000 200
1992 16Mbit $15000 Trac
150
1996 64Mbit $10000 Tcac

1998 128Mbit $4000 100


2000 256Mbit $1000
50
2004 512Mbit $250
2007 1Gbit $50 0
2010 2Gbit $30 '80 '83 '85 '89 '92 '96 '98 '00 '04 '07
2012 4Gbit $1
In 2012, Row and Column Access times are
2014 8Gbit ? 35 ns, and 0.8 ns, respectively.
2019 16Gbit ?

Memory technology 17
DRAM Packaging
(Laptops/Desktops/Servers)
~7
Clock and control signals
DRAM
Address lines multiplexed
row/column address
chip
~12
Data bus
(4b,8b,16b,32b)

 DIMM (Dual Inline Memory Module) contains


multiple chips with clock/control/address signals
connected in parallel (sometimes need buffers
to drive signals to all chips)
 Data pins work together to return wide word
(e.g., 64-bit data bus using 16x4-bit parts)

Memory technology 18
DRAM Packaging, Mobile Devices
[ Apple A4 package on circuit board]

Two stacked
DRAM die
Processor
plus logic die

[ Apple A4 package cross-section, iFixit 2010 ]


Memory technology
19
DRAM name based on Peak Chip Transfers / Sec
DIMM name based on Peak DIMM MBytes / Sec
Stan- Clock Rate M transfers / Mbytes/s/ DIMM
dard (MHz) second DRAM Name DIMM Name

DDR 133 266 DDR266 2128 PC2100

DDR 150 300 DDR300 2400 PC2400

DDR 200 400 DDR400 3200 PC3200

DDR2 266 533 DDR2-533 4264 PC4300

DDR2 333 667 DDR2-667 5336 PC5300

DDR2 400 800 DDR2-800 6400 PC6400

DDR3 533 1066 DDR3-1066 8528 PC8500

DDR3 666 1333 DDR3-1333 10664 PC10700

DDR3 800 1600 DDR3-1600 12800 PC12800

DDR4 1600 3200 DDR4-3200 25600 PC 25600


x2 x8
Memory technology 20
Advanced DRAM Organization
 Bits in a DRAM are organized as a rectangular
array
 DRAM accesses an entire row

 Burst mode: supply successive words from a

row with reduced latency


 Double data rate (DDR) DRAM
 Transfer on rising and falling clock edges

 Quad data rate (QDR) DRAM


 Separate DDR inputs and outputs

Memory technology 21
DDR SDRAM Control
 Commands
Bank N-1
• Activate row
Bank 1
Read row into row buffer

Row Decoder
• Column access Address Memory Array
Read data from addressed row Bank 0

• Bank Precharge
Get ready for new row access Sense Amplifiers
Row Buffer

Column Decoder
Bank Precharge
Data

Column
Idle Active Access

Row Activation

Memory technology 22
DRAM Operation
Three steps in read/write access to a given bank
 Row access (RAS)
 decode row address, enable addressed row (often multiple Kb in row)
 bitlines share charge with storage cell
 small change in voltage detected by sense amplifiers which latch whole
row of bits
 sense amplifiers drive bitlines full rail to recharge storage cells
 Column access (CAS)
 decode column address to select small number of sense amplifier
latches (4, 8, 16, or 32 bits depending on DRAM package)
 on read, send latched bits out to chip pins
 on write, change sense amplifier latches which then charge storage
cells to required value
 can perform multiple column accesses on same row without another
row access (burst mode)
 Precharge
 charges bit lines to known value, required before next row access

Each step has a latency of around 15-20ns in modern DRAMs


Various DRAM standards (DDR, RDRAM) have different ways of encoding the signals for
transmission to the DRAM, but all share same core architecture
Memory technology 23
Double-Data Rate (DDR2) DRAM
200MHz
Clock

Row Column Precharge Row’

Data

[ Micron, 256Mb DDR2 SDRAM datasheet ] 400Mb/s


Memory technology Data Rate
24
Constructing a Memory System
 Combine chips in parallel to increase access width
 E.g. 4 16-bit wide DRAMs for a 64-bit parallel access
 DIMM – Dual Inline Memory Module
 Combine DIMMs to form multiple ranks
 Attach a number to DIMMs to a memory channel
 Memory Controller manages a channel (or two lock-step
channels)
 Interleave patterns:
 Rank, Row, Bank, Column, [byte]
 Row, Rank, Bank, Column, [byte]
Better dispersion of addresses

Works better with power-of-two ranks

Memory technology 25
DDR SDRAM Memory System Example

 3 ranks = 3 DIMMs (Dual Inline Memory Modules)


 4 16-bit SDRAM DDR chips per DIMM (64 bits per DIMM access)
 4 memory banks (B0-3) per chip on DIMM
 Real memory address partitioning: rank|row|bank|column[byte]
Memory technology 26
Multi-core Memory Systems
 Memory controller can be centralized
 Mostly in smaller systems
 More often distributed in larger (multi-core) chip
multiprocessing (CMP) systems

CMP0 CMP1
channel channel
Memory 0 controller controller Memory 1

channel CMP2 CMP3 channel

Memory 2 controller controller Memory 3

Memory technology 27
SDRAM Memory Controller
 Interface between a
cache hierarchy and
main memory)
 Translates read and
write requests into
sequences of SDRAM
commands
 Memory scheduler
 keeps track of the

state of memory
banks,
 reorders and

interleaves memory
requests to optimize
memory latency and
bandwidth utilization

Memory technology 28
3D DRAM Stacking Technologies

Memory technology 29
Hybrid Memory Cube (HMC)

 Micron proposal [Pawlowski, Hot Chips 11]


 www.hybridmemorycube.org
Memory technology 30
Hybrid Memory Cube MCM

Memory technology 31
Network of DRAM

 Traditional DRAM: star topology


 HMC: mesh, etc. are feasible
Memory technology 32
Hybrid Memory Cube

 High-speed logic segregated in chip stack


 3D TSV for bandwidth
Memory technology 33
High Bandwidth Memory (HBM)

 High-speed serial links vs. 2.5D silicon interposer

Memory technology 34
3D DRAM Stacking in GPUs

Memory technology 35
Xeon Phi MCDRAM

Memory technology 36
Summary
° SRAM is fast but expensive and not very dense:
• Good choice for providing the user FAST access time.

° DRAM is slow but cheap and dense:


• Good choice for presenting the user with a BIG memory system

° New Stacked DRAM to replace traditional DRAM


° in high performance systems first

Memory technology 37

You might also like