0% found this document useful (0 votes)
16 views40 pages

Class 12

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views40 pages

Class 12

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 40

15-213

Memory Technology
Oct 5, 2000

Topics
• Memory Hierarchy Basics
• Static RAM
• Dynamic RAM
• Magnetic Disks
• Access Time Gap

class12.ppt
Impact of Technology
Moore’s Law
• Observation by Gordon Moore, Intel founder, in 1971
• Transistors / Chip doubles every 18 months
– Has expanded to include processor speed, disk capacity, …
We Owe a Lot to the Technologists
• Computer science has ridden the wave
Things Aren’t Over Yet
• Technology will continue to progress along current growth curves
• For at least 7–10 more years
• Difficult technical challenges in doing so
Even Technologists Can’t Beat Laws of Physics
• Quantum effects create fundamental limits as approach atomic scale
• Opportunities for new devices

class12.ppt –2– CS 213 F’00


Impact of Moore’s Law
Moore’s Law
• Performance factors of systems built with integrated circuit
technology follow exponential curve
• E.g., computer speed / memory capacities double every 1.5 years
Implications
• Computers 10 years from now will run 100 X faster
• Problems that appear intractable today will be straightforward
• Must not limit future planning with today’s technology
Example Application Domains
• Speech recognition
– Will be routinely done with handheld devices
• Breaking secret codes
– Need to use large enough keys
• Digital Video
– Will stream just like today’s MP3’s

class12.ppt –3– CS 213 F’00


Computer
System
Processor
Processor
Reg

Cache
Cache

Memory-I/O
Memory-I/Obus
bus

I/O
I/O I/O
I/O I/O
I/O
controller
controller controller
controller controller
controller
Memory
Memory

Disk Disk Display


Display Network
Network

class12.ppt –4– CS 213 F’00


Levels in Memory
Hierarchy
cache virtual memory

C
CPU 8B a 32 B 8 KB
CPU Memory
Memory disk
disk
c
regs
regs h
e

Register Cache Memory Disk Memory


size: 200 B 32KB - 4MB 128 MB 30 GB
speed: 2 ns 4 ns 60 ns 8 ms
$/Mbyte: $100/MB $1.50/MB $0.05/MB
block size: 8B 32 B 8 KB

larger, slower, cheaper

class12.ppt –5– CS 213 F’00


Dimensions
2000 devices
(0.18 µm)

1 cm 1 mm 0.1 mm 10µm 1 µm 0.1 µm 10 nm 1 nm 1Å

Chip size Diameter of 1996 devices 2007 devices Silicon


(1 cm) Human Hair (0.35 µm) (0.1 µm) atom
(25 µm) radius
(1.17 Å)

Deep UV X-ray
Wavelength Wavelength
(0.248 µm) (0.6 nm)

class12.ppt –6– CS 213 F’00


Scaling to 0.1µm
• Semiconductor Industry Association, 1992 Technology Workshop
– Projected future technology based on past trends

1992 1995 1998 2001 2004 2007


Feature size (µm ): 0.5 0.35 0.25 0.18 0.12 0.10
– Industry is slightly ahead of projection
DRAM capacity: 16M 64M 256M 1G 4G 16G
– Doubles every 1.5 years
– Prediction on track
Chip area (cm2): 2.5 4.0 6.0 8.0 10.0 12.5
– Way off! Chips staying small

class12.ppt –7– CS 213 F’00


Static RAM
(SRAM)
Fast
• ~4 nsec access time
Persistent
• as long as power is supplied
• no refresh required
Expensive
• ~$100/MByte
• 6 transistors/bit
Stable
• High immunity to noise and environmental disturbances
Technology for caches

class12.ppt –8– CS 213 F’00


Anatomy of an SRAM Cell

bit line bit line


b b’
Stable Configurations
word line
0 1 1 0

Terminology:
(6 transistors) bit line: carries data
word line: used for addressing

Write: Read:
1. set bit lines to new data value 1. set bit lines high
•b’ is set to the opposite of b 2. set word line high
2. raise word line to “high” 3. see which bit line goes low
 sets cell to new state (may involve
flipping relative to old state)

class12.ppt –9– CS 213 F’00


SRAM Cell Principle
Inverter Amplifies
• Negative gain
• Slope < –1 in middle
• Saturates at ends
Inverter Pair Amplifies
• Positive gain
• Slope > 1 in middle
• Saturates at ends

0.9

0.8

0.7

0.6

0.5

0.4
V1
V2
0.3

0.2

Vin 0.1

V1 0
0 0.2 0.4 0.6 0.8 1

V2 Vin

class12.ppt – 10 – CS 213 F’00


Bistable Element

Vin Stability
V1 • Require Vin = V2
• Stable at endpoints
V2 – recover from pertubation
• Metastable in middle
– Fall out when perturbed
1
Stable
0.9 Ball on Ramp Analogy
0.8

0.7

0.6 Metastable
0.5
Vin
0.4
V2
0.3

0.2

0.1

0
0 0.2 0.4 0.6 0.8 1

Stable Vin 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

class12.ppt – 11 – CS 213 F’00


Example SRAM Configuration (16 x 8)

b7 b7’ b1 b1’ b0 b0’


W0

W1
A0
A1 Address
Address memory
A2 decoder
decoder cells
A3 W15

R/W
sense/write
sense/write sense/write
sense/write sense/write
sense/write
amps
amps amps
amps amps
amps

Input/output lines d7 d1 d0
class12.ppt – 12 – CS 213 F’00
Dynamic RAM
(DRAM)
Slower than SRAM
• access time ~60 nsec
Not persistent
• every row must be accessed every ~1 ms (refreshed)
Cheaper than SRAM
• ~$1.50 / MByte
• 1 transistor/bit
Fragile
• electrical noise, light, radiation
Workhorse memory technology

class12.ppt – 13 – CS 213 F’00


Anatomy of a DRAM
Cell
Word Line
Bit
Line Storage Node
Access
Transistor
Cnode

CBL

Writing Reading
Word Line Word Line

Bit Line V Bit Line

Storage Node V ~ Cnode / CBL

class12.ppt – 14 – CS 213 F’00


Addressing Arrays with
Array Size Bits
• R rows, R = 2r
• C columns, C = 2c
• N = R * C bits of memory r c
Addressing address = row col
• Addresses are n bits, where N = 2n
n
• row(address) = address / C
– leftmost r bits of address
• col(address) = address % C
– rightmost bits of address
Example
• R=2
• C=4 0 1 2 3
• address = 6 0 000 001 010 011
1 100 101 110 111

row 1 col 2
class12.ppt – 15 – CS 213 F’00
Example 2-Level Decode DRAM
RAS
(64Kx1)
256 Rows

Row
Row Row
8 Row 256x256
address
address \ decoder 256x256
decoder cell
latch
latch cellarray
array
row
256 Columns
A7-A0
column
column R/W’
sense/write
sense/write
amps
amps
col
Provide 16-bit Column
Column column
column
address in two 8
address
address \ latch
latchand
and
8-bit chunks latch
latch decoder
decoder

CAS
Dout Din
class12.ppt – 16 – CS 213 F’00
DRAM Operation
Row Address (~50ns)
• Set Row address on address lines & strobe RAS
• Entire row read & stored in column latches
• Contents of row of memory cells destroyed
Column Address (~10ns)
• Set Column address on address lines & strobe CAS
• Access selected bit
– READ: transfer from selected column latch to Dout
– WRITE: Set selected column latch to Din
Rewrite (~30ns)
• Write back entire row

class12.ppt – 17 – CS 213 F’00


Observations About
Timing DRAMs
• Access time (= 60ns) < cycle time (= 90ns)
• Need to rewrite row
Must Refresh Periodically
• Perform complete memory cycle for each row
• Approximately once every 1ms
• Sqrt(n) cycles
• Handled in background by memory controller
Inefficient Way to Get a Single Bit
• Effectively read entire row of Sqrt(n) bits

class12.ppt – 18 – CS 213 F’00


Enhanced Performance
Conventional Access DRAMs
RAS
• Row + Col
• RAS CAS RAS CAS ... Row
Row 8 Row
256x256
address \ Row
address decoder 256x256
decoder cell array
Page Mode
latch cell array
latch
row

• Row + Series of columns A7-A0

• RAS CAS CAS CAS ... sense/write


sense/write
R/W’
amps
• Gives successive bits col
amps

Other Acronyms Column


Column
address
address
8
\
column
column
latch latch and
latch and
• EDORAM latch decoder
decoder

– “Extended data output” CAS

• SDRAM
Entire row buffered here
– “Synchronous DRAM”
Typical Performance
row access time col access time cycle time page mode cycle time
50ns 10ns 90ns 25ns
class12.ppt – 19 – CS 213 F’00
Video RAM
Performance Enhanced for Video / Graphics
Operations
• Frame buffer to hold graphics image
Writing
• Random access of bits 256x256
256x256
cell
cellarray
array
• Also supports rectangle fill operations
– Set all bits in region to 0 or 1
Reading
column
column
• Load entire row into shift register sense/write
sense/write
• Shift out at video rates amps
amps
Performance Example
• 1200 X 1800 pixels / frame Shift
ShiftRegister
Register
• 24 bits / pixel
• 60 frames / second Video Stream Output
• 2.8 GBits / second
class12.ppt – 20 – CS 213 F’00
DRAM Driving
Capacity Forces
• 4X per generation
– Square array of cells
• Typical scaling
– Lithography dimensions 0.7X
» Areal density 2X
– Cell function packing 1.5X
– Chip area 1.33X
• Scaling challenge
– Typically Cnode / CBL = 0.1–0.2
– Must keep Cnode high as shrink cell size
Retention Time
• Typically 16–256 ms
• Want higher for low-power applications

class12.ppt – 21 – CS 213 F’00


DRAM Storage
Planar Capacitor Capacitor
• Up to 1Mb
• C decreases linearly with
feature size
Plate
Trench Capacitor Area A
• 4 Mb –1 Gb Dielectric Material
• Lining of hole in substrate Dielectric Constant 
Stacked Cell d
  1Gb C = A/d
• On top of substrate
• Use high  dielectric

class12.ppt – 22 – CS 213 F’00


Trench
Capacitor
Process
• Etch deep hole in substrate
– ~5 µm deep
– ~0.5 µm diameter
– Becomes reference plate
• Grow oxide on walls
– Dielectric
• Fill with polysilicon plug
– Tied to storage node
SiO2 Dielectric

Storage Plate
Reference Plate

class12.ppt – 23 – CS 213 F’00


IBM DRAM
•IBM J. R&D, Jan/Mar ‘95 Cell
• Evolution from 4 – 256 Mb

4 Mb Cell Structure

class12.ppt – 24 – CS 213 F’00


IBM DRAM Evolution
• IBM J. R&D, Jan/Mar ‘95
• Evolution from 4 – 256 Mb Cell Layouts
• 256 Mb uses cell with area
0.6 µm2
4Mb

16Mb
Relative
Sizes
64Mb

256Mb

class12.ppt – 25 – CS 213 F’00


Mitsubishi Stacked Cell DRAM
• IEDM ‘95
• Claim suitable for 1 – 4 Gb
Cross Section of 2 Cells
Technology
• 0.14 µm process
• 8 nm gate oxide
• 0.29 µm2 cell
Storage Capacitor
• Fabricated on top of everything else
• Rubidium electrodes
• High dielectric insulator
– 50X higher than SiO2
– 25 nm thick
• Cell capacitance 25 femtofarads

class12.ppt – 26 – CS 213 F’00


Mitsubishi DRAM Pictures

class12.ppt – 27 – CS 213 F’00


Magnetic
Disks

Disk surface spins at


3600–15,000 RPM read/write head

arm
The surface consists
of a set of concentric
magnetized rings
called tracks

The read/write
head floats over
the disk surface
and moves back
Each track is divided and forth on an
into sectors arm from track to
track.
class12.ppt – 28 – CS 213 F’00
Disk
Parameter Capacity 18GB Example
• Number Platters 12
• Surfaces / Platter 2
• Number of tracks 6962
• Number sectors / track 213
• Bytes / sector 512
Total Bytes 18,221,948,928

class12.ppt – 29 – CS 213 F’00


Disk Operation
Operation
• Read or write complete sector
Seek
• Position head over proper track
• Typically 6-9ms
Rotational Latency
• Wait until desired sector passes under head
• Worst case: complete rotation
10,025 RPM  6 ms
Read or Write Bits
• Transfer rate depends on # bits per track and rotational speed
• E.g., 213 * 512 bytes @10,025RPM = 18 MB/sec.
• Modern disks have external transfer rates of up to 100 MB/sec

class12.ppt – 30 – CS 213 F’00


Disk
Getting First Byte Performance
• Seek + Rotational latency = 7,000 – 19,000 µsec
Getting Successive Bytes
• ~ 0.06 µsec each
– roughly 100,000 times faster than getting the first byte!
Optimizing Performance:
• Large block transfers are more efficient
• Try to do other things while waiting for first byte
– switch context to other computing task
– processor is interrupted when transfer completes

class12.ppt – 31 – CS 213 F’00


Disk / System
1. Processor Signals Interface (1) Initiate Sector Read
Controller Processor
Processor
• Read sector X and store Reg
(3) Read
starting at memory Done
address Y
Cache
Cache
2. Read Occurs
• “Direct Memory Access”
(DMA) transfer Memory-I/O
Memory-I/Obus
bus
• Under control of I/O
controller (2) DMA Transfer
I/O
I/O
3. I/O Controller Memory controller
controller
Memory
Signals Completion
• Interrupts processor
• Can resume suspended Disk Disk
process

class12.ppt – 32 – CS 213 F’00


Magnetic Disk Technology
Seagate ST-12550N Barracuda 2 Disk
• Linear density 52,187. bits per inch (BPI)
– Bit spacing 0.5 µm
– Track density 3,047. tracks per inch
(TPI)
– Track spacing 8.3 µm
• Total tracks 2,707. tracks
• Rotational Speed 7200. RPM
• Avg Linear Speed 86.4 kilometers / hour
• Head Floating Height 0.13 microns
Analogy:
• put the Sears Tower on its side
• fly it around the world, 2.5cm above the ground
• each complete orbit of the earth takes 8 seconds

class12.ppt – 33 – CS 213 F’00


CD Read Only Memory
Basis (CDROM)
• Optical recording technology developed for audio CDs
– 74 minutes playing time
– 44,100 samples / second
– 2 X 16-bits / sample (Stereo)
 Raw bit rate = 172 KB / second
• Add extra 288 bytes of error correction for every 2048 bytes of data
– Cannot tolerate any errors in digital data, whereas OK for audio
Bit Rate
• 172 * 2048 / (288 + 2048) = 150 KB / second
– For 1X CDROM
– N X CDROM gives bit rate of N * 150
– E.g., 12X CDROM gives 1.76 MB / second
Capacity
• 74 Minutes * 150 KB / second * 60 seconds / minute = 650 MB
class12.ppt – 34 – CS 213 F’00
Storage Trends

metric 1980 1985 1990 1995 2000 2000:1980

SRAM $/MB 19,200 2,900 320 256 100 190


access (ns) 300 150 35 15 2 100

metric 1980 1985 1990 1995 2000 2000:1980

DRAM $/MB 8,000 880 100 30 1.5 5,300


access (ns) 375 200 100 70 60 6
typical size(MB) 0.064 0.256 4 16 64 1,000

metric 1980 1985 1990 1995 2000 2000:1980

$/MB 500 100 8 0.30 0.05 10,000


Disk access (ms) 87 75 28 10 8 11
typical size(MB) 1 10 160 1,000 9,000 9,000

(Culled from back issues of Byte and PC Magazine)


class12.ppt – 35 – CS 213 F’00
Storage Price:
1.E+05 $/MByte
1.E+04

1.E+03

1.E+02
SRAM
DRAM
1.E+01 Disk

1.E+00

1.E-01

1.E-02
1980 1985 1990 1995 2000

class12.ppt – 36 – CS 213 F’00


Storage Access Times
1.E+08
(nsec)
1.E+07

1.E+06

1.E+05

SRAM
1.E+04 DRAM
Disk
1.E+03

1.E+02

1.E+01

1.E+00
1980 1985 1990 1995 2000

class12.ppt – 37 – CS 213 F’00


Processor clock
rates

Processors
metric 1980 1985 1990 1995 2000 2000:1980

typical clock(MHz) 1 6 20 150 750 750


processor 8080 286 386 Pentium P-III

culled from back issues of Byte and PC Magazine


class12.ppt – 38 – CS 213 F’00
The CPU vs. DRAM Latency Gap (ns)
1.E+03

1.E+02

SRAM
DRAM
CPU cycle

1.E+01

1.E+00
1980 1985 1990 1995 2000

class12.ppt – 39 – CS 213 F’00


Memory Technology
Summary
Cost and Density Improving at Enormous Rates
Speed Lagging Processor Performance
Memory Hierarchies Help Narrow the Gap:
• Small fast SRAMS (cache) at upper levels
• Large slow DRAMS (main memory) at lower levels
• Incredibly large & slow disks to back it all up
Locality of Reference Makes It All Work
• Keep most frequently accessed data in fastest memory

class12.ppt – 40 – CS 213 F’00

You might also like