Module 4 Memory Hierarchy
Module 4 Memory Hierarchy
Today
● Storage technologies and trends
● Locality of reference
● Caching in the memory hierarchy
Nonvolatile Memories
● DRAM and SRAM are volatile memories
○ Lose information if powered off.
● Nonvolatile memories retain value even if powered off
○ Read-only memory (ROM): programmed during production
○ Programmable ROM (PROM): can be programmed once
○ Eraseable PROM (EPROM): can be bulk erased (UV, X-Ray)
○ Electrically eraseable PROM (EEPROM): electronic erase capability
○ Flash memory: EEPROMs. with partial (block-level) erase capability
■ Wears out after about 100,000 erasings
● Uses for Nonvolatile Memories
○ Firmware programs stored in a ROM (BIOS, controllers for disks, network cards,
graphics accelerators)
○ Solid state disks
○ Disk caches
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 5
Plaksha Univ
CPU chip
Register file
ALU
System bus Memory bus
I/O Main
Bus interface
bridge memory
Register file
Load operation: movq A, %rax
%rax ALU
Main memory
I/O bridge A 0
Bus interface x A
Register file
Load operation: movq A, %rax
%rax ALU
x
Main memory
I/O bridge 0
Bus interface x A
Register file
Store operation: movq %rax, A
ALU
%rax y
main memory
I/O bridge 0
Bus interface y A
Actuator
Electronics
(including a
processor
SCSI and memory!)
connector
Disk Geometry
● Disks consist of platters, each with two surfaces.
● Each surface consists of concentric rings called tracks.
● Each track consists of sectors separated by gaps.
Tracks
Surface
Track k Gaps
Spindle
Sectors
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 14
Plaksha Univ
Spindle
Disk Capacity
● Capacity: maximum number of bits that can be stored.
○ Vendors express capacity in units of gigabytes (GB), where
1 GB = 109 Bytes.
● Capacity is determined by these technology factors:
○ Recording density (bits/in): number of bits that can be squeezed into a 1
inch segment of a track.
○ Track density (tracks/in): number of tracks that can be squeezed into a 1
inch radial segment.
○ Areal density (bits/in2): product of recording and track density.
Recording zones
● Modern disks partition tracks into disjoint
subsets called recording zones
○ Each track in a zone has the same
…
number of sectors, determined by
the circumference of innermost
track. Spindle
○ Each zone has a different number of
sectors/track, outer zones have more
sectors/track than inner zones.
○ So we use average number of
sectors/track when computing
capacity.
dle
spin
spin
spindle
spindle
dle
By moving radially, the arm can
position the read/write head over any
track.
Arm
Spindle
Disk Access
Disk Access
Rotation is counter-clockwise
After BLUE read Seek for RED Rotational latency After RED read
After BLUE read Seek for RED Rotational latency After RED read
I/O Bus
CPU chip
Register file
ALU
System bus Memory bus
I/O Main
Bus interface memory
bridge
Main
Bus interface
memory
I/O bus
Main
Bus interface
memory
I/O bus
Main
Bus interface
memory
I/O bus
Sequential read tput 550 MB/s Sequential write tput 470 MB/s
Random read tput 365 MB/s Random write tput 303 MB/s
Avg seq read time 50 us Avg seq write time 60 us
Disk
SSD
DRAM
CPU
Today
● Storage technologies and trends
● Locality of reference
● Caching in the memory hierarchy
Locality
● Principle of Locality: Programs tend to use data and instructions with
addresses near or equal to those they have used recently
● Temporal locality:
○ Recently referenced items are likely
to be referenced again in the near future
● Spatial locality:
○ Items with nearby addresses tend
to be referenced close together in time
Locality Example
sum = 0;
for (i = 0; i < n; i++)
sum += a[i];
return sum;
● Data references
○ Reference array elements in succession
(stride-1 reference pattern).
Spatial locality
○ Reference variable sum each iteration. Temporal locality
● Instruction references
○ Reference instructions in sequence. Spatial locality
○ Cycle through loop repeatedly. Temporal locality
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition 45
Plaksha Univ
● Question: Does this function have good locality with respect to array a?
Locality Example
● Question: Does this function have good locality with respect to array a?
Locality Example
● Question: Can you permute the loops so that the function scans the 3-d array a
with a stride-1 reference pattern (and thus has good spatial locality)?
Memory Hierarchies
● Some fundamental and enduring properties of hardware and software:
○ Fast storage technologies cost more per byte, have less capacity, and
require more power (heat!).
○ The gap between CPU and main memory speed is widening.
○ Well-written programs tend to exhibit good locality.
Today
● Storage technologies and trends
● Locality of reference
● Caching in the memory hierarchy
Caches
● Cache: A smaller, faster storage device that acts as a staging area for a subset of the
data in a larger, slower device.
● Fundamental idea of a memory hierarchy:
○ For each k, the faster, smaller device at level k serves as a cache for the larger,
slower device at level k+1.
● Why do memory hierarchies work?
○ Because of locality, programs tend to access the data at level k more often than
they access the data at level k+1.
○ Thus, the storage at level k+1 can be slower, and thus larger and cheaper per bit.
● Big Idea: The memory hierarchy creates a large pool of storage that costs as much as
the cheap storage near the bottom, but that serves data to programs at the rate of
the fast storage near the top.
8 9 14 3 Block b is in cache:
Cache Hit!
Memory 0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
Cache 8 9
12 14 3 Block b is not in cache:
Miss!
Web cache Web pages Remote server disks 1,000,000,000 Web proxy server
Summary
● The speed gap between CPU, memory and mass storage continues to widen.