ECE/CS 250 Computer Architecture Summer 2021
ECE/CS 250 Computer Architecture Summer 2021
Computer Architecture
Summer 2021
I/O
Tyler Bletsch
Duke University
Includes material adapted from Dan Sorin (Duke) and Amir Roth (Penn).
SSD material from Andrew Bondi (Colorado State).
Where We Are in This Course Right Now
• So far:
• We know how to design a processor that can fetch, decode, and
execute the instructions in an ISA
• We understand how to design caches and memory
• Now:
• We learn about the lowest level of storage (disks)
• We learn about input/output in general
• Next:
• Faster processor cores
• Multicore processors
2
This Unit: I/O
3
Readings
4
Computers Interact with Outside World
• Input/output (I/O)
• Otherwise, how will we ever tell a computer what to do…
• …or exploit the results of its work?
• Computers without I/O are not useful
• ICQ: What kinds of I/O do computers have?
5
One Instance of I/O
L2
Main
Memory
Disk(swap)
6
A More General/Realistic I/O System
• A computer system
• CPU, including cache(s)
• Memory (DRAM)
• I/O peripherals: disks, input devices, displays, network cards, ...
• With built-in or separate I/O (or DMA) controllers
• All connected by a system bus
8
Standard Bus Examples
9
This Unit: I/O
10
Operating System (OS) Plays a Big Role
11
I/O Device Characteristics
• Primary characteristic
• Data rate (aka bandwidth)
• Contributing factors
• Partner: humans have slower output data rates than machines
• Input or output or both (input/output)
12
I/O Device: Disk
Platters ~6 2 1
Seek time
Average Seek 4.16 ms 4.5 ms 7 ms not really
improving!
Sustained Data Rate 216 MB/s 94 MB/s 16 MB/s
Interface SAS/SATA SCSI ATA
Use Desktop Laptop Ancient iPod
14
Disk Read/Write Latency
15
Understanding disk performance
17
Error Correction: RAID
SSD HDD
19
Adapted from “Solid State Drives” by Andrew Bondi
SSDs
20
Adapted from “Solid State Drives” by Andrew Bondi
Typical read and write rates: SSD vs HDD
HDD SSD
21
This Unit: I/O
22
I/O Control and Interfaces
23
Sending Commands to I/O Devices
24
Memory mapped IO example (1)
28
Polling Overhead: Example #1
• Parameters
• 500 MHz CPU
• Polling event takes 400 cycles
29
Polling Overhead: Example #2
• Same parameters
• 500 MHz CPU, polling event takes 400 cycles
30
Interrupt-Driven I/O
31
Interrupt Overhead
32
Direct Memory Access (DMA)
33
DMA Controllers
Bus
Main
Disk display NIC
Memory
34
DMA Overhead
• Parameters
• 500 MHz CPU
• Interrupt handler takes 400 cycles
• Data transfer takes 100 cycles
• 4 MB/s, 16 B interface, disk transfers data 50% of time
• DMA setup takes 1600 cycles, transfer 1 16KB page at a time
35
DMA and Memory Hierarchy
36
DMA and Caching
37
Solutions to Coherence Problem
38
H/W Cache Coherence (more later on this)
39
Summary
• Storage devices
• HDD: Mechanical disk. Seeks are bad. Cheaper per GB.
• SSD: Flash storage. Cheaper per performance.
• Can combine drives with RAID to get aggregate performance/capacity
plus fault tolerance (can survive individual drive failures).
• Connectivity
• A bus is shared between CPU, memory, and/or and multiple IO devices
• How does CPU talk to IO devices?
• Special instructions or memory-mapped IO
(certain addresses don’t lead to RAM, they lead to IO devices)
• Either requires OS privilege to use
• Methods of interaction:
• Polling (simple but wastes CPU)
• Interrupts (saves CPU but transfers tiny bit at a time)
• DMA+interrupts (saves CPU+fast, but requires caches to snoop
traffic to not become wrong) 40