Common Framework For Memory Hierarchy
Common Framework For Memory Hierarchy
The Big Picture: Where are We Now? I/O System Design Issues (§8.1)
• Performance
• Expandability
°Today’s Topic: I/O Systems
• Resilience in the face of failure
Network
interrupts
Processor Processor
Processor
Input Input
Control Control
Memory Memory Cache
200
°Throughput:
• The number of tasks completed by the server in unit time
• In order to get the highest possible throughput:
100
- The server should never be idle
- The queue should never be empty
°Response time:
20% 40% 60% 80% 100%
• Begins when a task is placed in the queue Percentage of maximum throughput
• Ends when it is completed by the server
• In order to minimize the response time: °Tradeoff between response time and throughput
- The queue should be empty
°Example: grouping access requests that are close may increase
- The server will be idle throughput but also increase the response time for some requests.
ECE4680 io.7 April 1, 2003 ECE4680 io.8 April 1, 2003
Server
Queue
°Supercomputer application:
• Large-scale scientific problems
°Transaction processing:
• Examples: Airline reservations systems and banks
Producer
°File system:
Queue Server • Example: UNIX file system
°Measurements of UNIX file systems in an engineering environment: °Behavior: how does an I/O device behave?
• 80% of accesses are to files less than 10 KB • Input: read only
• 90% of all file accesses are to data with • Output: write only, cannot read
sequential addresses on the disk • Storage: can be reread and usually rewritten
• 67% of the accesses are reads
• 27% of the accesses are writes °Partner:
100%
• 6% of the accesses are read-write accesses • Either a human or a machine is at the other end of the I/O device
• Either feeding data on input or reading data on output
°Data rate:
• The peak rate at which data can be transferred:
- Between the I/O device and the main memory
- Or between the I/O device and the CPU
Registers
Cache
Memory
Disk
°Purpose:
Device Behavior Partner Data Rate (KB/sec) • Long term, nonvolatile storage
• Large, inexpensive, and slow
Keyboard Input Human 0.01
• Lowest level in the memory hierarchy
Mouse Input Human 0.02
°Two major types:
Line Printer Output Human 1.00 • Floppy disk
• Hard disk
Laser Printer Output Human 100.00
°Both types of disks:
Graphics Display Output Human 30,000.00
• Rely on a rotating platter coated with a magnetic surface
Network-LAN Input or Output Machine 200.00 • Use a moveable read/write head to access the disk
Floppy disk Storage Machine 50.00 °Advantages of hard disks over floppy disks:
Optical Disk Storage Machine 500.00 • Platters are more rigid ( metal or glass) so they can be larger
• Higher density because it can be controlled more precisely
Magnetic Disk Storage Machine 2,000.00 • Higher data rate because it spins faster
• Can incorporate more than one platter
ECE4680 io.15 ECE4680 io.16
April 1, 2003 April 1, 2003
°A stack of platters, a surface with a magnetic coating °Average seek time as reported by the industry:
• Typically in the range of 8 ms to 15 ms
°Typical numbers (depending on the disk size):
• (Sum of the time for all possible seek) / (total # of possible seeks)
• 500 to 2,000 tracks per surface
• 32 to 128 sectors per track °Due to locality of disk reference, actual average seek time may:
- A sector is the smallest unit that can be read or written • Only be 25% to 33% of the advertised number
Disk diameter (inches) 10.88 3.50 1.80 °Setup parameters: 16383 Cycliders, 63 sectors per track
Formatted data capacity (MB) 22,700 1,000 21 °3 platters, 6 heads
MTTF (hours) 50,000 400,000 100,000
°Bytes per sector: 512
Number of arms/box 12 1 1
°RPM: 7200
Rotation speed (RPM) 3,600 4,318 3,800
°Transfer mode: 66.6MB/s
Transfer rate (MB/sec) 4.2 4 1.9
°Average Read Seek time: 9.0ms (read), 9.5ms (write)
Power/box (watts) 2,900 12 2
°Average latency: 4.17ms
MB/watt 8 102 10.5
°Physical dimension: 1’’ x 4’’ x 5.75’’
Volume (cubic feet) 97 0.13 0.02
These disks represent the newest products of 1993. Compare with the newest disks
of 1997 at page 650 to see how fast the disks are developed.
ECE4680 io.21 ECE4680 io.22
April 1, 2003 April 1, 2003
°512 byte sector, rotate at 5400 RPM, advertised seeks is 12 ms, transfer
rate is 4 BM/sec, controller overhead is 1 ms, queue idle so no service
time
Backplane Bus
°Processor-Memory Bus (design specific or proprietary)
Processor Memory
• Short and high speed
• Only need to match the memory system
- Maximize memory-to-processor bandwidth I/O Devices
• Connects directly to the processor
°I/O Bus (industry standard) °A single bus (the backplane bus) is used for:
• Usually is lengthy and slower • Processor to memory communication
• Need to match a wide range of I/O devices • Communication between I/O devices and memory
• Connects to the processor-memory bus or backplane bus
°Advantages: Simple and low cost
°Backplane Bus (industry standard)
°Disadvantages: slow and the bus can become a major bottleneck
• Backplane: an interconnection structure within the chassis
• Allow processors, memory, and I/O devices to coexist °Example: IBM PC
• Cost advantage: one single bus for all components
°I/O buses tap into the processor-memory bus via bus adaptors:
• Processor-memory bus: mainly for processor-memory traffic °A small number of backplane buses tap into the processor-memory bus
• I/O buses: provide expansion slots for I/O devices • Processor-memory bus is used for processor memory traffic
• I/O buses are connected to the backplane bus
°Apple Macintosh-II
• NuBus: Processor, memory, and a few selected I/O devices °Advantage:
• SCSI Bus: the rest of the I/O devices • loading on the processor bus is greatly reduced
• I/O system can be easily expanded
ECE4680 io.35 ECE4680 io.36
April 1, 2003 April 1, 2003
Synchronous and Asynchronous Bus Simplest bus paradigm
°Synchronous Bus:
• Includes a clock in the control lines
• A fixed protocol for communication that is relative to the clock
• Advantage: involves very little logic and can run very fast
• Disadvantages:
- Every device on the bus must run at the same clock rate
- To avoid clock skew, they cannot be long if they are fast °All agents operate synchronously
BReq BReq
BG BG
Address Master Asserts Address Next Address Address Master Asserts Address Next Address
Read Read
Req Req
Ack Ack
t0 t1 t2 t3 t4 t5
t0 t1 t2 t3 t4 t5 ° t0 : Master has obtained control and asserts address, direction, data
° t0 : Master has obtained control and asserts address, direction, data ° Waits a specified amount of time for slaves to decode target\
° Waits a specified amount of time for slaves to decode target ° t1: Master asserts request line
° t1: Master asserts request line ° t2: Slave asserts ack, indicating ready to transmit data
° t2: Slave asserts ack, indicating data received ° t3: Master releases req, data received
° t3: Master releases req ° t4: Slave releases ack
° t4: Slave releases ack
ECE4680 io.43 April 1, 2003 ECE4680 io.44 April 1, 2003
Split Bus Transaction (page 666:elaboration) Multiple Potential Bus Masters: the Need for Arbitration
°Advantage: simple
°Disadvantages:
• Cannot assure fairness:
A low-priority device may be locked out indefinitely
• The use of the daisy chain grant signal also limits the bus speed
°Used in essentially all processor-memory busses and in high-speed
I/O busses
ECE4680 io.49 April 1, 2003 ECE4680 io.50 April 1, 2003
3-bit D Register
ReqB Arbiter GrantB P0 ReqA Q GrantA
Highest priority: ReqA K
ReqC GrantC ReqB Priority Clk
Lowest Priority: ReqB
P1 SetGrB
Clk G1 J
ReqC ReqB Q GrantB
P2 K
Clk
Clk SetGrC
EN G2 J GrantC
ReqC Q
Clk K
ReqA Clk
ReqB
GrA
GrB
J
J K Q(t-1) Q(t)
P1 G1 0 0 0 0
0 0 1 1 Q
D
0 1 x 0
1 0 x 1 K
1 1 0 1
P2 G2 1 1 1 0
clk
EN
Slots 16 9 10
Busses/system 1 1 1 2
°Examples
• graphics
• fast networks