CH 12
CH 12
Database System Concepts - 7th Edition 12.2 ©Silberschatz, Korth and Sudarshan
Storage Hierarchy
Database System Concepts - 7th Edition 12.3 ©Silberschatz, Korth and Sudarshan
Storage Terminology
Database System Concepts - 7th Edition 12.4 ©Silberschatz, Korth and Sudarshan
Disks
Database System Concepts - 7th Edition 12.5 ©Silberschatz, Korth and Sudarshan
Magnetic Disks
Database System Concepts - 7th Edition 12.6 ©Silberschatz, Korth and Sudarshan
Magnetic Hard Disk Mechanism
Database System Concepts - 7th Edition 12.7 ©Silberschatz, Korth and Sudarshan
Magnetic Disk Drives
Database System Concepts - 7th Edition 12.8 ©Silberschatz, Korth and Sudarshan
Magnetic Disk Performance
Database System Concepts - 7th Edition 12.9 ©Silberschatz, Korth and Sudarshan
The First Commercial Disk Drive
1956
IBM RAMDAC computer included
the IBM Model 350 disk storage
system
5M (7 bit) characters
50 x 24” platters
Access time = < 1 second
Database System Concepts - 7th Edition 12.10 ©Silberschatz, Korth and Sudarshan
Storage Interfaces
Database System Concepts - 7th Edition 12.11 ©Silberschatz, Korth and Sudarshan
Storage Attachment
Database System Concepts - 7th Edition 12.12 ©Silberschatz, Korth and Sudarshan
Network-Attached Storage
Database System Concepts - 7th Edition 12.13 ©Silberschatz, Korth and Sudarshan
Cloud Storage
Database System Concepts - 7th Edition 12.14 ©Silberschatz, Korth and Sudarshan
Disk Controller
Interfaces between the computer system and the disk drive hardware
Accepts high-level commands to read or write a sector
Initiates actions such as moving the disk arm to the right track
and reading or writing the data
Computes and attaches checksums to each sector to verify that
data is read back correctly
• If data is corrupted, with very high probability stored
checksum won’t match recomputed checksum
Ensures successful writing by reading back sector after writing it
Performs remapping of bad sectors
Database System Concepts - 7th Edition 12.15 ©Silberschatz, Korth and Sudarshan
Performance Measures of Disks
Access time – the time it takes from when a read or write request is issued
to when data transfer begins. Consists of:
Seek time – time it takes to reposition the arm over the correct track.
Average seek time is 1/2 the worst case seek time.
– Would be 1/3 if all tracks had the same number of sectors, and
we ignore the time to start and stop arm movement
4 to 10 milliseconds on typical disks
Rotational latency – time it takes for the sector to be accessed to
appear under the head.
4 to 11 milliseconds on typical disks (5400 to 15000 r.p.m.)
Average latency is 1/2 of the above latency.
Overall latency is 5 to 20 msec depending on disk model
Data-transfer rate – the rate at which data can be retrieved from or stored
to the disk.
25 to 200 MB per second max rate, lower for inner tracks
Database System Concepts - 7th Edition 12.16 ©Silberschatz, Korth and Sudarshan
Mean Time to Failure (MTTF)
Database System Concepts - 7th Edition 12.17 ©Silberschatz, Korth and Sudarshan
Disk Access
Database System Concepts - 7th Edition 12.18 ©Silberschatz, Korth and Sudarshan
Solid State Disks (SSD)
Database System Concepts - 7th Edition 12.19 ©Silberschatz, Korth and Sudarshan
SSD Performance Metrics
Database System Concepts - 7th Edition 12.20 ©Silberschatz, Korth and Sudarshan
Improvement of Reliability via Redundancy
Redundancy – store extra information that can be used to rebuild
information lost in a disk failure
Mirroring (or shadowing)
Duplicate every disk. Logical disk consists of two physical disks.
Every write is carried out on both disks
Reads can take place from either disk
If one disk in a pair fails, data still available in the other
Data loss will occur only if a disk fails, and its mirror disk also fails
before the system is repaired
Probability of combined event is very small
– Except for dependent failure modes such as fire or building
collapse or electrical power surges
Mean time to data loss depends on mean time to failure,
and mean time to repair
E.g., MTTF of 100,000 hours, mean time to repair of 10 hours gives
mean time to data loss of 500*106 hours (or 57,000 years) for a
mirrored pair of disks (ignoring dependent failure modes)
Database System Concepts - 7th Edition 12.21 ©Silberschatz, Korth and Sudarshan
Improvement in Performance via Parallelism
Database System Concepts - 7th Edition 12.22 ©Silberschatz, Korth and Sudarshan
Improvement in Performance via Parallelism
Bit-level striping – split the bits of each byte across multiple disks
In an array of eight disks, write bit i of each byte to disk i.
Each access can read data at eight times the rate of a single disk.
But seek/access time worse than for a single disk
Bit level striping is not used much any more
Block-level striping – with n disks, block i of a file goes to disk (i
mod n) + 1
Requests for different blocks can run in parallel if the blocks reside
on different disks
A request for a long sequence of blocks can utilize all disks in
parallel
Database System Concepts - 7th Edition 12.23 ©Silberschatz, Korth and Sudarshan
Redundant Arrays of Independ Disks (RAID)
Database System Concepts - 7th Edition 12.24 ©Silberschatz, Korth and Sudarshan
RAID Levels
Database System Concepts - 7th Edition 12.25 ©Silberschatz, Korth and Sudarshan
RAID Level 0
For a file with 20 blocks. Block 1 goes to first disk, block 2 goes to
second disk,
Database System Concepts - 7th Edition 12.26 ©Silberschatz, Korth and Sudarshan
RAID Level 1
Database System Concepts - 7th Edition 12.27 ©Silberschatz, Korth and Sudarshan
RAID Level 5
Database System Concepts - 7th Edition 12.28 ©Silberschatz, Korth and Sudarshan
RAID Level 5 (Cont.)
Block writes occur in parallel if the blocks and their parity blocks are on
different disks
Database System Concepts - 7th Edition 12.29 ©Silberschatz, Korth and Sudarshan
RAID Level 5 (Cont.)
Database System Concepts - 7th Edition 12.30 ©Silberschatz, Korth and Sudarshan
RAID Level 6
Database System Concepts - 7th Edition 12.31 ©Silberschatz, Korth and Sudarshan
RAID Levels Not Used in Practice
Database System Concepts - 7th Edition 12.32 ©Silberschatz, Korth and Sudarshan
Factors in Choosing RAID level
Monetary cost
Performance: Number of I/O operations per second, and
bandwidth during normal operation
Performance during failure
Performance during rebuild of failed disk
Including time taken to rebuild failed disk
Database System Concepts - 7th Edition 12.33 ©Silberschatz, Korth and Sudarshan
Choice of RAID Level
Database System Concepts - 7th Edition 12.34 ©Silberschatz, Korth and Sudarshan
Optimization of Disk-Block Access
Database System Concepts - 7th Edition 12.35 ©Silberschatz, Korth and Sudarshan
End of Chapter 12
Database System Concepts - 7th Edition 12.37 ©Silberschatz, Korth and Sudarshan
RAID Level 5
Parity blocks: Parity block i stores XOR of bits from block i of each disk
When writing data to a block i, parity block i must also be computed
and written to disk
Can be done by using old parity block, old value of current block
and new value of current block (2 block reads + 2 block writes)
Or by recomputing the parity value using the new values of blocks
corresponding to the parity block
– More efficient for writing large amounts of data sequentially
To recover data for a block, compute XOR of bits from all other blocks
in the set including the parity block
Database System Concepts - 7th Edition 12.38 ©Silberschatz, Korth and Sudarshan