Cs 5226 Week 8
Cs 5226 Week 8
Hardware Tuning
Application Programmer
(e.g., business analyst, Data architect)
Application
Sophisticated Application Programmer
(e.g., SAP admin)
Concurrency Control
DBA, Tuner
Recovery
Outline
Magnetic Disks
tracks spindle
platter
read/write head
1980: SEAGATE
first 5.25 disk drive 5 Mb 1.96 Mb/in2 625 Kb/sec 1999: IBM MICRODRIVE first 1 disk drive 340Mb 6.1 MB/sec
4
actuator
disk arm
Controller
disk interface
Magnetic Disks
Controller overhead (0.2 ms) Seek Time (4 to 9 ms) Rotational Delay (2 to 6 ms) Read/Write Time (10 to 500 KB/ms)
Disk Interface
IDE (16 bits, Ultra DMA 25 MHz) SCSI: width (narrow 8 bits vs. wide 16 bits) frequency (Ultra3 - 80 MHz).
Storage Metrics
DRAM
Unit Capacity
40
(up to 160)
470 23 450
Hardware Bandwidth
System Bandwidth Yesterday
in megabytes per second (not to scale!)
40 15 per disk
422
Hardware Bandwidth
System Bandwidth Today
in megabytes per second (not to scale!)
26
26 26 160 133 1,600
In practice, 3 disks can reach saturation using sequential IO Hard Disk | SCSI | PCI | Memory | Processor
8
Combine multiple small, inexpensive disk drives into a group to yield performance exceeding that of one large, more expensive drive Appear to the computer as a single virtual drive Support fault-tolerance by redundantly storing information in various ways
9
RAID 0 - Striping
No redundancy
11
RAID 1 Mirroring
One write = a physical write on each disk One read = either read both or read the less busy one
12
Fast read/write All disk arms are synchronized Speed is limited by the slowest disk
13
An extra bit added to a byte to detect errors in storage or transmission Even (odd) parity means that the parity bit is set so that there are an even (odd) number of one bits in the word, including the parity bit A single parity bit can only detect single bit errors since if an even number of bits are wrong then the parity bit will not change It is not possible to tell which bit is wrong
14
For error detection, rather than full redundancy Each stripe unit has an extra parity stripe
15
RAID 5 Read/Write
Good performance Read old data stripe; read parity stripe (2 reads) XOR old data stripe with new data stripe. XOR result into parity stripe. Write new data stripe and new parity stripe (2 writes).
16
High performance of RAID 0, and high tolerance of RAID 1 (at the cots of doubling disks)
Software RAID
Software RAID: run on the servers CPU Directly dependent on server CPU performance and load Occupies host system memory and CPU operation, degrading server performance
Hardware RAID: run on the RAID controllers CPU Does not occupy any host system memory. Is not operating system dependent Host CPU can execute applications while the array adapter's processor simultaneously executes array functions: true hardware multi-tasking
18
Hardware RAID
100000 rows Cold Buffer Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.
19
20
RAID Levels
Read-Intensive
Throughput (tuples/sec)
80000 60000 40000 20000 0 SoftRAID5 RAID5 RAID0 RAID10 RAID1 Single Disk
Write-Intensive
Throughput (tuples/sec)
Using multiple disks (RAID0, RAID 10, RAID5) increases throughput significantly.
Without cache, RAID 5 suffers. With cache, it is ok.
21
Write-Intensive:
RAID 1
2X 1X Yes Low Use double the disk space Very high I/O performance
RAID 5
High Medium Yes High Lower throughput with disk failure A good overall balance
RAID 10
High High Yes Low Very expensive, not scalable High reliability with good performance
22
Read-ahead:
Prefetching at the disk controller level. No information on access pattern. Better to let database management system do it. Write back: transfer terminated as soon as data is written to cache.
cache friendly: update of 20,000 rows (~90Mb) cache unfriendly: update of 200,000 rows (~900Mb)
25
1500
no cache cache
1000
cache friendly (90Mb) cache unfriendly (900Mb)
500
Updates Controller cache increases throughput whether operation is cache friendly or not.
Log File
RAID 5 is best suited for read intensive apps or if the RAID controller cache is effective enough. RAID 10 is best suited for write intensive apps. RAID 1 is appropriate
Temporary Files
Fault tolerance with high write throughput. Writes are synchronous and sequential. No benefits in striping.
RAID 0 is appropriate.
Fault tolerance
It does not prevent disk drive failures It enables real-time data recovery
High I/O performance Mass data capacity Configuration flexibility Lower protected storage costs Easy maintenance
28
Add memory
Cheapest option to get better performance Can be used to enlarge DB buffer pool
Better hit ratio If used for enlarge OS buffer (as disk cache), it benefits but to other apps as well
Add Disks
A dedicated disk for the log Switch RAID5 to RAID10 for update-intensive apps Move secondary indexes to another disk for writeintensive apps Partition read-intensive tables across many disks Automatic replication and load balancing
30
Add Processors
Function parallelism
GUI, Query Optimisation, TT&CC, different types of apps, different users Operation pipelines:
Parallelism
E.g., join phase of GRACE hash join E.g., scan, join, sum, min
32
Summary
We have covered:
RAID: what are they and which one to use? When to add what?
33
Database Tuning
Database Tuning is the activity of making a database application run more quickly. More quickly usually means higher throughput, though it may mean lower response time for time-critical applications.
34
Tuning Principles
Think globally, fix locally Partitioning breaks bottlenecks (temporal and spatial) Start-up costs are high; running costs are low Render onto server what is due onto Server Be prepared for trade-offs (indexes and inserts)
35
Set reasonable performance tuning goals Measure and document current performance Identify current system performance bottleneck Identify current OS bottleneck Tune the required components eg: application, DB, I/O, contention, OS etc Track and exercise change-control procedures Measure and document current performance Repeat step 3 through 7 until the goal is met
36
Tuning Mindset
Goals Met?
Appreciation of DBMS architecture Study the effect of various components on the performance of the systems Tuning principle Troubleshooting techniques for chasing down performance problems Hands-on experience in Tuning
37