The document discusses techniques for optimizing disk performance including caching and buffering, redundancy using RAID structures, different methods of disk attachment, implementing stable storage, and tertiary storage devices. It also covers operating system and performance issues related to disk optimization.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
33 views30 pages
ch14 Part2
The document discusses techniques for optimizing disk performance including caching and buffering, redundancy using RAID structures, different methods of disk attachment, implementing stable storage, and tertiary storage devices. It also covers operating system and performance issues related to disk optimization.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30
Chapter 14: Disk Performance
Optimization Part 2
Caching and Buffering
Redundancy - RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System Issues Performance Issues Caching and Buffering
Many systems maintain a disk cache buffer,
which is a region of main memory that the operating system reserves for disk data. In one context, the reserved memory acts as a cache, allowing processes quick access to data that would otherwise need to be fetched from disk. The reserved memory also acts as a buffer, allowing the operating system to delay writing modified data until the disk experiences a light load or until the disk head is in a favorable position to improve I/O performance. For example, an operating system may delay writing modified data to disk to allow time for multiple requests to contiguous locations to enqueue, so that they can be serviced with one I/O request. Operating System Concepts 13.2 Silberschatz, Galvin and Gagne 2002 Caching and Buffering
The disk cache buffer presents several
challenges to operating system designers. Because the size of the disk cache must be limited to allow enough memory for active processes, the designer must implement some replacement strategy. The cache-replacement question is similar to the page-replacement question, and designers use many of the same heuristics. Most commonly, designers choose a strategy that replaces the least-recently used item in the disk cache buffer. A second concern arises because disk caching can lead to inconsistencies. Disk cache buffers are maintained in volatile memory, so if the system fails or loses power while modified data is in the cache buffer, those changes are lost. Operating System Concepts 13.3 Silberschatz, Galvin and Gagne 2002 Caching and Buffering To safeguard data against such problems, the contents of the disk cache buffer are periodically flushed to the hard disk; this reduces the probability of data loss if the system crashes. A system that employs write-back caching does not write modified data to disk immediately. Instead, the cache is written to disk periodically, enabling the operating system to batch multiple I/Os that are serviced using a single request, which can improve system performance. A system that employs write-through caching writes data both to the disk cache buffer and to disk each time cached data is modified. This technique prevents the system from batching requests, but reduces the possibility of inconsistent data in the event of a system crash Operating System Concepts 13.4 Silberschatz, Galvin and Gagne 2002 Defragmentation
Defragmentation places file data in
contiguous blocks on disk, which improves access times by reducing seek activity when accessing sequential data. Disk reorganization places frequently or heavily used data in favorable locations on disk (e.g., midrange tracks for noncircular scheduling strategies) to reduce average seek times. There is a growing gap between processor speed and disk speed. The reduced access time due to data compression might outweigh the overhead incurred by compressing and decompressing data.
Operating System Concepts 13.5 Silberschatz, Galvin and Gagne 2002
Defragmentation
If programs generally exhibit spatial locality,
and the area of locality is not at the center of the disk, then moving the head to the center of the disk during each idle period requires a wasteful seek to return to the disk's hot spot when requests resume. A multiprogramming system can service requests from multiple concurrent processes, which may lead to several hot spots on the disk. In this case, it is difficult to determine which, if any, hot spot the read/write head should move to when it is idle.
Operating System Concepts 13.6 Silberschatz, Galvin and Gagne 2002
Redundancy
There are many examples of redundancy being
employed in operating systems for a variety of reasons. A common use of redundancy is creating backups to ensure that if one copy of information is lost, it can be restored. A multiprocessing system can have a pool of identical processors available to assign to processes and threads as needed. Such redundancy has several advantages. Although the system could still function with only a single processor, having the extra processors yields better performance because the processors can all work in parallel.
Operating System Concepts 13.7 Silberschatz, Galvin and Gagne 2002
Redundancy It is also effective for fault tolerance—if one processor fails, the system can continue operating. RAID (Redundant Array of Independent Disks)reduces access times to data on disks by placing redundant copies of that data on separate disks that may function in parallel. Redundant copies of the data can be placed on different regions of the same disk, so that the movement of the read/write head can be minimized and the amount of rotational movement of the disk before the data becomes accessible, thus increasing performance. Redundancy, of course, has its price. The resources costs money and the hardware and the software to support them can become more complex. This 13.8 Operating System Concepts is yet another example of Silberschatz, Galvin and Gagne 2002 RAID Structure
RAID – multiple disk drives provides reliability
via redundancy.
RAID is arranged into six different levels.
Operating System Concepts 13.9 Silberschatz, Galvin and Gagne 2002
RAID (cont)
Several improvements in disk-use techniques
involve the use of multiple disks working cooperatively.
Disk striping uses a group of disks as one
storage unit.
RAID schemes improve performance and
improve the reliability of the storage system by storing redundant data. Mirroring or shadowing keeps duplicate of each disk. Block interleaved parity uses much less redundancy.
Operating System Concepts 13.10 Silberschatz, Galvin and Gagne 2002
RAID Levels
Operating System Concepts 13.11 Silberschatz, Galvin and Gagne 2002
RAID (0 + 1) and (1 + 0)
Operating System Concepts 13.12 Silberschatz, Galvin and Gagne 2002
Disk Attachment
Disks may be attached one of two ways:
1. Host attached via an I/O port
2. Network attached via a network connection
Operating System Concepts 13.13 Silberschatz, Galvin and Gagne 2002
Network-Attached Storage
Operating System Concepts 13.14 Silberschatz, Galvin and Gagne 2002
Storage-Area Network
Operating System Concepts 13.15 Silberschatz, Galvin and Gagne 2002
Stable-Storage Implementation
Write-ahead log scheme requires stable
storage.
To implement stable storage:
Replicate information on more than one nonvolatile storage media with independent failure modes. Update information in a controlled manner to ensure that we can recover the stable data after any failure during data transfer or recovery.
Operating System Concepts 13.16 Silberschatz, Galvin and Gagne 2002
Tertiary Storage Devices
Low cost is the defining characteristic of
tertiary storage.
Generally, tertiary storage is built using
removable media
Common examples of removable media are
floppy disks and CD-ROMs; other types are available.
Operating System Concepts 13.17 Silberschatz, Galvin and Gagne 2002
Removable Disks
Floppy disk — thin flexible disk coated with
magnetic material, enclosed in a protective plastic case.
Most floppies hold about 1 MB; similar
technology is used for removable disks that hold more than 1 GB. Removable magnetic disks can be nearly as fast as hard disks, but they are at a greater risk of damage from exposure.
Operating System Concepts 13.18 Silberschatz, Galvin and Gagne 2002
Removable Disks (Cont.)
A magneto-optic disk records data on a rigid
platter coated with magnetic material. Laser heat is used to amplify a large, weak magnetic field to record a bit. Laser light is also used to read data (Kerr effect). The magneto-optic head flies much farther from the disk surface than a magnetic disk head, and the magnetic material is covered with a protective layer of plastic or glass; resistant to head crashes.
Optical disks do not use magnetism; they
employ special materials that are altered by laser light.
Operating System Concepts 13.19 Silberschatz, Galvin and Gagne 2002
WORM Disks
The data on read-write disks can be modified
over and over. WORM (“Write Once, Read Many Times”) disks can be written only once. Thin aluminum film sandwiched between two glass or plastic platters. To write a bit, the drive uses a laser light to burn a small hole through the aluminum; information can be destroyed by not altered. Very durable and reliable. Read Only disks, such ad CD-ROM and DVD, com from the factory with the data pre- recorded.
Operating System Concepts 13.20 Silberschatz, Galvin and Gagne 2002
Tapes
Compared to a disk, a tape is less expensive
and holds more data, but random access is much slower. Tape is an economical medium for purposes that do not require fast random access, e.g., backup copies of disk data, holding huge volumes of data. Large tape installations typically use robotic tape changers that move tapes between tape drives and storage slots in a tape library. stacker – library that holds a few tapes silo – library that holds thousands of tapes A disk-resident file can be archived to tape for low cost storage; the computer can stage it back into disk storage for active use.
Operating System Concepts 13.21 Silberschatz, Galvin and Gagne 2002
Operating System Issues
Major OS jobs are to manage physical devices
and to present a virtual machine abstraction to applications
For hard disks, the OS provides two
abstraction: Raw device – an array of data blocks. File system – the OS queues and schedules the interleaved requests from several applications.
Operating System Concepts 13.22 Silberschatz, Galvin and Gagne 2002
Application Interface
Most OSs handle removable disks almost
exactly like fixed disks — a new cartridge is formatted and an empty file system is generated on the disk. Tapes are presented as a raw storage medium, i.e., and application does not not open a file on the tape, it opens the whole tape drive as a raw device. Usually the tape drive is reserved for the exclusive use of that application. Since the OS does not provide file system services, the application must decide how to use the array of blocks. Since every application makes up its own rules for how to organize a tape, a tape full of data can generally only be used by the program that created Operating System Concepts it. 13.23 Silberschatz, Galvin and Gagne 2002 Tape Drives
The basic operations for a tape drive differ
from those of a disk drive. locate positions the tape to a specific logical block, not an entire track (corresponds to seek). The read position operation returns the logical block number where the tape head is. The space operation enables relative motion. Tape drives are “append-only” devices; updating a block in the middle of the tape also effectively erases everything beyond that block. An EOT mark is placed after a block that is written.
Operating System Concepts 13.24 Silberschatz, Galvin and Gagne 2002
File Naming
The issue of naming files on removable media
is especially difficult when we want to write data on a removable cartridge on one computer, and then use the cartridge in another computer. Contemporary OSs generally leave the name space problem unsolved for removable media, and depend on applications and users to figure out how to access and interpret the data. Some kinds of removable media (e.g., CDs) are so well standardized that all computers use them the same way.
Operating System Concepts 13.25 Silberschatz, Galvin and Gagne 2002
Hierarchical Storage Management (HSM)
A hierarchical storage system extends the
storage hierarchy beyond primary memory and secondary storage to incorporate tertiary storage — usually implemented as a jukebox of tapes or removable disks. Usually incorporate tertiary storage by extending the file system. Small and frequently used files remain on disk. Large, old, inactive files are archived to the jukebox. HSM is usually found in supercomputing centers and other large installations that have enormous volumes of data.
Operating System Concepts 13.26 Silberschatz, Galvin and Gagne 2002
Speed
Two aspects of speed in tertiary storage are
bandwidth and latency.
Bandwidth is measured in bytes per second.
Sustained bandwidth – average data rate during a large transfer; # of bytes/transfer time. Data rate when the data stream is actually flowing. Effective bandwidth – average over the entire I/O time, including seek or locate, and cartridge switching. Drive’s overall data rate.
Operating System Concepts 13.27 Silberschatz, Galvin and Gagne 2002
Speed (Cont.) Access latency – amount of time needed to locate data. Access time for a disk – move the arm to the selected cylinder and wait for the rotational latency; < 35 milliseconds. Access on tape requires winding the tape reels until the selected block reaches the tape head; tens or hundreds of seconds. Generally say that random access within a tape cartridge is about a thousand times slower than random access on disk. The low cost of tertiary storage is a result of having many cheap cartridges share a few expensive drives. A removable library is best devoted to the storage of infrequently used data, because the library can only satisfy a relatively small number of I/O requests per hour. Operating System Concepts 13.28 Silberschatz, Galvin and Gagne 2002 Reliability
A fixed disk drive is likely to be more reliable
than a removable disk or tape drive.
An optical cartridge is likely to be more
reliable than a magnetic disk or tape.
A head crash in a fixed hard disk generally
destroys the data, whereas the failure of a tape drive or optical disk drive often leaves the data cartridge unharmed.
Operating System Concepts 13.29 Silberschatz, Galvin and Gagne 2002
Cost
Main memory is much more expensive than
disk storage
The cost per megabyte of hard disk storage is
competitive with magnetic tape if only one tape is used per drive.
The cheapest tape drives and the cheapest
disk drives have had about the same storage capacity over the years.
Tertiary storage gives a cost savings only
when the number of cartridges is considerably larger than the number of drives.
Operating System Concepts 13.30 Silberschatz, Galvin and Gagne 2002