Secondary Storage Devices (1) :: Magnetic Disks
Secondary Storage Devices (1) :: Magnetic Disks
Magnetic Disks
Lecture 2
Secondary Storage Devices
2
Storage and Files
Storage has major implications for DBMS design!
READ: transfer data from disk to main memory (RAM).
WRITE: transfer data from RAM to disk.
Both operations are high-cost operations, relative to in-
memory operations, so DB must be planned carefully!
Why Not Store Everything in Main Memory?
Costs too much: Cost of RAM about 100 times the cost of the
same amount of disk space, so relatively small size.
Main memory is volatile.
Typical storage hierarchy:
Main memory (RAM) (primary storage) for currently
used data.
Disk for the main database (secondary storage).
Tapes for archiving older versions of the data (tertiary
storage).
3
Sstorage Hhierarchy
7
Magnetic Disks
8
A Disk Drive
surfaces
Spindle Boom
Read/Write heads
tracks
sector
10
Organization of Disks
11
Components of a Disk
Spindle
Tracks
Disk head
The platters spin (say, 90rps).
The arm assembly is moved in or out
to position a head on a desired track. Sector
Tracks under heads make a cylinder
(imaginary!).
Platters
Only one head reads/writes Arm movement
at any one time.
12
Disk Controller
Disk controllers: typically embedded
in the disk drive, which acts as an
interface between the CPU and the
disk hardware.
14
Cylinders
A cylinder is the set of tracks at a given radius of a
disk pack.
i.e. a cylinder is the set of tracks that can be accessed without
moving the disk arm.
All the information on a cylinder can be accessed
without moving the read/write arm.
15
Cylinders
16
Estimating Capacities
17
Exercise
18
Organizing Tracks by sector
6 10
7 3
5 6
4 8 2 7
3 9 9 11
2 5
10 4
1 11 1 8
19
Exercise
Suppose we want to read consecutively the
sectors of a track in order: sectors 1, 2,…11.
How many revolutions to read the disk?
a) Without interleaving
b) With 3:1 interleaving
Note: nowadays most disk controllers are
fast enough so interleaving is not common...
20
Cluster, Extent, and Fragmentation
The file manager is the part of the operating system responsible
for managing files.it maps the logical parts of the file into their
physical location .
►A cluster is a fixed number of contiguous sectors
►The file manager allocates an integer number of clusters to a file.
An example: Sector size: 512 bytes,
Cluster size: 2 sectors
– If a file contains 10 bytes, a cluster is allocated (1024 bytes).
– There may be unused space in the last cluster of a file.
This unused space contributes to internal fragmentation
21
Extents
If there is a lot of room on a disk, it may be
possible to make a file consist entirely of
contiguous clusters.
Then we say that the file is one extent. (very
good for sequential processing)
If there isn’t enough contiguous space available
to contain an entire file, the file is divided into
two or more noncontiguous parts. Each part is a
separate extent.
22
Fragmentation
►Due to records not fitting exactly in a sector
– Example: Record size = 200 bytes, sector size = 512 bytes
– to avoid that a record span 2 sectors we can only store 2
records in this sector (112 bytes go unused per sector
– the alternative is to let a record span two sectors, but in
this case two sectors must be read when we need to
access this record)
►Due to the use of clusters
– If the file size is not multiple of the cluster size, then the
last cluster will be partially used. 23
Organizing Tracks by Block
Disk tracks may be divided into user-defined blocks
rather than into sectors.
Blocks can be fixed or variable length.
A block is usually organized to hold an integral number
of logical records.
Blocking Factor = number of records stored in a
block.
No internal fragmentation, no record spanning over two
blocks.
In block-addressing scheme each block of data may be
accompanied by one or more subblocks containing extra
information about the block: record count, last record
key on the block…
24
How to Choose Cluster Size
25
Non-data Overhead
Both blocks and sectors require non-data overhead
(written during formatting)
On sector addressable disks, this information
involves sector address, track address, and
condition (usable/defective). Also pre-formatting
involves placing gaps and synchronization marks
between the sectors.
Where a block may be of any size, more
information is needed and the programmer should
be aware of some of this information to utilize it for
better efficiency…
26
Example
Disk characteristics
– Block-addressable Disk Drive
– Size of track = 20.000 bytes
– Nondata overhead per block = 300 bytes
► File Characteristics
– Record size = 100 bytes
► How many records can be stored per track for the
following blocking factors?
1. Block factor = 10
2. Block factor = 60
27
Solution
Blocking factor is 10
►Size of data block = 1000
Size of data block+ sublock= 1300
►Number of blocks that can fit in a track =
28
Solution
Blocking factor is 60
►Size of data blocks+ subblocks = 6300
►Number of blocks that can fit in a track =
29
The Cost of a Disk Access
The time to access a sector in a track on a surface is divided into 3
components:
31
Average Seek Time (s)-1
32
Average Seek Time (s)-2
Seek time depends only on the speed with
which the head rack moves, and the number of
tracks that the head must move across to
reach its target.
Given the following (which are constant for a
particular disk):
Hs = the time for the I/ O head to start moving
Ht = the time for the I/ O head to move from one
track to the next
Then the time for the head to move n tracks is:
Seek(n)= Hs+ Ht*n
33
Rotational Latency(latency)-1
Latency is the time needed for the disk to rotate so
the sector we want is under the read/write head.
Hard disks usually rotate at about 5000-7000
rpm,
12-8 msec per revolution.
Eg. for 7200 rpm, max latency is 8.33 msec.
Note:
Min latency = 0
Max latency = Time for one disk revolution
Average latency (r) = (min + max) / 2
= max / 2
= time for ½ disk revolution
Typically 6 – 4 ms, at average
34
Rotational Latency computation-2
35
Transfer Time-1
e.g. if there are St sectors per track, the time to transfer one
sector would be 1/ St of a revolution.
36
Transfer Time-2
37
Exercise
38
Parameters of Disks(The Cost of a disk Access)
1- Seek time (s)
This is the time needed to mechanically position the read/write
head on the correct track for movable-head disks.
For fixed-head disks, it is the time needed to electronically
switch to the appropriate read/write head.
For 1ST Type, this time varies, depending on the distance
between the current track under the r/w head and the track
specified in the block address. Usually, the disk manufacturer
provides an average seek time in milliseconds.
The typical range of average seek time is 4 to 10 msec.
This is the main culprit for the delay involved in transferring
blocks between disk and memory.
2-Rotational delay (rd)
Disk characteristics :
Average seek time=8msec
Average rotational delay=
3msec
Maximum rotational delay
=6msec
Spindle speed= 10000 rpm
Sectors per
track=170sectors
Sector size=512 bytes
What is the average time to
read one sector ?
Disk as Bottleneck
Processes are often Disk-Bound, i.e., the network
and the CPU often have to wait inordinate lengths of
time for the disk to transmit data.
The exponential growth in the performance and
capacity of semiconductor devices and memories,
faster microprocessors with larger and larger
primary memories are continually becoming
available.
To match this growth, it is natural to expect that
secondary storage technology must also take steps to
keep up with processor technology in performance
and reliability.
Assignment 2
49
Sequential Reading
Given the following disk:
Avg. Seek time, s = 16 ms
Avg. Rot. Latency, r = 8.3 ms
Block ( one sector) transfer time, btt = 8.4 ms
a) Calculate the time to read 10 sequential blocks, on
the same track.
b) Calculate the time to read 10 sequential cylinders,
if there are 200 cylinders, and 20 surfaces.
50
Random Reading
51
Fast Sequential Reading-FSR
FSR assumes that blocks are arranged so that
there is no rotational delay in transferring
from one track to another within the same
cylinder. This is possible if consecutive track
beginnings are staggered (like running races
on circular race tracks)
assumes that the consecutive blocks are
arranged so that when the next block is on an
adjacent cylinder, there is no rotational delay
after the arm is moved to new cylinder
So, in FSR assume no rotational delay after
finding the first block.
52
Assuming Fast Seq. Reading
Formulation of Reading b blocks:
i. Sequentially:
s + r + b * btt
b * btt
s+ r is insignificant for large files, where b is very large: thus
ii. Randomly:
b * (s + r + btt)
53
Exercise
Given a file of 30000 records, 1600 bytes each,
and block size 2400 bytes, how does record
placement affect sequential reading time, in the
following cases? Discuss.
i) Empty space in blocks-internal fragmentation.
ii) Records overlap block boundaries.
54
Exercise
55
Questions