Hard Disk Drive: Cassette Tape
Hard Disk Drive: Cassette Tape
Introduction
We know that the data in RAM is volatile. Hence it is necessary to store data on
some non-volatile medium, for later use. Floppy disks provide an alternative but their
capacity is limited. Hard disk were therefore developed which could store large amounts of
data reliably and access the data in short time. Hard disks therefore come under the
permanent, secondary storage devices. Hard disks are also referred to as fixed disks or
Winchester disk drives (WDD). A hard disk requires +5V supply for drive electronics &
+12Vmotors.
Hard disks were invented in the 1950s. They started as large disks up to 20 inches in
diameter holding just a few megabytes. They were originally called "fixed disks" or
"Winchesters" (a code name used for a popular IBM product). They later became known as
"hard disks" to distinguish them from "floppy disks." Hard disks have a hard platter that holds
the magnetic medium, as opposed to the flexible plastic film found in tapes and floppies.
At the simplest level, a hard disk is not that different from a cassette tape. Both hard disks
and cassette tapes use the same magnetic recording techniques. Hard disks and cassette
tapes also share the major benefits of magnetic storage - the magnetic medium can be
easily erased and rewritten, and it will "remember" the magnetic flux patterns stored onto the
medium for many years.
Let's look at the big differences between the cassette tapes and hard disks so you can see
how they differ:
• The magnetic recording material on a cassette tape is coated onto a thin plastic strip.
In a hard disk, the magnetic recording material is layered onto a high-precision
aluminum or glass disk. The hard disk platter is then polished to mirror smoothness.
• With a tape, you have to fast-forward or reverse through the tape to get to any
particular point on the tape. This can take several minutes with a long tape. On a
hard disk you can move to any point on the surface of the disk almost instantly.
• In a cassette tape deck, the read/write head touches the tape directly. In a hard disk
the read/write head "flies" over the disk, never actually touching it.
• The tape in a cassette tape deck moves over the head at about 2 inches (about 5.08
cm) per second. A hard disk platter can spin underneath its head at speeds up to
3,000 inches per second (about 170 MPH or 272 KPH)!
• The information on a hard disk is stored in extremely small magnetic domains
compared to a cassette tape's. The size of these domains is made possible by the
precision of the platter and the speed of the media.
Because of these differences, a modern hard disk is able to store an amazing amount of
information in a small space. A hard disk can also access any of its information in a fraction
of a second.
A typical desktop machine will have a hard disk with a capacity of between 10 and 40
gigabytes. Data is stored onto the disk in the form of files. A file is simply a named collection
of bytes. The bytes might be the ASCII codes for the characters of a text file, or they could
be the instructions of a software application for the computer to execute, or they could be the
records of a data base, or they could be the pixel colors for a GIF image. No matter what it
Page 12 -1
HCL Infosystems Ltd
contains, however, a file is simply a string of bytes. When a program running on the
computer requests a file, the hard disk retrieves its bytes and sends them to the CPU one at
a time.
• The data rate - the number of bytes per second that the drive can deliver to the CPU.
Rates between 5 and 40 megabytes per second are common.
• The seek time - the amount of time it takes between the time that the CPU requests a
file and the first byte of the file starts being sent to the CPU. Times between 10 and
20 milliseconds are common.
Drive Mechanism
A hard disk is made of one or more circular platters. A platter is commonly
made of aluminium. The platters are precisely machined to an extremely fine tolerance to
make them flat and smooth. On each of the platters is laid a magnetic medium on which
data is recorded. The magnetic medium is present on both surface of a platter. The diameter
of the platter determines the physical drive mechanism. Now a days 3.5”platters are used.
The platters are mounted on a shaft called spindle. The spindle connects to the spindle
motor. The spindle motor is a servo controlled DC motor. A servo-controlled motor uses the
feedback to maintain a constant and accurate rotation rate. A sensor in the disk drive
constantly monitors how fast the drive spins and adjusts the spin rate.
Unlike floppy disks, hard disk platters are kept spinning constantly. This is
necessary as the hard disk platters spin at high speed and have large inertia. If the spindle is
stopped, it will take some time for the platters to reach the required speed due to inertia,
which will increase access time. Constant spinning allows immediate access of data.
Read/Write Heads
The data is written to and read from the disk using read/write heads. Since
data is written on both surfaces of a platter there are 2 read/write heads, each associated
with one surface. Each platter has 2 read/write heads associated with it. Each head is
flexibly connected to a rigid arm, which supports the assembly. All the arms are linked to
form a single moving unit. Since the platters spin at such high speeds (3600rpm & above),
the head does not touch the media. In fact it actually floats 6 micro inches above the platter
surface. The head generates a magnetic field corresponding to the current fed to it. This field
in turn orients the magnetic domain in the media while storing data on the disk.
Page 12 -2
HCL Infosystems Ltd
Since all the heads are connected to a common spindle, all heads move in
unison. Actuators bring the head movements. The head actuator is an electromechanical
system that controls head movement. There are 2 types of actuators:
1. Open loop system: These use the band-stepper technology. In this is a stepper
motor, which positions the head as required. The head moves one step at a time,
which responds to a track. However since there is no feedback and these steps
have a discrete step length there a limit to the capacity of the hard disk. Data
cannot be packed closer as it would be difficult to read it.
2. Closed loop system: These use a ‘voice coil’ that operates like voice coil in the
loud speaker. A magnetic field is generated in the coil of wire by the controlling
electronics and this field moves the head in the resulting direction. Since there is
constant feedback about the head position, the head can be precisely positioned
over the required track. The feedback allows for tighter track spacing, therefore
greater capacity.
When the system is switched off the platters would stop spinning & the heads
would crash on the media. This can destroy the data in that area. To prevent this the heads
are taken to an area where no data is recorded. This area is called the ‘Landing zone’.
Usually the head is moved to the landing zone, using software, before the system is
switched off. This process is called ‘head parking’.
Disk Geometry We’ll now figure out how data is stored on a disk.
Track: Each platter in the hard disk has a number of concentric circles on it, extending from
the outer surface to the center. These concentric circles are called tracks. Tracks are present
on both the surfaces of the platter. The head moves from track to track while accessing data.
Track numbering starts from 0.
Cylinder: All the corresponding tracks on different platters form a cylinder. That is the track 0
of all platters taken together would form cylinder 0 and so on and so forth. Since all heads
are connected together each head will be placed at the same cylinder.
Sectors: Each track on the platter is further divided into sectors. Each sector holds 512
bytes of data. Sector numbering starts from one. The number of sectors per track depends
on the type of coding used to store data. If MFM (Modified Frequency Modulation) is used,
number of sectors is less. RLL (Run Length Limit) allows more number of sectors to be put
on a track. Hence sector/track varies from hard disk to hard disk.
Page 12 -3
HCL Infosystems Ltd
It is a sealed aluminum box with controller electronics attached to one side. The electronics
control the read/write mechanism and the motor that spins the platters. The electronics also
assemble the magnetic domains on the drive into bytes (reading) and turn bytes into
magnetic domains (writing). The electronics are all contained on a small board that detaches
from the rest of the drive:
Underneath the board are the connections for the motor that spins the platters, as well as a
highly-filtered vent hole that lets internal and external air pressures equalize:
Page 12 -4
HCL Infosystems Ltd
Removing the cover from the drive reveals an extremely simple but very precise interior:
• The platters, which typically spin at 3,600 or 7,200 RPM when the drive is operating.
These platters are manufactured to amazing tolerances and are mirror smooth (as
you can see in this interesting self-portrait of the author... No easy way to avoid that,
actually!)
• The arm that holds the read/write heads. This arm is controlled by the mechanism in
the upper-left corner, and is able to move the heads from the hub to the edge of the
drive. The arm and its movement mechanism are extremely light and fast. The arm
on a typical hard disk drive can move from hub to edge and back up to 50 times per
second - it is an amazing thing to watch!
In order to increase the amount of information the drive can store, most hard disks have
multiple platters. This drive has three platters and six read-write heads:
Page 12 -5
HCL Infosystems Ltd
The mechanism that moves the arms on a hard disk has to be incredibly fast and precise. It
can be constructed using a high-speed linear motor.
Many drives use a "voice coil" approach - the same technique used to move the cone of a
speaker on your stereo moves the arm.
Page 12 -6
HCL Infosystems Ltd
A typical track is shown in yellow; a typical sector is shown in blue. A sector contains a fixed
number of bytes -- for example, 256 or 512. Either at the drive or the operating system level,
sectors are often grouped together into clusters.
The process of low-level formatting a drive establishes the tracks and sectors on the
platter. The starting and ending points of each sector are written onto the platter. This
process prepares the drive to hold blocks of bytes. High-level formatting then writes the
file-storage structures, like the file allocation table, into the sectors. This process prepares
the drive to hold files.
Sector Interleave: In DOS, data is read one sector at a time. If more than one sector is to
be read at consecutive accesses have to be made. If the sectors on the track are numbered
consecutively, then access time of the disk increases. This happens because by the time
data read from the first sector is transferred to memory, as the disk is spinning continuously,
one or two sectors would have passed under the head.
Page 12 -7
HCL Infosystems Ltd
This means that when the signal to read sector 2 comes the head is over the 4 th
sector. Hence to read sector 2 the head has to wait for one complete revolution of the disk.
This will be repeated for every sector, if the sector numbering is consecutive. This can be
overcome by numbering the 4th sector as sector 2 and the 7th sector as sector 3 and so forth.
By doing this, when the signal to read sector 2 comes the head is over sector 2, resulting in
an instant access. This reduces disk access time. This technique is called ‘Sector
Interleave’. Since two physical sectors are left between logical sector number 1 and 4, this
scheme is said to have a interleave factor of 3:1. The primary format procedure establishes
the interleave factor by writing the logical sector numbers in the ID fields of each sector.
Boot Sequence: The first sector on the disk is Cyl 0, Head 0, and Sector 1. This is called the
Master Boot Sector’ or the ‘Disk Boot Sector’. This sector also contains the partition table,
which is 64 Bytes in length.
When the system is first powered on, the master boot record is loaded into
the memory. The program in the Master Boot Sector looks at the partition table and finds out
the active partition. Each partition has its own partition boot sector. The control from the Disk
Boot Sector is passed on to the boot sector of the active partition. This active partition’s boot
sector in turn loads the Operating System in the active partition and passes the control to the
OS.
On a DOS partition, we have four areas: Boot Sector, FAT, Root Directory
Area and Data Area.
The Boot sector in DOS partition is responsible for the loading of DOS
(IO.SYS, MSDOS.SYS, and COMMAND.COM). The next few sectors are utilized to store
the FAT (File Allocation Table). The size of the FAT depends on the capacity of the hard disk
and the version of the DOS used. There are 2 copies of FAT, which are identical to each
other. The hard disk Fat is normally 16-bit. The next few sectors are used to store the root
directory structure, which holds information about the various files on the disk.
When you partition a hard disk you demarcate different positions of the disk,
such that they can be accessed as separate rives. A hard disk with only one partition is
accessed as ‘C’ drive. When you partition a hard disk into 2 DOS partitions, the first partition
becomes the primary partition and the second becomes extended DOS partition. The
extended DOS partition can again contain logical drives. The primary partition is called as
Drive C & the other logical drives are accessed as Drive D, Drive E and so on. Which means
that physically there is only one drive because there is only one hard disk, but DOS has
marked a certain portion of the hard disk as Drive C and the remaining portion as Drive D,
Drive E etc.
The computer can boot only if the primary partition is made active. Note that
DOS allows only the primary partition to be made ‘active’.
The primary DOS partition and Extended DOS partition can have its own
separate boot sector, FAT record and directory structure.
Page 12 -8
HCL Infosystems Ltd
DOS provides a utility called ‘FDISK’, which can be used to create partitions &
logical drives. An important point to note is that if you alter the partition table of an existing
drive all data on the drive will be lost. Therefore be very careful when you play around with
‘FDISK’. You might inadvertently destroy the data on the disk.
The nature of the logical structures on the hard disk has an important influence on the
performance, reliability, expandability and compatibility of your storage subsystem. This
section takes a look at the logical structures on the hard disk and how they are set up and
used for a typical PC installation. I begin with a discussion of different PC operating systems,
and an overview of different file system types. I then go into significant detail describing the
major structures and key operating details of the most common PC file system, FAT
(FAT12/FAT16/VFAT/FAT32). I talk about utilities used for partitioning and formatting hard
disks, and also talk a bit about disk compression (even though it is no longer nearly as
important as it once was.) I place special emphasis on how to organize the disk for
maximum performance--while not getting bogged down in the minutiae of optimization where
it will buy you little.
Most of the focus in this section is on the FAT family of file systems, because these are by
far the most commonly used, and also the ones with which I am most familiar. I do mention
alternative file systems, but do not go into extensive detail on them, with one exception.
Recognizing the growing role of Windows NT and Windows 2000 systems, a separate,
comprehensive section has been added that describes the NTFS family of file systems. If
you are mostly interested in reading about NTFS, you may want to skip some of the earlier
subsections that describe FAT, and skip directly to the NTFS material. Bear in mind,
however, that some of the NTFS discussions build upon the descriptions of FAT, since in
some ways the file systems are related. So I recommend reading the section in order, if
possible.
Throughout my discussion of file systems, I have referred to the FAT family of file systems.
This includes several different FAT-related file systems, as described here. The file allocation
table or FAT stores information about the clusters on the disk in a table. There are three
different varieties of this file allocation table, which vary based on the maximize size of the
table. The system utility that you use to partition the disk will normally choose the correct
type of FAT for the volume you are using, but sometimes you will be given a choice of which
you want to use.
Since each cluster has one entry in the FAT, and these entries are used to hold the cluster
number of the next cluster used by the file, the size of the FAT is the limiting factor on how
many clusters any disk volume can contain. The following are the three different FAT
versions now in use:
• FAT12: The oldest type of FAT uses a 12-bit binary number to hold the cluster
number. A volume formatted using FAT12 can hold a maximum of 4,086 clusters,
Page 12 -9
HCL Infosystems Ltd
which is 2^12 minus a few values (to allow for reserved values to be used in the
FAT). FAT12 is therefore most suitable for very small volumes, and is used on floppy
disks and hard disk partitions smaller than about 16 MB (the latter being rare today.)
• FAT16: The FAT used for most older systems, and for small partitions on modern
systems, uses a 16-bit binary number to hold cluster numbers. When you see
someone refer to a "FAT" volume generically, they are usually referring to FAT16,
because it is the de facto standard for hard disks, even with FAT32 now more popular
than FAT16. A volume using FAT16 can hold a maximum of 65,526 clusters, which is
2^16 less a few values (again for reserved values in the FAT). FAT16 is used for hard
disk volumes ranging in size from 16 MB to 2,048 MB. VFAT is a variant of FAT16.
• FAT32: The newest FAT type, FAT32 is supported by newer versions of Windows,
including Windows 95's OEM SR2 release, as well as Windows 98, Windows ME and
Windows 2000. FAT32 uses a 28-bit binary cluster number--not 32, because 4 of the
32 bits are "reserved". 28 bits is still enough to permit ridiculously huge volumes--
FAT32 can theoretically handle volumes with over 268 million clusters, and will
support (theoretically) drives up to 2 TB in size. However to do this the size of the
FAT grows very large; see here for details on FAT32's limitations.
Here's a summary table showing how the three types of FAT compare:
One issue related to the FAT file system that has gained a lot more attention over the years
is the concept of slack, which is the colloquial term used to refer to wasted space due to the
use of clusters for storing files. This began in the mid-1990s when larger and larger hard
disks began shipping with most systems. Typically, retail systems were not being divided
into multiple partitions, and users began noticing that large quantities of their hard disk seem
to "disappear". In many cases this amounted to hundreds of megabytes on a disk of only 1
to 2 GB in size. When the use of FAT32 became more common this problem was less of an
issue for a while. Today, with hard disks sized at 40 GB or more commonplace, even FAT32
has problems with slack.
Of course the space doesn't really "disappear", assuming we are not talking about lost
clusters, which can make space really unusable on a disk unless you use a scanning utility
to recover it. The space is simply wasted as a result of the cluster system that FAT uses. A
cluster is the minimum amount of space that can be assigned to any file. No file can use part
of a cluster under the FAT file system. This means, essentially, that the amount of space a
file uses on the disk is "rounded up" to an integer multiple of the cluster size. If you create a
file containing exactly one byte, it will still use an entire cluster's worth of space. Then, you
Page 12-10
HCL Infosystems Ltd
can expand that file in size until it reaches the maximum size of a cluster, and it will take up
no additional space during that expansion. As soon as you make the file larger than what a
single cluster can hold, a second cluster will be allocated, and the file's disk usage will
double, even though the file only increased in size by one byte.
Think of this in terms of collecting rain water in quart-sized glass bottles. Even if you collect
just one ounce of water, you have to use a whole bottle. Once the bottle is in use, however,
you can fill it with 31 more ounces, until it is full. Then you'll need another whole bottle to
hold the 33rd ounce.
Since files are always allocated whole clusters, this means that on average, the larger the
cluster size of the volume, the more space that will be wasted. (When collecting rain water,
it's more efficient to use smaller, cup-sized bottles instead of quart-sized ones, if minimizing
the amount of storage space is a concern). If we take a disk that has a truly random
distribution of file sizes, then on average each file wastes half a cluster. (They use any
number of whole clusters and then a random amount of the last cluster, so on average half a
cluster is wasted). This means that if you double the cluster size of the disk, you double the
amount of storage that is wasted. Storage space that is wasted in this manner, due to space
left at the end of the last cluster allocated to the file, is commonly called slack.
The situation is in reality usually worse than this theoretical average. The files on most hard
disks don't follow a random size pattern, in fact most files tend to be small in size. (Take a
look in your web browser's cache directory sometime!) A hard disk that uses more small files
will result in far more space being wasted. There are utilities that you can use to analyze the
amount of wasted space on your disk volumes, such as the fantastic Partition Magic. It is not
uncommon for very large disks that are in single partitions to waste up to 40% of their space
due to slack, although 25-30% is more common.
Let's take an example to illustrate the situation. Let's consider a hard disk volume that is
using 32 kiB clusters. There are 17,000 files in the partition. If we assume that each file has
half a cluster of slack, then this means that we are wasting 16 kiB of space per file. Multiply
that by 17,000 files, and we get a total of 265 MB of slack space. If we assume that most of
the files are smaller, and so therefore on average each file has slack space of around two-
thirds of a cluster instead of one-half, this jumps to 354 MB!
If we were able to use a smaller cluster size for this disk, the amount of space wasted would
reduce dramatically. The table below shows a comparison of the slack for various cluster
sizes for this example. The more files on the disk, the worse the slack gets. To consider the
percentage of disk space wasted in this example, divide the slack figure by the size of the
disk. So if this were a (full) 1.2 GB disk using 32 kiB clusters, a full 30% of that space is
slack. If the disk is 2.1 GB in size, the slack percentage is 17%:
As you can see, the larger the cluster size used, the more of the disk's space is wasted due
to slack. Therefore, it is better to use smaller cluster sizes whenever possible. This is,
Page 12-11
HCL Infosystems Ltd
unfortunately, sometimes easier said than done. The number of clusters we can use is
limited by the nature of the FAT file system, and there are also performance tradeoffs in
using smaller cluster sizes. Therefore, it isn't always possible to use the absolute smallest
cluster size in order to maximize free space. One way that cluster sizes can be reduced is to
use FAT32 instead of FAT16, as described in other pages in this section. However, on very
large modern hard disks, big partitions even in FAT32 use rather hefty cluster sizes!
Page 12-12