0% found this document useful (0 votes)
23 views56 pages

L2.1 File Organization

The document discusses file organization and disk storage. It covers: - Files are organized on disk to support fast access to subsets of records. - Data is stored in a memory hierarchy from cache to primary storage to secondary storage disks. - Disks use magnetic platters divided into tracks then sectors to store data in blocks that can be quickly accessed. - A disk controller handles mechanical positioning of read/write heads and transferring data between disks and memory.

Uploaded by

Edmunds Larry
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views56 pages

L2.1 File Organization

The document discusses file organization and disk storage. It covers: - Files are organized on disk to support fast access to subsets of records. - Data is stored in a memory hierarchy from cache to primary storage to secondary storage disks. - Disks use magnetic platters divided into tracks then sectors to store data in blocks that can be quickly accessed. - A disk controller handles mechanical positioning of read/write heads and transferring data between disks and memory.

Uploaded by

Edmunds Larry
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 56

Disk Storage and File Organization

Introduction
File organization: is a method of arranging the records in a file when the file is
stored on disk.

Organizes data carefully to support


fast access to desired subsets of
records.

 A relation is typically stored as a file of records


Topics to be covered
 Memory Hierarchy

 Secondary Storage Devices

 Accelerating Access to Secondary Storage

 Disk Failures

 File Allocation

 File Organization techniques

◎ Quality assurance and standards


◎ Quality planning and control
Memory Hierarchy
Memory Hierarchy
 Data that makes up a database is stored physically in a computer storage
medium.
 The storage medium forms a storage hierarchy as shown below

Primary Storage
- Directly accessed by the CPU
- Limited storage capacity
- Volatile memory
Secondary and tertiary Storage
- Non-volatile memory

(No change in the database is considered final


until it has been migrated to a nonvolatile
secondary storage)

Data in secondary and tertiary storage cannot be processed directly by the CPU, first it must be copied
into primary storage then processed by the CPU
Memory Hierarchy
Cache
- Static RAM
- Speeds up execution of program
instructions
Main memory
- Dynamic RAM (DRAM)
- Main working area of CPU
- Keeps program instructions and data
- Data is moved from main memory to
cache to the processor
Magnetic disk
- Includes CD, DVD, hard disks
- Stores permanent data
Tapes
- Used for archiving and backup storage of
data

Most databases permanently store data on magnetic disk secondary storage since:
 It is non-volatile
 Most databases are too large to fit in main memory
Memory Hierarchy
Transfer of Data between Levels

- Data moves between adjacent levels of


hierarchy.
- A disk is organized into disk blocks
- Entire blocks of disks are moved to and
from a continuous section of main
memory called a buffer.
- Movement between main memory and
cache is by units of cache lines

A key technique for speeding up data operations is to arrange data so that when a piece of a disk block is needed,
it is likely that other data on the same block will also be needed at about the same time.
Hardware Description of a Disk
Hardware description of a disk
Magnetic disks are used for storing large amounts of data.

- A disk drive has two principal moving


parts:

a) Disk Assembly (Consists of platters that


rotate around a central spindle)

b) Head Assembly (Holds the disk


read/write heads.)

Head assembly Disk assembly


Hardware description of a disk
Magnetic disks are used for storing large amounts of data.

- Made of several surfaces, rotated by a


spindle.
- Each surface is called a platter.
- The surfaces are covered with magnetic
component that allows storage of data on
both sides of the platter. (data stored as
bytes)
- Arm holds the read writes heads
- Actuator moves the read/write heads
Hardware Description
 Each surface consists of tracks which are divided into sectors

- A platter stores information on concentric circles known as Header Data Trailer

tracks
- Each track is divided into smaller blocks or sectors Sector Error
Identifier corrector
(Division of a track into equal sized disk blocks is set by O/S during information

disk formatting or initialization)

- Sectors are separated by gaps, that are not magnetized to


represent either 0 or 1
Hardware description of a disk
Magnetic disks are used for storing large amounts of data.

- Cylinder: Tracks with the same diameter


(Number of cylinders in a disk= number of
tracks per surface)
Activity
Disk Capacity
The capacity of a disk is the number of bytes it can store.

Example:

The Megatron 747 disk has the following characteristics:


 There are eight platters providing 16 surfaces.
 There are 216 , or 65,536, tracks per surface.
 There are (on average) 28 = 256 sectors per track
 There are 212 = 4096 bytes per sector

a. What is the capacity of the disk?

b. If blocks are 214 = 16,384 bytes each:

i) How many sectors does a block occupy?

ii) How many blocks are on each track?


Disk Access
Disk Access
 Transfer of data between main memory and disk takes place in units of disk blocks

 The hardware address of a block is a combination of a cylinder number, a track number and block
number.

 The address of a buffer- a contiguous reserved area in main storage that holds a disk block is also
provided.

Disk Controller:
 Inside a disk unit
 Responsible for all mechanical operation of a disk and interfaces with CPU

Read Command

A disk block is copied to the buffer

Write Command

Contents of a buffer are copied into a disk block.


Disk Access
Disk Controller:
 One or more disk drives is controlled by a disk controller, which is a small processor capable of:

1. Controlling the mechanical actuator that moves the head assembly, to position the head at a particular radius
(so that any track of one particular cylinder can be read or written)

2. Selecting a sector from among all those in the cylinder at which the heads are positioned. (It knows when the
rotating spindle had reached the point where the desired sector is beginning to move under the head)

3. Transferring bits between the desired sector and the computers main memory.

4. Buffering an entire track or more in the local memory of a disk controller hoping that sectors of this track will
be read soon to avoid additional access to the disk.
Disk Access
 The figure bellows shows a simple, single processor computer. The processor communicates via a data bus
with the main memory and the disk controller. A disk controller can control several disks

In the figure, the disk controller in controlling three disks


.
Disk Access
 Accessing (reading or writing) a block requires three steps:

1. Disk controller positions the head assembly at the cylinder containing the track on which the
block is located. The time to do so is seek time

2. The disk controller waits while the first sector of the block moves under the head. This time is
called rotational latency or latency. It depends on the rpm of the disk. Example: at 15000 rpm
the time per rotation is 4 msec , and average rotational delay is 2 msec (4/2 )

3. Time needed to transfer data is block transfer time

Total time to locate and transfer a disk block given its address is the sum of seek time,
rotational delay and block transfer time
Activity
Activity
. Suppose that a disk unit has the following parameters:
Seek time s=20 msec; Rotational delay rd=10 msec; Block transfer time btt=1 msec; Block size B=2400 bytes.

An EMPLOYEE file has the following fields: Ssn, 9 bytes; Last_name, 20 bytes; First_name, 20 bytes;
Middle_init, 1 byte; Birth_date, 10 bytes; Address, 35 bytes; Phone, 12 bytes; Supervisor_ssn, 9 bytes;
Department, 4 bytes; Job_code, 4 bytes; deletion marker 1 byte

The EMPLOYEE file has r=30,000 records, fixed-length format, and unspanned blocking.
Calculate

a. The record size (including deletion marker)

b. The number of records per block

c. The total number of disk blocks

d. Calculate in msec the average time needed to search for an arbitrary record in the file, using linear search, if
the file blocks are not stored on consecutive disk blocks.
Accelerating Access to Secondary Storage
Accelerating disk access
 A disk designed to take an average of, say, 10 milliseconds to access a block will not necessarily

deliver data to an application in 10 milliseconds, after the request has been sent to the disk

controller since:

 If there is only one disk, the disk may be busy with another access from the same process or

another process.

 In worst case, a request for disk access arrives more than once every 10 milliseconds, and these

requests back up indefinitely.

 In that case, scheduling latency becomes indefinite.


Accelerating disk access
 There are several techniques we can use to decrease the average time a disk access
takes and thus improve throughput. (number of disk accesses per second that a
disk can accommodate.)
 They include:

1. Place blocks that are accessed together in the same cylinder (to avoid seek time and rotational latency)

2. Divide the data among several smaller disks rather than one large one. This is known as striping

3. Mirror a disk: making two or more copies of the data on different disks

4. Use a disk scheduling algorithm to select the order in which disks will be accessed.

5. Pre-fetch block to main memory in anticipation of their later use. (Double buffering)

Seek time ( time taken to move the cylinder)


Rotational latency ( time taken to wait until the first of the blocks moves under the head)
Disk Failures
Disk Failures
1. The most common type of failure is intermittent failure, where an attempt to read or write to a sector is

unsuccessful, but with repeated tries, we are able to read successfully.

2. A more serious failure is one in which bit or bits are corrupted, and it is impossible to read a sector correctly,

no matter how many times we try. This form of error is called media decay.

3. A related type of error is a write failure, where we attempt to write a sector, but we can neither write

successfully, nor can we retrieve the previously written sector. A possible cause is that there was a power outage

during the writing of the sector.

4. The most serious form of disk failure is a disk crash, where the entire disk become unreadable, suddenly and

permanently.
Disk Failures
1. Intermittent Failures

 This occurs when we try to read a sector, but the correct content of that sector is not delivered to the disk

controller.

 If the controller has a way to tell whether the sector is good or bad, then it will reissue the read request when

bad data is read, until the sector is returned correctly, or some preset limit is reached.

 Intermittent failure can also occur when a controller attempts to write a sector, but the contents of the sector

are not what was intended.

 The only way to check whether the write was correct is to let the disk go round again and read the sector. If a
good sector is read then the write was correct.
Disk Failures
2. Checksums

 In order to determine the good or bad status of a sector, additional bits called checksum are added to each sector.

 If, on reading we find that the checksum is not proper for the data bits, we know there is an error in reading.

 A simple form of checksum is based on parity bits.

 If there is an odd number of 1’s among the collection of bits, we say the bits have odd parity, and add a parity

bit that is 1.

 Similarly, if there is an even number of 1’s among the bits, then we say the bits have even parity and add parity

bit 0

NB: The number of 1’s among a collection of bits and their parity bit is always even

A disk controller counts the number of 1’s to determine the presence of an error if a sector has odd parity
Disk Failures
2. Checksums

Example:

1. If a sequence of bits in a sector were 01101000 ,then there is an odd number of 1’s, so the parity bit is 1. If we

follow this sequence by it’s parity bit, we have 011010001

2. If the given sequence of bits were 11101110 , we have an even number of 1’s and the parity bit is 0. The

sequence followed by it’s parity bit is 111011100.

NB: The number of 1’s among a collection of bits and their parity bit is always even
Disk Failures
3. Stable Storage

 Checksums assist to detect the existence of media failure (failure to read or write), but does not help to correct the

error.

 During writing, an overwrite of the previous contents of a sector may occur, yet the new contents cannot be read

correctly. This leads to the loss of both the old and new content.

We would like the following to hold:

 When we write a value B to a disk sector s currently containing the value A: after the write operation the sector

contains either A or B, independently of failures.

 Persistent storage where this property holds is called Stable Storage.

 It is highly desirable because it means that the write operation is atomic.

Unfortunately disks satisfy instead a weaker property, the Weakly-Atomic Property: After the write operation the

sector will contain a value that is either A, B, or such that when it is read it gives a read error.
Disk Failures
3. Stable Storage

 A policy know as stable storage deals with the problems above by pairing sectors to have copies of same

information.

 Consider two disks D1 and D2 that are mirror images (RAID-1) of each other, i.e. they have the same number of

equally numbered sectors. Corresponding sectors are intended to have the same content. Here is how we write

a value B to a sector that has currently value A

REPEAT
write B to sector s on D1;
read sector s from D1;
UNTIL value read is without error;

REPEAT
write B to sector s on D2;
read sector s from D2;
UNTIL value read is without error;
Disk Failures
Recovery from Disk Crashes

 The most serious mode of failure is the disk crash or the head crash where the data is permanently destroyed.

Schemes have been developed to reduce the loss of data by disk crashes.

a) Mirroring as a redundancy technique

 The simplest scheme is to mirror each disk. One of the disks that holds the main data is called a data disk and its

copies, held in other disks, whose contents are completely determined by the data disks are called redundant

disks.

 Mirroring as a protection against failure is referred to as RAID level 1. (Redundant Array of Independent disks 1)
Disk Failures
Recovery from Disk Crashes

b) Parity Blocks

 Mirroring as a technique to reduce the probability of a disk crash involving data loss uses as many redundant disks

as there are data disks.

 Another approach RAID level 4, uses only one redundant disk, no matter how many data disks there are.

 We assume the disks are identical so that we can number the blocks on each disk from 1 to some number n. All

blocks on all disks have same number of bits

 In the redundant disk, the i th block consists of parity checks for the i th blocks of all the data disks. That is, the jth

bits of all the ith blocks, including both the data disks and the redundant disks must have an even number of 1’s

among them, and we always choose the bit of the redundant disk to make this condition true.
Disk Failures
Recovery from Disk Crashes

b) Parity Blocks

 Example RAID level 4

Suppose blocks consist of only one byte-eight bits. Let there be three disks called 1,2 and 3, and one redundant disk called

disk 4. If the first 3 data disks have in their first blocks the following disk sequence:

Disk 1 : 11110000

Disk 2: 10101010

Disk 3: 00111000

Then the redundant disk will have in block 1 the parity check bits:

Disk 4:01100010

The modulo-2 sum of bits is 0 if there are even number of 1’s among those bits and 1
if there are odd number of 1’s
Disk Failures
Recovery from Disk Crashes

b) Parity Blocks

 RAID level 4

Reading: Blocks are read from the data disk

Writing: When we write a new block of a data disk, we also need to change its corresponding blocks of disks

Failure Recovery: If a redundant disk crashes we swap in a new disk and recomputed the redundant blocks. If a data

disk fails, we swap in a new disk and recomputed its data from other disks.
Disk Failures
Recovery from Disk Crashes

b) Parity Blocks

 RAID level 5

 Whatever scheme we use for updating the disks, we need to read and write the redundant disk’s

blocks.

 Raid level 4 suffers from bottleneck effect, where by if there are n data disks that needs to be written

to, then the number of writes to the redundant disk will be n times the average number of writes to

any one data disk.

 In RAID level 5, each disk is treated as a redundant disk for some blocks.

 Eg if there are n+1 disks numbered 0 through n, we could treat the ith cylinder of disk j as redundant

if j is the remainder when I is divided by n+1


Disk Failures
Recovery from Disk Crashes

b) Parity Blocks

 RAID level 4 and RAID level 5 compared.


Placing file records on disk
File Organization
File Records on disk

 Record: a collection of related data values or items, where each value is formed of one or

more bytes and corresponds to a particular field of the record.

 Record type: a collection of field names and their corresponding data types. A data type,

associated with each field, specifies the type of values a field can take.

Example:

struct employee {
char name[30];
char SSN[9];
int salary;
int jobcode;
char department[20];
};
Arranging data on disk
 A data element such as a tuple or object is represented by a record, which consists of consecutive bytes

in some disk block.

 Collections such as relations are usually represented by placing the records that represent their data

elements in one or more blocks.


File Organization
struct employee {
Fixed and Variable length records char name[30];
char SSN[9];
int salary;
int jobcode;
char department[20];

};
Arranging data on disk
Fixed-length records

 The simplest sort of record consists of fixed length fields, one for each attribute of the represented

tuple.

 The record has a header and a fixed length region for the record itself.

 A record may consist of:

1. A pointer to the schema for the data stored in the record. Eg it could point to the schema for the relation to

which the tuple belongs, helping us find the fields of the record.

2. The length of the record.

3. Timestamps indicating the time the records was last modified, or last read.

4. Pointers to the fields of the records.


Arranging data on disk
CREATE TABLE MovieStar (
Fixed-length records
name CHAR(30) PRIMARY KEY,
 Example:
address VARCHAR(255),

gender CHAR ( 1 ),

birth date DATE );

 The first field is for name, and this field requires 30 bytes. If we assume that all fields begin at a multiple of 4, then we

allocate 32 bytes for the name.

 The next attribute is address. A VARCHAR attribute requires a fixed length segment of bytes, with one more byte than

the maximum length (for the string’s end marker). Thus, we need 256 bytes for address.

 Attribute gender is a single byte, holding either the character ’M’ or ’F ’.We allocate 4 bytes, so the next field can start

at a multiple of 4.

 Attribute birth date is a SQL DATE value, which is a 10-byte string. We shall allocate 12 bytes to its field, to keep

subsequent records in the block aligned at multiples of 4


Arranging data on disk
CREATE TABLE MovieStar (
Fixed-length records
name CHAR(30) PRIMARY KEY,
 Example:
address VARCHAR(255),

gender CHAR ( 1 ),

birth date DATE );

The header of the record will hold:


a) A pointer to the record schema.
b) The record length.
c) A timestamp indicating when the record was created .
Arranging data on disk
Packing Fixed-length records into blocks

 Records representing tuples of a relation are stored in blocks of the disk and moved into main memory,

when we need to access or update them.

 The layout of the block is suggested as follows:

In addition to the block there is a block header holding information such as:
Arranging data on disk
Packing Fixed-length records into blocks

 In addition to the block there is a block header holding information such as:

1. Links to one or more other blocks that are part of a network of blocks , used for creating indexes to the tuples of

a relation.

2. Information about the role played by this block in such a network.

3. Information about which relation the tuples of this block belong to.

4. A “directory” giving the offset of each record in the block.

5. Timestamp(s) indicating the time of the block’s last modification and/or access.
File Organization
Spanned records
 Suppose that a block size is B bytes and a file of fixed length, records are of size R bytes,

with B bytes > R bytes. R may not divide B exactly leaving some unused space.

 To utilize the unused space, we store part of a record in one block and the rest in

another block, with a pointer pointing to the next block. Records span more than one

block leading to spanned organization


File Organization
Unspanned records
 If records are not allowed to cross boundaries the organization is
termed as unspanned
Allocating file blocks on disk
Contiguous allocation
 File blocks are allocated on consecutive blocks
 Reading the file is fast due to double buffering
 Expanding the file is difficult as it depends on availability of
contiguous memory
Allocating file blocks on disk
Linked allocation
 Each file block contains a pointer to the next file block
 Slow to read the whole file
 Easy to expand the file
Allocating file blocks on disk
Indexed allocation
 A special block known as the index block contains the pointers to all
the blocks occupied by a file.
Operations on files
Operations on files
Open
 Prepares the file for reading or writing.
 Allocates appropriate buffers (typically at least two) to hold file blocks from disk, and retrieves the
file header.
 Sets the file pointer to the beginning of the file.
Operations on files
Reset:

 Sets the file pointer of an open file to the beginning of the file.

Close:

 Completes the file access by releasing the buffers and performing any other needed cleanup operation

(e.g., cleanup the information of the file header which is maintained in main memory).

Find (or Locate):

 Searches for the first record that satisfies a search condition.

 Transfers the block containing that record into a main memory buffer (if it not already there).

 The file pointer points to the record in the buffer and it becomes the current record.
Operations on files
Read (or get):

 Copies the current record from the buffer to a program variable in the user program.

 This command may also advance the current record pointer to the next record in the file, which may

necessitate reading the next file block from disk.


Find Next:

 Searches for the next record in the file that satisfies the search condition.

 Transfers the block containing that record into a main memory buffer (if it is not already there).

 The record is located in the buffer and becomes the current record.
Operations on files
Delete

 Deletes the current record and (eventually) updates the file on disk to reflect the deletion.

Modify

 Modifies some field values for the current record and (eventually) updates the file on disk to reflect

the modification.

Insert

 Inserts a new record in the file by locating the block where the record is to be inserted, transferring

that block into a main memory buffer (if it is not already there), writing the record into the buffer, and

(eventually) writing the buffer to disk to reflect the insertion.

Scan (a combination of Find, FindNext, and Read):

 If the file has just been opened or reset, Scan returns the first record that satisfies the search

condition; otherwise, it returns the next record.

You might also like