0% found this document useful (0 votes)
8 views37 pages

File Organization

The document discusses file organization in databases, detailing how data is stored on magnetic disks and the types of storage (primary and secondary). It explains various primary file organizations, such as heap, sorted, and hashed files, as well as the mechanics of disk storage devices, including track and block organization. Additionally, it covers record types, blocking, and allocation techniques for efficient data retrieval and storage management.

Uploaded by

yash1215singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views37 pages

File Organization

The document discusses file organization in databases, detailing how data is stored on magnetic disks and the types of storage (primary and secondary). It explains various primary file organizations, such as heap, sorted, and hashed files, as well as the mechanics of disk storage devices, including track and block organization. Additionally, it covers record types, blocking, and allocation techniques for efficient data retrieval and storage management.

Uploaded by

yash1215singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

File Organization

Introduction
• Databases are stored physically as files of
records, which are typically stored on
magnetic disk.
• The DBMS software can then retrieve , update
and process this data as needed. Computer
storage media form as storage hierarchy that
includes two main categories.
Types of storage
• Primary storage

• Secondary storage
Introduction
• Database application need only a small
portions of the database at a time for
processing.
• Whenever a certain portion of the data is
needed it must be located on disk, copied to
main memory for processing and then
rewritten to the disk if the data is changed.
• The data stored on disk is organized as files of
records.
Types primary file organization
• There are several primary file organization,
which determine how the records of a file are
physically placed on the disk, and how the
records are accessed.
• Heap file – A heap file (or unordered file)
places the records on disk in no particular
order by appending new records at the end of
the file.
Types primary file organization
• Sorted file-( sequential file)-keeps the records
ordered by the value of a particular
field(called the sort key).
• Hashed file- A hashed file uses a hash
function applied to a particular fields (called
the hash key) to determine a record’s
placement on disk.
• Other primary file organization is B-tree uses
tree structure..
• A secondary organization or auxiliary access
structure allows efficient access to the
records of a file based on alternate fields
that have been used for the primary file
organization.
Disk Storage Devices
• Preferred secondary storage device for high
storage capacity and low cost.
• Data stored as magnetized areas on magnetic
disk surfaces.
• A disk pack contains several magnetic disks
connected to a rotating spindle.
• Disks are divided into concentric circular
tracks on each disk surface.
– Track capacities vary typically from 4 to 50 Kbytes
or more
Slide 13- 8
Harddisk
Disk Storage Devices (contd.)

Slide 13- 11
Disk Storage Devices (contd.)
• A track is divided into smaller blocks or sectors
– because it usually contains a large amount of information
• The division of a track into sectors is hard-coded on the disk
surface and cannot be changed.
– One type of sector organization calls a portion of a track that
subtends a fixed angle at the center as a sector.
• A track is divided into blocks.
– The block size B is fixed for each system.
• Typical block sizes range from B=512 bytes to B=4096 bytes.
– Whole blocks are transferred between disk and main memory
for processing.

Slide 13- 12
Disk Storage Devices (contd.)

Slide 13- 13
Disk Storage Devices (contd.)
• A read-write head moves to the track that contains the block
to be transferred.
– Disk rotation moves the block under the read-write head for
reading or writing.
• A physical disk block (hardware) address consists of:
– a cylinder number (imaginary collection of tracks of same radius
from all recorded surfaces)
– the track number or surface number (within the cylinder)
– and block number (within track).
• Reading or writing a disk block is time consuming because of
the seek time s and rotational delay (latency) rd.
• Double buffering can be used to speed up the transfer of
contiguous disk blocks.

Slide 13- 14
• To transfer a disk block given its address, the
disk controller must first mechanically position
the read/write head on the correct track.
• The time required to do this is called the seek
time.
• Typically seek times are 7 to 10msec on
desktops and 3 to 8 m.secs. on servers.
• Rotational delay or latency- while the
beginning of the desired block address rotates
into position under the read/write head.
• Block transfer time- some additional time is
needed to transfer the data.
• Total time needed to locate and transfer an
arbitrary block,=given its address, is the sum
of the seek time, rotational delay and block
transfer time.
• The seek time and rotational delay are usually
much larger than the block access time.
Placing file records on disk
• Data is usually stored in the form of records.
• Each record consists of a collection of related
data values of items.
• Each value is formed of one ore more bytes
and corresponds to a particular field of the
record.
• Records are usually describe entities and their
attributes.
• A collection of field names and their
corresponding data types constitutes a record
type or record format definition.
• A data type ,associated with each fields
specifies the types of values a filed can take.
• Struct Emplyee
• {
• Char name[30];
• Char eno[0];
• Int salary;
• Int jobcode;
• Char department[20];
• };
Files, fixed-length Records, and Variable-
length Records
• A file is a sequence of records,
• In many cases all records in a file are of the
same record type.
• If every record in the file has exactly the
same size(in bytes) then the file is said to be
made up of fixed-length records.
• If different records in the file have different
sizes , the file is said to be made up of
variable-length records.
• The file records are of the same record type,
but one or more of the fields are of varying
size(variable lengths fields).Example –
Employee name.
• The file records are of the same record type,
but one or more of the fields may have
multiple values for individual records such a
filled is called a repeating field and a group of
values for the field is often called a repeating
groups.
• The file records are of the same record type ,
but one or more of the fields are optional;
that is ,they may have values for some but not
all of the file records. (optional fields).
• The file contains records of different record
types and varying size(mixed file)
• Example-Grade report.
• This would occur if related records of different
types were clustered on disk block.
Record Blocking and Spanned Versus Un-
spanned Records.
• The records of a file must be allocated to disk
block because a block is the unit of data
transfer between disk and memory.
• When the block size is larger the record size,
each block will contain numerous records
although some files may have usually larger
records that cannot fit in one block.
• Suppose that the block size is B bytes.
• For a file of fixed length records to of size R
bytes with B>=R we can fit bfr=floor(B/R)
records per block.
• The value of bfr is called blocking factor of R
may not divide B exactly, so we have some
unused space in each block equal to B-(bfr*R)
• To utilize this unused space, we can store part
of a record on one block, and rest on anther.
• A pointer at the end of the end of the block
point to the block containing the remainder
of the record in case it is not the next
consecutive block on the disk. This
organization is called spanned.
Figure of Spanned and Un-spanned
• If records are not allowed to cross block
boundaries, the organization is called un-
spanned. This is used with fixed length
records having B>R because it makes search
record start at a known location in the block.
• For variable length record either a spanned or
an un-spanned organization can be used.
• For variable length records using spanned
organization, each block may store a different
number of records
• In this case blocking factor bfr represents the
average number of records per block for the
file.
• The no. of block b needed fir a file or records.
B=ceil(r/bfr) blocks
Allocating File blocks on disk

• There are several standard techniques fore


allocation the blocks of a file on disk.
• This makes reading the whole file very fst
using double buffering. But it makes
expanding the file difficult.
• Contiguous allocation- The file blocks are
allocate the consecutive disk blocks.
• In linked allocation, each file block contains a
pointed to the next file block.
• This makes it easy to expand the file but
makes it slow to read the whole file.
• A combination of the two allocates clusters of
consecutive disk blocks, and the cluster are
linked. Clusters are sometimes called file
segments or extents.
• Indexed allocation– where one or more index
blocks contain pointers to the actual file
blocks.
Unordered Files or HEAP files

• The simplest method of storing a DB table is to


store all the records of the table in the order
in which they are created, on contiguous
blocks, in a large file.
• Such files are called HEAP files, or a PILE.
• New records are inserted at the end of the
file.
• A linear search through the file records is
necessary to search for a record.
– This requires reading and searching half the file
blocks on the average, and is hence quite
expensive.
• Record insertion is quite efficient.
• Reading the records in order of a particular
field requires sorting the file records.

You might also like