DBMS Unit-3
DBMS Unit-3
(BCSPC4020)
SYLLABUS
• Primary Storage − The memory storage that is directly accessible to the CPU comes under this
category. CPU's internal memory (registers), fast memory (cache), and main memory (RAM) are directly
accessible to the CPU, as they are all placed on the motherboard or CPU chipset. This storage is typically
very small, ultra-fast, and volatile. Primary storage requires continuous power supply in order to maintain its
state. In case of a power failure, all its data is lost.
• Secondary Storage − Secondary storage devices are used to store data for future use or as
backup. Secondary storage includes memory devices that are not a part of the CPU chipset or
motherboard, for example, magnetic disks, optical disks (DVD, CD, etc.), hard disks, flash drives, and
magnetic tapes.
• Tertiary Storage − Tertiary storage is used to store huge volumes of data. Since such storage
devices are external to the computer system, they are the slowest in speed. These storage devices are
mostly used to take the back up of an entire system. Optical disks and magnetic tapes are widely used
as tertiary storage.
Note: The memory with the fastest access is the costliest one. Larger
storage devices offer slow speed and they are less expensive,
however they can store huge volumes of data as compared to CPU
registers or cache memory.
11/23/2022 DEPARTMENT OF CSE, GIET UNIVERSITY, GUNUPUR 6
DBMS Storage Structure
Magnetic Disks
Hard disk drives are the most common secondary storage devices in present computer systems. These are
called magnetic disks because they use the concept of magnetization to store information. Hard disks
consist of metal disks coated with magnetizable material. These disks are placed vertically on a spindle. A
read/write head moves in between the disks and is used to magnetize or de-magnetize the spot under it. A
magnetized spot can be recognized as 0 (zero) or 1 (one).
Magnetic Disks
11/23/2022
11/23/2022 DEPARTMENT OF CSE, GIET UNIVERSITY, GUNUPUR 8 8
DBMS Storage Structure
Magnetic Tape
11/23/2022
11/23/2022 DEPARTMENT OF CSE, GIET UNIVERSITY, GUNUPUR 9 9
DBMS RAID STRUCTURE
RAID 0
In this level, a striped array of disks is implemented. The data is broken down into blocks and the blocks
are distributed among disks. Each disk receives a block of data to write/read in parallel. It enhances the
speed and performance of the storage device. There is no parity and backup in Level 0.
There is no duplication of data. Hence, a block once lost cannot be recovered.
RAID 5
It consists of block-level striping with Distributed parity
RAID 5 writes whole data blocks onto different disks, but the parity bits generated for data block stripe are
distributed among all the data disks rather than storing them on a different dedicated disk.
File Organization
The File is a collection of records. Using the primary key, we can access the records. The type and frequency of
access can be determined by the type of file organization which was used for a given set of records.
File organization is a logical relationship among various records. This method defines how file records are mapped
onto disk blocks.
File organization is used to describe the way in which the records are stored in terms of blocks, and the blocks are
placed on the storage medium.
The first approach to map the database to the file is to use the several files and store only one fixed length record
in any given file. An alternative approach is to structure our files so that we can contain multiple lengths for records.
Files of fixed length records are easier to implement than the files of variable length records.
File Organization
Objective of file organization
It contains an optimal selection of records, i.e., records can be selected as fast as possible.
To perform insert, delete or update transaction on the records should be quick and easy.
The duplicate records cannot be induced as a result of insert, update or delete.
For the minimal cost of storage, records should be stored efficiently.
File Organization
11/23/2022 record.
DEPARTMENT OF CSE, GIET UNIVERSITY, GUNUPUR 29
DBMS Storage Structure
B+ File Organization
B+ tree file organization is the advanced method of an indexed sequential access method. It uses a tree-like
structure to store records in File.
It uses the same concept of key-index where the primary key is used to sort the records. For each primary key, the
value of the index is generated and mapped with the record.
B+ File Organization
B+ File Organization
Indexing in Databases
Indexing is a way to optimize the performance of a database by
minimizing the number of disk accesses required when a query is
processed.
It is a data structure technique which is used to quickly locate and
access the data in a database.
Indexes are created using a few database columns.
The first column is the Search key that contains a copy of the primary key or candidate key of the table.
These values are stored in sorted order so that the corresponding data can be accessed quickly.
Note: The data may or may not be stored in sorted order.
The second column is the Data Reference or Pointer which contains a set of pointers holding the address of
the disk block where that particular key value can be found.
11/23/2022 DEPARTMENT OF CSE, GIET UNIVERSITY, GUNUPUR 36
DBMS Storage Structure
Indexing in Databases
Indexing in Databases
Ordered Indices
Indexing in Databases
Primary Index:
Primary indexing refers to the process of creating an index based on the table’s primary
key.
These primary keys are specific to each record and establish a 1:1 relationship between
them.
The searching operation is fairly efficient because primary keys are stored in sorted
order.
There are two types of primary indexes: dense indexes and sparse indexes.
Indexing in Databases
Dense Index:
For every search key value in the data file, there is an index record.
This record contains the search key and also a reference to the first data record with that
search key value.
Indexing in Databases
Sparse Index:
The index record appears only for a few
items in the data file. Each item points to a
block as shown.
To locate a record, we find the index
record with the largest search key value less
than or equal to the search key value we
are looking for.
We start at that record pointed to by the index record, and proceed along with the pointers in the
file (that is, sequentially) until we find the desired record.
11/23/2022 DEPARTMENT OF CSE, GIET UNIVERSITY, GUNUPUR 41
DBMS Storage Structure
Main file
P.K -- -- --
Primary Indexing 1
2 B-1
.
Anchor attribute Index file Block Pointer .
The main file is sorted 11
. B-2
Primary key is used as anchor P.K B.P .
attribute(search key) 1 21
11 . B-3
It’s an example of sparse indexing .
21
No of entries = no of blocks acquired in .
….
. B-4
index file by the main memory. 91 .
91
No of access required = log 2n+1 .
.
Indexing in Databases
Clustering index
Clustering index is defined on an ordered data file. The data file is ordered on a non-key field.
In some cases, the index is created on non-primary key columns which may not be unique for each
record. In such cases, in order to identify the records faster, we will group two or more columns
together to get the unique values and create index out of them.
This method is known as the clustering index. Basically, records with similar characteristics are
grouped together and indexes are created for these groups.
For example, students studying in each semester are grouped together. i.e. 1st Semester students,
2nd semester students, 3rd semester students etc.. are grouped.
Clustered index
Clustered Indexing
Main file
N.K -- -- --
Non-key attribute Index file Block Pointer 1
1
1 B-1
The main file is sorted(on a non-key N.K B.P 2
attribute) 1 3
2 4
There will be one entry for each unique 4
3 B-2
value of the non key attribute. 5
4 5
If no. of block acquired by index file is n, 5 6
6 6
then block access required will be ≥ log2n+1 . B-3
.
.
11/23/2022 DEPARTMENT OF CSE, GIET UNIVERSITY, GUNUPUR 46
DBMS Storage Structure
Main file
Secondary Indexing
S.K -- -- --
The main file is un-sorted Anchor attribute Index file Block Pointer 1
27
S.K B.P B-1
Can be done on key as well as non-key .
1 3
attribute. 2 91
Called secondary because normally one 3 .
. B-2
4
indexing is already done. 2
5
It’s an example of dense indexing. 5
6
.
No of entry in index file = no of entry in .
B-3
6
main file. .
….
.
91