0% found this document useful (0 votes)
4 views17 pages

DBMS 5

The document discusses multiple granularity in database locking, explaining its hierarchical structure and benefits for concurrency. It also covers various file organization methods, including sequential, heap, hash, B+, and cluster file organizations, detailing their characteristics and use cases. Additionally, it describes storage devices, RAID configurations, and bitmap indexing, emphasizing their roles in data management and retrieval.

Uploaded by

vengalamadhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views17 pages

DBMS 5

The document discusses multiple granularity in database locking, explaining its hierarchical structure and benefits for concurrency. It also covers various file organization methods, including sequential, heap, hash, B+, and cluster file organizations, detailing their characteristics and use cases. Additionally, it describes storage devices, RAID configurations, and bitmap indexing, emphasizing their roles in data management and retrieval.

Uploaded by

vengalamadhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Multiple Granularity

Let's start by understanding the meaning of granularity.

Granularity: It is the size of data item allowed to lock.

Multiple Granularity:

o It can be defined as hierarchically breaking up the database into blocks


which can be locked.
o The Multiple Granularity protocol enhances concurrency and reduces lock
overhead.
o It maintains the track of what to lock and how to lock.
o It makes easy to decide either to lock a data item or to unlock a data item.
This type of hierarchy can be graphically represented as a tree.
For example: Consider a tree which has four levels of nodes.

o The first level or higher level shows the entire database.


o The second level represents a node of type area. The higher level database
consists of exactly these areas.
o The area consists of children nodes which are known as files. No file can be
present in more than one area.
o Finally, each file contains child nodes known as records. The file has exactly
those records that are its child nodes. No records represent in more than one
file.
o Hence, the levels of the tree starting from the top level are as follows:

o Database
o Area
o File
o Record
Fil
e Organization

o The File is a collection of records. Using the primary key, we can access the
records. The type and frequency of access can be determined by the type of
file organization which was used for a given set of records.
o File organization is a logical relationship among various records. This
method defines how file records are mapped onto disk blocks.
o File organization is used to describe the way in which the records are stored
in terms of blocks, and the blocks are placed on the storage medium.
o The first approach to map the database to the file is to use the several files
and store only one fixed length record in any given file. An alternative
approach is to structure our files so that we can contain multiple lengths for
records.
o Files of fixed length records are easier to implement than the files of variable
length records.
Types of file organization:
File organization contains various methods. These particular methods have pros
and cons on the basis of access or selection. In the file organization, the
programmer decides the best-suited file organization method according to his
requirement.

Types of file organization are as follows:

Sequential File Organization

This method is the easiest method for file organization. In this method, files are
stored sequentially.

o . In this method, we store the record in a sequence, i.e., one after another.
Here, the record will be inserted in the order in which they are inserted into
tables.
o In case of updating or deleting of any record, the record will be searched in
the memory blocks. When it is found, then it will be marked for deleting,
and the new record is inserted.
Insertion of the new record:
Suppose we have four records R1, R3 and so on upto R9 and R8 in a sequence.
Hence, records are nothing but a row in the table. Suppose we want to insert a new
record R2 in the sequence, then it will be placed at the end of the file. Here,
records are nothing but a row in any table.

Heap file organization

o It is the simplest and most basic type of organization. It works with data
blocks. In heap file organization, the records are inserted at the file's end.
When the records are inserted, it doesn't require the sorting and ordering of
records.
o When the data block is full, the new record is stored in some other block.
This new data block need not to be the very next data block, but it can select
any data block in the memory to store new records. The heap file is also
known as an unordered file.
o In the file, every record has a unique id, and every page in a file is of the
same size. It is the DBMS responsibility to store and manage the new
records.
Suppose we have five records R1, R3, R6, R4 and R5 in a heap and suppose we
want to insert a new record R2 in a heap. If the data block 3 is full then it will be
inserted in any of the database selected by the DBMS, let's say data block 1.
If we want to search, update or delete the data in heap file organization, then
we need to traverse the data from staring of the file till we get the requested record.

If the database is very large then searching, updating or deleting of record will be
time-consuming because there is no sorting or ordering of records. In the heap file
organization, we need to check all the data until we get the requested record.

Hash File Organization

Hash File Organization uses the computation of hash function on some fields of the
records. The hash function's output determines the location of disk block where the
records are to be placed.
When a record has to be received using the hash key columns, then the address is
generated, and the whole record is retrieved using that address. In the same way,
when a new record has to be inserted, then the address is generated using the hash
key and record is directly inserted. The same process is applied in the case of
delete and update.

In this method, there is no effort for searching and sorting the entire file. In this
method, each record will be stored randomly in the memory.
B+ File Organization

o B+ tree file organization is the advanced method of an indexed sequential


access method. It uses a tree-like structure to store records in File.
o It uses the same concept of key-index where the primary key is used to sort
the records. For each primary key, the value of the index is generated and
mapped with the record.
o The B+ tree is similar to a binary search tree (BST), but it can have more
than two children. In this method, all the records are stored only at the leaf
node. Intermediate nodes act as a pointer to the leaf nodes. They do not
contain any records.
The above B+ tree shows that:

o There is one root node of the tree, i.e., 25.


o There is an intermediary layer with nodes. They do not store the actual
record. They have only pointers to the leaf node.
o The nodes to the left of the root node contain the prior value of the root and
nodes to the right contain next value of the root, i.e., 15 and 30 respectively.
o There is only one leaf node which has only values, i.e., 10, 12, 17, 20, 24, 27
and 29.
o Searching for any record is easier as all the leaf nodes are balanced.
o In this method, searching any record can be traversed through the single path
and accessed easily.
o Indexed sequential access method (ISAM)
o ISAM method is an advanced sequential file organization. In this method,
records are stored in the file using the primary key. An index value is
generated for each primary key and mapped with the record. This index
contains the address of the record in the file.
o If any record has to be retrieved based on its index value, then the address of
the data block is fetched and the record is retrieved from the memory.

Cluster file organization

o When the two or more records are stored in the same file, it is known as
clusters. These files will have two or more tables in the same data block, and
key attributes which are used to map these tables together are stored only
once.
o This method reduces the cost of searching for various records in different
files.
o The cluster file organization is used when there is a frequent need for joining
the tables with the same condition. These joins will give only a few records
from both tables. In the given example, we are retrieving the record for only
particular departments. This method can't be used to retrieve the record for
the entire department.
In this method, we can directly insert, update or delete any record. Data is sorted
based on the key with which searching is done. Cluster key is a type of key with
which joining of the table is performed.

Databases are stored in file formats, which contain records. At physical level, the
actual data is stored in electromagnetic format on some device. These storage
devices can be broadly categorized into three types −

STORAGE DEVICES
 Primary Storage − The memory storage that is directly accessible to the CPU
comes under this category. CPU's internal memory (registers), fast memory
(cache), and main memory (RAM) are directly accessible to the CPU, as they are
all placed on the motherboard or CPU chipset. This storage is typically very small,
ultra-fast, and volatile. Primary storage requires continuous power supply in order
to maintain its state. In case of a power failure, all its data is lost.
 Secondary Storage − Secondary storage devices are used to store data for future
use or as backup. Secondary storage includes memory devices that are not a part of
the CPU chipset or motherboard, for example, magnetic disks, optical disks (DVD,
CD, etc.), hard disks, flash drives, and magnetic tapes.
 Tertiary Storage − Tertiary storage is used to store huge volumes of data. Since
such storage devices are external to the computer system, they are the slowest in
speed. These storage devices are mostly used to take the back up of an entire
system. Optical disks and magnetic tapes are widely used as tertiary storage.

Memory Hierarchy

A computer system has a well-defined hierarchy of memory. A CPU has direct


access to it main memory as well as its inbuilt registers. The access time of the
main memory is obviously less than the CPU speed. To minimize this speed
mismatch, cache memory is introduced. Cache memory provides the fastest access
time and it contains data that is most frequently accessed by the CPU.

The memory with the fastest access is the costliest one. Larger storage devices
offer slow speed and they are less expensive, however they can store huge volumes
of data as compared to CPU registers or cache memory.

Magnetic Disks

Hard disk drives are the most common secondary storage devices in present
computer systems. These are called magnetic disks because they use the concept of
magnetization to store information. Hard disks consist of metal disks coated with
magnetizable material. These disks are placed vertically on a spindle. A read/write
head moves in between the disks and is used to magnetize or de-magnetize the spot
under it. A magnetized spot can be recognized as 0 (zero) or 1 (one).

Hard disks are formatted in a well-defined order to store data efficiently. A hard
disk plate has many concentric circles on it, called tracks. Every track is further
divided into sectors. A sector on a hard disk typically stores 512 bytes of data.
Redundant Array of Independent Disks
RAID or Redundant Array of Independent Disks, is a technology to connect
multiple secondary storage devices and use them as a single storage media.

RAID consists of an array of disks in which multiple disks are connected together
to achieve different goals. RAID levels define the use of disk arrays.

RAID 0

In this level, a striped array of disks is implemented. The data is broken down into
blocks and the blocks are distributed among disks. Each disk receives a block of
data to write/read in parallel. It enhances the speed and performance of the storage
device. There is no parity and backup in Level 0.

RAID 1

RAID 1 uses mirroring techniques. When data is sent to a RAID controller, it


sends a copy of data to all the disks in the array. RAID level 1 is also
called mirroring and provides 100% redundancy in case of a failure.
RAID 2

RAID 2 records Error Correction Code using Hamming distance for its data,
striped on different disks. Like level 0, each data bit in a word is recorded on a
separate disk and ECC codes of the data words are stored on a different set disks.
Due to its complex structure and high cost, RAID 2 is not commercially available.

RAID 3

RAID 3 stripes the data onto multiple disks. The parity bit generated for data word
is stored on a different disk. This technique makes it to overcome single disk
failures.

RAID 4

In this level, an entire block of data is written onto data disks and then the parity is
generated and stored on a different disk. Note that level 3 uses byte-level striping,
whereas level 4 uses block-level striping. Both level 3 and level 4 require at least
three disks to implement RAID.

RAID 5

RAID 5

writes whole data blocks onto different disks, but the parity bits generated for data
block stripe are distributed among all the data disks rather than storing them on a
different dedicated disk.
RAID 6

RAID 6 is an extension of level 5. In this level, two independent parities are


generated and stored in distributed fashion among multiple disks. Two parities
provide additional fault tolerance. This level requires at least four disk drives to
implement RAID.

Bitmap Indexing
It is a special type of indexing built on a single key. But, it is designed to fire
queries on multiple keys quickly. We need to arrange the records in
sequential order before applying bitmap indexing on it. It makes it simple to
fetch a particular record from the block. Also, it becomes easy to allocate
them in the block of a file.

Bitmap Index Structure


he word 'bitmap' comprises of 'bit' and 'map'. A bit is the smallest unit of
data in a computer system. A map means organizing things. Thus, a bitmap
is simply mapping of bits in the form of an array. In a relation, each attribute
carries one bitmap for its value. A bitmap has sufficient bits for numbering
each record in the block.

For example, consider a relation Student_record where we wish to find out


the female and male students whose score in English is greater than 40. The
bitmaps for gender are given in the below image.

If the value of a record iis set to Mr, it means the i thbit of the bitmap will be
set to 1. Remaining all other bits for Mr in the bitmap will be set to 0.
Similarly, the same step proceeds in the case of female students. If for a
particular j record, the value is set to Ms, it means the j th bit in the bitmap will
be set to 1. All other bits for Ms will be set to 0. Now, if a user wishes to
retrieve either a single or all records for female students or male students,
i.e., value as Mr or Ms, only need to read all the records of the relation. After
reading, select the required records either for Mr or Ms.

However, the bitmap indexing does not allow to select the records quickly.
But, it enables the users to read and choose only the required records. As
seen in the above example that the user only selected the required records
either for female students or for male students.

You might also like