0% found this document useful (0 votes)

13 views53 pages

Storage System - RAID Levels

Uploaded by

Ankit Dahiya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views53 pages

Storage System - RAID Levels

Uploaded by

Ankit Dahiya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 53

Database Management System

(BTIT502-18)

Basics-Storage
RAID levels
1
In this topic…

• An overview of various types of

– storage devices that are used for accessing and storing data.

• Types of Data Storage

– Primary Memory- RAM/ROM(BIOS/OS)
– Secondary Storage/ROM
differ from one another as per the
– Tertiary /Auxilliary Storage speed and capacity
Differ from one another as per the speed and Capacity
Primary Memory (Storage)
• Main Memory: volatile, small enough to carry the entire
database

• Cache: costly, fastest , maintained by the computer hardware,

frequently programs/data used by CPU.

• Within CPU… registers

Secondary (HD) Storage
• the user to save and store data permanently

• Online (within the system), non-volatile storage

• Egs …
– Flash Memory: plugged into the USB slots , used in the server
systems for caching the frequently used data.
– Magnetic Disk Storage (HD): storing the data for a long time, ensures
availability of the data, Persistent storage
Tertiary Storage

• external from the computer system

• slowest speed
• store a large amount of data
• Offline storage
• used for data backup
• Egs …
– Optical Storage (CDs, DVDs)
– Tape Storage (Mag. Tapes)
– Pen drives
Storage Hierarchy

Speed, Cost
capacity
increases
increase
Redundant Array of Independent Disks
(RAID)
• Technology to connect multiple secondary storage devices (Disks)
and use them as a single storage media.

• RAID consists of an array of disks in which multiple disks are

connected together to achieve different goals.

• Multiple disks
RAID levels
• RAID 0
• RAID 1
• RAID 2
• RAID 3
• RAID 4
• RAID 5
• RAID 6
RAID 0
• A striped array of disks is implemented.
• The data is broken down into blocks(multiple bytes) and the
blocks are distributed among disks.
• Each disk receives a block of data to write/read in parallel.
• It enhances the speed and performance of the storage device.
• There is no parity(data check) and backup in Level 0.
RAID 1
• Uses mirroring techniques.
• When data is sent to a RAID controller, it sends a copy of data
to all the disks in the array.
• RAID level 1 is also called mirroring and provides 100%
redundancy in case of a failure.
RAID 2
• RAID 2 records ECC (error correction code) using Hamming distance for
its data, striped on different disks.
• Like level 0, each data bit in a word is recorded on a separate disk and
ECC codes of the data words are stored on a different set disks.
• Backup as well Data error correction
• Due to its complex structure and high cost, RAID 2 is not commercially
available.
RAID 3
• RAID 3 stripes the data onto multiple disks. The parity bit
(detect the error) generated for data word is stored on a
different disk.
• This technique makes it to overcome single disk failures.
RAID 4
• In this level, an entire block (multiple words) of data is written onto
data disks and then the parity is generated and stored on a different
disk.
• level 3 uses byte-level striping, whereas level 4 uses block-level
striping. Both level 3 and level 4 require at least three disks to
implement RAID.
RAID 5
• RAID 5 writes whole data blocks onto different disks,
• But the parity bits generated for data block stripe are
distributed among all the data disks rather than storing them
on a different dedicated disk.
RAID 6
• RAID 6 is an extension of level
5.
• In this level, two independent
parities are generated and
stored in distributed fashion
among multiple disks.
• Two parities provide
additional fault tolerance. This
level requires at least four disk
drives to implement RAID.
Indexing
• It is a type of data structure. It is used to locate and access the data in
a database (retrieve the data) table quickly.

• used to optimize the performance (time-minimum / speed-maximum)

of a database
– minimizing the number of disk accesses required when a query is processed
(OPTIMAL).

• “Data structure technique to efficiently retrieve records from the

database based on some attributes”
Index structure

• Search key - contains a copy of the primary key or candidate key of the
table.
• The values of the primary key are stored in sorted order so that the
corresponding data can be accessed easily.

• Data reference- contains a set of pointers holding the address of the disk
block where the value of the particular key can be found.
Example

Search Key Data Reference

Types of Indexes
1. Primary index
2. Clustered Index
3. Secondary Index
Indexing Methods/Techniques
• Primary/Ordered Index
– Dense Index
– Sparse Index

• Multilevel Index
1. Primary/Ordered Index
• Based on an ordered data file, ordered on a key field
• The key field is generally the primary key of the relation
• The indices are usually sorted to make searching faster (Ordered
Indices)

• Primary/Ordered Indexing is of two types −

– Dense Index
– Sparse Index
Dense Index
• Contains an index record for every search key value in the data file.
• Makes searching faster.
• The number of records in the index table is same as the number of
records in the main table.
• Needs more space to store index record itself. The index records
have the search key and a pointer to the actual record on the disk.
Sparse Index
• In sparse index, index records are not created for every search Key.
• In the data file, index record appears only for a few items. Each item
points to a block.
• Instead of pointing to each record in the main table, the index points
to the records in the main table in a gap.
2. Clustering Index
• A clustered index can be defined as an ordered data file.
Sometimes the index is created on non-primary key columns
which may not be unique for each record.

• In this case, to identify the record faster, we will group two or

more columns to get the unique value and create index out of
them. This method is called a clustering index.

• The records which have similar characteristics are grouped, and

indexes are created for these group.
3. Secondary Index
• Two level Index.

• As the size of the table grows, the size of mapping also grows. These mappings are usually
kept in the primary memory so that address fetch should be faster. Then the secondary
memory searches the actual data based on the address got from mapping. If the mapping
size grows then fetching the address itself becomes slower. In this case, the sparse index will
not be efficient. To overcome this problem, secondary indexing is introduced.

• In secondary indexing, to reduce the size of mapping, another level of indexing is introduced.
In this method, the huge range for the columns is selected initially so that the mapping size
of the first level becomes small. Then each range is further divided into smaller ranges. The
mapping of the first level is stored in the primary memory, so that address fetch is faster. The
mapping of the second level and actual data are stored in the secondary memory (hard disk).
Secondary
Index
B Tree
+

• Balanced binary search tree which follows a multi-level index

format.
• The leaf nodes denote actual data pointers.
• Ensures that all leaf nodes remain at the same height
(Balanced Tree).
• In the B+ tree, the leaf nodes are linked using a link list.
Therefore, a B+ tree can support random access as well as
sequential access.
Example- B tree+

• Every leaf node is at equal distance from the root node.

(Balanced Tree)
B Tree Insertion
+

• B+ trees are filled from bottom and each entry is done at the leaf node.
• If a leaf node overflows −
– Split leaf node into two parts.
– Partition at i = ⌊(m+1)/2⌋.
– First i entries are stored in one node.
– Rest of the entries (i+1 onwards) are moved to a new node.
– ith key is duplicated at the parent of the leaf.
• If a non-leaf node overflows −
– Split node into two parts.
– Partition the node at i = ⌈(m+1)/2⌉.
– Entries up to i are kept in one node.
– Rest of the entries are moved to a new node.
Example-B Tree Insertion
+

• Suppose we want to insert a record 60 in the below structure.

• Values=4 , i=(4+1)/2=low(2.5)=2
• It will go to the 3rd leaf node after 55 and a leaf node of this tree is already
full, so we cannot insert 60 there.
• So the have to split the leaf node
• The 3rd leaf node has the values (50, 55, 60, 65, 70) and its current root node
is 50. We will split the leaf node of the tree in the middle so that its balance
is not altered. So we can group (50, 55) and (60, 65, 70) into 2 leaf nodes.
• If these two has to be leaf nodes, the intermediate node cannot branch
from 50. It should have 60 added to it, and then we can have pointers to a
new leaf node.
B+ Tree Deletion
• B+ tree entries are deleted at the leaf nodes.
• The target entry is searched and deleted.
– If it is an internal node, delete and replace with the entry from the left
position.
• After deletion, underflow is tested,
– If underflow occurs, distribute the entries from the nodes left to it.
• If distribution is not possible from left, then
– Distribute from the nodes right to it.
• If distribution is not possible from left or from right, then
– Merge the node with left and right to it.
• to delete 60 from the above example
• In this case, we have to remove 60 from the intermediate node as
well as from the 4th leaf node too. If we remove it from the
intermediate node, then the tree will not satisfy the rule of the B+
tree. So we need to modify it to have a balanced tree.
Searching a record in B+ Tree
• To search 55 in the below B+ tree structure.
• Firstly fetch for the intermediary node which will direct to the leaf node that can
contain a record for 55.
• So, in the intermediary node, we will find a branch between 50 and 75 nodes.
Then at the end, we will be redirected to the third leaf node. Here DBMS will
perform a sequential search to find 55.
Hashing
• Indexing is inefficient for huge DBs.. WHY?? Searching is sequential
and slow.
• Hashing technique is used to calculate the direct location of a data
record on the disk without using index structure.
• In this, data is stored at the data blocks / bucket (multiple bytes)
• the address for data block is generated by using the hashing
function.
• The memory location where these records are stored is known as
data bucket or data blocks.
Hashing Function
• A hash function is a simple mathematical function to any
complex mathematical function. (egs Mod, sin, cos)

• the hash function mostly uses the primary key to generate the
address of the data block.

• primary key itself can be the address of the data block

Hash Organization
• Bucket − A hash file stores data in bucket format. Bucket is
considered a unit of storage. A bucket typically stores one
complete disk block, which in turn can store one or more records.

• Hash Function − A hash function, h, is a mapping function that

maps all the set of search-keys K to the address where actual
records are placed. It is a function from search keys to bucket
addresses.
• Example- data block addresses same as primary key value

98
Other Hash Functions..
• The hash function can also be a simple mathematical function like exponential, mod, cos, sin,
etc.
• Suppose we have mod (5) hash function to determine the address of the data block.
• In this case, it applies mod (5) hash function on the primary keys and generates 3, 3, 1, 4 and
2 respectively, and records are stored in those data block addresses.
• Eg. 98 mod (5) = 3 and 103 mod 5 = 3 , so both records of key value 98 and 103 will be stored
with same address 3 in memory as linked list.
• 104, 106
Types of Hashing
1. Static Hashing
• In static hashing, the resultant data bucket address will always be the same.
That means if we generate an address for EMP_ID =103 using the hash
function mod (5) then it will always result in same bucket address 3. Here,
there will be no change in the bucket address.
• Hence in this static hashing, the number of data buckets in memory remains
constant throughout.
Disadvantage of Static Hashing
• If new record needs to be added and we want to generate an
address of the data bucket, but data already exists in that data
address. This situation is BUCKET OVERFLOW.
2. Dynamic Hashing
• The dynamic hashing method is used to overcome the problems
of static hashing like bucket overflow.

• In this method, data buckets grow or shrink as the records

increases or decreases. This method is also known as Extendable
hashing method.

• This method makes hashing dynamic, i.e., it allows insertion or

deletion without resulting in poor performance.
To insert a new record using Hashing (dynamic)

• Firstly, you have to follow the same procedure for retrieval,

ending up in some bucket.
• If there is still space in that bucket, then place the record in it.
• If the bucket is full, then we will split the bucket and
redistribute the records.
Insert Key to data buckets
• 1. 3. Copying records in data buckets

• 2. Last bits Bucket

00 B0
01 B1
10 B2
11 B3

Insert key 9 with hash address 10001 into the above structure???
The max records in each bucket is 2
Insert key 9 with hash address 10001 into the previous
structure
• Since key 9 has hash address 10001, it must go into the first bucket. But bucket
B1 is full, so it will get split.
• Take last 3 bits to decide the bucket
Key Hash Address RULE (last 3 bits)
1 11010 000 B0
2 00 000 001B1
3 11110 010 B2
4 00000 011 B3
5 01001 100 B4
6 10101 101 B5
7 10111 110 B6
9 10 001 111 B7
Advantages of dynamic hashing
• The performance does not decrease as the data grows in the
system. It simply increases the size of memory to accommodate the
data.

• Memory is well utilized as it grows and shrinks with the data. There
will not be any unused memory lying.

• Good for the dynamic database where data grows and shrinks
frequently.
Disadvantages of dynamic hashing
• In this method, if the data size increases then the bucket size is
also increased. If there is a huge increase in data, maintaining the
bucket address table becomes tedious.

• In this case, the bucket overflow situation will also occur. But it
might take little time to reach this situation than static hashing.
• Thank You

• Any Queries?????

Commerce 1DA3 Notes-6
No ratings yet
Commerce 1DA3 Notes-6
256 pages
Unit 3 Storage Strategies Indices B-Trees Hashing
No ratings yet
Unit 3 Storage Strategies Indices B-Trees Hashing
12 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
18 pages
DBMS - Indexing: Dense Index
No ratings yet
DBMS - Indexing: Dense Index
5 pages
DBMS Unit5
No ratings yet
DBMS Unit5
40 pages
Dbms Indexing
No ratings yet
Dbms Indexing
3 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
12 pages
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
No ratings yet
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
7 pages
Need For RAID
No ratings yet
Need For RAID
15 pages
Indexing and B+ Tress
No ratings yet
Indexing and B+ Tress
6 pages
Indexing
No ratings yet
Indexing
10 pages
B+ Tree
No ratings yet
B+ Tree
17 pages
Indexing
No ratings yet
Indexing
53 pages
Unit Iv Indexing and Hashing: Basic Concepts
No ratings yet
Unit Iv Indexing and Hashing: Basic Concepts
35 pages
CO3-Session-09 & 10
No ratings yet
CO3-Session-09 & 10
41 pages
B+ Tree
No ratings yet
B+ Tree
17 pages
Memoryhierarchy Indexing
No ratings yet
Memoryhierarchy Indexing
9 pages
Dbms. 5 Unit Part-B
No ratings yet
Dbms. 5 Unit Part-B
8 pages
Indexing
No ratings yet
Indexing
6 pages
Indexing in DBMS
No ratings yet
Indexing in DBMS
7 pages
Indexing - DBMS
No ratings yet
Indexing - DBMS
20 pages
14 PhysicalAccess
No ratings yet
14 PhysicalAccess
41 pages
Co3 Session 21
No ratings yet
Co3 Session 21
53 pages
Indexing
No ratings yet
Indexing
41 pages
Indexing - II
No ratings yet
Indexing - II
57 pages
2 - Indexing Structures - Ch14
No ratings yet
2 - Indexing Structures - Ch14
50 pages
Unit 6
No ratings yet
Unit 6
38 pages
Primary Indexing
No ratings yet
Primary Indexing
7 pages
Unit V
No ratings yet
Unit V
81 pages
Unit 4 Chapter 1 Storage and Querying
No ratings yet
Unit 4 Chapter 1 Storage and Querying
37 pages
File Organization
No ratings yet
File Organization
47 pages
Unit 5 DBMS
No ratings yet
Unit 5 DBMS
38 pages
Unit Iv
No ratings yet
Unit Iv
29 pages
Types of Indexes
No ratings yet
Types of Indexes
9 pages
Database Management System-203105251: Assistant Professor Computer Science & Engineering
No ratings yet
Database Management System-203105251: Assistant Professor Computer Science & Engineering
35 pages
CSE 544: Lecture 11 Storing Data, Indexes: Monday, 5/1/2006
No ratings yet
CSE 544: Lecture 11 Storing Data, Indexes: Monday, 5/1/2006
52 pages
File Organization
No ratings yet
File Organization
11 pages
Chapter 4 Summery
No ratings yet
Chapter 4 Summery
14 pages
Unit-4 Hand Written
No ratings yet
Unit-4 Hand Written
35 pages
Unit5 Dbms Indexing
No ratings yet
Unit5 Dbms Indexing
6 pages
Index and Hashing
No ratings yet
Index and Hashing
82 pages
CS2202 IndexingHashing
No ratings yet
CS2202 IndexingHashing
83 pages
Index Dbms
No ratings yet
Index Dbms
5 pages
Chapter 11: Indexing and Hashing
No ratings yet
Chapter 11: Indexing and Hashing
47 pages
Unit5 File Organization
No ratings yet
Unit5 File Organization
112 pages
Indexing
No ratings yet
Indexing
6 pages
08 Indexes1
No ratings yet
08 Indexes1
7 pages
Unit Iv Implementation Techniques
No ratings yet
Unit Iv Implementation Techniques
91 pages
Unit-5 DBMS
No ratings yet
Unit-5 DBMS
28 pages
Indexing Hashing Files
No ratings yet
Indexing Hashing Files
68 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
81 pages
UNIT-IV - File Organization
No ratings yet
UNIT-IV - File Organization
10 pages
Assignment (DS)
No ratings yet
Assignment (DS)
8 pages
Index Architecture: Febriliyan Samopa
No ratings yet
Index Architecture: Febriliyan Samopa
110 pages
Lecture 5.Pptx 2
No ratings yet
Lecture 5.Pptx 2
22 pages
Indexing Structures For Files
No ratings yet
Indexing Structures For Files
25 pages
4 Marks Chapter (12) : 1) Physical Storage Media
No ratings yet
4 Marks Chapter (12) : 1) Physical Storage Media
6 pages
Planning and Configuring Mailbox Servers
No ratings yet
Planning and Configuring Mailbox Servers
31 pages
Cloud Design Patterns 1711512535
No ratings yet
Cloud Design Patterns 1711512535
3 pages
Collection
No ratings yet
Collection
56 pages
Import of SAPDBA Role (Sapdba - Role - SQL)
No ratings yet
Import of SAPDBA Role (Sapdba - Role - SQL)
4 pages
Tarun - Tuteja SQL Developer
No ratings yet
Tarun - Tuteja SQL Developer
3 pages
SE GTU Study Material Presentations Unit-8 29092020053751AM
No ratings yet
SE GTU Study Material Presentations Unit-8 29092020053751AM
15 pages
Unit 2: Log File Management: Control Files
No ratings yet
Unit 2: Log File Management: Control Files
11 pages
ECC To S4HANA Migration Guide
No ratings yet
ECC To S4HANA Migration Guide
5 pages
Software Design Description TEMPLATE
No ratings yet
Software Design Description TEMPLATE
20 pages
DBMS
No ratings yet
DBMS
61 pages
Multiple Questions On SQL
No ratings yet
Multiple Questions On SQL
7 pages
No 2 Preparation
No ratings yet
No 2 Preparation
4 pages
First Login To The System
No ratings yet
First Login To The System
22 pages
Chapter 1: Introduction To Decision Support Systems
100% (1)
Chapter 1: Introduction To Decision Support Systems
33 pages
Ba Exam Paper 2022 (Sem II)
No ratings yet
Ba Exam Paper 2022 (Sem II)
2 pages
Warda Pure Care Management Sysytem (Original Book Thesis) - Final.
No ratings yet
Warda Pure Care Management Sysytem (Original Book Thesis) - Final.
55 pages
Knowledge Management and Developing Repositories For Management Books
No ratings yet
Knowledge Management and Developing Repositories For Management Books
15 pages
Capturing Architectural Requirements: November 2001
No ratings yet
Capturing Architectural Requirements: November 2001
13 pages
1.1 Module-1
No ratings yet
1.1 Module-1
31 pages
Moin Resume
No ratings yet
Moin Resume
3 pages
List of Selected Training Firms: S. No. Company Name Allocated Courses/Trainings Contact Person
No ratings yet
List of Selected Training Firms: S. No. Company Name Allocated Courses/Trainings Contact Person
2 pages
Cse413 201-15-3452 Lab-Report 02
No ratings yet
Cse413 201-15-3452 Lab-Report 02
6 pages
Interview Qestions - Without Answers
No ratings yet
Interview Qestions - Without Answers
3 pages
Resume - Muhammad Nur
No ratings yet
Resume - Muhammad Nur
12 pages
Wasp Labeler User Manual
No ratings yet
Wasp Labeler User Manual
209 pages
Database Labs (1-4) .
No ratings yet
Database Labs (1-4) .
24 pages
Lecture 4 - GIS Data Modeling - Part 2
100% (1)
Lecture 4 - GIS Data Modeling - Part 2
32 pages
SPPU DBMS UT1 Sem 5
No ratings yet
SPPU DBMS UT1 Sem 5
3 pages
Sharayu Chaudhari - XC - 24 - IT Portfolio
No ratings yet
Sharayu Chaudhari - XC - 24 - IT Portfolio
34 pages

Storage System - RAID Levels

Uploaded by

Storage System - RAID Levels

Uploaded by

Database Management System

• An overview of various types of

• Types of Data Storage

• Cache: costly, fastest , maintained by the computer hardware,

• Within CPU… registers

• Online (within the system), non-volatile storage

• external from the computer system

• RAID consists of an array of disks in which multiple disks are

• used to optimize the performance (time-minimum / speed-maximum)

• “Data structure technique to efficiently retrieve records from the

Search Key Data Reference

• Primary/Ordered Indexing is of two types −

• In this case, to identify the record faster, we will group two or

• The records which have similar characteristics are grouped, and

• Balanced binary search tree which follows a multi-level index

• Every leaf node is at equal distance from the root node.

• Suppose we want to insert a record 60 in the below structure.

• primary key itself can be the address of the data block

• Hash Function − A hash function, h, is a mapping function that

• In this method, data buckets grow or shrink as the records

• This method makes hashing dynamic, i.e., it allows insertion or

• Firstly, you have to follow the same procedure for retrieval,

• 2. Last bits Bucket

You might also like