0% found this document useful (0 votes)

30 views23 pages

File Organization and Indexing: Prof P Sreenivasa Kumar Department of CS&E, IITM 1

The document discusses file organization and indexing in a relational database management system (RDBMS). It states that data is ultimately stored in disk files, and that the RDBMS prefers to manage disk space itself rather than relying on the operating system. It also discusses different methods for organizing records in files, including heap files, sorted files, and hashed files.

Uploaded by

Arun Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views23 pages

File Organization and Indexing: Prof P Sreenivasa Kumar Department of CS&E, IITM 1

Uploaded by

Arun Gupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

File Organization and Indexing

The data of a RDB is ultimately stored in disk files

Disks – non-volatile, inexpensive storage for data
– random-access adressable device

Disk space management:

Should Operating System services be used ?
Should RDBMS manage the disk space by itself ?

2nd option is preferred as RDBMS requires complete

control over when a block or page in main memory buffer
is written to the disk.

This is important for recovering data when system

crash occurs
Prof P Sreenivasa Kumar 1
Department of CS&E, IITM

Structure of Disks Speed:

7000 to
Platters
Disk 10000 rpm

§  several platters stacked on

a rotating spindle
§  one read / write head per surface
for fast access
§  platter has several tracks
•  ~10,000 per inch
§  each track - several sectors
Read/write head
§  each sector/track - blocks track
§  unit of data transfer - block
§  cylinder i - track i on all platters
§  sectoring is optional
§  block – ½ KB to 8KB
§  fixed; set at initialization time } sector

Prof P Sreenivasa Kumar 2

Department of CS&E, IITM

Data Transfer from Disk

Address of a block: Surface No, Cylinder No, Block No

Data transfer:
Move the r/w head to the appropriate track
•  time needed - seek time – ~ 12 to 14 ms
Wait for the appropriate block to come under r/w head
•  time needed - rotational delay - ~3 to 4ms (avg)

Access time: Seek time + rotational delay

Blocks on the same cylinder - roughly close to each other
- access time-wise
- cylinder i, cylinder (i + 1), cylinder (i + 2) etc.

Prof P Sreenivasa Kumar 3

Department of CS&E, IITM

1
Data Records and Files
Fixed length record type: each field is of fixed length
•  in a file of these type of records, the record number can be
used to locate a specific record
•  the number of records, the length of each field are available
in file header

Variable length record type:

•  arise due to missing fields, repeating fields, variable length
fields
•  special separator symbols are used to indicate the field
boundaries and record boundaries
•  the number of records, the separator symbols used are
recorded in the file header

Prof P Sreenivasa Kumar 4

Department of CS&E, IITM

Packing Records into Blocks

Record length much less than block size
•  The usual case
•  Blocking factor b = B/r B - block size (bytes)
r - record length (bytes)
- maximum no. of records that can be stored in a block

Record length greater than block size

•  spanned organization is used

Record 2
1 2 3 3
1

File blocks:
sequence of blocks containing all the records of the file

Prof P Sreenivasa Kumar 5

Department of CS&E, IITM

Mapping File Blocks onto the Disk Blocks

Contiguous allocation
•  Consecutive file blocks are stored in consecutive disk blocks
•  Pros: File scanning can be done fast using double buffering
Cons: Expanding the file by including a new block in the middle
of the sequence - difficult

Linked allocation
•  each file block is assigned to some disk block
•  each disk block has a pointer to next block of the sequence
•  file expansion is easy; but scanning is slow

Mixed allocation

Prof P Sreenivasa Kumar 6

Department of CS&E, IITM

2
Operations on Files
Insertion of a new record: may involve searching for appropriate
location for the new record
Deletion of a record: locating a record –may involve search;
delete the record –may involve movement of other records
Update a record field/fields: equivalent to delete and insert
Search for a record: given value of a key field / non-key field
Range search: given range values for a key / non-key field
How successfully we can carry out these operations
depends on the organization of the file and the availability
of indexes

Prof P Sreenivasa Kumar 7

Department of CS&E, IITM

Primary File Organization

The logical policy / method used for placing records into file blocks

Example: Student file - organized to have students records sorted

in increasing order of the “rollNo” values

Goal: To ensure that operations performed frequently on the file

execute fast
•  conflicting demands may be there
•  example: on student file, access based on rollNo and also
access based on name may both be frequent
•  we choose to make rollNo access fast
•  For making name access fast, additional access structures
are needed.
- more details later
Prof P Sreenivasa Kumar 8
Department of CS&E, IITM

Different File Organization Methods

We will discuss Heap files, Sorted files and Hashed files

Heap file:
Records are appended to the file as they are inserted
Simplest organization
Insertion - Read the last file block, append the record and
write back the block - easy
Locating a record given values for any attribute
•  requires scanning the entire file – very costly

Heap files are often used only along with other access structures.

Prof P Sreenivasa Kumar 9

Department of CS&E, IITM

3
Sorted files / Sequential files (1/2)
Ordering field: The field whose values are used for sorting the
records in the data file
Ordering key field: An ordering field that is also a key

Sorted file / Sequential file:

Data file whose records are arranged such that the values of the
ordering field are in ascending order

Locating a record given the value X of the ordering field:

Binary search can be performed
Address of the nth file block can be obtained from
the file header
O(log N) disk accesses to get the required block- efficient
Range search is also efficient
Prof P Sreenivasa Kumar 10
Department of CS&E, IITM

Sorted files / Sequential files (2/2)

Inserting a new record:
§  Ordering gets affected
•  costly as all blocks following the block in which insertion is
performed may have to be modified

§  Hence not done directly in the file

•  all inserted records are kept in an auxiliary file
•  periodically file is reorganized - auxiliary file and main file
are merged
•  locating record
•  carried out first on auxiliary file and then the main file.

Deleting a record
•  deletion markers are used.
Prof P Sreenivasa Kumar 11
Department of CS&E, IITM

Hashed Files
Very useful file organization, if quick access to the data record is
needed given the value of a single attribute.

Hashing field: The attribute on which quick access is needed and

on which hashing is performed

Data file: organized as a buckets with numbers 0,1, …, (M − 1)

(bucket - a block or a few consecutive blocks)

Hash function h: maps the values from the domain of the hashing
attribute to bucket numbers

Prof P Sreenivasa Kumar 12

Department of CS&E, IITM

4
Inserting Records into a Hashed File
Insertion: for the given record R,
apply h on the value of hashing 0
attribute to get the bucket number r.
Overflow
1
If there is space in bucket r, chain
place R there else place R in the
overflow chain of bucket r.
2

The overflow chains of all the

buckets are maintained in the
overflow buckets.
M-1 Overflow
buckets
Main buckets

Prof P Sreenivasa Kumar 13

Department of CS&E, IITM

Deleting Records from a Hashed File

0
Deletion: Locate the record R to be
deleted by applying h.

Remove R from its bucket/overflow

Overflow
chain. If possible, bring a record from 1 chain
the overflow chain into the bucket

2
Search: Given the hash filed value
k, compute r = h(k). Get the bucket
r and search for the record. If not
found, search the overflow chain
of bucket r. M-1 Overflow
buckets
Main buckets

Prof P Sreenivasa Kumar 14

Department of CS&E, IITM

Performance of Static Hashing

Static hashing:
§  The hashing method discussed so far
§  The number of main buckets is fixed
Locating a record given the value of the hashing attribute
most often – one block access

Capacity of the hash file C = r * M records

(r - no. of records per bucket, M - no. of main buckets)

Disadvantage with static hashing:

If actual records in the file is much less than C
•  wastage of disk space
If actual records in the file is much more than C
•  long overflow chains – degraded performance
Prof P Sreenivasa Kumar 15
Department of CS&E, IITM

5
Hashing for Dynamic File Organization
Dynamic files
§  files where record insertions and deletion take place frequently
§  the file keeps growing and also shrinking

Hashing for dynamic file organization

§  Bucket numbers are integers
§  The binary representation of bucket numbers
§  Exploited cleverly to devise dynamic hashing schemes
§  Two schemes
•  Extendible hashing
•  Linear hashing

Prof P Sreenivasa Kumar 16

Department of CS&E, IITM

Extendible Hashing (1/2)

The k-bit sequence corresponding to a record R:

Apply hashing function to the value of the hashing field of R

to get the bucket number r

Convert r into its binary representation to get the bit sequence

Take the trailing k bits

For instance, say record R hashes to bucket # 46

46 = (101110)2
So, the 3-bit sequence corresponding to the bucket is “110”

Prof P Sreenivasa Kumar 17

Department of CS&E, IITM

Extendible Hashing (2/2)

Local depth
2 The number of bits in the
The # of Global depth d=3 common suffix of bit
trailing sequences corresponding to
bits used in 000 3 the records in the bucket
the directory 001
010 3
011
All records with 2-bit
100 Sequence ‘01’
2
101
110
111 3

All records with 3-bit

3
Directory Sequence ‘111’

Locating a record
Match the d-bit sequence with an entry in the directory and go to
the corresponding bucket to find the record
Prof P Sreenivasa Kumar 18
Department of CS&E, IITM

6
Insertion in Extendible Hashing Scheme (1/2)
2 - bit sequence for the record to be inserted: 00
b0 b0
1
full
00 b3
00 b1
2 01
01
10 b1
10
b2 11
11
2 d=2
b2
d=2
all local
b0 Full: Bucket b0 is split depth = 2

All records whose 2-bit sequence is ‘10’ are

sent to a new bucket b3. Others are retained in b0
Directory is modified.
b0 Not full: New record is placed in b0. No changes in the directory.

Prof P Sreenivasa Kumar 19

Department of CS&E, IITM

Insertion in Extendible Hashing Scheme (2/2)

2 - bit sequence for the record to be inserted: 10 b0 2
b0
000
001 b1
2
b1 010
00
01 011
b3 3
10 b3 100
full 101
11
110 b4 3
d=2
b2
111
b2
all local d=3 2
depth = 2
b3 not full: new record placed in b3. No changes.
b3 full : b3 is split, directory is doubled, all records with 3-bit
sequence 110 sent to b4. Others in b3.
In general, if the local depth of the bucket to be split is equal to the
global depth, directory is doubled
Prof P Sreenivasa Kumar 20
Department of CS&E, IITM

Deletion in Extendible Hashing Scheme

b0 2
b0
000
001 b1
2
b1 010
00
01 011
b3 3
10 b3 100
11 101
110 b4 3
d=2
b2
111
b2
all local d=3 2
depth = 2
Matching pair of data buckets:
k-bit sequences have a common k-1 bit suffix, e.g, b3 & b4
Due to deletions, if a pair of matching data buckets
-- become less than half full – try to merge them into one bucket
If the local depth of all buckets is one less than the global depth
-- reduce the directory to half its size
Prof P Sreenivasa Kumar 21
Department of CS&E, IITM

7
Extendible Hashing Example
Bucket capacity – 2 Initial buckets = 1
Insert 45,22
Local depth 45 101101
Global 0
0 45 22 10110
depth
22 12 1100
11 1011
Insert 12 1 Bucket overflows
1 22 local depth = global depth
0 12
⇒ Directory doubles and split image
1
1 is created
45

Insert 11 1
1 22
0 12
1
1
45
11
Prof P Sreenivasa Kumar 22
Department of CS&E, IITM

Insert 15 1
22
12 Overflow occurs.
2 2 Global depth = local depth
00 45 Directory doubles and split occurs
01
10 45 101101
11 2
11 22 10110
15 12 1100
Insert 10 2 11 1011
12 15 1111
10 1010
2
45 Overflows occurs.
2
00 Since local depth < global depth
01 2 Split image is created
10 10
Directory is not doubled
11 22
2
11
15
Prof P Sreenivasa Kumar 23
Department of CS&E, IITM

Linear Hashing
Does not require a separate directory structure

Uses a family of hash functions h0, h1, h2,….

•  the range of hi is double the range of hi-1

•  hi(x) = x mod 2iM

M - the initial no. of buckets
(Assume that the hashing field is an integer)

Initial hash functions

h0(x) = x mod M
h1(x) = x mod 2M

Prof P Sreenivasa Kumar 24

Department of CS&E, IITM

8
Insertion (1/3)
Initially the structure has M main buckets 0
Overflow
( 0 ,…, M-1 ) and a few overflow buckets buckets
1

To insert a record with hash field value x,

place the record in bucket ho(x) 2

.
When the first overflow in any bucket occurs: .
Say, overflow occurred in bucket s M-1
Insert the record in the overflow chain of bucket s
Create a new bucket M
M
Split the bucket 0 by using h1 Split image
of bucket 0
Some records stay in bucket 0 and
some go to bucket M.

Prof P Sreenivasa Kumar 25

Department of CS&E, IITM

Insertion (2/3)
0
On first overflow,
irrespective of where it occurs, bucket 0 is split
1
On subsequent overflows
buckets 1, 2, 3, … are split in that order
(This why the scheme is called linear hashing) 2
N: the next bucket to be split .

After M overflows, .
.

all the original M buckets are split. M-1

We switch to hash functions h1, h2
and set N = 0. M

Split
ho h1 hi M+1
images
… …
h1 h2 hi+1
.
.

Prof P Sreenivasa Kumar 26

Department of CS&E, IITM

Nature of Hash Functions

hi(x) = x mod 2iM. Let M' = 2iM

§  Note that if hi(x) = k then x = M'r + k, k < M'

and hi+1(x) = (M'r + k) mod 2M'
= k or M' + k
Since,
r – even – (M'2s + k) mod 2M' = k
r – odd – ( M'(2s + 1) + k ) mod 2M' = M' + k
M'– the current number of original buckets.

Prof P Sreenivasa Kumar 27

Department of CS&E, IITM

9
Insertion (3/3)
Say the hash functions in use are hi, hi+1
To insert record with hash field value x,
Compute hi(x)
if hi(x) < N, the original bucket is already split
place the record in bucket hi+1(x)
else place the record in bucket hi(x)

Prof P Sreenivasa Kumar 28

Department of CS&E, IITM

Linear Hashing Example

Initial Buckets = 1 Bucket capacity = 2 records
Hash functions Split pointer
N
h0 = x mod 1 0
h1 = x mod 2

Insert 12, 11
Insert 14 0 12 h0 = x mod 2
N N
0 12 14 h1 = x mod 4
11
B0 overflows
Bucket pointed by 1 11
N is split
Hash functions are
changed

Prof P Sreenivasa Kumar 29

Department of CS&E, IITM

Insert 13
0 12 h0 = x mod 2
N
0 12 h1 = x mod 4
Insert 9
14 N
1 11 9
B1 overflows 13
1 11
B0 is split using h1
13
and split image
2 14
is created

Insert 10
N 0 12
0 12

Insert 18 1 9
N 13
1 11 9
overflow at B2
13 2 14 18
h1 is split B1 10
applied here h0 = x mod 4
2 14 h1 = x mod 8 3 11
10

Prof P Sreenivasa Kumar 30

Department of CS&E, IITM

10
Index Structures
Index: A disk data structure
– enables efficient retrieval of a record
given the value (s) of certain attributes
– indexing attributes

Primary Index:
Index built on ordering key field of a file

Clustering Index:
Index built on ordering non-key field of a file

Secondary Index:
Index built on any non-ordering field of a file

Prof P Sreenivasa Kumar 31

Department of CS&E, IITM

Primary Index 101 0

104
.

Can be built on ordered / sorted files .

.
.

Index attribute – ordering key field (OKF)

101 121 1
value of OKF for disk address 121 123
Index Entry: the first record of of Bj 129
.
.

a block Bj .
.
.
.
. 129
.
2
Index file: ordered file (sorted on OKF) 130
.
.

size: no. of blocks in the data file 240 .

.
.
Index file blocking factor BFi = B/(V +P) .
.
.

(B-block size, V-OKF size, P-block pointer size) 240 b

- generally more than data file blocking factor 244
.

No of Index file blocks bi = b/BFi

.
.
.

(b - no. of data file blocks) Ordering key

Data
(RollNo)
file

Prof P Sreenivasa Kumar 32

Department of CS&E, IITM

Record Access Using Primary Index

Given Ordering key field (OKF) value: x
Carry out binary search on the index file
m – value of OKF for the first record in the middle block k of
the index file
x < m: do binary search on blocks 0,…,(k −1) of index file
x ≥ m: if there are an index entries (vj, Pj), (vj+1, Pj+1) in block k
such that vj ≤ x < v(j+1),
use the block pointer Pj, get the data file block and
search for the data record with OKF value x
else
do binary search on blocks k +1,…, bi of index file

Maximum block accesses required: ⌈log2 bi⌉

Prof P Sreenivasa Kumar 33

Department of CS&E, IITM

11
An Example
Data file:
No. of blocks b = 9500
Block size B = 4KB
OKF length V = 15 bytes
Block pointer length p = 6 bytes
Index file
No. of records ri = 9500
Size of entry V + P = 21 bytes
Blocking factor BFi = 4096/21 = 195
No. of blocks bi = ri/BFi = 49
Max No. of block accesses for getting record
using the primary index 1 + log2 bi = 7
Max No. of block accesses for getting record
without using primary index log2b = 14
Prof P Sreenivasa Kumar 34
Department of CS&E, IITM

Making the Index Multi-level

Index file – itself an ordered file
– another level of index can be built
Multilevel Index –
Successive levels of indices are built till the last level has one block
9500
height – no. of levels entries
block accesses: height + 1 49 entries
.
(no binary search required) .
.
.
.
. .
. .
For the example data file: .
Second level
No of block accesses required with index
First level
multi-level primary index: 3 1 block index data file
without any index: 14 49 blocks 9500 blocks

Prof P Sreenivasa Kumar 35

Department of CS&E, IITM

Range Search, Insertion and Deletion

Range search on the ordering key field:
Get records with OKF value between x1 and x2 (inclusive)
Use the index to locate the record with OKF value x1 and read
succeeding records till OKF value exceeds x2.
Very efficient

Insertion: Data file – keep 25% of space in each block free

-- to take care of future insertions
index doesn't get changed
-- or use overflow chains for blocks that overflow

Deletion: Handle using deletion markers so that index doesn’t get

affected
Basically, avoid changes to index
Prof P Sreenivasa Kumar 36
Department of CS&E, IITM

12
Clustering Index
Built on ordered files where ordering field is not a key
Index attribute: ordering field (OF)
Distinct value Vi address of the first
Index entry: of the OF block that has a record with OF value Vi

Index file: Ordered file (sorted on OF)

size – no. of distinct values of OF

Prof P Sreenivasa Kumar 37

Department of CS&E, IITM

Secondary Index
Built on any non-ordering field (NOF) of a data file.
Case I: NOF is also a key (Secondary key)
value of the NOF Vi pointer to the record with Vi as the NOF value

Case II: NOF is not a key: two options

(1)
value of the NOF Vi pointer(s) to the record(s) with Vi as the NOF value

(2) value of the NOF Vi pointer to a block that has pointer(s) to the record(s)
with Vi as the NOF value

Remarks:
(1)  index entry – variable length record
(2)  index entry – fixed length – One more level of indirection
Prof P Sreenivasa Kumar 38
Department of CS&E, IITM

Secondary Index (key)

Can be built on ordered and also other type of files
Index attribute: non-ordering key field
Index entry: value of the NOF Vi pointer to the record with Vi as the NOF value

Index file: ordered file (sorted on NOF values)

No. of entries – same as the no. of records in the data file

Index file blocking factor Bfi = B/(V+Pr)

(B: block size, V: length of the NOF,
Pr: length of a record pointer)

Index file blocks = ⎡r/Bfi⎤

(r – no. of records in the data file)

Prof P Sreenivasa Kumar 39

Department of CS&E, IITM

13
An Example
Data file:
No. of records r = 90,000 Block size B = 4KB
Record length R = 100 bytes BF = 4096/100 = 40,
b = 90000/40 = 2250
NOF length V = 15 bytes length of a record pointer Pr = 7 bytes
Index file :
No. of records ri = 90,000 record length = V + Pr = 22 bytes
BFi = 4096/22 = 186 No. of blocks bi = 90000/186 = 484

Max no. of block accesses to get a record

using the secondary index 1 + log2bi = 10
Avg no. of block accesses to get a record
without using the secondary index b/2 = 1125
A very significant improvement
Prof P Sreenivasa Kumar 40
Department of CS&E, IITM

Multi-level Secondary Indexes

Secondary indexes can also be converted to multi-level indexes

First level index

– as many entries as there are records in the data file

First level index is an ordered file

so, in the second level index, the number of entries will be
equal to the number of blocks in the first level index
rather than the number of records

Similarly in other higher levels

Prof P Sreenivasa Kumar 41

Department of CS&E, IITM

Making the Secondary Index Multi-level

Multilevel Index –
Successive levels of indices are built
till the last level has one block data file
height – no. of levels 90000
block accesses: height + 1 records
90000
entries
3 entries 484 entries

.
.
.
.

1 block
For the example data file: Second level
index
No of block accesses required: 3 blocks
First level
index 2250
multi-level index: 4 484 blocks blocks
single level index: 10
Prof P Sreenivasa Kumar 42
Department of CS&E, IITM

14
Index Sequential Access Method (ISAM) Files
ISAM files –
Ordered files with a multilevel primary/clustering index

Insertions:
Handled using overflow chains at data file blocks

Deletions:
Handled using deletion markers

Most suitable for files that are relatively static

If the files are dynamic, we need to go for dynamic multi-level

index structures based on B+- trees

Prof P Sreenivasa Kumar 43

Department of CS&E, IITM

B+- trees Bayer & McCreight

Acta Informatica 1972
§  Balanced search trees (self-balancing)
•  Internal nodes have variable number of children
•  All leaves are at the same level
•  Nodes – internal or leaf – are disk blocks

§ Leaf node entries point to the actual data records

•  All leaf nodes are linked up as a list
§  Internal node entries carry only index information
§  In B-trees, internal nodes carry data record pointers also
§  The fan-out in B-trees is less

§  Make sure that blocks are always at least half filled

§  Support both random and sequential access of records

Prof P Sreenivasa Kumar 44
Department of CS&E, IITM

Order
Order (m) of an Internal Node
•  Order of an internal node is the maximum number of tree
pointers held in it.
•  Maximum of (m-1) keys can be present in an internal node

Order (mleaf) of a Leaf Node

•  Order of a leaf node is the maximum number of record
pointers held in it. It is equal to the number of keys in a
leaf node.

Prof P Sreenivasa Kumar 45

Department of CS&E, IITM

15
Pi: Tree pointer
Internal Node Structure (Block pointer)
m Ki: Key value
2
≤ j≤ m m : Order(internal)

P1 K1 P2 K2 … Ki-1 Pi Ki … Kj-1 Pj …

Sub-trees
x ≤ K1 Ki-1 < x ≤ Ki Kj-1 < x

Example
2 5 12 -

x≤ 2 x > 12
2 < x ≤ 5 5 < x ≤ 12

Prof P Sreenivasa Kumar 46

Department of CS&E, IITM

Internal Nodes
An internal node of a B+- tree of order m:
m
§  It contains at least pointers, except when it is the root node
2
(Root nodes – a min of 2 pointers is ok)
§  It contains at most m pointers.
§  If it has P1, P2, …, Pj pointers with
K1 < K2 < K3 … < Kj-1 as keys, where m ≤ j ≤ m, then
2
•  P1 points to the sub-tree with records having key value x ≤ K1

•  Pi (1 < i < j) points to the sub-tree with records having

key value x such that Ki-1 < x ≤ Ki

•  Pj points to records with key value x > Kj-1

Prof P Sreenivasa Kumar 47

Department of CS&E, IITM

Leaf Node Structure

Structure of leaf node of B+- of order mleaf :
§  It contains one block pointer P to point to next leaf node
§  At least m leaf record pointers and m leaf key values
2 2
§  At most mleaf record pointers and key values
§  If a node has keys K1 < K2 < … < Kj with Pr1, Pr2… Prj as record
pointers and P as block pointer, then

Pri points to record with Ki as the search field value, 1 ≤ i ≤ j

P points to next leaf block

K1 Pr1 K2 Pr2 … Kj Prj … P

… …

Prof P Sreenivasa Kumar 48

Department of CS&E, IITM

16
Order Calculation
Block size: B, Size of Index field: V
Size of block pointer: P, Size of record pointer: Pr

Order of Internal node (m):

As there can be at most m block pointers and (m-1) keys
(m*P) + ((m-1) * V) ≤ B
m can be calculated by using the above inequality (choose max)

Order of leaf node:

As there can be at most mleaf record pointers and keys
with one block pointer in a leaf node,
mleaf can be calculated by using the inequality: (choose max)
(mleaf * (Pr + V)) + P ≤ B

Prof P Sreenivasa Kumar 49

Department of CS&E, IITM

Example Order Calculation

Given B = 512 bytes V = 8 bytes
P = 6 bytes Pr = 7 bytes. Then

Internal node order m = ?

m * P + ((m-1) *V) ≤ B
m * 6 + ((m-1) *8) ≤ 512
14m ≤ 520
m ≤ 37

Leaf order mleaf = ?

mleaf (Pr + V) + P ≤ 512
mleaf (7 + 8) + 6 ≤ 512
15mleaf ≤ 506
mleaf ≤ 33
Prof P Sreenivasa Kumar 50
Department of CS&E, IITM

Example B+- tree

m = 3 mleaf = 2

3 7

2 - 4 - 9 -

1 2 3 - 4 - 6 7 8 9 12 15 ^

Prof P Sreenivasa Kumar 51

Department of CS&E, IITM

17
Insertion into B+- trees
Every (key, record pointer) pair is inserted in an appropriate leaf
(Search for it)
§  If a leaf node overflows:
•  Node is split at j = (m leaf + 1)
2
•  First j entries are kept in original node
•  Entities from j+1 are moved to new node
•  jth key value Kj is replicated in the parent of the leaf.
§  If an internal node overflows:
•  Node is split at j = (m + 1)
2
•  Values and pointers up to Pj are kept in the original node
•  jth key value Kj is moved to the parent of the internal node
•  Pj+1 and the rest of entries are moved to a new node.
Prof P Sreenivasa Kumar 52
Department of CS&E, IITM

Example of Insertions m = 3 mleaf = 2

2
Insert 20, 11 Insert 14 14 - ^

1
11 20 ^ 11 14 20 - ^
Overflow. leaf is split
at j = (mleaf + 1) = 2
2
14 is copied to parent level

3 4
Insert 25 Insert 30
14 - ^ 14 25

Inserted at Overflow.
11 14 20 25 ^ 11 14 20 25 30
leaf level split at 25.
25 is copied
to upper level
Prof P Sreenivasa Kumar
Department of CS&E, IITM

Insert 12 Overflow at leaf level.

- Split at leaf level,
- Triggers overflow at internal node
- Split occurs at internal node;

5 .
14 -

Internal node split

at j = m
12 - ^ 25 - ^ 2
Split at 14 and 14 is
moved up

11 12 14 - 20 25 30 -

Prof P Sreenivasa Kumar 54

Department of CS&E, IITM

18
Insert 22
14 6
- ^

12 - ^ 22 25

11 12 14 20 22 25 30 -

Insert 23, 24 7
14 24

12 - ^ 22 25

11 12 14 20 22 23 24 25 30

Prof P Sreenivasa Kumar 55

Department of CS&E, IITM

Deletion in B+- trees

§  Delete the entry from the leaf node

§  Delete the entry if it is present in Internal node and replace with

the entry to its right / right sibling.

§  If underflow occurs after deletion

•  Distribute the entries from left sibling
if not possible – Distribute the entries from right sibling
if not possible – Merge the node with left and right sibling

Prof P Sreenivasa Kumar 56

Department of CS&E, IITM

Example
14 24

12 22 25

11 12 14 20 22 23 24 25 30

Delete 20
Removed entry
14 24 from leaf here

12 22 25

11 12 14 22 23 24 25 30

Prof P Sreenivasa Kumar 57

Department of CS&E, IITM

19
Delete 22 22 is removed
from leaf and
14 24 internal node
entries from right
sibling are
12 23 25
distributed to left

11 12 14 23 24 25 30

Delete 24
14

12 23 25

11 12 14 23 25 30

Prof P Sreenivasa Kumar 58

Department of CS&E, IITM

Delete 14
12

11 23 25

11 12 23 25 30

Delete 12
23 25 Level drop has occurred

11 23 25 30

Prof P Sreenivasa Kumar 59

Department of CS&E, IITM

Advantages of B+- trees:

1) Any record can be fetched in equal number of disk accesses.

2) Range queries can be performed easily as leaves are linked up

3) Height of the tree is less as only keys are used for indexing

4) Supports both random and sequential access.

Disadvantages of B+- trees:

Insert and delete operations are complicated

Root node becomes a hotspot
Prof P Sreenivasa Kumar 60
Department of CS&E, IITM

20
Parallel Access of Multiple Disks
Single Disk: high block access time: 6msec – 50msec
Why not use parallel access to improve performance?
RAID – Redundant Arrays of Independent Disks (current usage)
Redundant Arrays of Inexpensive Disks (early usage)
RAID techniques aim to improve performance and reliability

Two ideas are employed

1)  Data Striping – distribute data on to multiple disks
Parallel reading of disks – faster data access
2)  Add redundant data to help recover from disk crashes
Take help of error-recovery codes
Details follow …
Prof P Sreenivasa Kumar 61
Department of CS&E, IITM

Data Striping
Data Striping – distribute data on multiple disks
Bit-level striping: ith bit of each byte – stored on the ith disk
Use 8 disks for 8 bits of a byte. // higher granularity is also possible
One (parallel) block read – 8 blocks of the data file
Transfer rate – eight times that of single disk
Read/write of a block – involves use of all the disks

Block-level striping: ith block of data – ith disk

Using n disks -
Single block access: n simulataneous block reads can happen
Multi-block access: n fold increase in transfer rate (parallel reads)

Downside: reliability of the set of disks comes down

Prof P Sreenivasa Kumar 62
Department of CS&E, IITM

Reliability of Multiple Disks

Reliability is modeled using Mean Time To Failure (MTTF)

An example scenario:
Mean Time To Failure (MTTF) of a disk: 2,40,000hrs
That is, probability of failure of a single disk in an hour: 1/2,40,000

Probability of Failure of a single disk in a 100-disk set: 1/2,400

MTTF of the 100-disk system is 2,400hrs = 100days ~ 3.3months!

This is unacceptable..

Prof P Sreenivasa Kumar 63

Department of CS&E, IITM

21
Mirroring disks to increase reliability
Mirroring – Each disk has a mirror disk – same data on both
If a disk fails – use the mirror of that disk till the original is replaced

One can improve reliability greatly:

•  A disk with MTTF = 2,40,000hrs – mirrored with same kind of disk
•  Probability of a disk failure in a particular hour: 2/2,40,000
•  Time to repair/copy a disk is, say, 24hrs
•  Probability of disk failure while copying/repair: 24/2,40,000
•  Probability of a data loss: (2/2,40,00) * (24/2,40,000) = 1/(12*108)
•  Or MTTF of the combination = 12*108 hrs

Performance: reading: same as a single disk or better

Writing: same as single disk, both disks are updated in parallel
Prof P Sreenivasa Kumar 64
Department of CS&E, IITM

Reliability and performance with parity disks

Mirroring – High reliability; uses 50% more disks!
Get good reliability & also performance with fewer additional disks?
Idea: Store additional information to recover data of the failed disk
Error-correcting codes – parity bit (1 if #of 1’s is odd, 0 otherwise)
Data: 1 0 1 1 0 0 1 0 - Parity Bit: 0 ( #of 1’s in Data & Parity is even )
Data: 1 0 0 1 1 0 1 1 - Parity Bit: 1 ( #of 1’s in Data & Parity is even )
Parity block: (Assuming block-level data striping with N disks)
The ith bit of the parity block j: parity of the ith bits of block j on all disks
Parity Disk – has parity blocks for all data blocks

If a disk k fails: Set the ith bit of block j using ith parity bit of block j
Do this for all blocks to recover data of disk k!
N – data disks, one extra disk – good performance and reliability!
Prof P Sreenivasa Kumar 65
Department of CS&E, IITM

Distributed Parity
N data disks and 1 redundant (parity) disk
•  Very good performance and protection against single-disk crash
•  Updating any data block – requires updating the parity disk
•  Usage of parity disk – high and it ages faster!

Can we distribute the parity information?

Use each disk as a redundant (parity) disk for some part of the data!
Say, we have D0, D1, D2, …, D5 – 6 disks with, say, 60 cylinders each
Use each as the redundant disk for 1/6 of data:
Cyl# 0, 6, 12, … of D0 – parity blocks for other disk cyl# 0, 6, 12, ...
Cyl# 1, 7, 13, … of D1 – parity blocks for other disk cyl# 1, 7, 13, ...
Etc…
This is called distributed parity – disk usage is uniform!

Prof P Sreenivasa Kumar 66

Department of CS&E, IITM

22
Standard RAID Levels
RAID-0 – Bit-level striping; No parity data; No mirroring
RAID-1 – Mirrored disks; No parity; No data striping
RAID-2 – Bit-level striping; Redundancy using Hamming codes
Not in much use currently.
RAID-3 – Byte-level striping; dedicated parity disk
Not in common use currently.
RAID-4 – Block-level striping; dedicated parity disk
RAID-5 – Block-level striping; distributed parity
RAID-6 – Block-level striping; double distributed parity;
Up to 2 disk crashes can be tolerated

Prof P Sreenivasa Kumar 67

Department of CS&E, IITM

Storage Area Networks (SAN)

Specialized computing systems for providing large-scale storage

-- Dedicated hardware and software
-- Shared across several servers
-- Connected to servers through a dedicated high-speed network
using special optical cables – Fiber channels
-- Block-level data storage
-- Internally use a large number of disks under a suitable RAID
-- Offer SCSI (Small Computer System Interface) interface to servers
-- Details are beyond the scope of this course

Prof P Sreenivasa Kumar 68

Department of CS&E, IITM

KM 250 Parts
89% (9)
KM 250 Parts
465 pages
Valve Body and Mechatronic Service PDF
100% (5)
Valve Body and Mechatronic Service PDF
44 pages
File Structures Indexing Kopyası
No ratings yet
File Structures Indexing Kopyası
76 pages
File Organizations and Indexes
No ratings yet
File Organizations and Indexes
51 pages
Formula Sheet Physics 12
100% (1)
Formula Sheet Physics 12
2 pages
08 File Handling
No ratings yet
08 File Handling
18 pages
2MCA2 DBMS Nit 2 Secondary Storage. 16960710426030
No ratings yet
2MCA2 DBMS Nit 2 Secondary Storage. 16960710426030
32 pages
DBMS Chapter 4 Record Organization and Dile Management
No ratings yet
DBMS Chapter 4 Record Organization and Dile Management
36 pages
Storage and Querying in DBMS
No ratings yet
Storage and Querying in DBMS
45 pages
Storage and File Management
100% (1)
Storage and File Management
16 pages
DBMS Unit 5
No ratings yet
DBMS Unit 5
58 pages
Chapter 17 Disk Storage, Basic File Structures, and Hashing Disk Storage Devices
No ratings yet
Chapter 17 Disk Storage, Basic File Structures, and Hashing Disk Storage Devices
10 pages
DS TM Study Material Presentations Unit-4 1TM
No ratings yet
DS TM Study Material Presentations Unit-4 1TM
22 pages
Chapter 6
No ratings yet
Chapter 6
62 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
81 pages
Chapter 6
No ratings yet
Chapter 6
62 pages
DBMS Unit 5 Notes
No ratings yet
DBMS Unit 5 Notes
28 pages
UNIT-IV - File Organization
No ratings yet
UNIT-IV - File Organization
10 pages
File Organization
No ratings yet
File Organization
41 pages
Unit 6 File Management
No ratings yet
Unit 6 File Management
70 pages
22-File Organization-06-09-2024
No ratings yet
22-File Organization-06-09-2024
23 pages
Mod4 Chap10 - 11 Indexing
No ratings yet
Mod4 Chap10 - 11 Indexing
77 pages
UNIT 4 Updated - 121124
No ratings yet
UNIT 4 Updated - 121124
52 pages
DBMS - R18 UNIT 5 Notes
86% (7)
DBMS - R18 UNIT 5 Notes
23 pages
Unit 6
No ratings yet
Unit 6
38 pages
Disk Storage, Basic File Structures, and Hashing
No ratings yet
Disk Storage, Basic File Structures, and Hashing
34 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
1 - Disk Storage - Ch13
No ratings yet
1 - Disk Storage - Ch13
31 pages
Inls 623 - Database Systems Ii - File Structures, Indexing, and Hashing
No ratings yet
Inls 623 - Database Systems Ii - File Structures, Indexing, and Hashing
41 pages
Unit 5 DBMS
No ratings yet
Unit 5 DBMS
38 pages
DBMS Unit-5
No ratings yet
DBMS Unit-5
23 pages
Co3 Session 21
No ratings yet
Co3 Session 21
53 pages
Lecture 4.Pptx 2
No ratings yet
Lecture 4.Pptx 2
15 pages
WINSEM2024-25 CBS1003 ETH VL2024250505129 2025-04-08 Reference-Material-I
No ratings yet
WINSEM2024-25 CBS1003 ETH VL2024250505129 2025-04-08 Reference-Material-I
12 pages
File and Database Design
No ratings yet
File and Database Design
28 pages
Unit-3 Part 2 Indexing and Hashing
No ratings yet
Unit-3 Part 2 Indexing and Hashing
36 pages
File Organization Notes
No ratings yet
File Organization Notes
21 pages
CH 13
No ratings yet
CH 13
6 pages
Chapter 11. File Organisation and Indexes
No ratings yet
Chapter 11. File Organisation and Indexes
56 pages
Unit 6 (22516)
No ratings yet
Unit 6 (22516)
40 pages
Unit 5
No ratings yet
Unit 5
185 pages
DINLect 1
No ratings yet
DINLect 1
69 pages
Unit 1 Introduction To Dbms
No ratings yet
Unit 1 Introduction To Dbms
27 pages
File Organization and Indexing: Structure of Disks
No ratings yet
File Organization and Indexing: Structure of Disks
28 pages
Module Iippt
No ratings yet
Module Iippt
27 pages
File Organization & Indexing: Reading: C&B, Appendix C
No ratings yet
File Organization & Indexing: Reading: C&B, Appendix C
17 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
58 pages
Data Storage: Agnibesh Samanta Mba-Final Year
No ratings yet
Data Storage: Agnibesh Samanta Mba-Final Year
12 pages
DBMS Storage and Indexing
No ratings yet
DBMS Storage and Indexing
80 pages
Indexing
No ratings yet
Indexing
62 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
60 pages
File Organization
No ratings yet
File Organization
11 pages
Dbms Unit III Notes
No ratings yet
Dbms Unit III Notes
27 pages
Csi 2018 Mechanical Division 15
100% (1)
Csi 2018 Mechanical Division 15
303 pages
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
No ratings yet
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
53 pages
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
No ratings yet
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
20 pages
Class 6
No ratings yet
Class 6
15 pages
m5 Index PDF
No ratings yet
m5 Index PDF
60 pages
R22 Unit 5
No ratings yet
R22 Unit 5
23 pages
Oxford Big Ideas Geography 8 Ch1 Landforms and Landscapes
0% (1)
Oxford Big Ideas Geography 8 Ch1 Landforms and Landscapes
14 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
31 pages
Repair+manuals Chilton Manuales
39% (95)
Repair+manuals Chilton Manuales
26 pages
File Structure Data Storage Query Evaluation Indexing and Hashing
No ratings yet
File Structure Data Storage Query Evaluation Indexing and Hashing
14 pages
Hotel Classification
No ratings yet
Hotel Classification
9 pages
DSA Unit6 Theory
No ratings yet
DSA Unit6 Theory
23 pages
BOX Hill Growth Centres Precinct Development Control Plan - in Force 28 June 2021
No ratings yet
BOX Hill Growth Centres Precinct Development Control Plan - in Force 28 June 2021
243 pages
Business Planfor Soapand Detergent Factory
100% (1)
Business Planfor Soapand Detergent Factory
6 pages
A. Local Related Literature and Studies
No ratings yet
A. Local Related Literature and Studies
7 pages
1 - Chemicals in The Workplace
No ratings yet
1 - Chemicals in The Workplace
58 pages
F650man I
No ratings yet
F650man I
553 pages
HPLC in Nucleic Acid Research Methods and Applications 1st Edition Best Quality Download
100% (13)
HPLC in Nucleic Acid Research Methods and Applications 1st Edition Best Quality Download
17 pages
Extracts Thirukkural
No ratings yet
Extracts Thirukkural
20 pages
Deleuze, Monet, and Being Repetitive
No ratings yet
Deleuze, Monet, and Being Repetitive
35 pages
Platonic Idealism: By: Dylan Isabela Jairus Marcos
No ratings yet
Platonic Idealism: By: Dylan Isabela Jairus Marcos
15 pages
Laguna - Coupe Quick Manual
No ratings yet
Laguna - Coupe Quick Manual
23 pages
NRF 24 e 1
No ratings yet
NRF 24 e 1
119 pages
Edited - Utbk Preparation
No ratings yet
Edited - Utbk Preparation
4 pages
Chapter 9
No ratings yet
Chapter 9
51 pages
Tress:: A Common Point For Counselling
No ratings yet
Tress:: A Common Point For Counselling
7 pages
Lecture 3 Software Quality Models
No ratings yet
Lecture 3 Software Quality Models
5 pages
Common Interview Question
No ratings yet
Common Interview Question
4 pages
Homework of Basic Communication Skills Lecture 1
No ratings yet
Homework of Basic Communication Skills Lecture 1
2 pages
4.2. Anna Ferruta Freud's Three Essays Revised
No ratings yet
4.2. Anna Ferruta Freud's Three Essays Revised
6 pages
Goal-Directed Cold Exposure Protocols From The Huberman Lab Podcast
No ratings yet
Goal-Directed Cold Exposure Protocols From The Huberman Lab Podcast
2 pages
Strongyloides Stercoralis
No ratings yet
Strongyloides Stercoralis
18 pages
CALTRACS - Standard - Calvert Racing, Inc
No ratings yet
CALTRACS - Standard - Calvert Racing, Inc
1 page
Guión Sofía y Francisco Ampliación de Inglés 3ºB
No ratings yet
Guión Sofía y Francisco Ampliación de Inglés 3ºB
2 pages
Transcendental
No ratings yet
Transcendental
2 pages
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
From Everand
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
Nolan Reeves
No ratings yet
C++ File Handling Step by Step: A Practical Guide with Examples
From Everand
C++ File Handling Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet

File Organization and Indexing: Prof P Sreenivasa Kumar Department of CS&E, IITM 1

Uploaded by

File Organization and Indexing: Prof P Sreenivasa Kumar Department of CS&E, IITM 1

Uploaded by

File Organization and Indexing

The data of a RDB is ultimately stored in disk files

Disk space management:

2nd option is preferred as RDBMS requires complete

This is important for recovering data when system

Structure of Disks Speed:

§ several platters stacked on

Prof P Sreenivasa Kumar 2

Data Transfer from Disk

Access time: Seek time + rotational delay

Prof P Sreenivasa Kumar 3

Variable length record type:

Prof P Sreenivasa Kumar 4

Packing Records into Blocks

Record length greater than block size

Prof P Sreenivasa Kumar 5

Mapping File Blocks onto the Disk Blocks

Prof P Sreenivasa Kumar 6

Prof P Sreenivasa Kumar 7

Primary File Organization

Example: Student file - organized to have students records sorted

Goal: To ensure that operations performed frequently on the file

Different File Organization Methods

Prof P Sreenivasa Kumar 9

Sorted file / Sequential file:

Locating a record given the value X of the ordering field:

Sorted files / Sequential files (2/2)

§ Hence not done directly in the file

Hashing field: The attribute on which quick access is needed and

Data file: organized as a buckets with numbers 0,1, …, (M − 1)

Prof P Sreenivasa Kumar 12

The overflow chains of all the

Prof P Sreenivasa Kumar 13

Deleting Records from a Hashed File

Remove R from its bucket/overflow

Prof P Sreenivasa Kumar 14

Performance of Static Hashing

Capacity of the hash file C = r * M records

Disadvantage with static hashing:

Hashing for dynamic file organization

Prof P Sreenivasa Kumar 16

Extendible Hashing (1/2)

Apply hashing function to the value of the hashing field of R

Convert r into its binary representation to get the bit sequence

For instance, say record R hashes to bucket # 46

Prof P Sreenivasa Kumar 17

Extendible Hashing (2/2)

All records with 3-bit

All records whose 2-bit sequence is ‘10’ are

Prof P Sreenivasa Kumar 19

Insertion in Extendible Hashing Scheme (2/2)

Deletion in Extendible Hashing Scheme

Uses a family of hash functions h0, h1, h2,….

• hi(x) = x mod 2iM

Initial hash functions

Prof P Sreenivasa Kumar 24

To insert a record with hash field value x,

Prof P Sreenivasa Kumar 25

all the original M buckets are split. M-1

Prof P Sreenivasa Kumar 26

Nature of Hash Functions

§ Note that if hi(x) = k then x = M'r + k, k < M'

Prof P Sreenivasa Kumar 27

Prof P Sreenivasa Kumar 28

Linear Hashing Example

Prof P Sreenivasa Kumar 29

Prof P Sreenivasa Kumar 30

Prof P Sreenivasa Kumar 31

Primary Index 101 0

Can be built on ordered / sorted files .

Index attribute – ordering key field (OKF)

size: no. of blocks in the data file 240 .

(B-block size, V-OKF size, P-block pointer size) 240 b

No of Index file blocks bi = b/BFi

(b - no. of data file blocks) Ordering key

Prof P Sreenivasa Kumar 32

§  several platters stacked on

§  Hence not done directly in the file

•  hi(x) = x mod 2iM

§  Note that if hi(x) = k then x = M'r + k, k < M'

§ Leaf node entries point to the actual data records

§  Make sure that blocks are always at least half filled

§  Support both random and sequential access of records

•  Pi (1 < i < j) points to the sub-tree with records having

•  Pj points to records with key value x > Kj-1

§  Delete the entry if it is present in Internal node and replace with

§  If underflow occurs after deletion