0% found this document useful (0 votes)

57 views28 pages

File Organization and Indexing: Structure of Disks

The document discusses file organization and indexing in a relational database management system (RDBMS). It describes how data is ultimately stored in disk files and that the RDBMS prefers to manage disk space itself rather than using operating system services. It then discusses the structure of disks including platters, tracks, sectors, and cylinders. The document also covers data transfer from disks including seek time and rotational delay that determine access time. It describes different types of data records and files, including fixed and variable length records, and how records are packed into blocks. Finally, it discusses primary file organization methods like heap files, sorted files, and hashed files.

Uploaded by

Bhuppi Latwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views28 pages

File Organization and Indexing: Structure of Disks

Uploaded by

Bhuppi Latwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Database Design Prof. P.

Sreenivasa Kumar

File Organization and Indexing

The data of a RDB is ultimately stored in disk files

Disk space management:

Should Operating System services be used ?

Should RDBMS manage the disk space by itself ?

2nd option is preferred as RDBMS requires complete control over when a block in

main memory buffer is written to the disk

This is important for recovering data when system crash occurs

Structure of Disks
Speed:
7000 to
Platters 10000 rpm

Read/write head
track

} sector

Disk
several platters stacked on a rotating spindle

one read / write head per surface for fast access

platter has several tracks

• xxx per inch

each track - several sectors

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

each sector - blocks

unit of data transfer - block

cylinder i - track i on all platters

Data transfer from Disk

Address of a block: Surface No, Cylinder No, Block No

Data transfer:

Move the r/w head to the appropriate track

• time needed - seek time - X ms

Wait for the appropriate block to come under r/w head

• time needed - rotational delay - Y ms

Access time: Seek time + rotational delay

Blocks on the same cylinder - roughly close to each other - access time-wise

cylinder i – cylinder (i + 1) etc.

Data records and Files

Fixed length record type: each field of fixed length

• in a file of these type of records, the record number can be used to

locate a specific record

• the number of records, the length of each field are available in file header

Variable length record type:

• arise due to missing fields, repeating fields, variable length fields

• special separator symbols are used to indicate the field boundaries and

record boundaries

• the number of records, the separator symbols used are recorded in the

file header

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Packing records into blocks

Record length much less than block size

• Usual case

• Blocking factor b =⌊B/r ⌋ B - block size (bytes)

r - record length (bytes)

maximum no. of records that can be stored in a block

Record length greater than block size

• spanned organization is used

Record
1 2 2 3 3
1

File blocks

Sequence of blocks containing all the records of the file

Mapping file blocks onto the disk blocks

Contiguous allocation

• Consecutive file blocks are stored in consecutive disk blocks

• Adv: File scanning can be done fast using double buffering

Disadv: Expanding the file by including a new block in the

middle of the sequence - difficult

Linked allocation

• each file block is assigned to some disk block

• each disk block has a pointer to next block of the sequence

• file expansion is easy; but scanning is slow

Mixed allocation

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Primary File Organization

The logical policy / method used for placing records into file blocks.

Example: Student file - organized to have student’s records sorted in increasing order of

the “rollNo” values

Goal: To ensure that operations performed frequently on the file execute fast

• conflicting demands may be there

• example: on student file, access based on rollNo and also

access based on name may both be frequent

• we choose to make rollNo access fast

• For making name access fast, additional access structures

are needed.

Different file organization methods

We will discuss Heap files, Sorted files and Hashed files

Heap file:

Records are appended to the file as they are inserted Simple organization

Insertion - Read the last file block; append the record write back the block - easy

Locating a record given values for any attribute

• requires scanning the entire file - costly

Heap files are often used only along with other access structures.

Sorted files / Sequential files (1/2)

Ordering field:

The field whose values are used for sorting the records in the data file

Ordering key field: An ordering field that is also a key

Sorted file / Sequential file:

Data file whose records are arranged such that the values of the ordering field

are in ascending order

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Locating a record given the value X of the ordering field:

Binary search can be performed - efficient

(Recall that the address of the nth file block can be obtained from the file header)

Sorted files / Sequential files (2/2)

Inserting a new record:

Ordering gets affected

• costly as all blocks following the block in which insertion is performed

may have to be modified

Hence not done directly in the file

• all inserted records are kept in an auxiliary file

• periodically file is reorganized - auxiliary file and main file are merged

• locating record

• Carried out first on auxiliary file and then the main file.

Deleting a record:

Deletion markers are used.

Hashed files
Very useful file organization, if quick access to the data record is needed given

the value of a single attribute

Hashing field: The attribute on which quick access is needed and on which

hashing is performed

Data file: organized as a buckets numbers 0,1,……M-1 (bucket - a block or a few

consecutive blocks)

Hash function h: maps the values from the domain of the hashing attribute to

bucket numbers

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Inserting records into a hashed file

Overflow
1 chain

M-1 Overflow
buckets
Main buckets

Insertion: for the given record R, apply h on the value of hashing attribute to get

the bucket number r.If there is space in bucket r, place R there else place R in the

overflow chain of bucket r

Deleting records from a hashed file

Overflow
1 chain

M-1 Overflow
buckets
Main buckets

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Deletion: Locate the record R to be deleted by applying h. remove R from its

bucket/overflow chain if possible bring a record from the overflow chain into the

bucket.

Hashed file performance

Locating a record given the value of the hashing attribute most often – one block

access

Capacity of the file C = r * M records(r – no. of records per bucket, M – no. of

buckets)

Disadvantage with static hashing actual records in the file – much less than C

• wastage of disk space actual records in the file – much more than C

• Long overflow chains – degraded performance.

Hashing for dynamic file organization

The binary representation of bucket numbers

exploited cleverly to devise dynamic hashing schemes

two schemes

• Extendible hashing

• Linear hashing

k-bit sequence corresponding to a record R:

Apply hashing function to the value of the hashing field of R to get the bucket

number r

Convert r into its binary representation to get the bit sequence Take the leading

k bits.

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Extendible Hashing
Local depth
The # of 2
higher order Global depth d=3 The number of bits in the
bits used in common prefix of bit
the directory 000 sequences corresponding to
3 the records in the bucket
001
010 3
011
All records with 2-bit
100 Sequence ’10’
2
101
110
111 3

All records with 3-bit

3
Directory Sequence ’111’

Locating a record

Match the d-bit sequence with an entry in the directory and go to the

corresponding bucket to find the record.

Insertion in Extendible Hashing Scheme (1/2)

2 - bit sequence for the record to be inserted: 01

b0 b0

full 1
00 b3
00 b1
2 01
01
10 b1
10
b2 11
11
2 d=2
b2
d=2
all local
depth = 2

Bucket b0 is split

b0 Full : All records whose 2-bit sequence is ‘01’ are sent to a new bucket b3. Others

are retained in b0 Directory is modified

b0 Not full: New record is placed in b0. No changes in the directory.

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Insertion in Extendible Hashing Scheme (2/2)

2 - bit sequence for the record to be inserted: 10

b0 2
b0
000
001 b3
2
b3 010
00
01 011
b1 3
10 b1 100
full
11 101
110 b4 3
d=2 b2
111
b2
all local d=3 2
depth = 2

b1 not full: new record placed in b1. No changes.

b1 full : b1 is split, directory is doubled, all records with 3-bit sequence 101 sent to b4.

Others in b1.

In general, if the local depth of the bucket to be split is equal to the global depth,

directory is doubled.

Linear Hashing
Does not require a separate directory structure

Uses a family of hash functions h0, h1, h2…

• the range of hi is double the range of hi-1

• hi(x) = x mod 2iM

M – the initial no. of buckets (Assume that the hashing field is an integer)

• Note that if hi(x) = k then x = M'r + k

and hi+1(x) = (M'r + k) mod 2M' = k or M' + k

r – even – (M'2s + k) mod 2M' = k

r – odd - M'(2s + 1) +k mod 2M' = M' + k

M'– the current number of original buckets.

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Insertion (1/3)

0
Overflow
buckets
1

.
.
M-1

M Split image
of bucket 0

Initially the structure has M main buckets and a few overflow buckets

To insert a record with hash field value k, place the record in bucket ho(k)

When the first overflow in any bucket occurs Say, overflow occurred in bucket s

Insert the record in the overflow chain of bucket s

Create a new bucket M

Split the bucket 0 records by using h1

Some records stay in bucket 0 and some go to bucket M.

Insertion (2/3)
On first overflow, irrespective of where it occurs, bucket 0 is split On subsequent

overflows buckets 1, 2, 3, … are split in that order

(This why the scheme is called linear hashing)

N: the next bucket to be split

After M overflows, all the original M buckets are split.

We switch to hash functions h1, h2 and set N = 0.

ho h1 hi

h1 h2 hi+1

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

.
.
.
M-1

M+1
Split
images
M+2

.
.

Insertion (3/3)
Say the hash functions in use are hi, hi+1

To insert record with hash field value k,

Compute hi(k)

if hi(k) < N, original bucket is already split place the record in bucket hi+1(k)

else place the record in bucket hi(k)

Index Structures
Index: A disk data structure – enables efficient retrieval of a record given the

value (s) of certain attributes – indexing attributes

Primary Index:

Index built on ordering key field of a file

Clustering Index:

Index built on ordering non-key field of a file

Secondary Index:

Index built on any non-ordering field of a file

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Primary Index
101 0
104
.
.
.
.

101 121 1
121 123
.
129 .
.
. .
.
. 129
. 2
130
.
.
240 .
.
.
.
.
.

240 b
244
.
.
.
.

Ordering key
(RollNo) Data
file

Can be built on ordered / sorted files

Index attribute – ordering key field (OKF)

Index Entry:

value of OKF for disk address

the first record of of Bj
a block Bj

Index file: ordered file (sorted on OKF) size-no. Of blocks in the data file

Index file blocking factor BFi = ⌊B/(V +P)⌋

(B-block size, V-OKF size, P-block pointer size)

- generally more than data file blocking factor

No of Index file blocks bi = ⌈b/Bfi⌉

(b - no. of data file blocks)

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Record Access using Primary Index

Given ordering key field (OKF) value: x

Carry out binary search on the index file

m – value of OKF for the first record in the middle block k of index file

x < m: do binary search on blocks 0 – (k-1) of index file

x ≥ m: if there is an index entry in block k with OKF value x, use the

corresponding block pointer and get the data file block and search for the data record

with OKF value x else do binary search on blocks k+1……bi of index file

Max. block accesses required: ⌈ log2 bi⌉

An Example
Data file:

No. of blocks b = 9500

Block size B = 4KB

OKF length V = 15 bytes

Block pointer length p = 6 bytes

Index file

No. of records ri = 9500

Size of entry V + P = 21 bytes

Blocking factor BFi = ⌊4096/21⌋ = 195

No. of blocks bi = ⌈ ri/BFi⌉ = 49

Max No. of block accesses for getting record

using the primary index 1 + ⌈log2 bi⌉ = 7

Max No. of block accesses for getting record

without using primary index ⌈log2b⌉ = 14

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Making the index multi-level

9500
entries

49 entries
.
.
. .
.
.
.
. .
.
Second level
index
1 block First level
index data file
49 blocks 9500 blocks

Index file – itself an ordered file

– another level of index can be built

Multilevel Index –

Successive levels of indices are built till the last level has one block

height – no. of levels block accesses: height + 1 (no binary search required)

For the example data file:

No of block accesses required with multi-level primary index: 3 without any index: 14

Insertion, Deletion and Range search

Range search on the ordering key field:

Get records with OKF value between x1 and x2 (inclusive)

Use the index to locate the record with OKF value x1 and read succeeding records till

OKF value exceeds x2 very efficient

Insertion: Data file – keep 25% of space in each block free - to take care of future

insertion – index doesn't get changed or use overflow chains for blocks that overflow

Deletion: Handle using deletion markers so that index doesn’t get affected

Basically, avoid changes to index

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Clustering Index
Built on ordered files where ordering field is not a key Index attribute: ordering

field (OF)

Index entry:
Distinct value Vi address of the first block
of the OF hat has a record with OF value Vi

Index file: Ordered file (sorted on OF)

size – no. of distinct values of OF

Secondary Index
Built on any non-ordering field (NOF) of a data file

NOF is also a key (Secondary key)

value of the NOF Vi pointer to the record with Vi as the NOF value

NOF is not a key: two options

(1)
value of the NOF Vi pointer(s) to the record(s) with Vi as the NOF value

(2)
value of the NOF Vi pointer to a block that has pointer(s) to the record(s)with Vi as the NOF value

(1) index entry – variable length record

(2) One more level of indirection -

Secondary Index (key)

Can be built on ordered and also other type of files

Index attribute: non-ordering key field

Index entry:
value of the NOF Vi pointer to the record with Vi as the NOF value

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Index file: ordered file (sorted on NOF values) No. of entries – same as the no. of

records in the data file

Index file blocking factor Bfi = ⌊B/(V+Pr)⌋ (B – block size, V – length of the NOF,

Pr – length of a record pointer)

Index file blocks = ⌈r/BFi⌉ (r – no. of records in the data file)

An Example
Data file:

No. of records r = 90,000 Block size B = 4KB

Record length R = 100 bytes BF = ⌊4096/100⌋ = 40,

b = ⌈90000/40⌉ = 2250

NOF length V = 15 bytes length of a record pointer Pr = 7 bytes

Index file:

No. of records ri = 90,000 record length = V + Pr = 22 bytes

BFi = ⌊4096/22⌋ = 186 No. of blocks bi = ⌈90000/186⌉ = 484

Max no. of block accesses to get a record

using the secondary index 1 + ⌈log2bi⌉ = 10

Avg no. of block accesses to get a record

without using the secondary index b/2 = 1125

A very significant improvement.

Index sequential Access Method (ISAM) files

ISAM files –

Ordered files with a multilevel primary/clustering index

Insertions:

Handled using overflow chains at data file blocks

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Deletions:

Handled using deletion markers

Most suitable for files that are relatively static

B+- trees
Balanced search trees

• all leaves at the same level

Leaf node entries point to the actual data records

• all leaf nodes are linked up as a list

Internal node entries carry index information

Makes sure that blocks are between half used to completely full

Supports both random and sequential access of records

Order
Order (m) of an Internal Node

• order of an internal node is the maximum number of tree pointer held in

it.

• maximum of (m-1) keys can be present in an internal node

Order (mleaf) of a Leaf Node

• order of a leaf node is the maximum number of record pointers a leaf can

hold. It is equal to the number of keys in a leaf node.

Internal Node
An internal node structure of a B+- tree of order m:

m
It contains atleast 2 pointers, except it is the root node

It contains atmost m pointers.

It has p1, p2,…, pj pointers with

k1 < k2 < k3 … < kj-1 as keys, where m ≤ j ≤ m, then

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

• p1 points to the subtree with records having key value x ≤ k1

• pi (1 ≤ i ≤ j) points to the subtree with records having key value x such

that ki-1 < x ≤ ki

• pj points to records with key value x > kj

Internal Node Structure

m/2 ≤j≤m

P1 K1 P2 K2 … Ki-1 Pi Ki … Kj-1 Pj …

x ≤ K1 Ki-1 < x ≤ Ki Kj-1 < x

Example

2 5 12 -

x≤2 5 < x ≤ 12 x > 12

2<x≤5

Leaf Node Structure

Structure of leaf node of B+ of order mleaf:

It contains one block pointer P to point to next leaf node

m m
Atleast leaf record pointers and leaf key values
2 2

Atmost mleaf record pointers and key values

A node with K1 < K2 < … Kj with Pr1, Pr2… Prj as record

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

pointers and P as block pointer

Pri → points to record with Ki 1≤i≤j

P → points to next leaf block

K1 Pr1 K2 Pr2 … K j Pj … P

Order calculation
Block size B, Size of Indexing field V

Size of block pointer – P,

Size of record pointer – Pr then

Order of Internal node (m):

As there can be atmost m block pointers and (m-1) keys

(m*p) + ((m-1) * v) ≤ B

m can be calculated by solving above equation.

Order of leaf node:

As there can be mleaf record pointers and keys with one block pointer mleaf can

be calculated by solving

(mleaf * (Pr + v)) + P ≤ B

Example Order calculation

Given B = 512 bytes V = 8 bytes P = 6 bytes Pr = 7 bytes. Then

Internal node order m =?

m * P + ((m-1) *V) ≤ B

m * 6 + ((m-1) *8) ≤ 512

14m ≤ 520

m ≤ 37

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Leaf order mleaf = ?

mleaf (Pr + v) + P ≤ 512

mleaf (7 + 8) + 6 ≤ 512

15mleaf ≤ 506

mleaf ≤ 33

Example B+- tree

m = 3 mleaf = 2

3 7

2 - 4 - 9 -

1 2 3 - 4 - 6 7 8 9 12 15 ^

Insertion into B+- trees

1. Every node is inserted at leaf level

If leaf node overflows, then

• Node is split at j = (m leaf + 1)

• First j entries are kept in original node

• Entities from j+1 are moved to new node

• jth key value is replicated in the parent of the leaf.

If Internal node overflows

(m + 1)
• Node is split at j =
2

• Values and pointers upto Pj are kept in original node

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

• jth key value is moved to parent of the internal node

• Pj+1 to the rest of entries are moved to new node.

Example of Insertions
m = 3 mleaf = 2

Insert 20, 11

1 Insert 14
11 20 ^

overflow leaf is split

(m leaf + 1)
at j = =2
2
14 is replicated to upper level

2 Insert 25
14 - ^
Inserted at
leaf level
11 14 20 - ^

3
14 - ^
Insert 30

overflow split at
11 14 20 25 ^ 25.25 is moved

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

4
14 25

11 14 20 25 30

Insert 12 overflows at leaf level

→ Split at leaf level,

→ Triggers overflow at internal node

→ Split occurs at internal node

5 14 - .

12 - ^ 25 - ^

11 12 14 - 20 25 30 -

m
Internal node split at j = split at 14 and 14 is moved up
2

Insert 22

14 - ^ 6

12 - ^ 22 25

11 12 14 20 22 25 30 -

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Insert 23, 24

14 24 7

12 - ^ 22 25

11 12 14 20 22 23 24 25 30

Deletion in B+- trees

Delete the entry from the leaf node

Delete the entry if it is present in Internal node and replace withthe entry to its

left in that position.

If underflow occurs after deletion

• Distribute the entries from left sibling

if not possible – Distribute the entries from right sibling

if not possible – Merge the node with left and right sibling

Example

14 24

12 22 25

11 12 14 20 22 23 24 25 30

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Delete 20
Removed entry from leaf here

14 24

12 22 25

11 12 14 22 23 24 25 30

Delete 22

14 24

12 23 25

11 12 14 23 24 25 30

Entry 22 is removed from leaf and internal node Entries from right sibling are

distributed to left

Delete 24

12 23 25

11 12 14 23 25 30

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Delete 14

11 23 25

11 12 23 25 30

Delete 12

23 25

11 23 25 30
Level drop has occurred

Advantage of B+- trees:

1) Any record can be fetched in equal number of disc accesses.

2) Range queries can be performed easily as leaves are linked up

3) Height of the tree is less as only keys are used for indexing

4) Supports both random and sequential access.

Disadvantage of B+- trees:

Insert and delete operations are complicated

Extendible Hashing Example

Bucket capacity – 2 Initial buckets = 1

Insert 45, 22

Bucket overflows local depth = global depth

⇒ Directory doubles and split image is created

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Local depth
Global 0
depth 0 45
22

Insert 12 1
1 22
0 12
1
1
45

Insert 11
1
1 22
0 12
1
1
45
11

Delete 15

Overflow occurs.

Global depth = local depth

Directory doubles and split occurs

1
22
12
2 2
00 45
01
10
11 2
11
15

Delete 10

2
12

2
45
2
00
01 2
10 10
11 12
2
11

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Overflows

Since local depth < global depth

Split image is created

Directory is not doubled.

Linear hashing example

Initial Buckets = 1 Bucket capacity = 2

Mod functions Split

N pointer
h0 = xmod 1 0
h1 = xmod 2

Insert 12, 11

N Insert 0 12 h0 = xmod 2
14 N h1 = xmod 4
0 12 14
11
B0 overflows
Bucket pointed by
N is split 1 11
Mod functions are
changed

Insert 13

0 12 h0 = xmod 2
N
0 12 h1 = xmod 4
Insert 9
14 N
1 11 9
B1 overflows 13
1 11 B0 is split using by
and split image
is created
2 14

Indian Institute of Technology Madras

Database Design Prof. P.Sreenivasa Kumar

Insert 10

0 12
0 12
Insert 18
1 9
N 13
1 11 9 Overflow at B3
h2 is applied here 13 Split B1 2 14 18
h0 = mod 4 10
h1 = mod 8
2 14 3 11
10

Indian Institute of Technology Madras

Chapter One Introduction To Multimedia Systems 1.1. What Is Multimedia?
100% (1)
Chapter One Introduction To Multimedia Systems 1.1. What Is Multimedia?
71 pages
Chapter 17 Disk Storage, Basic File Structures, and Hashing Disk Storage Devices
No ratings yet
Chapter 17 Disk Storage, Basic File Structures, and Hashing Disk Storage Devices
10 pages
Handbook Ericsson PUb 1 PDF
100% (1)
Handbook Ericsson PUb 1 PDF
264 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
60 pages
m5 Index PDF
No ratings yet
m5 Index PDF
60 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
58 pages
File Organization and Indexing: Prof P Sreenivasa Kumar Department of CS&E, IITM 1
No ratings yet
File Organization and Indexing: Prof P Sreenivasa Kumar Department of CS&E, IITM 1
23 pages
Database 2 Notes
No ratings yet
Database 2 Notes
42 pages
File Organization Notes
No ratings yet
File Organization Notes
21 pages
CH 13
No ratings yet
CH 13
6 pages
1 - Disk Storage - Ch13
No ratings yet
1 - Disk Storage - Ch13
31 pages
DS_TM_Study_Material_Presentations_Unit-4_1TM
No ratings yet
DS_TM_Study_Material_Presentations_Unit-4_1TM
22 pages
Dbms Unit III Notes
No ratings yet
Dbms Unit III Notes
27 pages
File Structures Indexing Kopyası
No ratings yet
File Structures Indexing Kopyası
76 pages
Unit-3 Part 2 Indexing and Hashing
No ratings yet
Unit-3 Part 2 Indexing and Hashing
36 pages
22-File Organization-06-09-2024
No ratings yet
22-File Organization-06-09-2024
23 pages
Unit 5
No ratings yet
Unit 5
185 pages
Lecture 17
No ratings yet
Lecture 17
24 pages
DBMS Chapter 4 Record Organization and Dile Management
No ratings yet
DBMS Chapter 4 Record Organization and Dile Management
36 pages
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
No ratings yet
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
53 pages
File Organization CH16 Updated
No ratings yet
File Organization CH16 Updated
30 pages
File Structure and Indexing
No ratings yet
File Structure and Indexing
18 pages
Chapter 6-
No ratings yet
Chapter 6-
62 pages
Mod4 Chap10 - 11 Indexing
No ratings yet
Mod4 Chap10 - 11 Indexing
77 pages
Storage and Querying in DBMS
No ratings yet
Storage and Querying in DBMS
45 pages
Elmasri Storage Hashing
No ratings yet
Elmasri Storage Hashing
27 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
81 pages
File Organization
No ratings yet
File Organization
45 pages
Lecture 01 - File Storage - Part 1
No ratings yet
Lecture 01 - File Storage - Part 1
48 pages
L2.2-File Organization Techniques
No ratings yet
L2.2-File Organization Techniques
42 pages
Unit 6.2 Indexing and Hashing
No ratings yet
Unit 6.2 Indexing and Hashing
37 pages
1 File Structure & Organization
No ratings yet
1 File Structure & Organization
23 pages
UNIT-IV - File Organization
No ratings yet
UNIT-IV - File Organization
10 pages
31 File Structures
No ratings yet
31 File Structures
20 pages
Disk Storage, Basic File Structures, and Hashing
No ratings yet
Disk Storage, Basic File Structures, and Hashing
34 pages
Unit 1 Introduction To Dbms
No ratings yet
Unit 1 Introduction To Dbms
27 pages
File Structures Indexing
No ratings yet
File Structures Indexing
58 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
31 pages
File Organization
No ratings yet
File Organization
11 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
11 pages
Chapter 6- - Copy
No ratings yet
Chapter 6- - Copy
62 pages
8.Physical Database Design
No ratings yet
8.Physical Database Design
20 pages
DSA Unit6 Theory
No ratings yet
DSA Unit6 Theory
23 pages
2MCA2 DBMS Nit 2 Secondary Storage. 16960710426030.Pptx
No ratings yet
2MCA2 DBMS Nit 2 Secondary Storage. 16960710426030.Pptx
32 pages
TOPIC THREE-File system
No ratings yet
TOPIC THREE-File system
15 pages
File Organizations and Indexes
No ratings yet
File Organizations and Indexes
51 pages
UNIT 5-FILE ORGANIZATION
No ratings yet
UNIT 5-FILE ORGANIZATION
21 pages
1. Elmasri_6e_Ch17
No ratings yet
1. Elmasri_6e_Ch17
43 pages
Elmasri_6e_Ch17_ppt_Compatibility_Mode_Repaired
No ratings yet
Elmasri_6e_Ch17_ppt_Compatibility_Mode_Repaired
32 pages
File Organization
No ratings yet
File Organization
47 pages
File Organization in DBMS
No ratings yet
File Organization in DBMS
10 pages
Adbs 5
No ratings yet
Adbs 5
37 pages
Basic File Structure
No ratings yet
Basic File Structure
17 pages
LM2 File Organisation
No ratings yet
LM2 File Organisation
31 pages
Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering &technology
No ratings yet
Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering &technology
38 pages
Chap. 6 Hash-Based Indexing: Abel J.P. Gomes
No ratings yet
Chap. 6 Hash-Based Indexing: Abel J.P. Gomes
15 pages
Unit 6 (22516)
No ratings yet
Unit 6 (22516)
40 pages
09_FIle.pptx
No ratings yet
09_FIle.pptx
22 pages
File Systems
No ratings yet
File Systems
8 pages
7_DataStorageIndexingStructures
No ratings yet
7_DataStorageIndexingStructures
83 pages
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
From Everand
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
Nolan Reeves
No ratings yet
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
IP Address
No ratings yet
IP Address
4 pages
Tools of Structured Analysis
100% (1)
Tools of Structured Analysis
23 pages
Computer Fundamentals Bba106
No ratings yet
Computer Fundamentals Bba106
1 page
SDLC Why Needed
No ratings yet
SDLC Why Needed
9 pages
BCA101
No ratings yet
BCA101
1 page
CS203A Computer Architecture Cache and Memory Technology and Virtual Memory
No ratings yet
CS203A Computer Architecture Cache and Memory Technology and Virtual Memory
28 pages
Computer Networking Principles Bonaventure 1-30-31 OTC1
No ratings yet
Computer Networking Principles Bonaventure 1-30-31 OTC1
594 pages
(IBM) FlashSystem V9000 Model AC3 With Flash Enclosure Model AE3 Product Guide
No ratings yet
(IBM) FlashSystem V9000 Model AC3 With Flash Enclosure Model AE3 Product Guide
64 pages
File Type Signatures Search
No ratings yet
File Type Signatures Search
5 pages
Chapter 6 Review Questions
No ratings yet
Chapter 6 Review Questions
5 pages
Circular Queue
No ratings yet
Circular Queue
3 pages
RH N Deploy Proxy
No ratings yet
RH N Deploy Proxy
12 pages
Linux Commands
94% (17)
Linux Commands
20 pages
Introduction To Assembler
No ratings yet
Introduction To Assembler
11 pages
ICT 1st Paper CH 04
No ratings yet
ICT 1st Paper CH 04
104 pages
CMCQ
No ratings yet
CMCQ
18 pages
Guide Translation
No ratings yet
Guide Translation
3 pages
HUDA Data Center RFP PDF
100% (1)
HUDA Data Center RFP PDF
122 pages
GCS 54 Cutover Plan
100% (1)
GCS 54 Cutover Plan
62 pages
Types of DSP Architectures
100% (3)
Types of DSP Architectures
45 pages
Auth Digest
No ratings yet
Auth Digest
4 pages
General Architecture: JDBC - Java Database Connectivity
No ratings yet
General Architecture: JDBC - Java Database Connectivity
4 pages
What Is Direct Memory Access (DMA) and Why Should We Know About It?
No ratings yet
What Is Direct Memory Access (DMA) and Why Should We Know About It?
23 pages
FieldServer Configuration Guide
No ratings yet
FieldServer Configuration Guide
83 pages
SQLite C Tutorial - SQLite Programming in C
No ratings yet
SQLite C Tutorial - SQLite Programming in C
28 pages
EDIFACT Tutorial From GXS
100% (3)
EDIFACT Tutorial From GXS
23 pages
Understanding Oracle Forms Timeout Parameters (Or Should I Say FORMS - TIMEOUT) FRM-92102 - A Network Error Has Occured
No ratings yet
Understanding Oracle Forms Timeout Parameters (Or Should I Say FORMS - TIMEOUT) FRM-92102 - A Network Error Has Occured
4 pages
Final MCPA NE Guide and Sample Quesion
75% (4)
Final MCPA NE Guide and Sample Quesion
21 pages
TCP Ip Stack
No ratings yet
TCP Ip Stack
26 pages
10th Class DATABASE Notes
No ratings yet
10th Class DATABASE Notes
5 pages
2V0 21 PDF
No ratings yet
2V0 21 PDF
24 pages
ITU07102.Lecture.5-Signed Number and BCD
No ratings yet
ITU07102.Lecture.5-Signed Number and BCD
29 pages
Hadoop Ecosystem PDF
No ratings yet
Hadoop Ecosystem PDF
6 pages
DA-100 Exam Prep
100% (2)
DA-100 Exam Prep
158 pages

File Organization and Indexing: Structure of Disks

Uploaded by

File Organization and Indexing: Structure of Disks

Uploaded by

Database Design Prof. P.

File Organization and Indexing

Disk space management:

Should Operating System services be used ?

Should RDBMS manage the disk space by itself ?

main memory buffer is written to the disk

This is important for recovering data when system crash occurs

 one read / write head per surface for fast access

 platter has several tracks

• xxx per inch

 each track - several sectors

Indian Institute of Technology Madras

 each sector - blocks

 unit of data transfer - block

 cylinder i - track i on all platters

Data transfer from Disk

Move the r/w head to the appropriate track

• time needed - seek time - X ms

Wait for the appropriate block to come under r/w head

• time needed - rotational delay - Y ms

Access time: Seek time + rotational delay

cylinder i – cylinder (i + 1) etc.

Data records and Files

• in a file of these type of records, the record number can be used to

locate a specific record

Variable length record type:

• arise due to missing fields, repeating fields, variable length fields

Indian Institute of Technology Madras

Packing records into blocks

Record length much less than block size

• Blocking factor b =⌊B/r ⌋ B - block size (bytes)

r - record length (bytes)

maximum no. of records that can be stored in a block

Record length greater than block size

• spanned organization is used

Sequence of blocks containing all the records of the file

Mapping file blocks onto the disk blocks

• Consecutive file blocks are stored in consecutive disk blocks

• Adv: File scanning can be done fast using double buffering

Disadv: Expanding the file by including a new block in the

middle of the sequence - difficult

• each file block is assigned to some disk block

• each disk block has a pointer to next block of the sequence

• file expansion is easy; but scanning is slow

Indian Institute of Technology Madras

Primary File Organization

the “rollNo” values

• conflicting demands may be there

• example: on student file, access based on rollNo and also

access based on name may both be frequent

• we choose to make rollNo access fast

• For making name access fast, additional access structures

Different file organization methods

Locating a record given values for any attribute

• requires scanning the entire file - costly

Sorted files / Sequential files (1/2)

Ordering key field: An ordering field that is also a key

Sorted file / Sequential file:

are in ascending order

Indian Institute of Technology Madras

Locating a record given the value X of the ordering field:

Binary search can be performed - efficient

Sorted files / Sequential files (2/2)

 Ordering gets affected

• costly as all blocks following the block in which insertion is performed

may have to be modified

 Hence not done directly in the file

• all inserted records are kept in an auxiliary file

 Deletion markers are used.

the value of a single attribute

Data file: organized as a buckets numbers 0,1,……M-1 (bucket - a block or a few

Indian Institute of Technology Madras

Inserting records into a hashed file

overflow chain of bucket r

Deleting records from a hashed file

Indian Institute of Technology Madras

one read / write head per surface for fast access

platter has several tracks

each track - several sectors

each sector - blocks

unit of data transfer - block

cylinder i - track i on all platters

Ordering gets affected

Hence not done directly in the file

Deletion markers are used.

exploited cleverly to devise dynamic hashing schemes