0% found this document useful (0 votes)
15 views29 pages

Indexed Structures

Uploaded by

fdlm096
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views29 pages

Indexed Structures

Uploaded by

fdlm096
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Democratic and Popular Republic of Algeria

Ministry of Higher Education and Scientific Research

Ecole supérieure en sciences et technologies de


l’informatique et du numérique

Indexed sequential structures

Presented by : Dr. Daoudi Meroua

Academic year: 2024/2025


Files with Indexes

➔Searching for a record in a sequential file structure is generally


costly

 → sequential search

→ binary search in a (very) large file

➔Indexing is a data structure technique that allows efficient


retrieval of file records based on certain attributes on which the
indexing has been performed.

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 2
Files with Indexes
The attribute (or group of attributes) used to search for records is called a "search
key."
For example, in a meteorological measurements file:
File of meteorological measurements
< city, date, temperature >
Search examples:
→ Find the record(s) where city = 'DJELFA'
Result:
‘DJELFA’, ‘2015-06-23’, 21
‘DJELFA’, ‘2013-10-04’, 15
‘DJELFA’, ‘2015-06-22’, 20
‘DJELFA’, ‘2020-07-16’, 29
2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 3
Files with Indexes
An index is an ordered table in main memory (MC), containing,
among other things, pairs: < key, address >

Key adr

Data file

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 4
Files with Indexes
Example: Search for the record with the attribute value A1 = 54

→ Perform a binary search for 54 in the index table in main memory (MC):
result adr = <4,2>

→ LireDir(F, 4, buf) and retrieve the record buf.tab[2]

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 5
Files with Indexes

Index table (MC) Data file (MS) Index file (MS)

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 6
Files with Indexes
The key can have unique values or not (multiple values).

Example of an index on a key attribute with multiple values

Key adr

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 7
Files with Indexes
Different representations of index tables with multiple values:
Key adr

1) One entry per key value.


Key adr
Key adr

2) Multiple entries per key value.


Key adr

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 8
Files with Indexes
The data file can be ordered by the
key or not.

If the data file is ordered (by the


key attribute)

⇒ Non-dense index (Clustered


Index) does not contain all the
values of the key attribute.

In this example, each entry in the


index table contains the largest key
of a group of two consecutive
blocks.

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 9
Files with Indexes
The data file can be ordered
by the key or not.
2) If the data file is not
ordered (by the key
attribute)
⇒ Dense index (Non-
Clustered Index)
contains all the values of
the key attribute.

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 10
Files with Indexes : basic operations

Record Search

Search in the index in main memory (MC), then access the data file.

● Exact query (key = value) → binary search for the exact value.
● Interval query (key ∈ [a, b]) → binary search for ‘a’ + sequential search
for the following values up to ‘b’.

Insertion / Deletion of Records


Insertions/deletions of records in the data file and, if necessary, update the
index in MC.

Case of Ordered File:


More efficient interval query.
Deletion is more costly.
2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 11
Files with Indexes : : basic operations
Example: Insertion in T~OF with a dense index and unique key values.
Type Tbloc = Struct
tab : tableau[ b ] de typeEnreg Tcouple = Struct
NB : entier cle : typeqlq ;
Fin numBlc , depl : entier
Var F : FICHIER de Tbloc BUFFER buf ENTETE Fin ( entier )
Index : tableau [ MaxIndex ] de Tcouple
NbE : entier // number of elements in the index table (== number of records in
the file F)
Ins( e:TypeEnreg )
Rech( e.cle , trouv , k ) // Search (binary) in the index table
SI ( Non trouv )
// Insertion at the end of the data file ...
OUVRIR( F, « donnees.dat » , ‘A’ )
i ← Entete( F , 1 )
LireDir( F , i , buf )

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 12
Files with Indexes
SI ( buf.NB < b ) buf.NB++ ; j ← buf.NB ; buf.tab[ j ] ← e
EcrireDir( F , i , buf )
SINON
i++ ; j ← 1 ;
buf.NB ← 1 ;
buf.tab[ j ] ← e
Aff_entete( F, 1, i ) ; EcrireDir( F , i , buf )
FSI
FERMER( F )
// Insertion in the index table ...
NbE++ ; m ← NbE
TQ ( m > k )
Index[ m ] ← Index[ m-1 ] ;
m–
FTQ
Index[ k ] ← < e.c , i , j > // clé, numBlc, depl
FSI
2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 13
Files with Indexes
Same example but with non-unique key values.
Type Tcouple = Struct maillon = struct
cle : typeqlq ; val : struct (numblc , depl :
tete : ptr(maillon) entier) ;
Fin adr : ptr(maillon)
Var Index : tableau [ MaxIndex ] de Tcouple Fin
Ins( e:TypeEnreg )
// Insertion at the end of the data file ...
OUVRIR( F, « donnees.dat » , ‘A’ )
i ← Entete( F , 1 )
LireDir( F , i , buf )
SI ( buf.NB < b ) buf.NB++ ; j ← buf.NB ; buf.tab[ j ] ← e
EcrireDir( F , i , buf )
SINON
i++ ; j ← 1 ; buf.NB ← 1 ; buf.tab[ j ] ← e
Aff_entete( F, 1, i ) ; EcrireDir( F , i , buf )
FSI
2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 14
Files with Indexes
FERMER( F )
// Insertion in the index table ...
Rech( e.cle , trouv , k )
SI ( trouv ) // Add a link <i, j> to the list index[k].head
Allouer( p ) ;
Affval( p , < i , j > ) ;
Affadr( p , Index[ k ].tete ) ;
Index[ k ].tete = p
SINON // Insert a new entry <key, <i, j>> in the index at position k.
NbE++ ;
m ← NbE ;
Allouer(p) ;
Affval(p, < i , j >) ;
Affadr(p,nil)
TQ ( m > k ) Index[ m ] ← Index[ m-1 ] ; m-- FTQ
Index[ k ] ← < e.c , p > // key = e.c, head = p
FSI
2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 15
Files with Indexes
Management of an Overflow Area

Non-Dense Index Table

Overflow Area Data File


Primary Area Data File

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 16
Files with Indexes
Example: Index for LOF File
(no inter-block offsets and no overflow area)

2020/2021
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 17
Files with Indexes
Exemple :LOF File / Insertion
The insertion of c5 causes
the overflow of block i:

Add a new block → i’

Split the content of i into two


halves

Update the index by


inserting a new entry for
block i’

2020/2021
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 18
Files with Indexes
Index in Main Memory in the form of BST

Type Tnoeud = struct


cle : typeqlq
numBlc , depl : entier
fg , fd : ptr(Tnoeud)
Fin

2020/2021
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 19
Files with Indexes : Large Index

Index in central
memory in the file in main
form of an memory (MC)
ordered file with
contiguous
blocks

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 20
Files with Indexes : Large Index
Index
Multiniveaux

2020/2021
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 21
Files with Indexes : Multi-Key Query

2020/2021
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 22
Files with Indexes : Multi-Key Query

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 23
Files with Indexes : Multi-Key Query
Find all records where the value of X = vx AND the value of Y = vy AND
…” with X, Y, ... as ‘secondary keys’ (For each secondary key, there is a
corresponding secondary index):

● Using the secondary index X, find the list Lx of primary keys associated
with the value vx.

● (Repeat the same action for each secondary key mentioned in the
query…)

● Perform the intersection of the primary key lists Lx, Ly, ... to find the
primary keys associated with each secondary key value mentioned in
the query.

● Use the primary index to retrieve the records from the data file (by first
sorting the sequence of block numbers before performing the physical
transfers).
2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 24
Files with Indexes : Multi-Key Query
If we are searching for all records
where A2 = ‘eee’ and A3 = 870, the
multi-key query algorithm will
proceed as follows:
a. Search for ‘eee’ in the index
IndA2 → result: LA2 = [32, 65, 70]
b. Search for 870 in the index IndA3
→ result: LA3 = [32]
c. Intersection of LA2 and LA3 →
result: Final L: [32]
d. Search for 32 in IndA1 → result:
block number <2>
e. ReadDir(F, 2, buf) and retrieve
the record “<32, bbb, 870, …>”

2024/2024
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 25
Files with Indexes : Multi-Key Query
Insertion of a record < c, vx, vy, ... >
● Search for c in the primary index → ip: the index where this key should be
inserted (binary search).
● Insert the record into the data file → adr: the address where the record has
been inserted.
● Insert in the primary index, at position ip, the entry < c, adr > if it is a dense
index, or update the entry at index ip if it is a non-dense index.
● Search for the value vx in the secondary index X.
● If vx exists, add c to the list pointed to by vx.
● If vx does not exist, insert vx in the secondary index X.
● → In this case, the new entry vx will point to a list formed by a single primary
key (c).
● Repeat step 4) for each remaining secondary key (vy, ...).

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 26
Files with Indexes : Multi-Key Query
Deletion of a record < c, vx, vy, ... >

● To logically delete a record with primary key c, it is sufficient to set a


deletion bit (or character) in the data file or in the primary index table for
the entry c.

● To physically delete a record with primary key c, you must first physically
remove the record from the data file, and then update the primary index
table either by deleting the entry related to c (in the case of a dense
index) or by modifying the key and/or address of the representative of the
group to which the deleted record belongs (in the case of a non-dense
index).

In both types of deletion (logical or physical), it is not necessary to update


the secondary indexes.

2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 27
Files with Indexes : Index Bitmap
Index Bitmap
A bitmap index on an attribute A (formed by m different values: v1, v2, … vm)
consists of m binary strings, each with N bits (IndA_v1, IndA_v2, ...
IndA_vm):
Each string IndA_vj is associated with the value vj of attribute A.
● If (IndA_vj[k] = 1), then in record number k, attribute A equals vj.
● If (IndA_vj[k] = 0), then in record number k, attribute A is different from vj.

Record number

The bit string associated with v1

The bit string associated with v2

The bit string associated with vm

Examples:
A = v2 in record number 2 and record number i of the data file.
A = v1 in records number 1, 5, 6, 8, … N-2 and N-1.
2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 28
Files with Indexes : Index Bitmap
Bitmap indexes can be useful for attributes with low cardinality (e.g., < 20
distinct values).
The different bit strings can be loaded into main memory (MC) independently
of each other.
They are primarily used for multi-key queries on attributes with low cardinality.
Example: “Find records where A = v2 and B = w4.

Cardinality of A = 3

Cardinality of w = 4

The result of the query is given by the binary operation: (IndA_v2 AND
IndB_w4)
→ Records number 7 and number i.
2024/2025
2ème année CP Pr Hidouci W.K. (http://hidouci.esi.dz) / SFSD / ESI 28

You might also like