0% found this document useful (0 votes)
10 views26 pages

Index Method2

The document presents various indexing methods to improve file organization and access efficiency, particularly for large data files. It discusses the concepts of dense and non-dense indexes, single and multi-level primary indexes, and secondary indexes for multi-criteria access. The conclusion emphasizes that these indexing methods enhance performance by utilizing auxiliary structures to expedite search operations.

Uploaded by

aymen Beskri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views26 pages

Index Method2

The document presents various indexing methods to improve file organization and access efficiency, particularly for large data files. It discusses the concepts of dense and non-dense indexes, single and multi-level primary indexes, and secondary indexes for multi-criteria access. The conclusion emphasizes that these indexing methods enhance performance by utilizing auxiliary structures to expedite search operations.

Uploaded by

aymen Beskri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

FILE ORGANIZATION

Index Methods
Presented by Pr. Nabil KESKES

year 2023-2024.

1
PLAN
Introduction.
Reminder
Secondary index and multi-criteria acces
Conclusion

2
1. Reminder

With simple file structure,

When the data file becomes too large  access operations (search, insertion, etc.)

become inefficient.

Indexing methods improve  performance by managing an auxiliary


structure (index table) to speed up
searches.

3
1.1 Index

An index is (usually) an ordered table of <key, adr> pairs used to speed up


searching for records in a file. The 'key' field represents a key to a record in the file,
and the 'adr' field represents its address in the file (e.g. as a <numBloc,
deplacement>).

4
Description
TYPE T = STRUCTURE
Key : Typekey
Adr : Typeadress
END

TYPE Typedress = STRUCTURE


Numbloc : Integer
Depl : Enteger
END

VAR TINDEX : Array [1..M] Of T

5
Note

You can also add other information that may be useful for managing files, such as a
Boolean to indicate logical deleting.

6
Remarks
 An index is "dense" if it contains all the keys in the file. In this case, there's no need to
keep the file in order.

 An index is "non-dense" if it does not contain all the keys in the file (for example,
only one key per block). In this case, the file must be ordered. The advantage of a
non-dense index is its size (smaller than a dense index for the same file).

7
1.2 Single level primary index

8
1.3 Basic Operations

Case : Dense index (unordered data file)

 Finding a record involves a dichotomous search for its key in the index table. If it
exists, the record is retrieved from the file..

9
Case : Dense index


A new record is inserted at the end of the file. Its key is inserted into the
index table (in MC) with offsets to maintain key order.

 Deletion is physical. The deleted record is replaced by the last one in the file
. and the index table is updated. In the case of a variable format, deletion is
generally logical if there is no hole management

10
Case : Dense Index

The interval query searches for all records whose key belongs to a given interval of
values [a,b].
1- Start by searching for the smallest key >= 'a' in the index (dichotomous search in MC).
2- Then continue sequentially through the table until you find a key > 'b'.
3- For each key, we access the data file to retrieve the record

11
Case : nondense index (Ordered Data File)

 Searching for a record involves a dichotomous search for its key in the index table,
then continuing in one of the blocks of the file.

 Deletion is logical. The deletion flag (either in the index table or in the data file) is
updated.

12
Case : nondense index

 Inserting a new record is done by moving the data file. The index table (in MC) is
then updated to reflect the shifts caused by the insertion of the new record in the
file. This operation is very time consuming!

 One solution, in the case of an ordered file, would be to maintain an overflow zone
dedicated to records resulting from inter-block shifts caused by insertions.

13
Case : Nondense Index

If the overflow blocks become too numerous (the lists become longer), the file must be
reorganised by creating a new, larger main file.

14
1.4 Multi-level primary index

 If the index is too large to reside in MC, a second index is built on the index file
(ordered). In this case, a single key is chosen for each block in the index file (non-
dense index) to build the second index.

 If the second index is still too large to reside in MC, we store it on disk (second index
file) and build a third index by selecting one key per block from the second index file.
This process can be repeated as many times as required.
.
15
Multi-level primary index

16
2.Secondary index and multi-criteria acces

To improve searches based on non-key fields (also known as secondary keys),


secondary indexes can be built on these fields.

17
Secondary index and multi-criteria acces

18
The problem with secondary keys is that there can be multiple records for the same
value of the indexed field. This multiplicity is usually implemented using lists of
primary keys.

19
Secondary index and multi-criteria acces

When we search for records by a secondary key (for example, X=a), we use the
secondary index on that field to retrieve the primary key(s) associated with the value
we are looking for (a). For each primary key found, we use the primary index to
locate the record in the file (block number and location).This is the reverse list
method.

20
2.1 Multi-criteria searches take the form of:
Find all records for which the value of X = vx AND the value of Y = vy AND
...".where X, Y, ... are secondary keys.
To solve such a query, proceed as follows
1. Using the secondary index X, find the list Lx of primary keys associated with the
valueof X (vx).
2. Repeat the same operation for each secondary key mentioned in the query...
3. Intersect the lists of primary keys Lx, Ly, ... in order to find the primary
keysassociated with each secondary key value mentioned in the query.
4. Then use the primary index to find the records in the file

21
2.2 Delete in this case :

To delete a primary key record c, simply set a delete bit in the primary index table for
entry c. The primary index table for entry c. This avoids having to update all the
secondary indexes. At the data file level, the record can be physically deleted if the
file structure of the file allows it (such as TOF).

22
2.3 Insert in this case :
To insert a record <c,vx,vy,...> with c its primary key and vx,vy,... its secondary
keys,proceed as follows:
1. Search for c in the primary index to check that it does not already exist and to
findthe ip index where this key should be inserted (dichotomous search).
2. Insert the record at the end of the data file. Let (i,j) be its address
3. Insert the pair <c, (i,j)> in the primary index table, at index ip (by shifting).)
4. Search for the value vx in the secondary index X,if vx exists, add c to the list
pointed to by vxif vx does not exist, insert vx (by shifting) into the X table.The
new vx entry will point to a list formed by a single primary key (c).
5. Repeat step 4) for each remaining secondary key (vy, ...).
23
3. Conclusion

 Indexing methods such as those presented in this chapter are designed for static
files (i.e. where the number of insertions and deletions is relatively small).

 Indexing methods improve performance to some extent by managingan auxiliary


structure (index table) to speed up searches

24
Index
dense
INDEX PRIMAIRE
A UN NIVEAU

Index non
dense
INDEX
PRIMAIRE

Organisation des
INDEX
fichiers PRIMAIRE
Méthodes A PLUSIEURS Recherche
D’index NIVEAUX{ Requête a intervalle
Insertion
suppression

Index
Secondaire

Une Carte Mentale de l’exposé


25
Références

http://zegour.esi.dz/Publication/Livre2/Partie-sdf/Part2-index.htm

https://sites.google.com/a/esi.dz/hidouci/competences-professionnelles/algo2

Mc BELAID et Sabiha LIMAME née MERZOUK ,Fichiers Organisation  Accès,les pages bleues
internationales Maison d’ édition pour l’enseignement et la formation, ISBN: 978-9947-850-71-8

26

You might also like