0% found this document useful (0 votes)
7 views21 pages

Chapter 3 File Organization Trees Structures

The document discusses file organization using tree structures, specifically m-ary search trees and B-Trees, which are more efficient for large and dynamic files compared to traditional index table methods. It explains the structure, search, and insertion algorithms for both m-ary search trees and B-Trees, highlighting their balanced nature and advantages. The conclusion notes the benefits and drawbacks of tree methods in file management.

Uploaded by

miloga4905
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views21 pages

Chapter 3 File Organization Trees Structures

The document discusses file organization using tree structures, specifically m-ary search trees and B-Trees, which are more efficient for large and dynamic files compared to traditional index table methods. It explains the structure, search, and insertion algorithms for both m-ary search trees and B-Trees, highlighting their balanced nature and advantages. The conclusion notes the benefits and drawbacks of tree methods in file management.

Uploaded by

miloga4905
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

FILES ORGANIZATION

The tree structures


Outlines

• Introduction

• M-ary search trees

• B-Trees

• Conclusion
Introduction

• Index table methods are limited to certain types of files (small files or

static files). Methods based on tree structures are better suited for

large and/or dynamic files. To better utilize block space, m-ary trees

are employed.
M-ary Search Trees

• An m-ary search tree of order n is a tree where each node can have at

most n children (Children[1..n]) and n-1 values (Val[1..n-1]). The

values within a node are ordered in ascending order.


M-ary Search Trees
• The children are organized based on the node values, following the rules:
• i) Child[1] points to a subtree containing values < Val[1]
• ii) Child[i] points to a subtree containing values > Val[i-1] and < Val[i], for
i=2..n-1
• iii) Child[n] points to a subtree containing values > Val[n-1]
• The degree of a node (the number of children) is the number of stored
values plus one.
• The figure depicts an m-ary search tree of order 5 with the root node a.

• Node b is the first child of a, and f is the fifth child of a. An internal node
may have some children set to nil while others are not. For instance, all
children of b except the third one are set to nil. Similarly, the first, third,
and fourth children of node d are set to nil, while its second child (pointing
to h) and fifth child (pointing to i) are not nil.
Note
• The general structure of a block is as follows:

• Type BlockType = Structure

• Val: array[1..N-1] of SomeType; // records or (keys-address)

• Children: array[1..N] of integer; // block numbers

• Degree: integer; // number of elements in Children

• End
• Among the characteristics of a file viewed as a tree, there is the root.
It is the block number containing the root of the tree. For example, if
the tree in the previous figure were a file, the content of block d
would be as follows:

Val 42 50 55 60 5 Val 10 2

Child -1 h -1 -1 i Degree Child -1 -1 Degree


1 2 3 4 5 1 2 3 4 5
Search
• The search for a value C begins in the root node P and continues
along a branch:
• 1- If C exists in P, then stop successfully.
• 2- If C does not exist in P, let k be the position in P where C should be
inserted (to keep the values ordered).
• P := Children[k]
• If P is not Nil, then go to 1; otherwise, stop with failure.
• For example, the search for 48 in the tree from the previous figure proceeds as
follows:
• Start the search in node a. The value 48 does not exist and is between 40 and 67.
So, the next node to visit is Children[3] (node d).
• Continue the search in node d, and the next node to explore is Children[2]
because 48 is between 42 and 50.
• Continue the search in node h. The value 48 exists (Val[2]), so stop successfully.
• If we were searching for the value 15, we would have first visited node a, then
node b (Children[1] of a, because 15 < 24). There, we would stop with failure
because 15 is between 12 and 20, and Children[4] of b is nil.
Insert
• To insert a new value V into an m-ary search tree, we first search for the value. The
search also returns the index k where V should be inserted if there is space in node P.

• If P is not full,

➢insert V into P, shifting elements to keep the array of values ordered. Else (P is
full):

➢allocate a new node Q containing a single value (Val[1] = V) and two children set
to nil (Children[1] = nil and Children[2] = nil).

➢make the Children[k] of P point to the new node Q (which was, before the
insertion, necessarily nil). The index k is the one returned by the search.
• For example, if we insert the value 64 into the tree from the previous
figure, we proceed as follows:
1. Search for 64 → failure (the last visited node is i, and the position
where 64 should be inserted is k=2).
2. Since the node i is not full, we can insert 64 at position 2 by shifting
to the right the values > 64.
• If we insert the value 62, we will allocate a new node (j) that contains
62, and it will be pointed to by the Children[1] of node i (where i is the
last visited node in the search for 62, and the position where 62 should
be inserted in i, if there were space, is k = 1).
B-Trees

• These are m-ary search trees that always remain balanced and are
therefore very useful for managing large and dynamic files. For
simplicity, an odd order is chosen (N = 2d+1). In a B-Tree of order N:

• All nodes except the root are filled at least 50% (i.e., d values).

• The root can contain a minimum of 1 value.

• All branches have the same length (completely balanced tree).


• Search in a B-Tree is similar to searching in an m-ary search tree. The

difference lies in the insertion and deletion algorithms.


Insertion Algorithm
• To insert a value V into a B-Tree with root R, proceed as follows:
1. Insert V and its right child (initially set to -1), i.e., if V is inserted at position j in a node, then the
right child RC should also be inserted at position j+1 in the array of children.
2. Search for V to verify that it does not exist and find the last visited leaf node (P).
3. If P is not full, insert (V, RC) into P by internal shifting and stop.
4. If P is already full, it will "split" into two nodes (P and Q, a new allocated node):
a) Form the "ordered sequence" of values in P along with the value V.
b) Assign the first d values with the first d+1 children to node P.
c) Assign the last d values with the last d+1 children to the new node Q.
d) Insert the middle value (the one at position d+1 in the ordered sequence) into the parent
node of P. The right child of this value will be node Q.
• Go to 2.
• For example, the insertion of 76 into the B-Tree from the previous
figure proceeds as follows:
1. Search for 76 → found = FALSE, the last visited node = e (it's a
leaf), and the position where 76 should be inserted is 2.
2. Since the node e is not full, insert 76 with its right child -1 at
positions 2 in the Val array and 3 in the children array of node e.
• If we now insert the value 57, the last visited node in the search will be d. The position of 57 in d should then
be 4 (and its RC at position 5).
• Since node d is already full, there will be a "split" (meaning, allocation of a new node, let's say f):
a) Form the "ordered sequence":
• Val: 42, 50, 55, 57, 60
• Fils: -1, -1, -1, -1, -1, -1
b) Assign the first d values (42, 50) and the first d+1 children (-1, -1, -1) to node d.
• Assign the last d values (57, 60) with the last d+1 children (-1, -1, -1) to the new node f.
• Insert the middle value (55) with its right child (f) into the parent node (a).
• Since node a is not full, the insertion of (55, f) can be done by internal shifting.
• If we continue the insertion of 7, node b will "split" (resulting in the allocation of a new node,
let's call it g). The ordered sequence of values in b plus the value 7 is formed:
• Val: 2, 5, 7, 12, 20
• Fils: -1, -1, -1, -1, -1, -1
• The values (2 and 7) with the first 3 children remain in b. The values (12 and 20) with the last 3
children will be assigned to the new node g. The middle value (7) with its right child node g is
inserted into the parent node (a).
• As node a is also full, it will "split" in turn. A new node h is then allocated, and the ordered
sequence is formed:
• Val: 7, 24, 40, 55, 70
• Fils: b, g, c, d, f, e
• The first d values (7 and 24) with the first 3 children (b, g, and c) remain in a. The last d values
(55 and 70) with the last 3 children (d, f, and e) are assigned to the new node h. The middle
value (40) with its right child node h is inserted into the parent of node a.
• As the node a was the root of the tree (thus having no parent), a new root (i)
is then allocated containing only the value 40. The left child (Children[1]) is
the old root (a), and the right child (Children[2]) is the node h. The
following figure shows the resulting tree:

• When the root of a B-Tree splits (due to an insertion), the depth of the tree
increases by one level.
Conclusion
In conclusion, tree methods

• Have advantages such as: • Disadvantages include:

• The size problem is no longer an • Intended for static files.


issue. • Imbalance of the B-Tree, with no
• Ordered file. solution except periodic
reorganization.

You might also like