0% found this document useful (0 votes)
47 views85 pages

Introduction To Trees: Linear Data Structures

Here are the key steps to delete a node with one child in a binary search tree: 1. Locate the node to be deleted (with value 6 in this example) 2. If the node has only one child (right child 8 in this case), connect the parent of the node (with value 5) to the child node (value 8) 3. Free the memory allocated to the node with value 6. This essentially removes the node from the tree by bypassing it and connecting its parent directly to its single child.

Uploaded by

nayanaa shindee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views85 pages

Introduction To Trees: Linear Data Structures

Here are the key steps to delete a node with one child in a binary search tree: 1. Locate the node to be deleted (with value 6 in this example) 2. If the node has only one child (right child 8 in this case), connect the parent of the node (with value 5) to the child node (value 8) 3. Free the memory allocated to the node with value 6. This essentially removes the node from the tree by bypassing it and connecting its parent directly to its single child.

Uploaded by

nayanaa shindee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

ADVANCED DATA STRUCTURES

Introduction to Trees

Linear Data Structures

N
0 1 2 3 4
List as an Array List as a Linked List
Disadvantage: Disadvantage:
• Fixed Size • Random Access is
• Expansion Time consuming
• Shrink
• Random Insertion
& Deletion is Time
Consuming
ADVANCED DATA STRUCTURES
Introduction to Trees

Linear organization of
data doesn’t help in
quick retrieval of Go for Non Linear
elements randomly Organization !!!
ADVANCED DATA STRUCTURES
Binary Trees

• Non Linear Data Structure


• Finite set of elements that is either empty or is partitioned
into three subsets
• First subset: is a single element, called the root
• Second subset: is a binary tree, called the left binary tree
• Third subset: is a binary tree, called the right binary tree

B C

D E F
G H I
ADVANCED DATA STRUCTURES
Binary Trees: Terminologies

• Each element of a binary tree is called a node of the tree


• Left node Y of X is called left child of X
• Right node Z of X is called the right child of X
X
• X is called the parent of Y and Z
• Y and Z are called siblings Y Z
• A node which has no children is called leaf node/external
node
• A node which has a child is called the non leaf
node/internal node
ADVANCED DATA STRUCTURES
Binary Trees: Terminologies

• A node N1 is called the ancestor of a node N2 if


• N1 is either the parent of N2 or
• N1 is the parent of some ancestor of N2
• A node N2 becomes the descendent of node N1
• Descendent can be either the left descendent or the right
descendent
N2 is the Left N2 is the Right
Descendent N1 Descendent A N1
A of N1
of N1
Ancestor
N2 B C B C
of N2

D E F G D E F G N2
ADVANCED DATA STRUCTURES
Binary Trees: Terminologies

• Level of a node
Root has level 0; level of any other node is one more
than its parent
• Depth of a tree
Maximum level of any leaf in the tree (path length from
the deepest leaf to the root)
• Depth of a node
Path length from the node to the root

A Level of node A – 0 Depth of tree: 2


Level of node B – 1 Depth of node A: 0
B C Level of node C – 1 Depth of node B: 1
Level of node D – 2 Depth of node C: 1
D Depth of node D: 2
ADVANCED DATA STRUCTURES
Binary Trees: Terminologies

• Height of a tree: Path length from the root node to the


deepest leaf
• Height of a node: Path length from the node to the deepest
leaf

A Height of Tree: 2
Height of Node A : 2
B C
Height of Node B : 0
D Height of Node C : 1
Height of Node D : 0
ADVANCED DATA STRUCTURES
Binary Trees: Terminologies

Strictly Binary Tree


A Binary tree where every node has either zero/two
children

A A

B C B C

D E D E

F F G

Not a Strictly Binary Strictly Binary Tree


Tree
ADVANCED DATA STRUCTURES
Binary Trees: Terminologies

Fully Binary Tree


• A binary tree with all the leaves at the same level
• If the binary tree has depth d, then there are 0 to d levels
• Total no. of nodes = 20 + 21 + … + 2d = 2(d+1) – 1
A Level 0

C Level 1
B Binary Tree
of depth 3
D E F G Level 2

H I J K L M N O Level 3
ADVANCED DATA STRUCTURES
Binary Trees: Terminologies

Complete Binary Tree


A complete binary tree is a binary tree in which all the levels
are completely filled except possibly the lowest one, which
is filled from the left.

A A A

B C B C B C

D E D E F G D E F G

F G H I J K H I J
Not Complete Binary Trees Complete Binary Tree
ADVANCED DATA STRUCTURES
Binary Tree Properties

Binary Tree Properties


• Every node except the root has exactly one parent
• A tree with n nodes has n-1 edges (every node except the
root has an edge to its parent)
• A tree consisting of only root node has height of zero
• The total number of nodes in a full binary tree of
depth d is 2(d+1) – 1 , d ≥ 0
• For any non-empty binary tree, if n0 is the number of
leaf nodes and n2 the nodes of degree 2, then n0 = n2 + 1
ADVANCED DATA STRUCTURES
Binary Search Tree: Definition

A Binary Search Tree is a binary tree which has the following


properties:
• all the elements in the left subtree of a node n are less
than the contents of node n
• all the elements in the right subtree of a node n are
greater than or equal to the contents of node n
ADVANCED DATA STRUCTURES
Binary Search Tree: Implementation

Linked implementation
Here every node will have its own info along with the links
to left child and right child

typedef struct tree_linked


{
int info;
struct tree_linked *left,*right;
}NODE;

NODE *root=NULL; //root points to Root of the tree and


initially it is null
ADVANCED DATA STRUCTURES
Binary Search Tree – An Application of Binary Tree

A Binary Search Tree with the nodes inserted in the order:


5, 3, 6, 4, 2, 8, 1,7, 9
5 3 6 4 2 8 1 7 9

3 6

2 4 8
1 7 9
ADVANCED DATA STRUCTURES
Binary Tree Traversals

Important operation: Traversal


Traversal: Moving through all the nodes in a binary tree and
visiting each one in turn
Trees: There are many orders possible since it is a nonlinear DS
Tasks: 1. Visiting a node denoted by V
2. Traversing the left subtree denoted by L
3. Traversing the right subtree denoted by R
Six ways to arrange them: VLR, LVR, LRV, VRL, RVL, RLV
Standard Traversals include: VLR-Preorder, LVR-Inorder,
LRV-Postorder
ADVANCED DATA STRUCTURES
Binary Tree Traversal: Preorder

Steps:
• Root Node is visited before the subtrees
• Left subtree is traversed in preorder
• Right subtree is traversed in preorder

F
F B A DC EG I H
B G

A D I

C E H
ADVANCED DATA STRUCTURES
Binary Tree Traversal: Inorder

Steps:
• Left subtree is traversed in Inorder
• Root Node is visited
• Right subtree is traversed in Inorder

F
A B C D E F G HI
B G

A D I

C E H
ADVANCED DATA STRUCTURES
Binary Tree Traversal: Postorder

Steps:
• Left subtree is traversed in postorder
• Right subtree is traversed in postorder
• Root Node is visited

F
A CED BHIG F
B G

A D I

C E H
ADVANCED DATA STRUCTURES
Binary Search Tree - Deletion

Deletion of a Node in Binary Search Tree


case1: Node with no child (leaf node)
case2: Node with 1 child
case3: Node with 2 children
ADVANCED DATA STRUCTURES
Binary Search Tree - Deletion

case1: Node with no child (leaf node)


5 5
3 6 3 6
2 4 8 2 4 8
1 7 9 1 9

To delete the node with info 7:


• Set its parent’s left child
field to point to NULL
• Free memory allocated to
node with info 7
ADVANCED DATA STRUCTURES
Binary Search Tree - Deletion

case1: Node with no child (leaf node)


5 5
3 6 3 6
2 4 8 2 8
1 7 9 1 7 9

To delete the node with info 4:


• Set its parent’s right child
field to point to NULL
• Free memory allocated to
node with info 4
ADVANCED DATA STRUCTURES
Binary Search Tree - Deletion

case2: Node with 1 child


5 5
3 6 3 8
2 4 8 2 4 7 9
1 7 9 1

To delete the node with info 6:


• Set its parent’s right child
field to point to its only child
• Free memory allocated to
node with info 6
ADVANCED DATA STRUCTURES
Binary Search Tree - Deletion

case2: Node with 1 child


5 5
3 6 3 6
2 4 8 1 4 8
1 7 9 7 9

To delete the node with info 2:


• Set its parent’s left child
field to point to its only child
• Free memory allocated to
node with info 2
ADVANCED DATA STRUCTURES
Binary Search Tree - Deletion

case3: Node with 2 children(Replace with inorder successor)


5 (Way1) 6
3 6 3 6
2 4 8 2 4 8
1 7 9 1 7 9

To delete the node with info 5: 6


• Replace 5 with its inorder
successor and delete that 3 8
inorder successor 2 4 7 9
• Now case3 has got changed
to case2 (In general may 1
change to case2 or case1)
ADVANCED DATA STRUCTURES
Binary Search Tree - Deletion

case3: Node with 2 children(Replace with inorder predecessor)


5 (Way2) 4
3 6 3 6
2 4 8 2 4 8
1 7 9 1 7 9

To delete the node with info 5: 4


• Replace 5 with its inorder
predecessor and delete that 3 6
inorder predecessor 2 8
• Here case3 has got changed
to case1 (In general may 1 7 9
change to case2 or case1)
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees

• A self-balancing binary search tree or height-balanced


binary search tree is a binary search tree (BST) that
attempts to keep its height, or the number of levels of
nodes beneath the root, as small as possible at all times,
automatically
• The disadvantage of a binary search tree is that its height
can be as large as N-1
• Most operations on a BST take time proportional to the
height of the tree, so it is desirable to keep the height small
• This means that the time needed to perform insertion,
deletion and many other operations can be O(N) in the
worst case
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees

• We want a tree with small height


• A binary tree with N nodes has height at least Θ(log N)
• Thus, our goal is to keep the height of a binary search tree
O(log N)
• Such trees are called balanced binary search trees.
Examples are AVL tree, red-black tree

• A typical operation done by trees to maintain balance is


rotation
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

• An AVL tree (named after inventors Adelson-Velsky


and Landis) is a self-balancing binary search tree
• It was the first such data structure to be invented
• In an AVL tree, the heights of the two child subtrees of any
node differ by at most one; if at any time they differ by
more than one, rebalancing is done to restore this
property
• Lookup, insertion, and deletion all take O(log n) time in
both the average and worst cases, where n is the number of
nodes in the tree prior to the operation
• Insertions and deletions may require the tree to be
rebalanced by one or more tree rotations
https://en.wikipedia.org/wiki/AVL_tree
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

Balance Factor
• In a binary tree the balance factor of a node X is defined to
be the height difference
BF(X) := Height(RightSubtree(X)) - Height(LeftSubtree(X))
of its two child sub-trees
• A binary tree is defined to be an AVL tree if the invariant
• BF(X) = {-1,0,1} holds for every node X in the tree
• A node X with BF(X) < 0 is called "left-heavy", one
with BF(X) > 0 is called "right-heavy", and one with
BF(X) = 0 is sometimes simply called "balanced"

https://en.wikipedia.org/wiki/AVL_tree
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

AVL tree Not an AVL tree

https://en.wikipedia.org/wiki/AVL_tree
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

Types of imbalance and Rotations involved


• LL imbalance : Right rotation (Single rotation)
• RR imbalance: Left rotation (Single rotation)
• LR imbalance: LR rotation (Double rotation)
• RL imbalance: RL rotation (Double rotation)
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

LL imbalance : Right rotation (Single rotation)


ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

RR imbalance : Left rotation (Single rotation)


ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

LR imbalance : LR rotation (Double rotation)


ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

RL imbalance : RL rotation (Double rotation)


ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

Sequentially insert 5, 6, 8, 3, 2, 4, 7 to an AVL Tree


ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

Sequentially insert A, Z, B, Y, C, X to an AVL Tree


ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

Sequentially insert A, Z, B, Y, C, X to an AVL Tree


ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

• AVL trees are often compared with red–black trees because


both support the same set of operations and
take O(logn) time for the basic operations
• For lookup-intensive applications, AVL trees are faster than
red–black trees because they are more strictly balanced

https://en.wikipedia.org/wiki/AVL_tree
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: AVL Tree

Applications
• AVL trees are used extensively in database applications in
which insertions and deletions are fewer but there are
frequent lookups for data required
• It is used in applications that require improved searching
apart from the database applications
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: Red Black Tree

• A red–black tree is a kind of self-balancing binary search


tree
• Each node stores an extra bit representing "color" ("red" or
"black"), used to ensure that the tree remains
approximately balanced during insertions and deletions

https://en.wikipedia.org/wiki/Red-black_tree
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: Red Black Tree

• When the tree is modified, the new tree is rearranged and


"repainted" to restore the coloring properties that
constrain how unbalanced the tree can become in the
worst case. The properties are designed such that this
rearranging and recoloring can be performed efficiently
• The re-balancing is not perfect, but guarantees searching
in O(log n) time, where n is the number of nodes of the
tree. The insertion and deletion operations, along with the
tree rearrangement and recoloring, are also performed
in O(log n) time

https://en.wikipedia.org/wiki/Red-black_tree
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: Red Black Tree

The red-black properties:


1. Every node is either red or black
2. The root is always black
3. Every leaf (NULL node) is black
4. If a node is red, both children are black
Note: can’t have 2 consecutive reds on a path
5. For each node, all simple paths from the node to
descendant leaves contain the same number of black
nodes
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: Red Black Tree

In the Red-Black tree, we use two tools to do the balancing


• Recoloring
• Rotation
Recoloring is the change in colour of the node i.e. if it is red
then change it to black and vice versa. It must be noted that
the colour of the NULL node is always black
The algorithms have mainly two cases depending upon the
colour of the uncle
• If the uncle is red, we do recolor
• If the uncle is black, we do rotations and/or recoloring

https://www.geeksforgeeks.org/red-black-tree-set-2-insert/
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: Red Black Tree

Algorithm:
Let x be the newly inserted node.
1. Perform standard BST insertion and make the colour of
newly inserted nodes as RED

2. If x is the root, change the colour of x as BLACK

3. Do the following if the color of x’s parent is not


BLACK and x is not the root:

https://www.geeksforgeeks.org/red-black-tree-set-2-insert/
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: Red Black Tree

Algorithm contd:

a) If x’s uncle is RED (Grandparent must have been black


from property 4)
(i) Change the colour of parent and uncle as BLACK
(ii) Colour of a grandparent as RED
(iii) Change x = x’s grandparent, repeat steps 2 and 3 for
new x

https://www.geeksforgeeks.org/red-black-tree-set-2-insert/
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: Red Black Tree

Algorithm contd:

b) If x’s uncle is BLACK, then there can be four configurations


for x, x’s parent (p) and x’s grandparent (g)
(This is similar to AVL tree)
(i) Left Left Case -> Right Rotation & swap color of p and g
(ii) Left Right Case -> LR Rotation & swap color of x and g
(iii) Right Right Case -> Left Rotation & swap color of p and g
(iv) Right Left Case -> RL Rotation & swap color of x and g

https://www.geeksforgeeks.org/red-black-tree-set-2-insert/
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: Red Black Tree

Construct a Red Black Tree with the following information:


3, 7, 8, 9, 2, 5, 6, 1
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: Red Black Tree

Construct a Red Black Tree with the following information:


8, 18, 5, 15, 17, 25, 40, 80
ADVANCED DATA STRUCTURES
Self-Balancing Binary Search Trees: Red Black Tree

Red Black Tree Applications


• Red black trees are used in TreeSet, TreeMap,
and Hashmap in the Java Collections Library
• Also, the Completely Fair Scheduler in the Linux kernel
uses this data structure
• Linux also uses red-black trees in
the mmap and munmap operations for file/memory
mapping
• Furthermore, red-black trees are used for geometric range
searches, k-means clustering, and text-mining
ADVANCED DATA STRUCTURES
Treap

• Balanced trees are great things – data access and insertion are
both O(log n) fast
• The downside is that implementation can be hard. This is a
legitimate reason not to use a data structure!

• To overcome this, we're going to look into a randomized data


structure, where instead of careful implementation which
guarantees a short tree, we're going to easily implement
something that is almost certainly short

• Our new data structure is a binary search tree called a Treap,


which is a cute combination of the words "Tree" and "Heap"
ADVANCED DATA STRUCTURES
Treap

• Treap = Tree + Heap


• In a Treap, each node has its data and a priority

• The data must be organized like all binary search trees

• The priority is assigned randomly when the node is created, and


must be organized with larger priorities to the top (the "heap
property")
ADVANCED DATA STRUCTURES
Treap - Insertion
Treaps support the following basic operations:
• Search
• Insert
• Delete

• To search for a given key value, apply a standard binary search


algorithm in a binary search tree, ignoring the priorities

• To insert a new key x into the treap, generate a random


priority y for x. Binary search for x in the tree, and create a new
node at the leaf position where the binary search determines a
node for x should exist. Then, as long as x is not the root of the
tree and has a larger priority number than its parent z, perform
a tree rotation that reverses the parent-child relation
between x and z.
https://en.wikipedia.org/wiki/Treap
ADVANCED DATA STRUCTURES
Treap - Insertion

• Data should satisfy BST property, Priority should satisfy Max heap property
• If new node is inserted on Left Sub Tree and new node priority is greater than
parent node priority then Right Rotate parent
• If new node is inserted on Right Sub Tree and new node priority is greater than
parent node priority then Left Rotate parent
Note: In the below example, alphabet represents data and digit represents
priority. Insert based on data, rotate (when max heap property fails) based on
priority. Insert A5, B1, C4, H3, P7, T2
ADVANCED DATA STRUCTURES
Treap - Insertion

private Node insert(Node n, K key){


if n is null,
return new Node(key) with random priority;
if key<n.data{
n.left=insert(n.left,key)
if n.left.priority>n.priority
return rightRotate(n)
} else if key>n.data {
n.right=insert(n.right,key)
if n.right.priority>n.priority
return leftRotate(n)
}
return n
}
https://www.usna.edu/Users/cs/crabbe/IC312/current/units/treap/treap.html#:~:text=Treaps,like%20all%20binary%20search%20trees.
ADVANCED DATA STRUCTURES
Treap - Deletion

• To delete a node x from the treap:


• If x is a leaf of the tree, simply remove it
• If x has a single child z, remove x from the tree and make z be
the child of the parent of x (or make z the root of the tree
if x had no parent)
• If x has two children, swap its position in the tree with the
position of its immediate successor z in the sorted order,
resulting in one of the previous cases
• In this final case, the swap may violate the heap-ordering
property for z, so additional rotations may need to be
performed to restore this property

https://en.wikipedia.org/wiki/Treap
ADVANCED DATA STRUCTURES
Treap - Deletion
Alternatively, a node can be deleted as follows:
1) If node is a leaf, delete it

2) If node has one child NULL and other as non-NULL


If left child exists, perform right rotation
If right child exists, perform left rotation

3) If node has both children as non-NULL, find max of left and right
children
a) If priority of right child is greater, perform left rotation at node
b) If priority of left child is greater, perform right rotation at node

The idea of step 2, 3 is to move the node to down so that we end


up leaf node case
An example for this method is given in next slide.
ADVANCED DATA STRUCTURES
Trie

• In computer science, a trie, also called digital tree or prefix tree,


is a type of search tree, a tree data structure used for locating
specific keys from within a set
• These keys are most often strings, with links between nodes
defined not by the entire key, but by individual characters

• Common applications of tries include storing a predictive


text or autocomplete dictionary and implementing
approximate matching algorithms, such as those used in spell
checking
• Such applications take advantage of a trie's ability to quickly
search for, insert, and delete entries

https://en.wikipedia.org/wiki/Trie
ADVANCED DATA STRUCTURES
Trie - Worksheet

Construct a Trie for the words: algorithm, data, datum, all, self, sea, search
ADVANCED DATA STRUCTURES
Trie

Considering only lower case English alphabets to be stored in the


trie data structure
struct trie
{
int isLeaf;
struct trie *child[26];
};
ADVANCED DATA STRUCTURES
Trie

TRIE* getnode()
{
TRIE *temp;
temp=(TRIE*)malloc(sizeof(TRIE));
temp->isLeaf=0;
for(int i=0;i<26;i++)
temp->child[i]=NULL;
return temp;
}
ADVANCED DATA STRUCTURES
Trie
void insert_pattern(TRIE *root,char *pattern) {
TRIE *cur=root;
while(*pattern) //pattern[i]!='\0'
{
if(cur->child[*pattern-'a']==NULL)
cur->child[*pattern-'a']=getnode();
cur=cur->child[*pattern-'a'];
pattern++;
}
cur->isLeaf=1;
}
ADVANCED DATA STRUCTURES
Trie
int search(TRIE *root,char *pattern)
{
TRIE *cur=root;int flag=0;
while(*pattern)
{
if(cur->child[*pattern-'a']==NULL)
{
flag=1;
break;
}
cur=cur->child[*pattern-'a'];
pattern++;
}
if(flag || cur->isLeaf==0)
return 0;
return 1;
ADVANCED DATA STRUCTURES
Suffix Tree - Worksheet

Construct a suffix trie for the word banana and show its suffix tree

Suffixes are:
ADVANCED DATA STRUCTURES
Suffix Tree
• In computer science, a suffix tree (also called PAT tree or, in an
earlier form, position tree) is a compressed trie containing all
the suffixes of the given text as their keys and positions in the text
as their values
• Suffix trees allow particularly fast implementations of many
important string operations
• The construction of such a tree for the string S takes time and
space linear in the length of S. Once constructed, several
operations can be performed quickly, for instance locating
a substring in S, locating a substring if a certain number of mistakes
are allowed, locating matches for a regular expression pattern etc
• Suffix trees also provide one of the first linear-time solutions for
the longest common substring problem. These speedups come at a
cost: storing a string's suffix tree typically requires significantly
more space than storing the string itself
ADVANCED DATA STRUCTURES
Suffix Tree - Applications
Suffix trees can be used to solve a large number of string problems that occur in
text-editing, computational biology and other application areas. Primary
applications include:
• String search, in O(m) complexity, where m is the length of the sub-string (but
with initial O(n) time required to build the suffix tree for the string)
• Finding the longest repeated substring
• Finding the longest common substring
• Finding the longest palindrome in a string
• Suffix trees are often used in bioinformatics applications, searching for patterns
in DNA or protein sequences (which can be viewed as long strings of characters)
• Suffix trees are also used in data compression; they can be used to find repeated
data, and can be used for the sorting stage of the Burrows–Wheeler transform
• A suffix tree is also used in suffix tree clustering, a data clustering algorithm used
in some search engines

https://en.wikipedia.org/wiki/Suffix_tree#Applications
ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Van Emde Boas Tree is a tree data structure formulated by a


team led by Dutch computer scientist Peter van Emde Boas
which supports each of the dynamic set operations : search,
insert, delete, minimum, maximum, successor and predecessor
in O(lg lg u) time where u is the size of the universe.

We want a recurrence relation of the form: T(u) = T(√u) + O(1)


which on solving yields the time complexity O(lg lg u).
ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Preliminary Approaches
Approach 1 : Direct Addressing
Approach 2 : Superimposing a binary tree structure
Approach 3 : Superimposing a tree of constant height
ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Approach 1 : Direct Addressing


Bit Vector
ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Approach 2 : Superimposing a binary tree structure


ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Approach 2 : Superimposing a binary tree structure


• To find the minimum value in the set, start at the root and
head down toward the leaves, always taking the leftmost node
containing a 1.
• To find the maximum value in the set, start at the root and
head down toward the leaves, always taking the rightmost
node containing a 1.
ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Approach 2 : Superimposing a binary tree structure


• To find the successor of x, start at the leaf indexed by x, and
head up toward the root until we enter a node from the left
and this node has a 1 in its right child z. Then head down
through node z, always taking the leftmost node containing a 1
(i.e., find the minimum value in the subtree rooted at the right
child z).
ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Approach 2 : Superimposing a binary tree structure


• To find the predecessor of x, start at the leaf indexed by x, and
head up toward the root until we enter a node from the right
and this node has a 1 in its left child z. Then head down
through node z, always taking the rightmost node containing a
1 (i.e., find the maximum value in the subtree rooted at the
left child z).
ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Approach 2 : Superimposing a binary tree structure


• We also augment the INSERT and DELETE operations
appropriately.
• When inserting a value, we store a 1 in each node on the
simple path from the appropriate leaf up to the root.
• When deleting a value, we go from the appropriate leaf up to
the root, recomputing the bit in each internal node on the path
as the logical-or of its two children.
• Since the height of the tree is lg u and each of the above
operations makes at most one pass up the tree and at most
one pass down, each operation takes O(lg u) time in the worst
case.
• But our goal is to perform operations in O(lg lg u) time.
ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Approach 3 : Superimposing a tree of constant height


Assume size of the universe is u=2^(2k) for some integer k, so
that √u is an integer. On imposing a tree of degree √u, the
height of the resulting tree is always 2.
ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Approach 3 : Superimposing a tree of constant height


ADVANCED DATA STRUCTURES
Van Emde Boas Tree

Approach 3 : Superimposing a tree of constant height


ADVANCED DATA STRUCTURES
Van Emde Boas Tree - Worksheet

Van Emde Boas Tree contains:


cluster information
summary information
min information
max information
ADVANCED DATA STRUCTURES
x-fast trie
• In computer science, an x-fast trie is a data structure for
storing integers from a bounded domain
• It supports exact and predecessor or successor queries in
time O(log log M), using O(n log M) space, where n is the
number of stored values and M is the maximum value in the
domain
• The structure was proposed by Dan Willard in 1982, along with
the more complicated y-fast trie, as a way to improve the space
usage of van Emde Boas trees, while retaining the O(log log M)
query time

https://en.wikipedia.org/wiki/X-fast_trie
ADVANCED DATA STRUCTURES
x-fast trie
• An x-fast trie is a bitwise trie: a binary tree where each subtree
stores values whose binary representations start with a common
prefix
• Each internal node is labeled with the common prefix of the
values in its subtree and typically, the left child adds a 0 to the
end of the prefix, while the right child adds a 1

https://en.wikipedia.org/wiki/X-fast_trie
ADVANCED DATA STRUCTURES
x-fast trie
• All values in the x-fast trie are stored at the leaves
• Internal nodes are stored only if they have leaves in their subtree
• If an internal node would have no left child, it stores a pointer to
the smallest leaf in its right subtree instead, called
a descendant pointer
• Likewise, if it would have no right child, it stores a pointer to the
largest leaf in its left subtree
• Each leaf stores a pointer to its predecessor and successor,
thereby forming a doubly linked list
• Finally, there is a hash table for each level that contains all the An x-fast trie containing the
nodes on that level. Together, these hash tables form the integers 1 (0012), 4 (1002) and
level-search structure (LSS) 5 (1012). Blue edges indicate
descendant pointers
• To guarantee the worst-case query times, these hash tables
should use dynamic perfect hashing or cuckoo hashing
https://en.wikipedia.org/wiki/X-fast_trie
ADVANCED DATA STRUCTURES
x-fast trie
x-fast tries support the following operations
• find(k): Find the element k in trie
• successor(k) and predecessor(k): Successor is the node with
smallest element greater than or equal to k, while predecessor is
largest element smaller than or equal to k in trie
• insert(k): insert the element k
• delete(k): delete the element k
ADVANCED DATA STRUCTURES
y-fast trie

• y-fast trie is a data structure for storing integers from a bounded


domain
• It supports exact and predecessor or successor queries in
time O(log log M), using O(n) space, where n is the number of
stored values and M is the maximum value in the domain
• The structure was proposed by Dan Willard in 1982 to decrease
the O(n log M) space used by an x-fast trie
ADVANCED DATA STRUCTURES
y-fast trie

• A y-fast trie consists of two data structures: the top half is an


x-fast trie and the lower half consists of a number of balanced
binary trees

Fig: An example of a y-fast trie

https://en.wikipedia.org/wiki/Y-fast_trie
ADVANCED DATA STRUCTURES
y-fast trie

y-fast tries support the following operations


• find(k): Find the element k in trie
• successor(k) and predecessor(k): Successor is the node with
smallest element greater than or equal to k, while predecessor is
largest element smaller than or equal to k in trie
• insert(k): insert the element k
• delete(k): delete the element k

https://en.wikipedia.org/wiki/Y-fast_trie

You might also like