0% found this document useful (0 votes)
19 views

Data Structures Unit 1 and 2

Uploaded by

pec library
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Data Structures Unit 1 and 2

Uploaded by

pec library
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

lOMoARcPSD|49513117

ADS Notes Units 1 & 2

Deep Learning (Panimalar Engineering College)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Vijayakumar murali ([email protected])
lOMoARcPSD|49513117

CP 4151 - ADVANCED DATA STRUCTURES AND ALGORITHMS

SYLLABUS

UNIT I ROLE OF ALGORITHMS IN COMPUTING & COMPLEXITY ANALYSIS

Algorithms – Algorithms as a Technology -Time and Space complexity of algorithms-


Asymptotic analysis-Average and worst-case analysis-Asymptotic notation-Importance of efficient
algorithms- Program performance measurement - Recurrences: The Substitution Method – The
Recursion-Tree Method- Data structures and algorithms.

UNIT II HIERARCHICAL DATA STRUCTURES

Binary Search Trees: Basics – Querying a Binary search tree – Insertion and Deletion- Red
Black trees: Properties of Red-Black Trees – Rotations – Insertion – Deletion -B-Trees: Definition of
B -trees – Basic operations on B-Trees – Deleting a key from a B-Tree- Heap – Heap
Implementation – Disjoint Sets - Fibonacci Heaps: structure – Merge able- heap operations
Decreasing a key and deleting a node-Bounding the maximum degree.

UNIT III GRAPHS


Elementary Graph Algorithms: Representations of Graphs – Breadth-First Search – Depth-
First Search – Topological Sort – Strongly Connected Components- Minimum Spanning Trees:
Growing a Minimum Spanning Tree – Kruskal and Prim- Single-Source Shortest Paths: The
Bellman-Ford algorithm – Single-Source Shortest paths in Directed Acyclic Graphs – Dijkstra‘s
Algorithm; Dynamic Programming - All-Pairs Shortest Paths: Shortest Paths and Matrix
Multiplication – The Floyd-Warshall Algorithm

UNIT IV ALGORITHM DESIGN TECHNIQUES


Dynamic Programming: Matrix-Chain Multiplication – Elements of Dynamic Programming –
Longest Common Subsequence- Greedy Algorithms: – Elements of the Greedy Strategy- An
Activity-Selection Problem - Huffman Coding.

UNIT V NP COMPLETE AND NP HARD


NP-Completeness: Polynomial Time – Polynomial-Time Verification – NP- Completeness
and Reducibility – NP-Completeness Proofs – NP-Complete Problems.

UNIT I ROLE OF ALGORITHMS IN COMPUTING & COMPLEXITY ANALYSIS

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

ALGORITHMS:
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in
a certain order to get the desired output.

Algorithms are generally created independent of underlying languages, i.e. an algorithm


can be implemented in more than one programming language.

From the data structure point of view, following are some important categories of
algorithms −

1. Search − Algorithm to search an item in a data structure.


2. Sort − Algorithm to sort items in a certain order.
3. Insert − Algorithm to insert item in a data structure.
4. Update − Algorithm to update an existing item in a data structure.
5. Delete − Algorithm to delete an existing item from a data structure.

Characteristics of an Algorithm:

Not all procedures can be called an algorithm. An algorithm should have the following
characteristics

i. Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or phases),
and their inputs/outputs should be clear and must lead to only one meaning.
ii. Input − An algorithm should have 0 or more well-defined inputs.
iii. Output − An algorithm should have 1 or more well-defined outputs, and should match the
desired output.
iv. Finiteness − Algorithms must terminate after a finite number of steps.
v. Feasibility − Should be feasible with the available resources.
vi. Independent − An algorithm should have step-by-step directions, which should be
independent of any programming code.

How to Write an Algorithm?

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

There are no well-defined standards for writing algorithms. Rather, it is problem and
resource dependent. Algorithms are never written to support a particular programming code.
As we know that all programming languages share basic code constructs like loops (do, for,
while), flow-control (if-else), etc. These common constructs can be used to write an algorithm.
We write algorithms in a step-by-step manner, but it is not always the case. Algorithm
writing is a process and is executed after the problem domain is well-defined. That is, we should
know the problem domain, for which we are designing a solution.

Example:

ALGORITHM AS TECHNOLOGY

There can be different solutions or algorithms for the same coding problem and these solutions
may differ in terms of efficiency.

These differences can be much more significant than differences due to hardware and software.
So, the system performance depends on choosing efficient algorithms as much as choosing fast
hardware.

Even applications that do not require algorithms directly at the application level, relies heavily
upon algorithms.

For examples:
1. Does the application requires fast hardware? The hardware design uses algorithms.
2. Does the application depend upon the user interface? The design of the user interface relies on
algorithms.
3. Does the application rely on fast networking? Networking relies heavily on routing algorithms.

Overall, algorithms are at the core of almost all computer applications. Just as rapid innovations
are being made in other computer technologies , they are also being made in algorithms.

TIME & SPACE COMPLEXITY OF AN ALGORITHM

ALGORITHM ANALYSIS

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Efficiency of an algorithm can be analyzed at two different stages, before implementation and
after implementation. They are the following −
1. A Priori Analysis − This is a theoretical analysis of an algorithm. Efficiency of an algorithm is
measured by assuming that all other factors, for example, processor speed, are constant and have
no effect on the implementation.

2. A Posterior Analysis − This is an empirical analysis of an algorithm. The selected algorithm is


implemented using programming language. This is then executed on target computer machine. In
this analysis, actual statistics like running time and space required, are collected.

Algorithm analysis deals with the execution or running time of various operations involved.

The running time of an operation can be defined as the number of computer instructions
executed per operation.

Algorithm Complexity

Suppose X is an algorithm and n is the size of input data, the time and space used by the algorithm
X are the two main factors, which decide the efficiency of X.

I. Time Factor − Time is measured by counting the number of key operations such as
comparisons in the sorting algorithm.

II. Space Factor − Space is measured by counting the maximum memory space required by
the algorithm.

The complexity of an algorithm f(n) gives the running time and/or the storage space required by
the algorithm in terms of n as the size of input data.

Space Complexity

Space complexity of an algorithm represents the amount of memory space required by the
algorithm in its life cycle.
The space required by an algorithm is equal to the sum of the following two components −
A) A fixed part that is a space required to store certain data and variables, that are
independent of the size of the problem. For example, simple variables and constants
used, program size, etc.

B) A variable part is a space required by variables, whose size depends on the size of the
problem. For example, dynamic memory allocation, recursion stack space, etc.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part and S(I) is the
variable part of the algorithm, which depends on instance characteristic I.

Example:

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Time Complexity

Time complexity of an algorithm represents the amount of time required by the algorithm to run
to completion.
Time requirements can be defined as a numerical function T(n), where T(n) can be measured as
the number of steps, provided each step consumes constant time.
For example, addition of two n-bit integers takes n steps.
Consequently, the total computational time is T(n) = c ∗ n, where c is the time taken for the
addition of two bits.
Here, we observe that T(n) grows linearly as the input size increases.

ASYMPTOTIC ANALYSIS

Asymptotic analysis of an algorithm refers to defining the mathematical


boundation/framing of its run-time performance. Using asymptotic analysis, we can very well
conclude the best case, average case, and worst case scenario of an algorithm.
Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded
to work in a constant time. Other than the "input" all other factors are considered constant.
Asymptotic analysis refers to computing the running time of any operation in
mathematical units of computation.
For example, the running time of one operation is computed as f(n) and may be for
another operation it is computed as g(n2).
This means the first operation running time will increase linearly with the increase in n and
the running time of the second operation will increase exponentially when n increases. Similarly,
the running time of both operations will be nearly the same if n is significantly small. Usually, the
time required by an algorithm falls under three types −

1. Best Case − Minimum time required for program execution.

2. Average Case − Average time required for program execution.

3. Worst Case − Maximum time required for program execution.

ASYMPTOTIC NOTATIONS

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Following are the commonly used asymptotic notations to calculate the running time complexity
of an algorithm.
 Ο Notation
 Ω Notation
 θ Notation

Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time. It
measures the worst case time complexity or the longest amount of time an algorithm can possibly
take to complete.

For example, for a function f(n):


Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }

Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It
measures the best case time complexity or the best amount of time an algorithm can possibly
take to complete.

For example, for a function f(n):


Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }

Theta Notation, θ

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

The notation θ(n) is the formal way to express both the lower bound and the upper bound of an
algorithm's running time. It is represented as follows −

For example, for a function f(n):


θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }

Common Asymptotic Notations

IMPORTANCE OF ALGORITHM EFFICIENCY


The efficiency of algorithms and data structures is becoming increasingly important in the area of big
data, where complicated analysis is performed on very large datasets.

Often algorithm efficiency is the deciding factor in analysis quality (of even if it possible at all).

Modelling modern computational infrastructure (such as complicated memory-hierarchies, GPUs and


modern clientserver architectures), and development of algorithms and data structures for these
models/devices, is also increasingly important.

The main objectives are to extend the basis understanding of efficient algorithms and data structures
for fundamental (big data) problems, as well as to further increase the Danish strength and capacity
within algorithms and data structures.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Since efficient algorithms and data structures are important – often even essential – in other
computer science research areas (as also explicitly indicated e.g. in the descriptions of the artificial
intelligence and data management disciplines), as well as in applications, there are significant
opportunities for synergies between algorithms researchers and other researcher in the project.

Thus, use of algorithmic advances in interdisciplinary and real-life application settings is another
important objective.

PROGRAM PERFORMANCE MEASUREMENT

What is Performance Analysis of an algorithm?


If we want to go from city "A" to city "B", there can be many ways of doing this. We can go by
flight, by bus, by train and also by bicycle. Depending on the availability and convenience, we
choose the one which suits us.
Similarly, in computer science, there are multiple algorithms to solve a problem. When we have
more than one algorithm to solve a problem, we need to select the best one. Performance
analysis helps us to select the best algorithm from multiple algorithms to solve a problem.
When there are multiple alternative algorithms to solve a problem, we analyze them and pick the
one which is best suitable for our requirements.
The formal definition is “Performance of an algorithm is a process of making evaluative judgement
about the algorithms”.
That means when we have multiple algorithms to solve a problem, we need to select a suitable
algorithm to solve that problem.
We compare algorithms with each other which are solving the same problem, to select the best
algorithm. To compare algorithms, we use a set of parameters or set of elements like memory
required by that algorithm, the execution speed of that algorithm, easy to understand, easy to
implement, etc.,
Generally, the performance of an algorithm depends on the following elements...

1. Whether that algorithm is providing the exact solution for the problem?
2. Whether it is easy to understand?
3. Whether it is easy to implement?
4. How much space (memory) it requires to solve the problem?
5. How much time it takes to solve the problem? Etc.,

When we want to analyse an algorithm, we consider only the space and time required by that
particular algorithm and we ignore all the remaining elements.
Performance analysis of an algorithm is performed by using the following measures...

1. Space required to complete the task of that algorithm (Space Complexity). It includes
program space and data space
2. Time required to complete the task of that algorithm (Time Complexity)

RECURRENCE RELATIONS

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

A recurrence is an equation or inequality that describes a function in terms of its values on smaller
inputs. To solve a Recurrence Relation means to obtain a function defined on the natural numbers
that satisfy the recurrence.

For Example, the Worst Case Running Time T(n) of the MERGE SORT Procedures is described by
the recurrence.

There are four methods for solving Recurrence:

1. Substitution Method
2. Iteration Method
3. Recursion Tree Method
4. Master Method

1. SUBSTITUTION METHOD:

The Substitution Method Consists of two main steps:

1. Guess the Solution.


2. Use the mathematical induction to find the boundary condition and shows that the guess
is correct.

For Example1 Solve the equation by Substitution Method. We have to show that it is
asymptotically bound by O (log n).

Solution:

We have to show that for some constant c

T (n) ≤c log n --- Put this in given Recurrence Equation.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Example 2:

Solution:

2. ITERATION METHOD:

It means to expand the recurrence and express it as a summation of terms of n and initial
condition.

Example1: Consider the Recurrence

Solution:

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Example 2:

Solution:

3. RECURSION TREE METHOD

1. Recursion Tree Method is a pictorial representation of an iteration method which is in the form
of a tree where at each level nodes are expanded.

2. In general, we consider the second term in recurrence as root.

3. It is useful when the divide & Conquer algorithm is used.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

4. It is sometimes difficult to come up with a good guess. In Recursion tree, each root
and child represents the cost of a single subproblem.

5. We sum the costs within each of the levels of the tree to obtain a set of pre-level
costs and then sum all pre-level costs to determine the total cost of all levels of the
recursion.

6. A Recursion Tree is best used to generate a good guess, which can be verified by
the Substitution Method.

Example 1:

Solution:

The Recursion tree for the above recurrence is

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Example 2:

Solution:

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

4. MASTER METHOD

The Master Method is used for solving the following types of recurrence

T (n) = a T + f (n) with a≥1 and b≥1 be constant & f(n) be a function and can be
interpreted as

Let T (n) is defined on non-negative integers by the recurrence.

In the function to the analysis of a recursive algorithm, the constants and function
take on the following significance:

o n is the size of the problem.


o a is the number of subproblems in the recursion.
o n/b is the size of each subproblem. (Here it is assumed that all subproblems
are essentially the same size.)
o f (n) is the sum of the work done outside the recursive calls, which includes
the sum of dividing the problem and the sum of combining the solutions to
the subproblems.
o It is not possible always bound the function according to the requirement, so
we make three cases which will tell us what kind of bound we can apply on
the function.

MASTER THEOREM:
It is possible to complete an asymptotic tight bound in these three cases:

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

EXAMPLE:

SOLUTION:

Since this equation holds, the first case of the master theorem applies to the given recurrence
relation, thus resulting in the conclusion:

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Case 3: If it is true f(n) = Ω for some constant ε >0 and it also true that: a

f for some constant c<1 for large value of n ,then :

T (n) = Θ((f (n))

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

UNIT - 2 HIERARCHICAL DATA STRUCTURES


Binary Search Tree is a node-based binary tree data structure which has the following
properties:
 The left subtree of a node contains only nodes with keys lesser than the
node’s key.
 The right subtree of a node contains only nodes with keys greater than the
node’s key.
 The left and right subtree each must also be a binary search tree.

QUERING A BINARY SEARCH TREE

1. Searching: The TREE-SEARCH (x, k) algorithm searches the tree node at x for a node whose key
value equal to k. It returns a pointer to the node if it exists otherwise NIL.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Clearly, this algorithm runs in O (h) time where h is the height of the tree. The iterative version of
the above algorithm is very easy to implement

2. Minimum and Maximum: An item in a binary search tree whose key is a minimum can always
be found by following left child pointers from the root until a NIL is encountered. The following
procedure returns a pointer to the minimum element in the subtree rooted at a given node x.

3. Successor and predecessor: Given a node in a binary search tree, sometimes we used to find its
successor in the sorted form determined by an in order tree walk. If all keys are specific, the
successor of a node x is the node with the smallest key greater than key[x]. The structure of a
binary search tree allows us to rule the successor of a node without ever comparing keys. The

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

following action returns the successor of a node x in a binary search tree if it exists, and NIL if x
has the greatest key in the tree:

The code for TREE-SUCCESSOR is broken into two cases. If the right subtree of node x is nonempty,
then the successor of x is just the leftmost node in the right subtree, which we find in line 2 by
calling TREE-MINIMUM (right [x]). On the other hand, if the right subtree of node x is empty and x
has a successor y, then y is the lowest ancestor of x whose left child is also an ancestor of x. To
find y, we quickly go up the tree from x until we encounter a node that is the left child of its
parent; lines 3-7 of TREE-SUCCESSOR handle this case.

The running time of TREE-SUCCESSOR on a tree of height h is O (h) since we either follow a simple
path up the tree or follow a simple path down the tree. The procedure TREE-PREDECESSOR, which
is symmetric to TREE-SUCCESSOR, also runs in time O (h).

Insertion and Deletion in Binary Search tree


To insert a new value into a binary search tree T, we use the procedure TREE-INSERT. The
procedure takes a node ´ for which key [z] = v, left [z] NIL, and right [z] = NIL. It modifies T and
some of the attributes of z in such a way that it inserts into an appropriate position in the tree.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Now our node z will be either left or right child of its parent (y).

So, insert a node in the left of node index at 6.

5. Deletion in Binary Search Tree: When Deleting a node from a tree it is essential
that any relationships, implicit in the tree can be maintained. The deletion of nodes
from a binary search tree will be considered:

There are three distinct cases:

1. Nodes with no children: This case is trivial. Simply set the parent's pointer to
the node to be deleted to nil and delete the node.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

2. Nodes with one child: When z has no left child then we replace z by its right
child which may or may not be NIL. And when z has no right child, then we
replace z with its right child.
3. Nodes with both Childs: When z has both left and right child. We find z's
successor y, which lies in right z's right subtree and has no left child (the
successor of z will be a node with minimum value its right subtree and so it
has no left child).
o If y is z's right child, then we replace z.
o Otherwise, y lies within z's right subtree but not z's right child. In this
case, we first replace z by its own right child and the replace z by y.

The Procedure runs in O (h) time on a tree of height h.

For Example: Deleting a node z from a binary search tree. Node z may be the root, a
left child of node q, or a right child of q.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

RED BLACK TREE

A Red Black Tree is a category of the self-balancing binary search tree. It was created
in 1972 by Rudolf Bayer who termed them "symmetric binary B-trees."

A red-black tree is a Binary tree where a particular node has color as an extra
attribute, either red or black. By check the node colors on any simple path from the
root to a leaf, red-black trees secure that no such path is higher than twice as long as
any other so that the tree is generally balanced.

Properties of Red-Black Trees

A red-black tree must satisfy these properties:

1. The root is always black.


2. A nil is recognized to be black. This factor that every non-NIL node has two
children.
3. Black Children Rule: The children of any red node are black.
4. Black Height Rule: For particular node v, there exists an integer bh (v) such
that specific downward path from v to a nil has correctly bh (v) black real (i.e.
non-nil) nodes. Call this portion the black height of v. We determine the black
height of an RB tree to be the black height of its root.

A tree T is an almost red-black tree (ARB tree) if the root is red, but other conditions
above hold.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Operations on RB Trees:

The search-tree operations TREE-INSERT and TREE-DELETE, when runs on a red-black


tree with n keys, take O (log n) time. Because they customize the tree, the conclusion
may violate the red-black properties. To restore these properties, we must change
the color of some of the nodes in the tree and also change the pointer structure.

1. Rotation:

Restructuring operations on red-black trees can generally be expressed more clearly


in details of the rotation operation.

Clearly, the order (Ax By C) is preserved by the rotation operation. Therefore, if we


start with a BST and only restructure using rotation, then we will still have a BST i.e.
rotation do not break the BST-Property.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Example: Draw the complete binary tree of height 3 on the keys {1, 2, 3... 15}. Add the NIL leaves
and color the nodes in three different ways such that the black heights of the resulting trees are: 2,
3 and 4.

Solution:

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

2. Insertion:

o Insert the new node the way it is done in Binary Search Trees.
o Color the node red
o If an inconsistency arises for the red-black tree, fix the tree according to the type of
discrepancy.

A discrepancy can decision from a parent and a child both having a red color. This type of
discrepancy is determined by the location of the node concerning grandparent, and the color of the
sibling of the parent.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

After the insert new node, Coloring this new node into black may violate the black-height conditions
and coloring this new node into red may violate coloring conditions i.e. root is black and red node
has no red children. We know the black-height violations are hard. So we color the node red. After
this, if there is any color violation, then we have to correct them by an RB-INSERT-FIXUP procedure.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Example: Show the red-black trees that result after successively inserting the keys
41,38,31,12,19,8 into an initially empty red-black tree.

Solution:

Insert 41

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Insert 19

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

DELETION IN RED BLACK TREE

First, search for an element to be deleted

o If the element to be deleted is in a node with only left child, swap this node
with one containing the largest element in the left subtree. (This node has no
right child).
o If the element to be deleted is in a node with only right child, swap this node
with the one containing the smallest element in the right subtree (This node
has no left child).
o If the element to be deleted is in a node with both a left child and a right
child, then swap in any of the above two ways. While swapping, swap only
the keys but not the colors.
o The item to be deleted is now having only a left child or only a right child.
Replace this node with its sole child. This may violate red constraints or black
constraint. Violation of red constraints can be easily fixed.
o If the deleted node is black, the black constraint is violated. The elimination
of a black node y causes any path that contained y to have one fewer black
node.
o Two cases arise:
o The replacing node is red, in which case we merely color it black to
make up for the loss of one black node.
o The replacing node is black.

The strategy RB-DELETE is a minor change of the TREE-DELETE procedure. After splicing out a
node, it calls an auxiliary procedure RB-DELETE-FIXUP that changes colors and performs rotation
to restore the red-black properties.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

B - TREES

B-Trees maintain balance by ensuring that each node has a minimum number of keys, so the
tree is always balanced. This balance guarantees that the time complexity for operations such as
insertion, deletion, and searching is always O(log n), regardless of the initial shape of the tree.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Properties of B-Tree:
 All leaves are at the same level.
 B-Tree is defined by the term minimum degree ‘t‘. The value of ‘t‘ depends
upon disk block size.
 Every node except the root must contain at least t-1 keys. The root may
contain a minimum of 1 key.
 All nodes (including root) may contain at most (2*t – 1) keys.
 Number of children of a node is equal to the number of keys in it plus 1.
 All keys of a node are sorted in increasing order. The child between two
keys k1 and k2 contains all keys in the range from k1 and k2.
 B-Tree grows and shrinks from the root which is unlike Binary Search Tree.
Binary Search Trees grow downward and also shrink from downward.
 Like other balanced Binary Search Trees, the time complexity to search,
insert and delete is O(log n).
 Insertion of a Node in B-Tree happens only at Leaf Node.
Following is an example of a B-Tree of minimum order 5
Note: that in practical B-Trees, the value of the minimum order is much more than
5.

We can see in the above diagram that all the leaf nodes are at the same level and all non-leafs
have no empty sub-tree and have keys one less than the number of their children.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Traversal in B-Tree:
Traversal is also similar to Inorder traversal of Binary Tree. We start from the leftmost child,
recursively print the leftmost child, then repeat the same process for the remaining children and
keys. In the end, recursively print the rightmost child.
Search Operation in B-Tree:
Search is similar to the search in Binary Search Tree. Let the key to be searched is k.
Start from the root and recursively traverse down.
For every visited non-leaf node,
If the node has the key, we simply return the node.
Otherwise, we recur down to the appropriate child (The child which is just before
the first greater key) of the node.
If we reach a leaf node and don’t find k in the leaf node, then return NULL.

Searching a B-Tree is similar to searching a binary tree. The algorithm is similar and goes with
recursion. At each level, the search is optimized as if the key value is not present in the range of the
parent then the key is present in another branch. As these values limit the search they are also
known as limiting values or separation values. If we reach a leaf node and don’t find the desired key
then it will display NULL.

Example:

Solution:

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Applications of B-Trees:
It is used in large databases to access data stored on the disk
Searching for data in a data set can be achieved in significantly less time using the B-
Tree
With the indexing feature, multilevel indexing can be achieved.
Most of the servers also use the B-tree approach.
B-Trees are used in CAD systems to organize and search geometric data.
B-Trees are also used in other areas such as natural language processing, computer
networks, and cryptography.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

HEAP DATA STRUCTURES

A heap is a complete binary tree, and the binary tree is a tree in which the node can
have utmost two children. Before knowing more about the heap data structure, we
should know about the complete binary tree.

What is a complete binary tree?

A complete binary tree is a binary tree in which all the levels except the last level, i.e.,
leaf node should be completely filled, and all the nodes should be left-justified.

In the above figure, we can observe that all the internal nodes are completely filled
except the leaf node; therefore, we can say that the above tree is a complete binary
tree.

How can we arrange the nodes in the Tree?

There are two types of the heap:

o Min Heap
o Max heap

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Min Heap: The value of the parent node should be less than or equal to either of its
children.

Or

In other words, the min-heap can be defined as, for every node i, the value of node i
is greater than or equal to its parent value except the root node. Mathematically, it
can be defined as:

A[Parent(i)] <= A[i]

Example:

11 is the root node, and the value of the root node is less than the value of all the
other nodes (left child or a right child).

Max Heap: The value of the parent node is greater than or equal to its children.

Or

In other words, the max heap can be defined as for every node i; the value of node i
is less than or equal to its parent value except the root node. Mathematically, it can
be defined as:

A[Parent(i)] >= A[i]

The above tree is a max heap tree as it satisfies the property of the max heap. Now, let's see the
array representation of the max heap.

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Time complexity in Max Heap

The total number of comparisons required in the max heap is according to the height
of the tree. The height of the complete binary tree is always logn; therefore, the
time complexity would also be O(logn).

Insertion in the Heap tree

44, 33, 77, 11, 55, 88, 66

Suppose we want to create the max heap tree. To create the max heap tree, we
need to consider the following two cases:

o First, we have to insert the element in such a way that the property of the
complete binary tree must be maintained.
o Secondly, the value of the parent node should be greater than the either of
its child.

Step 1: First we add the 44 element in the tree as shown below:

Step 2: The next element is 33. As we know that insertion in the binary tree always starts from the
left side so 44 will be added at the left of 33

Step 3: The next element is 77 and it will be added to the right of the 44

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

As we can observe in the above tree that it does not satisfy the max heap property, i.e., parent
node 44 is less than the child 77. So, we will swap these two values

Step 4: The next element is 11. The node 11 is added to the left of 33

Step 5: The next element is 55. To make it a complete binary tree, we will add the node 55 to the

right of 33

As we can observe in the above figure that it does not satisfy the property of the max heap
because 33<55, so we will swap these two values as shown below:

Downloaded by Vijayakumar murali ([email protected])


lOMoARcPSD|49513117

Step 6: The next element is 88. The left subtree is completed so we will add 88 to the left of 44

As we can observe in the above figure that it does not satisfy the property of the
max heap because 44<88, so we will swap these two values as shown below:

Again, it is violating the max heap property because 88>77 so we will swap these two
values as shown below:

Step 7: The next element is 66. To make a complete binary tree, we will add the 66
element to the right side of 77 as shown below:

In the above figure, we can observe that the tree satisfies the property of max heap;
therefore, it is a heap tree.

Deletion in Heap Tree

In Deletion in the heap tree, the root node is always deleted and it is replaced with
the last element.

Let's understand the deletion through an example.

Step 1: In the above tree, the first 30 node is deleted from the tree and it is replaced
with the 15 element as shown below:

Now we will heapify the tree. We will check whether the 15 is greater than either of
its child or not. 15 is less than 20 so we will swap these two values as shown below:

Again, we will compare 15 with its child. Since 15 is greater than 10 so no swapping
will occur.

Downloaded by Vijayakumar murali ([email protected])

You might also like