Ad 1
Ad 1
ANALYSIS
Algorithms – Algorithms as a Technology – Time and Space complexity of
algorithms – Asymptotic analysis – Average and worst-case analysis – Asymptotic notation
– Importance of efficient algorithms – Program performance measurement – Recurrences:
The Substitution Method – The Recursion – Tree Method – Data structures and algorithms.
Time Complexity
The time complexity of an algorithm quantifies the amount of time taken by an
algorithm to run as a function of the length of the input. The time to run is a function of
the length of the input and not the actual execution time of the machine on which the
algorithm is running. In order to calculate time complexity on an algorithm, it is assumed
that a constant time c is taken to execute one operation, and then the total operations for
an input length on N are calculated. Consider an example to understand the process of
calculation: Suppose a problem is to find whether a pair (X, Y) exists in an array A of N
elements whose sum is z. The simplest idea is to consider every pair and check if it
satisfies the given condition or not.
The pseudo-code is as follows:
int a[n];
for(int i = 0;i < n;i++)
cin >> a[i];
for(int i = 0;i < n;i++)
for(int j = 0;j < n;j++)
if(i!=j && a[i]+a[j] == z)
return true
return false
Assume that each of the operations in the computer takes approximately constant
time c. The number of lines of code executed actually depends on the value of z. During
analyses of the algorithm, mostly the worst-case scenario is considered, i.e., when there
is no pair of elements with sum equals z. In the worst case,
● N*c operations are required for input.
● The outer loop i, runs N times.
● For each i, the inner loop j loop runs n times.
So total execution time is N*c + N*N*c + c. Now ignore the lower order terms
since the lower order terms are relatively insignificant for large input, therefore only the
highest order term is taken (without constant) which is N*N in this case. Different
notations are used to describe the limiting behavior of a function, but since the worst
case is taken so big-O notation will be used to represent the time complexity.
Hence, the time complexity is O(N2) for the above algorithm. Note that the time
complexity is based on the number of elements in array A i.e the input length, so if the
length of the array will increase the time of execution will also increase.
Order of growth is how the time of execution depends on the length of the input.
In the above example, it is clearly evident that the time of execution quadratically
depends on the length of the array. Order of growth will help to compute the running
time with ease.
Another Example: Let’s calculate the time complexity of the below algorithm:
count = 0
for(int i =N; i > 0; i /= 2)
for (int j = 0; j < i; j++)
count++;
It seems like the complexity is O(N * log N). N for the j′s loop and log(N) for i′s loop.
But it’s wrong. Let’s see why.
Think about how many times count++ will run.
● When i = N, it will run N times.
● When i = N / 2, it will run N / 2 times.
● When i = N / 4, it will run N / 4 times.
● And so on.
The total number of times count++ will run is N + N/2 + N/4+…+1= 2 * N. So the
time complexity will be O(N). Some general time complexities are listed below with
the input range for which they are accepted in competitive programming:
Worst Accepted Time
Input Length Usually type of solutions
Complexity
Dynamic programming,
100 O(N4)
Constructive
Dynamic programming,
400 O(N3)
Constructive
Constructive, Mathematical,
100M O(N), O(log N), O(1)
Greedy Algorithms
Space Complexity
The space complexity of an algorithm quantifies the amount of space taken by an
algorithm to run as a function of the length of the input. Consider an example: Suppose
a problem to find the frequency of array elements.
The pseudo-code is as follows:
int freq[n];
int a[n];
for(int i = 0; i<n; i++)
{
cin>>a[i];
freq[a[i]]++;
}
Here two arrays of length N, and variable i are used in the algorithm so, the total
space used is N * c + N * c + 1 * c = 2N * c + c, where c is a unit space taken. For many
inputs, constant c is insignificant, and it can be said that the space complexity is O(N).
There is also auxiliary space, which is different from space complexity. The main
difference is where space complexity quantifies the total space used by the algorithm,
auxiliary space quantifies the extra space that is used in the algorithm apart from the
given input. In the above example, the auxiliary space is the space used by the freq[]
array because that is not part of the given input. So total auxiliary space is N * c + c
which is O(N) only.
AVL TREES
• Binary Search Trees
• AVL Trees
AVL Trees 1
Binary Search Trees
• A binary search tree is a binary tree T such that
- each internal node stores an item (k, e) of a
dictionary.
- keys stored at nodes in the left subtree of v are less
than or equal to k.
- Keys stored at nodes in the right subtree of v are
greater than or equal to k.
- External nodes do not hold elements but serve as
place holders.
44
17 88
32 65 97
28 54 82
29 76
80
AVL Trees 2
Search
• The binary search tree T is a decision tree, where the
question asked at an internal node v is whether the
search key k is less than, equal to, or greater than the
key stored at v.
• Pseudocode:
Algorithm TreeSeach(k, v):
Input: A search key k and a node v of a binary search
tree T.
Ouput: A node w of the subtree T(v) of T rooted at v,
such that either w is an internal node storing
key k or w is the external node encountered in
the inorder traversal of T(v) after all the inter
nal nodes with keys smaller than k and before
all the internal nodes with keys greater than k.
if v is an external node then
return v
if k = key(v) then
return v
else if k < key(v) then
return TreeSearch(k, T.leftChild(v))
else
{ k > key(v) }
return TreeSearch(k, T.rightChild(v))
AVL Trees 3
Search (cont.)
• A picture:
find(25) find(76)
44
17 88
32 65 97
28 54 82
76
29
80
AVL Trees 4
Insertion in a Binary Search
Tree
• Start by calling TreeSearch(k, T.root()) on T. Let w
be the node returned by TreeSearch
• If w is external, we know no item with key k is
stored in T. We call expandExternal(w) on T and have
w store the item (k, e)
• If w is internal, we know another item with key k is
stored at w. We call TreeSearch(k, rightChild(w)) and
recursively apply this alorithm to the node returned
by TreeSearch.
AVL Trees 5
Insertion in a Binary Search
Tree (cont.)
• Insertion of an element with key 78:
a) 44
17 88
32 65 97
28 54 82
29 76
80
b) 44
17 88
32 65 97
28 54 82
29 76
80
78
AVL Trees 6
Removal from a Binary Search
Tree
• Removal where the key to remove is stored at a node
(w) with an external child:
44
17 88
w
32 65 97
28 54 82
29 76
80
78
(a)
AVL Trees 7
Removal from a Binary Search
Tree (cont.)
44
17 88
28 65 97
29 54 82
76
80
78
(b)
AVL Trees 8
Removal from a Binary Search
Tree (cont.)
• Removal where the key to remove is stroed at a node
whose children are both internal:
44
17 88
w 65
28 97
29 54 82
76
80
78
(a)
AVL Trees 9
Removal from a Binary Search
Tree (cont.)
44
17 88
w 76
28 97
29 54 82
80
78
(b)
AVL Trees 10
Time Complexity
• Searching, insertion, and removal in a binary search
tree is O(h), where h is the height of the tree.
• However, in the worst-case search, insertion, and
removal time is O(n), if the height of the tree is
equal to n. Thus in some cases searching, insertion,
and removal is no better than in a sequence.
• Thus, to prevent the worst case, we need to develop
a rebalancing scheme to bound the height of the tree
to log n.
AVL Trees 11
AVL Tree
• An AVL Tree is a binary search tree such that for
every internal node v of T, the heights of the children
of v can differ by at most 1.
• An example of an AVL tree where the heights are
shown next to the nodes:
4
44
2 3
17 78
1 2 1
32 50 88
1 1
48 62
AVL Trees 12
Height of an AVL Tree
• Proposition: The height of an AVL tree T storing n
keys is O(log n).
• Justification: The easiest way to approach this
problem is to try to find the minimum number of
internal nodes of an AVL tree of height h: n(h).
• We see that n(1) = 1 and n(2) = 2
• for n 3, an AVL tree of height h with n(h) minimal
contains the root node, one AVL subtree of height n-
1 and the other AVL subtree of height n-2.
• i.e. n(h) = 1 + n(h-1) + n(h-2)
• Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2)
- n(h) > 2n(h-2)
- n(h) > 4n(h-4)
...
- n(h) > 2in(h-2i)
• Solving the base case we get: n(h) 2h/2-1
• Taking logarithms: h < 2log n(h) +2
• Thus the height of an AVL tree is O(log n)
AVL Trees 13
Insertion
• A binary search tree T is called balanced if for every
node v, the height of v’s children differ by at most
one.
• Inserting a node into an AVL tree involves
performing an expandExternal(w) on T, which
changes the heights of some of the nodes in T.
• If an insertion causes T to become unbalanced, we
travel up the tree from the newly created node until
we find the first node x such that its grandparent z is
unbalanced node.
• Since z became unbalanced by an insertion in the
subtree rooted at its child y,
height(y) = height(sibling(y)) + 2
• To rebalance the subtree rooted at z, we must
perform a restructuring
- we rename x, y, and z to a, b, and c based on the
order of the nodes in an in-order traversal.
- z is replaced by b, whose children are now a and c
whose children, in turn, consist of the four other
subtrees formerly children of x, y, and z.
AVL Trees 14
Insertion (contd.)
• Example of insertion into an AVL tree.
5
44
2
z 4
17 78
1 3 y 1
32 50 88
1 2 x
48 62
1
54 T3
T0 T2
4
44
2 3 x
17 62 z
2
y
1 2
32 50 78
1 1 1
48 54 88
T2
T0 T1 T3
AVL Trees 15
Restructuring
• The four ways to rotate nodes in an AVL tree,
graphically represented:
- Single Rotations:
T0 T3
T1 T3 T0 T1 T2
T2
T3 T3
T0 T2 T2 T1 T0
T1
AVL Trees 16
Restructuring (contd.)
- double rotations:
T0 T2
T2 T3 T0 T1 T3
T1
T0 T2
T3 T2 T3 T1 T0
T1
AVL Trees 17
Restructuring (contd.)
• In Pseudo-Code:
Algorithm restructure(x):
Input: A node x of a binary search tree T that has both
a parent y and a grandparent z
Output: Tree T restructured by a rotation (either
single or double) involving nodes x, y, and z.
AVL Trees 18
Removal
• We can easily see that performing a
removeAboveExternal(w) can cause T to become
unbalanced.
• Let z be the first unbalanced node encountered while
travelling up the tree from w. Also, let y be the child
of z with the larger height, and let x be the child of y
with the larger height.
• We can perform operation restructure(x) to restore
balance at the subtree rooted at z.
• As this restructuring may upset the balance of
another node higher in the tree, we must continue
checking for balance until the root of T is reached.
AVL Trees 19
Removal (contd.)
• example of deletion from an AVL tree:
z
4
44
1
y
3
17 62
x
2 2
50 78
1 1 0 1
T0 48 54 88
T2
32
T1 T3
y
4
62
z x
3
2
44 78
2
1 0 1
17 50 88
1 1
48 54 T2
T0 T3
T1
AVL Trees 20
Removal (contd.)
• example of deletion from an AVL tree
z
4
44
1
y
3
17 62
x
2 2
50 78
1 1 0 1
T0 48 54 88
32
T1 T2 T3
4 x
50
z y
2 3
44 62
1 1 1
2
17 48 54 78
0 1
88
T0 T1 T2
T3
AVL Trees 21
Implementation
• A Java-based implementation of an AVL tree
requires the following node class:
AVL Trees 22
Implementation (contd.)
public class SimpleAVLTree
extends SimpleBinarySearchTree
implements Dictionary {
public SimpleAVLTree(Comparator c) {
super(c);
T = new RestructurableNodeBinaryTree();
}
private int height(Position p) {
if (T.isExternal(p))
return 0;
else
return ((AVLItem) p.element()).height();
}
private void setHeight(Position p) { // called only
// if p is internal
((AVLItem) p.element()).setHeight
(1 + Math.max(height(T.leftChild(p)),
height(T.rightChild(p))));
}
AVL Trees 23
Implementation (contd.)
AVL Trees 24
Implementation (contd.)
private void rebalance(Position zPos) {
//traverse the path of T from zPos to the root;
//for each node encountered recompute its
//height and perform a rotation if it is
//unbalanced
while (!T.isRoot(zPos)) {
zPos = T.parent(zPos);
setHeight(zPos);
if (!isBalanced(zPos)) { // perform a rotation
Position xPos = tallerChild(tallerChild(zPos));
zPos = ((RestructurableNodeBinaryTree)
T).restructure(xPos);
setHeight(T.leftChild(zPos));
setHeight(T.rightChild(zPos));
setHeight(zPos);
}
}
}
AVL Trees 25
Implementation (contd.)
public void insertItem(Object key, Object element)
throws InvalidKeyException {
super.insertItem(key, element);// may throw an
// InvalidKeyException
Position zPos = actionPos; // start at the
// insertion position
T.replace(zPos, new AVLItem(key, element, 1));
rebalance(zPos);
}
AVL Trees 26
B- Trees
COL 106
Shweta Agrawal, Amit Kumar
0 5 10 25 35 45 55 0 5 25 35 45 55
10
B-tree 4-way tree
B-tree
1. It is perfectly balanced: every leaf node is at the same depth.
2. Every node, except maybe the root, is at least half-full
t-1≤ #keys ≤2t-1
3. Every internal node with k keys has k+1 non-null children
B-tree Height
Claim: any B-tree with n keys, height h and minimum degree t
satisfies:
n 1
h log t
2
Proof:
• The minimum number of KEYS for a tree with height h is
obtained when:
– The root contains one key
– All other nodes contain t-1 keys
B-Tree: Insert X
J
x … 56 98 …. x … 56 68 98 ….
split
y 60 65 68 83 86 90
y 60 65 z 83 86 90
Insert example
20 40 60 80 M 6; t 3
0 5 10 15 25 35 45 55 62 66 70 74 78 87 98
Insert 3:
20 40 60 80
0 3 5 10 15 25 35 45 55 62 66 70 74 78 87 98
20 40 60 80 M 6; t 3
0 3 5 10 15 25 35 45 55 62 66 70 74 78 87 98
Insert 61: 20 40 60 80
OVERFLOW
0 3 5 10 15 25 35 45 55 61 62 66 70 74 78 87 98
SPLIT IT
20 40 60 70 80
0 3 5 10 15 25 35 45 55 61 62 66 74 78 87 98
M 6; t 3
20 40 60 70 80
Insert 38:
0 3 5 10 15 25 35 45 55 61 62 66 74 78 87 98
20 40 60 70 80
0 3 5 10 15 25 35 38 45 55 61 62 66 74 78 87 98
Insert 4: M 6; t 3
20 40 60 70 80
0 3 5 10 15 25 35 38 45 55 61 62 66 74 78 87 98
20 40 60 70 80
OVERFLOW
0 3 4 5 10 15 25 35 38 45 55 61 62 66 74 78 87 98
SPLIT IT
OVERFLOW
5 20 40 60 70 80
SPLIT IT
0 3 4 10 15 25 35 38 45 55 61 62 66 74 78 87 98
M 6; t 3
OVERFLOW
5 20 40 60 70 80
SPLIT IT
0 3 4 10 15 25 35 38 45 55 61 62 66 74 78 87 98
60
5 20 40 70 80
0 3 4 10 15 25 35 38 45 55 61 62 66 74 78 87 98
Complexity Insert
• Inserting a key into a B-tree of height h is done in a
single pass down the tree and a single pass up the
tree
Recall: The root should have at least 1 value in it, and all other nodes should
have at least t-1 values in them
M 6; t 3
Underflow Example
Delete 87: 60
5 20 40 70 80
0 3 4 10 15 25 35 38 45 55 61 62 66 74 78 87 98
60 B-tree
UNDERFLOW
5 20 40 70 80
0 3 4 10 15 25 35 38 45 55 61 62 66 74 78 98
B-Tree: Delete X,k
• Delete as in M-way tree
• A problem:
– might cause underflow: the number of keys remain in a
node < t-1
• Solution:
– make sure a node that is visited has at least t instead of t-1
keys.
– If it doesn’t have k
• (1) either take from sibling via a rotate, or
• (2) merge with the parent
– If it does have k
• See next slides
Recall: The root should have at least 1 value in it, and all other nodes should
have at least t-1 (at most 2t-1) values in them
B-Tree-Delete(x,k)
1st case: k is in x and x is a leaf delete k
k=66
x 62 66 70 74 x 62 70 74
5 6 7 5 6 7
y 35 40 45 y 35 40 45
Example t=3
2nd case cont.:
c. Both a and b are not satisfied: y and z have t-1
keys
– Merge the two children, y and z
– Recursively delete k from the merged cell
x 30 70 90
x 30 50 70 90
y
35 40 55 60 z y 35 40 50 55 65
1 2 3 4 5 6 1 2 3 4 5 6
Example t=3
Questions
• When does the height of the tree shrink?
• Why do we need the number of keys to be at least t
and not t-1 when we proceed down in the tree?
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Delete Complexity
• Basically downward pass:
– Most of the keys are in the leaves – one
downward pass
– When deleting a key in internal node – may
have to go one step up to replace the key with
its predecessor or successor
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Why B-Tree?