0% found this document useful (0 votes)
18 views55 pages

Ad 1

The document discusses the role of algorithms in computing, focusing on time and space complexity analysis, including asymptotic analysis and performance measurement. It explains the concepts of time complexity with examples, detailing how to calculate it using big-O notation, and introduces AVL trees as a type of binary search tree with specific balancing properties. The document also covers insertion and removal operations in AVL trees, highlighting the importance of maintaining balance to ensure efficient performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views55 pages

Ad 1

The document discusses the role of algorithms in computing, focusing on time and space complexity analysis, including asymptotic analysis and performance measurement. It explains the concepts of time complexity with examples, detailing how to calculate it using big-O notation, and introduces AVL trees as a type of binary search tree with specific balancing properties. The document also covers insertion and removal operations in AVL trees, highlighting the importance of maintaining balance to ensure efficient performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

UNIT I ROLE OF ALGORITHMS IN COMPUTING & COMPLEXITY

ANALYSIS
Algorithms – Algorithms as a Technology – Time and Space complexity of
algorithms – Asymptotic analysis – Average and worst-case analysis – Asymptotic notation
– Importance of efficient algorithms – Program performance measurement – Recurrences:
The Substitution Method – The Recursion – Tree Method – Data structures and algorithms.

TIME AND SPACE COMPLEXITY OF ALGORITHM


Generally, there is always more than one way to solve a problem in computer
science with different algorithms. Therefore, it is highly required to use a method to
compare the solutions in order to judge which one is more optimal. The method must be:
● Independent of the machine and its configuration, on which the algorithm is
running on.
● Shows a direct correlation with the number of inputs.
● Can distinguish two algorithms clearly without ambiguity.

Time Complexity
The time complexity of an algorithm quantifies the amount of time taken by an
algorithm to run as a function of the length of the input. The time to run is a function of
the length of the input and not the actual execution time of the machine on which the
algorithm is running. In order to calculate time complexity on an algorithm, it is assumed
that a constant time c is taken to execute one operation, and then the total operations for
an input length on N are calculated. Consider an example to understand the process of
calculation: Suppose a problem is to find whether a pair (X, Y) exists in an array A of N
elements whose sum is z. The simplest idea is to consider every pair and check if it
satisfies the given condition or not.
The pseudo-code is as follows:
int a[n];
for(int i = 0;i < n;i++)
cin >> a[i];
for(int i = 0;i < n;i++)
for(int j = 0;j < n;j++)
if(i!=j && a[i]+a[j] == z)
return true
return false
Assume that each of the operations in the computer takes approximately constant
time c. The number of lines of code executed actually depends on the value of z. During
analyses of the algorithm, mostly the worst-case scenario is considered, i.e., when there
is no pair of elements with sum equals z. In the worst case,
● N*c operations are required for input.
● The outer loop i, runs N times.
● For each i, the inner loop j loop runs n times.

So total execution time is N*c + N*N*c + c. Now ignore the lower order terms
since the lower order terms are relatively insignificant for large input, therefore only the
highest order term is taken (without constant) which is N*N in this case. Different
notations are used to describe the limiting behavior of a function, but since the worst
case is taken so big-O notation will be used to represent the time complexity.
Hence, the time complexity is O(N2) for the above algorithm. Note that the time
complexity is based on the number of elements in array A i.e the input length, so if the
length of the array will increase the time of execution will also increase.
Order of growth is how the time of execution depends on the length of the input.
In the above example, it is clearly evident that the time of execution quadratically
depends on the length of the array. Order of growth will help to compute the running
time with ease.
Another Example: Let’s calculate the time complexity of the below algorithm:
count = 0
for(int i =N; i > 0; i /= 2)
for (int j = 0; j < i; j++)
count++;
It seems like the complexity is O(N * log N). N for the j′s loop and log(N) for i′s loop.
But it’s wrong. Let’s see why.
Think about how many times count++ will run.
● When i = N, it will run N times.
● When i = N / 2, it will run N / 2 times.
● When i = N / 4, it will run N / 4 times.
● And so on.
The total number of times count++ will run is N + N/2 + N/4+…+1= 2 * N. So the
time complexity will be O(N). Some general time complexities are listed below with
the input range for which they are accepted in competitive programming:
Worst Accepted Time
Input Length Usually type of solutions
Complexity

10 -12 O(N!) Recursion and backtracking

Recursion, backtracking, and bit


O(2N * N)
15-18 manipulation

Recursion, backtracking, and bit


18-22 O(2N * N)
manipulation

Meet in the middle, Divide and


30-40 O(2N/2 * N)
Conquer

Dynamic programming,
100 O(N4)
Constructive

Dynamic programming,
400 O(N3)
Constructive

Dynamic programming, Binary


Search, Sorting, Divide and
2K O(N2* log N)
Conquer

Dynamic programming, Graph,


10K O(N2) Trees,

Sorting, Binary Search, Divide


1M O(N* log N)
and Conquer

Constructive, Mathematical,
100M O(N), O(log N), O(1)
Greedy Algorithms

Space Complexity
The space complexity of an algorithm quantifies the amount of space taken by an
algorithm to run as a function of the length of the input. Consider an example: Suppose
a problem to find the frequency of array elements.
The pseudo-code is as follows:
int freq[n];
int a[n];
for(int i = 0; i<n; i++)
{
cin>>a[i];
freq[a[i]]++;
}
Here two arrays of length N, and variable i are used in the algorithm so, the total
space used is N * c + N * c + 1 * c = 2N * c + c, where c is a unit space taken. For many
inputs, constant c is insignificant, and it can be said that the space complexity is O(N).
There is also auxiliary space, which is different from space complexity. The main
difference is where space complexity quantifies the total space used by the algorithm,
auxiliary space quantifies the extra space that is used in the algorithm apart from the
given input. In the above example, the auxiliary space is the space used by the freq[]
array because that is not part of the given input. So total auxiliary space is N * c + c
which is O(N) only.
AVL TREES
• Binary Search Trees

• AVL Trees

AVL Trees 1
Binary Search Trees
• A binary search tree is a binary tree T such that
- each internal node stores an item (k, e) of a
dictionary.
- keys stored at nodes in the left subtree of v are less
than or equal to k.
- Keys stored at nodes in the right subtree of v are
greater than or equal to k.
- External nodes do not hold elements but serve as
place holders.

44

17 88

32 65 97

28 54 82

29 76

80

AVL Trees 2
Search
• The binary search tree T is a decision tree, where the
question asked at an internal node v is whether the
search key k is less than, equal to, or greater than the
key stored at v.
• Pseudocode:
Algorithm TreeSeach(k, v):
Input: A search key k and a node v of a binary search
tree T.
Ouput: A node w of the subtree T(v) of T rooted at v,
such that either w is an internal node storing
key k or w is the external node encountered in
the inorder traversal of T(v) after all the inter
nal nodes with keys smaller than k and before
all the internal nodes with keys greater than k.
if v is an external node then
return v
if k = key(v) then
return v
else if k < key(v) then
return TreeSearch(k, T.leftChild(v))
else
{ k > key(v) }
return TreeSearch(k, T.rightChild(v))

AVL Trees 3
Search (cont.)
• A picture:
find(25) find(76)

44

17 88

32 65 97

28 54 82

76
29

80

AVL Trees 4
Insertion in a Binary Search
Tree
• Start by calling TreeSearch(k, T.root()) on T. Let w
be the node returned by TreeSearch
• If w is external, we know no item with key k is
stored in T. We call expandExternal(w) on T and have
w store the item (k, e)
• If w is internal, we know another item with key k is
stored at w. We call TreeSearch(k, rightChild(w)) and
recursively apply this alorithm to the node returned
by TreeSearch.

AVL Trees 5
Insertion in a Binary Search
Tree (cont.)
• Insertion of an element with key 78:

a) 44

17 88

32 65 97

28 54 82

29 76

80

b) 44

17 88

32 65 97

28 54 82

29 76

80

78

AVL Trees 6
Removal from a Binary Search
Tree
• Removal where the key to remove is stored at a node
(w) with an external child:

44

17 88
w
32 65 97

28 54 82

29 76

80

78

(a)

AVL Trees 7
Removal from a Binary Search
Tree (cont.)

44

17 88

28 65 97

29 54 82

76

80

78

(b)

AVL Trees 8
Removal from a Binary Search
Tree (cont.)
• Removal where the key to remove is stroed at a node
whose children are both internal:

44

17 88
w 65
28 97

29 54 82

76

80

78

(a)

AVL Trees 9
Removal from a Binary Search
Tree (cont.)

44

17 88
w 76
28 97

29 54 82

80

78

(b)

AVL Trees 10
Time Complexity
• Searching, insertion, and removal in a binary search
tree is O(h), where h is the height of the tree.
• However, in the worst-case search, insertion, and
removal time is O(n), if the height of the tree is
equal to n. Thus in some cases searching, insertion,
and removal is no better than in a sequence.
• Thus, to prevent the worst case, we need to develop
a rebalancing scheme to bound the height of the tree
to log n.

AVL Trees 11
AVL Tree
• An AVL Tree is a binary search tree such that for
every internal node v of T, the heights of the children
of v can differ by at most 1.
• An example of an AVL tree where the heights are
shown next to the nodes:
4
44
2 3
17 78
1 2 1
32 50 88
1 1
48 62

AVL Trees 12
Height of an AVL Tree
• Proposition: The height of an AVL tree T storing n
keys is O(log n).
• Justification: The easiest way to approach this
problem is to try to find the minimum number of
internal nodes of an AVL tree of height h: n(h).
• We see that n(1) = 1 and n(2) = 2
• for n 3, an AVL tree of height h with n(h) minimal
contains the root node, one AVL subtree of height n-
1 and the other AVL subtree of height n-2.
• i.e. n(h) = 1 + n(h-1) + n(h-2)
• Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2)
- n(h) > 2n(h-2)
- n(h) > 4n(h-4)
...
- n(h) > 2in(h-2i)
• Solving the base case we get: n(h) 2h/2-1
• Taking logarithms: h < 2log n(h) +2
• Thus the height of an AVL tree is O(log n)

AVL Trees 13
Insertion
• A binary search tree T is called balanced if for every
node v, the height of v’s children differ by at most
one.
• Inserting a node into an AVL tree involves
performing an expandExternal(w) on T, which
changes the heights of some of the nodes in T.
• If an insertion causes T to become unbalanced, we
travel up the tree from the newly created node until
we find the first node x such that its grandparent z is
unbalanced node.
• Since z became unbalanced by an insertion in the
subtree rooted at its child y,
height(y) = height(sibling(y)) + 2
• To rebalance the subtree rooted at z, we must
perform a restructuring
- we rename x, y, and z to a, b, and c based on the
order of the nodes in an in-order traversal.
- z is replaced by b, whose children are now a and c
whose children, in turn, consist of the four other
subtrees formerly children of x, y, and z.

AVL Trees 14
Insertion (contd.)
• Example of insertion into an AVL tree.

5
44
2
z 4
17 78
1 3 y 1
32 50 88
1 2 x
48 62
1
54 T3

T0 T2

4
44
2 3 x
17 62 z
2
y
1 2
32 50 78
1 1 1
48 54 88

T2

T0 T1 T3

AVL Trees 15
Restructuring
• The four ways to rotate nodes in an AVL tree,
graphically represented:

- Single Rotations:

a=z single rotation b=y


b=y a=z c=x
c=x

T0 T3
T1 T3 T0 T1 T2
T2

c=z single rotation b=y


b=y a=x c=z
a=x

T3 T3
T0 T2 T2 T1 T0
T1

AVL Trees 16
Restructuring (contd.)

- double rotations:

a=z double rotation b=x


c=y a=z c=y
b=x

T0 T2
T2 T3 T0 T1 T3
T1

c=z double rotation b=x


a=y a=y c=z
b=x

T0 T2
T3 T2 T3 T1 T0
T1

AVL Trees 17
Restructuring (contd.)
• In Pseudo-Code:

Algorithm restructure(x):
Input: A node x of a binary search tree T that has both
a parent y and a grandparent z
Output: Tree T restructured by a rotation (either
single or double) involving nodes x, y, and z.

1: Let (a, b, c) be an inorder listing of the nodes x, y,


and z, and let (T0, T1, T2, T3) be an inorder listing
of the the four subtrees of x, y, and z not rooted at x,
y, or z
2. Replace the subtree rooted at z with a new subtree
rooted at b
3. Let a be the left child of b and let T0, T1 be the left
and right subtrees of a, respectively.
4. Let c be the right child of b and let T2, T3 be the left
and right subtrees of c, respectively.

AVL Trees 18
Removal
• We can easily see that performing a
removeAboveExternal(w) can cause T to become
unbalanced.
• Let z be the first unbalanced node encountered while
travelling up the tree from w. Also, let y be the child
of z with the larger height, and let x be the child of y
with the larger height.
• We can perform operation restructure(x) to restore
balance at the subtree rooted at z.
• As this restructuring may upset the balance of
another node higher in the tree, we must continue
checking for balance until the root of T is reached.

AVL Trees 19
Removal (contd.)
• example of deletion from an AVL tree:
z
4
44
1
y
3
17 62
x
2 2
50 78
1 1 0 1
T0 48 54 88
T2
32
T1 T3

y
4
62
z x
3
2
44 78
2
1 0 1
17 50 88
1 1
48 54 T2
T0 T3
T1
AVL Trees 20
Removal (contd.)
• example of deletion from an AVL tree
z
4
44
1
y
3
17 62
x
2 2
50 78
1 1 0 1
T0 48 54 88

32
T1 T2 T3

4 x
50
z y
2 3
44 62
1 1 1
2
17 48 54 78
0 1
88

T0 T1 T2
T3

AVL Trees 21
Implementation
• A Java-based implementation of an AVL tree
requires the following node class:

public class AVLItem extends Item {


int height;

AVLItem(Object k, Object e, int h) {


super(k, e);
height = h;
}

public int height() {


return height;
}

public int setHeight(int h) {


int oldHeight = height;
height = h;
return oldHeight;
}
}

AVL Trees 22
Implementation (contd.)
public class SimpleAVLTree
extends SimpleBinarySearchTree
implements Dictionary {

public SimpleAVLTree(Comparator c) {
super(c);
T = new RestructurableNodeBinaryTree();
}
private int height(Position p) {
if (T.isExternal(p))
return 0;
else
return ((AVLItem) p.element()).height();
}
private void setHeight(Position p) { // called only
// if p is internal
((AVLItem) p.element()).setHeight
(1 + Math.max(height(T.leftChild(p)),
height(T.rightChild(p))));
}

AVL Trees 23
Implementation (contd.)

private boolean isBalanced(Position p) {


// test whether node p has balance factor
// between -1 and 1
int bf = height(T.leftChild(p)) - height(T.rightChild(p));
return ((-1 <= bf) && (bf <= 1));
}

private Position tallerChild(Position p) {


// return a child of p with height no
// smaller than that of the other child
if(height(T.leftChild(p)) >= height(T.rightChild(p)))
return T.leftChild(p);
else
return T.rightChild(p);
}

AVL Trees 24
Implementation (contd.)
private void rebalance(Position zPos) {
//traverse the path of T from zPos to the root;
//for each node encountered recompute its
//height and perform a rotation if it is
//unbalanced
while (!T.isRoot(zPos)) {
zPos = T.parent(zPos);
setHeight(zPos);
if (!isBalanced(zPos)) { // perform a rotation
Position xPos = tallerChild(tallerChild(zPos));
zPos = ((RestructurableNodeBinaryTree)
T).restructure(xPos);
setHeight(T.leftChild(zPos));
setHeight(T.rightChild(zPos));
setHeight(zPos);
}
}
}

AVL Trees 25
Implementation (contd.)
public void insertItem(Object key, Object element)
throws InvalidKeyException {
super.insertItem(key, element);// may throw an
// InvalidKeyException
Position zPos = actionPos; // start at the
// insertion position
T.replace(zPos, new AVLItem(key, element, 1));
rebalance(zPos);
}

public Object remove(Object key)


throws InvalidKeyException {
Object toReturn = super.remove(key); // may throw
// an InvalidKeyException
if (toReturn != NO_SUCH_KEY) {
Position zPos = actionPos; // start at the
// removal position
rebalance(zPos);
}
return toReturn;
}
}

AVL Trees 26
B- Trees

COL 106
Shweta Agrawal, Amit Kumar

Slide Credit : Yael Moses, IDC Herzliya

Animated demo: http://ats.oka.nu/b-tree/b-tree.html


https://www.youtube.com/watch?v=coRJrcIYbF4
Motivation
• Large differences between time access to disk,
cash memory and core memory
• Minimize expensive access
(e.g., disk access)
• B-tree: Dynamic sets that is optimized for
disks
B-Trees
A B-tree is an M-way search tree with two properties :
1. It is perfectly balanced: every leaf node is at the same
depth
2. Every internal node other than the root, is at least half-
full, i.e. M/2-1 ≤ #keys ≤ M-1
3. Every internal node with k keys has k+1 non-null
children

For simplicity we consider M even and we use t=M/2:


2.* Every internal node other than the root is at least half-
full, i.e. t-1≤ #keys ≤2t-1, t≤ #children ≤2t
Example: a 4-way B-tree
20 40 20 40

0 5 10 25 35 45 55 0 5 25 35 45 55

10
B-tree 4-way tree
B-tree
1. It is perfectly balanced: every leaf node is at the same depth.
2. Every node, except maybe the root, is at least half-full
t-1≤ #keys ≤2t-1
3. Every internal node with k keys has k+1 non-null children
B-tree Height
Claim: any B-tree with n keys, height h and minimum degree t
satisfies:
n 1
h  log t
2

Proof:
• The minimum number of KEYS for a tree with height h is
obtained when:
– The root contains one key
– All other nodes contain t-1 keys
B-Tree: Insert X

1. As in M-way tree find the leaf node to which X should be


added
2. Add X to this node in the appropriate place among the
values already there
(there are no subtrees to worry about)
3. Number of values in the node after adding the key:
– Fewer than 2t-1: done
– Equal to 2t: overflowed
4. Fix overflowed node
Fix an Overflowed
1. Split the node into three parts, M=2t:
– Left: the first t values, become a left child node
– Middle: the middle value at position t, goes up to parent
– Right: the last t-1 values, become a right child node
2. Continue with the parent:
1. Until no overflow occurs in the parent
2. If the root overflows, split it too, and create a new root node

J
x … 56 98 …. x … 56 68 98 ….
split
y 60 65 68 83 86 90
y 60 65 z 83 86 90
Insert example

20 40 60 80 M  6; t  3

0 5 10 15 25 35 45 55 62 66 70 74 78 87 98

Insert 3:
20 40 60 80

0 3 5 10 15 25 35 45 55 62 66 70 74 78 87 98
20 40 60 80 M  6; t  3

0 3 5 10 15 25 35 45 55 62 66 70 74 78 87 98

Insert 61: 20 40 60 80
OVERFLOW

0 3 5 10 15 25 35 45 55 61 62 66 70 74 78 87 98

SPLIT IT
20 40 60 70 80

0 3 5 10 15 25 35 45 55 61 62 66 74 78 87 98
M  6; t  3
20 40 60 70 80
Insert 38:

0 3 5 10 15 25 35 45 55 61 62 66 74 78 87 98

20 40 60 70 80

0 3 5 10 15 25 35 38 45 55 61 62 66 74 78 87 98
Insert 4: M  6; t  3
20 40 60 70 80

0 3 5 10 15 25 35 38 45 55 61 62 66 74 78 87 98

20 40 60 70 80
OVERFLOW

0 3 4 5 10 15 25 35 38 45 55 61 62 66 74 78 87 98

SPLIT IT
OVERFLOW
5 20 40 60 70 80
SPLIT IT

0 3 4 10 15 25 35 38 45 55 61 62 66 74 78 87 98
M  6; t  3

OVERFLOW
5 20 40 60 70 80
SPLIT IT

0 3 4 10 15 25 35 38 45 55 61 62 66 74 78 87 98

60

5 20 40 70 80

0 3 4 10 15 25 35 38 45 55 61 62 66 74 78 87 98
Complexity Insert
• Inserting a key into a B-tree of height h is done in a
single pass down the tree and a single pass up the
tree

Complexity: O(h)  O(log t n)


B-Tree: Delete X
• Delete as in M-way tree
• A problem:
– might cause underflow: the number of keys
remain in a node < t-1

Recall: The root should have at least 1 value in it, and all other nodes should
have at least t-1 values in them
M  6; t  3

Underflow Example
Delete 87: 60

5 20 40 70 80

0 3 4 10 15 25 35 38 45 55 61 62 66 74 78 87 98

60 B-tree
UNDERFLOW

5 20 40 70 80

0 3 4 10 15 25 35 38 45 55 61 62 66 74 78 98
B-Tree: Delete X,k
• Delete as in M-way tree
• A problem:
– might cause underflow: the number of keys remain in a
node < t-1
• Solution:
– make sure a node that is visited has at least t instead of t-1
keys.
– If it doesn’t have k
• (1) either take from sibling via a rotate, or
• (2) merge with the parent
– If it does have k
• See next slides

Recall: The root should have at least 1 value in it, and all other nodes should
have at least t-1 (at most 2t-1) values in them
B-Tree-Delete(x,k)
1st case: k is in x and x is a leaf  delete k

k=66

x 62 66 70 74 x 62 70 74

How many keys are left?


Example t=3
k=50
x 30 50 70 90 x 30 45 70 90

5 6 7 5 6 7
y 35 40 45 y 35 40 45
Example t=3
2nd case cont.:
c. Both a and b are not satisfied: y and z have t-1
keys
– Merge the two children, y and z
– Recursively delete k from the merged cell

x 30 70 90
x 30 50 70 90

y
35 40 55 60 z y 35 40 50 55 65

1 2 3 4 5 6 1 2 3 4 5 6
Example t=3
Questions
• When does the height of the tree shrink?
• Why do we need the number of keys to be at least t
and not t-1 when we proceed down in the tree?
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Delete Complexity
• Basically downward pass:
– Most of the keys are in the leaves – one
downward pass
– When deleting a key in internal node – may
have to go one step up to replace the key with
its predecessor or successor

Complexity O(h)  O(log t n)


Run Time Analysis of
B-Tree Operations
• For a B-Tree of order M=2t
– #keys in internal node: M-1
– #children of internal node: between M/2 and M
–  Depth of B-Tree storing n items is O(log M/2 N)
• Find run time is:
– O(log M) to binary search which branch to take at each node, since M is
constant it is O(1).
– Total time to find an item is O(h*log M) = O(log n)
• Insert & Delete
– Similar to find but update a node may take : O(M)=O(1)

Note: if M is >32 it worth using binary search at each node


A typical B-Tree

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Why B-Tree?

• B-trees is an implementation of dynamic sets that is


optimized for disks
– The memory has an hierarchy and there is a tradeoff
between size of units/blocks and access time
– The goal is to optimize the number of times needed to
access an “expensive access time memory”
– The size of a node is determined by characteristics of the
disk – block size – page size
– The number of access is proportional to the tree depth

You might also like