0% found this document useful (0 votes)
12 views

Lecture Notes For Design and Analysis of Algorithms

Uploaded by

Aisha Yahya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Lecture Notes For Design and Analysis of Algorithms

Uploaded by

Aisha Yahya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Lecture Notes for Design and Analysis of Algorithms

What is a problem?

The problem is defined as a relation on a set I of problem instances and a set S of problem
solutions.

Example 1: Decision Problem Q: Given a positive integer n. is n divisible by 2?

Instance of Q of are I1: is 2 divisible by 2, I2: is 3 divisible by 2, and so forth and its solution
set S={yes, no}.

What is a Data Structure?


A data structure is a way to store and organize in order to facilitate access and modifications.

Note : No single data structure work well for all purposes, and so it important to know the
strengths and limitations of several of them.

Examples: Array, Stack, queue and so forth.

What is an Algorithm?
An algorithm is any well-defined computational procedure that takes some value, or set of
values , as input and produce some value or set of values , as output. That is, an algorithm is
thus a sequence of computational steps that transform the input into the output.

Examples:Sorting Algorithms, Searching Algorithms and so forth.

What is the Running time of an Algorithm?

The running time of an algorithm on a particular input is the number of steps executed.
2
Example: The running time of Bubble sort, Selection sort, Insertion sort is O(n ).

Asymptotic Notation
We are studying the asymptotic efficiency of algorithms. That is, we are concerned with how the
running time of an algorithm increases with the size of the input in the limit, as the size of the
input increases without bound.

 Asymptotic Notations are :


1. Big –Oh Notations (O)
2. Big-Omega Notation (Ω)
3. Big-Theta Notations ( Θ )
Big –Oh Notations (O): The function f(n)=O(g(n)) iff there exists positive constants n and C
0
such that f(n)≤ C g(n) for all n≥n .
0

Examples
 3n²+4n+1=O(n²)
2
 7n³+5n +6n+2=O(n³)
3
 7n³+5n³+6n+2=O(n )
Big-Omega Notation (Ω)
The function f(n)=Ω(g(n)) iff there exists positive constants n0 and c such that f(n) ≥c g(n) for all
n≥n0.

Examples
 3n²+4n+1= Ω (n²)
2
 7n³+5n +6n+2= Ω (n³)
3
 7n³+5n³+6n+2= Ω (n )
Big-Theta Notations ( Θ ) The function f(n)=Θ(g(n)) iff there exists positive constants n0 ,C1

and C2 such that C1g(n)≤g(n)≤ C2g(n) for all n≥n


0
.

Examples
 3n²+4n+1= Θ (n²)
2
 7n³+5n +6n+2= Θ (n³)
3
 7n³+5n³+6n+2= Θ (n )
Red-Black Trees
Motivation for Red-Black Tree
A set is manipulated by algorihms can grow, shrink, or otherwise change over time, call such
sets dynamic.

Binary Search Tree of height h can implement any of the basic dynamic ste operations- such as
Insert, Delete, Minimum, Maximum- in O(h) or O(n) time.

Stored keys must satisfy the binary-searchtree property.

 If y is in left subtree of x, then y.key <= x.key.


 If y is in right subtree of x, then y.key >= x.key.

Example 1:
Example 2:

Figure-a Figure-b

Figure-a: A binary search tree on 6 nodes with height 2. Figure-b: A less efficient binary search tree
with height 4 that contains the same keys.

 The set opertions are fast if the height of the search tree is small, but if its height is large,
their performance may be no better than with a linked list.

Red-Black tree( or RBT) are one of many search-tree schemes that are “balanced” in order to
gurantee that basic dynamic-set operations take O(lg n) time in the worst case.

A binary search tree is a Red-Black if it satisfies the following properties:

1. Every node is either red or black.


2. Every leaf (NULL pointer) is black ○ Note: this means every “real” node has 2 children.
3. If a node is red, both children are black ○ Note: can’t have 2 consecutive reds on a path.
4. Every path from node to descendent leaf contains the same number of black nodes.
5. The root is always black.

Definition 1: Number of edges in a longest path to a leaf.


Definition 2: Black-height of a node bh(X) is the number of black nodes on path from x to leaf,
not counting x.

Example:

 red-black tree has O(lg n) height.

Corollary: These operations take O(lg n) time:

o Minimum(), Maximum()
o Successor(), Predecessor()
o Search()
o Insert() and Delete(): Will also take O(lg n) time, but will need special care
since they modify tree

Red-Black Trees: An Example

Insert 8 :Where does it go. What color should it be?


The Problem With Insertion 11

The modified tree violates the properties of Red-Black tree and we have to
do the following things to make as the Red-Black tree
(a) we must change the colors of some of the nodes in the tree
(b) Change the pointer i.e., we chamge the pointer strucure through
rotation.
Rotation
We have two kinds of rotations: left rotations and right rotations. When we do a left rotation on a node
x , we assume that its right child y is not nil[T}. The left rotation “pivots” around the link from x to y . It
makes y the new root of the subtree, with x as y’s left child and y’s left child and y’s left child as x’s right
child . Similarly right rotation also and both rotations are shown below figure .

Left-Rotation Right-Rotation
x keeps its left child x keeps its left child
y keeps its right child y keeps its right child
y’s right child becomes x’s left child x’s right child becomes y’s left child
x’s and y’s parents change x’s and y’s parents change

Insertion in Red-Black Tree

Basic steps:

(1) Use Tree-Insert from BST (slightly modified) to insert a node z into T.
 Procedure RB-Insert(T, z).
 Color the node z red.

(2) Fix the modified tree by re-coloring nodes and performing rotation to preserve RB tree
property.
 Procedure RB-Insert-Fixup.
B-Trees
Motivation for B-Tree

(1) Since Search, Maximum, Minimum, Insert, etc. operations take O(n) time on Binary
Search tree, but O(log n) time on the Red-Black tree for the same operations.
(2) Data access and modifications operations are faster on Red-Black tree than the
Binary search tree.
(3) B-tree are balanced search trees designed to work well on magnetic disks or other
direct-access secondary storage devices.
(4) B-tree are similar to red-black trees , but they are minimizing disk I/O oprations .
Many database systems use B-tree to store information.

B-Trees are a variation on binary search trees that allow quick searching in files on disk. Instead of
storing one key and having two children, B-tree nodes have n keys and n+1 children, where n can be
large.

A B-tree is a tree with root root[T] with the following properties:

1. Every node has the following fields:

 n[x], the number of keys currently in node x. For example, n[|40|50|] in the above
example B-tree is 2. n[|70|80|90|] is 3.
 The n[x] keys themselves, stored in nondecreasing order: key1[x] <= key2[x] <= ... <=
keyn[x][x]. For example, the keys in |70|80|90| are ordered.
 leaf[x], a boolean value that is True if x is a leaf and False if x is an internal node.

2. If x is an internal node, it contains n[x]+1 pointers c1, c2, ... , cn[x], cn[x]+1 to its
children. For example, in the above B-tree, the root node has two keys, thus three
children. Leaf nodes have no children so their ci fields are undefined.
3. The keys keyi[x] separate the ranges of keys stored in each subtree: if ki is any key stored
in the subtree with root ci[x], then

k1 <= key1[x] <= k2 <= key2[x] <= ... <= keyn[x][x] <= kn[x]+1.

For example, everything in the far left subtree of the root is numbered less than 30.
Everything in the middle subtree is between 30 and 60, while everything in the far right
subtree is greater than 60. The same property can be seen at each level for all keys in
non-leaf nodes.

4. Every leaf has the same depth, which is the tree's height h. In the above example, h=2.
5. There are lower and upper bounds on the number of keys a node can contain. These
bounds can be expressed in terms of a fixed integer t >= 2 called the minimum degree of
the B-tree:
o Every node other than the root must have at least t-1 keys. Every internal node
other than the root, thus has at least t children. If the tree is nonempty, the root
must have at least one key.
o Every node can contain at most 2t-1 keys. Therefore, an internal node can have at
most 2t children. We say that a node is full if it contains exactly 2t-1 keys.
Searching a B-tree Searching a B-tree is much like searching a binary search tree, only the decision
whether to go "left" or "right" is replaced by the decision whether to go to child 1, child 2, ..., child n[x].
The following procedure, B-Tree-Search, should be called with the root node as its first parameter. It
returns the block where the key k was found along with the index of the key in the block, or "null" if the
key was not found:

B-Tree-Search (x, k) // search starting at node x for key k


i = 1

// search for the correct child

while i <= n[x] and k > keyi[x] do


i++
end while

// now i is the least index in the key array such that


// k <= keyi[x], so k will be found here or
// in the i'th child

if i <= n[x] and k = keyi[x] then


// we found k at this node
return (x, i)

if leaf[x] then return null

return B-Tree-Search (ci[x], k)

Creating an empty B-tree

B-Tree-Create (T)
x = allocate-node ();
leaf[x] = True
n[x] = 0
Disk-Write (x)
root[T] = x
This assumes there is an allocate-node function that returns a node with key, c, leaf fields, etc.,
and that each node has a unique "address" on the disk.

Clearly, the running time of B-Tree-Create is O(1), dominated by the time it takes to write the
node to disk.

Inserting a key into a B-tree

Inserting into a B-tree is a bit more complicated than inserting into an ordinary binary search
tree. We have to find a place to put the new key. We would prefer to put it in the root, since that
is kept in RAM and so we don't have to do any disk accesses. If that node is not full (i.e., n[x] for
that node is not 2t-1), then we can just stick the new key in, shift around some pointers and keys,
write the results back to disk, and we're done. Otherwise, we will have to split the root and do
something with the resulting pair of nodes, maintaining the properties of the definition of a B-
tree.
Here is the general algorithm for insertinging a key k into a B-tree T. It calls two other
procedures, B-Tree-Split-Child, that splits a node, and B-Tree-Insert-Nonfull, that
handles inserting into a node that isn't full.

B-Tree-Insert (T, k)
r = root[T]
if n[r] = 2t - 1 then
// uh-oh, the root is full, we have to split it
s = allocate-node ()
root[T] = s // new root node
leaf[s] = False // will have some children
n[s] = 0 // for now
c1[s] = r // child is the old root node
B-Tree-Split-Child (s, 1, r) // r is split
B-Tree-Insert-Nonfull (s, k) // s is clearly not full
else
B-Tree-Insert-Nonfull (r, k)
endif
Let's look at the non-full case first: this procedure is called by B-Tree-Insert to inserta key
into a node that isn't full. In a B-tree with a large minimum degree, this is the common case.
Before looking at the pseudocode, let's look at a more English explanation of what's going to
happen:

To insert the key k into the node x, there are two cases:

 x is a leaf node. Then we find where k belongs in the array of keys, shift everything over
to the left, and stick k in there.
 x is not a leaf node. We can't just stick k in because it doesn't have any children; children
are really only created when we split a node, so we don't get an unbalanced tree. We find
a child of x where we can (recursively) insert k. We read that child in from disk. If that
child is full, we split it and figure out which one k belongs in. Then we recursively insert
k into this child (which we know is non-full, because if it were, we would have split it).

Here's the algorithm:


B-Tree-Insert-Nonfull (x, k)
i = n[x]

if leaf[x] then

// shift everything over to the "right" up to the


// point where the new key k should go

while i >= 1 and k < keyi[x] do


keyi+1[x] = keyi[x]
i--
end while

// stick k in its right place and bump up n[x]

keyi+1[x] = k
n[x]++
else
// find child where new key belongs:

while i >= 1 and k < keyi[x] do


i--
end while

// if k is in ci[x], then k <= keyi[x] (from the definition)


// we'll go back to the last key (least i) where we found this
// to be true, then read in that child node

i++
Disk-Read (ci[x])
if n[ci[x]] = 2t - 1 then

// uh-oh, this child node is full, we'll have to split


it

B-Tree-Split-Child (x, i, ci[x])

// now ci[x] and ci+1[x] are the new children,


// and keyi[x] may have been changed.
// we'll see if k belongs in the first or the second

if k > keyi[x] then i++


end if

// call ourself recursively to do the insertion

B-Tree-Insert-Nonfull (ci[x], k)
end if
Now let's see how to split a node. When we split a node, we always do it with respect to its
parent; two new nodes appear and the parent has one more child than it did before. Again, let's
see some English before we have to look at the pseudocode:

We will split a node y that is the ith child of its parent x. Node x will end up having one more
child we'll call z, and we'll make room for it in the ci[x] array right next to y.

We know y is full, so it has 2t-1 keys. We'll "cut" y in half, copying keyt+1[y] through key2t-1[y]
into the first t-1 keys of this new node z.

If the node isn't a leaf, we'll also have to copy over the child pointers ct+1[y] through c2t[y] (one
more child than keys) into the first t children of z.

Then we have to shift the keys and children of x over one starting at index i+1 to accomodate the
new node z, and then update the n[] counts on x, y and z, finally writing them to disk.

Here's the pseudocode:

B-Tree-Split-Child (x, i, y)
z = allocate-node ()

// new node is a leaf if old node was


leaf[z] = leaf[y]

// we since y is full, the new node must have t-1 keys

n[z] = t - 1

// copy over the "right half" of y into z

for j in 1..t-1 do
keyj[z] = keyj+t[y]
end for

// copy over the child pointers if y isn't a leaf

if not leaf[y] then


for j in 1..t do
cj[z] = cj+t[y]
end for
end if

// having "chopped off" the right half of y, it now has t-1 keys

n[y] = t - 1

// shift everything in x over from i+1, then stick the new child in x;
// y will half its former self as ci[x] and z will
// be the other half as ci+1[x]

for j in n[x]+1 downto i+1 do


cj+1[x] = cj[x]
end for
ci+1 = z

// the keys have to be shifted over as well...

for j in n[x] downto i do


keyj+1[x] = keyj[x]
end for

// ...to accomodate the new key we're bringing in from the middle
// of y (if you're wondering, since (t-1) + (t-1) = 2t-2, where
// the other key went, its coming into x)

keyi[x] = keyt[y]
n[x]++

Example of Insertion

Let's look at an example of inserting into a B-tree. For preservation of sanity, let t = 2. So a node
is full if it has 2(2)-1 = 3 keys in it, and each node can have up to 4 children. We'll insert the
sequence 5 9 3 7 1 2 8 6 0 4 into the tree:
Step 1: Insert 5
___
|_5_|

Step 2: Insert 9
B-Tree-Insert simply calls B-Tree-Insert-Nonfull, putting 9 to the
right of 5:
_______
|_5_|_9_|

Step 3: Insert 3
Again, B-Tree-Insert-Nonfull is called
___ _______
|_3_|_5_|_9_|

Step 4: Insert 7
Tree is full. We allocate a new (empty) node, make it the root, split
the former root, then pull 5 into the new root:
___
|_5_|
__ / \__
|_3_| |_9_|

Then insert we insert 7; it goes in with 9


___
|_5_|
__ / \______
|_3_| |_7_|_9_|

Step 5: Insert 1
It goes in with 3
___
|_5_|
___ __ / \______
|_1_|_3_| |_7_|_9_|

Step 6: Insert 2
It goes in with 3
___
|_5_|
/ \
___ __ /___ \______
|_1_|_2_|_3_| |_7_|_9_|

Step 7: Insert 8
It goes in with 9

___
|_5_|
/ \
___ __ /___ \__________
|_1_|_2_|_3_| |_7_|_8_|_9_|

Step 8: Insert 6
It would go in with |7|8|9|, but that node is full. So we split it,
bringing its middle child into the root:
_______
|_5_|_8_|
/ | \
___ ____/__ _|_ \__
|_1_|_2_|_3_||_7_| |_9_|

Then insert 6, which goes in with 7:


_______
___|_5_|_8_|__
/ | \
___ ____/__ __|____ \__
|_1_|_2_|_3_| |_6_|_7_| |_9_|

Step 9: Insert 0

0 would go in with |1|2|3|, which is full, so we split it, sending the middle
child up to the root:
___________
|_2_|_5_|_8_|
_/ | | \_
_/ | | \_
_/_ __| |______ \___
|_1_| |_3_| |_6_|_7_| |_9_|

Now we can put 0 in with 1


___________
|_2_|_5_|_8_|
_/ | | \_
_/ | | \_
___ _/_ __| |______ \___
|_0_|_1_| |_3_| |_6_|_7_| |_9_|

Step 10: Insert 4


It would be nice to just stick 4 in with 3, but the B-Tree algorithm
requires us to split the full root. Note that, if we don't do this and
one of the leaves becomes full, there would be nowhere to put the middle
key of that split since the root would be full, thus, this split of the
root is necessary:
___
|_5_|
___/ \___
|_2_| |_8_|
_/ | | \_
_/ | | \_
___ _/_ __| |______ \___
|_0_|_1_| |_3_| |_6_|_7_| |_9_|

Now we can insert 4, assured that future insertions will work:

___
|_5_|
___/ \___
|_2_| |_8_|
_/ | | \_
_/ | | \_
___ _/_ ___|___ |_______ \____
|_0_|_1_| |_3_|_4_| |_6_|_7_| |_9_|

String Matching Algorithms


String Matching Problem: Given a text T and a pattern P, find
all occurrences of P within T.
Example: T=abcxyxabxyabcxyabcx and P=abc. Then P
appears 3 times in T.
A pattern P occurs with shift s in T , if
P[1,2,…m]=T[s+1,…s+m]. The String Matching Problem is to
find all values of s where 0  s  m  n .

Example:

String Matching Problem and Terminology

A string A is a prefix of X if X=A y, for some string y.

Similarly, a string A is a suffix of X, if X=yA, for some string y.


Brute Force Algorithm
Step1: P is aligned with T at the first index position.
Step 2: P is then compared with T from left-to-right.
Step 3: If a mismatch occurs, ”slide” to right by 1 position, and start the
comparison again.
Example:
The Rabin-Karp algorithm
Rabin-Karp string searching algorithm calculates a numerical (hash) value for the pattern P,and
for each m-character substring of textt. Then it compares the numerical values instead of
comparing the actual symbols. If any match is found, it compares the pattern with the
substringby naive approach. Otherwise it shifts to next substring of T to compare with P.We can
compute the numerical (hash) values using Horner’s rule. The Time complexity of the algorithm
is O(m(n-m)).

Example:
The Knuth-Morris-Pratt (KMP) Algorithm
Motivation for KMP algorithm

In the Brute-Force algorithm, if a mismatch occurs at P[j]


(j>1), it only slides P to right by 1 step.It throws away
one piece of information that we have already known.

When we slide P to right, it should be a place where P could


possibly occur in T?
Ans: The next function

Example for the next function


The next function
Given P[1,…,m], the next function is definition defined as
next(q)  (q)  max{k : k  q and P[1...k ] is a suffix of P[1,..., q]}
Example:

:
KMP algorithm is designed based on the next(q)  (q ) function, we are calling the next(q)
 (q) function as preprocessing and it takes O(m) time. The next(q)  (q) function is
presented as a procedure called COMPUTE-PREFIX-FUNCTION(P).

The Knuth-Morris-Pratt algorithm is given in pseudocode below as the


procedure KMP-MATCHER. KMP-MATCHER(T,P) calls the auxiliary procedure
COMPUTE-PREFIX-FUNCTION to compute  (q ) .
Example:

The Floyd-Warshall Algorithm


The Floyd-Warshall Algorithm is for solving the All Pairs Shortest Path problem .
The problem is to find shortest distances between every pair of vertices in a given edge weighted
directed Graph.

Rough Idea:

(1) Initialize the solution matrix same as the input graph matrix as a first step.
(2) We update the solution matrix by considering all vertices as an intermediate vertex. The
idea is to one by one pick all vertices and updates all shortest paths which include the
picked vertex as an intermediate vertex in the shortest path.
(3) Pick vertex number k as an intermediate vertex, we already have considered vertices {1,
2, .. k-1} as intermediate vertices. For every pair (i, j) of the source and destination
vertices respectively, there are two possible cases.

(a) k is not an intermediate vertex in shortest path from i to j. We keep the value of
d[i][j] as it is.
(b) k is an intermediate vertex in shortest path from i to j. We update the value of d[i][j]
as d[i][k] + d[k][j] if d[i][j] > d[i][k] + d[k][j]

The following figure shows the above optimal substructure property in the all-pairs shortest path
problem.
Topological sort
 Depth-first search can be used to perform a topological sort of a Directed Acyclic
Graph (or DAG). A topological sort of a DAG G=(V, E) is a linear ordering of all its
vertices such that if G contains edge (u, v), then u appers before v in the ordering.
 If the graph is cyclic , then no liner ordering is possible.
 A topological sort of a graph can be viewed as an ordering of its vertices along a
horizontal line so that all directed edges go from left to right.
Depth-first Search (DFS)
 Explore edges out of the most recently discovered vertex v.
 When all edges of v have been explored, backtrack to explore other edges leaving the
vertex from which v was discovered (its predecessor).
 “Search as deep as possible first.”
 Continue until all vertices reachable from the original source are discovered.
 If any undiscovered vertices remain, then one of them is chosen as a new source and
search is repeated from that source.

 Input: G = (V, E), directed or undirected. No source vertex given!


 Output:
» 2 timestamps on each vertex. Integers between 1 and 2|V|.
• d[v] = discovery time (v turns from white to gray)
• f [v] = finishing time (v turns from gray to black)
» [v] : predecessor of v = u, such that v was discovered during the scan of u’s
adjacency list.
Analysis of DFS
 Loops on lines 1-2 & 5-7 take (V) time, excluding time to execute DFS-Visit.
 DFS-Visit is called once for each white vertex vV when it’s painted gray the first time.
Lines 3-6 of DFS-Visit is executed |Adj[v]| times. The total cost of executing DFS-Visit
is  |Adj[v]| = (E)
vV
 Total running time of DFS is (V+E).

The following simple algorithm topologically sorts a dag:

TOPOLOGICAL-SORT(G)
1. call DFS(G) to compute finishing times f[v] for each vertex v.
2. as each vertex is finished, insert it onto the front of a linked list
3. return the linked list of vertices

Topological sorting for the example is : Its vertices are arranged from left to right in order of
decreasing finishing time. The topological sorted order for the shown graph is w z u v y x.

You might also like