Daa Algorithm
Daa Algorithm
Asymptotic Notations,
Dynamic Programming (LCS,
Floyd-Warshall Algorithm,
Matrix Chain Multiplication),
MODULE-IV:
Polynomial Time,
Polynomial-Time Verification,
NP Completeness & reducibility,
NP Completeness proofs,
Cook’s theorem
Asymptotic Notations
Why performance analysis?
There are many important things that should be taken care of, like user friendliness, modularity, security,
maintainability, etc. Why to worry about performance?
The answer to this is simple, we can have all the above things only if we have performance. So performance is like
currency through which we can buy all the above things. Another reason for studying performance is – speed is fun!
To summarize, performance == scale. Imagine a text editor that can load 1000 pages, but can spell check 1 page per
minute OR an image editor that takes 1 hour to rotate your image 90 degrees left OR … you get it. If a software
feature can not cope with the scale of tasks users need to perform – it is as good as dead.
Given two algorithms for a task, how do we find out which one is better?
One naive way of doing this is – implement both the algorithms and run the two programs on your computer for
different inputs and see which one takes less time. There are many problems with this approach for analysis of
algorithms.
1) It might be possible that for some inputs, first algorithm performs better than the second. And for some inputs
second performs better.
2) It might also be possible that for some inputs, first algorithm perform better on one machine and the second
works better on other machine for some other inputs.
Asymptotic Analysis is the big idea that handles above issues in analyzing algorithms. In Asymptotic Analysis, we
evaluate the performance of an algorithm in terms of input size (we don’t measure the actual running time). We
calculate, how does the time (or space) taken by an algorithm increases with the input size.
For example, let us consider the search problem (searching a given item) in a sorted array. One way to search is
Linear Search (order of growth is linear) and other way is Binary Search (order of growth is logarithmic). To
understand how Asymptotic Analysis solves the above mentioned problems in analyzing algorithms, let us say we
run the Linear Search on a fast computer and Binary Search on a slow computer. For small values of input array size
n, the fast computer may take less time. But, after certain value of input array size, the Binary Search will definitely
start taking less time compared to the Linear Search even though the Binary Search is being run on a slow machine.
The reason is the order of growth of Binary Search with respect to input size logarithmic while the order of growth of
Linear Search is linear. So the machine dependent constants can always be ignored after certain values of input size.
Also, in Asymptotic analysis, we always talk about input sizes larger than a constant value. It might be possible that
those large inputs are never given to your software and an algorithm which is asymptotically slower, always
performs better for your particular situation. So, you may end up choosing an algorithm that is Asymptotically slower
but faster for your software.
Analysis of algorithm is the process of analyzing the problem-solving
problem ing capability of the algorithm in terms of the time
and size required (the size of memory for storage while implementation). However, the main concern of analysis of
algorithms is the required time or performance. Generally, we perform the following types of analysis −
• Worst-case − The maximum number of steps taken on any instance of size a.
• Best-case − The minimum number of steps taken on any instance of size a.
• Average case − An average number of steps taken on any instance of size a.
• Amortized − A sequence nce of operations applied to the input of size a averaged over time.
we will take an example of Linear Search and analyze it using Asymptotic analysis. Let us consider the following
implementation of Linear Search.
# Driver Code
arr = [1, 10, 30, 15]
x = 30
n = len(arr)
print(x, "is present at index", search(arr, n, x))
Output:
30 is present at index 2
= Θ(n)
Most of the times, we do worst case analysis to analyze algorithms. In the worst analysis, we guarantee an upper
bound on the running time of an algorithm which is good information.
The average case analysis is not easy to do in most of the practical cases and it is rarely done. In the average case
analysis, we must know (or predict) the mathematical distribution of all possible inputs.
The Best Case analysis is bogus. Guaranteeing a lower bound on an algorithm doesn’t provide any information as in
the worst case, an algorithm may take years to run.
For some algorithms, all the cases are asymptotically same, i.e., there are no worst and best cases. For
example, Merge Sort. Merge Sort does Θ(nLogn) operations in all cases. Most of the other sorting algorithms have
worst and best cases. For example, in the typical implementation of Quick Sort (where pivot is chosen as a corner
element), the worst occurs when the input array is already sorted and the best occur when the pivot elements
always divide array in two halves. For insertion sort, the worst case occurs when the array is reverse sorted and the
best case occurs when the array is sorted in the same order as output.
The complexity of an algorithm describes the efficiency of the algorithm in terms of the amount of the memory
required to process the data and the processing time.
Complexity of an algorithm is analyzed in two perspectives: Time and Space.
Time Complexity
It’s a function describing the amount of time required to run an algorithm in terms of the size of the input. "Time"
can mean the number of memory accesses performed, the number of comparisons between integers, the number of
times some inner loop is executed, or some other natural unit related to the amount of real time the algorithm will
take.
Space Complexity
It’s a function describing the amount of memory an algorithm takes in terms of the size of input to the algorithm. We
often speak of "extra" memory needed, not counting the memory needed to store the input itself. Again, we use
natural (but fixed-length) units to measure this.
Space complexity is sometimes ignored because the space used is minimal and/or obvious, however sometimes it
becomes as important an issue as time.
Asymptotic Notations
Execution time of an algorithm depends on the instruction set, processor speed, disk I/O speed, etc. Hence, we
estimate the efficiency of an algorithm asymptotically.
If we use Θ notation to represent time complexity of Insertion sort, we have to use two statements for best and
worst cases:
1. The worst case time complexity of Insertion Sort is Θ(n^2).
2. The best case time complexity of Insertion Sort is Θ(n).
The Big O notation is useful when we only have upper bound on time complexity of an algorithm. Many times we
easily find an upper bound by simply looking at the algorithm.
O(f(n)) = { g(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ c*g(n) for all n >= n0}
Ω Notation
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It measures the best
case time complexity or the best amount of time an algorithm can possibly take to complete.
For example, for a function f(n)
Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that 0 ≤ g(n) ≤ c.f(n) for all n > n0. }
Theta Notation, θ
The notation θ(n) is the formal way to express both the lower bound and the upper bound of an algorithm's running
time. It is represented as follows −
θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }
A simple way to get Theta notation of an expression is to drop low order terms and ignore leading constants. For
example, consider the following expression.
3n3 + 6n2 + 6000 = Θ(n3)
erms is always fine because there will always be a n0 after which Θ(n3) has higher values than
Dropping lower order terms
Θn2) irrespective of the constants involved.
Little ο
Big-Ο is used as a tight upper-bound
bound on the growth of an algorithm’s effort (this effort is described by the
t function
f(n)), even though, as written, it can also be a loose upper-bound.
upper “Little-ο” (ο()) notation is used to describe an
upper-bound that cannot be tight.
Definition : Let f(n) and g(n) be functions that map positive integers to positive real numbers.
numbers. We say that f(n) is
ο(g(n)) (or f(n) Ε ο(g(n))) if for any real constant c > 0, there exists an integer constant n0 ≥ 1 such that 0 ≤ f(n) <
c*g(n).
In mathematical relation,
f(n) = o(g(n)) means
lim f(n)/g(n) = 0
n→∞
Examples:
Is 7n + 8 ∈ o(n2)?
In order for that to be true, for any c, we have to be able to find an n0 that makes
f(n) < c * g(n) asymptotically true.
lets took some example, If c = 100,we check the inequality is clearly true. If
If c = 1/100 , we’ll have to use
a little more imagination, but we’ll be able to find
fi an n0. (Try n0 = 1000.) From these examples, the conjecture
co
appears to be correct. then check limits,
lim f(n)/g(n) = lim (7n + 8)/(n2) = lim 7/2n = 0 (l’hospital)
n→∞ n→∞ n→∞
hence 7n + 8 ∈ o(n2)
Little ω
Definition : Let f(n) and g(n) be functions that map positive integers to positive real numbers. We say that f(n) is
ω(g(n)) (or f(n) ∈ ω(g(n))) if for any real constant c > 0, there exists an integer constant n0 ≥ 1 such that f(n) > c * g(n)
≥ 0 for every integer n ≥ n0.
f(n) has a higher growth rate than g(n) so main difference between Big Omega (Ω) and little omega (ω) lies in their
definitions.In the case of Big Omega f(n)=Ω(g(n)) and the bound is 0<=cg(n)<=f(n), but in case of little omega, it is
true for 0<=c*g(n)<f(n).
we use ω notation to denote a lower bound that is not asymptotically tight. and, f(n) ∈ ω(g(n)) if and only if g(n) ∈
ο((f(n)).
// Here c is a constant
for (int i = 1; i <= c; i++) {
// some O(1) expressions
}
O(n): Time Complexity of a loop is considered as O(n) if the loop variables is incremented / decremented by a
constant amount. For example following functions have O(n) time complexity.
// Here c is a positive integer constant
for (int i = 1; i <= n; i += c) {
// some O(1) expressions
}
for (int i = n; i > 0; i -= c) {
// some O(1) expressions
}
O(nc): Time complexity of nested loops is equal to the number of times the innermost statement is executed. For
example the following sample loops have O(n2) time complexity
O(Logn) Time Complexity of a loop is considered as O(Logn) if the loop variables is divided / multiplied by a constant
amount.
O(LogLogn) Time Complexity of a loop is considered as O(LogLogn) if the loop variables is reduced / increased
exponentially by a constant amount.
// Here c is a constant greater than 1
for (int i = 2; i <=n; i = pow(i, c)) {
// some O(1) expressions
}
//Here fun is sqrt or cuberoot or any other constant root
for (int i = n; i > 1; i = fun(i)) {
// some O(1) expressions
}
O(1) < O(log n) < O (n) < O(n log n) < O(n^2) < O (n^3)< O(2^n) < O(n!)
Apostiari analysis of an algorithm means we perform analysis of an algorithm only after running it on a system. It
directly depends on the system and changes from system to system.
In an industry, we cannot perform Apostiari analysis as the software is generally made for an anonymous user, which
runs it on a system different from those present in the industry.
In Apriori, it is the reason that we use asymptotic notations to determine time and space complexity as they change
from computer to computer; however, asymptotically they are the same.
Amortized Analysis
Amortized analysis is generally used for certain algorithms where a sequence of similar operations are performed.
• Amortized analysis provides a bound on the actual cost of the entire sequence, instead of bounding the cost
of sequence of operations separately.
• Amortized analysis differs from average-case analysis; probability is not involved in amortized analysis.
Amortized analysis guarantees the average performance of each operation in the worst case.
It is not just a tool for analysis, it’s a way of thinking about the design, since designing and analysis are closely
related.
Aggregate Method
The aggregate method gives a global view of a problem. In this method, if n operations takes worst-case time T(n) in
total. Then the amortized cost of each operation is T(n)/n. Though different operations may take different time, in
this method varying cost is neglected.
Accounting Method
In this method, different charges are assigned to different operations according to their actual cost. If the amortized
cost of an operation exceeds its actual cost, the difference is assigned to the object as credit. This credit helps to pay
for later operations for which the amortized cost less than actual cost.
Potential Method
This method represents the prepaid work as potential energy, instead of considering prepaid work as credit. This
energy can be released to payay for future operations.
If we perform n operations starting with an initial data structure D0. Let us consider, ci as the actual cost and Di as
data structure of ith operation. The potential function Ф maps to a real number Ф(D potentia of Di.
Ф( i), the associated potential
Dynamic Table
If the allocated space for the table is not enough, we must copy the table into larger size table. Similarly, if large
number of members are erased from the table, it is a good idea to reallocate the table with a smaller size.
Using amortized analysis, we can show that the amortized cost of insertion and deletion is constant and unused
space in a dynamic table never exceeds a constant fraction of the total space.
So using Amortized Analysis, we could prove that the Dynamic Table scheme has O(1) insertion time which is a great
result used in hashing.
Following are few important notes.
1) Amortized cost of a sequence of operations can be seen as expenses of a salaried person. The average monthly
expense of the person is less than or equal to the salary, but the person can spend more money in a particular
month by buying a car or something. In other months, he or she saves money for the expensive month.
2) The above Amortized Analysis done for Dynamic Array example is called Aggregate Method. There are two more
powerful ways to do Amortized analysis called Accounting Method and Potential Method. We will be discussing the
other two methods in separate posts.
3) The amortized analysis doesn’t involve probability. There is also another different notion of average case running
time where algorithms use randomization to make them faster and expected running time is faster than the worst
case running time. These algorithms are analyzed using Randomized Analysis. Examples of these algorithms are
Randomized Quick Sort, Quick Select and Hashing.
Definition
Let M be a deterministic Turing machine (TM) that halts on all inputs. The space complexity of M is the function $f
\colon N \rightarrow N$, where f(n) is the maximum number of cells of tape and M scans any input of length M. If
the space complexity of M is f(n), we can say that M runs in space f(n).
We estimate the space complexity of Turing machine by using asymptotic notation.
Let $f \colon N \rightarrow R^+$ be a function. The space complexity classes can be defined as follows −
SPACE = {L | L is a language decided by an O(f(n)) space deterministic TM}
SPACE = {L | L is a language decided by an O(f(n)) space non-deterministic TM}
PSPACE is the class of languages that are decidable in polynomial space on a deterministic Turing machine.
In other words, PSPACE = Uk SPACE (nk)
Dynamic Programming
Dynamic Programming is the most powerful design technique for solving optimization problems.
Divide & Conquer algorithm partition the problem into disjoint sub problems solve the sub problems recursively and
then combine their solution to solve the original problems.
Dynamic Programming is used when the sub problems are not independent, e.g. when they share the same sub
problems. In this case, divide and conquer may do more work than necessary, because it solves the same sub
problem multiple times.
Dynamic Programming solves each sub problems just once and stores the result in a table so that it can be
repeatedly retrieved if needed again.
Dynamic Programming is a Bottom-up approach- we solve all possible small problems and then combine to obtain
solutions for bigger problems.
If a problem has optimal substructure, then we can recursively define an optimal solution. If a problem has
overlapping sub problems, then we can improve on a recursive implementation by computing each sub problem only
once.
If a problem doesn't have optimal substructure, there is no basis for defining a recursive algorithm to find the
optimal solutions. If a problem doesn't have overlapping sub problems, we don't have anything to gain by using
dynamic programming.
If the space of sub problems is enough (i.e. polynomial in the size of the input), dynamic programming can be much
more efficient than recursion.
Step3: Computing the length of an LCS: let two sequences X = (x1 x2.....xm) and Y = (y1 y2..... yn) as inputs. It stores the
c [i,j] values in the table c [0......m,0..........n].Table b [1..........m, 1..........n] is maintained which help us to construct an
optimal solution. c [m, n] contains the length of an LCS of X,Y.
LCS Problem Statement: Given two sequences, find the length of longest subsequence present in both of them. A
subsequence is a sequence that appears in the same relative order, but not necessarily contiguous. For example,
“abc”, “abg”, “bdf”, “aeg”, ‘”acefg”, .. etc are subsequences of “abcdefg”.
In order to find out the complexity of brute force approach, we need to first know the number of possible different
subsequences of a string with length n, i.e., find the number of subsequences with lengths ranging from 1,2,..n-1.
Recall from theory of permutation and combination that number of combinations with 1 element are nC1. Number of
combinations with 2 elements are nC2 and so forth and so on. We know that nC0 + nC1 + nC2 + … nCn = 2n. So a string of
length n has 2n-1 different possible subsequences since we do not consider the subsequence with length 0. This
implies that the time complexity of the brute force approach will be O(n * 2n). Note that it takes O(n) time to check if
a subsequence is common to both the strings. This time complexity can be improved using dynamic programming.
It is a classic computer science problem, the basis of diff (a file comparison program that outputs the differences
between two files), and has applications in bioinformatics.
Examples:
LCS for input Sequences “ABCDGH” and “AEDFHR” is “ADH” of length 3.
LCS for input Sequences “AGGTAB” and “GXTXAYB” is “GTAB” of length 4.
The naive solution for this problem is to generate all subsequences of both given sequences and find the
th longest
matching subsequence. This solution is exponential in term of time complexity. Let us see how this problem
possesses both important properties of a Dynamic Programming (DP) Problem.
1) Optimal Substructure:
Let the input sequences be X[0..m-1] and nd Y[0..n-1]
Y[0..n 1] of lengths m and n respectively. And let L(X[0..m-1],
L(X[0..m Y[0..n-1]) be
the length of LCS of the two sequences X and Y. Following is the recursive definition of L(X[0..m-1],
L(X[0..m Y[0..n-1]).
If last characters of both sequences match (or X[m-1]
X[m == Y[n-1]) then
L(X[0..m-1], Y[0..n-1]) = 1 + L(X[0..m-2],
2], Y[0..n-2])
Y[0..n
If last characters of both sequences do not match (or X[m-1]
X[m != Y[n-1]) then
L(X[0..m-1], Y[0..n-1]) = MAX ( L(X[0..m-2],
2], Y[0..n-1]),
Y[0..n L(X[0..m-1], Y[0..n-2]) )
Examples:
trings “AGGTAB” and “GXTXAYB”. Last characters match for the strings. So length
1) Consider the input strings l of LCS can
be written as: L(“AGGTAB”, “GXTXAYB”) = 1 + L(“AGGTA”, “GXTXAY”)
2) Consider the input strings “ABCDGH” and “AEDFHR. Last characters do not match for the strings. So length of LCS
can be written as:
L(“ABCDGH”, “AEDFHR”) = MAX ( L(“ABCDG”, “AEDFHR”),
“AEDFH L(“ABCDGH”, “AEDFH”) )
So the LCS problem has optimal substructure property as the main problem can be solved using solutions to
subproblems.
2) Overlapping Subproblems:
Following is simple recursive implementation of the LCS problem. The implementation simply follows the recursive
structure mentioned above.
if m == 0 or n == 0:
return 0;
elif X[m-1] == Y[n-1]:
return 1 + lcs(X, Y, m-1, n-1);
else:
return max(lcs(X, Y, m, n-1),
1), lcs(X, Y, m-1,
m n));
Output:
Length of LCS is 4
Time complexity of the above naive recursive approach is O(2^n) in worst case and worst case happens when all
characters of X and Y mismatch i.e., length of LCS is 0.
Considering the above implementation, following is a partial recursion tree for input strings “AXYT” and “AYZX”
lcs("AXYT", "AYZX")
/
lcs("AXY", "AYZX") lcs("AXYT", "AYZ")
/ /
lcs("AX", "AYZX") lcs("AXY", "AYZ") lcs("AXY", "AYZ") lcs("AXYT", "AY")
In the above partial recursion tree, lcs(“AXY”, “AYZ”) is being solved twice. If we draw the complete recursion tree,
then we can see that there are many subproblems which are solved again and again. So this problem has
Overlapping Substructure property and recomputation of same subproblems can be avoided by either using
Memoization or Tabulation. Following is a tabulated implementation for the LCS problem.
Output:
Length of LCS is 4
Time Complexity of the above implementation is O(mn) which is much better than the worst-case time complexity of
Naive Recursive implementation.
Following is detailed algorithm to print the LCS. It uses the same 2D table L[][].
1) Construct L[m+1][n+1] using the steps discussed in above.
2) The value L[m][n] contains length of LCS. Create a character array lcs[] of length equal to the length of lcs plus 1
(one extra to store \0).
2) Traverse the 2D array starting from L[m][n]. Do following for every cell L[i][j]
…..a) If characters (in X and Y) corresponding to L[i][j] are same (Or X[i-1] == Y[j-1]), then include this character as
part of LCS.
…..b) Else compare values of L[i-1][j] and L[i][j-1] and go in direction of greater value.
The following table (taken from Wiki) shows steps (highlighted) followed by the above algorithm.
# Driver program
X = "AGGTAB"
Y = "GXTXAYB"
m = len(X)
n = len(Y)
lcs(X, Y, m, n)
Floyd-Warshall Algorithm
• Floyd-Warshall Algorithm is an algorithm for solving All Pairs Shortest path problem which gives the shortest
path between every pair of vertices of the given graph.
• Floyd-Warshall Algorithm is an example of dynamic programming.
• The main advantage of Floyd-Warshall Algorithm is that it is extremely simple and easy to implement.
Algorithm-
Create a |V| x |V| matrix
For each cell (i,j) in M do-
if i = = j
M[ i ][ j ] = 0 // For all diagonal elements, value = 0
if (i , j) is an edge in E
M[ i ][ j ] = weight(i,j) // If there exists a direct edge between the vertices, value = weight of edge
else
M[ i ][ j ] = infinity // If there is no direct edge between the vertices, value = ∞
for k from 1 to |V|
for i from 1 to |V|
for j from 1 to |V|
if M[ i ][ j ] > M[ i ][ k ] + M[ k ][ j ]
M[ i ][ j ] = M[ i ][ k ] + M[ k ][ j ]
Time Complexity-
• Floyd-Warshall Algorithm is best suited for dense graphs since its complexity depends only on the number of
vertices in the graph.
• For sparse graphs, Johnson’s Algorithm is more suitable.
Example: Input:
graph[][] = { {0, 5, INF, 10},
{INF, 0, 3, INF},
{INF, INF, 0, 1},
{INF, INF, INF, 0} }
Output:
Shortest distance matrix
0 5 8 9
INF 0 3 4
INF INF 0 1
INF INF INF 0
The following figure shows the above optimal substructure property in the all-pairs shortest path problem.
Output:
Following matrix shows the shortestst distances between every pair of vertices
0 5 8 9
INF 0 3 4
INF INF 0 1
INF INF INF 0
Time Complexity: O(V^3)
The above program only prints the shortest distances. We can modify the solution to print the shortest paths also by
storing the predecessor information in a separate 2D matrix.
Also, the value of INF can be taken as INT_MAX from limits.h to make sure that we handle maximum possible value.
When we take INF as INT_MAX, we need to change the if condition in the above program to avoid arithmetic
overflow.
Problem-
Consider the following directed weighted graph-
graph
Using Floyd-Warshall
Warshall Algorithm, find the shortest path distance between every pair of vertices.
Solution-
Step-01:
• Remove all the self loops and parallel edges (keeping the edge with lowest weight) from the graph if any.
• In our case, we don’t have any self edge and parallel edge.
Step-02:
Now, write the initial distance matrix representing the distance between every pair
pair of vertices as mentioned in the
given graph in the form of weights.
• For diagonal elements (representing self-loops),
self value = 0
• For vertices having a direct edge between them, value = weight of that edge
• For vertices having no direct edges between them, value = ∞
Step-03:
From step-03,
03, we will start our actual solution.
NOTE
• Since, we have total 4 vertices in our given graph, so we will have total 4 matrices of order 4 x 4 in
our solution. (excluding initial distance matrix)
• Diagonal elements of each matrix will always be 0.
The last matrix D4 represents the shortest path distance between every pair of vertices.
Matrix Chain Multiplication
It is a Method under Dynamic Programming in which previous output is taken as input for next.
Here, Chain means one matrix's column is equal to the second matrix's row [always].
In general:
If A = ⌊aij⌋ is a p x q matrix
B = ⌊bij⌋ is a q x r matrix
C = ⌊cij⌋ is a p x r matrix
Then
Given following matrices {A1,A2,A3,...An} and we have to perform the matrix multiplication, which can be
accomplished by a series of matrix multiplications
A1 xA2 x,A3 x.....x An
Matrix Multiplication operation is associative in nature rather commutative. By this, we mean that we have to follow
the above matrix order for multiplication but we are free to parenthesize the above multiplication depending upon
our need.
It can be observed that the total entries in matrix 'C' is 'pr' as the matrix is of dimension p x r Also each entry takes O
(q) times to compute, thus the total time to compute all possible entries for the matrix 'C' which is a multiplication of
'A' and 'B' is proportional to the product of the dimension p q r.
It is also noticed that we can save the number of operations by reordering the parenthesis.
Given a sequence of matrices, find the most efficient way to multiply these matrices together. The problem is not
actually to perform the multiplications, but merely to decide in which order to perform the multiplications.
We have many options to multiply a chain of matrices because matrix multiplication is associative. In other words,
no matter how we parenthesize the product, the result will be the same. For example, if we had four matrices A, B,
C, and D, we would have: (ABC)D = (AB)(CD) = A(BCD) = ....
However, the order in which we parenthesize the product affects the number of simple arithmetic operations
needed to compute the product, or the efficiency. For example, suppose A is a 10 × 30 matrix, B is a 30 × 5 matrix,
and C is a 5 × 60 matrix. Then,
Given an array p[] which represents the chain of matrices such that the ith matrix Ai is of dimension p[i-1] x p[i]. We
need to write a function MatrixChainOrder() that should return the minimum number of multiplications needed to
multiply the chain.
Optimal Substructure:
A simple solution is to place parenthesis at all possible places, calculate the cost for each placement and return the
minimum value. In a chain of matrices of size n, we can place the first set of parenthesis in n-1 ways. For example, if
the given chain is of 4 matrices. let the chain be ABCD, then there are 3 ways to place first set of parenthesis outer
side: (A)(BCD), (AB)(CD) and (ABC)(D). So when we place a set of parenthesis, we divide the problem into
subproblems of smaller size. Therefore, the problem has optimal substructure property and can be easily solved
using recursion.
Minimum number of multiplication needed to multiply a chain of size n = Minimum of all n-1 placements (these
placements create subproblems of smaller size)
2) Overlapping Subproblems
Following is a recursive implementation that simply follows the above optimal substructure property.
# A naive recursive implementation that
# simply follows the above optimal
# substructure property
import sys
if i == j:
return 0
_min = sys.maxsize
Time complexity of the above naive recursive approach is exponential. It should be noted that the above function
computes the same subproblems
bproblems again and again. See the following recursion tree for a matrix chain of size 4. The
function MatrixChainOrder(p, 3, 4) is called two times. We can see that there are many subproblems being called
more than once.
Following is the implementation for Matrix Chain Multiplication problem using Dynamic Programming.
# Dynamic Programming
mming Python implementation of Matrix
# Chain Multiplication. See the Cormen book for details
# of the following algorithm
import sys
# L is chain length.
for L in range(2, n):
for i in range(1, n-L+1):
j = i+L-1
m[i][j] = sys.maxint
for k in range(i, j):
# q = cost/scalar multiplications
q = m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j]
if q < m[i][j]:
m[i][j] = q
return m[1][n-1]
Output:
Minimum number of multiplications is 18
Time Complexity: O(n^3)
Auxiliary Space: O(n^2)
Greedy Algorithm
"Greedy Method finds out of many options, but you have to choose the best option."
In this method, we have to find out the best method/option out of many present ways.
In this approach/method we focus on the first stage and decide the output, don't think about the future.
This method may or may not give the best output. Greedy Algorithm solves problems by making the best choice that
seems best at the particular moment. Many optimization problems can be determined using a greedy algorithm.
Some issues have no efficient solution, but a greedy algorithm may provide a solution that is close to optimal. A
greedy algorithm works if a problem exhibits the following two properties:
1. Greedy Choice Property: A globally optimal solution can be reached at by creating a locally optimal solution.
In other words, an optimal solution can be obtained by creating "greedy" choices.
2. Optimal substructure: Optimal solutions contain optimal subsolutions. In other words, answers to
subproblems of an optimal solution are optimal.
Areas of Application
Greedy approach is used to solve many problems, such as
• Finding the shortest path between two vertices using Dijkstra’s algorithm.
• Finding the minimal spanning tree in a graph using Prim’s /Kruskal’s algorithm, etc.
Where Greedy Approach Fails
In many problems, Greedy algorithm fails to find an optimal solution, moreover it may produce a worst solution.
Problems like Travelling Salesman and Knapsack cannot be solved using this approach.
We define the shortest - path weight from u to v by δ(u,v) = min (w (p): u→v), if there is a path from u to v, and
δ(u,v)= ∞, otherwise.
The shortest path from vertex s to vertex t is then defined as any path p with weight w (p) = δ(s,t).
The breadth-first- search algorithm is the shortest path algorithm that works on unweighted graphs, that is, graphs
in which each edge can be considered to have unit weight.
In a Single Source Shortest Paths Problem, we are given a Graph G = (V, E), we want to find the shortest path from a
given source vertex s ∈ V to every vertex v ∈ V.
Variants:
Let P1 be x - y sub path of shortest s - v path. Let P2 be any x - y path. Then cost of P1≤ cost of P2,otherwise P not
shortest s - v path.
2. Triangle inequality: Let d (v, w) be the length of shortest path from v to w. Then,
d (v, w) ≤ d (v, x) + d (x, w)
3. Upper-bound property: We always have d[v] ≥ δ(s, v) for all vertices v ∈ V, and once d[v] conclude the value v δ(s,
v), it never changes.
4. No-path property: If there is no path from s to v, then we regularly have d[v] = δ(s, v) = ∞.
5. Convergence property: If s->u->v >v is a shortest path in G for some u, v ∈ V, and if d[u] = δ(s, u) at any time prior to
relaxing
ing edge (u, v), then d[v] = δ(s, v) at all times thereafter.
thereafter
Relaxation
The single - source shortest paths are based on a technique known as relaxation,, a method that repeatedly
decreases an upper bound on the actual shortest path weight of each vertex until
until the upper bound equivalent the
shortest - path weight. For each vertex v ∈ V, we maintain an attribute d [v], which is an upper bound on the weight
w
of the shortest path from source s to v. We call d [v] the shortest path estimate.
INITIALIZE - SINGLE - SOURCE (G, s)
1. for each vertex v ∈ V [G]
2. do d [v] ← ∞
3. π [v] ← NIL
4. d [s] ← 0
After initialization, π [v] = NIL for all v ∈ V, d [v] = 0 for v = s, and d [v] = ∞ for v ∈ V - {s}.
The development of relaxing an edge (u, v) consists of testing whether we can improve the shortest path to v found
so far by going through u and if so, updating d [v] and π [v]. A relaxation step may decrease the value of the shortest
- path estimate d [v] and updated v's predecessor field π [v].
Dijkstra's Algorithm
It is a greedy algorithm that solves the single-source
single source shortest path problem for a directed graph G = (V, E) with
nonnegative edge weights, i.e., w (u, v) ≥ 0 for each edge (u, v) ∈ E.
Dijkstra's
ijkstra's Algorithm maintains a set S of vertices whose final shortest - path weights from the source s have already
been determined. That's for all vertices v ∈ S; we have d [v] = δ (s, v). The algorithm repeatedly selects the vertex u ∈
V - S with the minimum shortest - path estimate, insert u into S and relaxes all edges leaving u.
Because it always chooses the "lightest" or "closest" vertex in V - S to insert into set S, it is called as the greedy
strategy.
Algorithm Steps:
• Set all vertices distances = infinity except for the source vertex, set the source distance = 0.
• Push the source vertex in a min--priority
priority queue in the form (distance , vertex), as the comparison in the min-
min
priority queue will be according to vertices distances.
• Pop the vertex with the e minimum distance from the priority queue (at first the popped vertex = source).
• Update the distances of the connected vertices to the popped vertex in case of "current vertex distance +
edge weight < next vertex distance", then push the vertex with the new ew distance to the priority queue.
• If the popped vertex is visited before, just continue without using it.
• Apply the same algorithm again until the priority queue is empty.
The set sptSet is initially empty and distances assigned to vertices are
{0, INF, INF, INF, INF, INF, INF, INF} where INF indicates infinite. Now pick the vertex
with minimum distance value. The vertex 0 is picked, include it in sptSet. So sptSet
becomes {0}. After including 0 to sptSet, update distance values of its adjacent vertices.
Adjacent vertices of 0 are 1 and 7. The distance values of 1 and 7 are updated as 4 and 8.
Following subgraph shows vertices and their distance values, only the vertices with finite
distance values are shown. The vertices included in SPT are shown in green colour.
Pick the vertex with minimum distance value and not already included in
SPT (not in sptSET). The vertex 1 is picked and added to sptSet. So sptSet
now becomes {0, 1}. Update the distance values of adjacent vertices of 1.
The distance value of vertex 2 becomes 12.
Pick the vertex with minimum distance value and not already included in
SPT (not in sptSET). Vertex 7 is picked. So sptSet now becomes {0, 1, 7}.
Update the distance values of adjacent vertices of 7. The distance value
of vertex 6 and 8 becomes finite (15 and 9 respectively).
Pick the vertex with minimum distance value and not already included in
SPT (not in sptSET). Vertex 6 is picked. So sptSet now becomes {0, 1, 7, 6}.
Update the distance values of adjacent vertices of 6. The distance value
of vertex 5 and 8 are updated.
We repeat the above steps until sptSet does include all vertices of given graph. Finally, we get the following Shortest
Path Tree (SPT).
We use a boolean array sptSet[] to represent the set of vertices included in SPT. If a value sptSet[v] is true, then
vertex v is included in SPT, otherwise not. Array dist[] is used to store shortest distance values of all vertices.
# Python program for Dijkstra's single
# source shortest path algorithm. The program is
# for adjacency matrix representation of the graph
# Library for INT_MAX
import sys
class Graph():
def __init__(self, vertices):
self.V = vertices
self.graph = [[0 for column in range(vertices)]
for row in range(vertices)]
# Driver program
g = Graph(9)
g.graph = [[0, 4, 0, 0, 0, 0, 0, 8, 0],
[4, 0, 8, 0, 0, 0, 0, 11, 0],
[0, 8, 0, 7, 0, 4, 0, 0, 2],
[0, 0, 7, 0, 9, 14, 0, 0, 0],
[0, 0, 0, 9, 0, 10, 0, 0, 0],
[0, 0, 4, 14, 10, 0, 2, 0, 0],
[0, 0, 0, 0, 0, 2, 0, 1, 6],
[8, 11, 0, 0, 0, 0, 1, 0, 7],
[0, 0, 2, 0, 0, 0, 6, 7, 0]
];
g.dijkstra(0);
Output:
Vertex Distance from Source
0 0
1 4
2 12
3 19
4 21
5 11
6 9
7 8
8 14
Notes:
1) The code calculates shortest distance, but doesn’t calculate the path information. We can create a parent array,
update the parent array when distance is updated (like prim’s implementation) and use it show the shortest path
from source to different vertices.
2) The code is for undirected graph, same dijkstra function can be used for directed graphs also.
3) The code finds shortest distances from source to all vertices. If we are interested only in shortest distance from
the source to a single target, we can break the for the loop when the picked minimum distance vertex is equal to
target (Step 3.a of the algorithm).
4) Time Complexity of the implementation is O(V^2). If the input graph is represented using adjacency list, it can be
reduced to O(E log V) with the help of binary heap. Please see
The running time of this data is determined by line 1 and by the for loop of lines 3 - 5. The topological sort can be
implemented in ∅ (V + E) time. In the for loop of lines 3 - 5, as in Dijkstra's algorithm,, there is one repetition per
vertex. For each vertex, the edges that leave the vertex are each examined exactly once. Unlike Dijkstra's algorithm,
we use only O (1) time per edge. The running time is thus ∅ (V + E), which is linear in the size of an adjacency
adjac list
depiction of the graph.
Example:
Step1: To topologically sort vertices apply DFS (Depth First Search) and then arrange vertices in linear order by
decreasing order of finish time.
Now, take each vertex in topologically sorted order and relax each edge.
1. adj [s] →t, x
2. 0 + 3 < ∞
3. d [t] ← 3
4. 0 + 2 < ∞
5. d [x] ← 2
1. adj [t] → r, x
2. 3+1<∞
3. d [r] ← 4
4. 3+5≤2
1. adj [x] → y
2. 2 - 3 < ∞
3. d [y] ← -1
1. adj [y] → r
2. -1 + 4 < 4
3. 3 <4
4. d [r] ← 3
Thus the Shortest Path is:
1. s to x is 2
2. s to y is -1
3. s to t is 3
4. s to r is 3
Bellman-Ford Algorithm
Solves single shortest path problem in which edge weight may be negative but no negative cycle exists.
This algorithm works correctly when some of the edges of the directed graph G may have negative weight. When
there are no cycles of negative weight, then we can find out the shortest path between source and destination.
It is slower than Dijkstra's Algorithm but more versatile, as it capable of handling some of the negative weight edges.
This algorithm detects the negative cycle in a graph and reports their existence.
Based on the "Principle of Relaxation" in which more accurate values gradually recovered an approximation to the
proper distance by until eventually reaching the optimum solution.
Given a weighted directed graph G = (V, E) with source s and weight function w: E → R, the Bellman-Ford algorithm
returns a Boolean value indicating whether or not there is a negative weight cycle that is attainable from the source.
If there is such a cycle, the algorithm produces the shortest paths and their weights. The algorithm returns TRUE if
and only if a graph contains no negative - weight cycles that are reachable from the source.
Recurrence Relation
distk [u] = [min[distk-1 [u],min[ distk-1 [i]+cost [i,u]]] as i except u.
k → k is the source vertex
u → u is the des^na^on vertex
i → no of edges to be scanned concerning a vertex.
Example
The following example shows how Bellman-Ford algorithm works step by step. This graph has a negative edge but
does not have any negative cycle, hence the problem can be solved using this technique.
At the time of initialization, all the vertices except the source are marked by ∞ and the source is marked by 0.
In the first step, all the vertices which are reachable from the source are updated by minimum cost. Hence,
vertices a and h are updated.
Following the same logic, in this step vertices b, f, c and g are updated.
Here, vertices c and d are updated.
g = Graph(5)
g.addEdge(0, 1, -1)
g.addEdge(0, 2, 4)
g.addEdge(1, 2, 3)
g.addEdge(1, 3, 2)
g.addEdge(1, 4, 2)
g.addEdge(3, 2, 5)
g.addEdge(3, 1, 1)
g.addEdge(4, 3, -3)
Output:
Vertex Distance from Source
0 0
1 -1
2 2
3 -2
4 1
Notes
1) Negative weights are found in various applications of graphs. For example, instead of paying cost for a path, we
may get some advantage if we follow the path.
2) Bellman-Ford works better (better than Dijksra’s) for distributed systems. Unlike Dijksra’s where we need to find
minimum value of all vertices, in Bellman-Ford, edges are considered one by one.
Knapsack problem
Given a set of items, each with a weight and a value, determine a subset of items to include in a collection so that
the total weight is less than or equal to a given limit and the total value is as large as possible.
The knapsack problem is in combinatorial optimization problem. It appears as a subproblem in many, more complex
mathematical models of real-world problems. One general approach to difficult problems is to identify the most
restrictive constraint, ignore the others, solve a knapsack problem, and somehow adjust the solution to satisfy the
ignored constraints.
Applications
In many cases of resource allocation along with some constraint, the problem can be derived in a similar way of
Knapsack problem. Following is a set of example.
• Finding the least wasteful way to cut raw materials
• portfolio optimization
• Cutting stock problems
Problem Scenario
A thief is robbing a store and can carry a maximal weight of W into his knapsack. There are n items available in the
store and weight of ith item is wi and its profit is pi. What items should the thief take?
In this context, the items should be selected in such a way that the thief will carry those items for which he will gain
maximum profit. Hence, the objective of the thief is to maximize the profit.
Based on the nature of the items, Knapsack problems are categorized as
• Fractional Knapsack
• Knapsack
Fractional Knapsack
Fractions of items can be taken rather than having to make binary (0-1) choices for each item. Fractional Knapsack
Problem can be solvable by greedy strategy whereas 0 - 1 problem is not.
ITEM wi vi
I1 5 30
I2 10 20
I3 20 100
I4 30 90
I5 40 160
ITEM wi vi Pi=
I1 5 30 6
I2 10 20 2
I3 20 100 5
I4 30 90 3
I5 40 160 4
ITEM wi vi pi=
I1 5 30 6
I3 20 100 5
I5 40 160 4
I4 30 90 3
I2 10 20 2
# Greedy Approach
class FractionalKnapSack:
"""Time Complexity O(n log n)"""
@staticmethod
def getMaxValue(wt, val, capacity):
"""function to get maximum value """
iVal = []
for i in range(len(wt)):
iVal.append(ItemValue(wt[i], val[i], i))
# Driver Code
if __name__ == "__main__":
wt = [10, 40, 20, 30]
val = [60, 40, 100, 120]
capacity = 50
Output :
Maximum value in Knapsack = 240
1. W ≤ capacity
2. Value ← Max
Input:
o Knapsack of capacity
o List (Array) of weight and their corresponding value.
Output: To maximize profit and minimize weight in capacity.
The knapsack problem where we have to pack the knapsack with maximum value in such a manner that the total
weight of the items should not be greater than the capacity of the knapsack.
KNAPSACK (n, W)
1. for w = 0, W
2. do V [0,w] ← 0
3. for i=0, n
4. do V [i, 0] ← 0
5. for w = 0, W
6. do if (wi≤ w & vi + V [i-1, w - wi]> V [i -1,W])
7. then V [i, W] ← vi + V [i - 1, w - wi]
8. else V [i, W] ← V [i - 1, w]
The [i, j] entry here will be V [i, j], the best value obtainable using the first "i" rows of items if the maximum
max capacity
were j. We begin by initialization and first row.
Output:
220
As we have discussed, one graph may have more than one spanning tree. If there are n number of vertices, the
spanning tree should have n - 1 number of edges. In this context, if each edge of the graph is associated with a
weight and there exists more than one spanning tree, we need to find the minimum spanning tree of the graph.
Moreover, if there exist any duplicate weighted edges, the graph may have multiple minimum spanning tree.
In the above graph, we have shown a spanning tree though it’s not the minimum spanning tree. The cost of this
spanning tree is (5 + 7 + 3 + 3 + 5 + 8 + 3 + 4) = 38.
Methods of Minimum
nimum Spanning Tree
There are two methods to find Minimum Spanning Tree
1. Kruskal's Algorithm
2. Prim's Algorithm
Kruskal's Algorithm:
An algorithm to construct a Minimum Spanning Tree for a connected weighted graph. It is a Greedy Algorithm. The
Greedy Choice is to put the smallest weight edge that does not because a cycle in the MST constructed so far.
If the graph is not linked, then it finds a Minimum Spanning Tree.
Analysis: Where E is the number of edges in the graph and V is the number of vertices, Kruskal's Algorithm can be
shown to run in O (E log E) time, or simply, O (E log V) time, all with simple data structures. These running
ru times are
equivalent because:
2 2
o E is at most V and log V = 2 x log V is O (log V).
o If we ignore isolated vertices, which will each their components of the minimum spanning tree, V ≤ 2 E, so
log V is O (log E).
Thus the total time is
1. O (E log E) = O (E log V).
For Example: Find the Minimum Spanning Tree of the following graph using Kruskal's
Kruskal's algorithm.
Solution: First we initialize the set A to the empty set and create |v| trees, one containing each vertex with MAKE-
MAKE
SET procedure. Then sort the edges in E into order by non-decreasing
non weight.
There are 9 vertices and 12 edges. So MST formed (9-1) = 8 edges
Now, check for each edge (u, v) whether the endpoints u and v belong to the same tree. If they do then the edge (u,
v) cannot be supplementary. Otherwise, the two vertices belong to different trees, and the edge (u, v) is added to A,
and the vertices in two trees are merged in by union procedure.
Step1: So, first take (h, g) edge
Step 3: then (a, b) and (i, g) edges are considered, and the forest becomes
Step 7: This step will be required Minimum Spanning Tree because it contains all the 9 vertices and (9 - 1) = 8 edges
1. e → f, b → h, d → f [cycle will be formed]
# A utility
lity function to find set of an element i
# (uses path compression technique)
def find(self, parent, i):
if parent[i] == i:
return i
return self.find(parent, parent[i])
# Driver code
g = Graph(4)
g.addEdge(0, 1, 10)
g.addEdge(0, 2, 6)
g.addEdge(0, 3, 5)
g.addEdge(1, 3, 15)
g.addEdge(2, 3, 4)
g.KruskalMST()
Prim's Algorithm
It is a greedy algorithm. It starts with an empty spanning tree. The idea is to maintain two sets of vertices:
o Contain vertices already included in MST.
o Contain vertices not yet included.
At every step, it considers all the edges and picks the minimum weight edge. After picking the edge, it moves the
other endpoint of edge to set containing MST.
MST-PRIM (G, w, r)
1. for each u ∈ V [G]
2. do key [u] ← ∞
3. π [u] ← NIL
4. key [r] ← 0
5. Q ← V [G]
6. While Q ? ∈
7. do u ← EXTRACT - MIN (Q)
8. for each v ∈ Adj [u]
9. do if v ∈ Q and w (u, v) < key [v]
10. then π [v] ← u
11. key [v] ← w (u, v)
Example: Generatete minimum cost spanning tree for the following graph using Prim's algorithm.
Solution: In Prim's algorithm, first we initialize the priority Queue Q. to contain all the vertices and the key of each
vertex to ∞ except for the root, whose key is set to 0. 0. Suppose 0 vertex is the root, i.e., r. By EXTRACT - MIN (Q)
procure, now u = r and Adj [u] = {5, 1}.
Removing u from set Q and adds it to set V - Q of vertices in the tree. Now, update the key and π fields of every
vertex v adjacent to u but not in a tree.
1. u = EXTRACT_MIN (2, 6)
2. u=2 [key [2] < key [6]]
3. 12 < 18
4. Now the root is 2
5. Adj [2] = {3, 1}
6. 3 is already in a heap
7. Taking 1, key [1] = 28
8. w (2,1) = 16
9. w (2,1) < key [1]
So update key value of key [1] as 16 and its parent as 2.
π[1]= 2
Total Cost = 10 + 25 + 22 + 12 + 16 + 14 = 99
class Graph():
def __init__(self, vertices):
self.V = vertices
self.graph = [[0 for column in range(vertices)]
for row in range(vertices)]
for v in range(self.V):
if key[v] < min and mstSet[v] == False:
min = key[v]
min_index
_index = v
return min_index
# Function to construct and print MST for a graph
# represented using adjacency matrix representation
def primMST(self):
# Key values used to pick minimum weight edge in cut
key = [sys.maxint] * self.V
parent = [None] * self.V # Array to store constructed MST
# Make key 0 so that this vertex is picked as first vertex
key[0] = 0
mstSet = [False] * self.V
parent[0] = -1 # First node is always the root of
g = Graph(5)
g.graph = [ [0, 2, 0, 6, 0],
[2, 0, 3, 8, 5],
[0, 3, 0, 0, 7],
[6, 8, 0, 0, 9],
[0, 5, 7, 9, 0]]
g.primMST();
Output:
Edge Weight
0-1 2
1-2 3
0-3 6
1-4 5
Time Complexity of the above program is O(V^2).
Differentiate between Dynamic Programming and Greedy Method
Geometric Algorithm
How to check if two given line segments intersect?
Given two line segments (p1, q1) and (p2, q2), find if the given line segments intersect with each other.
Before we discuss solution, let us define notion of orientation. Orientation of an ordered triplet of points in the plane
can be
–counterclockwise
–clockwise
–colinear
The following diagram shows different possible orientations of (a, b, c)
Sweep Line Algorithm: We can solve this problem in O(nLogn) time using Sweep Line Algorithm. The algorithm first
sorts the end points along the x axis from left to right, then it passes a vertical line through all points from left to
right and checks for intersections. Following are detailed steps.
1. 1)Let there be n given lines. There must be 2n end points to represent the n lines. Sort all points according to
x coordinates. While sorting maintain a flag to indicate whether this point is left point of its line or right
point.
2. 2) Start from the leftmost point. Do following for every point
a. If the current point is a left point of its line segment, check for intersection of its line segment
with the segments just above and below it. And add its line to active line segments (line segments
for which left end point is seen, but right end point is not seen yet). Note that we consider only
those neighbors which are still active.
b. If the current point is a right point, remove its line segment from active list and check whether its
two active neighbors (points just above and below) intersect with each other.
The step 2 is like passing a vertical line from all points starting from the leftmost point to the rightmost point. That is
why this algorithm is called Sweep Line Algorithm. The Sweep Line technique is useful in many other geometric
algorithms like calculating the 2D Voronoi diagram
Also, in step 1, instead of sorting, we can use min heap data structure. Building a min heap takes O(n) time and every
extract min operation takes O(Logn) time
Example:
Let us consider the following example taken from here. There are 5 line segments 1, 2, 3, 4 and 5. The dotted green
lines show sweep lines.
Following are steps followed by the algorithm. All points from left to right are processed one by one. We maintain a
self-balancing binary search tree.
Left end point of line segment 1 is processed:
processed 1 is inserted into the Tree. The treee contains 1. No intersection.
Left end point of line segment 2 is processed:
processed Intersection of 1 and 2 is checked. 2 is inserted into the Tree. No
intersection. The tree contains 1, 2.
Left end point of line segment 3 is processed: Intersection of 3 with 1 is checked. No intersection. 3 is inserted into
the Tree. The tree contains 2, 1, 3.
Right end point of line segment 1 is processed: 1 is deleted from the Tree. Intersection of 2 and 3 is checked.
Intersection of 2 and 3 is reported. The tree contains
co 2, 3.
Left end point of line segment 4 is processed:
processed Intersections of line 4 with lines 2 and 3 are checked. No
intersection. 4 is inserted into the Tree. The tree contains 2, 4, 3.
Left end point of line segment 5 is processed:
processed Intersection of 5 with 3 is checked. No intersection. 5 is inserted into
the Tree. The tree contains 2, 4, 3, 5.
Right end point of line segment 5 is processed: 5 is deleted from the Tree. The tree contains 2, 4, 3.
Right end point of line segment 4 is processed:
process 4 is deleted from the Tree. The tree contains 2, 4, 3. Intersection
of 2 with 3 is checked. Intersection of 2 with 3 is reported. The tree contains 2, 3. Note that the intersection
of 2 and 3 is reported again. We can add some logic to check forfo duplicates.
Right end point of line segment 2 and 3 are processed: Both are deleted from tree and tree becomes empty.
Time Complexity: The first step is sorting which takes O(nLogn) time. The second step process 2n points and for
processing every point,
nt, it takes O(Logn) time. Therefore, overall time complexity is O(nLogn)
The idea of Jarvis’s Algorithm is simple, we start from the leftmost point (or point with minimum x coordinate value)
and we keep wrapping points in counterclockwise direction. The big question is, given a point p as current point,
how to find the next point inn output? The idea is to use orientation () here. Next point is selected as the point that
beats all other points at counterclockwise orientation, i.e., next point is q if for any other point r, we have
“orientation(p, q, r) = counterclockwise”. Following is the detailed algorithm.
1) Initialize p as leftmost point.
2) Do following while we don’t come back to the first (or leftmost) point.
…..a) The next point q is the point such that the triplet (p, q, r) is counterclockwise for any other point r.
…..b) next[p] = q (Store q as next of p in the output convex hull).
…..c) p = q (Set p as q for next iteration).
Time Complexity: For every point on the hull we examine all the other points to determine the next point. Time
complexity is ?(m * n) where n is number of input points and m is number of output or hull points (m <= n). In worst
case, time complexity is O(n 2). The worst case occurs when all the points are on the hull (m = n)
The worst case time complexity of Jarvis’s Algorithm is O(n^2). Using Graham’s scan algorithm, we can find Convex
Hull in O(nLogn) time. Following is Graham’s algorithm
Let points[0..n-1] be the input array.
1) Find the bottom-most point by comparing y coordinate of all points. If there are two points with the same y value,
then the point with smaller x coordinate value is considered. Let the bottom-most point be P0. Put P0 at first
position in output hull.
2) Consider the remaining n-1 points and sort them by polar angle in counterclockwise order around points[0]. If the
polar angle of two points is the same, then put the nearest point first.
3 After sorting, check if two or more points have the same angle. If two more points have the same angle, then
remove all same angle points except the point farthest from P0. Let the size of the new array be m.
4) If m is less than 3, return (Convex Hull not possible)
5) Create an empty stack ‘S’ and push points[0], points[1] and points[2] to S.
6) Process remaining m-3 points one by one. Do following for every point ‘points[i]’
4.1) Keep removing points from stack while orientation of following 3 points is not counterclockwise (or they
don’t make a left turn).
a) Point next to top in stack
b) Point at the top of stack
c) points[i]
4.2) Push points[i] to S
5) Print contents of S
The above algorithm can be divided into two phases.
Phase 1 (Sort points): We first find the bottom-most point. The idea is to pre-process points be sorting them with
respect to the bottom-most point. Once the points are sorted, they form a simple closed path (See the following
diagram).
What should be the sorting criteria? computation of actual angles would be inefficient since trigonometric functions
are not simple to evaluate. The idea is to use the orientation to compare angles without actually computing them
(See the compare() function below)
Phase 2 (Accept or Reject Points): Once we have the closed path, the next step is to traverse the path and remove
concave points on this path. How to decide which point to remove and which to keep? Again, orientation helps here.
The first two points in sorted array are always part of Convex Hull. For remaining points, we keep track of recent
three points, and find the angle formed by them. Let the three points be prev(p), curr(c) and next(n). If orientation of
these points (considering them in same order) is not counterclockwise, we discard c, otherwise we keep it. Following
diagram shows step by step process of this phase
Time Complexity: Let n be the number of input points. The algorithm takes O(nLogn) time if we use a O(nLogn)
sorting algorithm.
The first step (finding the bottom-most point) takes O(n) time. The second step (sorting points) takes O(nLogn) time.
The third step takes O(n) time. In the third step, every element is pushed and popped at most one time. So the sixth
step to process points one by one takes O(n) time, assuming that the stack operations take O(1) time. Overall
complexity is O(n) + O(nLogn) + O(n) + O(n) which is O(nLogn)
Input is an array of points specified by their x and y coordinates. Output is a convex hull of this set of points in
ascending order of x coordinates.
Example :
Input : points[] = {{0, 3}, {1, 1}, {2, 2}, {4, 4},{0, 0}, {1, 2}, {3, 1}, {3, 3}};
Output : The points in convex hull are: (0, 0) (0, 3) (3, 1) (4, 4)
Input : points[] = {(0, 0), (0, 4), (-4, 0), (5, 0), (0, -6), (1, 0)};
Output : (-4, 0), (5, 0), (0, -6), (0, 4)
The QuickHull algorithm is a Divide and Conquer algorithm similar to Quick Sort. Let a[0…n-1] be the input array of
points. Following are the steps for finding the convex hull of these points.
1. Find the point with minimum x-coordinate lets say, min_x and similarly the point with maximum x-
coordinate, max_x.
2. Make a line joining these two points, say L. This line will divide the whole set into two parts. Take both the
parts one by one and proceed further.
3. For a part, find the point P with maximum distance from the line L. P forms a triangle with the points min_x,
max_x. It is clear that the points residing inside this triangle can never be the part of convex hull.
4. The above step divides the problem into two sub-problems (solved recursively). Now the line joining the
points P and min_x and the line joining the points P and max_x are new lines and the points residing outside
the triangle is the set of points. Repeat point no. 3 till there no point left with the line. Add the end points of
this point to the convex hull.
Time Complexity: The analysis is similar to Quick Sort. On average, we get time complexity as O(n Log n), but in
worst case, it can become O(n2)
Output :
New convex hull : (-1, 4) (0, 0) (3, -1) (100, 100)
We first check whether the point is inside the given convex hull or not. If it is, then nothing has to be done we
directly return the given convex hull. If the point is outside the convex hull, we find the lower and upper tangents,
and then merge the point with the given convex hull to find the new convex hull, as shown in the figure.
The red outline shows the new convex hull after merging the point and the given convex hull.
To find the upper tangent, we first choose a point on the hull that is nearest to the given point. Then while the line
joining the point on the convex hull and the given point crosses the convex hull, we move anti-clockwise till we get
the tangent line.
The figure shows the moving of the point on the convex hull for finding the upper tangent.
Note: It is assumed here that the input of the initial convex hull is in the anti-clockwise order, otherwise we have to
first sort them in anti-clockwise order then apply the following code.
The Brute force solution is O(n^2), compute the distance between each pair and return
return the smallest. We can
calculate the smallest distance in O(nLogn) time using Divide and Conquer strategy. In this post, a O(n x (Logn)^2)
approach is discussed.
Algorithm
Following are the detailed steps of a O(n (Logn)^2) algortihm.
Input: An array of n points P[]
Output: The smallest distance between two points in the given array.
As a pre-processing
processing step, the input array is sorted according to x coordinates.
1) Find the middle point in the sorted array, we can take P[n/2] as middle point.
2) Divide the
he given array in two halves. The first subarray contains points from P[0] to P[n/2]. The second subarray
contains points from P[n/2+1] to P[n-1].
3) Recursively find the smallest distances in both subarrays. Let the distances be dl and dr. Find the minimum
minimu of dl
and dr. Let the minimum be d.
4) From the above 3 steps, we have an upper bound d of minimum distance. Now we need to consider the pairs such
that one point in pair is from the left half and the other is from the right half. Consider the vertical line passing
through P[n/2] and find all points whose x coordinate is closer than d to the middle vertical line. Build an array strip[]
of all such points.
5)Sort
Sort the array strip[] according to y coordinates. This step is O(nLogn). It can be optimized to t O(n) by recursively
sorting and merging.
6) Find the smallest distance in strip[]. This is tricky. From the first look, it seems to be a O(n^2) step, but it is actually
O(n). It can be proved geometrically that for every point in the strip, we only need to check at most 7 points after it
(note that strip is sorted according to Y coordinate).
7) Finally return the minimum of d and distance calculated in the above step (step 6)
Time Complexity Let Time complexity of above algorithm be T(n). Let us assume that we use a O(nLogn) sorting
algorithm. The above algorithm divides all points in two sets and recursively calls for two sets. After dividing, it finds
the strip in O(n) time, sorts the strip in O(nLogn) time and finally finds the closest points in strip in O(n) time. So T(n)
can expressed as follows
T(n) = 2T(n/2) + O(n) + O(nLogn) + O(n)
T(n) = 2T(n/2) + O(nLogn)
T(n) = T(n x Logn x Logn)
Notes
1) Time complexity can be improved to O(nLogn) by optimizing step 5 of the above algorithm. We will soon be
discussing
iscussing the optimized solution in a separate post.
2) The code finds smallest distance. It can be easily modified to find the points with the smallest distance.
3) The code uses quick sort which can be O(n^2) in the worst case. To have the upper bound as O(n (Logn)^2), a
O(nLogn) sorting algorithm like merge sort or heap sort can be used
Internet Algorithm
Trie
Trie is an efficient information reTrieval
val data structure. Using Trie, search complexities can be brought to optimal
ore keys in binary search tree, a well balanced BST will need time proportional to M * log
limit (key length). If we store
N,, where M is maximum string length and N is number of keys in tree. Using Trie, we can search the key in O(M)
time. However the penalty is on Trie storage requirements
requireme
Every node of Trie consists of multiple branches. Each branch represents a possible character of keys. We need to
mark the last node of every key as end of word node. A Trie node field isEndOfWord is used to distinguish the node
as end of word node. A simple structure to represent nodes of the English alphabet can be as following,
// Trie node
struct TrieNode
{
struct TrieNode *children[ALPHABET_SIZE];
// isEndOfWord is true if the node
// represents end of a word
bool isEndOfWord;
};
Inserting a key into Trie is a simple approach. Every character of the input key is inserted as an individual Trie node.
Note that the children is an array of pointers (or references) to next level trie nodes. The key character acts as an
index into the array children. If the input key is new or an extension of the existing key, we need to construct non-
existing nodes of the key, and mark end of the word for the last node. If the input key is a prefix of the existing key in
Trie, we simply mark the last node of the key as the end of a word. The key length determines Trie depth.
Searching for a key is similar to insert operation, however, we only compare the characters and move down. The
search can terminate due to the end of a string or lack of key in the trie. In the former case, if the isEndofWord field
of the last node is true, then the key exists in the trie. In the second case, the search terminates without examining
all the characters of the key, since the key is not present in the trie.
The following picture explains construction of trie using keys given in the example below,
root
/ \ \
t a b
| | |
h n y
| | \ |
e s y e
/ | |
I r w
| | |
r e e
|
r
In the picture, every character is of type trie_node_t. For example, the root is of type trie_node_t, and it’s
children a, b and t are filled, all other nodes of root will be NULL. Similarly, “a” at the next level is having only one
child (“n”), all other children are NULL. The leaf nodes are in blue.
nsert and search costs O(key_length), however the memory requirements of Trie is O(ALPHABET_SIZE * key_length
* N) where N is number of keys in Trie. There are efficient representation of trie nodes (e.g. compressed trie, ternary
search tree, etc.) to minimize memory requirements of trie.
Trie | (Delete)
During delete operation we delete the key in bottom up manner using recursion. The following are possible
conditions when deleting key from trie,
1. Key may not be there in trie. Delete operation should not modify trie.
2. Key present as unique key (no part of key contains another key (prefix), nor the key itself is prefix of another
key in trie). Delete all the nodes.
3. Key is prefix key of another long key in trie. Unmark the leaf node.
4. Key present in trie, having atleast one other key as prefix key. Delete nodes from end of key until first leaf
node of longest prefix key.
class Trie:
# Trie data structure class
def __init__(self):
self.root = self.getNode()
def getNode(self):
# Returns new trie node (initialized to NULLs)
return TrieNode()
def _charToIndex(self,ch):
# private helper function
# Converts key current character into index
# use only 'a' through 'z' and lower case
return ord(ch)-ord('a')
def insert(self,key):
# If not present, inserts key into trie
# If the key is prefix of trie node,
# just marks leaf node
pCrawl = self.root
length = len(key)
for level in range(length):
index = self._charToIndex(key[level])
# driver function
def main():
# Input keys (use only 'a' through 'z' and lower case)
keys = ["the","a","there","anaswe","any",
"by","their"]
output = ["Not present in trie",
"Present in trie"]
# Trie object
t = Trie()
# Construct trie
for key in keys:
t.insert(key)
# Search for different keys
print("{} ---- {}".format("the",output[t.search("the")]))
print("{} ---- {}".format("these",output[t.search("these")]))
print("{} ---- {}".format("their",output[t.search("their")]))
print("{} ---- {}".format("thaw",output[t.search("thaw")]))
if __name__ == '__main__':
main()
Output :
the --- Present in trie
these --- Not present in trie
their --- Present in trie
thaw --- Not present in trie
Why Trie? :-
1. With Trie, we can insert and find strings in O(L) time where L represent the length of a single word. This is
obviously faster than BST. This is also faster than Hashing because of the ways it is implemented. We do not
need to compute any hash function. No collision handling is required (like we do in open
addressing and separate chaining)
2. Another advantage of Trie is, we can easily print all words in alphabetical order which is not easily possible
with hashing.
3. We can efficiently do prefix search (or auto-complete) with Trie.
Concatenation of the edge-labels on the path from the root to leaf i gives the suffix of S that starts at position i, i.e.
S[i…m].
Note: Position starts with 1 (it’s not zero indexed, but later, while code implementation, we will used zero indexed
position)
For string S = xabxac with m = 6, suffix tree will look like following:
It has one root node and two internal nodes and 6 leaf nodes.
String Depth of red path is 1 and it represents suffix c starting at position 6
String Depth of blue path is 4 and it represents suffix bxca starting at position 3
String Depth of green path is 2 and it represents suffix ac starting at position 5
String Depth of orange path is 6 and it represents suffix xabxac starting at position 1
Edges with labels a (green) and xa (orange) are non-leaf
non edge (which ends at an internal
nternal node). All other edges are
leaf edge (ends at a leaf)
If one suffix of S matches a prefix of another suffix of S (when last character in not unique in string), then path for the
first suffix would not end at a leaf.
For String S = xabxa, with m = 5,, following is the suffix tree:
This takes O(m2) to build the suffix tree for the string S of length m.
Following are few steps to build suffix tree based for string “xabxa$” based on above algorithm:
Ukkonen’s algorithm is divided into m phases (one phase for each character in the string with length m)
In phase i+1, tree Ti+1 is built from tree
ee Ti.
Each phase i+1 is further divided into i+1 extensions, one for each of the i+1 suffixes of S[1..i+1]
In extension j of phase i+1, the algorithm first finds the end of the path from the root labelled with substring S[j..i].
It then extends the substring
ring by adding the character S(i+1) to its end (if it is not there already).
In extension 1 of phase i+1, we put string S[1..i+1] in the tree. Here S[1..i] will already be present in tree due to
previous phase i. We just need to add S[i+1]th character in tree t (if not there already).
In extension 2 of phase i+1, we put string S[2..i+1] in the tree. Here S[2..i] will already be present in tree due to
previous phase i. We just need to add S[i+1]th character in tree (if not there already)
In extension 3 of phasee i+1, we put string S[3..i+1] in the tree. Here S[3..i] will already be present in tree due to
previous phase i. We just need to add S[i+1]th character in tree (if not there already)
.
.
In extension i+1 of phase i+1, we put string S[i+1..i+1] in the tree.
tree. This is just one character which may not be in tree
(if character is seen first time so far). If so, we just add a new leaf edge with label S[i+1].
Suffix extension is all about adding the next character into the suffix tree built so far.
In extension j of phase i+1, algorithm finds the end of S[j..i] (which is already in the tree due to previous phase i) and
then it extends S[j..i] to be sure the suffix S[j..i+1] is in the tree.
Rule 2:: If the path from the root labelled S[j..i] ends at non-leaf
non leaf edge (i.e. there are more characters after S[i] on
path) and next character is not s[i+1], then a new leaf edge with label s{i+1] and number j is created starting from
character S[i+1].
A new internal node will also be created if s[1..i] ends inside (in-between)
(in a non-leaf
leaf edge.
Rule 3:: If the path from the root labelled S[j..i] ends at non-leaf
non leaf edge (i.e. there are more characters after S[i] on
path) and next character is s[i+1] (already
ady in tree), do nothing.
One important point to note here is that from a given node (root or internal), there will be one and only one edge
starting from one character. There will not be more than one edges going out of any node, starting with same
character.
Following is a step by step suffix tree construction of string xabxac using Ukkonen’s algorithm:
In Suffix Tree Construction of string S of length m, there are m phases and for a phase j (1 <= j <= m), we add
jth character in tree built so far and this is done through j extensions. All extensions follow one of the three
extension rules
To do jth extension of phase i+1 (adding character S[i+1]), we first need to find end of the path from the root
labelled S[j..i] in the current tree. One way is start from root and traverse the edges matching S[j..i] string. This will
take O(m3) time to build the suffix tree. Using few observations and implementation tricks, it can be done in O(m)
which we will see now.
Suffix links
For an internal node v with path-label
label xA, where x denotes a single character and A denotes a (possibly empty)
substring, if there is another node s(v) with path-label
path label A, then a pointer from v to s(v) is called a suffix link.
If A is empty string, suffix link from internal node will go to root node.
There will not be any suffix link from root node (As it’s not considered as internal node).
In extension j+1 of same phase i, we will create a suffix link from the internal node created in jth extension to the
node with path labelled A.
So in a given phase, any newly created internal node (with path-label
path label xA) will have a suffix link from it (pointing to
another node with path-label
label A) by the end of the next extension.
In any implicit suffix tree Ti after phasee i, if internal node v has path-label
path label xA, then there is a node s(v) in Ti with path-
label A and node v will point to node s(v) using suffix link.
At any time, all internal nodes in the changing tree will have suffix links from them to another internal node
nod (or root)
except for the most recently added internal node, which will receive its suffix link by the end of the next extension.
How suffix links are used to speed up the implementation?
In extension j of phase i+1, we need to find the end of the path from
from the root labelled S[j..i] in the current tree. One
way is start from root and traverse the edges matching S[j..i] string. Suffix links provide a short cut to find end of the
path.
So we can see that, to find end of path S[j..i], we need not traverse from root. We can start from the end of path S[j-
S[j
1..i], walk up one edge to node v (i.e. go to parent node), follow the suffix link to s(v), then walk down the path y
(which is abcd here in Figure 17).
This shows the use of suffix link is an improvement over
o the process.
Note: In the next part 3, we will introduce activePoint which will help to avoid “walk up”. We can directly go to node
s(v) from node v.
When there is a suffix link from node v to node s(v), then if there is a path labelled with string y from node v to a
leaf, then there must be a path labelled with string y from node s(v) to a leaf. In Figure 17, there is a path label
“abcd” from node v to a leaf, then there is a path will same label “abcd” from node s(v) to a leaf.
This fact can be used to improve the walk from s(v) to leaf along the path y. This is called “skip/count” trick.
Skip/Count Trick
When walking down from node s(v) to leaf, instead of matching path character by character as we travel, we can
directly skip to the next node if number
mber of characters on the edge is less than the number of characters we need to
travel. If number of characters on the edge is more than the number of characters we need to travel, we directly skip
to the last character on that edge.
If implementation is such
uch a way that number of characters on any edge, character at a given position in string S
should be obtained in constant time, then skip/count trick will do the walk down in proportional to the number of
nodes on it rather than the number of characters on it.
Using suffix link along with skip/count trick, suffix tree can be built in O(m2) as there are m phases and each phase
takes O(m).
Edge-label compression
So far, path labels are represented as characters in string. Such a suffix tree will take O(m2) space to store the path
labels. To avoid this, we can use two pair of indices (start, end) on each edge for path labels, instead of substring
itself. The indices start and end tells the path label start and end position in string S. With this, suffix tree needs O(m)
space.
There are two observations about the way extension rules interact in successive extensions and phases. These two
observations lead to two more implementation tricks (first trick “skip/count” is seen already while walk down).
In Figure 11, “xab” is added in tree and in Figure 12 (Phase 4), we add next character “x”. In this, 3 extensions are
done (which adds 3 suffixes). Last suffix “x” is already present in tree.
In Figure 13, we add character “a” in tree (Phase 5). First 3 suffixes are added in tree and last two suffixes “xa” and
“a” are already present in tree. This shows that if suffix S[j..i] present in tree, then ALL the remaining suffixes S[j+1..i],
S[j+2..i], S[j+3..i],…, S[i..i] will also be there in tree and no work needed to add those remaining suffixes.
So no more work needed to be done in any phase as soon as rule 3 applies in any extension in that phase. If a new
internal node v gets created in extension j and rule 3 applies in next extension j+1, then we need to add suffix link
from node v to current node (if we are on internal node) or root node. ActiveNode, which will be discussed in part 3,
will help while setting suffix links.
Trick 2
Stop the processing of any phase as soon as rule 3 applies. All further extensions are already present in tree
implicitly.
Trick 3
In any phase i, leaf edges may look like (p, i), (q, i), (r, i), …. where p, q, r are starting position of different edges and i
is end position of all. Then in phase i+1, these leaf edges will look like (p, i+1), (q, i+1), (r, i+1),…. This way, in each
phase, end position has to be incremented in all leaf edges. For this, we need to traverse through all leaf edges and
increment end position for them. To do same thing in constant time, maintain a global index e and e will be equal to
phase number. So now leaf edges will look like (p, e), (q, e), (r, e).. In any phase, just increment e and extension on all
leaf edges will be done. Figure 19 shows this.
So using suffix links and tricks 1, 2 and 3, a suffix tree can be built in linear time.
Tree Tm couldld be implicit tree if a suffix is prefix of another. So we can add a $ terminal symbol first and then run
algorithm to get a true suffix tree (A true suffix tree contains all suffixes explicitly). To label each leaf with
corresponding suffix starting position
tion (all leaves are labelled as global index e), a linear time traversal can be done on
tree.
At this point, we have gone through most of the things we needed to know to create suffix tree using Ukkonen’s
algorithm. In next Part 3,, we will take string S = “abcabxabcd” as an example and go through all the things step by
step and create the tree. While building the tree, we will discuss few more implementation issues which will be
addressed by ActivePoints.
• Phase 1 completes with the completion of extension 1 (As a phase i has at most i extensions)
For any string, Phase 1 will have only one extension and it will always follow Rule 2.
• Phase 2 will read second character, will go through at least 1 and at most 2 extensions.
In our example, phase 2 will read second character ‘b’. Suffixes
Suffixes to be added are “ab” and “b”.
Extension 1 adds suffix “ab” in tree.
Path for label ‘a’ ends at leaf edge, so add ‘b’ at the end of this edge.
Extension 1 just increments the end index by 1 (from 1 to 2) on first edge (Rule 1).
In the next phase i+1, trick 3 (Rule 1) will take care of first j-1 suffixes (the j-1 leaf edges), then extension j will start
where we will add jth suffix in tree. For this, we need to find the best possible matching edge and then add new
character at the end of that edge. How to find the end of best matching edge? Do we need to traverse from root
node and match tree edges against the jth suffix being added character by character? This will take time and overall
algorithm will not be linear. activePoint
int comes to the rescue here.
In previous phase i, while jth extension, path traversal ended at a point (which could be an internal node or some
point in the middle of an edge) where ith character being added was found in tree already and Rule 3 applied,
jth extension of phase i+1 will start exactly from the same point and we start matching path against (i+1)th character.
activePoint helps to avoid unnecessary path traversal from root in any extension based on the knowledge gained in
vious extension. There is no traversal needed in 1st p extensions where Rule 1 is applied.
traversals done in previous
Traversal is done where Rule 2 or Rule 3 gets applied and that’s where activePoint tells the starting point for
traversal where we match the path against the current character being added in tree. Implementation is done in
such a way that, in any extension where we need a traversal, activePoint is set to right location already (with one
exception case APCFALZ discussed below) and at the end of current extension, we reset reset activePoint as apprppriate
so that next extension (of same phase or next phase) where a traversal is required, activePoint points to the right
place already.
activePoint:: This could be root node, any internal node or any point in the middle of an edge.
edge This is the point where
traversal starts in any extension. For the 1st extension of phase 1, activePoint is set to root. Other extension will get
activePoint set correctly by previous extension (with one exception case APCFALZ discussed below) and it is the
responsibility of current extension to reset activePoint appropriately at the end, to be used in next extension where
Rule 2 or Rule 3 is applied (of same or next phase).
To accomplish this, we need a way to store activePoint. We will store this using three
variables: activeNode, activeEdge, activeLength.
activeLength
After phase i, if there are j leaf edges then in phase i+1, first j extensions
extensions will be done by trick 3. activePoint will be
needed for the extensions from j+1 to i+1 and activePoint may or may not change between two extensions
depending on the point where previous extension ends.
activePoint change for walk down (APCFWD): activePoint may change at the end of an extension based on
extension rule applied. activePoint may also change during the extension when we do walk down. Let’s consider an
activePoint is (A, s, 11) in the above activePoint example figure. If this is the activePoint at the start of some
extension, then while walk down from activeNode A, other internal nodes will be seen. Anytime if we encounter an
internal node while walk down, that node will become activeNode (it will change activeEdge and activeLenght as
appropriate so that new activePoint represents the same point as earlier). In this walk down, below is the sequence
of changes in activePoint:
(A, s, 11) — >>> (B, w, 7) —- >>> (C, a, 3)
All above three activePoints refer to same point ‘c’
Let’s take another example.
If activePoint is (D, a, 11) at the start of an extension, then while walk down, below is the sequence of changes in
activePoint:
(D, a, 10) — >>> (E, d, 7) — >>> (F, f, 5) — >> (G, j, 1)
All above activePoints refer to same point ‘k’.
If activePoints are (A, s, 3), (A, t, 5), (B, w, 1), (D, a, 2) etc when no internal node comes in the way while walk down,
then there will be no change in activePoint for APCFWD.
The idea is that, at any time, the closest internal node from the point, where we want to reach, should be the
activePoint. Why? This will minimize the length of traversal in the next extension.
activePoint change for Active Length ZERO (APCFALZ): Let’s consider an activePoint (A, s, 0) in the above activePoint
example figure. And let’s say current character being processed from string S is ‘x’ (or any other character). At the
start of extension, when activeLength is ZERO, activeEdge is set to the current character being processed, i.e. ‘x’,
because there is no walk down needed here (as activeLength is ZERO) and so next character we look for is current
character being processed.
While code implementation, we will loop through all the characters of string S one by one. Each loop for ith character
will do processing for phase i. Loop will run one or more time depending on how many extensions are left to be
performed (Please note that in a phase i+1, we don’t really have to perform all i+1 extensions explicitly, as trick 3 will
take care of j extensions for all j leaf edges coming from previous phase i). We will use a
variable remainingSuffixCount, to track how many extensions are yet to be performed explicitly in any phase (after
trick 3 is performed). Also, at the end of any phase, if remainingSuffixCount is ZERO, this tells that all suffixes
supposed to be added in tree, are added explicitly and present in tree. If remainingSuffixCount is non-zero at the end
of any phase, that tells that suffixes of that many count are not added in tree explicitly (because of rule 3, we
stopped early), but they are in tree implicitly though (Such trees are called implicit suffix tree). These implicit suffixes
will be added explicitly in subsequent phases when a unique character comes in the way.
Using these descriptions, we can say given any string T [1......n], the substrings are
1. T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤n-1.
And proper substrings are
1. T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤n-1.
Note: If i>j, then T [i.....j] is equal to the empty string or null, which has length zero.
Algorithms used for String Matching:
There are different types of method is used to finding the string
1. The Naive String Matching Algorithm
2. The Rabin-Karp-Algorithm
3. Finite Automata
4. The Knuth-Morris-Pratt Algorithm
5. The Boyer-Moore Algorithm
NAIVE-STRING-MATCHER (T, P)
1. n ← length [T]
2. m ← length [P]
3. for s ← 0 to n -m
4. do if P [1.....m] = T [s + 1....s + m]
5. then print "Pattern occurs with shift" s
Analysis: This for loop from 3 to 5 executes for n-m + 1(we need at least m characters at the end) times and in
iteration we are doing m comparisons. So the total complexity is O (n-m+1).
Example:
1. Suppose T = 1011101110
2. P = 111
3. Find all the Valid Shift
Solution:
# Python3 program for Naive Pattern
# Searching algorithm
def search(pat, txt):
M = len(pat)
N = len(txt)
# A loop to slide pat[] one by one */
for i in range(N - M + 1):
j=0
if (j == M):
print("Pattern found at index ", i)
# Driver Code
if __name__ == '__main__':
txt = "AABAACAADAABAAABAA"
pat = "AABA"
search(pat, txt)
Output:
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13
The Rabin-Karp-Algorithm
The Rabin-Karp string matching algorithm calculates a hash value for the pattern, as well as for each M-character
subsequences of text to be compared. If the hash values are unequal, the algorithm will determine the hash value
for next M-character sequence. If the hash values are equal, the algorithm will analyze the pattern and the M-
character sequence. In this way, there is only one comparison per text subsequence, and character matching is only
required when the hash values match.
RABIN-KARP-MATCHER (T, P, d, q)
1. n ← length [T]
2. m ← length [P]
3. h ← dm-1 mod q
4. p ← 0
5. t0 ← 0
6. for i ← 1 to m
7. do p ← (dp + P[i]) mod q
8. t0 ← (dt0+T [i]) mod q
9. for s ← 0 to n-m
10. do if p = ts
11. then if P [1.....m] = T [s+1.....s + m]
12. then "Pattern occurs with shift" s
13. If s < n-m
14. then ts+1 ← (d (ts-T [s+1]h)+T [s+m+1])mod q
Example: For string matching, working module q = 11, how many spurious hits does the Rabin-Karp matcher
encounters in Text T = 31415926535.......
1. T = 31415926535.......
2. P = 26
3. Here T.Length =11 so Q = 11
4. And P mod Q = 26 mod 11 = 4
5. Now find the exact match of P mod Q...
Solution:
Complexity:
The running time of RABIN-KARP-MATCHER in the worst case scenario O ((n-m+1) m but it has a good average case
running time. If the expected number of strong shifts is small O (1) and prime q is chosen to be quite large, then the
Rabin-Karp algorithm can be expected to run in time O (n+m) plus the time to require to process spurious hits.
2. The KMP Matcher: With string 'S,' pattern 'p' and prefix function 'Π' as inputs, find the occurrence of 'p' in 'S' and
returns the number of shifts of 'p' after which occurrences are found.
Solution:
Initially: m = length [p] = 7
Π [1] = 0
k=0
After iteration 6 times, the prefix function computation is complete:
Let us execute the KMP Algorithm to find whether 'P' occurs in 'T.'
For 'p' the prefix function, ? was computed previously and is as follows:
Solution:
Initially: n = size of T = 15
m = size of P = 7
Pattern 'P' has been found to complexity occur in a string 'T.' The total number of shifts that took place for the match
to be found is i-m = 13 - 7 = 6 shifts.
Mod-4
Definition of NP class Problem: - The set of all decision-based
decision based problems came into the division of NP Problems who
polynomia time but verified in the polynomial time.
can't be solved or produced an output within polynomial time NP class contains
P class as a subset. NP problems being hard to solve.
Definition of Polynomial time: - If we produce an output according to the given input within a specific amount of
time such as within a minute, hours. This is known as Polynomial time.
Definition of Non-Polynomial time: - If we produce an output according to the given input but there are no time
constraints is known as Non-Polynomial
Polynomial time. But yes output will produce but time is not fixed yet.
Definition of Decision Based Problem: - A problem iss called a decision problem if its output is a simple "yes" or "no"
(or you may need this of this as true/false, 0/1, accept/reject.) We will phrase many optimization problems as
decision problems. For example, Greedy method, D.P., given a graph G= (V, E) if there exists any Hamiltonian cycle.
Definition of NP-hard class: - Here you to satisfy the following points to come into the division of NP-hard
NP
1. If we can solve this problem in polynomial time, then we can solve all NP problems in polynomial time
2. If youu convert the issue into one form to another form within the polynomial time
Definition of NP-complete class: - A problem is in NP-complete,
NP if
1. It is in NP
2. It is NP-hard
Pictorial representation of all NP classes which includes NP, NP-hard,
NP and NP-complete
complete
We could then inspect the graph and check that this is indeed a legal cycle and that it visits all of the vertices of the
graph exactly once. Thus, even though we know of no efficient way to solve the Hamiltonian cycle problem, there is
a beneficial way to verify that a given cycle is indeed a Hamiltonian
Hamilto cycle.
Definition of Certificate: - A piece of information which contains in the given path of a vertex is known as certificate
Relation of P and NP classes
1. P contains in NP
2. P=NP
1. Observe that P contains in NP. In other words, if we can solve a problem
problem in polynomial time, we can indeed
verify the solution in polynomial time. More formally, we do not need to see a certificate (there is no need
to specify the vertex/intermediate of the specific path) to solve the problem; we can explain it in polynomial
time anyway.
2. However, it is not known whether P = NP. It seems you can verify and produce an output of the set of
decision-based
based problems in NP classes in a polynomial time which is impossible because according to the
definition of NP classes you can verify
verify the solution within the polynomial time. So this relation can never be
held.
Reductions:
The class NP-complete
complete (NPC) problems consist of a set of decision problems (a subset of class NP) that no one knows
how to solve efficiently. But if there were a polynomial
po solution for even a single NP-complete
complete problem, then every
problem in NPC will be solvable in polynomial time. For this, we need the concept of reductions.
Suppose there are two problems, A and B. You know that it is impossible to solve problem A in polynomial time. You
want to prove that B cannot be explained in polynomial time. We want to show that (A ∉ P) => (B ∉ P)
Consider an example to illustrate reduction: The following problem
pr is well-known
known to be NPC:
3-color: Given a graph G, can each of its vertices be labeled with one of 3 different colors such that two adjacent
vertices do not have the same label (color).
Coloring arises in various partitioning issues where there is a constraint that two objects cannot be assigned to the
same set of partitions. The phrase "coloring" comes from the original application which was in map drawing. Two
countries that contribute a common border should be colored with different colors.
It is well known that planar graphs can be colored (maps) with four colors. There exists a polynomial time algorithm
for this. But deciding whether this can be done with 3 colors is hard, and there is no polynomial time algorithm for it.
Fig: Example of 3-colorable and non-3-colorable
colorable graphs.
Polynomial Time Reduction:
We say that Decision Problem L1 is Polynomial time Reducible to decision Problem L2 (L1≤p
≤p L2) if there is a polynomial
time computation function f such that of all x, xϵL
x 1 if and only if xϵL2.
NP-Completeness
A decision problem L is NP-Hard if
L' ≤p L for all L' ϵ NP.
Definition: L is NP-complete if
1. L ϵ NP and
2. L' ≤ p L for some known NP-complete
mplete problem L.' Given this formal definition, the complexity classes are:
P: is the set of decision problems that are solvable in polynomial time.
NP: is the set of decision problems that can be verified in polynomial time.
NP-Hard: L is NP-hard if for all L' ϵ NP, L' ≤p L. Thus if we can solve L in polynomial time, we can solve all NP problems
in polynomial time.
NP-Complete L is NP-complete if
1. L ϵ NP and
2. L is NP-hard
If any NP-complete
complete problem is solvable in polynomial time, then every NP-Complete
NP problem
blem is also solvable in
polynomial time. Conversely, if we can prove that any NP-Complete
NP Complete problem cannot be solved in polynomial time,
every NP-Complete
Complete problem cannot be solvable in polynomial time.
Reductions
Example: - Suppose there are two problems, A and B.. You know that it is impossible to solve problem A in
polynomial time. You want to prove that B cannot be solved in polynomial time. So you can convert the
problem A into problem B in polynomial time.
Example of NP-Complete problem
NP problem: - Suppose a DECISION-BASED
BASED problem is provided in which a set of inputs/high inputs you can get high
output.
CIRCUIT SAT
According to given decision-based
based NP problem, you can design the CIRCUIT and verify a given mentioned output also
within the P time. The CIRCUIT is provided below:-
below:
SAT (Satisfiability):-
A Boolean function is said to be SAT if the output for the given value of the input is true/high/1
F=X+YZ (Created a Boolean function by CIRCUIT SAT)
These points you have to be performed for NPC
1. CONCEPTS OF SAT
2. CIRCUIT SAT≤ρ SAT
3. SAT≤ρ CIRCUIT SAT
4. SAT ϵ NPC
1. CONCEPT: - A Boolean ean function is said to be SAT if the output for the given value of the input is true/high/1.
2. CIRCUIT SAT≤ρ SAT: - In this conversion, you have to convert CIRCUIT SAT into SAT within the polynomial
time as we did it
3. SAT≤ρ CIRCUIT SAT: - For the sake of verification
verification of an output you have to convert SAT into CIRCUIT SAT
within the polynomial time, and through the CIRCUIT SAT you can get the verification of an output
successfully
4. SAT ϵ NPC: - As you know very well, you can get the SAT through CIRCUIT SAT that comes from NP.
Proof of NPC: - Reduction has been successfully made within the polynomial time from CIRCUIT SAT TO SAT. Output
has also been verified within the polynomial time as you did in the above conversation.
So concluded that SAT ϵ NPC.
3CNF SAT
Concept: - In 3CNF SAT, you have at least 3 clauses, and in clauses, you will have almost 3 literals or constants
Such as (X+Y+Z) (X+Y+Z) (X+Y+Z)
You can define as (XvYvZ) ᶺ (XvYvZ) ᶺ (XvYvZ)
V=OR operator
^ =AND operator
These all the following points need to be considered in 3CNF SAT.
To prove: -
1. Concept of 3CNF SAT
2. SAT≤ρ 3CNF SAT
3. 3CNF≤ρ SAT
4. 3CNF ϵ NPC
1. CONCEPT: - In 3CNF SAT, you have at least 3 clauses, and in clauses, you will have almost 3 literals or
constants.
2. SAT ≤ρ 3CNF SAT:- In which firstly you need to convert a Boolean function created in SAT into 3CNF either in
POS or SOP form within the polynomial time
F=X+YZ
= (X+Y) (X+Z)
= (X+Y+ZZ') (X+YY'+Z)
= (X+Y+Z) (X+Y+Z') (X+Y+Z) (X+Y'+Z)
= (X+Y+Z) (X+Y+Z') (X+Y'+Z)
3. 3CNF ≤p SAT: - From the Boolean Function having three literals we can reduce the whole function into a
shorter one.
F= (X+Y+Z) (X+Y+Z') (X+Y'+Z)
= (X+Y+Z) (X+Y+Z') (X+Y+Z) (X+Y'+Z)
= (X+Y+ZZ') (X+YY'+Z)
= (X+Y) (X+Z)
= X+YZ
4. 3CNF ϵ NPC: - As you know very well, you can get the 3CNF through SAT and SAT through CIRCUIT SAT that
comes from NP.
Proof of NPC:-
1. It shows that you can easily convert a Boolean function of SAT into 3CNF SAT and satisfied the concept of
3CNF SAT also within polynomial time through Reduction concept.
2. If you want to verify the output in 3CNF SAT then perform the Reduction and convert into SAT and CIRCUIT
also to check the output
If you can achieve these two points that means 3CNF SAT also in NPC
Clique
To Prove: - Clique is an NPC or not?
For this you have to satisfy the following below-mentioned points: -
1. Clique
2. 3CNF ≤ρ Clique
3. Clique ≤ρ 3CNF≤SAT
4. Clique ϵ NP
1) Clique
Definition: - In Clique, every vertex is directly connected to another vertex, and the number of vertices in the Clique
represents the Size of Clique.
CLIQUE COVER: - Given a graph G and an integer k, can we find k subsets of verticesV1, V2...VK, such that UiVi = V, and
that each Vi is a clique of G.
The following figure shows a graph that has a clique cover of size 3.
2)3CNF ≤ρ Clique
Proof:-For
For the successful conversion from 3CNF to Clique, you have to follow the two steps:-
steps:
Draw the clause in the form of vertices, and each vertex represents the literals of the clauses.
1. They do not complement each other
2. They don't belong to the same clause
In the conversion, the size of the Clique and size of 3CNF must be the same, and you successfully converted
3CNF into Clique within the polynomial time
Clique ≤ρ 3CNF
Proof: - As you know
ow that a function of K clause, there must exist a Clique of size k. It means that P variables which
are from the different clauses can assign the same value (say it is 1). By using these values of all the variables of the
CLIQUES, you can make the value off each clause in the function is equal to 1
Proof: - As you know very well, you can get the Clique through 3CNF and to convert the decision-based
decision NP problem
into 3CNF you have to first convert into SAT and SAT comes from
fr NP.
So, concluded that CLIQUE belongs to NP.
Proof of NPC:-
1. Reduction achieved within the polynomial time from 3CNF to Clique
2. And verified the output after Reduction from Clique To 3CNF above
So, concluded that, if both Reduction and verification can be done within the polynomial time that
means Clique also in NPC.
Vertex Cover
1. Vertex Cover Definition
2. Vertex Cover ≤ρ Clique
3. Clique ≤ρ Vertex Cover
4. Vertex Cover ϵ NP
1) Vertex Cover:
Definition: - It represents a set of vertex or node in a graph G (V, E), which gives the connectivity of a complete graph
According to the graph G of vertex cover which you have created, the size of Vertex Cover =2
2) Vertex Cover ≤ρ Clique
In a graph G of Vertex Cover, you have N vertices which contain a Vertex Cover K. There must exist of Clique Size of
size N-K in its complement.
Subset Cover
To Prove:-
1. Subset Cover
2. Vertex Cover ≤ρ Subset Cover
3. Subset Cover≤ρ Vertex Cover
4. Subset Cover ϵ NP
1) Subset Cover
Definition: - Number of a subset of edges after making the union for a get all the edges
edges of the complete graph G, and
that is called Subset Cover.
According to the graph G, which you have created the size of Subset Cover=2
1. v1{e1,e6} v2{e5,e2} v3{e2,e4,e6} v4{e1,e3,e5} v5{e4} v6{e3}
2. v3Uv4= {e1, e2, e3, e4, e5, e6} complete set of edges after the union of vertices.
Independent Set:
An independent set of a graph G = (V, E) is a subset V'⊆V of vertices such that every edge in E is incident on at most
one vertex in V.' The independent-set problem is to find a largest-size independent set in G. It is not hard to find
small independent sets, e.g., a small independent set is an individual node, but it is hard to find large independent
sets.