Lecture-1 Merge Merge
Lecture-1 Merge Merge
Lecture-1 Merge Merge
Lecture No. 1
1
Strong Belief
Searching
- Linear Search
- Binary Search
And Many More
Key Questions
Given a problem:
1st Question
Does Solution/Algorithm exist?
Do we know any such problem?
2nd Question
If solution exists, is there alternate better solution?
3rd Question
What is the least time required to solve the problem?
- lower bound results
4th Question
Does there exist algorithm solving the problem taking the least
time?
Key Questions
5th Question
Is the known solution polynomial time?
What about primality?
6th Question
If the known solution is not polynomial time, does/will
there exist a polynomial time solution?
7th Question
Can we prove that no polynomial time solution will
ever exist?
8th Question
If we don’t know a polynomial time solution and
answer to 7th Question is no, then what?
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 2
1
Algorithms
• Seen many algorithms
Graph Searching
Sorting Graph Algorithms
- Breadth First Search
- Insertion Sort - Shortest Path
- Depth First Search
- Bubble Sort - Minimum Spanning Tree
- Tree Traversal
- Merge Sort
- Quick Sort
- Heap Sort
- Radix Sort
- Counting Sort
Searching
- Linear Search
- Binary Search
And Many More
Key Questions
Given a problem:
1st Question
Does Solution/Algorithm exist?
Do we know any such problem?
2nd Question
If solution exists, is there alternate better solution?
3rd Question
What is the least time required to solve the problem?
- lower bound results
4th Question
Does there exist algorithm solving the problem taking the least
time?
Key Questions
5th Question
Is the known solution polynomial time?
What about primality?
6th Question
If the known solution is not polynomial time, does/will
there exist a polynomial time solution?
7th Question
Can we prove that no polynomial time solution will
ever exist?
8th Question
If we don’t know a polynomial time solution and
answer to 7th Question is no, then what?
Course Objective 1:
Algorithm Design Techniques
We already know one popular strategy
Divide & Conquer
Consider the Coin Change Problem with coins of
denomination 1, 5, 10 & 25
Solution is Easy
What is the guarantee that solution works!
Introduce two more popular & widely applicable problem
solving strategies:
• Dynamic Programming
• Greedy Algorithms
Key Questions
Course Objective 2:
One of the objectives of this course is to look at
Question 5 to Question 8 in detail for a class of
problems
Understand famous P vs NP problem
We strongly believe that certain important class
of problems will not have polynomial time
solution.
Course Objective - 3
• How to deal with the class of problems for
which we strongly believe that no polynomial
time algorithm will exist?
• This class consists of important practical
problems
Example: Traveling Salesman Problem, 0-1
Knapsack Problem, Bin Packing Problem
And Many more
Course Objective - 3
1st Alternative
Try to get polynomial time solution for an important
particular instance of the problem
2nd Alternative
Backtracking Algorithms
With good heuristics this works well for some important
particular instance of the problem
3rd Alternative
Approximation Algorithms
As name indicates, these algorithms will give approximate
solutions but will be polynomial in time.
Many Other Alternatives
Course Objective - 3
Course Objective – 3
For a certain class of problems to study, develop
and analyze ‘good’ approximation algorithms.
Text Book
• Text Book
Thomas H. Cormen, Charles E. Leiserson,
Ronald L. Rivest & Clifford Stein
Introduction to Algorithms
Third Edition, PHI Learning Private Limited, 2015
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 3
1
Course Objective 1:
Algorithm Design Techniques
We already know one popular strategy
Divide & Conquer
Consider the Coin Change Problem with coins of
denomination 1, 5, 10 & 25
Solution is Easy
What is the guarantee that solution works!
Introduce two more popular & widely applicable problem
solving strategies:
• Dynamic Programming
• Greedy Algorithms
Key Questions
Course Objective 2:
One of the objectives of this course is to look at
Problems for which the known solution is non
polynomial and we can’t prove that no polynomial
time solution will ever exist.
Understand famous P vs NP problem
We strongly believe that certain important class of
problems will not have polynomial time solution.
Course Objective - 3
• How to deal with the class of problems for which
we strongly believe that no polynomial time
algorithm will exist?
• Approximation Algorithms
• This class consists of important practical
problems
Example: Traveling Salesman Problem, 0-1
Knapsack Problem, Bin Packing Problem
And Many more
Binary Search Tree
• Binary Search Tree
Well known, important data structure for efficient
search
If T is Binary tree with n nodes, then
min height of the tree is
logn
max height of the tree is
n-1
Disadvantage
Tree can be skewed making search inefficient
Binary Search Tree
• How to fix the skewness
Balanced trees
- AVL Trees
- Red-Black Trees
- Multi-way Search Trees, (2, 4) Trees
- Few More
Technique used for balancing
- Rotations
Left Rotation or Right Rotation
Binary Search Tree
• Disadvantage with Balanced Trees
Number of rotations needed to maintain balanced
structure is
O(logn)
Question
Are there type of binary search trees for which
number of rotations needed for balancing is
constant
(independent of n)
Course Objective - 4
Course Objective – 4
Study type of binary search trees called Treap for
which expected number of rotations needed for
balancing is constant
We will be proving that expected number of
rotations needed for balancing in Treap is 2
(something really strong)
Treap is an Randomized Data Structure
Review - Algorithm Design Strategy
• Divide & Conquer
- binary search
- merge sort
- quick sort
- Matrix Multiplication (Strassen Algorithm)
Many More
Many standard iterative algorithms can be written
as Divide & Conquer Algorithms.
Example: Sum/Max of n numbers
Question: Any advantage in doing so?
Divide & Conquer
Divide-and-conquer algorithms:
1. Dividing the problem into smaller sub-
problems
(independent sub-problems)
2. Solving those sub-problems
3. Combining the solutions for those smaller
sub-problems to solve the original problem
Divide & Conquer
How to analyze Divide & Conquer Algorithms
Generally, using Recurrence Relations
• Let T(n) be the number of operations required to
solve the problem of size n.
Then T(n) = a T(n/b) + c(n) where
- each sub-problem is of size n/b
- There are a such sub-problems
- c(n) extra operations are required to combine the
solutions of sub-problems into a solution of the
original problem
Divide & Conquer
• How to solve Recurrence Relations
- Substitution Method
- works in simple relations
- Masters Theorem
- See Text Book for details
Coin Change Problem
Coin Change Problem
Given a value N, if we want to make change for
N paisa, and we have infinite supply of each of C
= { 1, 5, 10, 25} valued coins, what is the
minimum number of coins to make the change?
Solution – Easy (We all know this)
Greedy Solution
Coin Change Problem
Key Observations
• Optimization Problem
• Making Change is possible
i.e. solution exists
• Multiple solution exists
For example: 60 - 25, 25, 10 OR 10, 10, six times
• Question has two parts
1. Minimum number of coins required
2. Which coins are part of the solution
• What is the guarantee that solution works?
Coin Change Problem
Key Observations
Suppose we add coin of denomination 20 to the set
i.e. C = { 1, 5, 10, 20, 25}
If N = 40
Then the greedy solution fails.
Because the greedy solution will give answer as
3 which is {25, 10, 5}
Whereas correct answer is {20, 20}
Course Objective
Main Question:
Given an problem (mostly optimization
problem)
1. How to decide whether to use
Greedy Strategy OR Dynamic Programming Strategy?
2. How to show that solution works?
We will explore these ideas in the initial few
lectures.
Coin Change Problem
Suppose we want to compute the minimum
number of coins with values
d[1], d[2], …,d[n] where each d[i]>0
& where coin of denomination i has value d[i]
Let c[i][j] be minimum number of coins required
to pay an amount of j units 0<=j<=N using only
coins of denominations 1 to i, 1<=i<=n
C[n][N] is the solution to the problem
Coin Change Problem
In calculating c[i][j], notice that:
• Suppose we do not use the coin with value
d[i] in the solution of the (i,j)-problem,
then c[i][j] = c[i-1][j]
• Suppose we use the coin with value d[i] in the
solution of the (i,j)-problem,
then c[i][j] = 1 + c[i][ j-d[i]]
Since we want to minimize the number of coins,
we choose whichever is the better alternative
Coin Change Problem – Recurrence
Therefore
c[i][j] = min{c[i-1][j], 1 + c[i][ j-d[i]]}
&
c[i][0] = 0 for every i
Alternative 1
Recursive algorithm
Overlapping Subproblems
When a recursive algorithm revisits the same
problem over and over again, we say that the
optimization problem has overlapping
subproblems.
How to observe/prove that problem has
overlapping subproblems.
Answer – Draw Computation tree and observe
Overlapping Subproblems
Computation Tree
Lecture No. 4
1
Coin Change Problem
Suppose we have infinite supply of n coins.
d[1], d[2], …,d[n] where each d[i]>0
& where coin of denomination i has value d[i]
Given an amount N, we want to compute the
minimum number of coins needed to make a
change of N.
Let c[i][j] be minimum number of coins required to
pay an amount of j units 0<=j<=N using only coins
of denominations 1 to i, 1<=i<=n
C[n][N] is the solution to the problem
Coin Change Problem
In calculating c[i][j], notice that:
• Suppose we do not use the coin with value
d[i] in the solution of the (i,j)-problem,
then c[i][j] = c[i-1][j]
• Suppose we use the coin with value d[i] in the
solution of the (i,j)-problem,
then c[i][j] = 1 + c[i][ j-d[i]]
Since we want to minimize the number of coins,
we choose whichever is the better alternative
Coin Change Problem – Recurrence
Therefore
c[i][j] = min{c[i-1][j], 1 + c[i][ j-d[i]]}
&
c[i][0] = 0 for every i
Alternative 1
Recursive algorithm
Overlapping Subproblems
When a recursive algorithm revisits the same
problem over and over again, we say that the
optimization problem has overlapping
subproblems.
How to observe/prove that problem has
overlapping subproblems.
Answer – Draw Computation tree and observe
Overlapping Subproblems
Computation Tree
F(n-1) + F(n-2)
The table gives the solution to our problem for all the
instances involving a payment of 8 units or less
Analysis
Time Complexity
We have to compute n(N+1) entries
Each entry takes constant time to compute
Running time – O(nN)
Question
• How can you modify the algorithm to actually
compute the change (i.e., the multiplicities of
the coins)?
• Modify the algorithm to handle exceptional
cases.
if i = 1 and j < d[i], c[i][j] = +∞
if i = 1, c[i][j] = 1 + c[i][j – d[1]]
if j < d[i], c[i][j] = c[i-1][j]
• In case when there is no solution algorithm
will return +∞
Optimal Substructure Property
• Why does the solution work?
Optimal Substructure Property/ Principle of
Optimality
• The optimal solution to the original problem
incorporates optimal solutions to the subproblems.
This is a hallmark of problems amenable to dynamic
programming.
• Not all problems have this property.
Optimal Substructure Property
• In our example though we are interested only
in c[n][N], we took it granted that all the other
entries in the table must also represent
optimal choices.
• If c[i][j] is the optimal way of making change
for j units using coins of denominations 1 to i,
then c[i-1][j] & c[i][j-d[i]] must also give the
optimal solutions to the instances they
represent
Optimal Substructure Property
How to prove Optimal Substructure Property?
Generally by Cut-Paste Argument or By
Contradiction
Note
Optimal Substructure Property looks obvious
But it does not apply to every problem.
Exercise:
Give an problem which does not exhibit Optimal
Substructure Property.
Dynamic Programming Algorithm
The dynamic-programming algorithm can be broken
into a sequence of four steps.
1. Characterize the structure of an optimal solution.
Optimal Substructure Property
2. Recursively define the value of an optimal solution.
3. Compute the value of an optimal solution in a
bottom-up fashion.
Overlapping subproblems
4. Construct an optimal solution from computed
information.
(not always necessary)
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 5
1
Optimal Substructure Property
Optimal Substructure Property/ Principle of
Optimality
• The optimal solution to the original problem
incorporates optimal solutions to the subproblems.
This is a hallmark of problems amenable to dynamic
programming.
• Not all problems have this property.
Optimal Substructure Property
How to prove Optimal Substructure Property?
Generally by Cut-Paste Argument or By
Contradiction
Note
Optimal Substructure Property looks obvious
But it does not apply to every problem.
Exercise:
Give an problem which does not exhibit Optimal
Substructure Property.
Dynamic Programming Algorithm
The dynamic-programming algorithm can be broken
into a sequence of four steps.
1. Characterize the structure of an optimal solution.
Optimal Substructure Property
2. Recursively define the value of an optimal solution.
3. Compute the value of an optimal solution in a
bottom-up fashion.
Overlapping subproblems
4. Construct an optimal solution from computed
information.
(not always necessary)
The 0-1 Knapsack Problem
Given: A set S of n items, with each item i having
wi - a positive “weight”
vi - a positive “benefit”
Goal:
Choose items with maximum total benefit but with weight
at most W.
And we are not allowed to take fractional amounts
In this case, we let T denote the set of items we take
Objective : maximize 𝒗𝒊
𝒊∈𝑻
Constraint : 𝒘𝒊 ≤ 𝑾
𝒊∈𝑻
Greedy approach
Possible greedy approach:
Approach1: Pick item with largest value first
Approach2: Pick item with least weight first
Approach3: Pick item with largest value per
weight first
None of the above approaches work
Exercise:
Prove by giving counterexamples.
Brute Force Approach
• Brute Force
The naive way to solve this problem is to go
through all 2n subsets of the n items and pick
the subset with a legal weight that maximizes
the value of the knapsack.
Recursive Formulation
Very similar to Coin Change Problem
Sk: Set of items numbered 1 to k.
Define B[k,w] to be the best selection from Sk with weight
at most w
B[k − 1, w] if wk w
B[k , w] =
max{B[k − 1, w], B[k − 1, w − wk ] + bk } else
Lecture No. 6
1
Re Cap - Recursive Formulation
Coin Change Problem
c[i][j] = min{c[i-1][j], 1 + c[i][ j-d[i]]}
B[k − 1, w] if wk w
B[k , w] =
max{B[k − 1, w], B[k − 1, w − wk ] + bk } else
Subset Sum problem
OPT(j, t) = max{OPT(j-1, t), xj + OPT(j-1, t - xj)}
𝒄 𝒊, 𝒋 = 𝒂 𝒊, 𝒌 𝒃[𝒌, 𝒋]
𝒌=𝟏
where 1≤ i ≤ p and 1≤ j ≤ r
OR
Dot product of ith row of A with jth column of B
Properties
• Matrix multiplication is associative
i.e., A1(A2A3) = (A1A2 )A3
So parenthesization does not change result
• It may appear that the amount of work done
won’t change if you change the parenthesization
of the expression
• But that is not the case!
Example
• Let us use the following example:
– Let A be a 2x10 matrix
– Let B be a 10x50 matrix
– Let C be a 50x20 matrix
• Consider computing A(BC):
Total multiplications = 10000 + 400 = 10400
• Consider computing (AB)C:
Total multiplications = 1000 + 2000 = 3000
Substantial difference in the cost for computing
Matrix Chain Multiplication
• Thus, our goal today is:
• Given a chain of matrices to multiply,
determine the fewest number of multiplications
necessary to compute the product.
• Let dixdi+1 denote the dimensions of matrix Ai.
• Let A = A0 A1 ... An-1
• Let Ni,j denote the minimal number of
multiplications necessary to find the product: Ai Ai+1
... Aj.
• To determine the minimal number of multiplications
necessary N0,n-1 to find A,
• That is, determine how to parenthisize the
multiplications
Matrix Chain Multiplication
1st Approach –
Brute Force
• Given the matrices A1,A2,A3,A4 Assume the
dimensions of A1=d0×d1 etc
• Five possible parenthesizations of these arrays, along
with the number of multiplications:
(A1A2)(A3A4):d0d1d2+d2d3d4+d0d2d4
((A1A2)A3)A4:d0d1d2+d0d2d3+d0d3d4
(A1(A2A3))A4:d1d2d3+d0d1d3+d0d3d4
A1((A2A3)A4):d1d2d3+d1d3d4+d0d1d4
A1(A2(A3A4)):d2d3d4+d1d2d4+d0d1d4
Matrix Chain Multiplication
Questions?
• How many possible parenthesization?
• At least lower bound?
The number of parenthesizations is atleast Ω(2n)
Exercise: Prove
The exact number is given by the recurrence relation
𝑛−1
𝑇 𝑛 = 𝑇 𝑘 𝑇(𝑛 − 𝑘)
𝑘=1
Because, the original product can be split into two parts
In (n-1) places.
Each split is to be parenthesized optimally
Matrix Chain Multiplication
Solution to the recurrence is the famous Catalan
Numbers
T(n) = Ω(4n/3n/2)
Step3:
Compute the value of an optimal solution in a
bottom-up fashion
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 7
1
Matrix Chain Multiplication
• Thus, our goal today is:
• Given a chain of matrices to multiply,
determine the fewest number of multiplications
necessary to compute the product.
• Let dixdi+1 denote the dimensions of matrix Ai.
• Let A = A0 A1 ... An-1
• Let Ni,j denote the minimal number of
multiplications necessary to find the product: Ai Ai+1
... Aj.
• To determine the minimal number of multiplications
necessary N0,n-1 to find A,
• That is, determine how to parenthisize the
multiplications
Matrix Chain Multiplication
Questions?
• How many possible parenthesization?
• At least lower bound?
The number of parenthesizations is atleast Ω(2n)
Exercise: Prove
The exact number is given by the recurrence relation
𝑛−1
𝑇 𝑛 = 𝑇 𝑘 𝑇(𝑛 − 𝑘)
𝑘=1
Because, the original product can be split into two parts
In (n-1) places.
Each split is to be parenthesized optimally
Matrix Chain Multiplication
Solution to the recurrence is the famous Catalan
Numbers
T(n) = Ω(4n/3n/2)
Optimal Substructure Property
If a particular parenthesization of the whole product is
optimal,
then any sub-parenthesization in that product is
optimal as well.
Matrix Chain Multiplication
Step 2:
Recursive Formulation
Let M[i,j] represent the minimum number of
multiplications required for matrix product Ai ×⋯×
Aj, For 1≤i≤j<n
High-Level Parenthesization for Ai..j
Notation: Ai..j = Ai x ….x Aj
For any optimal multiplication sequence,
at the last step we are multiplying two matrices
Ai..k and Ak+1..j for some k, i.e.,
Ai..j = (Ai x ….x Ak) (Ak+1 x ….x Aj) = Ai..k Ak+1..j
Matrix Chain Multiplication
Thus,
M[i,j]=M[i,k]+M[k+1,j]+ di-1dkdj
Thus the problem of determining the optimal
sequence of multiplications is broken down to the
following question?
How do we decide where to split the chain?
OR (what is k)?
Answer:
Search all possible values of k & take the minimum
of it.
Matrix Chain Multiplication
Therefore,
0, 𝑖𝑓 𝑖 = 𝑗
𝑀 𝑖, 𝑗 = ቐ min {𝑀 𝑖, 𝑘 + 𝑀 𝑘 + 1, 𝑗 +𝑑 𝑑 𝑑 }, 𝑖𝑓 𝑖 < 𝑗
𝑖−1 𝑘 𝑗
𝑖≤𝑘<𝑗
Step3:
Compute the value of an optimal solution in a
bottom-up fashion
Overlapping Subproblem
Matrix Chain Multiplication
Which sub-problems are necessary to solve first?
By Definition M[i,i] = 0
Clearly it's necessary to solve the smaller problems
before the larger ones.
• In particular, we need to know M[i, i+1], the number of
multiplications to multiply any adjacent pair of
matrices before we move onto larger tasks.
Chains of length 1
• The next task we want to solve is finding all the values
of the form M[i, i+2], then M[i, i+3], etc.
Chains of length 2 & then chains of length 3 & so on
Matrix Chain Multiplication
That is, we calculate in the order
Matrix Chain Multiplication
• This tells us the order in which to build the
table:
By diagonals
Diagonal indices:
• On diagonal 0, j=i
• On diagonal 1, j=i+1
• On diagonal q, j=i+q
• On diagonal n−1, j=i+n−1
Matrix Chain Multiplication
Example
• Array dimensions:
• A1: 2 x 3 , A2: 3 x 5 , A3: 5 x 2
• A4: 2 x 4 , A5: 4 x 3
𝑀 2,2 + 𝑀 3,5 +𝑑1 𝑑2 𝑑5
𝑀 2, 5 = 𝑚𝑖𝑛 ൞𝑀 2,3 + 𝑀 4,5 +𝑑1 𝑑3 𝑑5
𝑀 2,4 + 𝑀 5,5 +𝑑1 𝑑4 𝑑5
Matrix Chain Multiplication
Table for M[i, j]
𝑴 𝟐, 𝟐 + 𝑴 𝟑, 𝟓 +𝒅𝟏 𝒅𝟐 𝒅𝟓
𝑴 𝟐, 𝟓 = 𝒎𝒊𝒏 ൞𝑴 𝟐, 𝟑 + 𝑴 𝟒, 𝟓 +𝒅𝟏 𝒅𝟑 𝒅𝟓
𝑴 𝟐, 𝟒 + 𝑴 𝟓, 𝟓 +𝒅𝟏 𝒅𝟒 𝒅𝟓
Matrix Chain Multiplication
Optimal locations for parentheses:
Table for s[i, j]
The multiplication sequence is recovered
as follows.
Lecture No. 8
1
All-Pairs Shortest Path
G = (V, E) be a graph
G has no negative weight cycles
Vertices are labeled 1 to n
If (i, j) is an edge its weight is denoted by wij
Optimal Substructure Property
Easy to Prove
Recursive formulation - 1
(𝒎)
𝐋𝐞𝐭 𝒍𝒊𝒋 be the minimum weight of any path from
vertex i to vertex j that contains atmost m edges.
Then,
(𝟎) 𝟎, 𝒊𝒇 𝒊 = 𝒋
𝒍𝒊𝒋 = ቊ
∞, 𝒊𝒇 𝒊 ≠ 𝒋
𝒑𝒊 + 𝒒𝒊 = 𝟏
𝒊=𝟏 𝒊=𝟎
Because we have probabilities of searches for each key
and each dummy key, we can determine the expected
cost of a search in a given binary search tree T.
Let us assume that the actual cost of a search is the
number of nodes examined,
i.e., the depth of the node found by the search in T,
plus1.
Optimal Binary Search Tree
Then the expected cost of a search in T is
E[ search cost in T]
k1 k4 k1 k5
d0 d1
d0 d1 d5
k3 k5 k4
d2 d3 d4 d5 d4
k3
Figure (a) Figure (a) costs 2.80
i 0 1 2 3 4 5
d2 d3
qi 0.05 0.10 0.05 0.05 0.05 0.10 Figure (b) costs 2.75
Optimal Binary Search Tree
We start with a problem regarding binary search trees
in an environment in which the probabilities of
accessing elements and gaps between elements is
known.
Goal:
We want to find the binary search tree that minimizes
the expected number of nodes probed on a search.
Optimal Binary Search Tree
Brute Force Approach:
Exhaustive checking of all possibilities.
Question
What is the number of binary search trees on n keys?
Answer
𝒘 𝒊, 𝒋 = 𝒑𝒍 + 𝒒𝒊
𝒍=𝒊 𝒍=𝒊−𝟏
Optimal Binary Search Tree
Thus, if kr is the root of an optimal subtree containing
keys ki ,…,kj , we have
e[i, j]= pr + (e[i, r-1]+w(i, r-1)) + (e[r+1, j]+w(r+1, j))
Noting that w (i, j) = w(i,r-1)+ pr +w(r+1,j)
We rewrite e[i, j] as
e[i, j]= e[i, r-1] + e[r+1, j] + w(i, j)
The recursive equation as above assumes that we know
which node kr to use as the root.
We choose the root that gives the lowest expected
search cost
Optimal Binary Search Tree
Final recursive formulation:
𝑞𝑖−1 , 𝑖𝑓 𝑗 = 𝑖 − 1
𝑒 𝑖, 𝑗 = ቐ min { 𝑒 𝑖, 𝑟 − 1 + 𝑒 𝑟 + 1, 𝑗 + 𝑤(𝑖, 𝑗)}, 𝑖𝑓 𝑖 ≤ 𝑗
𝑖≤𝑟≤𝑗
c[i, j] = c[i - 1, j - 1] + 1
2 x2 0 second
0
0
m xm 0 c[m, n]
Computing the table
0 if i = 0 or j = 0
c[i, j] = c[i-1, j-1] + 1 if xi = yj
max(c[i, j-1], c[i-1, j]) if xi yj
Else
b[i, j] = “ ”
Pseudo Code for LCS
1. for i ← 1 to m
2. do c[i, 0] ← 0
3. for j ← 0 to n
4. do c[0, j] ← 0
5. for i ← 1 to m
6. do for j ← 1 to n
7. do if xi = yj
8. then c[i, j] ← c[i - 1, j - 1] + 1
9. b[i, j ] ← “ ”
10. else if c[i - 1, j] ≥ c[i, j - 1]
11. then c[i, j] ← c[i - 1, j]
12. b[i, j] ← “↑”
13. else c[i, j] ← c[i, j - 1]
14. b[i, j] ← “←”
15.return c and b Running time: O(mn)
Example
X = A, B, C, B, D, A, B 0 if i = 0 or j = 0
Y = B, D, C, A, B, A c[i, j] = c[i-1, j-1] + 1 if xi = yj
max(c[i, j-1], c[i-1, j]) if xi yj
0 1 2 3 4 5 6
yj B D C A B A
If xi = yj
0 xi
b[i, j] = “ ” 0 0 0 0 0 0 0
Else if c[i - 1, j] ≥ c[i, j-1] 1 A 0 0 0 0 1 1 1
b[i, j] = “ ”
2 B 0 1 1 1 1 2 2
else
3 C 0 1 1 2 2 2 2
b[i, j] = “ ”
4 B 0
1 1 2 2 3 3
5 D 0
1 2 2 2 3 3
6 A 0
1 2 2 3 3 4
7 B 0
1 2 2 3 4 4
Constructing a LCS
0 xi
0 0 0 0 0 0 0
When we
1 A 0 0 0 0 1 1 1
encounter a “ “ in
b[i, j] 2 B 0 1 1 1 1 2 2
xi = yj is an 3 C 0
1 1 2 2 2 2
element of the LCS
4 B 0
1 1 2 2 3 3
LCS is BCBA 5 D 0
1 2 2 2 3 3
6 A 0
1 2 2 3 3 4
7 B 0
1 2 2 3 4 4
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 9
1
DP - Exercise
Exercise:
1. Partition Problem
You have a set of n integers. Problem is to Partition these
integers into two subsets such that you minimize |S1 − S2|,
where S1 and S2 denote the sums of the elements in each of the
two subsets.
2. Edit distance
the minimum number of edits - insertions, deletions, and
substitutions of characters - needed to transform the first
string into the second.
3. Independent set
An independent set is a set I ⊆ V of vertices such that for all u, v
∈ I, (u, v) E. An independent set problem is to find a
maximum size independent set in G.
DP - Exercise
4. Palindromes
Given a string s, we are interested in computing minimum
number of palindromes from which one can construct s (that is,
the minimum k such that s can be written as w1w2 . . . wk where
w1,w2, . . . ,wk are all palindromes).
5. Longest palindromic subsequence
Devise an algorithm that takes a sequence x[1,...,n] and returns
the length of the longest palindromic subsequence.
6. Longest Increasing Subsequence
Given a sequence of numbers find the longest increasing
Subsequence.
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 10
1
Activity Selection Problem
• Input: A set of activities S = {a1,…, an}
• Each activity ai has start time si and a finish time fi
such that 0 ≤ si < fi < ∞
• An activity ai takes place in the half-open interval [si
, fi )
• Two activities are compatible if and only if their
interval does not overlap
i.e., ai and aj are compatible if [si , fi ) and
[sj , fj ) do not overlap (i.e. si ≥ fj or sj ≥ fi )
• Output: A maximum-size subset of mutually
compatible activities
• Also Called Interval Scheduling Problem
Activity Selection Problem
Example
Lecture No. 11
1
Activity Selection Problem
• Input: A set of activities S = {a1,…, an}
• Each activity ai has start time si and a finish time fi
such that 0 ≤ si < fi < ∞
• An activity ai takes place in the half-open interval [si
, fi )
• Two activities are compatible if and only if their
interval does not overlap
i.e., ai and aj are compatible if [si , fi ) and
[sj , fj ) do not overlap (i.e. si ≥ fj or sj ≥ fi )
• Output: A maximum-size subset of mutually
compatible activities
• Also Called Interval Scheduling Problem
Activity Selection Problem
• Brute Force Solution
All possible subsets
Running time: O(2n)
• “Greedy” Strategies:
Greedy 1: Pick the shortest activity, eliminate all activities
that conflict with it, and recurse.
Greedy 2: Pick the activity that starts first, eliminate all the
activities that conflict with it, and recurse.
Greedy 3: Pick the activity that ends first, eliminate all the
activities that conflict with it, and recurse.
Observe
• Greedy1 and Greedy2 does not work
• Greedy3 seems to work (Why?)
Notation
Notation
0, 𝑖𝑓 𝑗 = 0
𝑆𝑗 = ൝
max {𝑣𝑗 + 𝑆 𝑝 𝑗 , 𝑆 𝑗 − 1 }, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Interval Partitioning
Interval Partitioning
Lecture i starts at si and finishes at fi.
Goal:
Find minimum number of classrooms to schedule all
lectures so that no two occur at the same time in
the same room.
Answer
• Depth - Given a set of intervals, the depth of this
set is the maximum number of open intervals that
contain a time t.
Interval Partitioning
Lemma
In any instance of interval partitioning we need at
least depth many classrooms to schedule these
courses.
Proof
This is simply because by definition of depth there is
a time t and depth many courses that are all running
at time t. That means that these courses are mutually
in-compatible, i.e., no two of them can be scheduled
at the same classroom. So, in any schedule we would
need depth many classrooms.
Greedy Strategies
Greedy Strategies:
Greedy 1: Pick the shortest activity
- ascending order of fj - sj
Greedy 2: Pick the activity that starts first,
ascending order of sj
Greedy 3: Pick the activity that ends first
ascending order of fj
Greedy algorithm
Greedy algorithm
Consider lectures in increasing order of start time: assign
lecture to any compatible classroom.
Lecture No. 12
1
Greedy - Exercise
Suppose you were to drive from Mumbai to Pune along the
express highway. Suppose your car’s petrol tank holds enough
petrol to travel M miles. Suppose you have a map that gives
distances between petrol stations along the route. Let d1 < d2
< …< dn be the locations of all the petrol stations along the
route where di is the distance of petrol station from Mumbai.
Assume that distance between neighboring petrol stations is
less than M miles. Your goal is to accomplish the journey
making as few petrol stops as possible along the way.
Design an efficient algorithm to determine at which petrol
stations to take a halt. Prove that your algorithm gives optimal
solution and also analyze the time complexity of the
algorithm.
The Fractional Knapsack Problem
Given: A set S of n items, with each item i having
wi - a positive “weight”
vi - a positive “benefit”
Goal:
Choose items with maximum total benefit but with weight
at most W.
And we are allowed to take fractional amounts
The Fractional Knapsack Problem
Possible Greedy Strategies:
• Pick the items in increasing order of weights
• Pick the items in decreasing order of benefits
• Pick the items by decreasing order of value per
pound
Note:
1st two strategies do not give optimal solution
Counterexamples - Exercise
Greedy Algorithm
We can solve the fractional knapsack problem with a
greedy algorithm:
• Compute the value per pound (vi/wi) for each item
• Sort (decreasing) the items by value per pound
• Greedy strategy of always taking as much as possible
of the item remaining which has highest value per
pound
Time Complexity:
If there are n items, this greedy algorithm takes
O(nlogn) time
Greedy Choice Property
• Theorem
Consider a knapsack instance P, and let item 1
be item of highest value density
Then there exists an optimal solution to P that
uses as much of item 1 as possible (that is,
min(w1, W)).
Proof:
Suppose we have a optimal solution Q that uses
weight w <min(w1, W) of item 1.
Let w′ = min(w1, W) − w
Greedy Choice Property
Q must contain at least weight w′ of some other
item(s), since it never pays to leave the knapsack
partly empty.
Construct Q* from Q by removing w′ worth of
other items and replacing with w′ worth of item
1
Because item 1 has max value per weight, So
Q* has total value at least as big as Q.
Alternate Proof - 1
Alternate Proof:
Assume the objects are sorted in order of cost
per pound.
Let vi be the value for item i and let wi be its
weight.
Let xi be the fraction of object i selected by
greedy and let V(X) be the total value obtained
by greedy
Alternate Proof - 1
Alternate Proof
Alternate Proof - 1
Alternate Proof - 1
Alternate proof (2) - Proof by contradiction
We can also prove by contradiction.
We start by assuming that there is an optimal solution where we did
not take as much of item i as possible and we also assume that our
knapsack is full (If it is not full, just add more of item i).
Since item i has the highest value to weight ratio, there must exist an
𝒗𝒋 𝒗
item j in our knapsack such that < 𝒊
𝒘𝒋 𝒘𝒊
We can take item j of weight x from our knapsack and we can add item
i of weight x to our knapsack (Since we take out x weight and put in x
weight, we are still within capacity).
𝒗𝒊 𝒗𝒋
The change in value of our knapsack is x − 𝒙 >𝟎
𝒘𝒊 𝒘𝒋
Therefore, we arrive at a contradiction because the ”so-called” optimal
solution in our starting assumption, can in fact be improved by taking
out some of item j and adding more of item i. Hence, it is not optimal.
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 13
1
Minimum Spanning Tree (MST)
Optimal substructure for MST
Prims Algorithm
Choose some v ∈ V and let S = {v}
Let T = Ø
While S ≠ V
Choose a least-cost edge e with one
endpoint in S and one endpoint in V – S
Add e to T
Add both endpoints of e to S
Minimum Spanning Tree (MST)
Greedy-choice property:
Exercise:
Shortest Path Problem
Minimize time in the system
Problem Statement:
• A single server with N customers to serve
• Customer i will take time ti, 1≤ i ≤ N to be served.
• Goal: Minimize average time that a customer spends in
the system.
where time in system for customer i = total waiting time + ti
• Since N is fixed, we try to minimize time spend by all
customers to reach our goal
Minimize T = σ𝑁
𝑖=1 (time in system for customer i)
Minimize time in the system
Example:
Assume that we have 3 jobs with t1 = 5, t2 = 3, t3 = 7
𝑇 𝑂 = 𝑛 − 𝑎 + 1 𝑠𝑏 + 𝑛 − 𝑏 + 1 𝑠𝑎 + (𝑛 − 𝑘 + 1) 𝑠𝑘
𝑘=1
𝑘≠𝑎,𝑏.
r s t p u v q w SJ
Example
After reorganization
x y p r q 𝑺′𝑰
gap
u s t p r v q w 𝑺′𝑱
Proof of the claim
Proof of the claim:
Suppose that some job a occurs in both the feasible
sequences SI and SJ where it is scheduled at times
tI and tJ respectively.
Case 1: If tI = tJ there is nothing to prove
Case 2: If tI < tJ
Note, since sequence is SJ feasible, it follows that the
deadline for job a is no earlier than tJ
Modify sequence SI as follows:
Case i : Suppose there is a gap in SI at time tJ move job
a from time tI into the gap at tJ
Proof of the claim
Case ii :
Suppose there is some job b scheduled in SI at time tJ ,
then exchange jobs a and b in SI
The resulting sequence is still feasible, since in either
case job a will be executed by its deadline, and in the
second case job b is moved to an earlier time and so
can be executed.
So job a is executed at the same time tJ in both the
modified sequences SI and SJ
Case 3: If tI > tJ
Similar argument works, except in this case SJ is
modified
Greedy Proof
Once job a has been treated in this way, we never need
to move it again.
Therefore, if SI and SJ have m jobs in common, after at
most m modifications of either SI or SJ we can ensure
that all the jobs common to I and J are scheduled at the
same time in both sequences.
𝑺′𝑰 and 𝑺′𝑱 be the resulting sequences.
Suppose there is a time when the job scheduled in 𝑺′𝑰
is different from that scheduled in 𝑺′𝑱
Greedy Proof
Case1:
If some job a is scheduled in 𝑺′𝑰 opposite a gap in 𝑺′𝑱 .
Then a does not belong to J & the set J ∪ {a} is feasible.
So we have a feasible solution profitable than J.
This is impossible since J is optimal by assumption.
Case2:
If some job b is scheduled in 𝑺′𝑱 opposite a gap in 𝑺′𝑰 .
Then b does not belong to I & the set I ∪ {b} is feasible.
So the greedy algorithm would have included b in I.
This is impossible since it did not do so.
Greedy Proof
Case 3:
Some job a is scheduled in 𝑺′ opposite a different job b in 𝑺′𝑱
𝑰
In this case a does not appear in J and b does not appear in I.
Case i : If ga > gb
Then we could replace a for b in J and get a better solution.
This is impossible because J is optimal.
Case ii : If ga < gb
The greedy algorithm would have chosen b before considering a
since (I \ {a})∪ {b} would be feasible.
This is impossible because the algorithm did not include b in I.
Case iii : The only remaining possibility is ga= gb
The total profit from I is therefore equal to the profit from the
optimal set J, and so I is optimal to.
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 14
1
Scheduling Problems
1. Activity Selection Problem/Interval Scheduling
Problem
2. Interval Partitioning
3. Minimize time in the system
4. Scheduling with deadlines
5. Weighted interval scheduling problem
Note:
• Greedy Algorithm for Problems 1 – 4
• DP Solution for Problem 5
Scheduling to minimizing lateness
Scheduling to minimizing lateness:
• Single resource processes one job at a time.
• Job j requires tj units of processing time and is
due at time dj
• If j starts at time sj , it finishes at time fj = sj + tj
Lateness: lj = max {0, fj – dj }
Goal:
Schedule all jobs to minimize maximum lateness
𝐿 = 𝑚𝑎𝑥𝑗 𝑙𝑗
Example
Example:
Greedy Algorithm
EARLIEST-DEADLINE-FIRST
SORT jobs by due times and renumber so that
d1 ≤ d2 ≤ … ≤ dn
Theorem: The earliest-deadline-first schedule is optimal.
Claims:
1. There exists an optimal schedule with no idle time
2. The earliest-deadline-first schedule has no idle time
Def. Given a schedule S, an inversion is a pair of jobs i and j such that: i < j but
j is scheduled before i.
3. The earliest-deadline-first schedule is the unique idle-free schedule
with no inversions.
4. If an idle-free schedule has an inversion, then it has an adjacent
inversion.
5. Exchanging two adjacent, inverted jobs i and j reduces the number
of inversions by 1 and does not increase the max lateness.
Greedy - Exercise
Maximize the payoff
Suppose you are given two sets A and B, each
containing n positive integers. You can choose to
reorder each set however you like. After reordering,
let ai be the ith element of set A, and let bi be
the ith element of set B. You then receive a payoff
ς𝒏𝒊=𝟏 𝒂𝒊 𝒃𝒊 . Give an algorithm that will maximize
your payoff. Prove that your algorithm maximizes
the payoff, and state its running time.
Coding
Suppose that we have a 100000 character data file that
we wish to store. The file contains only 6 characters,
appearing with the following frequencies:
a b c d e f
Frequency in thousands 45 13 12 16 9 5
𝐵 𝑇 = 𝑐. 𝑓𝑟𝑒𝑞. 𝑑 𝑇 (𝑐)
𝑐 ∈𝐶
which is defined as cost of the tree T
Full Binary Tree
Key Idea:
An optimal code for a file is always represented by a
full binary tree.
Full binary tree is a binary tree in which every non-leaf
node has two children
Proof: If some internal node had only one child then
we could simply get rid of this node and replace it with
its unique child. This would decrease the total cost of
the encoding.
Greedy Choice Property
Lemma:
Consider the two letters, x and y with the
smallest frequencies. Then there exists an
optimal code tree in which these two letters are
sibling leaves in the tree at the lowest level
Proof (Idea)
Take a tree T representing arbitrary optimal
prefix code and modify it to make a tree
representing another prefix code such that the
resulting tree has the required greedy property.
Greedy Choice Property
Let T be an optimum prefix code tree, and let b and c
be two siblings at the maximum depth of the tree
(must exist because T is full).
Assume without loss of generality that f(b) ≤ f(c) and
f(x) ≤ f(y)
Since x and y have the two smallest frequencies it
follows that f(x) ≤ f(b) and f(y) ≤ f(c)
Since b & c are at the deepest level of the tree d(b) ≥
d(x) and d(c) ≥ d(y)
Now switch the positions of x and b in the tree
resulting in new tree T’
Proof
Example Continued
Huffman Code - Example
Example Continued
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 15
1
NP Completeness
Until now we have been designing algorithms for
specific problems.
We have seen running times O(logn), O(n), O(nlogn), O(n2),
O(n3), .
We often think about problems we can solve in polynomial
time O(nk) as being easy/practically solvable/tractable
We have seen a lot of these
Similarly we think about problems we need exponential
time like O(2n) to solve as being hard/practically
unsolvable/intractable
We have seen a few of these.
NP Completeness
Showing that a problem has an efficient
algorithm is relatively easy.
“All’ that is needed is to demonstrate an
algorithm.
Proving that no efficient algorithm exists for a
particular problem is difficult.
How can we prove the non-existence of
something?
NP Completeness
Goal:
To study interesting class of problems (called NP
Complete)
whose status is unknown.
i.e., no polynomial time algorithm has yet been
discovered, nor has anyone yet been able to prove that
no polynomial time algorithm can exist for any one of
them.
One of the deepest research problem in theoretical
computer science.
NP Complete
• The problem of finding an efficient solution to an NP-
Complete problem is known as P ≠ NP?.
• There is currently a US$1,000,000 award offered by the
Clay Institute (http://www.claymath.org/) for its
solution
• In the remainder of this module we will introduce the
notation, terminology and tools needed to discuss NP-
Complete problems and to prove that problems are
NP-complete.
Proving that a problem is NP-Complete does not prove
that the problem is hard.
It does indicate that the problem is very likely to be hard.
NP Complete
Why so difficult?
Several NP-Complete Problems are very similar to problems
that we know already how to solve in polynomial time.
For example:
• Shortest simple path vs. Longest simple path
• Fractional Knapsack vs. 0-1 Knapsack
• Euler Tour vs Hamiltonian Cycle
• 2-CNF Satisfiability vs. 3-CNF Satisfiability
In Boolean logic, a formula is in conjunctive normal
form (CNF) if it is a conjunction of one or more
clauses, where a clause is a disjunction of literals;
otherwise put, it is a product of sums or an AND of
ORs
Class P
Informal discussion
The class P consists of those problems that are solvable
in polynomial time.
More specifically, they are problems that can be solved
in time O(nk) for some constant k, where n is the size of
the input to the problem.
Most of the problems we have examined are in class P
Class NP
Consider Hamiltonian cycle problem
Given a directed graph G = (V, E) a ‘certificate’ is a
sequence v = (v1 ,v2 ,v3 ,…..v|V|) of |V| vertices.
We can easily check/verify in polynomial time if v is an
Hamiltonian cycle or not.
Consider the problem of 3-CNF Satisfiability
A certificate here is an assignment of values to the
variables.
We can easily check/verify in polynomial time
whether the assignment satisfies the boolean formula.
Class NP
The class NP consists of those problems that are
verifiable in polynomial time.
Already observed that Hamiltonian Cycle Problem &
3-CNF Satisfiability problem are in class NP.
Note
If a problem is in P then we can solve it in polynomial
time & so given a certificate it is verifiable in
polynomial time
Therefore P NP
The open question is whether P ⊂ NP?
Class NP-Complete
A problem is in the class NPC (referred as NP-Complete) if
1. It is in NP
2. Is as ‘hard’ as any problem in NP
Note
Most computer scientists believe that NP-complete
problems are intractable.
Because, if any NP-complete problem can be solved in
polynomial time then every problem in NP has a
polynomial time algorithm.
- We will prove this later
NPC
Technique to show that the problem is NP-complete:
Different than what we have been doing:
We are not trying to prove the existence of polynomial
time algorithm
Instead we are trying to show that no efficient or
polynomial time algorithm is likely to exist.
Three key ideas needed to show problem is NPC
1. Decision problem vs. Optimization problem
2. Reductions
3. First NP-complete problem
Decision problem vs. Optimization problem
Decision problem vs. Optimization problem
Many problems of interest are Optimization problems
Decision SPP
Instance: A weighted graph G, two nodes s and t of G,
and a bound b
Question: is there a simple path from s to t of length at
most b?
Decision problem vs. Optimization problem
Observe, if one can solve an optimization problem (in
polynomial time), then one can answer the decision
version (in polynomial time)
Example: If we know how to solve MST we can solve DST
which asks if there is an Spanning Tree with weight at
most k.
How?
First solve the MST problem and then check if the MST
has cost k.
If it does, answer Yes.
If it doesn’t, answer No
Decision problem vs. Optimization problem
Lecture No. 16
1
Class P
Informal discussion
The class P consists of those problems that are solvable
in polynomial time.
More specifically, they are problems that can be solved
in time O(nk) for some constant k, where n is the size of
the input to the problem.
Most of the problems we have examined are in class P
Class NP
Consider Hamiltonian cycle problem
Given a directed graph G = (V, E) a ‘certificate’ is a
sequence v = (v1 ,v2 ,v3 ,…..v|V|) of |V| vertices.
We can easily check/verify in polynomial time if v is an
Hamiltonian cycle or not.
Consider the problem of 3-CNF Satisfiability
A certificate here is an assignment of values to the
variables.
We can easily check/verify in polynomial time
whether the assignment satisfies the boolean formula.
Class NP
The class NP consists of those problems that are
verifiable in polynomial time.
Already observed that Hamiltonian Cycle Problem &
3-CNF Satisfiability problem are in class NP.
Note
If a problem is in P then we can solve it in polynomial
time & so given a certificate it is verifiable in
polynomial time
Therefore P NP
The open question is whether P ⊂ NP?
Class NP-Complete
A problem is in the class NPC (referred as NP-Complete) if
1. It is in NP
2. Is as ‘hard’ as any problem in NP
Note
Most computer scientists believe that NP-complete
problems are intractable.
Because, if any NP-complete problem can be solved in
polynomial time then every problem in NP has a
polynomial time algorithm.
- We will prove this later
NPC
Technique to show that the problem is NP-complete:
Different than what we have been doing:
We are not trying to prove the existence of polynomial
time algorithm
Instead we are trying to show that no efficient or
polynomial time algorithm is likely to exist.
Three key ideas needed to show problem is NPC
1. Decision problem vs. Optimization problem
2. Reductions
3. First NP-complete problem
Decision problem vs. Optimization problem
Remark: An optimization problem usually has a
corresponding decision problem.
Usually easily defined with the help of a bound on the value
of feasible solutions
Example
Optimization problem: Minimum Spanning Tree
Given a weighted graph find a minimum spanning tree (MST)
of G
Decision problem: Decision Spanning Tree (DST)
Given a weighted graph and an integer k, does G have a
spanning tree of weight at most k?
Decision problem vs. Optimization problem
Observe, if one can solve an optimization problem (in
polynomial time), then one can answer the decision
version (in polynomial time)
Example: If we know how to solve MST we can solve DST
which asks if there is an Spanning Tree with weight at
most k.
How?
First solve the MST problem and then check if the MST
has cost k.
If it does, answer Yes.
If it doesn’t, answer No
Decision problem vs. Optimization problem
Union of L1 and L2 : L1 ∪ L2
Intersection of L1 and L2: L1 ∩ L2
Complement of L: = Σ*− L
Concatenation of L1 and L2:
L1L2 = {x1x2 : x1 ∈ L1 and x2 ∈ L2}
Kleene star of L :
L* = {} ∪ L∪ L^2 ∪ L^3 ∪...where L^k = LL...L, k times.
Formal Language Framework
Consider the concrete problem corresponding to the
problem of deciding whether a natural number is prime.
Using binary encoding, Σ = {0,1}, this concrete problem is a
function:
Prime: {0,1}* → {0,1} with
Prime(10) = Prime(11) = Prime(101) = Prime(111) = 1
Prime(0) = Prime(1) = Prime(100) = Prime(110) = 0
We can associate with Prime a language LPRIME
corresponding to all strings s over {0,1} with Prime(s) = 1:
LPRIME = {10,11,101,111,1011,1101,...}
Sometimes it’s convenient to use the same name for
the concrete problem and its associated language :
Prime = {10,11,101,111,1011,1101,...}
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 17
1
Re Cap
Abstract problem
An abstract problem Q is a binary relation on a set I of
problem instances and a set S of problems solutions.
An abstract decision problem is a function that maps
the instance set to the solution set {0, 1}
An encoding of a set I of abstract objects is a mapping e
from I to a set of strings over an alphabet Σ.
We will use the binary encoding as the standard one.
Formal Language Framework
Consider the concrete problem corresponding to the
problem of deciding whether a natural number is prime.
Using binary encoding, Σ = {0,1}, this concrete problem is a
function:
Prime: {0,1}* → {0,1} with
Prime(10) = Prime(11) = Prime(101) = Prime(111) = 1
Prime(0) = Prime(1) = Prime(100) = Prime(110) = 0
We can associate with Prime a language LPRIME
corresponding to all strings s over {0,1} with Prime(s) = 1:
LPRIME = {10,11,101,111,1011,1101,...}
Sometimes it’s convenient to use the same name for
the concrete problem and its associated language :
Prime = {10,11,101,111,1011,1101,...}
Formal Language Framework
The formal-language framework allows us to define algorithms
for concrete decision problems as “machines” that operate on
languages.
An alphabet Σ is any finite set of symbols.
A language L over Σ is any set of strings made up of symbols
from Σ.
We denote the empty string by , and the empty language by ∅.
The set/language of all strings over Σ is denoted by Σ*.
So, if Σ = {0,1}, then Σ* = {, 0, 1, 00, 01, 10, 11, 000,...} is the set
of all binary strings.
Every language L over Σ is a subset of Σ*.
Operations on Languages
Union of L1 and L2 : L1 ∪ L2
Intersection of L1 and L2: L1 ∩ L2
Complement of L: = Σ*− L
Concatenation of L1 and L2:
L1L2 = {x1x2 : x1 ∈ L1 and x2 ∈ L2}
Kleene star of L :
L* = {} ∪ L∪ L^2 ∪ L^3 ∪...where L^k = LL...L, k times.
Formal Language Framework
Note:
Even if language L is accepted by an algorithm A, the
algorithm will not necessarily reject a string x L
provided as input to it.
For example, the algorithm may loop forever.
A language L is decided by an algorithm A if every
binary string in L is accepted by A and every binary
string not in L is rejected by the A.
Formal Language Framework
A language L is accepted in polynomial time by an
algorithm A if for any length-n string x L, the
algorithm accepts x in time O(nk) for some constant k.
Four Possibilities
NP=co-NP
P=NP=co-NP
P
P
NP co-NP
P = NP co − NP
P
Most researchers regard this
NP co-NP
Possibility as the most likely
P NP co − NP
Reduction Algorithm
A language L1 is polynomial-time reducible to a
language L2, written L1 ≤P L2 if there exists a polynomial-
time computable function f: {0, 1}* {0, 1}*
such that for all x L1 if and only if f(x) L2.
We call the function f the reduction function, and a
polynomial algorithm F that computes f is called a
reduction algorithm.
Reduction
Lemma: If L1, L2 {0, 1}*are languages such that
L1 ≤P L2, then L2 P implies L1 P.
Proof:
yes, f ( x) L2 yes, f ( x ) L1
x f (x)
F A2
no, f ( x ) L2
no, f ( x) L1
A1
NP-Completeness
A language L {0, 1}* is NP-complete if
1. L NP and
2. L’ ≤P L for every L’ NP
Lecture No. 18
1
Re Cap
We define the complexity class P as:
P = { L {0, 1}*| there exists an algorithm A that
decides L in polynomial time}.
The complexity class NP is the class of languages that
can be verified by a polynomial-time algorithm.
More precise, a language L belongs to NP if and only if
there exists a two-input polynomial-time algorithm A
and a constant c such that
L = {x {0, 1}* | a certificate y with |y| = O(|x|c)
such that A(x, y) = 1}
• P NP
Class co-NP
Complexity class co-NP
L co-NP if LC NP
Easy to prove P NP co-NP
NP=co-NP
P=NP=co-NP
P
P
NP co-NP
P = NP co − NP
NP
P
co-NP
Most researchers regard this
Possibility as the most likely
P NP co − NP
Reduction Algorithm
A language L1 is polynomial-time reducible to a
language L2, written L1 ≤P L2 if there exists a polynomial-
time computable function f: {0, 1}* {0, 1}*
such that for all x L1 if and only if f(x) L2.
We call the function f the reduction function, and a
polynomial algorithm F that computes f is called a
reduction algorithm.
Lemma: If L1, L2 {0, 1}*are languages such that
L1 ≤P L2, then L2 P implies L1 P.
NP-Completeness
A language L {0, 1}* is NP-complete if
1. L NP and
2. L’ ≤P L for every L’ NP
Theorem:
The circuit-satisfiability problem is NP-Complete
Proof:
For time being we will assume this theorem
Circuit satisfiability
• Circuit satisfiability
SAT
Formula satisfiability (SAT)
An instance of SAT is a boolean formula composed of :
• n boolean variables: x1,x2,….,xn
• m boolean connectives: any boolean function with
one or two input and one output
• Parentheses
Wlog, we assume there are no redundant parentheses
SAT = {< > | is a satisfiable boolean formula}
SAT
Example
SAT is NP Complete
Theorem
Satisfiability of boolean formulas is NP-complete
Proof
• SAT NP
Easy
• CIRCUIT-SAT ≤P SAT
We show how to reduce any instance of circuit
satisfiability to an instance of formula satisfiability in
polynomial time
SAT is NP Complete
What do we have to do?
• Given an instance C of Circuit-SAT, define
poly-time function f that converts C to
instance φ of SAT
• Argue that f is poly-time
• Argue that f is correct (i.e., C of Circuit-SAT is
satisfiable iff φ of SAT is satisfiable)
Construction
Let C be instance of circuit satisfiability
Construction/Algorithm
Look at the gate that produces the circuit output and
inductively express each of the gate’s inputs as formulas
Note:
This approach cannot lead to a polynomial time reduction.
Exercise: Give Example
Useful Identity
The reduction relies on the following identity:
‘if and only if’ (denoted by ↔) is a Boolean operator
that follows the following truth table.
Let C = A ↔B
A B C
1 1 1
1 0 0
0 1 0
0 0 1
AND/OR gate with any number of input wires can be similarly reduced to a
Boolean Formula
SAT is NP Complete
Let C be instance of circuit satisfiability
Lecture No. 19
1
NPC
How to show that L NPC?
• Direct Proof
We show that L NP & that L’ ≤P L for every L’ NP
• Indirect proof
We use the following lemma for the indirect proof
Lemma
If L is a language such that L’ ≤P L for some L’ NPC,
then L is NP-hard.
Moreover, if L NP then L NPC
1st NP Complete Problem
1st NP Complete Problem
• Circuit-satisfiability problem: Given a boolean
combinational circuits composed of AND, OR, or NOT
gates, is it satisfiable?
Theorem:
The circuit-satisfiability problem is NP-Complete
Theorem
Satisfiability of boolean formulas is NP-complete
Proof
• SAT NP
• CIRCUIT-SAT ≤P SAT
3 - CNF
3–SAT/3-CNF problem
Given a set of clauses C1, C2, . . . , Cm in 3-CNF form
variables x1, x2, . . . , xn
Example: ( x1 x1 x2 ) ( x3 x2 x4 )
(x1 x3 x4 )
(xy)(yz)(xz)(zy)
y
x
y
z
z
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 20
1
Re Cap
Theorem:
Satisfibility of boolean formula in 3-CNF is NP complete.
Exercise
1. 4-CNF is NP Complete.
2. k-CNF is NP Complete for k 3.
3. MAX-3CNF
Given a Boolean formula in conjunctive normal form, such
that each clause contains 3 literals.
The task is to find an assignment to the variables of the
formula such that a maximum number of clauses is
satisfied.
4. What about DNF?
5. What about 2-CNF? MAX 2-CNF?
We will show that 2-CNF is in P but MAX 2-CNF is NPC
2 - CNF
Instance: A 2-CNF formula
Problem: To decide if is satisfiable
Example: a 2-CNF formula
(xy)(yz)(xz)(zy)
y
x
y
z
z
Lemma 1
Lemma 1:
If the graph contains a path from to , it also contains
a path from to .
Observe
If there’s an edge (,), then there’s also an edge
(,)
Proof:
Extend the observation
Lemma 2
Lemma 2:
A 2-CNF formula is unsatisfiable
iff there exists a variable x, such that:
1. there is a path from x to x in the graph
2. there is a path from x to x in the graph
Proof
By contradiction
Proof
Suppose there are paths x from x and x from x for
some variable x,
But there’s also a satisfying assignment
Case1: If (x) = T
x . . . x
T T F F
() is false! A contradiction
Case2: If (x) = F
Similar Analysis
Proof
• Suppose there are no such paths
Construct an assignment as follows:
1. pick an 3. assign F to
unassigned their negations
x
literal , with
no path from y
to , and x 4. Repeat until all
assign it T vertices are
y
assigned
2. assign T to
z
all reachable
vertices z
Proof
Claim: The algorithm is well defined.
Proof:
If there were a path from x to both y and y,
then there would have been a path from x to y and
from y to x
2-SATP
Algorithm for 2SAT:
– For each variable x find if there is a path from x to
x and vice-versa.
– Reject if any of these tests succeeded.
– Accept otherwise
2-SATP
Theorem:
Given a graph G=(V,E) and two vertices s, t ∈ V, finding
if there is a path from s to t in G is polynomial time
algorithm.
MAX 2SAT
Theorem : MAX-2SAT is NP-complete
Proof:
• MAX-2SAT is in NP
Guess a truth assignment and verify the count.
The verifying takes polynomial time.
• We now reduce 3-CNF to MAX-2SAT
We show that given an instance of 3-CNF we construct
an instance of MAX-2SAT so that a satisfying truth
assignment of 3-CNF can be ex-tended to a satisfying
truth assignment of MAX-2SAT.
MAX-2SAT
Let S be the instance of 3CNF where the clauses are C1,
C2,......,Cm
where Ci = (x ∨ y ∨ z)
From S we build an instance S′ of MAX-2SAT as follows:
Each Ci in S corresponds to a clause group 𝑪′𝒊 in S′,
where 𝑪′𝒊 has the following 10 clauses:
(x)∧(y)∧(z)∧(w)
(¬x∨¬y)∧(¬y∨¬z)∧(¬z∨¬x)
(x∨¬w)∧(y∨¬w)∧(z∨¬w)
where w is a new variable and 1 ≤ i ≤ m.
MAX-2SAT
Assume that S is satisfiable.
Then in a typical clause Ci = (x ∨ y ∨ z)
either one or two or all the three variables are true.
• All of x, y, z are true:
By setting w to true, we satisfy 4 + 0 + 3 = 7 clauses of 𝑪′𝒊
• Two of x, y, z are true:
By setting w to true, we satisfy 3 + 2 + 2 = 7 clausesof 𝑪′𝒊
• One of x, y, z is true:
By setting w to false, we satisfy 1 + 3 + 3 = 7 clauses of 𝑪′𝒊
Observe: A satisfying truth assignment of S can be
extended to a satisfying truth assignment of S′ where
exactly seven clauses in each clause group get satisfied.
MAX-2SAT
Now assume that S is not satisfiable.
Then in at least one clause Ci = (x ∨ y ∨ z) in S, we have
x = F, y = F, z = F
By setting w to false, we satisfy 0 + 3 + 3 = 6 clauses of
𝑪′𝒊
By setting w to true, we satisfy only 1 + 3 + 0 = 4
clauses of 𝑪′𝒊
That is, if S is not satisfiable, no assignment can make
at least seven clauses true in each clause group of 𝑪′𝒊
MAX-2SAT
Observe
Each clause in S′ as constructed above has at most two
literals.
It can be seen that the clauses in S′ can be efficiently
generated from the clauses in S in polynomial time.
Next we prove:
A satisfying truth assignment of S exists if and only if it
can be extended to a truth assignment for S′ satisfying
atleast k clauses, where k= 7m
MAX-2SAT
Let φ be an instance of 3-CNF
and R(φ) be the corresponding instance of MAX-2SAT
Ifφ has m clauses, then R(φ) has 10m clauses
Claim:
K = 7m clauses of R(φ) can be satisfied if and only if φ
is satisfiable
Therefore 3-CNF ≤𝑷 MAX-2SAT
And so MAX-2SAT NPC
Max-Clique
Max-Clique:
Given a graph G, find the largest clique
Clique: set of nodes such that all pairs in the set are
neighbors OR A clique in G of size k is a complete subgraph
of G on k vertices
Decision problem:
Given G and integer k, does G contain a clique of size ≥ k?
Max-Clique is clearly in NP.
Easy
Theorem :
Max-Clique is NP-Complete
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 21
1
NP Complete Problems
We have proved the following reductions:
CIRCUIT-SAT
SAT
3-CNF
Example: Suppose = C1 C2 C3
where C1 = x1 ⌐ x2 ⌐ x3, C 2 = ⌐ x 1 x2 x3
and C3 = x1 x2 x3
G:
Max-Clique
Proof Contd.
• Reduction takes polynomial time
easy to see
Observe
Since each of the 3 vertices corresponding to each
clause are not connected to each other, at most one of
them can be in a clique at a time.
This means the maximum clique size is at most m.
Claim:
is satisfiable if and only if a clique of size m exist in G
Max-Clique
Suppose is satisfiable.
Then each clause has at least one literal that’s assigned
value T.
Let V’ be the set of corresponding vertices.
Then |V’| = m
Claim: V’ is a clique.
Consider any two vertices of V’
1. These vertices corresponds to literals from different
clauses.
2. Since the corresponding literals are assigned value T, the
corresponding literals are consistent.
Therefore, an edge between the two vertices
Hence the claim.
Max-Clique
Conversely, suppose G has a clique V’ of size m.
Since no edges in G connect vertices corresponding to
literals from the same clause,
So V’ contains exactly one vertex corresponding to literals
from each of the m clauses.
Assign value T to each of these m literals.
Observe : Assignment is consistent
Because G contains no edges between inconsistent literals.
Since there is a literal in each clause assigned value T
Each clause is satisfied.
Hence is satisfied
Vertex Cover Problem
Vertex Cover Problem
Let G = (V, E) be an undirected graph.
A subset V’ V is called a vertex cover of G if
(u, v) E, then u V’ or v V’ (or both)
The size of vertex cover is the number of vertices in it.
The Vertex Cover Problem is to find a vertex cover of
minimum size in a given graph.
The Vertex Cover Decision problem is:
Given a graph G and an integer k,
Does G have a vertex cover of size ≤ k?
Vertex Cover Problem
Example: For the following graph G find the vertex
cover
Theorem:
Vertex Cover NPC
Proof:
• Vertex Cover is in NP
Easy to prove
Vertex Cover Problem
Claim: Vertex Cover is NP-hard
For proving this we need the notion of a complement graph.
The complement graph of G= (V, E) is defined by 𝑮 ഥ (𝑽, 𝑬
ഥ)
where
ഥ = {(u, v) : u, v V, u ≠ v, (u, v) E}
𝑬
(It is the graph with all edges that are not in E)
Vertex Cover Problem
Claim: Clique ≤𝑷 Vertex Cover
Reduction Algorithm:
Input is an instance (G, k) of the Clique Problem
The reduction algorithm computes the complement
ഥ
𝑮.
This can be done in polynomial time.
Claim: G has a clique of size k if and only if the
graph 𝑮ഥ has a vertex cover of size |V| - k.
Vertex Cover Problem
Theorem:
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 22
1
NP Complete Problems
We have proved the following reductions:
CIRCUIT-SAT
SAT
3-CNF
Vertex Cover
Independent Set
Max-Clique
Theorem :
Max-Clique is NP-Complete
Proof:
We will prove 3-CNF ≤𝑷 Max-Clique.
Given a 3-CNF formula of m clauses C1, C2,….., Cm
and over n variables x1, x2,….., xn
We construct a graph G as follows:
1. for each clause Cr = 𝒍𝒓𝟏 𝒍𝒓𝟐 𝒍𝒓𝟑 , create one vertex for each
of 𝒍𝒓𝟏 , 𝒍𝒓𝟐 , 𝒍𝒓𝟑
2. Place an edge between two vertices 𝒍𝒓𝒊 and 𝒍𝒔𝒋 if and only if
• r ≠ s, i.e., the corresponding literals are from different clauses
• 𝒍𝒓𝒊 ≠ ⌐ 𝒍𝒔𝒋 i.e., they are consistent
Max-Clique
Example: Suppose = C1 C2 C3
where C1 = x1 ⌐ x2 ⌐ x3, C 2 = ⌐ x 1 x2 x3
and C3 = x1 x2 x3
G:
Vertex Cover Problem
Vertex Cover NPC
Claim: Clique ≤𝑷 Vertex Cover
Claim: G has a clique of size k if and only if the complement
ഥ has a vertex cover of size |V| - k.
graph 𝑮
Theorem:
Independent Set is NP-Complete.
Claim: Max-Clique ≤𝑷 Independent Set
Claim: G has a clique of size k if and only if the the
complement graph 𝑮 ഥ has a independent set of size k.
Set Covering Problem
Example
SAT
3-CNF
Vertex Cover
Theorem :
The traveling salesman problem is NP-complete.
Proof :
First, we have to prove that TSP belongs to NP.
In TSP, we find a tour and check that the tour contains
each vertex once.
Then the total cost of the edges of the tour is
calculated and verified if it is at most k
This can be done in polynomial time thus TSP belongs
to NP.
Travelling Salesman Problem
Secondly, we prove that TSP is NP-hard.
Claim : Hamiltonian cycle ≤p TSP
(Assume Hamiltonian cycle problem is NP complete)
Reduction Algorithm:
Assume G = (V, E) to be an instance of Hamiltonian cycle.
An instance of TSP is then constructed as follows.
• We create the complete graph G’ = (V, E’), where
E′ = {(i, j) : i, j ∈ V and i ≠ j}
• We define the cost function c by
𝟎, 𝒊𝒇 𝒊, 𝒋 𝑬
𝒄 𝒊, 𝒋 = ቊ
𝟏, 𝒊𝒇 𝒊, 𝒋 𝑬
Travelling Salesman Problem
Claim:
G has a Hamiltonian cycle if and only if G’ has a tour of
cost at most 0.
Suppose that a Hamiltonian cycle h exists in G.
It is clear that the cost of each edge in h is 0 in G’ as
each edge belongs to E.
Therefore, h has a cost of 0 in G′.
Thus, if graph G has a Hamiltonian cycle then graph G’
has a tour of cost 0.
Travelling Salesman Problem
Conversely,
We assume that G’ has a tour h of cost at most 0.
By definition, the cost of edges in E’ are 0 and 1.
Since cost of h is 0, each edge of h must have a cost 0.
Thus h contains only edges in E.
Therefore h is a Hamiltonian cycle of G.
TSP
Suppose in the reduction algorithm we define
𝟏, 𝒊𝒇 𝒊, 𝒋 𝑬
𝒄 𝒊, 𝒋 = ቊ
𝟐, 𝒊𝒇 𝒊, 𝒋 𝑬
Then the corresponding claim is:
G has a Hamiltonian cycle if and only if G’ has a
tour of cost |V|.
Hamiltonian
Exercise:
1. Longest Simple Cycle is NP Complete
BITS, PILANI – K. K. BIRLA GOA CAMPUS
Lecture No. 24
1
NP Complete Problems
We have proved the following reductions:
CIRCUIT-SAT Hamiltonian Cycle
3-CNF
Vertex Cover
𝒂 = 𝒂
𝒂∈𝑨 𝒂∈𝑺−𝑨
=𝒔−𝒕
Therefore, there exists a partition of X′ into two such
that each partition sums to s−t
Set-Partition
Conversely, suppose there exists a partition of X′ into
two sets such that the sum over each set is s−t.
Then, one of these sets contains the number s−2t.
Removing this number, we get a set of numbers whose
sum is t, and all of these numbers are in X
0-1 Knapsack problem
0-1 Knapsack Decision Problem
Given n items with weights w1, w2,……, wn
and values v1, v2,……, vn and capacity W and value V
Is there a subset S ⊆ {1,2, .., n} such that
σ𝒊 ∈ 𝑺 𝒘𝒊 ≤ 𝑾 and σ𝒊 ∈ 𝑺 𝒗𝒊 ≥ 𝐕?
Theorem:
0-1 Knapsack Problem is NP-complete
0-1 Knapsack problem
Proof:
• 0-1 Knapsack is in NP.
The proof is the set S of items that are chosen and the
verification process is to computeσ𝒊 ∈ 𝑺 𝒘𝒊 and σ𝒊 ∈ 𝑺 𝒗𝒊
This can be done in polynomial time in the size of input
• Claim
Set-Partition ≤𝑷 0-1 Knapsack
0-1 Knapsack problem
Reduction Algorithm Q:
Suppose we are given a set X = {a1, a2,……, an} for the
Set-Partition Problem
Consider the following Knapsack problem:
wi = ai & vi = ai for i = 1,2,…,n &
𝟏 𝒏
𝑾=𝑽= σ 𝒂
𝟐 𝒊=𝟏 𝒊
This process of converting the Set-Partition problem to
0-1 Knapsack problem is polynomial in the input size.
0-1 Knapsack problem
Suppose X is a ‘Yes’ instance for the Set-Partition
problem.
Then there exists a subset S of X such that
𝟏
σ𝒊 ∈ 𝑺 𝒂𝒊 = σ𝒊 ∈𝑿−𝑺 𝒂𝒊 = σ𝒏𝒊=𝟏 𝒂𝒊
𝟐
Let our Knapsack contain the items in S
𝟏 𝒏
Then σ𝒊 ∈ 𝑺 𝒘𝒊 = σ𝒊 ∈ 𝑺 𝒂𝒊 = σ𝒊=𝟏 𝒂𝒊 = W
𝟐
𝟏 𝒏
And σ𝒊 ∈ 𝑺 𝒗𝒊 = σ𝒊 ∈ 𝑺 𝒂𝒊 = σ𝒊=𝟏 𝒂𝒊 = V
𝟐
0-1 Knapsack problem
Conversely, Suppose Q(X) is a ‘Yes’ instance for the 0-1
Knapsack problem
Let S be the set chosen for the 0-1 Knapsack problem
𝟏
We have σ𝒊 ∈ 𝑺 𝒘𝒊 = σ𝒊 ∈ 𝑺 𝒂𝒊 ≤ W = σ𝒏𝒊=𝟏 𝒂𝒊 and
𝟐
𝟏 𝒏
σ𝒊 ∈ 𝑺 𝒗𝒊 = σ𝒊 ∈ 𝑺 𝒂𝒊 ≥ V = σ 𝒂
𝟐 𝒊=𝟏 𝒊
𝟏 𝒏
Thereforeσ𝒊 ∈ 𝑺 𝒂𝒊 = σ𝒊 ∈ 𝑿 − 𝑺 𝒂𝒊 = σ𝒊=𝟏 𝒂𝒊
𝟐
Thus S is the required subset of X.