0% found this document useful (0 votes)
15 views69 pages

Unit 5

Uploaded by

as1735
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views69 pages

Unit 5

Uploaded by

as1735
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

INTRODUCTION TO RANDOMIZATION &

APPROXIMATION ALGORITHMS
Agenda
PartI:

• IntroductiontoRandomized Algorithms and ApproximationAlgorithms


• RandomHiring Problem
• StringMatchingAlgorithm
• RabinKarpAlgorithmforStringMatching
• ApproximationAlgorithm
• VertexCovering

PartII:
• IntroductiontoComplexityClasses
• PTypeProblems
• IntroductiontoNPTypeProblems
• HamiltonianCycleProblem
• NPCompleteProblem
• SatisfiabilityProblem
• NPHardProblem
• Examples
Introduction
• Algorithm running time
– Polynomial
– Exponential
• Algorithm is of 2 types
– Deterministic
• For given input same output produced
• Solved in polynomial time
• DFS, BFS
– Non deterministic
• Different output in different runs
• Randomized algorithm
• Terms
– Choice(x)
– Failure()
– Success()
Randomized Algorithm

INPUT ALGORITHM OUTPUT

Random No

An algorithm that uses random numbers to decide what


to do next anywhere in its logic is called Randomized
Algorithm.
Example of Randomized Algorithm and
calculation of running time
Two Types of Randomized Algorithms
and Some Complexity Classes

• A Las Vegas algorithm fails with some probability, but we can tell when it
fails. In particular, we can run it again until it succeeds, which means that
we can eventually succeed with probability 1 (but with a potentially
unbounded running time). Alternatively, we can think of a Las Vegas
algorithm as an algorithm that runs for an unpredictable amount of time
but always succeeds.
QuickSort is an example of a Las Vegas algorithm.
• A Monte Carlo algorithm fails with some probability, but we can’t tell
when it fails. If the algorithm produces a yes/no answer and the failure
probability is significantly less than 1/2, we can reduce the probability of
failure by running it many times and taking a majority of the answers.
The polynomial equality-testing algorithm in an example of a Monte Carlo
algorithm.
Las Vegas vs. Monte Carlo

• Las Vegas algorithms • Monte Carlo algorithms


– Always produces a – Allow a small probability
(correct/optimal) for outputting an
solution. incorrect/non-optimal
– Like RandQS. solution.
– Like Polynomial Equality
testing.

7
Las Vegas Algorithms
• For example, RandQS is a Las Vegas algorithm.

• A Las Vegas always gives the correct solution

• The only variation from one run to another is its


running time, whose distribution we study.

8
Randomized quicksort
partition(arr[], lo, hi)
pivot = arr[hi]
i = lo // place for swapping
for j := lo to hi – 1 do
if arr[j] <= pivot then
swap arr[i] with arr[j]
i=i+1
swap arr[i] with arr[hi]
return i

partition_r(arr[], lo, hi)


r = Random Number from lo to hi
Swap arr[r] and arr[hi]
return partition(arr, lo, hi)

quicksort(arr[], lo, hi)


if lo < hi
p = partition_r(arr, lo, hi)
quicksort(arr, p-1, hi)
9
quicksort(arr, p+1, hi)
Randomized QS - An illustration

4 2 7 8 1 9 3 6 5

4 2 1 3 5 7 8 9 6

1 2 4 3 6 7 8 9

1 3 4 6 8 9

3 9

10
Randomized QS - An illustration

2 7

1 4 6 8

Algorithm Inorder(tree)
1. Traverse the left subtree, i.e., call Inorder(left-subtree) 3 9
2. Visit the root.
3. Traverse the right subtree, i.e., call Inorder(right-
subtree)
2 Questions for RandQS

• Is RandQS correct?
– That is, does RandQS “always” output a sorted
list of X?
• What is the time complexity of RandQS?
– Due to the randomization for selecting x, the
running time for RandQS becomes a random
variable.
– We are interested in the expected time
complexity for RandQS.
Randomized QuickSort always sorts an input array and expected worst case
time complexity of QuickSort is E(O(nLogn))
12
RANDOM HIRING PROBLEM
Hiring Problem - Pseudocode
Hiring Problem (Contd)..
Hiring Problem (Contd)..
Analysis of Hiring Problem
• In hiring example we could try to use a random number generator to create a
random permutation of the input and then run the hiring algorithm on that.

hiring cost is O(m*Ch)


where m is the number
of candidate hired.
Harmonic mean : 1+1/2+1/3+1/4+ ...... +1/n = ln n Hc = O(ln n* Ch)
String Matching:
Naive Approach &
Rabin Karp Algorithm
Matching Algorithm - Naive Approach
• In the naïve approach tests all the possible placement of
Pattern P [1.......m] relative to text T [1......n].

NAIVE-STRING-MATCHER (T, P)
n ← length [T]
Ex: if n = 5, m =3
m ← length [P] O(n-m)
For s=0, P[1-3] = T[1-3],
s=1, P[1-3] = T[2-4], so
for s ← 0 to n -m on..

do if P [1.....m] = T [s + 1....s + m] O(1)


then print "Pattern occurs with shift" s
Matching Algorithm - Naive Approach
Matching Algorithm - Naive Approach
Matching Algorithm - Naive Approach
RABIN KARP ALGORITHM - Preliminaries
Characters and Hash code table
A B C D E F G H I J
1 2 3 4 5 6 7 8 9 10

K L M N O P Q R S T
11 12 13 14 15 16 17 18 19 20

U V W X Y Z
21 22 23 24 25 26

Text T = STRING, Here n = 6


Pattern P = RING, Here m = 4.

Compute Hash function


Pattern P, h(P) = R (18) + I (9) + N(14) + G (7) = 48 (i.e. Hash code = 48)
RABIN KARP ALGORITHM - Preliminaries
TEXT S T R I N G
S=0 H(T) 19 20 18 9 H(T) = 66
H(P) = 48 PATTERN R I N G H(T) ≠H(P)

TEXT S T R I N G
S=1 H(T) 20 18 9 14 H(T) = 80
H(P) = 48 PATTERN R I N G H(T) ≠H(P)

TEXT S T R I N G
S=2 H(T) 18 9 14 7 H(T) = 48
H(P) = 48 PATTERN R I N G H(T) =H(P)

Character & Hash code Table


A B C D E F G H I J
1 2 3 4 5 6 7 8 9 10
K L M N O P Q R S T
11 12 13 14 15 16 17 18 19 20
U V W X Y Z
21 22 23 24 25 26
RABIN KARP ALGORITHM – Preliminaries & Limitation : Spurious Hits
TEXT A B C D A A
S=0 H(T) 1 2 3 H(T)=6
H(P) = 6 PATTERN 1 2 3 H(T) =H(P)

TEXT A B C D A A
S=1 H(T) 2 3 4 H(T)=9
H(P) = 6 PATTERN 1 2 3 H(T) ≠H(P)

TEXT A B C D A A
S=2 H(T) 3 4 1 H(T)=8
H(P) = 6 PATTERN 1 2 3 H(T) ≠H(P)

TEXT A B C D A A
S=3 H(T) 4 1 1 H(T)=6
H(P) = 6 PATTERN 1 2 3 H(T) =H(P)
Character & Hash code Table
Spurious A B C D E F G H I J
Hit 1 2 3 4 5 6 7 8 9 10
K L M N O P Q R S T
TEXT A B C D A A 11 12 13 14 15 16 17 18 19 20
P A B C U V W X Y Z
21 22 23 24 25 26
RABIN KARP ALGORITHM – How to
overcome Spurious Hits?
TEXT S T R I N G
S=0 H(T) 19 20 18 9
H(P) = H(T) = 347941
322823 PATTERN R I N G H(T) ≠H(P)

S=1 TEXT S T R I N G
H(P) = H(T) 20 18 9 14 H(T) = 363936
322823 PATTERN R I N G H(T) ≠H(P)

S=2 TEXT S T R I N G
H(P) = H(T) 18 9 14 7 H(T) = 322823
322823 PATTERN R I N G H(T) =H(P)

Character & Hash code Table


A B C D E F G H I J
1 2 3 4 5 6 7 8 9 10
K L M N O P Q R S T
11 12 13 14 15 16 17 18 19 20
U V W X Y Z
21 22 23 24 25 26
RABIN KARP ALGORITHM – Preliminaries & Avoiding Spurious Hits
S=0 TEXT A B C D A A
H(P) = H(T) 1 2 3 H(T)=731
731 PATTERN 1 2 3 H(T) =H(P)

S=1 TEXT A B C D A A
H(P) = H(T) 2 3 4 Remove 1*262 ; Multiply (2*261 +3
731 * 260 )*26 ; Add 4 *260 ;H(T)=1434
PATTERN 1 2 3 H(T) ≠H(P)
S=2 TEXT A B C D A A
H(P) = H(T) 3 4 1 H(T)=2133
731 PATTERN 1 2 3 H(T) ≠H(P)

S=3 TEXT A B C D A A
H(P) = H(T) 4 1 1 H(T)=2731
731 PATTERN 1 2 3 H(T) ≠H(P)
Character & Hash code Table
Spurious Hit
is avoided A B C D E F G H I J

T, A B C D A A 1 2 3 4 5 6 7 8 9 10
n=6 K L M N O P Q R S T

P, A B C 11 12 13 14 15 16 17 18 19 20
m =3 U V W X Y Z

H(P) = 1*262 + 2*261 +3 * 260 = 731 21 22 23 24 25 26


Yet another limitation in Hash computation!

• Assume T = ABCATCBJ
• Assume P = CAT;
• H(TCAT) = 3* 102 + 2*101 + 20*100 = 330 Spurious
Hit!
• H(TCBJ)= 3* 102 + 2*101 + 10*100 = 330
d=10,
instead of
26
Rabin Karp - Example
Algorithm:
1. Let characters in both T & P be
in Radix notation (0,1,2,3...9).
2. Let ‘p’ be the value of characters in
n = 11,
P.
m=2 3. Choose any prime no ‘q’ within a
computer ‘word’ length to
S=0 speedup computations.
4. Compute (p mod q). This value will
be used to find all matches of
the pattern P in T.
S=1 5. Loop from shift index ‘s’=0 to
n-m, i.e.11-2 = 9;
6. Compute (T[s+1,...., s+m] mod
S=2 q), for s=0 to n-m
i.e. For s=0, compute T[1,2]
For s=1, compute T[2,3]....
For s= n-m, ie. s=9, compute
T[10,11]
Rabin Karp – Example (Contd..)
7. Test against P only those
sequences in T having the same
S=3
(mod q) value.
8. (T[s+1,...s+m])mod q) can be
S=4 incrementally computed by
subtracting the higher order
S=5 digit, shifting, adding the low
order bit, all in modulo q
arithmetic.
S=6
Ex. @ S=3, 15 = 1*101 + 5* 100 = 15
S=7 @ S=4, 59 = 1*101 + (5* 100 )*10
+ (9)
Rabin Karp – Example (Contd..)

S=8

S=9

• T(n) = O(m-n+1)
P, NP, NP-Complete Problems
Complexity Classes
• Class P
• NP Class
– NP- Hard
– NP-Complete
Polynomial Problems (P Family)
• The set of problems that can be solved in polynomial time

• These problems form the P family

• All problems we covered so far are in P

P
Nondeterministic Polynomial (NP Family)

• The set of decision problems that can be


verified in polynomial time

• Not necessarily solvable in polynomial time

What does it mean:


NP “decision problem”
“verifiable”
NP Class
• In computational complexity theory, NP ("Nondeterministic Polynomial time") is the set of
decision problems solvable in polynomial time on a nondeterministic Turing machine.

• It is the set of problems that can be "verified" by a deterministic Turing machine in polynomial
time.

• All the problems in this class have the property that their solutions can be checked effectively.

• This class contains many problems that people would like to be able to solve effectively,
including
– the Boolean satisfiability problem (SAT)
– the Hamiltonian path problem (special case of TSP)
– the Vertex cover problem.
Nondeterministic Polynomial (NP Family)
(Cont’d)
• Decision Problem
– Problem where its outcome is either Yes or No

Is there a way to color the graph 3-


way such that no two adjacent Is there a clique of size 5?
nodes have the same color?

• Verifiable in Polynomial Time


– If I give you a candidate answer, you can verify whether it is correct or wrong in polynomial time
– That is different from finding the solution in polynomial time
Verifiable in Polynomial Time

If I give you color assignment:


>> Check the number of colors is 3
>> Each that no two vertices are the same O(E)

• But the find a solution from scratch, it can be hard


P vs. NP
• P is definitely subset of NP
– Every problem with poly-time solution is verifiable in poly-time

• Is it proper subset or equal?


– No one knows the answer

Most guesses are


leaning towards P ≠ NP

NP P NP P

• NP family has set of problems known as “NP-Complete”


– Hardest problems in NP
– No poly-time solution for NP-Complete problems yet
NP-Complete (NPC)
• A set of problems in NP
– So, they are decision problems
NP P
– Can be verified quickly (poly-time)
NPC
• They are hardest to solve
– The existing solutions are all exponential
– Known for 30 or 40 years, and no one managed to find poly-time solution for them
– Still, no one proved that no poly-time solution exist for NPC problems

• Property in NPC problem


– Problem X is NPC if any other problem in NP can be mapped (transformed) to X in
polynomial time
NP-Complete
• In complexity theory, the Np complete problems are the most difficult problems in NP
("nondeterministic polynomial time") in the sense that they are the ones most likely not to be in P.

• If one could find a way to solve any Np complete problem quickly (in polynomial time), then they
could use that algorithm to solve all NP problems quickly.

• At present, all known algorithms for Np complete problems require time that is super polynomial in
the input size.

• To solve an NP complete problem for any nontrivial problem size, generally one of the following
approaches is used:
– Approximation
– Probabilistic
– Special cases
– Heuristic
NP-Complete
• Boolean satisfiability problem (SAT)
• N puzzle
• Knapsack problem
• Hamiltonian cycle problem
• Traveling salesman problem
• Subgraph isomorphism problem
• Subset sum problem
• Clique problem
• Vertex cover problem
• Independent set problem
• Graph coloring problem
• Minesweeper
NP-Complete (NPC) Cont’d
• Property in NPC problems
– Problem X is NPC if any other problem in NP can be mapped (transformed) to X in
polynomial time
• So, Any two problems in NPC must transform to each other in poly-time
• X ----PolyTime-------> Y
• Y -----PolyTime ------> X
NPC Y NP
X

• This means if any problem in NPC is solved in poly-time è Then all


NPC problems are solved in Poly-Time
– This will lead to P = NP
NPC Example 2: Clique Problem

• Given a graph G(V, E)

• Clique: subset of vertices, where each pair of them is connected by an edge

• Is there a clique in G ?
• If there a clique of size m in G?
NPC Example 3: Graph Coloring

• Given a graph G (V, E)

• Is there coloring scheme of the vertices in G using 3-way


colors such that no two adjacent vertices have the same
color?

• BTW: 2-way graph coloring is solvable in poly-time


NP-Hard Family
• It is a family of problems as hard as NPC problems

• But they are not decision problems


– Can be any type

• NP-Hard problems have exponential time solutions


NP-Hard Example: Travelling Salesman
Problem
• Given a set of n cities and a pairwise distance function d(u, v)

• What is the shortest possible route that visits each city once and go back to
the starting point
Full Diagram

Most probable
Complexity Classes: NP, NP-hard, and
NP-complete
Examples of NP-complete and NP-hard problems

1 1
5 0 2
1
1
3 5
0 8
4
9 8
8
5
6
1
5 NP Hard: Travelling
NPC: Hamiltonian Salesman
Circuit
Monte Carlo algorithms
• For example, RandEC (the randomized minimum-cut
algorithm) is a Monte Carlo algorithm.
• A Monte Carlo algorithm may sometimes produce a
solution that is incorrect.
• For decision problems, there are two kinds of Monte
Carlo algorithms:
– those with one-sided error
– those with two-sided error

53
Which is better?
• The answer depends on the application.
• A Las Vegas algorithm is by definition a Monte
Carlo algorithm with error probability 0.
• Actually, we can derive a Las Vegas algorithm
A from a Monte Carlo algorithm B by repeated
running B until we get a correct answer.

54
What is Approximation algorithm
Introduction:
• An Approximate Algorithm is a way of approach NP-
COMPLETENESS for the optimization problem. This
technique does not guarantee the best solution. The goal
of an approximation algorithm is to come as close as
possible to the optimum value in a reasonable amount of
time which is at the most polynomial time. Such algorithms
are called approximation algorithm or heuristic algorithm.
• Example : For the vertex cover problem, the optimization
problem is to find the vertex cover with fewest vertices,
and the approximation problem is to find the vertex cover
with few vertices.
What is Approximation algorithm?
Non
NP Deterministic
Non Deterministic Polynomial
Polynomial time Solution may be
Ex: TSP, Knapsack, etc. found in
polynomial
time, but not
P
sure.
Deterministic
Polynomial time
complexity, Ex:
Multiplication,
sorting, huffman
code, etc

5 * 3= 15? Is this
P = Polynomial verified? Yes!!
S = f(N)
Steps = f(Size)
Vertex Cover Problem using
Approximation algorithm
Vertex Cover Problem
• A vertex cover of an undirected graph is a
subset of its vertices such that for every edge
(u, v) of the graph, either ‘u’ or ‘v’ is in vertex
cover.
Approximate Algorithm for Vertex
Cover:
• 1) Initialize the result as {}
• 2) Consider a set of all edges in given graph. Let
the set be E.
• 3) Do following while E is not empty
• ...a) Pick an arbitrary edge (u, v) from set E and
add 'u' and 'v' to result
• ...b) Remove all edges from E which are either
incident on u or v.
• 4) Return result
Vertex Cover Problem
E’={ab, bc, bd, de, ce, cf}
Select arbitary {u,v} belongs to E’
C=C U {u,v}
Return C;
--------------------------------------------
Case 1: Choose {ab}; A = {ab}
C = {a,b}
Remove edges: ab, bd, bc from E’
Choose {ce}; A = {ab, ce}
C = {a,b,c,e}
Remove edges: ab, bd, bc, de, ce, cf
-----------------------------------------
Case 2: Choose {bc}; A = {bc}
C = {b,c}
Remove Edges: ab, bc, bd, ce, cf
Choose {de}, A = {bc , de}
C = {b,c, d,e}
Remove edges: ab, bc, bd, ce, cf, de.
--------------------------------------------
2-approximation algorithm for Vertex
Cover
4
2
1
5 • Let A be the set of
edges chosen by the
6
algorithm ;
3
A = {(1,2)}
|A|=1
C = {1,2}
C* = {1,2}
| C* | = 2
4
2 • | A | ≤ | C* |
5 • |C| = 2|A|
1
Therefore: | C | ≤ 2 | C* |
6 That is, |C| cannot be larger than twice
the optimal, so is a 2-approximation
3 algorithm for Vertex Cover.
2-approximation algorithm for Vertex
Cover
Let A be the set of edges chosen by the algorithm ;
|A|=2
The optimal solution C* must have at least as many
vertices as edges in A:
C = {b,c, d,e}
| C | = 4;

C* = {b,c, d,e}
| C* | = 4

• | A | ≤ | C* |
• |C| = 2|A|

Therefore: | C | ≤ 2 | C* |
That is, |C| cannot be larger than twice the optimal,
so is a 2-approximation algorithm for Vertex
Cover.
Example of Approx algorithm – Vertex cover

O(E
)
O(V)

Line 1 initializes C to the empty set.


Line 2 sets E' to be a copy of the edge set E[G] of the graph.
The loop on lines 3-6 repeatedly picks an edge (u, v) from E',
adds its endpoints u and v to C, and deletes all edges
in E' that are covered by either u or v.
The running time of this algorithm is O(E+V)

You might also like