0% found this document useful (0 votes)
4 views40 pages

Week 11

Uploaded by

muddasirrizwan9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views40 pages

Week 11

Uploaded by

muddasirrizwan9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

CSE 317: Design and Analysis of Algorithms

Shahid Hussain
Weeks 11, 12, and 13: October 28 – November 13, 2024: Fall 2024

1
Probability Theory
Probability Theory

• We say the set S is the sample space (the certain event)


• The elements of S are called elementary events
• An event is a subset of S and ∅ is called the null event
• Two events A and B are disjoint (mutually exclusive) if A ∩ B = ∅.

2
Probability Theory: Distribution

• We say Pr : 2S → R is a probability distribution on S if it satisfies the following


axioms:
1. Pr{A} ≥ 0 for any event A
2. Pr{S} = 1
3. Pr{A ∪ B} = Pr{A} + Pr{B} for any two disjoint events A and B
More generally, for any (finite or countably infinite) sequence of events
A1 , A2 , . . . that are pairwise disjoint,
( )
[ X
Pr Ai = Pr{Ai }
i i

• We have Pr{∅} = 0 and Pr{A} = 1 − Pr{A} where A = S \ A. If A ⊆ B then


Pr{A} ≤ Pr{B}. For any two evens A and B,
Pr{A ∪ B} = Pr{A} + Pr{B} − Pr{A ∩ B}.
3
Probability Theory: Distribution

• A probability distribution is discrete if S is a finite or countably infinite


• In this case for any event A, Pr{A} = s∈A Pr{s}
P

• If S is finite and Pr{s} = 1/|S| for any s ∈ S then we have the uniform
probability distribution on S
• Two events are independent if Pr{A ∩ B} = Pr{A} Pr{B}
• A collection of events A1 , . . . An is independent if, for every set of indices
I ⊆ {1, . . . , n} we have
( )
\ Y
Pr Ai = Pr{Ai }
i∈I i∈I

4
Probability Theory: Random Variables

• A discrete random variable X is a function from a finite or countably infinite


sample space S to the real numbers
• For a random variable X and a real number x, we define the event X = x to
be {s ∈ S : X(s) = x}. Thus,
X
Pr{X = x} = Pr{s}
s∈S,X(s)=x

• The function f (x) = Pr{X = x} is the probability mass function (PMF) of the
random variable X
• From the probability axioms, f (x) ≥ 0 and Σx f (x) = 1

5
Probability Theory: Expected Values

• The expected value (expectation or mean) of a discrete random variable X is


X
E[X] = x Pr{X = x},
x

which is well defined if the sum is finite or converges absolutely


• For any two random variables X and Y , E[X + Y ] = E[X] + E[Y ] or more
generally for random variables X1 , X2 , . . . , Xn and for α1 , α2 , . . . , αn where
αi ∈ R for each 1 ≤ i ≤ n

E[α1 X1 + α2 X2 + · · · + αn Xn ] = α1 E[X1 ] + α2 E[X2 ] + · · · + αn E[Xn ]

• This property is called the linearity of expectation. It also expands to


absolutely convergent summations of expectations.
6
Probability Theory: Bernoulli/Binomial Trial

Lemma
If we repeatedly perform independent trials of an experiment, each of which
succeeds with probability p > 0, then the expected number of trials we need to
perform until the first success is 1/p.

7
Probability Theory: Bernoulli/Binomial Trial

Proof of the Lemma.


Let X be the random variable equal to the number of trials. For j > 0, we have
Pr{X = j} = (1 − p)j−1 p. Then

X ∞
X
E[X] = j · Pr{X = j} = j · (1 − p)j−1 p
j=1 j=1

X p 1
= p j · (1 − p)j−1 = = .
j=1
p2 p
P∞
[We know that j=1 q j = 1/(1 − q) for |q| < 1. Differentiating both sides with respect to q we get:

! ∞ ∞  
d X j
X d j X d 1 1
q = (q ) = jq j−1 = = .]
dq j=1 j=1
dq j=1
dq 1 − q (1 − q)2

8
Randomized Algorithms
Randomized Algorithms

• We will consider examples of randomized algorithms which can make random


decisions during their work
• Randomized algorithms are often conceptually much simpler than the
deterministic ones
• Randomized algorithms can be generally categorized in two ways: Las Vegas
algorithms and Monte Carlo algorithms
• Las Vegas algorithms always give the correct answer (whenever they produce
an answer), but they may take a long time to do so
• On the other hand Monte Carlo algorithms may give the wrong answer, but
they are guaranteed to give the correct answer with high probability

9
Randomized Algorithms

• Let us consider one problem and two randomized algorithms for this problem
one Las Vegas and one Monte Carlo
• Consider an array A of n elements
• Such that each half of the array contains 0’s and half of the array contains 1’s
• We need to find the index j such that A[j] = 1

10
Randomized Algorithms: Las Vegas Algorithm

Algorithm: find-one-LasVegas
Input: An array A of n elements, s.t. half of A is 1’s and half is 0’s
Output: The index j such that A[j] = 1

1. while true
2. j = random(1, n)
3. if A[j] = 1 return j

• Above algorithm will ultimately find an index j such that A[j] = 1 if the
random number generator used in Line 2 is does not repeatedly select the
same element again and again.

11
Randomized Algorithms: Monte Carlo Algorithm

Algorithm: find-one-MonteCarlo
Input: An array A of n elements, s.t. half of A is 1’s and half is 0’s and k > 0
Output: An index j s.t. A[j] = 1 (success), otherwise nil with p = (1/2)k (failure)
1. i = 0
2. while i < k
3. j = random(1, n)
4. i=i+1
5. if A[j] = 1 return j
6. return nil
• This algorithm does not guarantee to give correct answer all the times
• When successful it returns the index j s.t. A[j] = 1, otherwise fails with probability (1/2)k
• If k is sufficiently large then the probability of failure is very small
12
Selection Problem

• Let S = ⟨a1 , a2 , . . . , an ⟩ be a sequence of n distinct numbers


• It is convenient to assume S is a set
• For a given k, 1 ≤ k ≤ n, the selection problem is to find the k-th smallest
element of S
• This is a generalization of the median finding problem where k = (n + 1)/2 if
n is odd and k = n/2 if n is even

13
Randomized Selection Algorithm

• We will consider the following recursive algorithm select(S, k)


• We choose an element ai ∈ S at random, call it splitter
• We partition S into two sets S − = {a ∈ S : a < ai }, and
S + = {a ∈ S : a > ai }
• If |S − | = k − 1, then the k-th smallest element is ai
• If |S − | > k − 1, then we the k-th smallest element in S − , select(S − , k)
• If |S − | < k − 1, then the k-th smallest element in S + , select(S + , k − |S − | − 1)

Independent of the choice of the splitter this algorithm finds the k-th smallest
element of S

14
Example

• Let n = 7, k = 5, and S = {4, 8, 3, 9 , 15, 11, 2}


• We choose ai = 4 as the splitter we get:
• S − = {3, 2}, S + = {8, 9, 15, 11}
• Since |S − | = 2 < k − 1 = 4, we need to find the 2-nd smallest element in S + ,
k=2
• We choose ai = 11 as the splitter we get:
• S − = {8, 9}, S + = {15}
• Since |S − | = 2 > k − 1 = 4, we need to find the 2-nd smallest element in S −
• We choose ai = 9 as the splitter we get:
• S − = {8}, S + = ∅
• Since |S − | = 1 = k − 1 = 1, the 5-th smallest element is 9

15
Analysis of Selection Algorithm: Worst-Case

• During the construction of S − and S +


• The select algorithm makes n − 1 comparisons from S with ai (the splitter)
• In the worst-case, the algorithm chooses the maximal number in S as the
splitter
• Therefore, the number of comparisons in the worst-case will be:

n(n − 1)
(n − 1) + (n − 2) + · · · + 1 = = Ω(n2 )
2

16
Analyis of Selection Algorithm: Average-Case

• We will analyze the expected number of comparisons made by the select


algorithm
• We will assume that the splitter is chosen uniformly at random from S
• Let X be the random variable equal to the number of comparisons made by
the algorithm
• Let Xi be the random variable equal to the number of comparisons made by
the algorithm when the splitter is ai
• We have X = X1 + X2 + · · ·
• We will analyze the expected value of Xi and then use the linearity of
expectation to find the expected value of X
• We will show that the expected number of comparisons is O(n)

17
Analyis of Selection Algorithm: Average-Case

• We say that the algorithm is in phase j when the size of the set under
consideration (denoted as m) satisfies the following inequality:
 j+1  j
3 3
n <m≤n
4 4
• In a given iteration we say that an element is central if there are at least
⌊m/4⌋ elements which are smaller than it and at least ⌊m/4⌋ elements which
are larger than it
• If a central element is chosen as the splitter than the number of elements the
algorithm has to work with will be at most m − ⌊m/4⌋ − 1, and clearly
jmk    j+1
3 3
k− −1≤m ≤n
4 4 4
• Now the algorithm is in phase j + 1
18
Analyis of Selection Algorithm: Average-Case

• It is clear that the number of central elements is at least m − 2⌊m/4⌋ ≥ m/2


• Therefore, the probability that the algorithm chooses a central element is at
least 1/2
• Therefore, the expected number of iterations before a central element is found
is at most 2
• Therefore, the expected number of comparisons made by the algorithm in
 j
3
phase j is at most 2n , now
4
 
X X X  3 j
E[X] = E  E[Xj ] ≤
 E[Xj ] = 2n = 8n = O(n)
4
j j j

19
QuickSort

• QuickSort is a sorting algorithm which is based on the divide-and-conquer


paradigm
• The algorithm works as follows:
1. Choose an element ai from the array A at random, call it the pivot
2. Partition the array into two sets A− = {a ∈ A : a < ai } and
A+ = {a ∈ A : a > ai }
3. Recursively sort A− and A+ and concatenate the sorted arrays
• The algorithm is in-place and has a space complexity of O(log n)
• The worst-case time complexity of the algorithm is O(n2 )
• The average-case time complexity of the algorithm is O(n log n)

20
Analysis of QuickSort

• Worst-case: During each iteration QuickSort makes n − 1 comparisons,


therefore the total number of comparisons in the worst-case would be:

n(n − 1)
(n − 1) + (n − 2) + · · · + 1 = = Ω(n2 )
2

21
Average Case Analysis of QuickSort

• Let zi be the i-th smallest element of A


• Let X be the random variable equal to the number of comparisons made by
QuickSort
• For 1 ≤ i < j ≤ n, let Xij be the indicator random variable if the elements zi
and zj are compared
• We can conclude that:
n−1
X n
X
X= Xij
i=1 j=i+1
• Since each Xij is independent and identically distributed, we have:
n−1
X n
X n−1
X n
X
E[X] = E[Xij ] = Pr{zi and zj are compared}
i=1 j=i+1 i=1 j=i+1

22
Average Case Analysis of QuickSort (cont.)

• Let Zij = {zi , zi+1 , . . . , zj } be the set of elements between zi and zj in the
sorted array
• In order for zi and zj to be compared, the pivot must be chosen from Zij
• The probability that the pivot is chosen from Zij is 2/|Zij | = 2/(j − i + 1)
• We get:
n−1 n n−1 n n−1 n
X X 2 X X 1 XX 1
E[X] = =2 ≤2 ≤ 2n ln n
j−i+1 j−i+1 k
i=1 j=i+1 i=1 j=i+1 i=1 k=2

• Here ln k ≤ H(k) ≤ ln k + 1, where H(k) = 1 + 1/2 + · · · + 1/k


• Therefore, the average-case time complexity of QuickSort is O(n log n)

23
String Equality Testing with Randomized Algorithms

• Problem: Alice and Bob want to check if their strings x and y are equal.
• Solution: Use a fingerprint (hash) to represent each string.

Algorithm
1. Alice selects a prime p from the set of primes less than M
2. Alice computes fp (x).
3. Alice sends p and fp (x) to Bob.
4. Bob compares fp (x) with fp (y) to check equality.

24
Probability of False Positives in String Equality

• Let n be the no. of bits in the strings x


• If y requires more or less than n-bits then clearly x ̸= y
• Let π(n) be the no. of primes less than n,
n
π(n) ≃
ln n
• For k < 2n the no. of distinct primes that divide k is less than π(n) (except
when k is very small)

25
Probability of False Positives in String Equality

• False positives occur if x ̸= y but fp (x) = fp (y)


• This is only possible if p divides f (x) − f (y)
• Let N (p, n) denote the no. of primes less than 2n s.t., each prime divides
f (x) − f (y)
• We know that N (p, n) ≤ π(n) therefore,
N (p, n) π(n)

π(n) π(M )
• Now letting M = 2n2 we obtain:
π(n) n/ ln n n ln 2n2 1
Pr{failure} ≤ ≃ 2 2
= · 2

π(M ) 2n / ln 2n ln n 2n n
• If we let k = ⌈log log n⌉ then we have:
1
Pr{failure} ≤
nk 26
Example

• Let n = 106 , then M = 2n2 = 2 × 1012 = 240.8631


• The no. of bits required to transmit p is ⌊M ⌋ + 1 = 40 + 1 = 41
• Similarly the no. of bits required to transmit
⌊log(p − 1)⌋ + 1 ≤ ⌊log M ⌋ + 1 = 41
• Thus, the total no. of bits required to transmit p and fp (x) is 82
• The probability of failure in one transmission is at most 1/n = 1/106
• Since, ⌈log log n⌉ = 5, repeating the algorithm five times reduces the
probability of false positive to

n−⌈log log n⌉ = (106 )−5 = 10−30

• Which is negligible
27
Pattern Matching

• Given a text T (of length n) and a pattern P (of length m) (m ≤ n and


T, P ∈ {0, 1}∗ )
• We need to determine if P appears in T
• A simple solution is to move the pattern across the text
• This solution has a time complexity of O(mn)
• We can design and Monte Carlo algorithm to solve it in O(m + n)
• This Monte Carlo algorithm can easily be turned into a Las Vegas algorithm
with the same complexity bound of O(m + n)

28
Pattern Matching

• Let T = t1 t2 · · · tn be the text and P = p1 p2 · · · pm be the pattern


• The Monte Carlo algorithm works similarly to the brute-force algorithm by
sliding the pattern across the text
• However, rather than comparing the pattern with the text at each position,
we compare the fingerprint of the pattern with the fingerprint of the text at
each position (block of text)
• Let T (j) = tj tj+1 · · · tj+m−1 be the block of text of length m starting at
position j in the text T
• We will compare the fingerprint Iq (P ) of the pattern modulo q with the
fingerprint Iq (T (j)) of the block of text T (j) modulo q
• These O(n) fingerprints can be easily computed

29
Pattern Matching

• Let Iq (T (j)) represents the fingerprint of the block of text T (j) modulo q
• Then the fingerprint of the block of text T (j + 1) can be computed as:

Iq (T (j + 1)) = (2Iq (T (j)) − 2m tj + tj+m ) mod q

• If we let W = 2m mod q then we have:

Iq (T (j + 1)) = (2Iq (T (j)) − W tj + tj+m ) mod q

30
Pattern Matching

Algorithm: pattern-matching
Input: A text T of length n and a pattern P of length m
Output: The 1-st index j such that tj = p1 , . . . , tj+m−1 = pm or 0 otherwise
1. q = random(1, M )
2. j=1
3. Wq = 2m mod q
4. Compute Iq (P ) and Iq (T (j))
5. while j ≤ n − m + 1
6. if Iq (P ) = Iq (T (j)) return j
7. Iq (T (j + 1)) = (2Iq (T (j)) − Wq tj + tj+m ) mod q
8. j =j+1
9. return 0
31
Analysis of Pattern Matching Algorithm

• The computation of each of Iq (P ) and Iq (T (1)) requires O(m) operations


• Iq (T (j + 1)) requires O(1) operations for 2 ≤ j ≤ n − m + 1, total = O(n)
• Therefore, the running time is O(n + m)
• A false match occurs if Iq (P ) = Iq (T (j)) but P ̸= T (j)
• This is possible if the chose prime p divides
Y
ζ= |I(P ) − I(T (j))|
{j:P ̸=T (j)}

• ζ ≤ (2m )n = 2mn
• Therefore, the no. of primes that divide it cannot exceed π(mn)

32
Analysis of Pattern Matching Algorithm

• If we let M = 2mn2 then the probability of a false match cannot exceed:

π(mn) mn/ ln(mn) 1


≈ 2 2
<
π(M ) 2mn / ln(mn ) n

• This probability is independent of m (size of the pattern)


• When m = n, the problem reduces to string equality testing

33
Converting Monte Carol to Las Vegas

• Whenever the two fingerprints Iq (P ) = Iq (T (j)) we can test the corresponding


blocks of text and pattern
• If they are equal then we have found a match
• If not then we have a false match and we repeat
• The expected running time of this Las Vegas algorithms becomes
   
1 1
O(m + n) · 1 − + mn = O(m + n)
n n

• That is, the Las Vegas algorithm has the same expected running time as the
Monte Carlo algorithm

34
Random Sampling

• Given a set S of n elements


• We want to select a random sample of k elements from S, k < n
• Assume without loss of generality that S = {1, 2, . . . , n}
• We can use the following Θ(n) Las Vegas algorithm

35
Random Sampling

Algorithm: random-sampling
Input: Two positive integers n and k such that k < n
Output: An array A[1..k] of k distinct elements from {1, 2, . . . , n}
1. B = ⟨0⟩n // BX is an n-bit vector of 0’s
2. j=0
3. while j < k
4. r = random(1, n)
5. if Bi = 0
6. j =j+1
7. A[j] = r
8. Br = 1
9. return A
36
Analys of Random Sampling Algorithm

• If k ≈ n i.e., much larger than n/2 in that case we can discard n − k elements
and return the rest, so, assuming k ≤ n/2
• Let pj be the probability that the j − 1 elements have already been selected
1 ≤ j ≤ k, clearly
n−j+1
pk =
n
• Let Xj be the indicator random variable that the j-th element is selected
• Then the expected value of Xj is
1 n
E[Xj ] = =
pk n−j+1
• Let X = X1 + X2 + · · · + Xk then the expected value of X is
k k k n−k
X X 1 X1 X1
E[X] = E[Xj ] = n =n −n
n−j+1 n n
j=1 j=1 j=1 j=1
37
Analys of Random Sampling Algorithm

Pn
• We known j=1 1/j ≤ ln n + O(1), therefore

E[X] ≤ n(ln n + 1) − n(ln(n − k + 1))


= n(ln n + 1 − ln(n − k + 1))
≤ n(ln n + 1 − ln(n/2)) since k ≤ n/2
= n(ln 2 + 1)
= n ln(2e)
≈ 1.69n

• The expected running time of this algorithm is Θ(n)

38

You might also like