0% found this document useful (0 votes)

19 views28 pages

SoICT-Eng - ProbComp - Lec 2

Uploaded by

Sope Coto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views28 pages

SoICT-Eng - ProbComp - Lec 2

Uploaded by

Sope Coto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Models and Algoritms for

Internet Computing

Basic probability: axioms,

conditional probability, random
variables, distributions
Application: Verifying Polynomial
Identities
• Computers can make mistakes:
• Incorrect programming
• Hardware failures
 sometimes, use randomness to check output
• Example: we want to check a program that multiplies
together monomials
E.g: (x+1)(x-2)(x+3)(x-4)(x+5)(x-6) ?= x6-7x3+25
• In general check if F(x) = G(x) ?
• One way is:
• Write another program to re-compute the coefficients
• That’s not good: may goes same path and produces the same bug
as in the first
3
How to use randomness
• Assume the max degree of F & G is d. Use this algorithm :
RANDOM_TEST
1. Pick a uniform random number r from:
{1,2,3, … 100d}
2. Check if F(r)=G(r) then output “equivalent”, otherwise “non-
equivalent”
• Note: this is much faster than the previous way – O(d) vs. O(d2)
• One-sided error:
• Answer “non-equivalent” always true
• Answer “equivalent” can be wrong
• How it can be wrong:
• If accidentally picked up a root of F(x)-G(x) = 0
• This can occur with probability at most 1/100

4
Axioms of probability
• We need a formal mathematical setting for analyzing
the randomized algorithms like RANDOM_TEST
• Any probabilistic statement must refer to the underlying
probability space
• Definition 1: A probability space has three components:
• A sample space , which is the set of all possible outcomes of
the random process modeled by the probability space
• A family of sets  representing the allowable events, where
each set in  is a subset of the sample space
• A probability function Pr: R satisfying definition 2 below
An element of  is called a simple or elementary event
• Example: In RANDOM_TEST, the sample space is the
set of integers {1,…100d}.
• Each choice of an integer r in this range is a simple event

5
Axioms
• Def2: A probability function is any function Pr: R that
satisfies the following conditions:
1. For any event E, O Pr(E) 1;
2. Pr() =1; and
3. For any sequence of pairwise mutually disjoint events E1, E2, E3 …,
Pr(i1Ei) = i1Pr(Ei)
• events are sets  use set notation to express event combinations
• In RANDOM_TEST:
• Each choice of an integer r is a simple event
• All the simple events have equal probability
• The sample space has 100d simple events, and the sum of the
probabilities of all simple events must be 1  each simple event
has probability 1/100d
6
Lemmas
• Lem1: For any two events E1, E2:
Pr(E1E2)= Pr(E1) + Pr(E2)- Pr(E1E2)
• Lem2(Union bound): For any finite of countably
infinite sequence of events E1, E2, E3 …,
Pr(i1Ei)  i1Pr(Ei)
• Lem3(inclusion-exclusion principle) Let E1, E2, E3 … be any
n events. Then
Pr(i=1,nEi) =i=1,nPr(Ei) - i<jPr(EjEj) +
i<j<kPr(EiEj Ek) - …
+(-1)l+1i1...il Pr(r=1,lEir) +…
7
Analysis of RANDOM_TEST
• The algo gives an incorrect answer if the random
number it chooses is a root of polynomial F-G
• Let E represent the event that RANDOM_TEST failed to
give the correct answer
• The elements of the set corresponding to E are the roots of
the polynomial F-G that are in the set of integer {1,…100d}
• Since F-G has degree at most d then has no more than d
roots  E has at most d simple events

• Thus, Pr( RANDOM_TEST fails) = Pr(E)  d/(100d) =

1/100
8
How to improve the algo. for
smaller failure probability?
• Can increase the sample space
• E.g. {1,…, 1000d}
• But not so good
• Repeat the algo multiple times, using
different random values to test
• If F(r)=G(r) for just one of these many rounds
then output “non-equivalent”
• Can sample from {1,…100d} many times with
or without replacements
• With replacement: like having no memory of
previous selections

9
Notion of independence
• Def3: Two events E and F are independent iff (if and only if)
Pr(EF)= Pr(E) . Pr(F)
More generally, events E1, E2, …, Ek are mutually independent iff for any
subset I[1,k]: Pr(iIEi)= PiIPr(Ei)
• Now for our algorithm samples with replacements
• The choice in one iteration is independent from the choices in previous
iterations
• Let Ei be the event that the ith run of algo picks a root ri s.t. F(ri)-G(ri)=0
• The probability that the algo returns wrong answer is
Pr(E1 E2 … Ek) = Pi=1,kPr(Ei)  Pi=1,k (d/100d) = (1/100)k
• Sampling without replacement:
• The probability of choosing a given number is conditioned on the events
of the previous iterations

10
Notion of conditional probability

• Def 4: The conditional probability that event E

occurs given that event F occurs is
Pr(E|F) = Pr(EF)/
Pr(F)
• Note this con. pro. only defined if Pr(F)>0
• When E and F are independent and Pr(F)>0 then
Pr(E|F) = Pr(EF)/
Pr(F) = Pr(E).Pr(F)/
Pr(F) = Pr(E)
• Intuitively, if two events are independent then
information about one event should not affect the
probability of the other event.
11
Sampling without replacement
• Again assume FG
• We repeat the algorithm k times: perform k iterations of
random sampling from [1,…100d]
• What is the prob that all k iterations yield roots of F-G,
resulting in a wrong output by our algo?
• Need to bound Pr(E1 E2 … Ek)
Pr(E1 E2 … Ek)= Pr(Ek|E1 … Ek-1) . Pr(E1 E2 … Ek-1)
= Pr(E1). Pr(E2|E1). Pr(E3|E1 E2) … Pr(Ek|E1 … Ek-1)
• Need to bound Pr(Ej|E1 … Ej-1):  d-(j-1)/100d-(j-1) Why?
So Pr(E1 E2 … Ek)  Pj=1,k d-(j-1)/100d-(j-1)  (1/100)k, slightly
better than sampling with replacement
• Use d+1 iterations: always give correct answer. Why?
Efficient?
12
Random variables

• Def 5: A random variable X on a sample space

 is a real-valued function on ; that is X:
R. A discrete random variable is a random
variable that takes on only finite or countably
infinite number of values
• So, “X=a” represents the set {s |X(s)=a}
• Pr(X=a) =  X(s)=a Pr(s)
Eg. Let X is the random variable representing the
sum of the two dice. What is the prob of X=4?
13
Random variables

• Def6: Two random variables X and Y are independent iff

for all values x and y:
Pr( (X=x)(Y=y) ) = Pr(X=x). Pr(Y=y)

14
Expectation
• Def 7: The expectation of a discrete random variable
X, denoted by E[X] is given by E[X] = ii.Pr(X=i)
• where the summation is over all values in range of X
• Eg. compute the expectation of the random variable X
representing the sum of two dice

15
Linearity of expectation
• Theorem:
• E[i=1,nXi] = i=1,nE[Xi]
• E[c X] = c E[X] for all constant c

16
Bernoulli and Binomial random
variables
• Consider an experiment that succeeds with probability p
and fails with probability 1-p
• Let Y be a random variable that takes 1 if the experiment
succeeds and 0 if otherwise. Such a r.v. is called a Bernoulli or
an indicator random variable
• E[Y] = p
• Now we want to count X, the number of success in n tries
• A binomial random variable X with parameters n and p,
denoted by B(n,p), is defined by the following probability
distribution on j=0,1,2,…, n:
Pr(X=j) = (n choose j) pj(1-p) n-j
• E.g. used a lot in sampling (book: Mit-Upfal)
17
The Hiring Problem Revisited

HIRE-ASSISTANT(n)
1 best←0
candidate 0 is a least-qualified dummy candidate
2 for i←1 to n
3 do interview candidate i
4 if candidate i is better than candidate best
5 then best←i
6 hire candidate i

18
Cost Analysis
• We are not concerned with the running time of
HIRE-ASSISTANT, but instead with the cost
incurred by interviewing and hiring.
• Interviewing has low cost, say ci, whereas hiring
is expensive, costing ch. Let m be the number of
people hired. Then the cost associated with this
algorithm is O (nci+mch). No matter how many
people we hire, we always interview n
candidates and thus always incur the cost nci,
associated with interviewing.

19
Worst-case analysis

• In the worst case, we actually hire every candidate

that we interview. This situation occurs if the
candidates come in increasing order of quality, in
which case we hire n times, for a total hiring cost of
O(nch).

20
Probabilistic analysis

• Probabilistic analysis is the use of

probability in the analysis of problems. In
order to perform a probabilistic analysis, we
must use knowledge of the distribution of the
inputs.
• For the hiring problem, we can assume that
the applicants come in a random order.

21
Randomized algorithm

• We call an algorithm randomized if its behavior is

determined not only by its input but also by values
produced by a random-number generator.

22
Indicator random variables
The indicator random variable I[A]
associated with event A is defined as

1 i f A occur s
I [ A]  
0 i f A does not occur
• Lemma: Given a sample space  and an
event A in the sample space , let XA=I{A}.
Then E[XA]=Pr(A).

23
Analysis of the hiring problem using
indicator random variables
• Let X be the random variable whose value equals
the number of times we hire a new office
assistant and Xi be the indicator random variable
associated with the event in which the ith
candidate is hired. Thus,
X=X1+X2+…+Xn

By the lemma above, we have

E[Xi]=Pr{ candidate i is hired}=1/i. Thus,
E[X]=1+1/2+1/3+…+1/n=ln n+O(1)

24
Randomized algorithms
RANDOMIZED-HIRE-ASSISTANT(n)
1 randomly permute the list of candidate
2 best←0
3 for i←1 to n
4 do interview candidate i
5 if candidate i is better than candidate best
6 then best←i
7 hire candidate i

25
Food for thoughts
PERMUTE-BY-SORTING(A)
1 n←length[A]
2 for i←1 to n
3 do P[i] ←RANDOM(1,n3)
4 sort A, using P as sort keys
5 return A
Lemma: Procedure PERMUTE-BY-SORTING
produces a uniform random permutation of input,
assuming that all priorities are distinct.
26
Food for thoughts
RANDOMIZE-IN-PLACE(A)
1 n←length[A]
2 for i←1 to n
3 do swap A[i ]↔A[RANDOM(i,n)]

Lemma: Procedure RANDOMIZE-IN-PLACE

computes a uniform random permutation.

27
Thank you for
your attentions!

Week 11
No ratings yet
Week 11
40 pages
Report Endterm
No ratings yet
Report Endterm
30 pages
1-ProbabilityReview v3
No ratings yet
1-ProbabilityReview v3
116 pages
1.1. Verifying Polynomial Identities: 15-Jan-15 Mat-72306 Randal, Spring 2015 17
No ratings yet
1.1. Verifying Polynomial Identities: 15-Jan-15 Mat-72306 Randal, Spring 2015 17
28 pages
Report Mid
No ratings yet
Report Mid
19 pages
Chapter Five and Six
No ratings yet
Chapter Five and Six
23 pages
Daa C6
No ratings yet
Daa C6
14 pages
Expectation of Geometric Distribution Variance and Standard Deviation
No ratings yet
Expectation of Geometric Distribution Variance and Standard Deviation
5 pages
SWEN 5012 Advanced Algorithms and Problem Solving: Lecture 3 Randomized Algorithms Beakal Gizachew Assefa
No ratings yet
SWEN 5012 Advanced Algorithms and Problem Solving: Lecture 3 Randomized Algorithms Beakal Gizachew Assefa
33 pages
Randomized Algorithms: Prof. Tapio Elomaa
No ratings yet
Randomized Algorithms: Prof. Tapio Elomaa
37 pages
Chapter 1 Probability
No ratings yet
Chapter 1 Probability
13 pages
All in One CheatSheet PDF
No ratings yet
All in One CheatSheet PDF
52 pages
ML Cheat Sheet
50% (2)
ML Cheat Sheet
74 pages
Probabilities
No ratings yet
Probabilities
7 pages
Probability Lecture 1
No ratings yet
Probability Lecture 1
35 pages
CHP 5
No ratings yet
CHP 5
63 pages
Introduction To Discrete Probability Theory and Bayesian Networks
No ratings yet
Introduction To Discrete Probability Theory and Bayesian Networks
26 pages
AI ML Cheatsheet
No ratings yet
AI ML Cheatsheet
51 pages
1 The Hiring Problem and Basic Probability
No ratings yet
1 The Hiring Problem and Basic Probability
6 pages
Practice Sheet 1
No ratings yet
Practice Sheet 1
3 pages
Chapter 05
No ratings yet
Chapter 05
20 pages
Randomizedd Algorithms
No ratings yet
Randomizedd Algorithms
195 pages
Hiring Problem
No ratings yet
Hiring Problem
18 pages
CENG 222 Statistical Methods For Computer Engineering
No ratings yet
CENG 222 Statistical Methods For Computer Engineering
31 pages
340 Printable Course Notes
No ratings yet
340 Printable Course Notes
184 pages
Probabilistic Analysis and Randomized Algorithms: Pedro Ribeiro
No ratings yet
Probabilistic Analysis and Randomized Algorithms: Pedro Ribeiro
53 pages
Stochastic Systems: Dr. Farah Haroon
No ratings yet
Stochastic Systems: Dr. Farah Haroon
24 pages
Course Name: MEM601 Statistics: For Engineering Managers (2 Credit Hours)
No ratings yet
Course Name: MEM601 Statistics: For Engineering Managers (2 Credit Hours)
38 pages
Exam P Formula Sheet
100% (4)
Exam P Formula Sheet
14 pages
Probability & Statistics Wapole (CHPTR 02)
No ratings yet
Probability & Statistics Wapole (CHPTR 02)
40 pages
Stats Semis
No ratings yet
Stats Semis
18 pages
Probability-The Science of Uncertainty and Data
No ratings yet
Probability-The Science of Uncertainty and Data
4 pages
Unit 9
No ratings yet
Unit 9
23 pages
Cmpe107 Notes
No ratings yet
Cmpe107 Notes
14 pages
L05 Final
No ratings yet
L05 Final
19 pages
Module - 3
No ratings yet
Module - 3
90 pages
PTSP
No ratings yet
PTSP
101 pages
Probability and Random Variables
No ratings yet
Probability and Random Variables
14 pages
WEEK - 01 To 12 NOTES
No ratings yet
WEEK - 01 To 12 NOTES
48 pages
Probability and Distribution
No ratings yet
Probability and Distribution
43 pages
Lecture Notes in Probability: Raz Kupferman Institute of Mathematics The Hebrew University April 5, 2009
No ratings yet
Lecture Notes in Probability: Raz Kupferman Institute of Mathematics The Hebrew University April 5, 2009
159 pages
Unit 5 & 6. Probability and Prob Disti
No ratings yet
Unit 5 & 6. Probability and Prob Disti
90 pages
Statistics For Economists - Huye
No ratings yet
Statistics For Economists - Huye
144 pages
Lec 2
No ratings yet
Lec 2
23 pages
PTSP
No ratings yet
PTSP
74 pages
Slides Algo-Prob Review1 Typed PDF
No ratings yet
Slides Algo-Prob Review1 Typed PDF
17 pages
Unit7 Probability Statistics I-1
No ratings yet
Unit7 Probability Statistics I-1
49 pages
A 18-Page Statistics & Data Science Cheat Sheets
No ratings yet
A 18-Page Statistics & Data Science Cheat Sheets
18 pages
Probability Basics
No ratings yet
Probability Basics
19 pages
DI&M Part3
No ratings yet
DI&M Part3
18 pages
ML DL AI Cheatsheet
No ratings yet
ML DL AI Cheatsheet
52 pages
All Cheat Shests 1749903425
No ratings yet
All Cheat Shests 1749903425
3 pages
STAT 516 Course Notes Part 0: Review of STAT 515: 1 Probability
No ratings yet
STAT 516 Course Notes Part 0: Review of STAT 515: 1 Probability
21 pages
Turn in Recitation and Tutorial Scheduling Form Policy: Text
No ratings yet
Turn in Recitation and Tutorial Scheduling Form Policy: Text
52 pages
Probablity Mit Removed
No ratings yet
Probablity Mit Removed
31 pages
SOR1211 - Probability
No ratings yet
SOR1211 - Probability
17 pages
STAT Formulas
No ratings yet
STAT Formulas
130 pages
Probability Is A Branch of Mathematics That Deals With Measuring The Likelihood of Events
No ratings yet
Probability Is A Branch of Mathematics That Deals With Measuring The Likelihood of Events
34 pages

SoICT-Eng - ProbComp - Lec 2

Uploaded by

SoICT-Eng - ProbComp - Lec 2

Uploaded by

Models and Algoritms for

Basic probability: axioms,

• Thus, Pr( RANDOM_TEST fails) = Pr(E)  d/(100d) =

• Def 4: The conditional probability that event E

• Def 5: A random variable X on a sample space

• Def6: Two random variables X and Y are independent iff

• In the worst case, we actually hire every candidate

• Probabilistic analysis is the use of

• We call an algorithm randomized if its behavior is

By the lemma above, we have

Lemma: Procedure RANDOMIZE-IN-PLACE

You might also like