1 The Hiring Problem and Basic Probability

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

CS210: Data Structures and Algorithms 23 Aug 2013

Notes 10 Probabilistic Analysis


1 The hiring problem and basic probability
A perfectionist oce-supervisor has a vacant position for an oce assistant. He interviews
one candidate every day. His policy for hiring is as follows: If after interviewing, he eval-
uates the candidate to have better skills than the current oce-assistant, then he hires
this candidate as the new oce assistant and lets go of the current oce assistant. If he
has interviewed n candidates, what is the average (expected) number of candidates that he
would have hired in total (of whom all but one are let go).
Problems such are these are often addressed in the context of probabilistic analysis and
the analysis of randomized algorithms. In the hiring problem, the number of people hired
as oce assistants depends on the order in which the candidates appear for interview. If
the candidates appear in the order of decreasing skill then only the rst candidate is hired.
If the candidates appear in the order of increasing skill then all n candidates are hired.
However, the interesting question is if the candidates appear in random order, then how
many of them are hired?
Let us revisit some basic probability which you are familiar with. A sample space S is a
nite set or a countably innite set (e.g., N). For example, the sample space S
1
for a single
coin toss is S
1
= {h, t} and the sample space for three coin tosses is S
3
= {hhh, hht, hth,
htt,thh, tht,tth, ttt}. An event E is a subset of the sample space. For example, an
event E from the sample space of 3 coin tosses could be the set of possibilities with 2 heads
and 1 tails. This corresponds to E = {hht, hth, thh}. We say that events E and F are
mutually exclusive if E F = .
Axioms of probability A probability distribution Pr {} on a sample space S is a mapping
from events of S (i.e., subsets of S) to real numbers satisfying the following probability
axioms.
1. Pr {A} 0 for any event A.
2. Pr {S} = 1.
3. Pr {A B} = Pr {A} +Pr {B} for any two mutually exclusive events A and B. More
generally, for any nite or countably innite sequence of events A
1
, A
2
, . . . that are
pair-wise mutually exclusive, Pr {
i
A
i
} =

i
Pr {A
i
}.
We call Pr {A} the probability of the event A. All standard results on probability follow
from the above denitions. For example, it follows that (i) Pr {} = 0, (ii) A B, then,
Pr {A} Pr {B}, (iii) if we represent

A to denote the event S A (the complement of A),
1
then we have Pr
_

A
_
= 1 Pr {A}. For any two events A and B,
Pr {A B} = Pr {A} + Pr {B} Pr {A B}
Pr {A} + Pr {B}
Often the distributions that we will encounter are uniform, that is each element s S is
picked uniformly. Then, for each s S, Pr {s} = 1/ |S|.
Example 1. Suppose we ip a fair coin n times. The sample space is the n-times cartesian
product of {h, t} {h, t} . . . {h, t}, or equivalently, all n-character sequences over the
letters {h, t}. Each elementary event in S is an n-character sequence over the letters {h, t}
and each such sequence has probability 1/2
n
. Let A be the following event
A = { exactly k heads and n k tails occur } .
The number of n-character sequences that are members of A are
_
n
k
_
. Thus, |A| =
_
n
k
_
and
so Pr {A} =
_
n
k
_
/2
n
.
Discrete random variables
A discrete random variable X is a function from a sample space to the real numbers. For
example, let S be the sample space of the outcomes of n tosses of a fair coin {h, t}
n
.
Dene X to be the number of heads in an outcome. Then, X : {h, t}
n
{0, 1, . . . , n} and
Pr {X = k} =
_
n
k
_
/2
n
.
For a random variable X and a real number x, we dene the event X = x to be {s S :
X(s) = x}. Hence,
Pr {X = x} =

sS:X(s)=x
Pr {s} .
The function
f(x) = Pr {X = x}
is called the probability density function of the random variable X.
We often dene several random variables on the same sample space. if X and Y are random
variables, the function
f(x, y) = Pr {X = x and Y = y}
is the joint probability density function of X and Y .
Expected value of a random variable
The simplest and the most useful summary (statistic) of the distribution of a random
variable is the average of the values it takes. The expected value ofa discrete random
variable X is
E[X] =

x
x Pr {X = x}
2
Example 2. Suppose we ip a fair coin n times and let X be the random variable that
counts the number of heads. Then,
E[X] =
n

k=0
k
_
n
k
_
/2
n
We know by Binomial theorem that
(1 + x)
n
=
n

k=0
_
n
k
_
x
k
.
Dierentiating both sides with respect to x, we obtain the identity,
n(1 + x)
n1
=
n

k=0
k
_
n
k
_
x
k1
.
Setting x = 1, we obtain
n 2
n1
=
n

k=0
k
_
n
k
_
.
Substituting in the expression of E[X] we have,
E[X] =
n 2
n1
2
n
= n/2 .
Example 3. Consider a game in which you ip two fair coins. You earn Rs. 3 for each
head and lose Rs. 2 for each tail. What is the expected value of your gain/loss? There
are four possibilities {hh, ht, th, tt}. For the rst possibility, we gain 3 + 3 = 6 Rs, for
the second possibility, we gain 3 2 = 1 Re., for the third possibility, we gain 2 + 3 = 1
Re, and for the nal possibility, we lose 2 + 2 = 4 Rs. Each of the possibilities occurs with
probability 1/4. So by denition of expectation, we have,
E[X] =

x
x Pr {X = x}
= 6 Pr {2h} + 1 Pr {1h, 1t} 4 Pr {2t}
= 6(1/4) + 1(2/4) 4(1/4) = 1
We note that another equivalent way of writing the expression for expectation is
E[X] =

sS
X(s)Pr {s} =

sS
X(s)/4, in this example.
We now give a proof of linearity of expectation.
3
Theorem 1. Let X and Y be two random variables. Then, E[X + Y ] = E[X] +E[Y ].
Proof. Let Z = X + Y . Let X take values in the set D
1
R and Y takes values in the set
D
2
R.
E[Z] =

cD
1
+D
2
c Pr {X + Y = c}
We can write this sum in another way as follows. X + Y = c if and only if X takes the
value a and Y takes the value b such that a+b = c. There may be multiple decompositions
of c into a + b, a

+ b

etc.. So, Pr {X + Y = c} =

a,b:a+b=c
Pr {X = a and Y = b}. Then
the above sum is
=

aD
1
,bD
2
(a + b) Pr {X = a and Y = b}
This may be simplied as follows.
E[X + Y ] =

aD
1
,bD
2
(a + b) Pr {X = a and Y = b}
=

aD
1
,bD
2
a Pr {X = a and Y = b} +

aD
1
,bD
2
b Pr {X = a and Y = b}
We can write the rst summation

aD
1
,bD
2
a Pr {X = a and Y = b} as follows.

aD
1
,bD
2
a Pr {X = a and Y = b}
=

aD
1
a

bD
2
Pr {X = a and Y = b}
=

aD
1
a Pr {X = a}
Note that

bD
2
Pr {X = a and Y = b} is simply Pr {X = a}. Further, by denition of
E[X],

aD
1
a Pr {X = a} = E[X] .
Likewise, we can show that

aD
1
,bD
2
b Pr {X = a and Y = b}
=

bD
2
b

aD
1
Pr {X = a and Y = b}
=

bD2
b Pr {Y = b}
= E[Y ] .
Therefore E[X + Y ] = E[X] +E[Y ].
4
Example 4. Let us solve the hiring problem. For any candidate j, let rank(j) denote the
rank of this candidate, that is the position of this candidate in a non-decreasing sorted order
of the candidates in terms of their skill set. Further, we will assume that the candidates
arrive in a random order, that is, to say that all permutations of the rank order is equally
likely.
For 1 i n, dene the indicator variable
X
i
=
_
1 if the ith candidate is hired
0 if the ith candidate is not hired
Let X =

n
i=1
X
i
. Then, X is the number of candidates hired. What is the probability
that X
i
= 1?
X
i
= 1 i the ith candidate has the best rank among the rst i candidates. There are two
ways to calculate this, one uses the symmetry of the problem and the other works from
rst principles. In the rst argument, we note that for any xed set of the i candidates, the
best rank candidate among them may be the rst or second . . . or the ith candidate with
equal probability. Hence, the probability that the i th candidate has the best rank among
the rst i candidates is 1/i.
Another way to view this is as follows. Let us rst count the number of ways in which the ith
candidate has the best rank among the rst i candidates. Choose a set of i candidates from
the n candidates. This can be done in
_
n
i
_
ways. Now we have to place these i candidates
in such a way that the best candidate is in position i. This may be done in (i 1)! ways.
Thus, the number of ways in which the best candidate is in position i is
_
n
i
_
(i 1)!
The total (unrestricted) number of ways in which the rst i candidates may be chosen and
placed is
_
n
i
_
i!
Thus the probability is
_
n
i
_
(i 1)!
_
n
i
_
i!
= 1/i .
We note that for any indicator variables X
i
,
E[X
i
] = 0 Pr {X
i
= 0} + 1 Pr {X
i
= 1} = Pr {X
i
= 1} .
Hence, by linearity of expectation, we have,
E[X] = E[X
1
] +E[X
2
] + . . . +E[X
n
]
= Pr {X
1
= 1} + Pr {X
2
= 1} + . . . + Pr {X
n
= 1}
= 1/1 + 1/2 + 1/3 + . . . + 1/n
= H
n
( the nth Harmonic number H
n
)
= ln n +
5
where, = 0.577 . . . is Eulers constant.
Thus on expectation H
n
candidates are hired (under the random order of candidates as-
sumption).
6

You might also like