Assignment4 Solution
Assignment4 Solution
Date Assigned:01/21/2015
Date Due: 5pm on 01/28/2015
Homework box in 2131 Kemper Hall
m
Question 1
er as
co
Suppose X ∼ Binom(n, p). Let Y ∼ Binom(m, p) and Y is independent of X.
eH w
1. Show that X + Y ∼ Binom(n + m, p)
o.
rs e
2. Show that X − Y is not Binomial
ou urc
Answer
o
and similarly,
ed d
T = X +Y
sh is
= I1 + I2 + . . . + Im+n
= Binom(n + m, p)
It is also possible to show using the PMF and it is good practice to be able to show it. We want to find
https://www.coursehero.com/file/13316921/assignment4-solution/
P (X + Y = k). to do that we condition on X
k
X
P (X + Y = k) = ((X + Y = k|X = j) × P (X = j)
j=0
k
X n j
= P (X + Y = k|X = j) × p (1 − p)n−j
| {z } j
j=0
since X and Y are independent
k
X n j
= P (Y = k − j) × p (1 − p)n−j
j
j=0
k
X m n j
= pk−j (1 − p)m−k+j × p (1 − p)n−j
k−j j
j=0
k
X m n
= pk (1 − p)m+n−k
m
k−j j
er as
j=0
m+n k
co
= p (1 − p)m+n−k
eH w
k
= Binom(m + n, p)
o.
rs e
2. Show that X − Y is not Binomial
ou urc
T = X − Y can take negative values. But by definition of the Binomial distribution, A Binomial RV cannot
take negative values. Hence, T is not Binomial.
o
aC s
Question 2
vi y re
0 b<0
ed d
ar stu
b
4 0≤b<1
1 b−1
F (b) = 2 + 4 1≤b<2
sh is
11
2≤b<3
12
Th
1 3≤b
1. Find P (X = i), i = 1, 2, 3
https://www.coursehero.com/file/13316921/assignment4-solution/
Answer
First draw the CDF. It is a mixture of continuous and discrete distributions. For X = 1, 2, 3 it is discrete and
continuous in the ranges (0,1] and (1,2]. The PMFs for X = 1, 2, 3 are the jumps at the discrete values.
P (X = 1) = 1/4
P (X = 2) = 1/6
P (X = 3) = 1/12
For the continuous regions you can compute the probability by computing the area then show that the sum is
equal to 1.
Question 3
Suppose a coin is tossed repeatedly until a heads is obtained for the first time. Let p be the probability of a head. Let
m
the random variable X denote the number of tosses that are required (including the toss that landed heads). Find the
er as
CDF of X. Plot the CDF of X for p = 1/2 and p = 3/4 using R.
co
eH w
Answer
o.
Since random variable X denotes the number of tosses that are required, we know that X is discrete.
rs e
ou urc
F (1) = P (X = 1) = p
F (2) = F (1) + P (X = 2) = p + (1 − p)p
o
...
aC s
F (n) = F (n − 1) + P (X = n)
vi y re
= p + (1 − p)p + · · · + (1 − p)n−1 p
= p(1 + (1 − p) + · · · + (1 − p)n−1 )
1 − (1 − p)n
ed d
= p
1 − (1 − p)
ar stu
1 − (1 − p)n
= p
p
= 1 − (1 − p)n
sh is
lim F (n) = 1
n→∞
This is actually a geometric distribution, use the following program to plot the graphs for p=1/2 and 3/4.
n <- 20
k <- seq(1, n) # vector of numbers of tosses
p = 1/2 # replace with 3/4 for the other plot
pr <- pgeom(k-1, p) # k-1 is the number of failures before success
barplot(pr, names.arg=k, ylab="CDF", xlab="# of tosses")
https://www.coursehero.com/file/13316921/assignment4-solution/
m
er as
co
eH w
o.
Figure 1: CDF of X for p = 1/2
rs e
ou urc
o
aC s
vi y re
ed d
ar stu
sh is
Th
Question 4
A hash table is a very important data structure in computer science. It is used for fast information retrieval. It stores
data as a < key, value > pair, where the value is indexed by the key. Note that keys must ne unique. Consider the
https://www.coursehero.com/file/13316921/assignment4-solution/
example of storing persons name using the social security number (ssn) as the key. For each ssn x, a hash function
h is used, where h(x) is the location to store the name of x. Once we have created a table, to look up the name for
ssn x, we can recomputes h(x) and then look up what is stored in that location. In Python, dictionaries are based
on hash tables. Typically, the hash function h is deterministic; we do not want to get different results every time we
compute h(x). But h is often chosen to be pseudo-random. For this problem, we will assume that h is truly random.
Suppose there are k people, with each person’s name stored in a random location (independently), represented by
an integer between 1 and n, k < n. It may happen that one location has more than one name stored there, if two
different people ssns x and y end up with the same random location for their name to be stored.
Answer
m
er as
Let Ij be an indicator random variable equal to 1 if the jth location is empty, and 0 otherwise, for 1 ≤ j ≤ n. Then
P (Ij = 1) = (1 − 1/n)k , since the names are stored in independent random locations. Then I1 + . . . + In is the
co
eH w
number of empty locations. By linearity of expectation, we have
o.
n n
rs e E(
X
Ij ) =
X
E(Ij )
ou urc
j=1 j=1
= n(1 − 1/n)k
Similarly, the probability of a specific location having exactly 1 name stored is nk (1 − 1/n)k−1 . Hence, the expected
o
number of such locations is k(1 − 1/n)k−1 . Finally, the number of locations with more than 1 name is n − n(1 −
aC s
Question 5
ed d
This problem is based on the Poisson distribution. We will cover this on Friday. Suppose the number of accidents
ar stu
occurring in the highway each day is Poisson random variable with parameter λ = 3;
2. Repeat the above under the assumption that at least 1 accident occurs today.
Th
Answer
1. Find the probability that 3 or more accidents occur today.
Use the command pr = 1 − ppois(2, 3), we can get the value is: 0.5768099
2. Repeat the above under the assumption that at least 1 accident occurs today.
Let A denotes the event: that 3 or more accidents occur today
Let B denotes the event: at least 1 accident occurs today
Then A ∩ B = A
So what we want is P (A|B) = PP(AB) P (A)
(B) = P (B)
P (A) = 1 − ppois(2, 3) = 0.5768099
https://www.coursehero.com/file/13316921/assignment4-solution/
P (B) = 1 − ppois(0, 3) = 0.9502129
The answer is: P (A|B) = PP (B)
(A)
= 0.6070323
Question 6
Consider a random permutation of 1, 2, 3, . . . , n where n ≥ 2. A number is a local maxima if it is greater then the
two adjacent numbers. In the case of the two endpoints the comparison is just with one adjacent number. Consider
the following examples using random permutation of numbers from 1 to 7.
Example 1:
3 2 1 4 7 5 6
7 is a local maxima
3 is a local maxima
m
6 is a local maxima
er as
co
Example 2:
eH w
5 6 3 1 2 7 4
o.
6 is a local maxima
rs e
ou urc
7 is a local maxima
Find the expected number of local maxima. (Hint: use indicator random variables).
o
Answer
aC s
vi y re
Let Ij be the indicator RV of position i having a local maxima i ≤ i ≤ n. Then T which denotes the number of
local maxima is given by T = I1 + I2 + . . . + In form which we get
E(T ) = E(I1 + I2 + . . . + In )
ed d
n−2 1
= +2
3 2
n+1
=
3
sh is
For the n − 2 positions, give 3 numbers in positions i − 1, i, i + 1, there are 3! permutations of which 2 correspond
Th
to a local maxima at position i, which give the probability 1/3. Similarly, for the endpoints the probability is 1/2.
Question 7
You start a business on cloud services. You buy processing capacity at 10 cents per CPU (per minute) and sell it at
15 cents per CPU (per minute). Any CPU capacity that is not used is lost. Suppose you have 10 clients and for any
given minute they independently and randomly request CPU with probability p = 1/3. What is the approximate
number of CPU should you purchase to maximize the expected profit?
https://www.coursehero.com/file/13316921/assignment4-solution/
Answer
Let’s define the following variables:
s = # CPU purchased
b = net profit for each CPU sold = 15 − 10 = 5
` = net loss for each CPU not sold = 10
1
X = # item sold ∼ Binomial(x; 10, )
3
The profit, as a function of CPU sold, is then
bX − (s − X)` X ≤ s
P (X) =
bs X>s
The expected profit is just the expected value of this function, i.e.
m
er as
co
10
1
eH w
X
E[P (X)] = P (X) × Binomial(x; 10, )
3
o.
x=0
= rs e
Xs
1
(bX − (s − X)`) × Binomial(x; 10, ) +
10
X 1
bX × Binomial(x; 10, )
ou urc
3 3
x=0 x=s+1
There are several ways to maximize this function, the first way is just to plug and play with different values of
o
s. The example in the book, 4.4b, gives a comprehensive treatment of various ways to make this calculation easier.
aC s
From there, we have the following useful equation; purchasing s + 1 units is better than purchasing s units whenever
vi y re
s
X b
p(x) <
b+`
x=0
ed d
where p(x) = Binomial(x; n = 10, p = 13 ). Using any of these ways should show that the optimal number of
ar stu
CPUs to purchase is 3.
sh is
Th
https://www.coursehero.com/file/13316921/assignment4-solution/