Assn10 Sol
Assn10 Sol
Page 1 of 7
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Thus, by taking into account the positive correlation between the assets’ gains, we are no
longer as comfortable with the probability of insolvency as we thought we were in part (b).
2. Let M and N be the number of males and females, respectively, that cast a vote. We need to
find P (M > N ), i.e., P (M − N > 0). The central limit theorem does not apply directly to
the random variable M − N . However, the central limit theorem implies that M and N are
well approximated by normal random variables. So, M − N is the difference of two independent
approximately normal random variables. Since the difference of two normal random variables is
itself normal, it follows that M − N is approximately normal. The mean and variance of M − N
are found by
Thus, the standard deviation of M − N is 11. Let Z be a standard normal random variable.
Using the central limit theorem approximation, we obtain
M − N − 22 22
P(M − N > 0) = P >−
11 11
≈ P(Z ≥ −2)
= 0.9772.
A slightly more refined estimate is obtained by expressing the event of interest as P(M − N ≥
1/2). We then have
M − N − 22 21.5
P(M − N > 1/2) = P ≥−
11 11
≈ P(Z ≥ −1.95)
= 0.974.
4. (a) Let C denote the coin that Bob received, so that C = 1 if Bob received the first coin, and
C = 2 if Bob received the second coin. Then P(C = 1) = p and P(C = 2) = 1 − p. Given
C, the number of heads Y in 3 independent tosses is a binomial random variable.
We can find the probability that Bob received the first coin given that he observed k heads
using Bayes’ rule.
Page 2 of 7
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
P(Y = k | C = 1) · P(C = 1)
P(C = 1 | Y = k) =
P (Y = k | C = 1) · P(C = 1) + P(Y = k | C = 2) · P(C = 2)
3 k 3−k p
k · (1/3) (2/3)
= 3 3−k · p + 3 · (2/3)k (1/3)3−k · (1 − p)
k
k · (1/3) (2/3) k
23−k p 1
= = 1−p
23−k p + 2k (1 − p) 1+ p 22k−3
P(C = 1 | Y = k) > p
23−k p
> p
23−k p + 2k (1 − p)
Note that if p = 0 or p = 1, there is no value of k that satisfies the inequality. We now solve
it for 0 < p < 1:
23−k
> 1
23−k p + 2k (1 − p)
23−k > 23−k p + 2k (1 − p)
23−k (1 − p) > 2k (1 − p)
23−k > 2k
2k < 3
k < 3/2
For 0 < p < 1, k = 0 or k = 1 the probability that Alice sent the first coin increases. The
inequality does not depend on p, and so does not change when p increases. Intuitively, this
makes sense: lower values of k increase Bob’s belief he got the coin with lower probability
of heads.
(c) Given that Bob observes k heads, Bob must decide on whether the first or second coin was
used. To minimize the error, he should decide it is the first coin when P(C = 1 | Y = k) ≥
P(C = 2 | Y = k). Thus, we have the decision rule given by
P(C = 1 | Y = k) ≥ P(C = 2 | Y = k)
23−k p 2k (1 − p)
≥
23−k p + 2k (1 − p) 23−k p + 2k (1 − p)
23−k p ≥ 2k (1 − p)
p
22k−3 ≤
1−p
3 1 p
k ≤ + log2
2 2 1−p
Page 3 of 7
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
Page 4 of 7
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
If p ≥ 8/9, the prior probability of receiving the first coin is so high that no amount of
evidence from 3 tosses of the coin will make Bob decide he received the second coin.
5. (a) Using the total probability theorem, we have
Z 1 Z 1
1
pT1 (t) = pT1 |Q (t, q)fQ (q)dq = (1 − q)t−1 qdq = for t = 1, 2, . . .
0 0 (t + 1)t
(b) The least squares estimate coincides with the conditional expectation of Q given T1 , which
is derived as
Z 1
E[Q | T1 = t] = pQ|T1 (q | t)qdq
0
Z 1
pT1 |Q (t | q)fQ (q)
= qdq
0 pT1 (t)
Z 1
= t(t + 1)q(1 − q)t−1 qdq
0
Z 1
= t(t + 1)q 2 (1 − q)t−1 dq
0
2(t − 1)!
= t(t + 1)
(t + 2)!
2
=
t+2
(c) We write the posterior probability distribution of Q given T1 = t1 , . . . , Tk = tk
or equivalently
Xk
k(1 − q) − ( ti − k)q = 0,
i
which yields the MAP estimate
k
q̂ = Pk .
i=1 ti
For this part only assume q is sampled from the random variable Q which is now uniformly
distributed over [0.5, 1]
Page 5 of 7
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
= 2 − 4(ln 2)2
Therefore we have derived the linear least squares estimator
2 − 4(ln 2)2
T̂2 = 2 ln 2 + (T1 − 2 ln 2) ≈ 1.543 + 0.113T1 .
4 − 2 ln 2 − (2 ln 2)2
6. (a) To find the normalization constant c we integrate the joint PDF:
Z 1Z 1 Z 1Z 1 Z 1
fX,Y (x, y) dy dx = c xy dy dx = c 1/2x dx = c/4.
0 0 0 0 0
Therefore, c = 4.
(b) To construct the conditional expectation estimator, we need to find the conditional proba
bility density.
Z 1
x̂CE (y) = E[X|Y = y] = x · 2x dx = 2/3.
0
Page 6 of 7
Massachusetts Institute of Technology
Department of Electrical Engineering & Computer Science
6.041/6.431: Probabilistic Systems Analysis
(Fall 2010)
(c) We first note that the conditional probability does not depend on y. Therefore, X and Y are
independent, and whether or not we observe Y = y does not affect the estimate in part (b).
Another way to seeR 1 this is to consider that if we do not observe y, we can compute the
marginal fX (x) = 0 4xydy = 2x which is equal to the conditional density, and will therefore
produce the same estimate.
(d) Since X and Y are independent, no estimator can make use of the observed value of Y to
estimate X. The MAP estimator for X is equal to 1, regardless of what value y we observe,
since the conditional (and the marginal) density is maximized at 1.
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.