Solutions5 6
Solutions5 6
g(x)h(y).”
Please note that, although the values of the joint density fX,Y (x, y) break down as products
g(x)h(y), g and h need not be the marginal densities of x and y. For example, if one were to
multiply g by some factor P and divide h by P the same factor, the equation would still hold. There is
no reason to believe that x g(x) = 1 or y h(y) = 1, and thus no reason to believe that g and
h are probability densities at all (of any random variable).
P P
(a) “Express P (X = x) in terms of g and h.” P (X = x) =
P y P (X = x and Y = y) = y g(x)h(y) =
g(x) y h(y).
P
(b) “Express P (Y = y) in terms of g and h.” Similarly, P (Y = y) = h(y)
x g(x).
P P P
(c) “Show that ( x g(x))( y h(y)) = 1.” Since
x P (X = x) = 1, we have 1 = x g(x) y h(y) =
P P
P P
( x g(x))( y h(y)). (A similar calculation could have been done using the result for P (Y =
y)
P or, alternatively, by going back to the original joint density and using the fact that
x,y P ((x, y)) = 1.)
(d) “Show that X and Y are independent.” Use (a), (b),
P P (c), and the definition of independent! P (X =
x, Y = y) = g(x)h(y) = g(x)( x g(x))( y h(y))h(y) [by (c)] = · · · .
[Recall that X and Y are independent if the joint density of X and Y is equal to the product of the densities of X and Y : f (x, y) =
f (x)f (y).]
(Source: Introduction to Probability Theory by Hoel, Port, and Stone, 1971, p. 79.)
The “bull’s eye” within the circle of radius 1 is assigned 4 points. The annular region between the circle of radius 1 and the circle of radius 2
is assigned 3 points. The annular region between the circle of radius 2 and the circle of radius 3 is assigned 2 points. The outermost annular
(a) Let Yi be the score obtained form the ith dart, i = 1, 2, 3, 4, 5. Compute the density of Yi . The total area of the target
is 16π square meters, the area of the next concentric circle in is 9π square meters, etc. Thus,
1
for each dart, the probability of hitting the bull’s eye is 16 , the probability of hitting the next
4−1 3
region is 16 = 16 , etc.
1 3 5 7 30 15
(b) Compute EYi . 4( 16 ) + 3( 16 ) + 2( 16 ) + 1( 16 )= 16 = 8 .
→
− →
−
(c) Let Y = (Y1 , Y2 , Y3 , Y4 , Y5 ). Compute the density of Y . Since the throws are independent, the probability of
any sequence of throws is the product of their probabilities. (This is an acceptable answer;
you don’t need to write down all the formulas as long as you know how to find them.)
(d) Let X =
P5
i=1 Yi . That is, X represents the total score from the five darts. Compute EX. (Hint: There is an easy way!) The
75
expectation of a sum is the sum of the expectations! So EX = 8 .
2. We proved in class for any random variables X and Y and any real constant c that E(X + Y ) = EX + EY and that EcX = cEX.
(a) Combine these results to conclude that, for any linear combination aX + bY of random variables (where a and b are real constants),
1
Proof. Initial claim (Case n=1): Both the case n = 1 and the case n = 2 have previously
been proven directly.
Inductive claim: Assume as inductive hypothesis that
n
! n
X X
E ci Xi = ci EXi .
i=1 i=1
P
n+1 Pn
Then E i=1 ci Xi = E ( i=1 ci Xi + cn+1 Xn+1 ), by the recursive definition of a sum with
arbitrarily
P many
terms. By the case of two terms (previously proven), we obtain that
n+1 Pn
E i=1 ci Xi = E ( i=1 ci Xi ) + E(cn+1 Xn+1 ). Finally, by inductive hypothesis, again
using the recursive definition of a sum (in the opposite direction), we obtain the desired
conclusion inductive that ! n+1
n+1
X X
E ci Xi = ci EXi .
i=1 i=1
3. Suppose X, Y , and Z are random variables with finite expectation, and EX = 2, EY = 7, and EZ = 3. What is E(2X + Y + 4Z)? Explain.
E(2X + Y + 4Z) = 2EX + EY + 4EZ = 23. This calculation is a direct application of the result
of the previous exercise.
4. “Let X be a geometrically distributed random variable [as defined by Hoel, Port, and Stone, slightly different from our text: f (x) = (1 − p)x p]
and let M > 0 be an integer. Set Z = min(X, M ). Compute the mean [expectation] of Z.” (Source: Introduction to Probability Theory by Hoel,
Remark: The interpretation of this version of the geometric distribution is that X tells you the
number of failures that occur before the first success. If Y is a random variable with geometric
distribution as we have defined it, then X = Y − 1, as you can see by substitution into the formula.
I forgot to mention it, but implicit in this interpretation and distribution is that the possible values
of X include 0.
This computation is not only good practice with the definition of expectation, but also with
computing the sums of finite and infinite series! Since Z takes the minimum of the two values,
Z = X in the case that X ≤ M , and Z = M in the case that M < X. Thus, using our general
result on the expectation of a function of a random vector (here, Z is a function of X),
M
X ∞
X
EZ = x(1 − p)x p + M (1 − p)x p.
0 M +1
The first of these two sums is probably the harder one, but we know
PM from experience that
PM our com-
putation must be based on the following function: let g(y) = 0 y x . Then g 0 (y) = 0 xy x−1 .
We can find closed formulas for both of these sums, and of course we will evaluate f 0 (1 − p). So
here goes! Using the usual trick for geometric sums, or just remembering the formula, we obtain
M +1
g(y) = 1−y 0
1−y . Computing g (y) from this formula is not pretty, but you all know how to do it.
M +1 M
−(M +1)y +1
After some algebraic simplification, we obtain g 0 (y) = M y (1−y)2 ; hence (again after some
careful algebraic simplification, and check my work, because I don’t guarantee there are no algebra
M
errors!) g 0 (1 − p) = (1−p) (−M
p2
p−1)+1
. Now, equipped with this systematic prior preparation, we
obtain
XM M
X
x(1 − p)x p = p(1 − p) x(1 − p)x−1
0 0
M +1
(1 − p) (−M p − 1) + 1 − p
= p(1 − p)g 0 (1 − p) = .
p
2
For the second sum, substituting y = x − M − 1 and using geometric series techniques, we obtain
∞ ∞
X X M p(1 − p)M +1
M (1 − p)x p = M p(1 − p)M +1 (1 − p)y = .
0
p
M +1
I leave it to you to add these two sums together and simplify as appropriate.
5. “Let X be a geometrically distributed random variable [as defined by Hoel, Port, and Stone: f (x) = (1 − p)x p] and let M > 0 be an integer.
As this
Set Z = max(X, M ). Compute the mean of Z.” (Source: Introduction to Probability Theory by Hoel, Port, and Stone, 1971, p. 104.)
computation uses similar methods to the one above, I leave it to you for practice if you did not
solve it the first time.
6. Let X have hypergeometric density. Compute EX. Hint: The easiest way to do this is as follows, and is very similar to the method we used
to compute the expectation of a binomial random variable. Let Xi = 1 if the ith object picked is marked, and 0 if it is not marked. Thus,
Pn
each Xi is Bernoulli. What is P (Xi = 1)? We then have X = i=1 Xi , just as for a binomial random variable, except that the Xi are not
independent in this case. Discussed in class.
7. Let X be a discrete random variable with probability distribution f . Let µ = EX, and let Y = X − µ. Prove the EY = 0. (Hint: Remember
that
P
x f (x) = 1.) There are two easy ways to do this:
Method 1: use the result
P on sums of random P P EY = EX − µ = 0. Method 2: direct
variables.
computation. EY = x (x − µ)f (x) = x xf (x) − µ x f (x) = EX − µ(1) = 0. Incidently, in
order to rearrange this series we have used the fact that if two series converge absolutely, the series
of the sum of their terms converges absolutely. This is an elementary fact but not a trivial one;
its proof requires some analysis.
8. Let X be a continuous random variable with probability density f . Let µ = EX, and let Y = X − µ. Prove the EY = 0. Similar to the
previous exercise.
9. Let X be a continuous random variable with probability density f . Prove that if f is an even function, then EX = 0. (Recall that by definition
a function f is even if, for all x ∈ R, f (−x) = f (x), and a function f is odd if, for all x ∈ R, f (−x) = −f (x). Hint: If f is an even function,
Remark: We implicitly assume that the expectation of X is finite; otherwise the integrals in the
computation below might not converge.
If f is an even function, then (−x)f (−x) = −xf (x), so the function with outputs xf (x) is an odd
R∞ R0 R∞ R∞ R∞
function. Thus −∞ xf (x)dx = −∞ xf (x)dx + 0 xf (x)dx = − 0 uf (u)du + 0 xf (x)dx = 0.
(Make the substitution u = −x in the first integral.)
10. Does the converse to the previous problem hold? If so, prove it; if not, give a counterexample. The converse does not hold.
First, here is a simple discrete example, which I told the students who came to office hours.
Let X take the value −2 with probability 31 and the value 1 with probability 23 . Obviously, the
probability distribution of X is not an even function (since, for example, P (X = −1) = 0, whereas
P (X = 1) = 23 . Nonetheless, EX = (−2)( 31 + (1)( 32 ) = 0.
Strictly speaking we are confining our attention to continuous random variables. Using similar
reasoning, we can readily(find continuous examples. For example, suppose the probability density
− 1 sin x2 , for − 2π ≤ x ≤ 0
of X is given by f (x) = 1 8 .
4 sin x, for 0 ≤ x ≤ π
11. Let X be a continuous random variable such that EX = 0. Let f be the probability density of X, and define g by g(y) = f (y − µ), where µ ∈ R
is a constant. Prove that g is a probability density and that if Y is a random variable with density g, then EY = µ. Both integrals
may be computed by the substitution x = y − µ.