Chap 3.1
Chap 3.1
In the simplest case, suppose there are only two discrete random variables (X , Y)
which take distinct values :
Definition
The joint probability mass function (joint pmf) of the discrete random variables X
and Y and is defined by
p ( x, y ) = P ( X = x, Y = y ) , x ∈ X (Ω ) , y ∈Y (Ω ) .
Sometimes the joint pmf can be conveniently presented in the form of a two-way
table as
Values of Y
Values of X y1 y2 … yc
x1 p( x1 , y1 ) p( x1 , y 2 ) … p( x1 , yc )
x2 p( x2 , y1 ) p( x2 , y 2 ) … p( x2 , yc )
… … … … …
xr p( xr , y1 ) p( xr , y 2 ) … p( xr , y c )
P.104
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.1
Suppose that 3 balls are randomly selected from an urn containing 3 red, 4 white,
and 5 blue balls. If we let X and Y denote, respectively, the number of red and
white balls in the sample, then both X and Y takes values 0, 1, 2, 3 only. The
joint pmf of ( X, Y ) can be calculated as
⎛ 5⎞ ⎛12 ⎞ 10
p (0,0) = P ( X = 0, Y = 0) = P (3 blue balls ) = ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ =
⎝ 3⎠ ⎝ 3 ⎠ 220
⎛ 4 ⎞⎛ 5 ⎞ ⎛12 ⎞ 40
p (0,1) = P ( X = 0, Y = 1) = P (1 white 2 blue ) = ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ =
⎝ 1 ⎠⎝ 2 ⎠ ⎝ 3 ⎠ 220
⎛ 3 ⎞⎛ 4 ⎞ ⎛12 ⎞ 12
p (2,1) = P (2 red 1 white) = ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ =
⎝ 2 ⎠⎝ 1 ⎠ ⎝ 3 ⎠ 220
Values of Y
Values of X 0 1 2 3 Total
0 0.0454 0.1818 0.1364 0.0182 0.3818
1 0.1364 0.2727 0.0818 0 0.4909
2 0.0682 0.0545 0 0 0.1227
3 0.0045 0 0 0 0.0045
Total 0.2545 0.5091 0.2182 0.0182 1.0000
⎛ 3 ⎞⎛ 4 ⎞⎛ 5 ⎞
⎜⎜ ⎟⎟⎜⎜ ⎟⎟⎜⎜ ⎟
⎝ x ⎠⎝ y ⎠⎝ 3 − x − y ⎟⎠
p ( x, y ) = , x = 0,1,2,3 , y = 0,1,2,3 , x + y ≤ 3 .
⎛12 ⎞
⎜⎜ ⎟⎟
⎝3⎠
P.105
Stat1301 Probability& Statistics I Spring 2008-2009
1. 0 ≤ p( x, y ) ≤ 1 for all x ∈ X (Ω ), y ∈ Y (Ω ) .
2. ∑ ∑ p ( x, y ) = 1
x∈ X (Ω ) y∈Y (Ω )
3. P (( X , Y ) ∈ A) = ∑ p ( x, y ) where A ⊂ X (Ω ) × Y (Ω ) .
( x , y )∈ A
Example 3.2
For the joint pmf in above example, obviously p satisfies properties 1 and 2.
For the probability that there are same number of red and white balls,
For the probability that there are less red balls than white balls,
We may also compute the probability concerning about X only. For example,
P.106
Stat1301 Probability& Statistics I Spring 2008-2009
Definition
Let X and Y be discrete random variables with joinly pmf p ( x, y ) . The marginal
pmfs of X and Y are respectively defined as
p X ( x ) = P ( X = x ) = ∑ p ( x, y )
y∈Y (Ω )
and
pY ( y ) = P (Y = y ) = ∑ p ( x, y ) .
x∈ X (Ω )
Example 3.3
⎧ 0.3818 x=0
⎪ 0.4909 x =1
⎪
p X (x ) = P( X = x ) = ⎨ .
⎪ 0.1227 x=2
⎪⎩ 0.0045 x=3
The marginal pmf of Y is
⎧ 0.2545 y=0
⎪ 0.5091 y =1
⎪
pY ( y ) = P (Y = y ) = ⎨ .
⎪ 0.2182 y=2
⎪⎩ 0.0182 y=3
Remark
Joint pmf can uniquely determine the marginal pmfs, but the converse is not true.
Example 3.4
The following table shows a different joint pmf from the previous example that
yield the same marginal pmfs.
Values of Y
Values of X 0 1 2 3 Total
0 0.0972 0.1944 0.0833 0.0069 0.3818
1 0.1249 0.2499 0.1071 0.0089 0.4909
2 0.0312 0.0625 0.0268 0.0022 0.1227
3 0.0011 0.0023 0.0010 0.0001 0.0045
Total 0.2545 0.5091 0.2182 0.0182 1.0000
P.107
Stat1301 Probability& Statistics I Spring 2008-2009
Definition
P (( X , Y ) ∈ C ) = ∫∫ f ( x, y )dxdy .
( x , y )∈C
This function, if exists, is called the joint probability density function (joint pdf).
2. ∫ − ∞ ∫ − ∞ f ( x, y )dxdy = 1
∞ ∞
3. P (( X , Y ) ∈ A) = ∫∫A f ( x, y )dxdy
P (a ≤ X ≤ b, c ≤ Y ≤ d ) = ∫ c ∫ a f ( x, y )dxdy .
d b
In particular,
F ( x, y ) = P ( X ≤ x, Y ≤ y ) = ∫ − ∞ ∫ − ∞ f (s, t )dsdt
y x
5. Marginal pdf :
f X ( x ) = ∫ − ∞ f ( x, y )dy ,
∞
−∞< x<∞
fY ( y ) = ∫ − ∞ f ( x, y )dx ,
∞
−∞< y<∞
Example 3.5
f ( x, y ) = 4 x(1 − y ) , 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
1
1 1
∫ ∫
0 0
4x (1 − y )dydx = ∫ 1
0
4 x
⎡y2 ⎤
⎢y
1
− ⎥ dx = ∫ 0 2 xdx = x 2 [ ] 1
0 =1
⎣ 2 ⎦0
P.108
Stat1301 Probability& Statistics I Spring 2008-2009
P (0 ≤ X ≤ 1 2 , 1 2 ≤ Y ≤ 1) = ∫ 0 ∫ 1 2 4 x (1 − y )dydx = 1 16
12 1
Similarly,
Suppose we want to determine P ( X < 3Y ) . The following graph shows the region
corresponding to the event X < 3Y .
x = 3y
1/3
x
0 1
The points in the region can be specified by y > x 3 , 0 < x < 1 , therefore we have
1
⎡ y2 ⎤
P ( X < 3Y ) = ∫ ∫ (1 − y )dydx = ∫
1 1 1
0 x3
4x 0
4 x ⎢y− ⎥ dx
⎣ 2 ⎦x3
1
1⎛ 4 x2 2 x3 ⎞ ⎡ 2 4 x3 x4 ⎤ 11
=∫ ⎜ 2x − + ⎟ = ⎢ − + =
18 ⎥⎦ 0 18
0⎜
dx x
⎝ 3 9 ⎟⎠ ⎣ 9
For 0 ≤ x ≤ 1 , 0 ≤ y ≤ 1 ,
(
F ( x, y ) = ∫ 0 ∫ 0 4 s (1 − t )dtds = x 2 2 y − y 2
x y
)
For 0 ≤ x ≤ 1 , y > 1 , F ( x, y ) = F ( x,1) = x 2
For x > 1 , 0 ≤ y ≤ 1 , F ( x, y ) = F (1, y ) = 2 y − y 2
Therefore
⎧0 x < 0, y < 0
⎪ 2
(
⎪ x 2y − y
2
) 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
⎪
F ( x, y ) = ⎨ x 2 0 ≤ x ≤ 1, y > 1
⎪
⎪ 2y − y x > 1, 0 ≤ y ≤ 1
2
⎪⎩ 1 x > 1, y > 1
P.109
Stat1301 Probability& Statistics I Spring 2008-2009
Marginal pdfs :
1
⎡ 2
⎤
f X (x ) = ∫
1
0
4x (1 − y )dy = 4 x ⎢ y − y ⎥ = 2 x , 0 ≤ x ≤1
⎣ 2 ⎦0
1
[ ]
fY ( y ) = ∫ 0 4 x (1 − y )dx = (1 − y ) 2 x 2
1
0 = 2(1 − y ) , 0 ≤ y ≤1
Definition
Two random variables X and Y are said to be independent if and only if their joint
pmf (pdf) is equal to the product of their marginal pmfs (pdfs), i.e.
p ( x, y ) ≠ p X ( x ) pY ( y ) , (or f ( x, y ) ≠ f X ( x ) f Y ( y )).
Example 3.6
In example 3.1, X is the number of red balls, Y is the number of white balls in a
sample of 3 randomly drawn from an urn containing 3 red balls, 4 white balls, and
5 blue balls. The following table shows the joint and marginal pmfs.
Values of Y
Values of X 0 1 2 3 Total
0 0.0454 0.1818 0.1364 0.0182 0.3818
1 0.1364 0.2727 0.0818 0 0.4909
2 0.0682 0.0545 0 0 0.1227
3 0.0045 0 0 0 0.0045
Total 0.2545 0.5091 0.2182 0.0182 1.0000
Therefore X and Y are dependent, i.e. knowing the value of X will affect our
uncertainty about Y, and vice versa.
P.110
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.7
f X (x ) = 2 x , 0 ≤ x ≤ 1 , fY ( y ) = 2(1 − y ) , 0 ≤ y ≤ 1 .
Example 3.8
Y
X 10 20 40 80 p X (x )
20 0.04 0.06 0.06 0.04 0.2
=(0.2)(0.2) =(0.2)(0.3) =(0.2)(0.3) =(0.2)(0.2)
Y
X 10 20 40 80 p X (x )
20 0.04 0.06 0.06 0.04 0.2
40 0.10 0.15 0.15 0.05 0.45
60 0.06 0.09 0.09 0.11 0.35
pY ( y ) 0.2 0.3 0.3 0.2 1.00
P.111
Stat1301 Probability& Statistics I Spring 2008-2009
Proposition
Let X and Y be random variables with joint pdf (or pmf) f ( x, y ) . Then X and Y are
independent if and only if
(i) the supports of X and Y do not depend on each other (i.e. the region of
possible values is a rectangle); and
Example 3.9
⎧1
⎪ ( x + 1)( y + 1)e
−x− y
x, y > 0
f ( x, y ) = ⎨ 4
⎪⎩ 0 otherwise
X and Y are independent since the supports do not depend on each other and
⎡1 ⎤
[ ]
f ( x, y ) = ⎢ ( x + 1)e − x ⎥ ( y + 1)e − y .
⎣4 ⎦
Example 3.10
⎧1
⎪ ( x + y )e
− x− y
x, y > 0
f ( x, y ) = ⎨ 2
⎪⎩ 0 otherwise
P.112
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.11
⎧⎪ 1 π , x2 + y2 ≤ 1
f ( x, y ) = ⎨
⎪⎩ 0 , x2 + y2 > 1
Remarks
1. The definitions of joint pdf (pmf) and marginal pdf (pmf) can be generalized to
multivariate case directly.
• Marginal pmf/pdf of X 1 :
P ( X 1 ∈ A1 , X 2 ∈ A2 ,..., X n ∈ An ) = P ( X 1 ∈ A1 )P ( X 2 ∈ A2 )L P ( X n ∈ An ) .
P.113
Stat1301 Probability& Statistics I Spring 2008-2009
Definition
For random variables X 1 , X 2 ,..., X n (not necessarily independent) with joint pmf
p( x1 , x2 ,..., xn ) or joint pdf f ( x1 , x2 ,..., xn ) ; if u ( X 1 , X 2 ,..., X n ) is a function of
these random variables, then the expectation of u ( X 1 , X 2 ,..., X n ) is defined as
Discrete
E (u ( X 1 , X 2 ,..., X n )) = ∑ ∑L ∑ u ( x1 , x2 ,..., xn ) p ( x1 , x2 ,..., xn )
x1 x 2 xn
Continuous
Example 3.12
where R1 and R2 are the two resistances. Suppose that experience tell us that the
two resistances have a joint pdf
⎧ xy 2
⎪ 0 < x < 2, 0 < y < 3
f ( x, y ) = ⎨ 18 .
⎪⎩ 0 otherwise
3 2⎛ 1 1⎞
E (R ) = ∫ 0 ∫ 0 ⎜ + ⎟ f ( x, y )dxdy
⎝x y⎠
2 3
1 3⎡ 2 x2 y ⎤ 1 ⎡ y3 y2 ⎤
1 3 2 2
18
( )
= ∫ 0 ∫ 0 y + xy dxdy = ∫ 0 ⎢ y x +
18 ⎣ 2 ⎥⎦ 0
1 3 2
dy = ∫ 0 y + y dy = ⎢ + ⎥
9 9⎣ 3 2 ⎦0
( )
= 1.5 Ω
P.114
Stat1301 Probability& Statistics I Spring 2008-2009
If the two resistors are to be connected in series, then the combined resistance
would be
R = R1 + R2 .
Properties
M X +Y (t ) = M X (t )M Y (t ) .
E (g ( X 1 )) = ∑ ∑L ∑ g ( x1 ) p ( x1 , x2 ,..., xn )
x1 x 2 xn
= ∑ g ( x1 )∑ ∑L ∑ p ( x1 , x2 ,..., xn )
x1 x2 x3 xn
= ∑ g ( x1 ) p X 1 ( x1 )
x1
P.115
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.13
⎧x ⎧ y2
⎪ 0< x<2 ⎪ 0< y <3
f R1 ( x ) = ⎨ 2 ; f R2 ( y ) = ⎨ 9 .
⎪⎩ 0 otherwise ⎪⎩ 0 otherwise
Hence
21 4 31 9
E (R1 ) = ∫ 0 x 2 dx = , E (R2 ) = ∫ 0 y 3dy = .
2 3 9 4
4 9 43
E (R ) = E (R1 ) + E (R2 ) = + =
3 4 12
xy 2
Since f R1 ( x ) f R2 ( y ) = = f ( x, y ) for all 0 < x < 2 and 0 < y < 3 , R1 and R2 are
18
independent. Thus
E (R1R2 ) = E (R1 )E (R2 ) = 3 .
Example 3.14
Suppose X ~ χ r2 and Y ~ χ r22 are two independent random variables. Then the
1
1 1 1
M X +Y (t ) = M X (t )M Y (t ) =
1
= (r1 + r2 ) 2 , t<
(1 − 2t )r 2 (1 − 2t )r
1 2 2
(1 − 2t ) 2
X + Y ~ χ r2 + r .
1 2
P.116
Stat1301 Probability& Statistics I Spring 2008-2009
Definition
σ xy = Cov( X , Y )
= E [( X − μ x )(Y − μ y )]
= E ( XY ) − μ x μ y
Example 3.15
In example 3.1, X is the number of red balls, Y is the number of white balls in a
sample of 3 randomly drawn from an urn containing 3 red balls, 4 white balls, and
5 blue balls. The following table shows the joint and marginal pmfs.
Values of Y
Values of X 0 1 2 3 Total
0 0.0454 0.1818 0.1364 0.0182 0.3818
1 0.1364 0.2727 0.0818 0 0.4909
2 0.0682 0.0545 0 0 0.1227
3 0.0045 0 0 0 0.0045
Total 0.2545 0.5091 0.2182 0.0182 1.0000
3 3
E ( XY ) = ∑ ∑ xyf ( x, y )
x =0y =0
P.117
Stat1301 Probability& Statistics I Spring 2008-2009
Properties
1. The sign and the magnitude of σ xy reveal the direction and the strength of the
linear relationship between X and Y .
( X ↑, Y ↑, σ xy > 0; X ↑, Y ↓, σ xy < 0 )
3. Cov( X , X ) = Var ( X )
⎛m ⎞ m n
4. Cov⎜ ∑ ai X i , ∑ b j Y j ⎟ = ∑∑ ai b j Cov ( X i , Y j )
n
⎝ i =1 j =1 ⎠ i =1 j =1
5. Var ( X + Y ) = Cov ( X + Y , X + Y )
= Cov ( X , X ) + Cov ( X , Y ) + Cov (Y , X ) + Cov (Y , Y )
= Var ( X ) + Var (Y ) + 2Cov ( X , Y )
Example 3.16
(Sampling without replacement from a finite population)
Suppose we randomly draw n balls from an urn with m red balls and N – m white
balls. Let X be the number of red balls in our sample. Then X has a hypergeometric
distribution with pmf
⎛ m ⎞⎛ N − m ⎞
⎜⎜ ⎟⎟⎜⎜ ⎟⎟
−
p ( x ) = P ( X = x ) = ⎝ ⎠⎝ ⎠
x n x
, max (0, n − (N − m )) ≤ x ≤ min(n, m ) .
⎛N⎞
⎜⎜ ⎟⎟
⎝n ⎠
P.118
Stat1301 Probability& Statistics I Spring 2008-2009
Let
⎧1 if the i th ball drawn is red
Yi = ⎨ ,
⎩0 otherwise
n
then X = ∑ Yi . Consider
i =1
m m 2 m( N − m )
E (Yi ) = E Yi ( ) 2
= P (Yi = 1) =
m
N
, Var (Yi ) = − 2 =
N N N2
m m −1
E (YiY j ) = P (Yi = 1, Y j = 1) =
N N −1
m(m − 1) m m( N − m )
Cov (Yi , Y j ) =
2
− 2 =− 2 for i ≠ j .
( )N N −1 N N ( N − 1)
Hence
⎛ n ⎞ n n m nm
E ( X ) = E ⎜ ∑ Yi ⎟ = ∑ E (Yi ) = ∑ =
⎝ i =1 ⎠ i =1 i =1 N N
i =1 i< j
n m( N − m ) ⎛ m( N − m ) ⎞
=∑ + 2 ∑ ⎜⎜ − 2 ⎟⎟
i =1 N 2
i< j ⎝ N ( N − 1) ⎠
nm( N − m ) n(n − 1) m( N − m ) ⎛ N − n ⎞ m ⎛ m⎞
= − 2 = ⎜ ⎟ n ⎜ 1 − ⎟
N2 2 N 2 ( N − 1) ⎝ N − 1 ⎠ N ⎝ N⎠
Coefficient of Correlation
P.119
Stat1301 Probability& Statistics I Spring 2008-2009
Definition
Let X and Y be random variables with covariance σ xy , standard deviations
σ x = Var ( X ) , σ y = Var (Y ) . The population correlation coefficient between X
and Y is defined as
σ xy Cov ( X , Y )
ρ = Corr ( X , Y ) = = .
σ xσ y Var ( X )Var (Y )
Example 3.17
σ xy − 0.2045
Hence ρ = = = −0.4082 . The number of red balls and the
σ xσ y 0.6784 × 0.7385
number of white balls in the sample are slightly negatively correlated.
Cauchy-Schwartz Inequality
(E ( XY ))2 ≤ E (X 2 )E (Y 2 ).
The equality holds if and only if either P (Y = 0) = 1 or P ( X = aY ) = 1 for some
constant a, i.e. X and Y have a perfect linear relationship.
1. − 1 ≤ ρ ≤ 1
By Cauchy-Schwartz inequallity,
σ xy = E (( X − μ X )(Y − μY ))
2 2
( )(
≤ E ( X − μ X ) E (Y − μY ) = σ x2σ y2
2 2
)
P.120
Stat1301 Probability& Statistics I Spring 2008-2009
σ XY
2
i.e. ρ = 2 2 ≤ 1.
2
σ Xσ Y
The equality holds ( ρ = ±1) when X and Y are perfectly linearly related, i.e.
when
P ( X − μ X = a (Y − μY )) = 1 .
acCov ( X , Y )
= = sign(ac ) Corr ( X , Y )
ac Var ( X )Var (Y )
Cov( X , Y ) = E ( XY ) − E ( X )E (Y ) = E ( X )E (Y ) − E ( X )E (Y ) = 0 .
and hence ρ = 0 .
Remarks
1. The converse of property 3 need not be true. That is, ρ = 0 does not imply X
and Y are independent.
P.121
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.18
(
Obviously X and Y are not independent because P X 2 + Y 2 = 1 = 1 . )
However,
2π 2π
sin θ cos θ cosθ sin θ
E(X ) = ∫ E (Y ) = ∫
2π 2π
dθ = − =0 , dθ = =0
0
2π 2π 0
0
2π 2π 0
2π
sin θ cosθ 2π sin 2θ cos 2θ
E ( XY ) = ∫
2π
dθ = ∫ 0 dθ = − =0
0
2π 4π 8π 0
Example 3.19
Suppose their pension contribution is 10% of the man’s income, and 20% of the
woman’s income. Then the couple’s total pension contribution is a weighted sum:
W = 0.1X + 0.2Y .
Suppose we know that the average man’s income is E ( X ) = 20 and the average
woman’s is E (Y ) = 16 , then the average total income is
E (S ) = E ( X + Y ) = E ( X ) + E (Y ) = 20 + 16 = 36
P.122
Stat1301 Probability& Statistics I Spring 2008-2009
= 60 + 70 + 2(49 ) = 228
σ S = 228 = 15.1
and
Var (W ) = Var (0.1X + 0.2Y )
= (0.1) Var ( X ) + (0.2 ) Var (Y ) + 2(0.1)(0.2 )Cov ( X , Y )
2 2
σ W = 5.36 = 2.32
E( X i ) = μi , Var ( X i ) = σ i2 .
n
Let Y = ∑ ai X i where ai ’s are constants. Then
i =1
E (Y ) = ∑ ai E ( X i ) = ∑ ai μ i
n n
i =1 i =1
i =1 i =1 i =1
P.123
Stat1301 Probability& Statistics I Spring 2008-2009
Based on a random sample, usually we will compute some summary statistics such
as the sample mean and sample variance. The probabilistic behaviour of these
summary statistics are called the sampling distributions.
1 n
Sample Mean X= ∑ Xi
n i =1
E (X ) = ∑ μ = n μ = μ
n
1 1
i =1 n n
1 2 σ2
Var ( X ) = ∑ 2 σ = n 2 σ =
n
1 2
i =1 n n n
Sample Variance S =
1 n
2
∑ (X i − X )2
n − 1 i =1
⎧n
⎩ i =1
2⎫
n
⎭ i =1
{ 2
n
i =1
}
E ⎨∑ ( X i − X ) ⎬ = ∑ E ( X i − X ) = ∑Var ( X i − X )
( E (X i − X ) = μ − μ = 0 )
i =1
n ⎧ 2 σ2 σ2⎫
= ∑ ⎨σ + −2 ⎬
i =1 ⎩ n n ⎭
= (n − 1)σ 2
Hence ( )
E S2 =σ 2 .
P.124
Stat1301 Probability& Statistics I Spring 2008-2009
i
i =1
M Y (t ) = E (e tY ) = E (e t ( a X + a X +L+ a X ) )
1 1 2 2 n n
= E (e ta X e ta X Le ta X )
1 1 2 2 n n
= E (e ta X )E (e ta X )L E (e ta X )
1 1 2 2 n n
(independence)
= M X (a1t )M X (a2 t )L M X (an t )
1 2 n
= ∏ M X (ai t ) .
n
i
i =1
Example 3.20
Example 3.21
random variable is
⎧ σ 2t 2 ⎫
M X (t ) = exp ⎨μt + ⎬.
⎩ 2 ⎭
⎧ ⎡ t σ 2 (t n )2 ⎤ ⎫ (σ 2 n )t 2 ⎫
n
⎧
M X (t ) = ⎨exp ⎢ μ + ⎥ ⎬ = exp⎨μt + ⎬
⎩ ⎣ n 2 ⎦⎭ ⎩ 2 ⎭
⎛ σ2⎞
X ~ N ⎜ μ, ⎟ .
⎝ n ⎠
P.125
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.22
⎛ X −μ ⎞
P ( X − μ > 0.05) = P⎜ > 0.5 ⎟ = 2(1 − Φ (0.5)) = 2(0.309 ) = 0.618
⎝ 0.1 ⎠
⎛ 0.01 ⎞ X −μ
X ~ N ⎜ μ, ⎟, i.e. ~ N (0,1) .
⎝ 10 ⎠ 0.1 10
Hence the probability that this final measure deviates from the true value by 0.05 is
⎛ X −μ 0.05 ⎞
P ( X − μ > 0.05) = P⎜⎜ > ⎟ = 2(1 − Φ (1.58)) = 2(0.057 ) = 0.114
⎟
⎝ 0. 1 10 0 .1 10 ⎠
Example 3.23
iid
Suppose X i ~ Exp (λ ) . The moment generating function for each X is
λ
M X (t ) = , t<λ.
λ −t
M X (t ) = ⎜⎜ ⎟⎟ = ⎜ ⎟ , t < nλ
⎝ λ − t n ⎠ ⎝ nλ − t ⎠
X ~ Γ(n, nλ ) .
P.126
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.24
∑
n − 1 i =1
Consider
∑ ( X i − μ ) = ∑ (X i − X + X − μ )
n n
2 2
i =1 i =1
= ∑ ( X i − X ) + ∑ ( X − μ ) + 2( X − μ )∑ ( X i − X )
n n n
2 2
i =1 i =1 i =1
= ∑ ( X i − X ) + n( X − μ )
n
2 2
i =1
(n − 1)S 2 ⎛ X − μ ⎞
2
⎛ Xi − μ ⎞
n 2
Hence ∑⎜ σ ⎟ = + ⎜⎜ ⎟⎟
i =1 ⎝ ⎠ σ2 ⎝ σ n ⎠
W = Y + Z (say)
Xi − μ ⎛X −μ⎞
( )⇒
2
X i ~ N μ ,σ 2
~ N (0,1) ⇒ ⎜ i ⎟ ~ χ1
2
σ ⎝ σ ⎠
2
⎛ σ2⎞ X −μ ⎛X −μ⎞
X ~ N ⎜⎜ μ , ⎟⎟ ⇒ ~ N (0,1) ⇒ ⎜⎜ ⎟⎟ ~ χ12 , i.e. Z ~ χ12 .
⎝ n ⎠ σ n ⎝σ n ⎠
It can be shown (in later section) that X and S 2 are independent. Therefore Y and
Z are also independent. Now consider the moment generating functions
M W (t ) = M Y (t )M Z (t )
= M Y (t )
1 1
⇒
(1 − 2t ) n2
(1 − 2t )1 2
M Y (t ) =
1
⇒
(1 − 2t )(n−1) 2
P.127
Stat1301 Probability& Statistics I Spring 2008-2009
Y=
(n − 1)S 2 ~ χ 2 , or equivalently,
⎛ n − 1 n − 1⎞
S 2 ~ Γ⎜ , 2⎟.
n −1
σ 2
⎝ 2 2σ ⎠
ES( )2
=
n −1 n −1
=σ 2, Var S ( )
2
=
n −1 (n − 1)2 =
2σ 4
2 2σ 2
2 4σ 4 n −1
However, in some real life applications neither of these two criteria is satisfied. If
the population size N is very large such that the ratio n N is negligible, then
X 1 , X 2 ,..., X n can be still assumed to be approximately mutually independent. On
the other hand, if n N is not so small, then X 1 , X 2 ,..., X n is no longer a random
sample. In this case, it is called a simple random sample (SRS) and the inference
about the sample mean should be adjusted as
⎛ N − n ⎞σ
E (X ) = μ Var ( X ) = ⎜
2
, ⎟ .
⎝ N − 1 ⎠ n
⎛ N − n⎞
The factor ⎜ ⎟ is called the finite population correction factor.
⎝ N −1 ⎠
P.128