0% found this document useful (0 votes)
78 views

Theorem 5.3.1.: 1 N 2 N I 2 I

1. The document discusses sampling distributions and properties of important statistics based on random samples from normal distributions. 2. It is shown that for a random sample from a normal distribution, the sample mean (X̄) and sample variance (S2) are independent random variables, X̄ follows a normal distribution, and (n-1)S2/σ2 follows a chi-square distribution. 3. Additional distributions discussed include the t distribution, noncentral t distribution, and F distribution. Relationships between these distributions are also presented.

Uploaded by

Ben
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views

Theorem 5.3.1.: 1 N 2 N I 2 I

1. The document discusses sampling distributions and properties of important statistics based on random samples from normal distributions. 2. It is shown that for a random sample from a normal distribution, the sample mean (X̄) and sample variance (S2) are independent random variables, X̄ follows a normal distribution, and (n-1)S2/σ2 follows a chi-square distribution. 3. Additional distributions discussed include the t distribution, noncentral t distribution, and F distribution. Relationships between these distributions are also presented.

Uploaded by

Ben
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Lecture 18: Sampling distributions

In many applications, the population is one or several normal


distributions (or approximately).
We now study properties of some important statistics based on a
random sample from a normal distribution.
If X1 , ..., Xn is a random sample from N(µ, σ 2 ), then the joint pdf is
!
1 1 n 2
exp − 2 ∑ (xi − µ) , xi ∈ R, i = 1, ..., n
(2π)n/2 σ n 2σ i=1

Theorem 5.3.1.
Let X1 , ..., Xn be a random sample from N(µ, σ 2 ) and let X̄ and S 2 be
the sample mean and sample variance. Then
a. X̄ and S 2 are independent random variables;
b. X̄ ∼ N(µ, σ 2 /n);
c. (n − 1)S 2 /σ 2 has the chi-square distribution with n − 1 degrees of
beamer-tu-logo
freedom.
UW-Madison (Statistics) Stat 609 Lecture 18 2015 1 / 18
Proof.
We have already established property b (Chapter 4).
To prove property a, it is enough to show the independence of Z̄ and
SZ2 , the sample mean and variance based on Zi = (Xi − µ)/σ ∼ N(0, 1),
i = 1, ..., n, because we can apply Theorem 4.6.12 and
σ2 n
X̄ = σ Z̄ − µ and S 2 = ∑ (Zi − Z̄ )2 = σ 2 SZ2
n − 1 i=1
Consider the transformation
Y1 = Z̄ , Yi = Zi − Z̄ , i = 2, ..., n,
Then
Z1 = Y1 − (Y2 + · · · + Yn ), Zi = Yi + Y1 , i = 2, ..., n,
and
∂ (Z1 , ..., Zn ) 1
∂ (Y , ..., Yn ) = n

1
Since the joint pdf of Z1 , ..., Zn is
!
1 1 n 2
exp − ∑ zi zi ∈ R, i = 1, ..., n, beamer-tu-logo
(2π)n/2 2 i=1
UW-Madison (Statistics) Stat 609 Lecture 18 2015 2 / 18
the joint pdf of (Y1 , ..., Yn ) is
 !2  !
n n
n 1 1
exp − y1 − ∑ yi  exp − ∑ (yi + y1 )2
(2π)n/2 2 i=2
2 i=2
  !2 
n n
n  n  1 yi ∈ R
= exp − y12 exp −  ∑ yi2 + ∑ yi 
(2π) n/2 2 2 i=2 i=2
i = 1, ..., n.

Since the first exp factor involves y1 only and the second exp factor
involves y2 , ..., yn , we conclude (Theorem 4.6.11) that Y1 is
independent of (Y2 , ..., Yn ).
Since
n n
Z1 − Z̄ = − ∑ (Zi − Z̄ ) = − ∑ Yi and Zi − Z̄ = Yi , i = 2, ..., n,
i=2 i=2
we have
!2
1 n 1 n
1 n 2
SZ2 = ∑ (Zi − Z̄ )2 = n − 1 ∑ Yi + ∑ Yi beamer-tu-logo
n − 1 i=1 i=2
n − 1 i=2
UW-Madison (Statistics) Stat 609 Lecture 18 2015 3 / 18
which is a function of (Y2 , ..., Yn ).
Hence, Z̄ and SZ2 are independent by Theorem 4.6.12.
This proves a.
Finally, we prove c (the proof in the textbook can be simplified).
Note that
n n n
(n −1)S 2 = ∑ (Xi − X̄ )2 = ∑ (Xi − µ + µ − X̄ )2 = ∑ (Xi − µ)2 +n(µ − X̄ )2
i=1 i=1 i=1
Then 2
(n − 1)S 2 n 
Xi − µ 2 n
 
X̄ − µ
n
σ
+
σ2
= ∑ σ = ∑ Zi2
i=1 i=1
Since Zi ∼ N(0, 1) and Z1 , ..., Zn are independent, we have previously
shown that
each Zi2 ∼ chi-square with degree of freedom 1,
the sum ∑ni=1 Zi2 ∼ chi-square with degrees of freedom n, and its
mgf is (1 − 2t)−n/2 , t < 1/2,

n(X̄ − µ)/σ ∼ N(0, 1) and hence n[(X̄ − µ)/σ ]2 ∼ chi-square
beamer-tu-logo
with degree of freedom 1.
UW-Madison (Statistics) Stat 609 Lecture 18 2015 4 / 18
The left hand side of the previous expression is a sum of two
independent random variables and, hence, if f (t) is the mgf of
(n − 1)S 2 /σ 2 , then the mgf of the sum on the left hand side is
(1 − 2t)−1/2 f (t)
Since the right hand side of the previous expression has mgf
(1 − 2t)−n/2 , we must have
f (t) = (1 − 2t)−n/2 /(1 − 2t)−1/2 = (1 − 2t)−(n−1)/2 t < 1/2
This is the mgf of the chi-square with degrees of freedom n − 1, and
the result follows.

The independence of X̄ and S 2 can be established in other ways.


t-distribution
Let X1 , ..., Xn be a random sample from N(µ, σ 2 ).
Using the result in Chapter 4 about a ratio of independent normal and
chi-square random variables, the ratio

X̄ − µ (X̄ − µ)/(σ / n)
√ =p beamer-tu-logo
S/ n [(n − 1)S 2 /σ 2 ]/(n − 1)
UW-Madison (Statistics) Stat 609 Lecture 18 2015 5 / 18
has the central t-distribution with n − 1 degrees of freedom.
What is the distribution of T0 = X̄S/−µ
√ 0 for a fixed known constant µ0 ∈ R
n
which is not necessarily equal to µ?
Note that T is not a statistic while T0 is a statistic.
Since X̄ − µ0 ∼ N(µ − µ0 , σ 2 /n), from the discussion in Chapter 4 we
know that the distribution of T0 is the noncentral t-distribution with
√ of freedom n − 1 and noncentrality parameter
degrees
δ = n(µ − µ0 )/σ .

F-distribution
Let X1 , ..., Xn be a random sample from N(µx , σx2 ), Y1 , ..., Ym be a
random sample from N(µy , σy2 ), Xi ’s and Yi ’s be independent, and Sx2
and Sy2 be the sample variances based on Xi ’s and Yi ’s, respectively.
From the previous discussion, (n − 1)Sx2 /σx2 and (m − 1)Sy2 /σy2 are
Sx2 /σx2
both chi-square distributed, and the ratio Sy2 /σy2
has the F-distribution
with degrees of freedom n − 1 and m − 1 (denoted by Fn−1,m−1 ). beamer-tu-logo

UW-Madison (Statistics) Stat 609 Lecture 18 2015 6 / 18


Theorem 5.3.8.
Let Fp,q denote the F-distribution with degrees of freedom p and q.
a. If X ∼ Fp,q , then 1/X ∼ Fq,p .
b. If X has the t-distribution with degrees of freedom q, then X 2 ∼ F1,q .
c. If X ∼ Fp.q , then (p/q)X /[1 + (p/q)X ] ∼ beta(p/2, q/2).

Proof.
We only need to prove c, since properties a and b follow directly from
the definitions of F- and t-distributions.
Note that Z = (p/q)X has pdf
Γ[(p + q)/2] z p/2−1
, z >0
Γ(p/2)Γ(q/2) (1 + z)(p+q)/2
If u = z/(1 + z), then z = u/(1 − u), dz = (1 − u)−2 du, and the pdf of
U = Z /(1 + Z ) is
 p/2−1
Γ[(p + q)/2] u 1 1
Γ(p/2)Γ(q/2) 1 − u (1 − u) −(p+q)/2 (1 − u)2
Γ[(p + q)/2] p/2−1
= u (1 − u)q/2−1 u>0 beamer-tu-logo
Γ(p/2)Γ(q/2)
UW-Madison (Statistics) Stat 609 Lecture 18 2015 7 / 18
Definition 5.4.1 (Order statistics).
The order statistics of a random sample of univariate X1 , ..., Xn are the
sample values placed in a non-decreasing order, and they are denoted
by X(1) , ..., X(n) .

Once X(1) , ..., X(n) are given, the information left in the sample is the
particular positions from which X(i) is observed, i = 1, ..., n.

Functions of order statistics


Many useful statistics are functions of order statistics.
Both sample mean and variance are functions of order statistics,
because
n n n n
∑ Xi = ∑ X(i) and ∑ Xi2 = ∑ X(i)2
i=1 i=1 i=1 i=1

The sample range R = X(n) − X(1) , the distance between the


smallest and largest observations, is a measure of the dispersion
in the sample and should reflect the dispersion in the population.
beamer-tu-logo

UW-Madison (Statistics) Stat 609 Lecture 18 2015 8 / 18


For any fixed p ∈ (0, 1), the (100p)th sample percentile is the
observation such that about np of the observations are less than
this observation and n(1 − p) of the observations are greater:
X(1) if p ≤ (2n)−1
X({np}) if (2n)−1 < p < 0.5
X((n+1)/2) if p = 0.5 and n is odd
(X(n/2) + X(n/2+1) )/2 if p = 0.5 and n is even
X(n+1−{n(1−p)}) if 0.5 < p < 1 − (2n)−1
X(n) if p ≥ 1 − (2n)−1
where {b} is the number b rounded to the nearest integer, i.e., if k
is an integer and k − 0.5 ≤ b < k + 0.5, then {b} = k .
Other textbooks may define sample percentiles differently.
The sample median is the 50th sample percentile.
It is a measure of location, alternative to the sample mean.
The sample lower quartile is the 25th sample percentile and the
upper quartile is the 75th sample percentile. beamer-tu-logo

The sample mid-range is defined as V = (X(1) + X(n) )/2.


UW-Madison (Statistics) Stat 609 Lecture 18 2015 9 / 18
If X1 , ..., Xn is a random sample of discrete random variables, then the
calculation of probabilities for the order statistics is mainly a counting
task.
Theorem 5.4.3.
Let X1 , ..., Xn be a random sample from a discrete distribution with pmf
f (xi ) = pi , where x1 < x2 < · · · are the possible values of X1 . Define
P0 = 0, P1 = p1 , ..., Pi = p1 + · · · + pi , ...
Then, for the jth order statistic X(j) ,
n
 
n
P(X(j) ≤ xi ) = ∑ Pik (1 − Pi )n−k
k =j
k
n  
n
P(X(j) = xi ) = ∑ [Pik (1 − Pi )n−k − Pi−1
k
(1 − Pi−1 )n−k ]
k =j
k

Proof.
For any fixed i, let Y be the number of X1 , ..., Xn that are less thanbeamer-tu-logo
or
equal to xi .
UW-Madison (Statistics) Stat 609 Lecture 18 2015 10 / 18
If the event {Xj ≤ xi } is a “success", then Y is the number of
successes in n trials and is distributed as binomial(n, Pi ).
Then, the result follows from {X(j) ≤ xi } = {Y ≥ j},
n  
n
P(X(j) ≤ xi ) = P(Y ≥ j) = ∑ Pik (1 − Pi )n−k
k =j
k
and P(X(j) = xi ) = P(X(j) ≤ xi ) − P(X(j) ≤ xi−1 ).

If X1 , ..., Xn is a random sample from a continuous population with pdf


f (x), then
P(X(1) < X(2) < · · · < X(n) ) = 1
i.e., we do not need to worry about ties, and the joint pdf of
(X(1) , ..., X(n) ) is
(
n!f (x1 ) · · · f (xn ) x1 < x2 < · · · < xn
h(x1 , ..., xn ) =
0 otherwise
The n! naturally comes into this formula because, for any set of values
x1 , ..., xn , there are n! equally likely assignments for these values to
beamer-tu-logo
X1 , ..., Xn that all yield the same values for the order statistics.
UW-Madison (Statistics) Stat 609 Lecture 18 2015 11 / 18
Theorem 5.4.4.
Let X(1) , ..., X(n) be the order statistics of a random sample X1 , ..., Xn
from a continuous population with cdf F and pdf f .
Then the pdf of X(j) is
n!
fX(j) (x) = [F (x)]j−1 [1 − F (x)]n−j f (x) x ∈R
(j − 1)!(n − j)!
Proof.
Let Y be the number of X1 , ..., Xn less than or equal to x.
Then, similar to the proof of Theorem 5.4.3, Y ∼ binomial(n, F (x)),
{X(j) ≤ x} = {Y ≥ j} and
n  
n
FX(j) (x) = P(X(j) ≤ x) = P(Y ≥ j) = ∑ [F (x)]k [1 − F (x)]n−k
k =j
k
We now obtain the pdf of X(j) by differentiating the cdf FX(j) :
n  
d n d
fX(j) (x) = FX(j) (x) = ∑ [F (x)]k [1 − F (x)]n−k beamer-tu-logo
dx k =j
k dx
UW-Madison (Statistics) Stat 609 Lecture 18 2015 12 / 18
n  
n 
k [F (x)]k −1 [1 − F (x)]n−k −(n − k )[F (x)]k [1 − F (x)]n−k −1 f (x)

=∑
k =j
k
  n  
n n
= j[F (x)]j−1 [1 − F (x)]n−j f (x) + ∑ l[F (x)]l−1 [1 − F (x)]n−l f (x)
j l=j+1
l
n−1  
n
−∑ (n − k )[F (x)]k [1 − F (x)]n−k −1 f (x)
k =j
k
n!
= [F (x)]j−1 [1 − F (x)]n−j f (x)
(j − 1)!(n − j)!
n−1  
n
+∑ (k + 1)[F (x)]k [1 − F (x)]n−k −1 f (x)
k =j
k + 1
n−1  
n
−∑ (n − k )[F (x)]k [1 − F (x)]n−k −1 f (x)
k =j
k

The result follows from the fact that the last two terms cancel, because
   
n n! n beamer-tu-logo
(k + 1) = = (n − k )
k +1 k !(n − k − 1)! k
UW-Madison (Statistics) Stat 609 Lecture 18 2015 13 / 18
Example 5.4.5.
Let X1 , ..., Xn be a random sample from uniform(0, 1) so that f (x) = 1
and F (x) = x for x ∈ [0, 1].
By Theorem 5.4.4, the pdf of X(j) is
n! Γ(n + 1)
x j−1 (1 − x)n−j = x j−1 (1 − x)n−j +1−1 0<x <1
(j − 1)!(n − j)! Γ(j)Γ(n − j + 1)
which is the pdf of beta(j, n − j + 1).

Theorem 5.4.6.
Let X(1) , ..., X(n) be the order statistics of a random sample X1 , ..., Xn
from a continuous population with cdf F and pdf f .
Then the joint pdf of X(i) and X(j) , 1 ≤ i < j ≤ n, is
n!
fX(i) ,X(j) (x, y ) = [F (x)]i−1 [F (y ) − F (x)]j−i−1
(i − 1)!(j − i − 1)!(n − j)!
×[1 − F (y )]n−j f (x)f (y ) x < y , (x, y ) ∈ R 2
beamer-tu-logo
The proof is left to Exercise 5.26.
UW-Madison (Statistics) Stat 609 Lecture 18 2015 14 / 18
Example 5.4.7.
Let X1 , ..., Xn be a random sample from uniform(0, a), R = X(n) − X(1)
be the range, and V = (X(1) + X(n) )/2 be the midrange.
We want to obtain the joint pdf of R and V as well as the marginal
distributions of R and V .
By Theorem 5.4.6, the joint pdf of Z = X(1) and Y = X(n) is
n(n − 1)  y z n−2 n(n − 1)(y − z)n−2
fZ ,Y (z, y ) = − = , 0<z <y <a
a2 a a an
Since R = Y − Z and V = (Y + Z )/2, we obtain Z = V − R/2 and
Y = V + R/2,
∂ (Z , Y ) − 21 1

∂ (R, V ) 1 1 = −1
=
2
The transformation from (Z , Y ) to (R, V ) maps the sets
{(z, y ) : 0 < z < y < a} → {(r , v ) : 0 < r < a, r /2 < v < a − r /2}
Obviously 0 < r < a, and for a fixed r , the smallest value of v is r /2
(when z = 0 and y = r ) and the largest value of v is a − r /2 (whenbeamer-tu-logo
z = a − r and y = a).
UW-Madison (Statistics) Stat 609 Lecture 18 2015 15 / 18
Thus, the joint pdf of R and V is
n(n − 1)r n−2
fR,V (r , v ) = , 0 < r < a, r /2 < v < a − r /2
an
The marginal pdf of R is
n(n − 1)r n−2 n(n − 1)r n−2 (a − r )
Z a−r /2
fR (r ) = dv = , 0<r <a
r /2 an an

The marginal pdf of V is


n(n − 1)r n−2 n(2v )n−1
Z 2v
fV (v ) = dr = 0 < v < a/2
0 an an
n(n − 1)r n−2 n(2(a − v )n−1
Z 2(a−v )
= dr = a/2 < v < a
0 an an
because the set where fR,V (r , v ) > 0 is
{(r , v ) : 0 < r < a, r /2 < v < a − r /2}
= {(r , v ) : 0 < v ≤ a/2, 0 < r < 2v }
[ beamer-tu-logo
{(r , v ) : a/2 < v ≤ a, 0 < r < 2(a − v )}
UW-Madison (Statistics) Stat 609 Lecture 18 2015 16 / 18
Example.
Let X1 , ..., Xn be a random sample from uniform(0, 1).
We want to find the distribution of X1 /X(1) .
For s > 1,
  n  
X1 X1
P >s = ∑P > s, X(1) = Xi
X(1) i=1
X(1)
n  
X1
= ∑P > s, X(1) = Xi
i=2
X(1)
 
X1
= (n − 1)P > s, X(1) = Xn
X(1)
= (n − 1)P (X1 > sXn , X2 > Xn , ..., Xn−1 > Xn )
= (n − 1)P (sXn < 1, X1 > sXn , X2 > Xn , ..., Xn−1 > Xn )
"Z ! #
Z 1/s 1Z n−1 1
= (n − 1) ∏ dxi dx1 dxn
0 sxn i=2 xn
Z 1/s
= (n − 1) (1 − xn )n−2 (1 − sxn )dxn beamer-tu-logo

0
UW-Madison (Statistics) Stat 609 Lecture 18 2015 17 / 18
Thus, for s > 1,
   Z 1/s 
d X1 d n−2
P ≤s = 1 − (n − 1) (1 − t) (1 − st)dt
ds X(1) ds 0
Z 1/s
= (n − 1) (1 − t)n−2 tdt
0
Z 1/s Z 1/s
= (n − 1) (1 − t)n−2 tdt − (n − 1) (1 − t)n−1 dt
0 0
Z 1/s Z 1/s
= (n − 1) (1 − t)n−2 tdt − (n − 1) (1 − t)n−1 dt
0 0
n−1 "  #
1 n−1
 
1 n−1
= 1− 1− − 1− 1−
s n s

For s ≤ 1, obviously
   
X1 d X1
P ≤s =0 P ≤s =0
X(1) ds X(1) beamer-tu-logo

UW-Madison (Statistics) Stat 609 Lecture 18 2015 18 / 18

You might also like