Chap 10
Chap 10
Usually, we use the value of a statistic to estimate a population parameter. This value is called a
point estimate of the parameter. The statistic is also referred to as an estimator.
Examples- If we take a random sample of size n, the sample mean x may be used as a point
estimate of the population mean µ. The statistic X is an estimator of µ.
X is an estimator of µ
S 2 is an estimator of σ 2
p̂ = X/n is an estimator of p
Since there are many estimators, it is necessary to study some desirable properties of estimators.
Properties of Estimators
Section 10.2 Unbiased Estimators
A statistic θˆ is an unbiased estimator of the parameter θ if and only if E( θˆ ) = θ.
Let X 1 , X 2 , , X n be a random sample from a population with mean µ. Consider the sample
1 1
mean X . = E( X )
n
∑ E=( Xi ) =
n
∑ µ µ . Therefore, X is an unbiased estimator of µ.
If S 2 is the variance of a random sample from an infinite population, then E( S 2 ) = σ 2 .
If d̂ is biased for θ and lim E(dˆ ) = θ , then we say that d̂ is asymptotically unbiased for θ.
n →∞
Problems of Unbiasedness
• If θˆ is unbiased for θ, it does not follow that ω( θˆ ) is unbiased for ω(θ).
• Unbiased estimators are not necessarily unique.
The quantity in the denominator is referred to as the information about θ, which is supplied by
the sample.
∂ ln f ( X ) 2 ∂ 2 ln f ( X )
Note that E = − E ∂θ 2 .
∂θ
Let θˆ1 and θˆ2 be two unbiased estimators of θ and let var( θˆ1 ) < var( θˆ2 ), we say that θˆ1 is
relatively more efficient. The efficiency of θˆ relative to θˆ is defined as the ratio
2 1
var(θˆ1 )
.
var(θˆ2 )
Example 10.7 on page 286. Efficiency of sample median to the sample mean is about 64%. For
large samples, the sample mean requires only 64% as many observations as the sample median to
estimate the population mean µ with the same reliability.
Note that relative efficiency is based on unbiased estimators. For biased estimators, we compare
them by using the mean square error instead of the variance. The mean square error is defined as
MSE = E[(θˆ − θ ) 2 ] .
Examples
Problem 10.7 page 287
n n
= θ y (1 − θ ) n − y θ y (1 − θ ) n − y = 1 , which is independent
y y
of θ. Thus, once Y is known, no other information from X 1 , X 2 , , X n will shed additional light
on the possible value of θ. So Y contains all the information about θ. Therefore, Y is sufficient
for θ. If P= ( X 1 x1 ,=, X n x=n |Y y ) depends on θ, some X 1 , X 2 , , X n are more probable for
some values of θ than for others.
Definition: The statistic θˆ is a sufficient estimator of θ if and only if for each value of θˆ the
conditional distribution of X 1 , X 2 , , X n given Θ̂ = θˆ is independent of θ.
The above definition may not be easy to work with in order to verify sufficiency property. We
now state the following factorization theorem.
Factorization Theorem: The statistic θˆ is sufficient estimator of the parameter θ if and only if the
joint density or probability distribution of the random sample X 1 , X 2 , , X n can be factored so
Page 4 of 7
Example: Show that the estimator in Problem 10.23 (on page 288) is consistent.
Solution:
Example: Consider the density function f ( x) = 1/ θ , 0 < x < θ. Suppose we use Yn , the largest
order statistic to estimate θ, check whether this estimator is (a) unbiased and (b) consistent. Find
the efficiency of the sample mean relative to Yn .
Solution:
1 n k
If X 1 , X 2 , , X n is a random sample, the kth sample moment is mk′ = ∑ xi . The method of
n i =1
moments leads to
mk′ = µk′ , k = 1, 2, …, p
for the p parameters of the population. Note that one may use
mk = µk , k = 1, 2, …, p
where mk is the kth central sample moment and µk is the kth central population moment.
Page 5 of 7
Example: Find the moment estimates for the binomial parameters θ and n.
Solution:
Advantages: (i) mle yields sufficient estimators whenever they exist and (ii) mle are
asymptotically minimum variance unbiased estimators.
Note
a. The value of θ that will maximize L(θ ) is the same that will maximize ln [ L(θ ) ] . It may be
easier to work with ln [ L(θ ) ] .
b. It is not always the case that the method of differentiation can be used to obtain the mle.
When the domain of the function depends on the parameter, we can hardly use the method of
differentiation.
Regular case is when method of differentiation works and the non-regular case is when method
of differentiation does not work.
Examples
Problem 10.51 page 301
Bayesian point estimation is similar to finding decision function δ ( x) that can predict the value
of Θ when the value of x and the conditional density ϕ (θ | x) are known. In general, the mean
or the median is used to predict the population mean of a random variable. In Bayes statistics, the
choice of the decision function depends on a loss function L ( Θ, δ ( x) ) . One method is to select a
decision function δ ( x) such that the conditional expectation of the loss is minimum.