Chapter 2
Chapter 2
Estimation
.
1
Objectives
After completing this chapter you will be able to:
compute point and confidence interval estimate of the population
mean and population proportion
Explain properties of best estimator
Determine sample size necessary in estimating population parameter
2
Basic Concepts
• Estimation is the process of approximate or estimate various
unknown population parameters from sample statistics.
• Inference is the process of making interpretations or conclusions
from sample data for the totality of the population.
In statistics, inference can be made in two ways .
i. Statistical estimation
ii. Statistical hypothesis testing.
3
Count…
Population Analyzed
Inference Data
Data analysis is the process of extracting relevant information from the summarized
data. 4
Statistical Estimation
• It is way of making inference about the population parameter where the investigator
does not have any prior notion about values or characteristics of the population
parameter.
• There are two ways of estimation;
1. Point Estimation
• It is a procedure that results in a single value as an estimate for a parameter.
2. Interval estimation
• It is the procedure that results in the interval of values as an estimate for a
parameter.
• It deals with identifying the upper and lower limits of a parameter.
5
Definitions
• Estimator: is a sample statistic which is used to estimate a population parameter.
Properties of Estimators.
i. Unbiased Estimator: is an estimator whose expected value (mean) is the
value of the parameter being estimated (value of population parameter).
ii. Consistent Estimator: is an estimator which gets closer to the value of the
parameter as the sample size increases.
iii. Relatively Efficient Estimator: The estimator with the smallest variance.
6
Count…
Estimate: is different possible values which an estimator can
assumes.
• Point Estimate: single value used to estimate a parameter.
• Interval Estimate: range of values used to estimate a parameter
i.e. the parameter is specified as being between two values.
For example, an interval estimate for the average age of all students
might be 21.9 to 22.7 years.
7
Count…
A confidence interval is a specific interval estimate of a parameter with specific
level of confidence.
Confidence level is the probability that the value of the parameter falls within the
specified range by the confidence interval.
Degrees of Freedom: The number of data values that have the freedom to vary after a
sample statistic has been computed. it can be calculated as
Df = n−1 8
Point estimation of the population mean: μ
9
Confidence interval estimation of the population mean: µ
• The statistic is "close to" the parameter. That leads to the obvious question,
what is "close"? Or How confident can we be that the value of the statistic
falls within a certain "distance" of the parameter?
The confident that the value of the statistic can falls within a certain distance
/range of the parameter is the confidence interval. 10
There are different cases to be considered to construct
confidence intervals:
Case 1: If sample size is large or if the population is normal with known variance
• As a result of the Central Limit Theorem z distribution for sample means can be
used when sample sizes are large, regardless of the shape of the population
distribution or for smaller sizes if the population is normally distributed.
The value of the population mean, 𝜇 , lies somewhere within this range. Rewriting this
expression yields the confidence interval for population mean:
𝑿 − 𝒁𝜶 𝟐𝝈 𝒏 ≤ 𝝁 ≤ 𝑿 + 𝒁𝜶 𝟐𝝈 𝒏
12
Count…
Where
α = probability that the parameter lies outside the interval OR the proportion of incorrect
statements (α = 1 – C)
Zα 2 = standard normal variable to right of which α 2probability lies, i.e. p(Z > Zα 2) = α 2
-Zα 2 = standard normal variable to left of which α 2probability lies, i.e. p(Z < - Zα 2) = α 2
if 𝜎 2 is not known, we estimate by its point estimator S2 , confidence interval for population
mean becomes:
𝑿 − 𝒁𝜶 𝟐𝒔 𝒏 ≤ 𝝁 ≤ 𝑿 + 𝒁𝜶 𝟐𝒔 𝒏
13
Count…
From the above expression 𝜇 = 𝑋 ± 𝑍𝛼 2 𝜎 𝑛
= 𝑋 ± 𝜀, measure of error
⇒ 𝜀 = 𝑍𝛼 2𝜎 𝑛
- For the interval estimator to be good, the error should be small. The measure of
error can be small by making; n large, Small variability and Taking Z small.
Here are the Z values corresponding to the most commonly used confidence levels
𝟏𝟎𝟎(𝟏 − 𝜶) % 𝜶 𝜶 𝟐 𝒁𝜶 𝟐
𝑋−𝜇
𝑡= ℎ𝑎𝑠 𝑡 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑛 − 1 𝑑𝑒𝑔 𝑟 𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚.
𝑆 𝑛
Characteristics of t-distribution
1. It is symmetric about its mean (0) and ranges from - ∞ to ∞.
2. It is bell-shaped (unimodal) and has approximately the same appearance as the
standard normal distribution (Z- distribution).
3. It depends on a parameter called the degrees of freedom (ν) of the distribution. ν = n -1,
1. From a normal sample of size 25 a mean of 32 was found. Given that the
population standard deviation is 4.2. Find
2. A drug company is testing a new drug which is supposed to reduce blood pressure.
From the six people who are used as subjects, it is found that the average drop in
blood pressure is 2.28 points, with a standard deviation of .95 points. What is the
95% confidence interval for the mean change in pressure? 17
Solution
1. a. 𝑋 = 32, 𝜎 = 4.2,1 − 𝛼 = 0.95 ⇒ 𝛼 = 0.05, 𝛼 2 = 0.025
⇒ 𝑍𝛼 2 = 1.96𝑓𝑟𝑜𝑚𝑡𝑎𝑏𝑙𝑒.
⇒ 𝑇ℎ𝑒 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑖𝑛𝑡 𝑒 𝑟𝑣𝑎𝑙𝑤𝑖𝑙𝑙𝑏𝑒 𝑋 ± 𝑍𝛼 2𝜎 𝑛
95% confident that the mean decrease in blood pressure is between 1.28 and 3.28
points.
19
Interval Estimation of the Population Proportion
Sample proportion, 𝑝, is an unbiased estimator of a population proportion P and if the
𝑃−𝑃 𝑃−𝑃
sample size is large, then, the sampling distribution of 𝑝 is normal with 𝑍 = = .
𝜎𝑝 𝑃𝑞
𝑛
• However, p is unknown, it estimate by 𝑝 & 𝜎𝑝 substituted by 𝑆𝑝 and Z becomes
𝑃−𝑃
𝑍= , Solving for P
𝑝𝑞
𝑛
𝑝𝑞
𝑃 =𝑝+𝑍 𝑛 and Z can assume both positive and negative values,
𝑝𝑞
𝑃 =𝑝±𝑍 𝑛. Z represents the confidence level
𝑝𝑞
𝑃 = 𝑝 ± 𝑍𝛼/2 𝑛 = 𝑝 ± 𝑍𝛼/2 𝑆𝑝
21
Solution:
n= 87, 𝑝 = 0.39, 𝑞 = 0.61, C = 0.95, α = 1 – C = 1- 0.95 = 0.05 𝛼 2 =
0.05/2 = 0.025 and 𝑍𝛼/2 = 𝑍0.025 = 1.96
𝑝𝑞 0.61∗0.39
𝑆𝑝 = 𝑛= 87 = 0.0523
2
2 𝛿
𝑒2 = 𝑍𝛼/2 Solving for n,
𝑛
2
𝑍𝛼/2 𝜎2
𝑛=
𝑒2
𝑍𝛼/2 𝜎 2
𝑛𝜇 = if 𝜎 is known
𝑒
𝑍𝛼/2 𝑠 2
𝑛𝜇 = if 𝜎 is not known
𝑒 24
Examples
1. A gasoline service station shows a standard deviation of Birr 6.25 for the changes
made by the credit card customers. Assume that the station’s management would
like to estimate the population mean gasoline bill for its credit card customers to
be with in ± Birr 1.00. For a 95% confidence level, how large a sample would be
necessary?
2. The National Travel and Tour Organization (NTO) would like to estimate the
mean amount of money spent by a tourist to be with in Birr 100 with 95%
confidence. If the amount of money spent by tourist is considered to be normally
distributed with a standard deviation of Br 200, what sample size would be
necessary for the NTO to meet their objective in estimating this mean
25 amount?
Solution
1. e = Birr 1.00, σ = Birr 6.25, C = 0.95, 𝑍𝛼/2 = 𝑍0.025 = 1.96
𝑍𝛼/2 𝜎 2
𝑛𝜇 =
𝑒
1.96∗6.25 2
𝑛𝜇 = = 150.06 ≈ 151
1
26
Sample size for estimating population proportion, p
𝑝𝑞 𝑝𝑞
• The confidence interval for p is 𝑃 = 𝑝 ± 𝑍𝛼/2 . The expression 𝑍𝛼/2 is
𝑛 𝑛
𝑝𝑞
𝑒 = 𝑍𝛼/2 , squaring both sides
𝑛
2 𝑝𝑞
𝑒 2 = 𝑍𝛼/2 , solving for n
𝑛
2
𝑍𝛼/2 𝑝𝑞
𝑛𝑝 =
𝑒2
𝑍𝛼/2 2
• Since if 𝑝 𝑎𝑛𝑑 𝑞 not given, we use p and q and 𝑛𝑝 becomes, 𝑛𝑝 = 𝑝𝑞
𝑒27
Examples
1. Suppose that a production facility purchases a particular component parts in large
lots from a supplier. The production manager wants to estimate the proportion of
defective parts received from this supplier. She believes that the proportion of
defects is no more than 0.2 and wants to be with in 0.02 of the true proportion of
defects with a 90% level of confidence. How large a sample should she take?
2. What is the largest sample size that would be needed in estimating a population
proportion to with in ± 0.02, with a confidence coefficient of 0.95?
28
Solution
1. e = 0.02, p = 0.2, q =0.8, C = 0.90 and 𝑍𝛼/2 = 𝑍0.05 = 1.64
𝑍𝛼/2 2
𝑛𝑝 = 𝑝𝑞
𝑒
1.64 2
𝑛𝑝 = 0.2 ∗ 0.8 = 1075.84 ≈ 1076
0.02