Chapter Two

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 28

Chapter Two

Estimation
.

1
Objectives
After completing this chapter you will be able to:
 compute point and confidence interval estimate of the
population mean and population proportion
 Explain properties of best estimator
 Determine sample size necessary in estimating population
parameter

2
Statistical Estimation
Estimation is the process of approximate or estimate various unknown
population parameters from sample statistics.
Inference is the process of making interpretations or conclusions from sample
data for the totality of the population.
 In statistics, inference can be made in two ways .
i. Statistical estimation
ii. Statistical hypothesis testing.
count…
Population Analyzed
Inference Data

Sample Numerical
data

Data analysis is the process of extracting relevant information from the summarized
data.
4
Statistical Estimation
•It is way of making inference about the population parameter where the investigator
does not have any prior notion about values or characteristics of the population
parameter.
• There are two ways estimation.
1. Point Estimation
• It is a procedure that results in a single value as an estimate for a parameter.
2. Interval estimation
• It is the procedure that results in the interval of values as an estimate for
a parameter.
• It deals with identifying the upper and lower limits of a parameter. The limits by
themselves are random variable.
5
Definition of terms
• Estimator: is a sample statistic which is used to estimate a
population parameter.
It must be unbiased, consistent, and relatively efficient.
i. Unbiased Estimator: is an estimator whose expected value is
the value of the parameter being estimated.
ii. Consistent Estimator: is an estimator which gets closer to the
value of the parameter as the sample size increases.
iii. Relatively Efficient Estimator: The estimator with the smallest
6
Count…

• Estimate: Is the different possible values which an estimator can assumes.


• Point Estimate: A single value used to estimate a parameter.
• Interval Estimate: A range of values used to estimate a parameter i.e.
the parameter is specified as being between two values.
For example, an interval estimate for the average age of all students might be
21.9 to 22.7 years.

7
Count…
 Confidence level is the probability that the value of the parameter falls within the
specified range by the confidence interval.

 A confidence interval is a specific interval estimate of a parameter with specific


level
of confidence.

it communicate how accurate our estimate likely to be.


 Three commonly used confidence intervals are: 90, 95, and 99%.
For example, 95% confidence interval
means, 95% confidentthat the interval contains the
true value of parameter.
 Degrees of Freedom: The number of data values that have the freedom to vary after a
it canstatistic
sample be calculated ascomputed.
has been Df = n−1 8
Point and Interval estimation of the population mean: µ
Point Estimation
• is a statistical procedure in which a single value is used to estimate a population parameter.

• A point estimate is a single number that is used as an estimate of population parameter, and
is derived from a random sample taken from the population.
• point estimator is the mathematical way to compute the point estimate.
• Some of the most important point estimators are given below.
Parameter (population values) Estimator (statistic)
Population Mean, 𝜇 ത σ 𝑛𝑖 =1 𝑥 𝑖
𝑋= 𝑛
2
𝑖 =1(𝑋 𝑖 −𝑋)
σ𝑛
Population variance, 𝜎 2 𝑆2 = 𝑛 −1
Population S.D, 𝜎 S = 𝑆2
Population proportion, P 𝑥
11
𝑃ത 𝑛
Confidence Interval Estimation

• Although 𝑋ሜ possesses nearly all the qualities of a good estimator, because

of sampling error, sample statistic will be not equal to the population


parameter, but instead will fall into an interval of values.

• The statistic is "close to" the parameter. That leads to the obvious question,
what is "close"? Or How confident can we be that the value of the statistic
falls within a certain "distance" of the parameter?

The confident that the value of the statistic can falls within a certain distance
/range of the parameter is the confidence interval.
There are different cases to be considered to construct confidence intervals.
Case 1: If sample size is large or if the population is normal with known variance

• Recall the Central Limit Theorem, which applies to the sampling distribution of the
mean of a sample.

Consider samples of size n drawn from a population, whose mean is 𝜇 and


standard deviation is 𝜎 with replacement and order important.

𝜎
The sampling distribution of 𝑋ሜ will have a mean𝑥lj𝜇 𝑥lj = 𝑛,
𝜇, standard deviation
& approaches a normal𝜎 distribution as n gets large. This
= allows us to use the normal

distribution curve for computing confidence intervals.


13
⇒ 𝑍 = 𝑋ሜ − ℎ𝑎𝑠 𝑎 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑚𝑒𝑎𝑛 = 0 & 𝑣𝑎𝑟 𝑖 𝑎𝑛𝑐𝑒 =
𝜇 1
⇒ 𝜇 = 𝑋ሜ ± 𝑍 𝜎Τ
𝜎Τ
𝑛 𝑛
- = 𝑋estimator
For the interval ሜ ± 𝜀, to be good, the error should be small. How it be small?
The measure of error can be small by making:
𝑤ℎ𝑒𝑟𝑒𝜀𝑖𝑠𝑎𝑚𝑒𝑎
 n large
𝑠𝑢𝑟𝑒𝑜𝑓𝑒𝑟𝑟𝑜𝑟
 Small variability

⇒ 𝜀 Taking
= 𝑍 𝜎ΤZ small
- To obtain the value
𝑛 of Z, we have to attach this to a theory of chance. That is,
is an area of size 1 − 𝛼 such that
there
𝑃(−𝑍𝛼Τ2 < 𝑍 < 𝑍𝛼Τ2) = 1 − 𝛼
𝑊ℎ𝑒𝑟𝑒 𝛼 = 𝑖𝑠 𝑡ℎ𝑒 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑡ℎ𝑎𝑡 𝑡ℎ𝑒 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 𝑙𝑖𝑒𝑠 𝑜𝑢𝑡𝑠𝑖𝑑𝑒 𝑡ℎ 𝑒 14
𝑍𝛼Τ2 = 𝑠𝑡𝑎𝑛𝑑𝑠 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑒𝑑 𝑛𝑜𝑟𝑚𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑡𝑜 𝑡ℎ𝑒 𝑟𝑖𝑔ℎ𝑡 𝑜𝑓 𝑤ℎ𝑖𝑐ℎ

𝛼Τ2 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑙𝑖𝑒𝑠, 𝑖. 𝑒 𝑃(𝑍 > 𝑍𝛼 Τ2 ) = 𝛼Τ2

⇒ 𝑃(−𝑍𝛼Τ2 < 𝑋ሜ <𝑍 )=1−𝛼


𝜎−𝜇
Τ 𝑛 𝛼 Τ2

⇒ 𝑃(𝑋ሜ − 𝑍𝛼Τ2 𝜎Τ 𝑛 < 𝜇 < 𝑋ሜ + 𝑍𝛼Τ2 𝜎Τ 𝑛) = 1 − 𝛼

⇒ 𝑋ሜ − 𝑍𝛼Τ2 𝜎Τ 𝑛 , 𝑋ሜ + 𝑍𝛼Τ2 𝜎Τ 𝑛 𝑖𝑠𝑎100 1 − 𝛼 %


𝑐𝑜𝑛𝑖𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑓𝑜𝑟 𝜇

if 𝜎 2 is not known, in that case we estimate by its point estimator S2

⇒ 𝑋ሜ − 𝑍𝛼Τ2 𝑆Τ 𝑛 , 𝑋ሜ + 𝑍𝛼Τ2 𝑆Τ 𝑛 𝑖𝑠𝑎100 1 − 𝛼


%𝑐𝑜𝑛𝑖𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑖𝑛𝑡 𝑒 𝑟𝑣𝑎𝑙𝑓𝑜𝑟𝜇
15
Count…
Here are the Z values corresponding to the most commonly used confidence levels

𝟏𝟎𝟎(𝟏 − 𝑎) % 𝑎 𝑎Τ 𝟐 𝒁𝑎Τ 𝟐

90 0.10 0.05 1.645


95 0.05 0.025 1.96
99 0.01 0.005 2.58

Case 2: If sample size is small and the population variance, 𝝈𝟐is not known.

𝑡 = 𝑋ሜ − ℎ𝑎𝑠𝑡𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛𝑤𝑖𝑡ℎ𝑛 − 1 𝑑𝑒𝑔 𝑟 𝑒𝑒𝑠𝑜𝑓𝑓𝑟𝑒𝑒𝑑𝑜𝑚.


𝜇
𝑆Τ 𝑛
⇒ (𝑋ሜ − 𝑡𝛼Τ2 𝑆Τ 𝑛 , 𝑋ሜ + 𝑡𝛼Τ2 𝑆Τ 𝑛)𝑖𝑠𝑎100 1 − 𝛼 %𝑐𝑜𝑛𝑖𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑖𝑛𝑡 𝑒 𝑟𝑣𝑎𝑙𝑓𝑜𝑟𝜇

The unit of measurement of the confidence interval is the standard error. This is just the
standard deviation of the sampling distribution of the statistic.
16
Examples

1. From a normal sample of size 25 a mean of 32 was found. Given that the
population standard deviation is 4.2. Find

a) A 95% confidence interval for the population mean.

b) A 99% confidence interval for the population mean.

2. A drug company is testing a new drug which is supposed to reduce blood pressure.
From the six people who are used as subjects, it is found that the average drop in
blood pressure is 2.28 points, with a standard deviation of .95 points. What is the
95% confidence interval for the mean change in pressure? 17
Solution

1. a. 𝑋ሜ = 32, 𝜎 = 4.2,1 − 𝛼 = 0.95 ⇒ 𝛼 = 0.05, 𝛼Τ2 =


0.025
⇒ 𝑍𝛼Τ2 = 1.96𝑓𝑟𝑜𝑚𝑡𝑎𝑏𝑙𝑒.

⇒ 𝑇ℎ𝑒 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑖𝑛𝑡 𝑒 𝑟𝑣𝑎𝑙𝑤𝑖𝑙𝑙𝑏𝑒 𝑋ሜ ± 𝑍𝛼Τ2 𝜎Τ

= 32 ± 1.96 ∗ 4.2Τ 25 = 32 ±
1.65 =
(30.35,33.65)
18
2. Solution

𝑋ሜ = 2.28, 𝑆 = 0.95,1 − 𝛼 = 0.95 ⇒ 𝛼 = 0.05, 𝛼Τ2 = 0.025

⇒ 𝑡𝛼Τ2 = 2.571𝑤𝑖𝑡ℎ𝑑𝑓 = 5𝑓𝑟𝑜𝑚𝑡𝑎𝑏𝑙𝑒.

⇒ 𝑇ℎ𝑒 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑤𝑖𝑙𝑙 𝑏𝑒𝑋ሜ ± 𝑡𝛼Τ2 𝑆Τ 𝑛

= 2.28 ± 2.571 ∗ 0.95Τ 6

= 2.28 ± 1.008 = (1.28,3.28)

95% confident that the mean decrease in blood pressure is between 1.28 and
3.28 points.
19
Interval Estimation of the Population Proportion
Sample proportion, 𝑝, is an unbiased estimator of a population proportion P and if the
sample size is large then, the sampling distribution of 𝑝 is normal with 𝑍 = 𝑃−𝑃 = 𝑃−𝑃
𝑃�� .
𝜎𝑝
𝑛

• However, p is unknown, it estimate by 𝑝 & 𝜎 𝑝 substituted by 𝑆 𝑝 and Z
becomes
𝑝��ൗ 𝑃−𝑃
𝑍= 𝑛
, Solving for P

𝑃 =𝑝+𝑍 𝑝��ൗ
𝑛 and Z can assume both
positive and negative values,

𝑃 =𝑝±𝑍 𝑝��ൗ
𝑛 . Z represents the confidence
level

𝑃 = 𝑝 ± 𝑍𝛼/2 𝑝��ൗ
𝑛 = 𝑝 ± 𝑍 𝛼/2 𝑆 𝑝
Example
1. Recently, a study of 87 randomly selected companies with
telemarketing operation was completed. The study revealed that
39% of the sampled companies had used telemarketing to assist
them in order processing. Estimate the population proportion of
telemarketing companies who use their telemarketing operation to
assist them in order processing taking a 95% confidence level.

21
Solution:

n= 87, 𝑝 = 0.39, 𝑞 = 0.61, C = 0.95, α = 1 – C = 1- 0.95 = 0.05 𝛼Τ2 =

0.05/2 = 0.025 and 𝑍𝛼/2 = 𝑍0.025 = 1.96

𝑆𝑝 = 𝑝��ൗ 0.61∗0.39Τ = 0.0523


87

=
𝑃 = 𝑝 ±𝑛 𝑍𝛼/2 𝑆𝑝 0.39 ± 1.96(0.0523)

= 0.39 ± 0.1025 OR 0.2875


≤P≤ 0.4925
companies which use
With 95% confidence
telemarketing to assist orderthat the proportion
processing lies between 0.2875 and 0.4925
22
of
Determination of Sample Size
•The reason for taking a sample from a population is that it would be too costly to
gather data for the whole population. But collecting sample data also costs money;
and the larger the sample, the higher the cost. To hold cost down, we want to use as
small a sample as possible. On the other hand, we want a sample to be large enough
to provide “good” approximation/estimates of population parameters. Consequently,
the question is “How large should the sample be?”
• The answer depends on three factors:
1) How precise (narrow) do we want a confidence interval to be?
2) How confident do we want to be that the interval estimate is correct?
3) How variable is the population being 23
Sample size for estimating population mean, 𝝁
• The confidence interval for 𝜇
𝜇 = 𝑋 ± 𝑍𝛼 /2 �� ൗ
is
• From the above expression 𝑍𝛼 /2𝑛 . is called error of estimation (e). That is, the
𝑛
𝜎
difference between 𝑥 and 𝜇 which results from the sampling process.
So 𝜎
e = 𝑍𝛼 /2 Squaring both sides
𝑛

𝛿2 Solving for n,
𝑒2 = 𝑍2
𝛼/2 𝑛
𝑍2 𝜎2
𝛼/2
𝑛=
𝑒2 2
𝑛𝜇 = 𝑍 𝛼/2 if 𝜎 is known
𝑒 𝜎
2
𝑛𝜇 = 𝑍 𝛼/2 𝑠 if 𝜎 is not known
𝑒 24
Examples
1. A gasoline service station shows a standard deviation of Birr 6.25 for the changes
made by the credit card customers. Assume that the station’s management would
like to estimate the population mean gasoline bill for its credit card customers to
be with in ± Birr 1.00. For a 95% confidence level, how large a sample would be
necessary?
2. The National Travel and Tour Organization (NTO) would like to estimate the
mean amount of money spent by a tourist to be with in Birr 100 with 95%
confidence. If the amount of money spent by tourist is considered to be normally
distributed with a standard deviation of Br 200, what sample size would be
necessary for the NTO to meet their objective in estimating this mea n amount?
2 5
Solution
1. e = Birr 1.00, σ = Birr 6.25, C = 0.95, 𝑍𝛼/2 = 𝑍0.025 = 1.96
2
𝑛𝜇 = 𝑍 𝛼/2 𝜎
𝑒
1.96∗6.25 2
𝑛𝜇 = = 150.06 ≈ 151
1

2. e = Birr 100 , σ = Birr 200, C = 0.95, 𝑍𝛼/2 = 𝑍0.025 = 1.96


2
𝑛𝜇 = 𝑍 𝛼/2 𝜎
𝑒
2
1.96∗200
𝑛𝜇 = = 15.37 ≈ 16
100

26
Sample size for estimating population proportion, p.

𝑝𝑞 𝑝𝑞
• The confidence interval for p is 𝑃 = 𝑝 ± 𝑍𝛼 /2 . The expression 𝑍 𝛼 /2 is
𝑛 𝑛

called the error term (e). That


is,
𝑝𝑞
𝑒 = 𝑍𝛼 /2 , squaring both sides
𝑛

𝑝𝑞
𝑒 2 = 𝑍𝛼2 /2 𝑛
, solving for n

𝑍2 𝑝 𝑞
𝑛𝑝 = 𝛼/2
𝑒2
2
• Since if 𝑝 𝑎𝑛𝑑 𝑞 not given, we use p and q 𝑝 becomes, 𝑛𝑝 = 𝑍 𝛼/2 𝑝𝑞
𝑒2
Examples
1. Suppose that a production facility purchases a particular component parts in large
lots from a supplier. The production manager wants to estimate the proportion of
defective parts received from this supplier. She believes that the proportion of
defects is no more than 0.2 and wants to be with in 0.02 of the true proportion of
defects with a 90% level of confidence. How large a sample should she take?

2. What is the largest sample size that would be needed in estimating a population
proportion to with in ± 0.02, with a confidence coefficient of 0.95?

28
Solution
1. e = p = 0.2, q =0.8, C = 0.90 𝑍𝛼/2 = 𝑍0.05 = 1.64
2
0.02, and 𝑛𝑝 = 𝑍 𝛼/2 𝑝𝑞
𝑒
2
1.64
𝑛𝑝 = 0.2 ∗ 0.8 = 1075.84 ≈ 1076
0.02

2. e = 0.02, C = 0.95 and 𝑍𝛼/2 = 𝑍0.025 = 1.96

The largest sample size would be obtained


2
when p = 0.5. So,
𝑍𝛼 /2
𝑛𝑝 = 𝑝𝑞
𝑒
2
1.96
𝑛𝑝 = 0.5 ∗ 0.5 = 2401
0.02
If p is unknown and there is no possibility of estimating it, use 0.5 as the value of p
because it will generate the greatest possible sample size as compared with 2o9 ther values.

You might also like