0% found this document useful (0 votes)
20 views30 pages

Chapter 2

Chapter Two focuses on estimation in statistics, covering point and confidence interval estimates for population parameters. It explains the properties of estimators, the process of statistical estimation, and how to determine sample sizes for accurate estimates. Additionally, it discusses confidence intervals, their construction, and provides examples for practical understanding.

Uploaded by

rhama4790
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views30 pages

Chapter 2

Chapter Two focuses on estimation in statistics, covering point and confidence interval estimates for population parameters. It explains the properties of estimators, the process of statistical estimation, and how to determine sample sizes for accurate estimates. Additionally, it discusses confidence intervals, their construction, and provides examples for practical understanding.

Uploaded by

rhama4790
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Chapter Two

Estimation
.

1
Objectives
After completing this chapter you will be able to:
 compute point and confidence interval estimate of the population
mean and population proportion
 Explain properties of best estimator
 Determine sample size necessary in estimating population parameter

2
Basic Concepts
• Estimation is the process of approximate or estimate various
unknown population parameters from sample statistics.
• Inference is the process of making interpretations or conclusions
from sample data for the totality of the population.
 In statistics, inference can be made in two ways .
i. Statistical estimation
ii. Statistical hypothesis testing.

3
Count…

Population Analyzed
Inference Data

Sample Numerical data

Data analysis is the process of extracting relevant information from the summarized
data. 4
Statistical Estimation
• It is way of making inference about the population parameter where the investigator
does not have any prior notion about values or characteristics of the population
parameter.
• There are two ways of estimation;
1. Point Estimation
• It is a procedure that results in a single value as an estimate for a parameter.
2. Interval estimation
• It is the procedure that results in the interval of values as an estimate for a
parameter.
• It deals with identifying the upper and lower limits of a parameter.
5
Definitions
• Estimator: is a sample statistic which is used to estimate a population parameter.

Properties of Estimators.
i. Unbiased Estimator: is an estimator whose expected value (mean) is the
value of the parameter being estimated (value of population parameter).
ii. Consistent Estimator: is an estimator which gets closer to the value of the
parameter as the sample size increases.
iii. Relatively Efficient Estimator: The estimator with the smallest variance.

6
Count…
 Estimate: is different possible values which an estimator can
assumes.
• Point Estimate: single value used to estimate a parameter.
• Interval Estimate: range of values used to estimate a parameter
i.e. the parameter is specified as being between two values.
For example, an interval estimate for the average age of all students
might be 21.9 to 22.7 years.

7
Count…
 A confidence interval is a specific interval estimate of a parameter with specific
level of confidence.

it communicate how accurate our estimate likely to be.


 Three commonly used confidence intervals are: 90, 95, and 99%.
For example, 95% confidence interval means, 95% confident that the interval
contains the true value of parameter.

 Confidence level is the probability that the value of the parameter falls within the
specified range by the confidence interval.
 Degrees of Freedom: The number of data values that have the freedom to vary after a
sample statistic has been computed. it can be calculated as
Df = n−1 8
Point estimation of the population mean: μ

• Point Estimator:- is the mathematical way we compute the point estimate.


For instance,

9
Confidence interval estimation of the population mean: µ

• Although 𝑋possesses nearly all the qualities of a good estimator, because of


sampling error, sample statistic will be not equal to the population
parameter, but instead will fall into an interval of values.

• The statistic is "close to" the parameter. That leads to the obvious question,
what is "close"? Or How confident can we be that the value of the statistic
falls within a certain "distance" of the parameter?

The confident that the value of the statistic can falls within a certain distance
/range of the parameter is the confidence interval. 10
There are different cases to be considered to construct
confidence intervals:
Case 1: If sample size is large or if the population is normal with known variance
• As a result of the Central Limit Theorem z distribution for sample means can be
used when sample sizes are large, regardless of the shape of the population
distribution or for smaller sizes if the population is normally distributed.

• The sampling distribution of 𝑋 will have a mean 𝜇𝑥 = 𝜇, standard deviation 𝜎𝑥 =


𝜎
, & approaches a normal distribution as n gets large. This allows us to use the
𝑛

normal distribution curve for computing confidence intervals.


11
Count…
𝑋−𝜇
𝑍= has a normal distribution with mean = 0 & variance = 1, and solve for 𝜇
𝜎 𝑛
𝜇 = 𝑋 −𝑍𝜎 𝑛
Because the 𝑋 can be greater than or less than the 𝜇, z can be +ve or -ve. Thus, the
above expression takes the form:
𝜇 = 𝑋±𝑍𝜎 𝑛, it is interval estimate
𝜇 = 𝑋 ± 𝑍𝛼 2𝜎 𝑛

The value of the population mean, 𝜇 , lies somewhere within this range. Rewriting this
expression yields the confidence interval for population mean:
𝑿 − 𝒁𝜶 𝟐𝝈 𝒏 ≤ 𝝁 ≤ 𝑿 + 𝒁𝜶 𝟐𝝈 𝒏
12
Count…
Where
α = probability that the parameter lies outside the interval OR the proportion of incorrect
statements (α = 1 – C)
Zα 2 = standard normal variable to right of which α 2probability lies, i.e. p(Z > Zα 2) = α 2
-Zα 2 = standard normal variable to left of which α 2probability lies, i.e. p(Z < - Zα 2) = α 2

if 𝜎 2 is not known, we estimate by its point estimator S2 , confidence interval for population
mean becomes:

𝑿 − 𝒁𝜶 𝟐𝒔 𝒏 ≤ 𝝁 ≤ 𝑿 + 𝒁𝜶 𝟐𝒔 𝒏

13
Count…
From the above expression 𝜇 = 𝑋 ± 𝑍𝛼 2 𝜎 𝑛
= 𝑋 ± 𝜀, measure of error
⇒ 𝜀 = 𝑍𝛼 2𝜎 𝑛
- For the interval estimator to be good, the error should be small. The measure of
error can be small by making; n large, Small variability and Taking Z small.

 Here are the Z values corresponding to the most commonly used confidence levels
𝟏𝟎𝟎(𝟏 − 𝜶) % 𝜶 𝜶 𝟐 𝒁𝜶 𝟐

90 0.10 0.05 1.645


95 0.05 0.025 1.96
99 0.01 0.005 2.58
14
Case 2: If sample size is small and the population variance, 𝝈𝟐 is not known

𝑋−𝜇
𝑡= ℎ𝑎𝑠 𝑡 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑛 − 1 𝑑𝑒𝑔 𝑟 𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚.
𝑆 𝑛

⇒ 𝑋 − 𝑡𝛼 2𝑆 𝑛 , 𝑋 + 𝑡𝛼 2𝑆 𝑛 𝑐𝑜𝑛𝑖𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑖𝑛𝑡 𝑒 𝑟𝑣𝑎𝑙 𝑓𝑜𝑟 𝜇

Characteristics of t-distribution
1. It is symmetric about its mean (0) and ranges from - ∞ to ∞.
2. It is bell-shaped (unimodal) and has approximately the same appearance as the
standard normal distribution (Z- distribution).
3. It depends on a parameter called the degrees of freedom (ν) of the distribution. ν = n -1,

4. Its variance is ν/ (ν-2) for ν>2, and always exceeds 1.


15
Count…
The confidence interval for population mean is affected by:
1. Population distribution, i.e., whether the population is normally distributed or not
2. Standard deviation, i.e., whether 𝜎 is known or not.
3. Sample size, i.e., whether the sample size, n, is large or not.
• Steps to find the interval estimate of population mean,µ;
1. Compute the standard error of the meanδ𝑋
2. Compute α 2 from the confidence coefficient.
3. Find the Z value for the α 2 from the table
4. Construct the confidence interval
5. Interpret the results
16
Examples

1. From a normal sample of size 25 a mean of 32 was found. Given that the
population standard deviation is 4.2. Find

a) A 95% confidence interval for the population mean.

b) A 99% confidence interval for the population mean.

2. A drug company is testing a new drug which is supposed to reduce blood pressure.
From the six people who are used as subjects, it is found that the average drop in
blood pressure is 2.28 points, with a standard deviation of .95 points. What is the
95% confidence interval for the mean change in pressure? 17
Solution
1. a. 𝑋 = 32, 𝜎 = 4.2,1 − 𝛼 = 0.95 ⇒ 𝛼 = 0.05, 𝛼 2 = 0.025
⇒ 𝑍𝛼 2 = 1.96𝑓𝑟𝑜𝑚𝑡𝑎𝑏𝑙𝑒.
⇒ 𝑇ℎ𝑒 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑖𝑛𝑡 𝑒 𝑟𝑣𝑎𝑙𝑤𝑖𝑙𝑙𝑏𝑒 𝑋 ± 𝑍𝛼 2𝜎 𝑛

= 32 ± 1.96 ∗ 4.2 25 = 32 ± 1.65 = (30.35,33.65)

b. 𝑋 = 32, 𝜎 = 4.2,1 − 𝛼 = 0.99 ⇒ 𝛼 = 0.01, 𝛼 2 = 0.005


⇒ 𝑍𝛼 2 = 2.58 𝑓𝑟𝑜𝑚 𝑡𝑎𝑏𝑙𝑒.
⇒ 𝑇ℎ𝑒 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙𝑤𝑖𝑙𝑙 𝑏𝑒𝑋 ± 𝑍𝛼 2𝜎 𝑛

= 32 ± 2.58 ∗ 4.2 25 = 32 ± 2.17 = (29.83, 34.17)


18
2. Solution

𝑋 = 2.28, 𝑆 = 0.95,1 − 𝛼 = 0.95 ⇒ 𝛼 = 0.05, 𝛼 2 = 0.025


⇒ 𝑡𝛼 2 = 2.571𝑤𝑖𝑡ℎ𝑑𝑓 = 5𝑓𝑟𝑜𝑚𝑡𝑎𝑏𝑙𝑒.
⇒ 𝑇ℎ𝑒 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑤𝑖𝑙𝑙 𝑏𝑒𝑋 ± 𝑡𝛼 2𝑆 𝑛
= 2.28 ± 2.571 ∗ 0.95 6

= 2.28 ± 1.008 = (1.28,3.28)

95% confident that the mean decrease in blood pressure is between 1.28 and 3.28
points.

19
Interval Estimation of the Population Proportion
Sample proportion, 𝑝, is an unbiased estimator of a population proportion P and if the
𝑃−𝑃 𝑃−𝑃
sample size is large, then, the sampling distribution of 𝑝 is normal with 𝑍 = = .
𝜎𝑝 𝑃𝑞
𝑛
• However, p is unknown, it estimate by 𝑝 & 𝜎𝑝 substituted by 𝑆𝑝 and Z becomes
𝑃−𝑃
𝑍= , Solving for P
𝑝𝑞
𝑛

𝑝𝑞
𝑃 =𝑝+𝑍 𝑛 and Z can assume both positive and negative values,
𝑝𝑞
𝑃 =𝑝±𝑍 𝑛. Z represents the confidence level
𝑝𝑞
𝑃 = 𝑝 ± 𝑍𝛼/2 𝑛 = 𝑝 ± 𝑍𝛼/2 𝑆𝑝

Where: 𝑝 = sample proportion, 𝑞 = 1 - 𝑝, α = 1 – C, n = sample size and


P = unknown population proportion 20
Example
1. Recently, a study of 87 randomly selected companies with
telemarketing operation was completed. The study revealed that
39% of the sampled companies had used telemarketing to assist
them in order processing. Estimate the population proportion of
telemarketing companies who use their telemarketing operation to
assist them in order processing taking a 95% confidence level.

21
Solution:
n= 87, 𝑝 = 0.39, 𝑞 = 0.61, C = 0.95, α = 1 – C = 1- 0.95 = 0.05 𝛼 2 =
0.05/2 = 0.025 and 𝑍𝛼/2 = 𝑍0.025 = 1.96

𝑝𝑞 0.61∗0.39
𝑆𝑝 = 𝑛= 87 = 0.0523

𝑃 = 𝑝 ± 𝑍𝛼/2 𝑆𝑝 0.39 ± 1.96(0.0523)

= 0.39 ± 0.1025 OR 0.2875 ≤ P ≤ 0.4925

With 95% confidence that the proportion of companies which use


telemarketing to assist order processing lies between 0.2875 and 0.4925
22
Determination of Sample Size
• The reason for taking a sample from a population is that it would be too costly to
gather data for the whole population. But collecting sample data also costs money;
and the larger the sample, the higher the cost. To hold cost down, we want to use as
small a sample as possible. On the other hand, we want a sample to be large enough
to provide “good” approximation/estimates of population parameters. Consequently,
the question is “How large should the sample be?”
• The answer depends on three factors:
1) How precise (narrow) do we want a confidence interval to be?
2) How confident do we want to be that the interval estimate is correct?
3) How variable is the population being sampled? 23
Sample size for estimating population mean, 𝝁
• The confidence interval for 𝜇 is 𝜇 = 𝑋 ± 𝑍𝛼/2 𝜎 𝑛
.
𝜎
• From the above expression 𝑍𝛼/2 is called error of estimation (e). That is, the
𝑛

difference between 𝑥 and 𝜇 which results from the sampling process. So


𝜎
e = 𝑍𝛼/2 Squaring both sides
𝑛

2
2 𝛿
𝑒2 = 𝑍𝛼/2 Solving for n,
𝑛
2
𝑍𝛼/2 𝜎2
𝑛=
𝑒2
𝑍𝛼/2 𝜎 2
𝑛𝜇 = if 𝜎 is known
𝑒
𝑍𝛼/2 𝑠 2
𝑛𝜇 = if 𝜎 is not known
𝑒 24
Examples
1. A gasoline service station shows a standard deviation of Birr 6.25 for the changes
made by the credit card customers. Assume that the station’s management would
like to estimate the population mean gasoline bill for its credit card customers to
be with in ± Birr 1.00. For a 95% confidence level, how large a sample would be
necessary?
2. The National Travel and Tour Organization (NTO) would like to estimate the
mean amount of money spent by a tourist to be with in Birr 100 with 95%
confidence. If the amount of money spent by tourist is considered to be normally
distributed with a standard deviation of Br 200, what sample size would be
necessary for the NTO to meet their objective in estimating this mean
25 amount?
Solution
1. e = Birr 1.00, σ = Birr 6.25, C = 0.95, 𝑍𝛼/2 = 𝑍0.025 = 1.96
𝑍𝛼/2 𝜎 2
𝑛𝜇 =
𝑒
1.96∗6.25 2
𝑛𝜇 = = 150.06 ≈ 151
1

2. e = Birr 100 , σ = Birr 200, C = 0.95, 𝑍𝛼/2 = 𝑍0.025 = 1.96


𝑍𝛼/2 𝜎 2
𝑛𝜇 =
𝑒
1.96∗200 2
𝑛𝜇 = = 15.37 ≈ 16
100

26
Sample size for estimating population proportion, p

𝑝𝑞 𝑝𝑞
• The confidence interval for p is 𝑃 = 𝑝 ± 𝑍𝛼/2 . The expression 𝑍𝛼/2 is
𝑛 𝑛

called the error term (e). That is,

𝑝𝑞
𝑒 = 𝑍𝛼/2 , squaring both sides
𝑛

2 𝑝𝑞
𝑒 2 = 𝑍𝛼/2 , solving for n
𝑛

2
𝑍𝛼/2 𝑝𝑞
𝑛𝑝 =
𝑒2

𝑍𝛼/2 2
• Since if 𝑝 𝑎𝑛𝑑 𝑞 not given, we use p and q and 𝑛𝑝 becomes, 𝑛𝑝 = 𝑝𝑞
𝑒27
Examples
1. Suppose that a production facility purchases a particular component parts in large
lots from a supplier. The production manager wants to estimate the proportion of
defective parts received from this supplier. She believes that the proportion of
defects is no more than 0.2 and wants to be with in 0.02 of the true proportion of
defects with a 90% level of confidence. How large a sample should she take?

2. What is the largest sample size that would be needed in estimating a population
proportion to with in ± 0.02, with a confidence coefficient of 0.95?

28
Solution
1. e = 0.02, p = 0.2, q =0.8, C = 0.90 and 𝑍𝛼/2 = 𝑍0.05 = 1.64
𝑍𝛼/2 2
𝑛𝑝 = 𝑝𝑞
𝑒
1.64 2
𝑛𝑝 = 0.2 ∗ 0.8 = 1075.84 ≈ 1076
0.02

2. e = 0.02, C = 0.95 and 𝑍𝛼/2 = 𝑍0.025 = 1.96

The largest sample size would be obtained when p = 0.5. So,


2
𝑍𝛼/2
𝑛𝑝 = 𝑝𝑞
𝑒
1.96 2
𝑛𝑝 = 0.5 ∗ 0.5 = 2401
0.02
If p is unknown and there is no possibility of estimating it, use 0.5 as the value of p
because it will generate the greatest possible sample size as compared with 29other values.

You might also like