Population Mean Point Interval Population Proportion Population Variance (Standard Deviation Distribution "Chi-Square Distribution"
Population Mean Point Interval Population Proportion Population Variance (Standard Deviation Distribution "Chi-Square Distribution"
LECTURE 3
Introduction
In lecture 2, we studied the point and interval estimates for the population mean when the
population standard deviation is known and when it is unknown.
The point and interval estimates can be studied (extended) to other population parameters
that are important as the population mean.
In this lecture, we will study and explain how to estimate the population proportion and
also, the population variance (standard deviation).
We will estimate the confidence intervals for the population proportion using z distribution
and for the population variance using a new distribution called "chi-square distribution".
Different problems and applications will be presented, in this lecture, to cover these two cases.
3. Confidence Intervals for the population proportion
Many statistical studies involve finding a proportion of the population that has a certain
characteristic. It can be expressed as a fraction, decimal, or percentage.
The population proportion is denoted by p and is given by the formula
X
p ,
N
where,
N The population size (always very large or infinite).
X The number of elements in the population with a certain characteristic.
The best point estimate of the population proportion is the sample proportion taken from
it. It is denoted by p̂ and is given by the formula
X
pˆ ,
n
where,
n The sample size (always finite and known).
X The number of elements in the sample with a certain characteristic.
The complement of the sample proportion is denoted by qˆ and is given by the formula
n X
qˆ .
n
When p̂ and qˆ are given in decimals or fractions, then pˆ + qˆ = 1 . Also, when p̂ and qˆ are
given in percentages, then pˆ + qˆ = 100% . That is, qˆ can be written as qˆ = 1 - pˆ .
The confidence intervals for the population proportion (p) are given by the formula
pˆ qˆ pˆ qˆ
pˆ z . p pˆ z . ,
2
n 2
n
where,
n The sample size.
pˆ The sample proportion.
qˆ The complement of the sample proportion
z 2 The standard value of the random variable X.
Remark
There are two basic conditions must be satisfied to use the above formula are:
n . pˆ 0 and n . qˆ 0.
The margin of error when estimating the population proportion is given by
pˆ qˆ
E z. .
2
n
The confidence intervals for the population proportion (p) can be written as
pˆ qˆ
C.I pˆ z . .
2
n
Example 1
A random sample of 200 workers in a large factory found that 128 drive to work alone. Find p̂ and
qˆ , where p̂ is the proportion of workers who drive to work alone?
Solution
A survey conducted by Sallie Mae and Gallup of 1404 respondents found that 323 students paid
for their college education by student loans. Find the best point estimate and the 90% confidence
interval of the true proportion of students who paid for their college education by student loans?
Solution
Since, we have
n 1404, X 323, z 2 1.65.
Then, the best point estimate of the population proportion is given as
X 323
p pˆ 0.23,
n 1404
Also,
qˆ 1 pˆ 1 0.23 0.77.
The 90% confidence interval of the population proportion is given as
pˆ qˆ pˆ qˆ
pˆ z . p pˆ z .
2
n 2
n
(0.23)(0.77) (0.23)(0.77)
0.23 (1.65) p 0.23 (1.65)
1404 1404
0.23 (1.65)(0.01123119) p 0.23 (1.65)(0.01123119)
0.23 0.019 p 0.23 0.019
21.1% p 24.9%.
Hence, we can be 90% confident that the true proportion of students who pay for their college
education by student loans is between 21.1% and 24.9%.
Example 3
In 2008, 17% of American homes were protected by a home security system. A marketing firm
wanted to estimate the proportion of protected homes today. It chose a random sample of 200
homes and discovered that 26.5% had home security systems. Estimate the true proportion of
homes with security systems with 99% confidence?
Solution
Since, we have
n 200, pˆ 0.265, qˆ 0.735, z 2 2.58.
The 99% confidence interval of the population proportion is given as
pˆ qˆ pˆ qˆ
pˆ z . p pˆ z .
2
n 2
n
(0.265)(0.735) (0.265)(0.735)
0.265 (2.58) p 0.265 (2.58)
200 200
0.265 (2.58)(0.03120697) p 0.265 (2.58)(0.03120697)
0.265 0.081 p 0.265 0.081
18.4% p 34.6%.
Hence, we can be 99% confident that the true proportion of homes with security systems is
between 18.4% and 34.6%.
Sample Size
There is an important question in estimation of the population proportion. How large should
the sample be in order to make an accurate estimate?
To estimate the sample size that helps the researchers to make an accurate estimate of the
population proportion, we will use the margin of error formula as follows:
pˆ qˆ z /2 pˆ qˆ
2
z /2
E z. E n z . pˆ qˆ n n pˆ qˆ .
2 n 2
E E
That is, the minimum sample size needed for an interval estimate of the population
proportion is given by the formula
2
z
n pˆ qˆ /2 .
E
Example 4
A researcher wishes to estimate, with 95% confidence, the proportion of people who own a home
computer. A previous study shows that 40% of those interviewed had a computer at home. The
researcher wishes to be accurate within 2% of the true proportion. Find the minimum sample size
necessary?
Solution
Since, we have
z 2 1.96, E 0.02, pˆ 0.40, qˆ 0.60.
Then, the minimum sample size needed to estimate the proportion of people who own a home
computer is given by
2 2
z 1.96
n pˆ qˆ /2 (0.40)(0.60) 2304.96 2305.
E 0.02
Hence, to be 95% confident that the estimate is accurate within 2% of the true proportion, the
researcher needs a sample of at least 2305 people.
4. Confidence Intervals for the population variance and standard deviation
In statistics, the variance and standard deviation of a variable are as important as the mean.
Both of them measure the variation in the data set of the variable under study.
In this section, we will explain how to find the confidence intervals for the population
variance (σ2) and standard deviation (σ) of a variable.
To calculate confidence intervals for σ2 and σ, a new statistical distribution will be studied. It is
called the "chi-square distribution".
Find the values of and for a 90% confidence interval when the sample size is 25?
Solution
Since, we have
(1 )100% 90% 1 0.90 1 0.90 0.10 / 2 0.05.
Then, we get
right /2, d.f
2 2
0.05,
2
24 36.415.
0.95,
2
24 13.848.
Remark
The values of right
2
and left
2
are found from the chi-square distribution table as follows:
Theorem
The confidence intervals for the population variance (σ2) are given by the formula
(n 1) s 2 (n 1) s 2
2 .
right
2
left
2
The confidence intervals for the population standard deviation (σ) are given by the formula
(n 1) s 2 (n 1) s 2
.
right
2
left
2
where, n The sample size & s 2 The sample variance (known or calculated).
Example 6
Find the 95% confidence interval of the variance and standard deviation for the nicotine content of
cigarettes that are manufactured if a random sample of 20 cigarettes has a standard deviation of
1.6 milligrams? Assume the variable is normally distributed.
Solution
Since,
(1 )100% 95% 1 0.95 1 0.95 0.05 / 2 0.025.
Then, we have
right /2, d.f 0.025,19 32.852.
2 2 2
Hence, we can be 95% confident that the true variance for the nicotine content of all cigarettes
manufactured is between 1.48 and 5.46 milligrams.
The 95% confidence interval for the population standard deviation is given by
1.48 5.46
1.22 2.34.
Hence, we can be 95% confident that the true standard deviation for the nicotine content of all
cigarettes manufactured is between 1.22 and 2.34 milligrams.
Example 7
Find the 90% confidence interval of the standard deviation for the number of named storms per
year in the Atlantic basin. A random sample of 10 years has been used and we obtain the results:
10 5 12 11 13 15 19 18 14 16
Assume the variable is approximately normal.
Solution
Since,
x
x i 10 5 ....... 16 133
13.3.
n 10 10
s
2 x i2 n x 2 1921 10(13.3)2 152.1
16.9.
n 1 10 1 9
And where,
(1 )100% 90% 1 0.90 1 0.90 0.10 / 2 0.05.
Then, we have
right /2, d.f 0.05, 9 16.919.
2 2 2
Now, the 90% confidence interval for the population standard deviation is given by
(n 1) s 2 (n 1) s 2
2
right
2
left
2
8.99 45.74
2.99 6.76.
Hence, we can be 90% confident that the standard deviation for the number of named storms is
between 3.0 and 6.8 storms.
تمت بـحمـد اللـه