0% found this document useful (0 votes)
72 views

Population Mean Point Interval Population Proportion Population Variance (Standard Deviation Distribution "Chi-Square Distribution"

This document summarizes key points from a lecture on confidence intervals for population proportions and sample size determination. It provides formulas for calculating point estimates and confidence intervals for population proportions using sample data. Examples are given to demonstrate how to calculate sample proportions, confidence intervals, and determine the minimum sample size needed to estimate a population proportion within a given margin of error.

Uploaded by

Layla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views

Population Mean Point Interval Population Proportion Population Variance (Standard Deviation Distribution "Chi-Square Distribution"

This document summarizes key points from a lecture on confidence intervals for population proportions and sample size determination. It provides formulas for calculating point estimates and confidence intervals for population proportions using sample data. Examples are given to demonstrate how to calculate sample proportions, confidence intervals, and determine the minimum sample size needed to estimate a population proportion within a given margin of error.

Uploaded by

Layla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Mr.

Mohamed El-Sayed El-Dawoody Lecturer of Mathematical Statistics

LECTURE 3

Confidence Intervals and sample size

Introduction
 In lecture 2, we studied the point and interval estimates for the population mean when the
population standard deviation is known and when it is unknown.
 The point and interval estimates can be studied (extended) to other population parameters
that are important as the population mean.
 In this lecture, we will study and explain how to estimate the population proportion and
also, the population variance (standard deviation).
 We will estimate the confidence intervals for the population proportion using z distribution
and for the population variance using a new distribution called "chi-square distribution".
 Different problems and applications will be presented, in this lecture, to cover these two cases.
3. Confidence Intervals for the population proportion
 Many statistical studies involve finding a proportion of the population that has a certain
characteristic. It can be expressed as a fraction, decimal, or percentage.
 The population proportion is denoted by p and is given by the formula
X
p ,
N
where,
 N  The population size (always very large or infinite).
 X  The number of elements in the population with a certain characteristic.
 The best point estimate of the population proportion is the sample proportion taken from
it. It is denoted by p̂ and is given by the formula
X
pˆ  ,
n
where,
 n  The sample size (always finite and known).
 X  The number of elements in the sample with a certain characteristic.
 The complement of the sample proportion is denoted by qˆ and is given by the formula
n X
qˆ  .
n

LECTURE 3 PAGE 1 STAT 2040


Mr. Mohamed El-Sayed El-Dawoody Lecturer of Mathematical Statistics

 When p̂ and qˆ are given in decimals or fractions, then pˆ + qˆ = 1 . Also, when p̂ and qˆ are
given in percentages, then pˆ + qˆ = 100% . That is, qˆ can be written as qˆ = 1 - pˆ .
 The confidence intervals for the population proportion (p) are given by the formula
pˆ qˆ pˆ qˆ
pˆ  z  .  p  pˆ  z  . ,
2
n 2
n
where,
 n  The sample size.
 pˆ  The sample proportion.
 qˆ  The complement of the sample proportion
 z  2  The standard value of the random variable X.
Remark
 There are two basic conditions must be satisfied to use the above formula are:
n . pˆ  0 and n . qˆ  0.
 The margin of error when estimating the population proportion is given by
pˆ qˆ
E z. .
2
n
 The confidence intervals for the population proportion (p) can be written as
pˆ qˆ
C.I  pˆ  z  . .
2
n
Example 1

A random sample of 200 workers in a large factory found that 128 drive to work alone. Find p̂ and
qˆ , where p̂ is the proportion of workers who drive to work alone?
Solution

Since, n  200, X  128.


Then, the sample proportions p̂ and qˆ are given as
X 128
pˆ    0.64  64%,
n 200
n  X 200  128 72
qˆ     0.36  36%.
n 200 200
Hence, 64% of the people in the survey drive to work alone, and 36% drive with others.
Example 2

A survey conducted by Sallie Mae and Gallup of 1404 respondents found that 323 students paid
for their college education by student loans. Find the best point estimate and the 90% confidence
interval of the true proportion of students who paid for their college education by student loans?

LECTURE 3 PAGE 2 STAT 2040


Mr. Mohamed El-Sayed El-Dawoody Lecturer of Mathematical Statistics

Solution

 Since, we have
n  1404, X  323, z  2  1.65.
Then, the best point estimate of the population proportion is given as
X 323
p  pˆ    0.23,
n 1404
Also,
qˆ  1  pˆ  1  0.23  0.77.
 The 90% confidence interval of the population proportion is given as
pˆ qˆ pˆ qˆ
pˆ  z  .  p  pˆ  z  .
2
n 2
n
(0.23)(0.77) (0.23)(0.77)
0.23  (1.65)  p  0.23  (1.65)
1404 1404
0.23  (1.65)(0.01123119)  p  0.23  (1.65)(0.01123119)
0.23  0.019  p  0.23  0.019
21.1%  p  24.9%.
Hence, we can be 90% confident that the true proportion of students who pay for their college
education by student loans is between 21.1% and 24.9%.
Example 3

In 2008, 17% of American homes were protected by a home security system. A marketing firm
wanted to estimate the proportion of protected homes today. It chose a random sample of 200
homes and discovered that 26.5% had home security systems. Estimate the true proportion of
homes with security systems with 99% confidence?
Solution

Since, we have
n  200, pˆ  0.265, qˆ  0.735, z  2  2.58.
The 99% confidence interval of the population proportion is given as
pˆ qˆ pˆ qˆ
pˆ  z  .  p  pˆ  z  .
2
n 2
n
(0.265)(0.735) (0.265)(0.735)
0.265  (2.58)  p  0.265  (2.58)
200 200
0.265  (2.58)(0.03120697)  p  0.265  (2.58)(0.03120697)
0.265  0.081  p  0.265  0.081
18.4%  p  34.6%.
Hence, we can be 99% confident that the true proportion of homes with security systems is
between 18.4% and 34.6%.

LECTURE 3 PAGE 3 STAT 2040


Mr. Mohamed El-Sayed El-Dawoody Lecturer of Mathematical Statistics

Sample Size
 There is an important question in estimation of the population proportion. How large should
the sample be in order to make an accurate estimate?
 To estimate the sample size that helps the researchers to make an accurate estimate of the
population proportion, we will use the margin of error formula as follows:
pˆ qˆ z  /2 pˆ qˆ
2
 z  /2 
E z.  E n  z  . pˆ qˆ  n   n  pˆ qˆ   .
2 n 2
E  E 
 That is, the minimum sample size needed for an interval estimate of the population
proportion is given by the formula
2
z 
n  pˆ qˆ   /2  .
 E 
Example 4

A researcher wishes to estimate, with 95% confidence, the proportion of people who own a home
computer. A previous study shows that 40% of those interviewed had a computer at home. The
researcher wishes to be accurate within 2% of the true proportion. Find the minimum sample size
necessary?
Solution

Since, we have
z  2  1.96, E  0.02, pˆ  0.40, qˆ  0.60.
Then, the minimum sample size needed to estimate the proportion of people who own a home
computer is given by
2 2
z   1.96 
n  pˆ qˆ   /2   (0.40)(0.60)    2304.96  2305.
 E   0.02 
Hence, to be 95% confident that the estimate is accurate within 2% of the true proportion, the
researcher needs a sample of at least 2305 people.
4. Confidence Intervals for the population variance and standard deviation
 In statistics, the variance and standard deviation of a variable are as important as the mean.
Both of them measure the variation in the data set of the variable under study.
 In this section, we will explain how to find the confidence intervals for the population
variance (σ2) and standard deviation (σ) of a variable.
 To calculate confidence intervals for σ2 and σ, a new statistical distribution will be studied. It is
called the "chi-square distribution".

LECTURE 3 PAGE 4 STAT 2040


Mr. Mohamed El-Sayed El-Dawoody Lecturer of Mathematical Statistics

 The chi-square distribution has the following properties:


 The chi-square variable is denoted by (pronounced "ki").
 The chi-square variable cannot be negative (That is, ).
 The chi-square distribution curve is skewed to the right (Positively skewed).
 The total area under the chi-square distribution curve is equal to 1 (or, 100%).
 The chi-square distribution has a family of curves based on the "degrees of freedom".
 The chi-square distribution becomes somewhat symmetric at about 100 degrees of freedom.
Remark
 Two different values are used to estimate the confidence intervals for the population variance
and standard deviation. This is because the chi-square distribution is not symmetric.
 One value is located on the right side of the distribution, and the other is located on the left
side of the distribution. See the following figure:
 These two values are denoted by and
and are given by the formulas:

where, (1   )100%  Confidence Level and


.
Example 5

Find the values of and for a 90% confidence interval when the sample size is 25?
Solution

Since, we have
(1   )100%  90%  1    0.90    1  0.90  0.10   / 2  0.05.
Then, we get
  right   /2, d.f
2 2

 0.05,
2
24  36.415.

 left  1 /2, d.f


2 2

 0.95,
2
24  13.848.

Remark
The values of  right
2
and  left
2
are found from the chi-square distribution table as follows:

LECTURE 3 PAGE 5 STAT 2040


Mr. Mohamed El-Sayed El-Dawoody Lecturer of Mathematical Statistics

Theorem
 The confidence intervals for the population variance (σ2) are given by the formula
(n  1) s 2 (n  1) s 2
2  .
 right
2
 left
2

 The confidence intervals for the population standard deviation (σ) are given by the formula
(n  1) s 2 (n  1) s 2
  .
 right
2
 left
2

where, n  The sample size & s 2  The sample variance (known or calculated).
Example 6

Find the 95% confidence interval of the variance and standard deviation for the nicotine content of
cigarettes that are manufactured if a random sample of 20 cigarettes has a standard deviation of
1.6 milligrams? Assume the variable is normally distributed.
Solution

Since,
(1   )100%  95%  1    0.95    1  0.95  0.05   / 2  0.025.
Then, we have
 right   /2, d.f  0.025,19  32.852.
2 2 2

 left  1 /2, d.f  0.975,19  8.907.


2 2 2

Now, we find that:


 The 95% confidence interval for the population variance is given by
(n  1) s 2 (n  1) s 2
2 
 right
2
 left
2

(20  1) (1.6)2 (20  1) (1.6)2


2 
32.852 8.907
1.48    5.46.
2

LECTURE 3 PAGE 6 STAT 2040


Mr. Mohamed El-Sayed El-Dawoody Lecturer of Mathematical Statistics

Hence, we can be 95% confident that the true variance for the nicotine content of all cigarettes
manufactured is between 1.48 and 5.46 milligrams.
 The 95% confidence interval for the population standard deviation is given by
1.48    5.46
1.22    2.34.
Hence, we can be 95% confident that the true standard deviation for the nicotine content of all
cigarettes manufactured is between 1.22 and 2.34 milligrams.
Example 7

Find the 90% confidence interval of the standard deviation for the number of named storms per
year in the Atlantic basin. A random sample of 10 years has been used and we obtain the results:
10 5 12 11 13 15 19 18 14 16
Assume the variable is approximately normal.
Solution

Since,
 x 
x i 10  5  .......  16 133
   13.3.
n 10 10
 s 
2  x i2  n x 2 1921  10(13.3)2 152.1
   16.9.
n 1 10  1 9
And where,
(1   )100%  90%  1    0.90    1  0.90  0.10   / 2  0.05.
Then, we have
 right   /2, d.f  0.05, 9  16.919.
2 2 2

 left  1 /2, d.f  0.95, 9  3.325.


2 2 2

Now, the 90% confidence interval for the population standard deviation is given by
(n  1) s 2 (n  1) s 2
2 
 right
2
 left
2

(10  1) (16.9) (10  1) (16.9)


2 
16.919 3.325
8.99    45.74.
2

8.99    45.74
2.99    6.76.
Hence, we can be 90% confident that the standard deviation for the number of named storms is
between 3.0 and 6.8 storms.
‫تمت بـحمـد اللـه‬

LECTURE 3 PAGE 7 STAT 2040

You might also like