Confidence Interval
Confidence Interval
Intervals
“CONFIDENCE COMES NOT FROM
A L WAY S B E I N G R I G H T B U T F R O M
NOT FEARING TO BE WRONG ”.
— PETER MCINTYRE
Introduction to Confidence Intervals
An interval estimate of a population parameter such as mean and standard deviation is an interval
(or)range of values within which the true parameter value is likely to lie within a certain probability.
◦ For Instance confidence interval for the population mean may be stated as the population mean lies between 30
and 50.
i.e.
◦ Confidence level may
30≤μ≤50
increase or decrease due to the remaining population
i.e. {α= significance, it may be 0.05,0.06,0.03 …..
◦ Confidence intervals
(1-α)100%
is the interval estimate of population parameter estimated from a sample using a specified
confidence level.
e.g. If 1-α=0.95 =>α=0.05
Calculating Confidence Interval
◦ Confidence intervals is can be calculated using Error Bound. The Error Bound gets its name from the
recognition that it provides the boundary of the interval derived from the standard error of the sampling
distribution.
◦ To construct a confidence interval for a single unknown population mean μ, where the population standard
deviation is known, we need x− as an estimate for μ and we need the margin of error. Here, the margin of error
(EBM) is called the error bound for a population mean (abbreviated EBM). The sample mean x− is the point
estimate of the unknown population mean μ.
◦ The confidence interval estimate will have the form:
(point estimate - error bound, point estimate + error bound) or, in symbols,
The mathematical formula for this confidence interval is:
Confidence Interval for population
Mean
Graph
Example 1: A sample of 100 patients was chosen to estimate the length of stay (LoS) at a hospital.
The sample mean was 4.5 days and the population standard deviation was known to
be 1.2 days.
(a) Calculate the 95% confidence Interval for the population mean.
(b) What is the probability that the population mean Is greater than 4.73 days?
Solution:
(a) 95% confidence interval for population mean: We know that X = 4.5 and
s = 1.2 and thus
The 95% confidence interval is given by
The Excel function CONFIDENCE(a, s, n) [or CONFIDENCE.NORM(a, s, n)], where a is the significance, s is the population standard
deviation, and n is the sample size, returns the value Z a /2 *s √n For current problem CONFIDENCE(0.05, 1.2, 100) = 0.235196. The
corresponding confidence Interval is,
(4.5 – 0.235196, 4.5 + 0.235196) = (4.2648, 4.7352)
(b) Note that 4.73 is the upper limit of the 95% confidence Interval from part (a), thus
the probability that the population mean Is greater than 4.73 is approximately 0.025.
Example 2: Amount of time (measured in hours) spent by 20 students on an online course is
given in Table 5.2. Assuming that the population of time spent follows a normal distribution
and standard deviation Is 3.1 hours, calculate the 90% confidence interval
for the mean time spent by the students.
TABLE 5.2 Sample time spent by students on an online course
4.7 9.3 8 7.4 9.2 1.7 7.2 8.6 9 6.9
9.2 11.2 7.6 4.9 5.3 2.8 12.3 10.6 5.7 3.8
Solution: The estimate mean from the sample is X = 7 2 . 7 and the sampling distribution’s standard
deviation Is s / . n = = 3 1/ . 20 0 6932 .
The 90% confidence Interval is given by,
Confidence Interval For
Population Proportions
o Requirements for constructing meaningful confidence intervals about the population proportion is no more than 5% of
the size of the population it was drawn from.
i) The size of our sample is no more than 5% of the size of the population.
ii) If x1,x2,x3,.....xn are from Bernoulli trails with a probability of success, that is E(X1)=p and var(X i)=p*q (where q=1-
p), the sampling distribution of the probability of success for large sample space with mean P and standard error n-
>Sample size.
◦ np(1-p)≥10
◦ If the sample meets the above requirement, it has an approximately normal distribution.
Standard normal distribution . But we have a rule of thumb, we set the value of n, n*p*q≥10
(1-α)100% confidence interval for population proportion Is given by,
.
Confidence Interval For Population Mean
When Standard Derivation Is Unknown
If the population follows a normal distribution and the standard deviation is calculated from the sample, then the statistics will
follow a t- distribution with (n-1) degrees q freedom.
Here S is the standard deviation estimated from the sample(standard error). The t- distribution is very similar to standard t-
distribution is very similar to the standard normal distribution, it has a bell shape and its mean, median and mode are equal to
zero as in the case q standard normal distribution.
The (I-α)100% confidence interval for mean from a population that follows normal distribution when the population mean is
unknown is given by,
In the value ta/2,n-1 is the value of t under t- distribution for which the cumulative probability f(t)=0.025, when the degree q
freedom is (n-1), here the degree of freedom is (n-1) since the standard deviation is estimated from the sample. the absolute
value ta/2,n-1 for different values α along with corresponding Za/2value.
Example: An online grocery store is interested in estimating the basket size (number of items ordered by the
customer) of its customer order so that it can optimize its size of crates used for delivering the grocery items. From a
sample of 70 customers, the average basket size was estimated as 24 and the standard deviation estimated from the
sample was 3.8. Calculate the 95% confidence interval for the basket size of the customer order.
Thus the 95% confidence interval for the size of the basket is (23.09, 24.91)
Deriving a confidence interval for
the population variance
Suppose we are about to sample n independent observations from a normally distributed population.
We intend to use the sample variance s2 to estimate the population variance s 2.
Then :
has an x2 distribution with n-1 degree of freedom.
s 2 = ∑(xi- μx ) 2 / s 2
ux=population mean
where c2a/2, n-1 is the value of chi-square distribution with n - 1 degrees of freedom where a/2 is the right
side area, c21-a/,2n-1 is the value of chi-square distribution with n - 1 degrees of freedom where 1 - a/2 is the right side area.
Deriving confidence interval of variance:-
Example: Time taken to manufacture an aircraft door is a random variable due to several manual processes and assembly
of more than 1000 parts to make the aircraft door. The sources of variability in door assembly include factors such as the
non-availability of parts, manpower, and machine tools. It is known that the time to assemble a door follows a normal
distribution. The variance of the time taken to manufacture the door was estimated to be 324 hours based on a sample of
50 doors. Calculate a 95% confidence interval for the variance in manufacturing aircraft doors.
Solution: We know that n = 50, S2 = 324, c 02. , 025 49 = 70.22, c 02. , 975 49 = 31.55