0% found this document useful (0 votes)

48 views

Unit 5 Estimation: Structure

This document discusses estimation methods, including point estimation and interval estimation. It introduces key concepts such as estimators, point estimates, and criteria for good estimators. Specifically, it defines an estimator as a function of sample observations used to estimate an unknown parameter. A point estimate is a single value of an estimator. It also defines an unbiased estimator as one where the mean of all possible estimates equals the unknown population parameter. The document goes on to describe point estimation, using sample mean as an example estimator. It then discusses desirable properties of estimators, focusing on unbiasedness as a key criteria for a good estimator.

Uploaded by

Manoj BE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views

Unit 5 Estimation: Structure

Uploaded by

Manoj BE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

UNIT 5 ESTIMATION

Structure Page No.

5.1 Introduction
Objectives
5.2 Point Estimation
5.3 Criteria For a Good Estimator
5.4 Interval Estimation
Confidence Interval for Mean with Known Variance
Confidence Interval for Mean with Unknown Variance
Confidence Interval for Proportion
5.5 Summary

5.1 INTRODUCTION

In Units 2 and 3 you have seen that populations can be described by distributions which
I
are fully determined with the help of their parameters. For example, in the case of
binomialdistribution, you need to know n and p; in a Poisson distribution, you need to
knovn A; and a normal distribution is determined by y and u.These quantities are called
parameters. The problem with these parameters is that in real-life situations they are
usuallj. unknown. We have seen in Unit 4, that in such situations, we take a random
sample from the population and compute a function of the sample values, called
statistic. More precisely we try to estimate the population parameters by functions of
sample values. In this unit we shall discuss certain methods by which we can estimate
the po'plation parameters. These processes are called estimation. As we have already a

stated, in estimation we expect that the sample value is 'reasonably close' to the
population value. How do you judge this? Here we discuss some criteria which tell us
how best the parameters can be estimated by sample values.
In this unit we discuss two methods of estimation - point estimation and interval
estimation. In Sec. 5.2 we discuss point estimation. Point estimation concerns choosing
a statistic, that is a single number calculated from the sample data. In contrast to this,
we sometimes obtain an interval in which we can expect the parameter to lie with some
degree of confidence. The method of constructing such intervals is called 'interval
estiniaticn'. In Sec.5.4, we illustrate construction of such an interval. There we first
il1ust;ate ho~vsuch an interval is constructed for the population mean. We do this in
different cases. First we consider the case when the population standard deviation is
known and the sample size is large (n > 30). Then we take up the case where the
standard deviation is unknown, both when the sample size is small and when it is large.
After that we shall illustrate how interval estimates are constructed for the population
proportion.
Objectives
After reading this unit, you should be able to
I choose an estimator corresponding to a particular situation under study,
I
i
! check whether an estimator is,
Statistical Inference unbiased
or
efficient.

construct confidence intervals for the population mean and proportion, using appro-
priate sampling distribution,

distinguish between point estimation and interval estimation

5.2 POINT ESTIMATION

Imagine that you need to find the mean life-time of the bulbs produced by a company.
Assume that the life of a bulb is distributed as normal with mean 8. Now to find the
life-time of a bulb, you have to keep it on till it burns off, and note the time. So, it is a
destructive process. If you do this for every bulb produced by the company, it will soon
@ have to close down! The way out in this situation is to take a sample of the bulbs, and
try to estimate the average life-time of the population on the basis of the life-time
observations obtained from the sample. Of course, we cannot hope to get the exact
value of the mean life-time. What we get from the sample is only an estimate. If
X I , x2, . . . , Xn are the life-times of the bulbs which were chosen in a sample of size n,
X l , X 2 , + . . ;+xn
then we could take the sample mean, as an estimate of the population
n
mean. Of course, this estimate will vary from sample to sample. You already have
come across this concept in the previous unit.
But apart from the sample mean, there could be other ways of estimating the population
mean from the sample. For example, we could take x i as an estimate, or we could take
Xrnin + Xrnax as an estimate where xminis the minimum value and xmaXis the maximum
XI + x 2 + . . . + x , 2
can be value.
n
written as ---
C:=lxi In any case, the estimate is always based on some or all of the sample values. That is to
n say that we calculate some sample statistic and take it as an estimate of the population
parameter. This sample statistic is called an estimator. The value of this estimator for
our sample is the estimate.
Definition 1: An estimator is a function of the sample observations-that is used to
estimate an unknown parameter. A point estimate is a single value of an estimator.
The process by which we choose an estimator and find the point estimate for estimating
an unknown parameter is called point estimation.
For example, if a sample mean is used to estimate a population mean,and if the sample
mean for a particular sample equals 10, then the estimator used is the sample mean,
whereas the point estimate is 10.
To cite another example, suppose we are interested in finding the proportion of
individuals in India preferring a given soft drink over another. Here the population
parameter is proportion. If the sample proportion is used to estimate the population
proportion and if the sample proportion for a particular sample equals 0.6, then the
estimator used is the sample proportion and the point estimate is 0.6.
Why don't you try an exercise now.

El) Write the estimator and estimate used in the following two situations.
i) Suppose an organisation wants to have some information about the mileage
for a whole fleet of used taxis, and for that they calculate the mean odometer
reading [mileage) from a sample of used taxis and find it to be 98,000 miles.
ii) Suppose we want to find the proportion of teenagers who have criminal Estimation
record and for that we take a sample of 50 teenagers and find that 2 % (or
.02) have criminal record.

We can, in fact, have a number of estimators for a given parameter. Apart from the
sample mean, the sample median or the average of the smallest and the largest
observations in the sample could also be considered as estimators for the population
mean. Since we have a variety of estimators for a parameter 8, we should choose the
best of the lot to get a real good estimate. But what do we mean by the best? We'll see
that in the next section.

5.3 CRITERIA FOR A GOOD ESTIMATOR

In this section we shall discuss some desirable properties of an estimator. You have
already learnt in Unit 4 that an estimator takes different values for different samples.
But the estimators such as sample mean, proportion have some nice properties. For
example, the sample mean has the property that the means of repeated random sample
values taken from a given population will centre on the population mean. You recall
that in Unit 4 Sec.4.2, we stated this result that the mean of the sampling distribution of
the means is equal to the population mean. This means that 'on the average' the
I estimator values (or estimates) will equal the parameter value. This property is
considered to be one of the criteria for a good estimator. We have a definition here.
4

Definition 2: Suppose 8 (read theta hat) is an estimator of the population parameter, 8.

The estimator b takes different values for different samples. If the mean of all these
different estimates is the unknown parameter, 8, then we say that 6 is an unbiased
estimator of.8. Otherwise, it is called a biased one. It follows from the definition of
expectation of a r.v., that 8 is unbiased if and only if ~ ( 8 =
) 6. [Please see Sec.3.2, Unit
3, where we have discussed the expectation of a r.v.1
Let us now look at the estimator given in the following situation.
Let us consider some problems.
Problem 1: A Psychologist measures the reaction times of a sample of 6 individuals to
certain stimulus. The measures are given by 0.53, 0.46,0.50, 0.49, 0.52, 0.53 seconds.
Determine an unbiased estimate of the population mean.
Solution: An unbiased estimate of the population mean is given by the sample mean,
x = - C xi
-
n
=~0 . 5 2 , ~ =
Heren = 6andxl = 0 . 5 3 , ~=~0.46,x3 = 0 . 5 0 , ~=~0 . 4 9 , ~ s 0.53.

= 0.5 lseconds.

Then X = 0.5 seconds. Therefore an unbiased estimate is 0.5 1 seconds and 0.5 1
seconds is a point estimate for the mean reaction time of individuals to the stimulus.

Problem 2: In a sample of 400 textile workers, 184 expressed dissatisfaction regarding

a prospective plan to modify working conditions. The management felt that this is a
strong negative reaction. So they want to know the proportion of total workers who
have this feeling of dissatisfaction. Obtain an unbiased estimate of the population
proportion.
Solution A point estimate of the population proportion is given by the sample
proportion p, given as
S
Statistical Inference p=-
n
where s denotes the number of observations in the sample which meet the particular
characteristic, under study, and n is the sample size.
Here s = 184 and n = 400.
184 46
.p=-= -
" 400 100
46
Therefore an unbiased estimate of the population proportion is -
100

Here are some exercises for you.

E2) A law firm selects a random sample of 60 electronics stores in a particular area,
and asks each of them to repair a compact disc player. In each case the law firm
determines whether the store makes unnecessary repairs in order to inflate its bill.
The law firm finds that 8 of the stores are guilty of this practice. Obtain a point
estimate of the proportion of all such stores in the area that inflate bills in this way.

E3) A washing machine company chooses a random sample of 25 motors from those
it receives from one of its suppliers. It determines the length of life of each of the
motors. The results (expressed in thousands of hours) are as follows:
4.1 4.6 4.6 4.6 5.1
4.3 4.7 4.6 4.8 4.8
4.5 4.2 5.0 4.4 4.7
4.7 4.1 3.8 4.2 4.6
3.9 4.0 4.4 4.0 4.5
The firm's management is interested in estimating the mean length of life of the
motors received from the supplier. Provide a point estimate of this population
parameter.

We have seen that the sample mean and sample proportion are unbiased estimates for
population mean and population proportion respectively. Does this indicate that the
statistic or estimator corresponding to the population parameter is always unbiased? To
find an answer to this, let us consider the following example.
Suppose we consider the parameter, 'standard deviation'. Then the sample statistic S
given by the formula

where (xl, XZ, . . . ,x,) denote the sample observations, can be taken to be an estimator
of the population standard deviation. It has been proved that the statistics has an
expected value equal to \/(?) o and not o, this means that S is not an unbiased
estimator of a. Hence an unbiased estimator of a is obtained by the expression in
(2)

instead of the expression in (1). For example, an unbiased estimate of the population
standard deviation for the situation given in Problem 1 is
Estimation

+ +
(0.49 - 0 . 5 1 ) ~ (0.52 - 0 . 5 1 ) ~ (0.53 - 0.51j2]

:. S
-
=
0.0006
seconds.
As we have seen in E l , in certain situations one can find more than one unbiased
estimator for an unknown parameter 8. If we have to choose between two unbiased
estimators for a fixed sample size, then we find the standard deviation (or variance) of
the sampling distribution of these two estimators and choose that one with smaller
standard deviation (or valiance). An unbiased estimator T I of a parameter 8 is said to
<
be more efficient than another unbiased estimator T2 of 8 if Var(Tj) VarjT2), and in
such a case, [hi: sampling distribution of T I has a smaller dispersion (spread) about 8
than that of T2 (See Fig.1).

Less-efficient estimator

I 1 More-efficient estimator

," Value of estimator

Fig.1:
As an example, let us take a random sample of size n from a normal population with
mean p and standard deviation u and consider the sample mean and sample median as
two estimators of p. If we compare the sampling distributions of the mean and median
for random samples of size n, we get that these two sampling distributions have the
same rilean but their variances differ. We have seen in Unit 4 that, the variance of the
sampling distribution of the mean is 02/n, and it can be shown that for random samples
of the same size from a normal population, the variance of the sampling distribution of
the median is approximately 1 5708 $.
Statistical Inference Hence we get that both the mean and the median are unbiased estimators, but for a
given sample size, the standard error for mean is less than that of median.
From what we have already observed now, we get that for random samples from normal
populations the mean is more efficient than the median as an estimator of p. This fact
will be more clear to you when you try E4. In fact it can be shown that in most practical
situations where we estimate a population mean p , the variance of the sampling
distribution of no other statistic is less than that of the sampling distribution of the
mean. In other words, in most practical situations the sample mean is the 'most
acceptable' statistic for estimating a population mean p.
There exist several other criteria for assessing the "goodness" of estimators, but we
shall not discuss them in this course.
Why don't you try this exercise now.
- - - --- - -- - - - - - - - - - - -- --

E4) To verify the claim that the mean is generally more efficient than the median a
student conducted an experiment consisting of 12 tosses of three dice. The
following are his results: 2,4, and 6; 5,3, and 5; 4, 5 and 3; 5,2 and 3; 6,l and 5;
2,3 and 1; 3,1, and 4; 5,5 and 2; 3,3 and 4; 1,6 and 2; 3,3 and 3; and 4,5 and 3.
a) Calculate the 12 medians and the 12 means.
b) Group the medians and the means obtained in part (a) into separate
distributions having the classes 1.5-2.5,2.5-3.5, 3.5-4.5 and 4.5-5.5.
c) Draw histograms of the two distributions obtained in part (b) and explain
how they illustrate the claim that the mean is generally more efficient than
the median.

We can summarise our discussion up to now as follows:

Population parameters are usually unknown and need to be estimated from a
sample
There could be a variety of estimators for the same parameter.
"Unbiasedness" and "efficiency" are some of the desirable properties of a good
estimator.

5.4 INTERVAL ESTIMATION

In the last section we have seen what a point estimate is. Sometimes it is difficult to
evaluate the precision of a point estimator (as measured by its variance, say).
Alternatively, we can think of giving an interval, computed on the basis of the sample
values, which will contain the true parameter with a certain degree of confidence. This
interval is called an interval estimator of the parameter. These intervals are also called
confidence intervals. We shall first discuss confidence intervals for the population mean
p. We first consider the case when the variance 0 is known.

5.4.1 Confidence Interval for the Mean with Known Variance

Suppose you have been suspecting that the 1 litre pack of milk that is delivered to your
house every morning is not exactly 1 litre, but less. You feel that the filling machine
which is supposed to fill each polypack with 1 litre of milk is not working properly. Of
course, you are ready to admit that even though the machine is set for 1 litre, it has a
certain variability arid so there could be some packs which are less than 1 litre full
while others which are more.
To end your doubts, you need to find the average volume of milk filled by the machine.
Obviously, it would be impossible to do this except by taking a sample. Suppose you
i

I
I
I
measure the milk pack you get over a period of sixty days. That is, your sample size is
60. Suppose you find that the mean of your observations, which is the sample mean, is
950 ml. This is an estimate of the population mean. But you cannot immediately
Estimation

L
conclude that the machine is set for 950 ml. You must account for the variability of the
I sample means. For this you must also know the standard deviation, a,or calculate it
I
from the sample. Suppose we assume that a = 50.
-Now we shall construct an interval for the parameter p the average amount of milk that
the machine gives. For that we make use of the central limit theorem discussed in Unit
4. According to this Theorem, for sufficiently large sample size n the sample mean X is
a
approximately normally distributed with mean p and standard deviation -. Then we
J;;
make use of the normal distribution table given in Appendix 2 at the end of this block
and note that
P[-1.96 < Z < 1.961 = 0.95 (3)
-
"- "
where Z = -i.e. z a/& =x -
"16 ( 1
Now we rewrite Eqn.(3) using simple algebra as

Now we subtract -X from all the three terms inside the bracket. Then we get
7
I
a
[
P -X - 1.96- +
< - p < -X 1 . 9 6 ~ =
0
1 0.95
J;;
Now we multiply all the terms inside the bracket by - 1 and (therefore the inequalities
C
get reversed) and we get If we multiply the terms in
the inequality y 2 1 by (-1).
<p <X+1 . 9 6 ~-
' 0.95 (4) then the inequality gets
J; -
reversed and we get
Thus corresponding to each sample mean TI, we got an interval given by -y 5 - 1 .
(5)
which satisfies Equation (4). Let us now see what does Equation (4) implies. Let us, for
example consider the sample value X = 950ml. obtained for the problem regarding
average volume of milk filled by the machine. Then the Equation (4) corresponding to
-
x = 950ml is
P[937.35 < p < 962.651 = 0.95
We interpret it in the way that we are 95 % confident that the interval (937.65, 962.65)
contains the true value p. This does not mean that "There is 95 % probability that /L lies
in the interval (937.35, 962.65). This is a very common mis-interpretation of
Equation (4) and it is incorrect. This is because the population mean p is a fixed
quantity and therefore p either lies in the interval (937.35, 962.65) or it does not.
Therefore the probability that p lies in the interval is either 0 or 1. The 95 percent
probability is assigned to our level of confidence that the interval contains p. It is not
assigned to the probability that p lies in the interval.
Another interpretation of Equation (4) is based on the fact that we can construct a
confidence interval for each sample mean X. We will get different intervals for different
values of sample means. So, in this case Equation (4) says that if all possible samples
of size n are calculated, and the intervals are calculated
for each sample, then 95 % of all such intervals are expected to contain the population
parameter p. This does not mean that for a particular sample value TI, we can expect
a
that the interval (X - 1.96-, + a
X 1.96-) will contain p.
The confidence intervals
(X -
" +
1.963&" 9

is also denaoted as
-x rt-r 1.96-
>
fi J;; 4 35
Statistical Inference That means if you select 100 samples and calculate the intervals about their
sample means, then 95 of these will contain the population p. Note that here we
- figure
assume that a is known. In the following - we have illustrated this graphically,
showing five such intervals
1

I 0 I 0
I )x4+1.9%
i74
I
I
I I
fit
x5-1.96
I
x5
I +
%+l.96
Fig.2: A number of intervals constructed around the population mean.
Only the interval constructed around the sample mean Q does not contain the
population mean.
The interval given by (5) is called a confidence interval.(C.I)
The value 0.95 (or 95%) attached with the confidence interval is called confidence
coefRcient. The left end point of the confidence interval is called lower confidence
limit (LCL) and the right end point of the confidence interval is called upper
confidence limit (UCL). The difference between the UCL and LCL is the width of the
confidence interval. The width of the 95% confidence interval in the above example is

Although 0.95 is frequently used as a confidence coefficient, we can have other values
such as 0.90 or 0.99 as confidence coefficients. Using the normal distribution table, we
can obtain the confidence interval for 0.90 (or 90%) as
and for 0.99 (or 99%) as Esff matlon

In the following problem we illustrate the use of confidence intervals.

Example 1: The Director of a marketing division wants to analyse the market value of
business firms of a similar size. [Market value is defined as the number of common
shares outstanding, multiplied by the share prize as listed on an organised exchange]. A
sample of 600 firms revealed a mean market value of Rs.850 million. The earlier results
reveals that the population is normally distributed with population standard deviation
Rs.200 million. It is desired to set up a confidence interval for the (unknown) mean
market value.
Given that the sample mean is X = 850 million and the sample size n = 600 and
0 = 200 million. Therefore can construct 95 % confident interval, which is given by

This shows that the director can be 95% confident that the interval (834.01, 865.99)
contain the mean market value.

It is time to do some exercises.

- - -

E5) For each of the values given below, calculate the 95% confidence interval for the
mean.
i) X=O,cr= 1 0 , n = 8
ii) 51 = 550, o = 40, n = 16.
E6) If the mean length of hospitalisation of 140 patients was 11.4 days and the
standard deviation of patient days is assumed to be 2.5 days, what is the 99%
confidence interval for the average length of stay'? Assume normality.

E7) Estimate the number of days between gemination and the first pickable
cucumbers using the following sample.
Date of germination First Fruit
May 1 , June 17
4 18
8 21
5 16
12 28
18 July 3
11 June 25
9 26
What is the 95% confidence interval assuming u = 2 days?

Next we shall consider the case when u is unknown.

5.4.2 Confidence Interval for Mean with Unknown Variance

In all the computations of the confidence interval for p so far we have assumed that the
population variance is known. Each time, the normal distribution was the appropriate
sampling distribution used to determine the confidence intervals. However the norm21
Statistical Inference distribution is not appropriate when the population variance is unknown and the sample
size is less than 30. In such situations we use t-distribution. As indicated in the previous
unit. the sample standard deviation 's' is generally used as an estimator of the
population standard deviation.
If the sample size is 30 or less and the population is normal (and large relative to the
sample), a confidence interval for the population mean can be constructed by using the
t-distribution in place of the standard normal distribution.
You are already familiar with t distribution from Unit 4. We now have to use the t
distribution table given in Appendix to construct the confidence intervals corresponding
to different levels of confidence, say 95% or 99%. Let us suppose that we want to find
the confidence intervals at the 90% confidence level i.e. a = 0.1 with a sample size of
14 similar to the ones we have given in Equation(5). Note that we don't know o in this
case. Therefore, as indicated in Unit 4, the sample standard deviations is used as an
estimator of the population standard deviation. Thus if s is known, then 90% confidence
interval is given as

a 0.1
where b.05is the t-value corresponding to the value - = - = 0.05 and for the
2 2
parameter v = n - 1, where n is the sample size. Now to find the t-value we make use
of the table 1 in the Appendix. For example, suppose that n = 14, then v = 13, then,
from table 1 we get that the t-value is t,/2 = 1.771 (See Fig. 3).

n =14
df =13 t degrees of freedom

O . O 5 d under thearea
curve 0.05 of area
~nderthe curve under the curve

Fig.3: Confidence interval using the t-distribution

S
Like a z-value, the t-value 1.771 shows that if we mark off plus and minus 1.771 - on
either side of the mean 51, the area under the curve between these two limits will be
6
90%, and the area outside these limits will be 10%.
Therefore the 90% confidence interval, for degrees of freedom 13 is.
S
(X- 1.7'71-,f + 1.771-
fi.
Similarly the 95% confidence interval for 20 degrees of freedom is
S
(T - 2.086-, X + 2.086-- (7)
J;;
In a similar way we can find), confidence intervals for different degrees of freedom.
(see E8)
Let us consider some examples.
Example 2: A sample of 10 measurements of the diameter of a sphere has a mean
-x = 43.5mm and s2 = 4mm. Let us find the i) 95% and ii) 99% confidence intervals for
the actual diameter.
i) Here n=10 and cr = 0.5. Therefore, we use t distribution with 9 d.f. From the Estimation
table, we get t,12 = to025 = 2.26. So, the 95% confidence interval for p is

I So, we can be 95% confident that the true mean lies between 42.07 and 44.93

I ii) Working similarly, the 99% confidence interval is

(43.5 - 3.25 (&) +
, 43.5 3.25 (&))
= (41.44, 45.55).

The idea will be more clear to you if when you do the following exercises.
Problem 3: A manufacturer of light bulbs wants to estimate the mean length of life of
a new type of bulb which is designed to be extremely durable. The firm's engineer tests
nine of these bulbs and find that the length of life (in hours) of each is as follows:

Previous experience indicates that the lengths of life of individual bulbs of a particular
type are normally distributed. Construct a 90 percent confidence interval for the mean
length of life of all bulbs of this new type.
Solution: If xi is the length of life of the ith light bulb in the sample, we find that
9

Since n=9, we make use of t-distribution. Because a 90 percent confidence interval is

wanted, tcr12= b.05; and the number of degrees of freedom is (n - 1) = 8. Therefore,
the t-distribution table given in the Appendix of Unit 4 shows that if there are 8 degree
C of freedom, t.05 = 1.86. Thus, the desired confidence interval is

By simplifying, we get that the confidence interval is (5 107, 5293).

Why don't you try these exercises now?

-- -- -- -

E8) Given the following sample sizes and confidence levels, find the appropriate t,/2
values for constructing confidence intervals.
i) n = 1 0 ; 9 9 %
ii) n = 28; 95%
iii) n = 13; 90%
Statistical Inference iv) n = 25; 99%

E9) A sample of 12 measurements of breaking strengths of cotton threads gave a mean

of 0.738 N and a standard deviation of 0.124N. Find a 95% and 99% confidence
intervals for the actual breaking strength.

E10) Five measurements of the reaction time of an individual to certain stimuli were
recorded as: 0.28, 0.30, 0.27,0.33 and 0.31 second. Find the 95% confidence
interval for the actual reaction time.

E l 1) If you are given a sample of 20 candles from a large shipment of candles, and are
asked to give an interval estimate of their average burning life, how would you
proceed? What information would you need?

The above examples and exercises illustrate how we can use t-distribution to find the
confidence intervals. As we mentioned earlier, t-distribution can be used only if the
population variance is unknown and the sample size is small. Next we shall see how
to construct the confidence intervals for large samples when the population variance is
unknown.
Mathematicians have shown that if the sample size is large, we can simply substitute
the sample standard deviation for the population standard deviation in the results
obtained in the previous part of this section i.e. in Subsection 5.4.1. Thus, if we want to
construct a 95% confidence interval - that is, a confidence interval with a confidence
coefficient of 95 percent - we can substitute s for n in Equation (5), the result being

Consequently, the confidence interval is

Equation (8) is applicable only if the population is large relative to the sample.
The following example should make the above discussion more clear.
Problem 4: A random sample of 100 ball bearings made by a machine in 1 week was
taken. The mean diameter was found to be 8.24 mm with a standard deviation of 0.42
mrn. Find the 95% and 99% confidence intervals for the mean diameter of ball bearings
produced by that machine.
Solution: Since the sample is large, from Equation(8), we get that the 95% confidence
interval for p is

Similarly, the 99% confidence interval is

See if you can solve these exercises:

E12) A random sample of marks obtained by 50 students in Mathematics showed a

mean of 75 and a standard deviation of 10.
a) What are the 95% confidence limits for the mean marks in Mathematics?
b) With what degree of confidence can we say that the mean marks are between
74 and 76?
E13) A washing machine company's statistician says that 90% confidence interval for Estimation
the mean length of motors received from Supplier I1 is 4,500 to 4,800 hours,
based on a sample of 36 motors. he statistician also says that the standard
deviation of the lengths of life of motors received from Supplier I1 is 500 hours. Is
there any contradiction between the statements? If so, what is the contradiction?

Next we shall illustrate how confidence intervals are calculated for population
proportions. We have talked in length about the estimation of population parameter p.
Another important population that we need to estimate is the population proportion, p.
Let's see how to go about it.

5.4.3 Confidence Interval for Population Proportion

Let us start with a situation. In a random sample of 25 men from a city, 8 were found to
be smokers. Can we estimate the proportion of smokers in the city?
8
Suppose the proportion of smokers in a city is T . Then p = - is a point interval of 7r,
25
obtained from this sample. Now we shall construct a confidence for estimate -/r. To do
this we proceed similar to what we did for the population mean. We recall the result
from Unit 4, that the sampling distribution of sample proportion p has mean 7r and

standard deviation
J 7r(l - 7r)
n
. We also know from Unit 4, that if the sample size is
sufficiently large ahd if 7r is not very close to 0 or 1, the sampling distribution is
approximately a normally distribution. Then using the standard normal distribution
table, we can find confidence intervals. If we want to construct 95% confidence
intervals then that will be given by

so that

L J

The interval given by in (9) is called 95% confidence interval for 7r. Similarly, we can
have 90% or 99% confidence intervals. The above intervals given in Equatibn (9)
cannot be used as they involve the unknown, T .
However, if n is large, then 7r can be replaced by p without compromising acauracy. So
that for large samples, the 95% confidence interval for 7r will the

If we want to get a 99% confidence interval, we will have to replace 1.96 by 2.58, since
r 1

We shall illustrate this with the problem given below.

Problem 5: : In a random sample of 75 parts produced ,by a machine, 12 have a surface
finish which is rougher than the specification will allow. Find a i) 95% ii) 99%
confidence interval for the proportion of rough parts produced by the machine.
Solution: Here p = 12/75 = 0.16 . Then
a) 95% confidence interval is
Statistical Inference b) The 99% confidence interval for P is
/ \

Here are some exercises for you.

E14) A random sample of 800 calculators contains 24 defective items. Compute a 99%
confidence interval for the proportion of defective calculators.
E15) Of 1000 randomly selected lung cancer cases, 699 resulted in death. Construct a
95% confidence interval for the death rate from lung cancer.
E16) A student in a university wanted to decide whether or not a contest the election for
the presidency of the students' union. Out of 50 students, 11 showed their
willingness to vote for her. Find a 99% confidence interval for the true proportion
of students voting for her.

We now summarise our discussion about interval estimation in the following table:
Table 1
Parameter Point Estimator Confidence Interval
-
a known x-z5, X+zz

/ x -I
J;; J;;
a unknown, large n
i
a unknown, small n ( t X+t-
J;;

7.r P p is not too close to o or 1, large n (p - ZJ,- p + zJq,

Now we shall present a case study which shows how the techniques of the estimation
discussed in this unit helps in tackling real-life problems.
Abrasion resistance of Statistical Estimation in the Chemical Industry: A Case Study: A chemical firm I
rubber is the extent to called Imperial Chemical Industry (ICI) carried out the following experiment to I
which rubber can withstand estimate the effect of a chlorinating agent on the abrasion resistance of a certain type of
pressure against rubbing off rubber. Ten pieces of this type of rubber were cut in half, and one half-piece was treated
or frictional action. For
example, rubber of high the chlorinating agent, while the other half-piece was untreated. Then the abrasion
abrasion resistance will resistance of each half was evaluated on a machine, and the difference between the
have high road life. abrasion resistance of the treated half-piece and the untreated half-piece was computed.
Table below shows the 10 differences (1 corresponding to each of the pieces of rubber
in the sample). Based on this experiment, ICI was interested in estimating the mean
difference between the abrasion resistance of a treated and untreated half-piece of this
type of rubber. In other words, if this experiment were performed again and again, an
infinite population of such differences would result. ICI was interested in estimating the
mean of this population, since the mean is a good measure to find the effect of the
chlorinating agent on this type of rubber's abrasion resistance.
If you were a statistical consultant for ICI, how would you analyse these data? You
would recognise that a good point estimate of the mean of this population is the sample
mean, which is 1.27 as shown in Table below. Thus, your first step would be to advise
ICI that if they want a single number as an estimate, 1.27 is a good number to use. Next
you would point out that such a point estimate contains no indication of how much
error it may contain, whereas a confidence interval does contain such information.
Since the population standard deviation is unknown and the sample is small, expression
( ) should be used in this case to calculate a confidence interval. Assuming that the firm
wants a confidence coefficient of 95 percent, the confidence interval is (0.464 2.076),
because t.025 = 2.262, s = 1.1265, and n=10. The chances are 95 out of 100 that such a
confidence interval would include the population mean. (Note that this analysis
assumes that the population is approximately normally distributed)

Place Difference

The above analysis is, in fact, exactly how ICI's statisticians proceeded. Despite the
fact that the sample consisted of only 10 observations, the evidence was very strong that
the chlorinating agent had a positive effect on abrasion resistance. After all, the 95
'I: percent confidence interval was that the mean difference between abrasion resistance of
rubber with and without treatment was an increase of between 0.464 and 2.076. (For
that matter, the statisticians found that the 98 percent confidence interval was that the
mean difference was an increase of between0.265 and 2.275). The best estimate was
that the chlorinating agent resulted in an increase of about 1.27 in abrasion resistance.
With the detailed example you have seen how several aspects covered in this unit has
merged. In fact as you reflect on this case study you should check from the summary
below how many points are actually covered in this case study.
With that we come to the end of this unit.

5.5 SUMMARY

In this unit we have seen that

1) When the population is large, its parameters, like mean, variance, proportion, need
to be estimated from a sample.
2) There can be many different estimates of a parameter.
3) An estimator is unbiased if the mean of the estimates is the population parameter
4) Between two unbiased estimators we prefer the one with the smaller variance
5) Interval estimates are better than point estimates since we can easily specify the
precision of our estimate.
6) The computation of confidence intervals is done by using the sampling
distributions of the estimators.
Statistical Inference
5.6 SOLUTIONSIANSWERS

E l ) For (i) the estimator is the mean mileage of the sample of used taxis. The value
98,000 miles is an estimate.
For (ii) the estimator is the proportion and the value .02 is an estimate.
E2) An unbiased estimate of the population proportion is obtained by
8 2
p=-=-
60 15
E3) A point estimator of the population mean is obtained by calculating the sample
mean.
The sample mean of 25 motors is 4.448 thousands of hours.
E4) a) The medians are 4,5,4,3,25,2,3,5,3,2,3and 4; the means are 4,4.3,4,3.3,2,
2.7,4,3.3,3 and 4.
b) The frequencies are2,4,3 and 3 for the medians and 1,5,6 and 0 for the
means. Then obtain the frequency distribution.
c) The histograms of two distributions shows that the variance for the median is
more than for the mean which illustrate the claim that the mean is generally
more efficient that the median.
E5) 95% confidence interval for mean is
i) Here X = 0 and a = 10 and n = 8. Therefore the interval is
10
= (-6.9296,6.9296)

ii) Here X = 550, a = 40 and n = 16. Therefore the interval is

E6) Here n = 140,X = 11.4,a = 2.5

99% C.I. is given by

E7) Number of days are:

47,45,44,42,47,46,45,48
-
:. x = 5.5.11 = 8.0 = 2
2
There C.I. is given by 5.5 - 1.96-, 5.5 + = (4.1 141,6.8859)
Js
E8) i) Note that here t = n - 1 = 10 - 1 = 9 hnci a = 0.005. Therefore we look
under column for 0.005 till we reach the row for 9. Then'we get the value
3.250.
ii) Here (Y = 0.5 the tmI2-valueis 2.052
iii) Here a = 0.1 the taI2-value is 1.782
iv) Here a = 0.01 the taI2-value is 3.797
E9) n = 12 :. d.f. = 11
The t value for 95% C.I. is 2.20 and that for 99% C.I. is 3.11.
*
:. 95% '0.1. : 0.738 2.20 = (0.6592,0.8167)

(Z)
99% C.I. : 0.738 f 3.11 - = (0.6267,0.8493)
From the sample, TI = 0.298 and Estimation
s= = 0.0213 and d.t = 4.
:. he required value o f t is 2.78
:. 95% C.I. = 0.298 f 2.78

Light up the candles and measure the amount of time (life time) for which each
candle burns. This data will have 20 observations. Find the mean (TI) and the
standard deviation (s) of this data. The value of K is a point estimate. If we want
95% C.I., we find t = 2.09 for 19 d.f., since the sample size is 20. Then C.I. is

= 2P (0 5 Z 5 0.7071) '= 0.5224

:. Degree of confidence is 52%

24
p = - = 0.03,n = 800
800
:. C.I. = (0.0144,0.0456)

11
E15) p = - = 0.22. Therefore the C.1 is (0.0688,0.3711)
50

Sta 341 Class Notes Final
No ratings yet
Sta 341 Class Notes Final
120 pages
Unit 5
No ratings yet
Unit 5
17 pages
Chapter 5- Estimation
No ratings yet
Chapter 5- Estimation
8 pages
Cha 2
0% (1)
Cha 2
23 pages
Ch-1.Ppt Business Statx (2)
No ratings yet
Ch-1.Ppt Business Statx (2)
66 pages
Unit 5
No ratings yet
Unit 5
49 pages
ND Vohra Ch10 Theory of Estimation
No ratings yet
ND Vohra Ch10 Theory of Estimation
37 pages
SM Lec-2
No ratings yet
SM Lec-2
6 pages
Offiwiz File
No ratings yet
Offiwiz File
46 pages
ESTIMATION
No ratings yet
ESTIMATION
51 pages
CH-2 Estimation - 071222
No ratings yet
CH-2 Estimation - 071222
16 pages
Chapter 6
No ratings yet
Chapter 6
33 pages
Business Statistics CH 2
No ratings yet
Business Statistics CH 2
49 pages
Chapter Two Stat II
No ratings yet
Chapter Two Stat II
20 pages
Stat CH 3 Edited 1
No ratings yet
Stat CH 3 Edited 1
9 pages
MGMT 222 Ch. IV
50% (2)
MGMT 222 Ch. IV
30 pages
Unit 2 Statistical Estimation
No ratings yet
Unit 2 Statistical Estimation
15 pages
Stat For Fin CH 4 PDF
No ratings yet
Stat For Fin CH 4 PDF
17 pages
Point Estimation
No ratings yet
Point Estimation
22 pages
CH Ii Business Stat
No ratings yet
CH Ii Business Stat
28 pages
Estimation of Parameters
100% (1)
Estimation of Parameters
35 pages
Biostat Inferential Statistics
No ratings yet
Biostat Inferential Statistics
62 pages
POINT INTERVAL Estimates
No ratings yet
POINT INTERVAL Estimates
48 pages
Module 5
No ratings yet
Module 5
67 pages
Statistics for Economists Lecture VI
No ratings yet
Statistics for Economists Lecture VI
33 pages
BS_IMI_U4_Oct23_complete
No ratings yet
BS_IMI_U4_Oct23_complete
182 pages
Basic Concepts of Estimation
100% (1)
Basic Concepts of Estimation
17 pages
Session 10 - Estimation & PT Estimation
No ratings yet
Session 10 - Estimation & PT Estimation
14 pages
4. Interval Estimation
No ratings yet
4. Interval Estimation
69 pages
BS - CH II Estimation
No ratings yet
BS - CH II Estimation
10 pages
UMass Stat 516 Solutions Chapter 8
No ratings yet
UMass Stat 516 Solutions Chapter 8
26 pages
Unit - 1 Sampling distribution and estimation part 2
No ratings yet
Unit - 1 Sampling distribution and estimation part 2
15 pages
STATPROB Module 7
No ratings yet
STATPROB Module 7
16 pages
Lecture 11
100% (1)
Lecture 11
33 pages
Methods Chapter 2
No ratings yet
Methods Chapter 2
19 pages
Point and Interval Estimation-26!08!2011
No ratings yet
Point and Interval Estimation-26!08!2011
28 pages
Chapter 3
No ratings yet
Chapter 3
81 pages
Psp-Unit-6 Estimation Theory PDF
No ratings yet
Psp-Unit-6 Estimation Theory PDF
38 pages
Lecture 5 final Point Estimation and Interval Estimation
No ratings yet
Lecture 5 final Point Estimation and Interval Estimation
10 pages
Unit v Estimation
No ratings yet
Unit v Estimation
33 pages
Inferential PDF
No ratings yet
Inferential PDF
9 pages
Chapter 6. Estiamation
No ratings yet
Chapter 6. Estiamation
65 pages
Estimation
No ratings yet
Estimation
14 pages
Statistical Inference Point Estimators Estimating The Population Mean Using Confidence Intervals
No ratings yet
Statistical Inference Point Estimators Estimating The Population Mean Using Confidence Intervals
40 pages
Statistics and Probability Module 4 Moodle
No ratings yet
Statistics and Probability Module 4 Moodle
6 pages
Chapter Four
No ratings yet
Chapter Four
9 pages
7 Estimation
No ratings yet
7 Estimation
91 pages
Lecture 8
No ratings yet
Lecture 8
85 pages
2.parameter Estimation
No ratings yet
2.parameter Estimation
59 pages
Statistics 2 Chapter Two
No ratings yet
Statistics 2 Chapter Two
14 pages
Inferential Statistic: 1 Estimation of A Population Mean
No ratings yet
Inferential Statistic: 1 Estimation of A Population Mean
8 pages
Ch4 Estimation of Parameters Complete
No ratings yet
Ch4 Estimation of Parameters Complete
53 pages
Estimation
No ratings yet
Estimation
92 pages
Unit - III (P&S Notes)
No ratings yet
Unit - III (P&S Notes)
39 pages
Lesson 4.1 Computing The Point Estimate of A Population Mean
No ratings yet
Lesson 4.1 Computing The Point Estimate of A Population Mean
34 pages
Project 2..research
No ratings yet
Project 2..research
5 pages
Chapter 8
No ratings yet
Chapter 8
19 pages
Lesson 5 Statistics & Probability
No ratings yet
Lesson 5 Statistics & Probability
18 pages
4 5 Chapter 4 ESTIMATION and 5 Hyp Testing
No ratings yet
4 5 Chapter 4 ESTIMATION and 5 Hyp Testing
180 pages
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
From Everand
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Peter Bradley
No ratings yet
Nptel: Vehicle Dynamics - Video Course
No ratings yet
Nptel: Vehicle Dynamics - Video Course
2 pages
Unix Lab Manual Part B PDF
No ratings yet
Unix Lab Manual Part B PDF
12 pages
Unix VVFGC Notes PDF
100% (4)
Unix VVFGC Notes PDF
75 pages
MSC in Data Science Programme Brochure
No ratings yet
MSC in Data Science Programme Brochure
2 pages
Visual Basic Part 4 Studocu PDF
No ratings yet
Visual Basic Part 4 Studocu PDF
42 pages
VB Part 3 Studocu PDF
100% (1)
VB Part 3 Studocu PDF
23 pages
UNIX THEvi-editors-mode PDF
No ratings yet
UNIX THEvi-editors-mode PDF
17 pages
Lab1-HTML1 v2 - Lab 1 HTML Lab1-HTML1 v2 - Lab 1 HTML
No ratings yet
Lab1-HTML1 v2 - Lab 1 HTML Lab1-HTML1 v2 - Lab 1 HTML
4 pages
Lab1-HTML1 v2 - Learn HTML Lab1-HTML1 v2 - Learn HTML
No ratings yet
Lab1-HTML1 v2 - Learn HTML Lab1-HTML1 v2 - Learn HTML
4 pages
9th English Socialscience 2
No ratings yet
9th English Socialscience 2
160 pages
Android Development PDF
No ratings yet
Android Development PDF
259 pages
Unit 3 Graph Algorithms: Structure Page Nos
No ratings yet
Unit 3 Graph Algorithms: Structure Page Nos
19 pages
Block-3 MS-024 Unit-3
No ratings yet
Block-3 MS-024 Unit-3
18 pages
Unit 2 Divide and Conquer Approach: Structure Page Nos
No ratings yet
Unit 2 Divide and Conquer Approach: Structure Page Nos
40 pages
Section 5 Sampling - Case Study: Structure Page Nos
No ratings yet
Section 5 Sampling - Case Study: Structure Page Nos
11 pages
Unit 1 Greedy Techniques: Structure Page Nos
No ratings yet
Unit 1 Greedy Techniques: Structure Page Nos
37 pages
Section 4 Tme Series Anaysis and Control Charts: Structure Page Nos
No ratings yet
Section 4 Tme Series Anaysis and Control Charts: Structure Page Nos
10 pages
Section 2 Correlation & Regression: Structure Page Nos
No ratings yet
Section 2 Correlation & Regression: Structure Page Nos
17 pages
Section 1: Descriptive Statistics and Statistical Inferences
No ratings yet
Section 1: Descriptive Statistics and Statistical Inferences
28 pages
Addis Ababa Medical and Business College: Article Review
No ratings yet
Addis Ababa Medical and Business College: Article Review
3 pages
Surveys in Social Research 5th Edition Social Research Today Series D. A. De Vaus pdf download
100% (1)
Surveys in Social Research 5th Edition Social Research Today Series D. A. De Vaus pdf download
42 pages
Test Bank for Ecology: The Economy of Nature, 8th Edition, Rick Relyea, Robert E. Ricklefs, - PDF Format Is Available With All Chapters
100% (5)
Test Bank for Ecology: The Economy of Nature, 8th Edition, Rick Relyea, Robert E. Ricklefs, - PDF Format Is Available With All Chapters
50 pages
Question Paper Code: 17126: Reg. No
No ratings yet
Question Paper Code: 17126: Reg. No
4 pages
Nota PLS-SEM
100% (1)
Nota PLS-SEM
25 pages
Quantum Chaos Thesis
No ratings yet
Quantum Chaos Thesis
93 pages
Abductive Reasoning
No ratings yet
Abductive Reasoning
13 pages
Griffiths QMCH 2 P 13
No ratings yet
Griffiths QMCH 2 P 13
6 pages
Compontents of Business Research
No ratings yet
Compontents of Business Research
19 pages
QM
No ratings yet
QM
4 pages
OutlierDetection.ppt
No ratings yet
OutlierDetection.ppt
20 pages
MBA SIP Guidelines 2023 25 Batch
No ratings yet
MBA SIP Guidelines 2023 25 Batch
21 pages
3b Weibull Analysis Supporting Notes
No ratings yet
3b Weibull Analysis Supporting Notes
15 pages
Sources of Knowledge
No ratings yet
Sources of Knowledge
13 pages
Characteristics of Quantitative and QualitativeResearch
75% (8)
Characteristics of Quantitative and QualitativeResearch
2 pages
Lab Report Structure - PHYS192
No ratings yet
Lab Report Structure - PHYS192
3 pages
Guo Et Al 2020 Do You Get The Picture A Meta Analysis of The Effect of Graphics On Reading Comprehension
No ratings yet
Guo Et Al 2020 Do You Get The Picture A Meta Analysis of The Effect of Graphics On Reading Comprehension
20 pages
MCQ Testing of Hypothesis With Correct Answers
93% (15)
MCQ Testing of Hypothesis With Correct Answers
7 pages
Homework 3
No ratings yet
Homework 3
10 pages
M8 ANOVA and Kruskall Wallis - Pelajar 12042018-20191108123443
No ratings yet
M8 ANOVA and Kruskall Wallis - Pelajar 12042018-20191108123443
59 pages
Hypothetico-Deductive Method - Testing Theories PDF
No ratings yet
Hypothetico-Deductive Method - Testing Theories PDF
7 pages
Architettura 2012 Inglese
No ratings yet
Architettura 2012 Inglese
27 pages
Final Assignment Business Analytics
No ratings yet
Final Assignment Business Analytics
10 pages
CS1.1 Discrete RV
No ratings yet
CS1.1 Discrete RV
4 pages
Wilcoxon Signe Ranked Test
No ratings yet
Wilcoxon Signe Ranked Test
14 pages
Probability and Non Probability
No ratings yet
Probability and Non Probability
2 pages
Gauge Capability Study
No ratings yet
Gauge Capability Study
8 pages
CH02
0% (1)
CH02
6 pages
Correlation
No ratings yet
Correlation
2 pages

Unit 5 Estimation: Structure

Uploaded by

Unit 5 Estimation: Structure

Uploaded by

UNIT 5 ESTIMATION

Structure Page No.

distinguish between point estimation and interval estimation

5.2 POINT ESTIMATION

5.3 CRITERIA FOR A GOOD ESTIMATOR

Definition 2: Suppose 8 (read theta hat) is an estimator of the population parameter, 8.

Problem 2: In a sample of 400 textile workers, 184 expressed dissatisfaction regarding

Here are some exercises for you.

," Value of estimator

We can summarise our discussion up to now as follows:

5.4 INTERVAL ESTIMATION

5.4.1 Confidence Interval for the Mean with Known Variance

In the following problem we illustrate the use of confidence intervals.

It is time to do some exercises.

Next we shall consider the case when u is unknown.

5.4.2 Confidence Interval for Mean with Unknown Variance

Fig.3: Confidence interval using the t-distribution

I ii) Working similarly, the 99% confidence interval is

Since n=9, we make use of t-distribution. Because a 90 percent confidence interval is

By simplifying, we get that the confidence interval is (5 107, 5293).

Why don't you try these exercises now?

E9) A sample of 12 measurements of breaking strengths of cotton threads gave a mean

Consequently, the confidence interval is

Similarly, the 99% confidence interval is

See if you can solve these exercises:

E12) A random sample of marks obtained by 50 students in Mathematics showed a

5.4.3 Confidence Interval for Population Proportion

We shall illustrate this with the problem given below.

Here are some exercises for you.

7.r P p is not too close to o or 1, large n (p - ZJ,- p + zJq,

In this unit we have seen that

ii) Here X = 550, a = 40 and n = 16. Therefore the interval is

E6) Here n = 140,X = 11.4,a = 2.5

E7) Number of days are:

= 2P (0 5 Z 5 0.7071) '= 0.5224

You might also like