0% found this document useful (0 votes)
136 views12 pages

8 Statistical Estimation

1. This document discusses statistical estimation and hypothesis testing. Statistical estimation involves using sample statistics to estimate population parameters, while hypothesis testing involves setting up and testing hypotheses about population parameters. 2. There are two main types of statistical estimation - point estimation and confidence interval estimation. Point estimation provides a single value for a population parameter based on sample data. Confidence interval estimation provides a range of values that is likely to contain the true population parameter. 3. The method used to construct confidence intervals depends on whether the population standard deviation is known or unknown. When the standard deviation is known, confidence intervals can be constructed based on the normal distribution. When the standard deviation is unknown, it must be estimated from the sample data
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views12 pages

8 Statistical Estimation

1. This document discusses statistical estimation and hypothesis testing. Statistical estimation involves using sample statistics to estimate population parameters, while hypothesis testing involves setting up and testing hypotheses about population parameters. 2. There are two main types of statistical estimation - point estimation and confidence interval estimation. Point estimation provides a single value for a population parameter based on sample data. Confidence interval estimation provides a range of values that is likely to contain the true population parameter. 3. The method used to construct confidence intervals depends on whether the population standard deviation is known or unknown. When the standard deviation is known, confidence intervals can be constructed based on the normal distribution. When the standard deviation is unknown, it must be estimated from the sample data
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 12

Stat i stical estimation and hypothesis testing

Introduction
There are 2 types of statistical inferences:-
(1) Statistical estimation (2) Hypothesis testing

Statistical estimation is concerned with estimating the population parameters


using sample statistics.

Hypothesis testing involves the setting up of a hypothesis (or theory) about the
population and then sampling in order to see if the hypothesis is supported or
rejected.

Statistical estimation
Because of time and cost factors, the population parameters (  ,  , p ) are
frequently estimated by using sample statistics ( x , s, p ).

Point estimate
An estimate of a population parameter given by a single value and
calculated from sample data is called a point estimate of the population
parameter.
(a) x is a point estimate for  .
(b) s is a point estimate for  where
s =      x n (for ungrouped data)
2
X 2
 X
n  n  n 1
 

s =  f    f  x f  1 (for grouped data)


2
fX  2
fX  f
or
    
Note: In a question on statistical inference, the standard deviation given is
taken to be s.

(c) Given 2 samples from the same population.


Sample 1 of size n1 , sample mean X and sample standard deviation
1

s1
Sample 2 of size n2 , sample mean X and sample standard deviation
2

s2

1
Then the point estimate for the population mean  is
n1 X 1  n 2 X 2
X 
n1  n 2

and the point estimate for the population standard deviation  is


2 2
(n1  1) s1  (n2  1) s 2
Sp =
n1  n2  2

E.g. A sample of 5 measurements of the diameter of a sphere recorded by a


scientist is as follows:-

6.36mm, 6.32mm, 6.37mm, 6.33mm, 6.37mm

Determine a point estimate for (i) the population mean,  . (ii) the
population standard deviation  .

Solution: Diameter X X2____

6.36 40.4496

6.32 39.9424

6.37 40.5769

6.33 40.0689

6.37 40.5769____

31.75 201.6147

(i) The point estimate for  is

(ii) The point estimate for  is

2
Confidence interval estimates or confidence limits

An estimate of a population parameter by 2 numbers together with an assessment


of the probability that the population parameter may lie is called a confidence
interval estimate of the population parameter.

Confidence interval estimate of population mean , 

Model I :

Assumption: If population standard deviation  is known.

The population X ~ N[  ,  2 ] or distribution of X is unknown


but n  30 then sampling distribution of sample means is
  N n
normal X ~ N [ X , X ] where and  or
2
X   X
 x
n n N 1

Based on the characteristics of normal curve, a symmetric interval about    X

which contains 99% or 95% of all sample means can be obtained i.e. C. I.
estimate of  can be obtained.

E.g. To construct a 99% C.I. for .

 1%
The 99% C. I. estimate for is X 2.5758 X where P( Z > 2.5758) = 2
.

3
 5%
Similarly, the 95% C. I. for is X 1.96 X where P( Z > 1.96) = 2
.


In general, the 100(1-  )% C. I. for  is X Z   X
2
where P( Z > Z
2
)= 2
.

E.g. Suppose that a random sample of 5 observations was taken from a normal
population whose variance is 25. The results are 8, 15, 12, 6, 7. Find the 99%
C.I. estimate of the population mean.

E.g. A normal population has unknown mean  and standard deviation 15. A
random sample of size 25 drawn from this population was found to have a mean
of 950.

(i) Construct (a) a 90% C.I. for  ; (b) a 95% C.I. for ; (c) a 99% C.I. for 

(ii) Interpret the results obtained in (i).

Solution: Given:

E.g. A manufacturer wishes to estimate the mean dimension of a certain


component. He would be satisfied if he obtains an estimate within 0.01 cm. of
the true mean. The standard deviation of the dimension of the component is 0.2

4
cm. What must be the size of the sample that he should examine if he wants to
be 95% certain?

Model II :

Assumption: If population standard deviation  is unknown then it is estimated


by s, and the sample size n is large (n  30) .
s s N n
X is estimated by SX 
n
or n

N 1

The 100(1-  )% C. I. for  is X Z  S X


2
.

In particular, the 99% C. I. for  is X 2.5758  S X ;

and the 95% C. I. for  is X 1.9600  S X .

E.g. Fifty bags of sugar were randomly selected and carefully weighed. The
mean weight was 1.04 kg. and the standard deviation 0.002 kg. Construct a 99%
confidence interval for the mean weight of all bags.

5
E.g. The management of a company making a certain type of car component
wishes to ascertain the average number of components per hour produced by the
workers. The company employs a very large number of workers and it is decided
to use a sample of the output of 400 workers. After checking the output of this
sample, it was found that the average output produced by each worker every hour
is 100 with a standard deviation of 20.

(a) Calculate a 95% confidence interval for the average output produced by each
worker per hour for the whole factory.

(b) How large a sample is needed if the management wishes to be 95% confident
that the sample mean will be within one unit of the true mean?

Model III :If  is unknown then it is estimated by s, and the sample size n is
small (n<30) and the population is normal or approximately normal.

s s N n
The 100(1-  ) % C.I. for  is X  t
, n 1
 SX where S X 
n
or
n

N 1
2

6
t
and 2
, n 1 is obtained from t distribution table.

Student’s t distribution or simply t distribution is a family of probability


distributions distinguished by their individual degrees of freedom (v), similar in
form to the normal distribution and it is used when the population standard
deviation is unknown and the sample size is small (n<30). The table of area
under t distribution is tabulated.

 = 0.10 0.05 0.025 … 0.001 0.0005

v=1 3.078 6.314

30
40
60
120
 1.282 1.645 1.960 … 3.090 3.291

E.g. A sample of 10 packets of sugar packed by a machine has the following


weights:
1 kg., 1.05, 1.10, 0.95, 0.96, 1.10, 1.02, 0.97, 0.99, 1
(a) Calculate the sample mean and standard deviation

7
(b) Obtain a 95% C.I. for .

Confidence interval estimate of difference of two means

Population I Population II

8
Pop. Mean 1 2

Pop. std dev. 1 2

Sample size n1 n2

Sample mean X1 X2

Sample std dev. s1 s2

For large sample sizes, n1 and n2 (  30)

1 1 N 1  n1
X1 ~ N [ X ,  X 2
] where  X  1 & X  or 
1 1 1 1
n1 n1 N1  1

2 2 N 2  n2
X2 ~ N [ X ,  X 2
] where  X   2 & X  or 
2 2 2 2
n2 n2 N2  1

then X1  X 2 ~ N [ X  X2
, X  X2
2
] where X  X2
 X   X  1   2
1 1 1 1 2

2 2 2 2
1   1 N 1  n1  N  n2
and X
1  X2

n1
 2
n2
or 
n1 N 1  1
 2  2
n2 N2  1

If 1 &  2 are unknown, X  X2 is estimated by


1

2 2 2 2
s1 s s1 N1  n1 s N  n2
SX   2 or   2  2
1  X2
n1 n2 n1 N 1  1 n2 N2  1

Model IV : For large sample size n1 , n2  30 , 1 &  2 are unknown and


estimated by s1 & s2 respectively.
The 99% C. I. for 1   2 is ( X 1  X 2 )  2.58  S X
1  X2

The 95% C. I. for 1   2 is ( X 1  X 2 )  1.96  S X


1  X2

9
In general, the 100(1-  ) % C. I. for 1   2 is ( X 1  X 2 )  Z   S X
2
1  X2

Z 
where P ( Z > 2
) = 2

E.g. The management of a large company wishes to determine whether there is


any difference in performance between the day shift workers and the night shift
workers. A sample of 120 day shift workers and another 100 night shift workers
are selected. The results ( in number of parts produced per hour) are given
below:-
Day shift Night shift
Sample size 120 100
Sample mean 75.5 70.4
Sample std. dev. 4.13 4.27
Construct a 95% C. I. for the difference of the average output of the day shift
workers and that of night shift workers.

Confidence interval estimate of population proportion, p


Model V
Assumption: For large sample size, n  30, the 100(1-  )% C.I. estimate for
p is

10
    
p Z   s  p (1  p ) p (1  p ) N n
p where s  or  .
2 p n n N 1

In particular, the 95% C.I. estimate for p is p 1.9600  s 


p
;

and the 99% C.I. estimate for p is p 2.5857  s 


p

E.g. In a random sample of 300 employees, 55% were found to be in favour


of strike action. Find the 90% confidence interval for the proportion of all
employees in the company who are in favour of such action.

E.g. A producer of steel pipes selected a simple random sample of 300 pipes
from the production process to estimate the proportion of defective pipes.
There were 15 defective pipes in the sample.
(a) What is the point estimate of the proportion of defective pipes in the
population?
(b) Construct a 95% confidence interval estimate of the proportion of the
defective pipes in the population.
(c) How large a sample would be needed if the probability is to be 0.95 that
the error of estimate will not exceed 0.02 unit?

Confidence interval estimate of difference of two proportions


Population proportion p1 p2
Sample size n1 n2
Sample proportion  
p1 p2

11
 
Point estimate for difference of 2 pop. proportions ( p1 - p2) is p1 - p2

For large sample sizes n1 , n2  30 ,


 p1 (1  p1 )
p1 ~ N [   ,     p1 
2
 ) where and  
p1 p1 p1 p1 n1


2 p 2 (1  p 2 )
p2 ~ N [   ,   ) where    p2 and   
p2 p2 p2 p2 n2
 
p1  p 2 ~ N [   ,      
2
   ] where  
p1  p2 p1  p 2 p1  p2 p1 p2

p1 (1  p1 ) p 2 (1  p 2 )
and     which is estimated by
2 2
      
p1  p 2 p1 p2 n1 n2
   
p1 (1  p1 ) p 2 (1  p 2 )
S   
p1  p2 n1 n2

Model VI : For large sample sizes n1 , n2  30. p1 & p2 are estimated by


 
p1 - p2 respectively.
 Z  S
The 100( 1-  )% C. I. for p1 - p2 is ( p -
 
p2 ) where
 
1 p1  p 2
2

Z 
P(Z > 2
) = 2

E.g. Superplasticized concrete is formed by adding chemicals to conventional


concrete to make it more fluid so that it can be placed more easily. Suppose that
a sample of 50 new construction projects in Area A yields 15 that are using this
type of concrete. A sample of 60 new projects in Area B also yields 15 using
superplasticized concrete. Construct a 99% C. I. for the difference in the
proportions of new construction projects in Areas A and B that are using
superplasticized concrete.

12

You might also like