0% found this document useful (0 votes)
17 views19 pages

Chapter 8

The document discusses statistical estimation and hypothesis testing, focusing on inferential statistics that draw conclusions about a population based on sample data. It outlines two primary methods of estimation: point estimation, which provides a single value estimate, and interval estimation, which gives a range of values likely to contain the parameter. Additionally, it explains the properties of good estimators, confidence intervals, and provides examples of calculating confidence intervals for means and proportions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views19 pages

Chapter 8

The document discusses statistical estimation and hypothesis testing, focusing on inferential statistics that draw conclusions about a population based on sample data. It outlines two primary methods of estimation: point estimation, which provides a single value estimate, and interval estimation, which gives a range of values likely to contain the parameter. Additionally, it explains the properties of good estimators, confidence intervals, and provides examples of calculating confidence intervals for means and proportions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Estimation and Hypothesis Testing

Introduction to statistics
Chapter 8
Statistical Estimation and Hypothesis testing

 Inference is the process of making interpretations or conclusions from sample data


for the totality of the population.
 Inferential statistics uses the sample results to make decisions and draw
conclusions about the population from which the sample is drawn.
 In statistics there are two ways though which inference can be made.
 Statistical estimation
 Statistical hypothesis testing
Both involve using sample statistics to make inferences about the
population parameter.

Both involve using sample statistics to make inferences about the population parameter.

Population
Analyzed data
Inference

Numerical data
Sample

Statistical Estimation:
This is one way of making inference about the population parameter where the investigator
does not have any prior notion about values or characteristics of the population parameter.
There are two ways estimation:

1
Estimation and Hypothesis Testing
Introduction to statistics
i. Point Estimation: It is a single value or number of sample information that is used
to estimate a parameter. The best point estimate of the population mean  is the
sample mean X.

ii. Interval estimation: It is the procedure that results in the interval of values as an
estimate for a parameter, which is interval that contains the likely values of a
parameter. It deals with identifying the upper and lower limits of a parameter.
8.2. Estimator and Estimate
Estimator is the rule or random variable that helps us to approximate a population
parameter. But estimate is the different possible values which an estimator can assume. For
n

X
is an estimator for the population mean and X  10
i
example: The sample mean X  i 1

is an estimate, which is one of the possible values of X .


8.3 Properties of best estimator
The following are some qualities of an estimator
o It should be unbiased.
o It should be consistent.
o It should be relatively efficient.

To explain these properties let ˆ be an estimator of θ


1. Unbiased Estimator: An estimator whose expected value is the value of the parameter
being estimated. i.e. E ˆ   . 
2. Consistent Estimator: An estimator which gets closer to the value of the parameter as

the sample size increases. i.e. ˆ gets closer to θ as the sample size increases.
3. Relatively Efficient Estimator: The estimator for a parameter with the smallest
variance. This actually compares two or more estimators for one parameter.
8.4. Point and Interval Estimation of the population mean: μ
8.4.1. Point estimation of the population mean: μ
Another term for statistic is point estimate, since we are estimating the parameter value. A
point estimator is the mathematical way we compute the point estimate. For instance, sum
of X i over n is the point estimator used to compute the estimate of the population means,

 . That is, X 
X i
is a point estimator of the population mean.
n

2
Estimation and Hypothesis Testing
Introduction to statistics
8.4.2. Confidence interval estimation of the population mean
Although X possesses nearly all the qualities of a good estimator, because of sampling
error, we know that it's not likely that our sample statistic will be equal to the population
parameter, but instead will fall into an interval of values. We will have to be satisfied
knowing that the statistic is "close to" the parameter. That leads to the obvious question,
what is "close"?
We can phrase the latter question differently: How confident can we be that the value of the
statistic falls within a certain "distance" of the parameter? Or, what is the probability that the
parameter's value is within a certain range of the statistic's value? This range is the
confidence interval. A confidence interval is a specific interval estimate of a parameter
determined by using data obtained from a sample and the specific confidence level of the
estimate.
The confidence level is the probability that the value of the parameter falls within the range
specified by the confidence interval surrounding the statistic. There are different conditions
to be considered to construct confidence intervals of the population mean,  .

Condition-1: If the population variance  2 is known; what ever the value of sample
size but the population is normal
Recall the Central Limit Theorem, which applies to the sampling distribution of the mean of a
sample. Consider samples of size n drawn from a population, whose mean is μ and standard
deviation is  with replacement and order important. The population can have any
frequency distribution. The sampling distribution of X will have a mean

 X   and a standard deviation  X  , and approaches a normal distribution as n gets large.
n
This allows us to use the normal distribution curve for computing confidence intervals.

Z 
X    ~ N (0,1)
 n

   X  Z n

 X   , where  is a measure of error.

  Z n
- For the interval estimator to be good the error should be small. How it is small?
• By making n large

3
Estimation and Hypothesis Testing
Introduction to statistics
• Small variability
• Taking Z small
-To obtain the value of Z, we have to attach this to a theory of chance. That is, there is an
area of size 1-  Such that:
 P Z 2  Z  Z 2   1  
Where:  = is the probability that the parameter lies outside the interval
Z 2  is the value of the standard normal variable corresponding to


the right of which  2 probability lie , i.e.  P Z  Z 2   2 
 X  
 P  Z  2   Z  2   1  
  n 

 P X  Z 2  n    X  Z 2  
n  1

If the population has a normal distribution and  is known, then a 1   100% confidence
interval for  is given by:

X  Z  2  n , X  Z 2  n 
Note: When (as is often the case) we don't know the population standard deviation  and n
is large ( n  30 ), we can approximate it by the sample standard deviation S , and obtain the
following (good) approximation of the 1   100% confidence interval for  :

X  Z  2 S n , X  Z 2 S n 
Z 2  Z-value with an area of /2 to its right (obtained from a table).

Condition-2: If the population variance  2 is not known and n is Small (n<30 the
population is normal:

In most practical research, the standard deviation for the population of interest is not
known. In this case, the standard deviation  is replaced by the estimated standard
deviation S, also known as the standard error. Since the standard error is an estimate for the
true value of the standard deviation, the distribution of the sample mean X is no longer

4
Estimation and Hypothesis Testing
Introduction to statistics
normal with mean  and standard deviation  n . Instead, the sample mean follows the

t -distribution with mean X and standard deviation S n . The t -distribution is also


described by its degrees of freedom. For a sample of size n, the t -distribution will have
n-1 degrees of freedom. The notation for a t -distribution with n-1 degrees of freedom is
t  n 1 . As the sample size n increases, the t -distribution becomes closer to the normal
distribution, since the standard error approaches the true standard deviation for large n.

t 
X    has t distribution with n-1 degree of freedom.
S n

-The value of t 2 can be obtained from a table with an area of  2 to the right with

n  1 degrees of freedom.
Therefore, the 1   100% confidence interval for  when the population is normally
distributed and  is not known is given by:

X  t  2 S n , X  t 2 S n 
Example 8.1: A random sample of 900 workers showed an average height of 67 inches
with a standard deviation of 5 inches.
a) Find a 95% confidence interval of the mean height of all workers
b) Find a 99% confidence interval of the mean height of all workers
Solution:
a) X  67 , S=5, n=900
 1   100%  95%  1     0.95
   0.05   2  0.025

 Z 2  Z 0.025  1.96, from the table.

The required interval will be:


X  Z  2 S n , X  Z 2 S n 
 (67  1.96 * 5 30,67  1.96 * 5 30)
 66.673,67.327

5
Estimation and Hypothesis Testing
Introduction to statistics
 1   100%  99%  1     0.99
b)
   0.01   2  0.005

 Z 2  Z 0.005  2.58, from the table.

The required interval will be:


X  Z  2 S n , X  Z 2 S n 
 (67  2.58 * 5 30,67  2.58 * 5 30)
 66.57,67.43

Example 8.2: A Drug Company is testing a new drug which is supposed to reduce blood
pressure. From the six people who are used as subjects, it is found that the average drop in
blood pressure is 2.28 points, with a standard deviation of 0.95 points. What is the 95%
confidence interval for the mean change in pressure?
Solution:
X  2.28 , S  0.95, n  6
 1   100%  95%  1     0.95
   0.05   2  0.025

 t 2  t 0.025  2.571, from the table, with df  5.

The required interval will be:


X  t S n , X  t S n 
 2  2

 (2.28  2.571* 0.95 6 ,2.28  2.571* 0.95 6)


 1.28.3.28

Example 8.3: Suppose we want to estimate a 95% confidence interval for the average
quarterly returns of all fixed-income funds in the Ethiopia. We draw a sample of 100
observations and calculate the sample mean to be 0.05 and the standard deviation 0.03. We
assume that those returns are normally distributed with known variance.
Solution:
X  0.05,   0.03, n=100
 1   100%  95%  1     0.95
   0.05   2  0.025

 Z 2  Z 0.005  2.58, from the table

6
Estimation and Hypothesis Testing
Introduction to statistics
 The confidence interval is:


 X  Z 2  n 
  0.03  
  0.05  1.96   
  10 
 (0.04412, 0.05588)
8.5. Point and Interval Estimation of the Population proportion:
X
If P represents for the population proportion then the sample proportion Pˆ  provides a
n
good estimate of P. Therefore, the sample proportion P̂ is the point estimation of the
population proportion. To construct the confidence interval for the proportion we follow
the following conditions:
Conditions: If the population proportion is not too close to zero or one, and
that the sample size is large (at least 30):
X
 Under these conditions, the sampling distribution Pˆ  can be approximated by a
n

normal distribution that has mean P and standard deviation P(1  P)


.
n

 To construct a confidence interval for P, we can now adopt the same argument that
was used in finding a confidence interval for  and write:

P(1  P) P(1  P)
P( Pˆ  Z  2  P  Pˆ  Z  2 )  1
n n
Hence a ( 1   ) 100% confidence interval for population proportion P is given by:
P(1  P) P(1  P)
Pˆ  Z  2  P  Pˆ  Z  2 )
n n

An Approximate ( 1   ) 100% confidence interval for the population proportion P is given


by:

Pˆ (1  Pˆ ) Pˆ (1  Pˆ )
Pˆ  Z  2  P  Pˆ  Z  2 )
n n
If the sample size is large (usually n>30)

7
Estimation and Hypothesis Testing
Introduction to statistics
Example 8.4: In a sample of 400 people who were questioned regarding their participation
in sports, 160 said that they did participate. Construct a 98 % confidence interval for P, the
proportion of P in the population who participate in sports.
Solution:
Let X= be the number of people who are interested to participate in sports.
X=160, n=400,  =0.02, Hence Z 2  Z 0.01  2.33
X 160
Pˆ    0.4
n 400
P(1  P) 0.4(0.6)
 P2ˆ    0.0245
n 400

As a result, an approximate 98% confidence interval for P is given by:


Pˆ (1  Pˆ ) Pˆ (1  Pˆ )
 Pˆ  Z  2  P  Pˆ  Z  2 )
n n

 (0.4  (2.33 * 0.0245)), (0.4  (2.33 * 0.0245


 0.345,0.457
Hence, we can conclude that about 98% confident that the true proportion of people in
the population who participate in sports between 34.5% and 45.7%.
8.6 STATISTICA HYPOTHESIS TESTING
A statistical hypothesis test is a method of making statistical decisions using
experimental data.
Hypothesis Testing: Is a common method of drawing inferences about a population based
on statistical evidence from a sample.
Definitions:
Statistical hypothesis: Is an assertion, statement, or claim about the population whose
plausibility is to be evaluated on the basis of the sample data.
Test statistic: Is a statistics whose value serves to determine whether to reject or accept the
hypothesis to be tested. It is a random variable.
Statistic test: Is a test or procedure used to evaluate a statistical hypothesis and its value
depends on sample data.
There are two types of hypothesis:

8
Estimation and Hypothesis Testing
Introduction to statistics
Null hypothesis: Is a claim or statement about a population parameter that is usually
assumed to be true from the very beginning until it is declared false. It is a statistical
hypothesis that states a hypothesis of equality or the hypothesis of no difference between a
parameter and a specific value. It is usually denoted by H .
0

Alternative hypothesis: Is a claim or statement about a population parameter that will be


true if the null hypothesis is false. It is a statistical hypothesis that states a hypothesis of
difference between a parameter and a specific value. It is usually denoted by H or H .
1 A

Types and size of errors:


 Testing hypothesis is based on sample data which may involve sampling and non
sampling errors.
 Type I error: Rejecting the null hypothesis when it is actually true. The significance level
(  ) can be interpreted as the probability of rejecting the null hypothesis when it is
actually true. The distribution of the test statistic under the null hypothesis
determines the probability  of a type I error.
 =P (type I error) = level of significance
 Type II error: Occurs when a false null hypothesis is not rejected. The null
hypothesis is actually false but we wrongfully conclude do not reject it. 
represents the probability that H0 is not rejected when actually H0 is false. The
distribution of the test statistic under the alternative hypothesis determines the
probability  of a type II error.
 =P (type II error)
 The power of a test ( 1   ) is the probability of correctly rejecting a false null
hypothesis. The value of ( 1   ) is called the power of a test.
1   =Power of test
Note: The two types of errors that occur in tests of hypothesis depend on each other. We
can not lower the values of  and  simultaneously for a test of hypothesis for a fixed
sample size. Lowering the value of  will raise the value of  , and lowering the value
of  will raise the value of  . However, we can decrease both  and 
simultaneously by increasing the sample size.

9
Estimation and Hypothesis Testing
Introduction to statistics
- The following table gives a summary of possible results of any hypothesis
test:
Actual situation (condition)
H0 is true H0 is false
(H1 is false) (H1 is true)
Decision Do not Reject H0 Correct Decision Type II error
Reject H0 Type I error Correct Decision
General steps in hypothesis testing:
1. State the appropriate hypothesis
2. Select the level significance, 
3. Select an appropriate test statistics
4. Identify the critical region.
5. Compute the test value
6. Making the decision.
7. Summarize the results.
8.6.1 Hypothesis tests about a population mean: 
Suppose the assumed or hypothesized value of  is denoted by  0 then one can formulate
two sided (1) and one sided (2 and 3) hypothesis as follows:
1. H 0 :    0 VS H1 :   0

2. H 0 :    0 VS H1 :   0

3. H 0 :    0 VS H1 :   0
Condition-1: If the population standard deviation,  is known what ever the value of
sample size is and when sampling is from a normal distribution:
- The formula for the test statistic is:

Z cal 
X   
0

 n
After specifying α we have the following test criteria corresponding to the above three
hypothesis.
Hypothesis Decision rule is to
reject H0 if:
Null Alternative

10
Estimation and Hypothesis Testing
Introduction to statistics
VS   0 Z cal  Z  2
  0
  0 Z cal  Z 
  0 Z cal   Z 

Note: When we don't know the population standard deviation  and n is large ( n  30 ), we
can approximate it by the sample standard deviation S , and obtain the following test
statistics:

Z cal 
X    ~ N (0,1)
0

S n

-The decision rule is the same as condition-1.

Condition-2: When the population standard deviation,  , is unknown, the population


is normally or approximately normally distributed, and sample size is small (n<30):
- The formula for the test statistic is:
( X  0 )
t cal  ~ t ( n1)
S n
After specifying α we have the following test criteria corresponding to the above three
hypothesis.
Hypothesis Decision rule is to
reject H0 if:
Null Alternative
VS   0 t cal  t 2
  0
  0 t cal  t
  0 t cal  t
Example 8.5: The Tele Co. provides telephone service in an area. According to the
company’s records, the average length of all calls placed was 12.5 minutes. A sample of 150
such calls placed through this Co. produced a mean length of 13 minutes with a standard
deviation of 2.6 minutes. Can you conclude that the mean length of all current calls is
different from 12.5 minutes? Use the 0.05 level of significance and assume that the
distribution of all call is normal.

11
Estimation and Hypothesis Testing
Introduction to statistics
Solution:
Let  0  population mean
1. State the null and alternative hypothesis:
H 0 :   12.5 (The mean length of all current calls is 12.5 minutes) H1 :   12.5

(The mean length of all current calls is different from12.5 minutes).


2. Select the level significance,  = 0.05 (given)
3. Select an appropriate test statistics:
Z-statistic is appropriate because the sample size is large
4. Identify the critical region:
Here we have two critical regions since we have two tailed hypothesis. The
critical region is Z cal  Z 0.025  1.96
 (1.96,1.96) is the acceptance region

5. Compute the test value


X  13 ,   2.6 , n=150

 Z cal 
X     13  12.5
0

0.5
 2.27
S n 2.6 150 0.22

6) Decision:
Reject H0, since Z cal is not in the acceptance region
7 Conclusion: At 5% level of significance, we have evidence to say that the
average length of all such calls is not equal to 12.50 minutes.
Example 8.6: Ten individuals are chosen at random from a population and their height is
found to be in inches 63, 63, 66, 67, 68, 69, 70, 71 and 71. In the height of the data the
average height of the population is 66 inches. Can we conclude that the height of an
individual is decreasing? (Use   0.05 and assume the normality of the population)
Solution:
Let  0  population mean
1. State the null and alternative hypothesis:
H 0 :   66 VS H1 :   12.5

12
Estimation and Hypothesis Testing
Introduction to statistics
2. Select the level significance,  = 0.05 (given)
3. Select an appropriate test statistics:
t -statistic is appropriate because the population standard deviation is
unknown and the sample size is small.
4. Critical region:
t cal  t ,n1  t 0.05,9  1.8331

 (,1.8331) is the acceptance region.

5. Compute the test value


10 n

X i (X i  X )2
X i 1
 67.8 , S  i 1
 3.01, n=10
101 n 1

 t cal 
X     67.8  66  1.891
0

S n 3.01 10
6. Decision:
Reject H0, since t cal is not in the acceptance region
7. Conclusion: At 5% level of significance, we have evidence to say that the
average height of an individual is less than 66 inches.
Example 8.7: A national magnitude claims that the average college student watches less
television. The average national of all college students is 29.4 hours per week with a standard
deviation of 2 hours. A sample of 25 college students has a mean of 27 hours. Test the claim
at   0.01 and assume normality of the population.
Solution:
1. State the null and alternative hypothesis:
H 0 :   29.4 VS H1 :   29.4

2. Select the level significance,  = 0.01 (given)


3. Select an appropriate test statistics:
Z-statistic is appropriate because the population standard deviation is
known.
4. Critical region:

13
Estimation and Hypothesis Testing
Introduction to statistics
Z cal  Z   Z 0.01  2.33

 (, 2.33) is the acceptance region for the null hypothesis

5. Compute the test value


X  27  2, n=25

 Z cal 
X     27  29.4  6
0

 n 2 25
6. Decision:
Do not reject H0, since Z cal is not in the acceptance region
7. Conclusion: The average college students watches less television at 1% level
of significance
Example 8.8: An authority from a district power station of the town told reporters
recently that the average monthly electric Bill of households in AA is not more than Birr
100. A random sample of 400 households from the city produces a mean of Birr 105 Bill
with standard deviation of Birr 40. Test the claim of the authority at 5% level of
significance.
Solution:
State the null and alternative hypothesis:
H 0 :   100 (claim) VS H1 :   100

Select the level significance,  = 0.05 (given)


1. Select an appropriate test statistics:
Z-statistic is appropriate because the sample size is large and the
population is non-normal
2. Critical region:
Z cal  Z   Z 0.05  1.645

 (, 2.5) is the acceptance region for the null hypothesis

3. Compute the test value

 Z cal 
X     105  100  2.5
0

S n 40 400
Decision:
Reject H0, since Z cal is not in the acceptance region
4. Conclusion: At 5% level of significance the claim of the authority is not correct.

14
Estimation and Hypothesis Testing
Introduction to statistics
8.6.2 Tests about a population proportion: P
The procedure to make tests of hypothesis about the population proportion P for large
samples is similar in many aspects to the population mean. The procedure includes the same
seven steps. Similarly, the test can be two-tailed or one tailed. When the sample size is large,
the sample proportion P̂ is approximately normally distributed with its mean equal to P
P(1  P)
and standard deviation equal to . Hence; we use the normal distribution to
n
perform a test of hypothesis about the population proportion P for a large Sample. The
sample size considered to be large when nPˆ and n(1  Pˆ ) are both greater than 5.
Suppose the assumed or hypothesized value of P (parameter of the binomial distribution) is
denoted by P0 then one can formulate two sided (1) and one sided (2 and 3) hypothesis as
follows:
1. H 0 : P  P0 VS H 1 : P  P0

2. H 0 : P  P0 VS H 1 : P  P0

3. H 0 : P  P0 VS H 1 : P  P0

The choice of H 1 depends on the prior information we have on the values of P0 .


Decision Rule:
Hypothesis Decision rule is to
reject H0 if:
Null Alternative
VS P  P0 Z cal  Z  2
P  P0
P  P0 Z cal  Z 
P  P0 Z cal   Z 

Z cal 
Pˆ  P 
0
~ N (0,1) Example 8.9: A manufacturing company has submitted a
P0 (1  P0 )
n
claim that 100% of items produced by a certain process are non defective. An improvement
in the process is being considered that the feel will lower the proportion of defectives below
the current 10%. In an experiment 100 items are produced with the new process and 5 are
defective: Is this evidence sufficient to conclude that the method has been improved? Use a
0.05 level of significance.

Solution: As usual, we follow the steps:

15
Estimation and Hypothesis Testing
Introduction to statistics
1. H 0 : P  0.9 (actually P  0.9 ) VS H1 : P  0.9

2.   0.05
3. Critical Region: Z>1.645
4. Computation
X 95
Pˆ    0.95
n 100

Z cal 
Pˆ  P 
0

0.95  0.90
 1.67
P0 (1  P0 ) 0 .9 * 0 . 1
n 100

5. Decision: Reject H0
6. Conclusion: At 0.05 we have an evidence to say that the improvement has
reduced the proportion of defective.
7. Example 8.10: the unemployment rate in a given country at a given period is
believed to be 10%. The government embarked on a series of projects to reduce
unemployment. It was of interest to determine whether unemployment decreases as
a result of the projects. A random sample of 500 people was chosen, and 48 of them
were found to be unemployed. Test at 1% level of significance if the government
projects reduced the unemployment rate
Solution: As usual, we follow the steps:
1. H 0 : P  0.1 VS H1 : P  0.1

2.   0.05
3. Critical Region: Z<-Z1.645
4. Critical Region: Z  Z 
5. Computation
X 48
Pˆ    0.096
n 500

Z cal 
Pˆ  P 
0

0.096  0.1
 0 .3
P0 (1  P0 ) 0 .1 * 0 .9
n 500
 Z tab   Z   Z 0.01  2.33

16
Estimation and Hypothesis Testing
Introduction to statistics
6. Decision: Do not reject H0 since Zcal > Ztab
7. Conclusion: the government projects didn’t reduce unemployment.
Example 8.11: A large sample of 200 students from the students of a certain high
school is interviewed and 85 of them are found to use city bus. Can you conclude
that at least 40% of the students use city bus? Use a 0.05 level of significance
(Exercise)
8.7 Test of Association
In the previous section we tried to see how we can test hypothesis for numeric data give in
the from of mean or proportion. It is also possible to apply hypothesis testing on categorical
data.
- Suppose we have a population consisting of observations having two attributes or
qualitative characteristics say A and B.
- If the attributes are independent then the probability of possessing both A and B is P *P
A B
Where P is the probability that a number has attribute A.
A

P
B is the probability that a number has attribute B.

- Suppose A has r mutually exclusive and exhaustive classes.


B has c mutually exclusive and exhaustive classes
- The entire set of data can be represented using c*r contingency table is shown bellow.
B

A B1 B2 . . Bj . Bc Tota
l
A O O O O R
1 11 12 1j 1c 1

A O O O O R
2 21 22 2j 2c 2
.
.
A O O O O R
i i1 i2 ij ic i
.
.
A O O O O
r r1 r2 rj rc

Total C C C n
1 2 j

- The chi-square procedure test is used to test the hypothesis of independency of two
attributes

- The statistic is given by:

17
Estimation and Hypothesis Testing
Introduction to statistics
r Oij  eij 2 
c
   
2
 ~  2 with r  1c  1 deg ree of freedom
i 1 j 1 
 eij 

..Where Oij =The number of units that belong to category i of A and j of B.


eij = Expected frequency that belong to category i of A and j of B and eij is given
by
Ri  C j
eij  Where Ri=the i th raw total
n
Cj= the j th column total.
n=total number of observation.
Remarks:
r c r c

 O
i 1 j 1
ij   eij
i 1 j 1

- The null and alternative hypothesis may be stated as:


H0: There is no association between A and B.
H1: not H0 (There is association between A and B).
Decision Rule:
- Reject H for independency at α level of significance if the calculated value of  2 exceeds
0
the tabulated value with degree of freedom equal to (c-1) (r-1).
Example 8.12 A researcher is interested to assess the effect of litracy on family planning
use. Accordingly he collected data and tabulated the findings in the following manner. Can
we say there is association between educational status and family planning use?
FP Use Educational Status Total
Ilitrate Litrate
Yes a 63 b 49 112
No c 15 d 33 48
Total 78 82 160

Example 8.13: A geneticist took a random sample of 300 men to study whether there is
association between father and son regarding boldness. He obtained the following
results.

Son
Father Bold Not
Bold 85 59
Not 65 91
Test whether there is association between father and son regarding boldness. Using α=5%

18
Estimation and Hypothesis Testing
Introduction to statistics
Example 8.14: Random samples of 200 men, all retired were classified according to
education and number of children is as shown below

Number of children

Education level
0-1 2-3 Over 3
Elementary 14 37 32
Secondary and above 31 59 27

Test whether there is association education and number of children Using α=5%

19

You might also like