0% found this document useful (0 votes)
125 views19 pages

Chapter II

This document defines key concepts in hypothesis testing including the null hypothesis, alternative hypothesis, type I and type II errors, power of a test, and p-value. It then outlines the steps to conduct hypothesis testing which include formulating hypotheses, choosing a significance level, collecting or using data, calculating a test statistic, determining if it falls in the acceptance region, and deciding whether to reject or fail to reject the null hypothesis. Finally, it provides an example of a chi-square test to examine the independence of two variables.

Uploaded by

Danida Gisele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
125 views19 pages

Chapter II

This document defines key concepts in hypothesis testing including the null hypothesis, alternative hypothesis, type I and type II errors, power of a test, and p-value. It then outlines the steps to conduct hypothesis testing which include formulating hypotheses, choosing a significance level, collecting or using data, calculating a test statistic, determining if it falls in the acceptance region, and deciding whether to reject or fail to reject the null hypothesis. Finally, it provides an example of a chi-square test to examine the independence of two variables.

Uploaded by

Danida Gisele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

2 HYPOTHESIS TESTING

2.1 Definitions

2.1.1 Hypothesis

is a claim or an opinion about an item or issue. Therefore it has to be tested statistically in


order to establish whether it is correct or not correct. Whenever testing a hypothesis, one
must fully understand the 2 basic hypotheses to be tested namely: The null hypothesis
(H0 ) and The alternative hypothesis (H1 )
Whenever we have a decision to make about a population characteristic, we make a hy-
pothesis.

2.1.2 Null hypothesis

This is the hypothesis being tested, the belief of a certain characteristic e.g. Kenya Bureau
of Standards (KBS) may walk to a sugar making company with an intention of confirming
that the 2kg bags of sugar produced are actually 2kg and not less, they conduct hypothesis
testing with the null hypothesis being: H0 : each bag weighs 2kg. The testing will set out
to confirm this or to refute it.

2.1.3 Alternative hypothesis

While formulating a null hypothesis we also consider the fact that the belief might be found
to be untrue hence we will reject it. We therefore formulate an alternative hypothesis which
is a contradiction to the null hypothesis, thus when we reject the null hypothesis we accept
the alternative hypothesis.
In our example the alternative hypothesis would be H1 : each bag does not weigh 2kg Note:
For the null hypothesis we use equality, since we are comparing µ with a previously deter-

6
mined mean. For the alternative hypothesis, we have the choices: <, ≤, >, ≥, or 6=.
<, ≤, >, ≥ are for one sided hypothesis testing while 6= is for two sided hypothesis testing.

Errors in Hypothesis Tests

2.1.4 Type I error

We define a ”type I error” as the event of rejecting the null hypothesis when the null
hypothesis was true (Probability of concluding that there is a true difference while the true
difference does not exist). The probability of a type I error (α) is called the significance
level.

2.1.5 Type II error

with probability β as the event of failing to reject the null hypothesis when the null hy-
pothesis was false (Probability of failing to detect a true difference which does exist).
In general, the two type of errors are summarized in the following table:

Table 1: Type of Errors

Null hypothesis
Action True False
Fail to reject(accept) Correct Type II Error (β)
Reject Type I Error (α) Correct

2.1.6 Power of a test

Power is the probability of making a correct decision (to reject the null hypothesis when
the null hypothesis is false). The power of a test is given by 1 − β

7
2.1.7 Probability value (P-Value)

The p-value is the level of marginal significance within a statistical hypothesis test rep-
resenting the probability of the occurrence of a given event. The p-value is used as an
alternative to rejection points to provide the smallest level of significance at which the null
hypothesis would be rejected. A smaller p-value means that there is stronger evidence in
favor of the alternative hypothesis.

Procedures in Hypothesis Testing When we test a hypothesis we proceed as follows:

1. Formulate the null and alternative hypothesis.

2. Choose a level of significance.

3. Determine the sample size or use given data

4. Collect data or use given data

5. Calculate the test statistic.

6. Utilize the table to determine if the test statistic falls within the acceptance region.

Decide to reject the null hypothesis and therefore accept the alternative hypothesis or fail
to reject the null hypothesis and therefore state that there is not enough evidence to suggest
the truth of the alternative hypothesis.

8
2.2 Standard Parametric Tests

2.2.1 Chi-Square test

Review of Chi-square distribution

In probability theory and statistics, the chi-square distribution (also chi-squared or χ2 -


distribution) with k degrees of freedom is the distribution of a sum of the squares of k
independent standard normal random variables.
The random variable in the chi-square distribution is the sum of squares of df standard
normal variables, which must be independent. The key characteristics of the chi-square
distribution also depend directly on the degrees of freedom.

The Chi Square distribution is the distribution of the sum of squared standard normal
deviates. The degrees of freedom of the distribution is equal to the number of standard
normal deviates being summed. As the degrees of freedom increases, the Chi Square dis-
tribution approaches a normal distribution.

Chi-square test

Chi-square test (goodness-of-fit test) is used to compare frequencies or proportions in two


or more groups, especially for their independence.
The logics in chi-square test are as follows:

• The total number of observations in each column and the total number of observations
in each row are considered to be given or fixed. (marginal frequencies)

• After assuming that the columns and rows are independent, we can calculate the
number of observations expected to occur by chance (expected frequencies).

• Expected frequency can be find by multiplying the column total by the row total and

9
dividing by the grand total i.e

Row total × Column total


Expected Frequency =
Grand total

• Chi-square test compares the observed frequency in each cell with the expected fre-
quency.

• If no relationship exists between the column and row variables, then The observed
frequencies will be very close to the expected frequencies, they will differ only by
small amounts.

In this instance, the value of the chi-square statistic will be small. On the other hand, if
a relationship (dependency) does occur, then the observed frequencies will vary quite a bit
from the expected frequencies, and the value of the chi-square statistics will be large.
So chi-square is given as

I X J
2
X (Oij − Eij )2
χ = (1)
i=1 j=1
Eij

Where d.f = degree of freedom = (r − 1)(c − 1), r and c are number of rows and column,
respectively
Oij = Observed frequency ( is the observed cell count)
Eij = Expected frequency (is the estimated expected count under the null hypothesis in
the ij th cell of a (I × J) table.)
Here the hypothesis to be tested is:

H0 : The two variables are independent/not associated

H1 : The two variables are dependent/associated

10
Reject H0 when the calculated value of chi-square is greater than the tabulated chi-square
(obtained from the chi-square Table in a given significant level such as 5%).
Construction of Decision rule, if:

χ2 ≤χ2 (α, (r − 1)(c − 1), Conclude HO

χ2 >χ2 (1 − α, b − 1, ab(n − 1), Conclude H1

Example
According to the following table (3) which summarized status of students’ knowledge on
statistical computer software and Statistics II exam performance. Test whether or not
there is an association between students’ knowledge on statistical computer software and
Statistics II exam performance.(i.e Calculate chi-square and make conclusion about the
independence of the two variables). Control the significance level at 5%.

Table 2

The course Performance


Software Knowledge Good Bad Total
Yes 70 5 75
No 10 15 25
Total 80 20 100

Solution

H0 : There is no association between students’ knowledge on statistical computer software and

Statistics II exam performance

(Two variables are independent)

H1 : There is association between students’ knowledge on statistical computer software and

Statistics II exam performance

(Two variables are dependent)

11
75 × 80
E11 = = 60
100
75 × 20
E12 = = 15
100
25 × 80
E21 = = 20
100
25 × 20
E22 = =5
100

(70 − 60)2 (5 − 15)2 (10 − 20)2 (15 − 5)2


χ2 = + + +
60 15 20 5
=33.33

The tabulated value of χ2(0.05,(2−1)(2−1)) = χ2(0.05,1) = 3.84


χ2 calculated is greater than χ2 tabulated, we reject H0 and conclude that There is associ-
ation between students’ knowledge on statistical computer software and Statistics II exam
performance

Exercise
Visa Card USA studied how frequently consumers of various age groups use plastic cards
(debit and credit cards) when making purchases (Associated Press, January 16, 2006 ).
Sample data for 300 customers shows the use of plastic cards by four age groups.

Table 3

Age Group
Payment 18-24 25-34 35-44 45 and over Total
Plastic 21 27 27 36 111
Cash or check 21 36 42 90 189
Total 42 63 69 126 300

(a.) Test for the independence between method of payment and age group. Use α = 5%,
what is your conclusion?

(b.) If method of payment and age group are not independent, what observation can you

12
make about how different age groups use plastic to make purchases?

(c.) What implications does this study have for companies such as Visa, MasterCard, and
Discover?

2.2.2 Normal test

Normal test is used to test a sample mean ( X̄ ) against a population mean (µ) (where
samples size n > 30 and population variance σ is known) and sample proportion, p (where
sample size np > 5 and nq > 5 since in this case the normal distribution can be used to
approximate the binomial distribution.
Here the test statistic is

X̄ − µ
z= (2)
√σ
n

Construction of Decision rule, if:

z calculated ≤ z tabulated, Conclude HO

z calculated > z tabulated, Conclude H1

The tabulated z value is the z value corresponding to the area of z(1− α2 ) for two sided
hypothesis testing and z(1−α) for one sided hypothesis testing

Testing for a population proportion, the process is completely analogous with the mean,

pq
although we will need to use the standard deviation formula for a proportion, like n

instead of standard deviation formula for mean= √s . If the sample is large we can use the
n

central limit theorem to say that the distribution of proportions is approximately normal.
Example
A certain NGO carried out a survey in a certain community in order to establish the av-
erage age at which the girls are married. The results of the previous study indicated that

13
the marriage age for the girls is 19 years. In order to establish the validity of the mean
marital age, a sample of 50 women was interviewed and the average age indicated that
they got married at the age of 16 years. However the different ages at which they were
married differed with the standard deviation of 2.1years. The sample data indicates that
the marital age is less 19 years. Is this conclusion true or not?
Solution

• The hypothesis to be tested is:

H0 : µ (mean marital age) = 19 years

H1 : µ (mean marital age) < 19years

• The level of significance is 5%

• The sample mean age, X̄ = 16 years and the sample standard deviation s = 2.1 years

• The population mean age, µ = 19 years

• The sample size n = 50 > 30

X̄−µ 16−19
• The test statistics is z = √s
= 2.1

= −10.1; and |z| = 10.1
n 50

• The standardize value for z corresponding to α = 5% for one sided hypothesis is the z
value which gives the area under the normal curve corresponding to z1−α = z1−0.05 =
z0.9495 +z0.9505 1.64+1.65
z0.95 = 2
= 2
= 1.645 ≈ 1.65

• Since the test statistic z = 10.1 > tabulated value z = 1.65, we reject H0 and conclude
H1 . Meaning that the marriage mean age in this community is significantly lower
than 19 years.

14
Exercise 1
A foreign company which manufactures electric bulbs has assured its customers that the
lifespan of the bulbs is 28 month with a standard deviation of 4 months. Recently the com-
pany embarked on a quality improvement research for their product. After the research
using new technology, a sample of 70 bulbs was tested and they gave a mean lifespan of 30.2
months Does this justify the research undertaken? Use 1% level of significance to conduct
a statistical test in order to establish the truth about the above question.

Exercise 2
A construction firm has placed an order that they require a consignment of wires which
have a mean length of 10.5 meters with a standard deviation of 1.7 m The company which
produces the wires delivered 90 wires, which had a mean length of 9.2 m., The construction
company rejected the consignment on the grounds that they were different from the order
placed.

2.2.3 Student’s t- tests

Review of t-distribution

The t distribution is similar in shape to the z distribution and one of its major uses is to
answer research questions about means. The t distribution is symmetric and has a mean
of 0, but its standard deviation is larger than 1. The precise size of the standard deviation
depends on the sample size, which is called here degree of freedom (d.f ). The t distribution
has a larger standard deviation so it is wider and its tails are higher than those for the z
distribution. As the sample size increases, the degree of freedom also increases, and the
t distribution becomes almost the same as the standard normal distribution. When the
sample size is 30 or more t distribution and z distribution curves become so close, therefore
Either t or z distribution can be used.

15
t- test

A t-test is a type of inferential statistic tests used to determine if there is a significant


difference between the means of two groups or a mean of one group with a pre-specified
value, which may be related in certain features. The t-test is one of many tests used for
the purpose of hypothesis testing in statistics.
Calculating a t-test requires three key data values. They include the difference between
the mean values from each data set (called the mean difference), the standard deviation of
each group, and the number of data values of each group.
There are three different types of t-test that can be performed depending on the data and
objectives.

1. One sample t-test: tests the mean of a single group against a known mean.

2. Independent Samples t-test: compares the means for two independent groups(Samples).

3. Paired (Matched) sample t-test: compares means from the same group at dif-
ferent times (say, one year apart), so the samples are dependent and essentially con-
nected they are tests on the same person or thing.

Generally, the hypothesis to be tested is:

H0 : Population means among groups are equal

H1 : Population means among groups are not equal

Reject H0 when the calculated t-value is greater than the tabulated t-value (obtained from
the t-distribution Table).

16
Construction of Decision rule, if:

t calculated ≤ t tabulated, Conclude HO

t calculated > t tabulated, Conclude H1

One sample t-test

Exercise
According to the following table which shows weight for nine female students. By using
t-test, make a conclusion about the equality of the selected female students weight with
female population mean weight of 55 kg

Table 4

Female Student Wt. in Kg


55
50
50
50
55
50
60
55
50

The hypothesis to be tested is the following:

H0 : µ = 55

H1 : µ 6= 55

x̄ − µ
t= (3)
√s
n

Where:

17
• The sample mean:

n
1 X
x̄ = xi
n − 1 i=1

=52.778

• The sample variance:

n
1 X
s2 = (xi − x̄)2
n − 1 i=1
n
1 X 2
= ( x − nx̄2 )
n − 1 i=1 i

=13.168

s = 13.168

=3.629

• The sample size: n

Hence, the test statistics is given by:

52.778 − 55
t= 3,629

9

=1.837

Therefore, we check the table, t( α2 , n − 1) = t0.025,8 = 2.306 The test statistics is less than
the tabulated value (1.837 < 2.306): Do not reject H0 and conclude that on average, the
female population mean weight is 55 kg.

18
Independent samples t-test

Exercise
The following table shows age for nine male and female individuals. By using t-test, make
a conclusion about the equality of male and female selected individual’s age.

Table 5

Age of Male Age of Female


26 40
22 17
18 15
38 44
18 16
15 20
27 28
17 48
52 37

The hypothesis to be tested is the following:

H0 : µ1 = µ2

H1 : µ1 6= µ2

The test statistics is given by:

x¯1 − x¯2
t =q 2 (4)
s s2
( n11 + n22 )

19
In our example:

X1 : Age of male

X2 : Age of female

x̄1 = 25.889

x̄2 = 29.444

s21 =145.86

s22 =170.028
25.889 − 29.444
t =q = −0.6
145.86 170.028
( 9 + 9 )

|t| =0.6

Therefore, we check the table, t( α2 , n1 + n2 − 2) = t0.025,16 = 2.120. Absolute value of the


test statistics is less than the tabulated value (0.6 < 2.210): Do not reject H0 and conclude
that population mean age of male and population mean age of female are the same.

Paired (Matched) samples t-test

Example
A market research firm used a sample of individuals to rate the purchase potential of a
particular product before and after the individuals saw a new television commercial about
the product. The purchase potential ratings were based on a 0 to 10 scale, with higher
values indicating a higher purchase potential. Test whether mean rating after is greater
than the mean rating before. Use α = 5%.

Note: Here you can use one sample t-test using difference (di ) and t with n − 1 d.f in
comparing to zero.

20
Table 6: Purchase Rating

Before After
5 6
4 6
7 7
3 4
5 3
8 9
5 7
6 6

Table 7

Before After After-Before


5 6 1
4 6 2
7 7 0
3 4 1
5 3 -2
8 9 1
5 7 2
6 6 0

The hypothesis to be tested is the following:

H0 : µdi ≤ 0

H1 : µdi > 0

x̄di − 0
t= sd (5)
√i
n

Where:

5
x̄di = = 0.63
9
s2di = 1.695; sdi = 1.302

21
Hence,

0.63 − 0
t= 1.302 = 1.369

8

Since we have one sided hypothesis, we check the table, t( α, n − 1) = t0.05,7 = 1.895. The
test statistics is less than the tabulated value (1.369 < 1.895): Do not reject H0 , hence
conclude that the mean rating after is not greater than the mean rating before( i.e the
commercial do not improved the mean purchase potential rating).

Exercise 1
In recent years, a growing array of entertainment options competes for consumer time.
By 2004, cable television and radio surpassed broadcast television, recorded music, and
the daily newspaper to become the two entertainment media with the greatest usage (The
Wall Street Journal, January 26, 2004). Researchers used a sample of 15 individuals and
collected data on the hours per week spent watching cable television and hours per week
spent listening to the radio.

22
Table 8: TV and Radio data

Individual Television Radio


1 22 25
2 8 10
3 25 29
4 22 19
5 12 13
6 26 28
7 22 23
8 19 21
9 21 21
10 23 23
11 14 15
12 14 18
13 14 17
14 16 15
15 24 23

a. What is the sample mean number of hours per week spent watching cable television?
What is the sample mean number of hours per week spent listening to radio? Which
medium has the greater usage?

b. Use a 5% level of significance and test for a difference between the population mean
usage for cable television and radio.

Exercise 2
Consider a new computer software package developed to help systems analysts reduce the
time required to design, develop, and implement an information system. To evaluate the
benefits of the new software package, a random sample of 24 systems analysts is selected.
Each analyst is given specifications for a hypothetical information system. Then 12 of the
analysts are instructed to produce the information system by using current technology. The
other 12 analysts are trained in the use of the new software package and then instructed
to use it to produce the information system.The researcher in charge of the new software
evaluation project hopes to show that the new software package will provide a shorter mean

23
project completion time. Use the appropriate test to prove whether his believe is true or
false.

Table 9: Completion time data

Current Technology New Software


300 274
280 220
344 308
385 336
372 198
360 300
288 315
321 258
376 318
290 310
301 332
283 263

24

You might also like