BUP-05-Hypothesis Testing
BUP-05-Hypothesis Testing
bd
Professor of Statistics, DU E-mail: [email protected]
Test of Hypothesis-I
Examples:
(1) A rickshaw puller hypothesizes that more than fifty percent of our politicians are
corrupted.
(2) A popular band singer claims that copy right rule will never be effective in our
country.
Null hypothesis: Null means the possible rejection of the hypothesis. Null hypothesis is a
statement, which tells us that no difference exists between the parameter and the statistic
being compared to it. Null hypothesis is always denoted by H 0 .
Alternative hypothesis: The alternative hypothesis is the logical opposite of the null
hypothesis. Alternative hypothesis is usually denoted by H 1 or H a .
Example: There exists significant difference in the population between the rates of
prevalence of malnutrition between the male and female children.
One-tailed test: A test of any statistical hypothesis where the alternative is one-sided such as
H 0 : 0
H1 : 0
or perhaps, H 0 : 0
H1 : 0
1
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
Two-tailed test: A test of any statistical hypothesis where the alternative is two-sided such as
H 0 : 0
H1 : 0
is called a two-tailed test.
Type I error: The error of rejecting H 0 (accepting H 1 ) when H 0 is true is called type I
error. The probability of type I error is denoted by and it is called the level of significance.
Type II error: The error of accepting H 0 when H 0 is false ( H 1 is true) is called type II
error. The probability of type II error is denoted by .
2
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
p-value: The p-value or observed significance level of a test is the smallest value of for
3
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
The decision rules, which most researchers follow in stating their results, are as follows:
If p-value is less than 0.01, the results are regarded as highly significant.
If p-value is between 0.01 and 0.05, the results are regarded as statistically
significant.
If p-value is between 0.05 and 0.1, the results are regarded as only tending toward
statistical significance.
If p-value is greater than 0.10, the results are regarded as not statistically significant.
4
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
Critical region or rejection region: A region of rejection is a set of possible values of the
sample statistic, which provides evidence to contradict the null hypothesis and leads to a
decision to reject the null hypothesis.
Acceptance region: A region of acceptance is a set of possible values of the sample statistic,
which provides evidence to support the null hypothesis and lead to a decision to accept it.
5
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
Test statistic: The statistic used to provide evidence about the null hypothesis is called the
test statistic.
One-sample mean test: The one sample mean test is used when you have data from a single
sample of participants and you wish to know whether the mean of the population from which
the sample is drawn is the same as the hypothesized mean.
Example: Suppose you are using the data set gssft.sav and you want to test whether the
students work 40 hours in a week.
6
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
H 1 : In the population, the average work hour of college graduates is not 40 hours per week.
Syntax:-
T-TEST
/TESTVAL=40
/MISSING=ANALYSIS
/VARIABLES=hrs1
/CRITERIA=CI(.95) .
One-Sample Test
Test Value = 40
t df Sig. (2- Mean 95% Confidence Interval
tailed) Difference of the Difference
Lower Upper
Number of
hours worked 20.478 1496 .000 5.627 5.09 6.17
last week
The probability of observing a sample t value greater than +20.478 or less than -20.478 is
given by the entry labeled sig. (2-talied). Since the observed significance level is smaller than
0.05, we can conclude that the null hypothesis can be rejected and it’s quite unlikely that
college graduates work a 40-hour week on average.
Tests with two samples: One-sample mean test was used to determine whether a single
sample of scores was likely to have been drawn from a hypothesized population. Now we
will extend the understanding of sampling distributions to ask whether two sets of scores are
random samples from the same or different populations. If they are random samples from the
same population, then any differences across conditions or groups can be attributed to
random sampling variability. However, if the two sets of scores are random samples from
different populations, then we can attribute any difference between means across conditions
to the independent variable or the treatment effect.
Repeated measures t-test: The repeated measures t-test, also referred to as the dependent-
samples or paired t-test, is used when you have data from only one group of participants. In
other words, an individual obtains two scores under two different conditions. Studies which
employ a pretest-posttest design are commonly analyzed using repeated measures t-test.
7
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
participant should be normally distributed. Providing the sample size is not small 30 ,
violations of this assumption are of little concern.
The null hypothesis for a paired design is that there is no difference between the average
values for the two members of a pair in the population. In other words, the average
population difference is zero. The alternative hypothesis is that there is a difference in the
average values.
d d d
Test statistic:- t ~ t n1
sd n sd n
where, d i xi yi .
Example: Suppose you are using the data set endorph.sav and you want to test whether
average -endorphin levels changed during a run. ( -endorphins are morphine-like
substances manufactured in the body).
Syntax:
T-TEST
PAIRS= after WITH before (PAIRED)
/CRITERIA=CI(.95)
/MISSING=ANALYSIS.
Paired Samples Test
Paired Differences
95% Confidence
Interval of the
Std. Error Difference
Mean Std. Deviation Mean Lower Upper t df Sig. (2-tailed)
Pair 1 AFTER - BEFORE 18.7364 8.3297 2.5115 13.1404 24.3324 7.460 10 .000
From the table, the observed significance level is small (less than 0.05). So you can reject the
null hypothesis that the -endorphin levels do not change during a run.
Problem: Use the electric.sav data file to answer the following questions:
1. Use the Select Cases facility to select only men with coronary heart disease (variable
chd equals 1). Test the hypothesis that they come from a population in which the
average serum cholesterol is 205 mg/dl (variable chol58). State the null and
alternative hypotheses.
8
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
2. Select only men without coronary heart disease (variable chd equals 0). Is it plausible
that they are a sample from a population in which the average weight is 175 pounds
(variable wt58)?
3. For patients who developed acute renal failure (variable type equals 1), determine if
there was a statistically significant change in average BUN and creatinine at
admission ( variables admbun and admcreat) and average BUN and creatinine at
discharge ( variables finbun and fincreat). For all patients, see if there was a
statistically significant change in creatinine between the time of admission (variable
admcreat) and the time of surgery (variable precreat).
Examples:
(1) We might compare the smoking habit of a sample of women with that among the men
to see if there is any difference between sexes with respect to smoking habit.
(2) A manager of an industrial firm may study the workers’ productivity from two groups
or different sample.
9
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
Case4: when population variances unknown (assumed unequal) and the samples are
small
x x2
z 1 ~ t v
s12 s 22
n1 n2
s12 s 22
Where, v 2
n1 n2
s1 n1
s 22 n 2
n1 1 n2 1
Example: Suppose you use the data file gssnet.sav and your interest is to check whether
internet users and non-users watch television equal number of hours per day.
Solution:
The null and alternative hypotheses for the problem in hand are:
H 0 : The average hours of television viewing are the same in the population for internet
users and non-users.
H1 : The average hours of television viewing are not the same in the population for internet
users and non-users.
Syntax:
T-TEST
GROUPS=usenet (0 1)
/MISSING =ANALYSIS
/VARIABLES=tvhours
/CRITERIA=CI(.95).
10
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
Given that Levene’s test has a probability less than 0.05; you can assume that the population
variances are relatively unequal. Therefore, you can use the t-value, degrees of freedom and
two-tailed significance for the unequal variance estimates to determine whether differences
exists between internet users and non-users. The observed two-tailed significance level is less
than 5%, so you may reject the null hypothesis and conclude that internet users and non-users
do not watch the same average number of hours of television.
Let us designate the two attributes as A and B where, attribute A is assumed to have r
categories and attribute B is assumed to have c categories. Furthermore, assume the total
number of observations in the problem is N. A representation of these observations is shown
below in a table where Oij represents the observations in the ith row and jth column. Such a
table in the matrix form is called a contingency table and is shown below:
B Total
B1 B2 ............... Bj .................... Bc
A1 O11 O12 ............... O1 j .................... O1c R1
A2 O21 O22 ............... O2 j .................... O2 c R2
A
Ai O i1 Oi 2 ............... Oij .................... Oic Ri
11
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
In the table, R i is the total of ith row and C j is the total of jth column. The frequencies in
these cells are termed as cell frequencies and the totals of the frequencies in each of the rows
Ri and columns C j are termed as marginal frequencies.
In the analysis of contingency table, the objective is to determine whether or not one method
of classification is ‘contingent’ or ‘dependent’ on the other method of classification. If not,
then the two methods of classification are said to be ‘independent’. The null hypothesis to be
tested in the contingency table is that A and B are independent, that is there is no association
or relationship between the two variable.
If two criteria of classification are independent, a joint probability is equal to the product of
the two corresponding marginal probabilities. Thus, the expected cell frequencies are given
by the formula:
R Cj Ri C j
Eij i N
N N N
To conduct the test, we use statistic
2
r c O Eij
2 r c Oij2
N ~ 2r 1c 1
2 ij
i 1 j 1 Eij i 1 j 1 Eij
Example: Suppose you use the data file manners.sav and you want to check the association
between sex and attitude of opening door. The null hypothesis is that the percentage opening
doors for strangers is the same for men and women. Another way of stating the null
hypothesis is that gender and response are independent.
Syntax:
CROSSTABS
/TABLES-sex BY opendoor
/FORMAT=AVALUE TABLES
/STATISTICS=CHISQ
12
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]
/CELLS=COUNT EXPECTED.
Chi-Square Tests
Based on the observed significance level for the chi-square statistic, you cannot reject the null
hypothesis that men and women are equally likely to report that they open doors for
strangers. The observed significance level is 0.069, which is greater than the customary 0.05.
Problem
Use the gssnet.sav data file to answer the following questions
1. Examine the relationship between education (degree) and perception of life (life). Can
you reject the null hypothesis that education and perception of life are independent?
2. Can you reject the null hypothesis that amount of internet use (netcat) and perception
of life (life) are independent? Explain.
3. Test whether belief in life after death (postlife) and highest degree earned (degree) are
independent.
4. Perform the appropriate analyses to test whether the average number of hours of daily
television viewing (tvhours) is the same for men and women.
5. Consider people who use the Inernet and those who don’t (Usenet). What is the
average income for people who use the Internet and those who don’t? (use rincdol).
Do you have enough evidence to reject the null hypothesis that the average income is
the same for the two groups? Which group makes more money?
13