0% found this document useful (0 votes)
11 views13 pages

BUP-05-Hypothesis Testing

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views13 pages

BUP-05-Hypothesis Testing

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Dr. Md. Abdus Salam Akanda Website: http://du.ac.

bd
Professor of Statistics, DU E-mail: [email protected]

Test of Hypothesis-I

Hypothesis: A statistical hypothesis is an assertion or statement about a population or


equivalently about the probability distribution characterizing a population, which we want to
verify on the basis of information contained in a sample.

Examples:
(1) A rickshaw puller hypothesizes that more than fifty percent of our politicians are
corrupted.
(2) A popular band singer claims that copy right rule will never be effective in our
country.

Test of a statistical hypothesis: A test of a statistical hypothesis is a two-action decision


problem after the experimental sample values have been obtained, the two-actions being the
acceptance or rejection of the hypothesis under consideration.

Null hypothesis: Null means the possible rejection of the hypothesis. Null hypothesis is a
statement, which tells us that no difference exists between the parameter and the statistic
being compared to it. Null hypothesis is always denoted by H 0 .

Example: There is no difference in the population between the rates of prevalence of


malnutrition between the male and female children.

Alternative hypothesis: The alternative hypothesis is the logical opposite of the null
hypothesis. Alternative hypothesis is usually denoted by H 1 or H a .

Example: There exists significant difference in the population between the rates of
prevalence of malnutrition between the male and female children.

H 0 : There is no association between level of education and knowledge of child nutrition


among women.
H1 : The level of education and knowledge of child nutrition among women are associated.

Simple hypothesis: If a hypothesis completely specifies the distribution of a population, it is


called a simple hypothesis. Suppose, that a coin is tossed 30 times ( n  30 ) to determine
whether the coin is an ideal one. Then, the hypothesis 𝐻0 : 𝑝 = 0.50 is a simple hypothesis
since it completely specifies the population distribution.

Composite hypothesis: If a hypothesis does not completely specify the population


distribution, it is called a composite hypothesis. In the above coin-tossing example, if we do
not specify that n  30 , then the hypothesis 𝐻0 : 𝑝 = 0.50 would be a composite hypothesis.

One-tailed test: A test of any statistical hypothesis where the alternative is one-sided such as
H 0 :   0
H1 :   0
or perhaps, H 0 :    0
H1 :   0

1
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

is called a one-sided test.

Two-tailed test: A test of any statistical hypothesis where the alternative is two-sided such as
H 0 :   0
H1 :   0
is called a two-tailed test.

Type I error: The error of rejecting H 0 (accepting H 1 ) when H 0 is true is called type I
error. The probability of type I error is denoted by  and it is called the level of significance.

Type II error: The error of accepting H 0 when H 0 is false ( H 1 is true) is called type II
error. The probability of type II error is denoted by  .

2
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Power of the test: 1   , that is the probability of rejecting H 0 when H 0 is false ( H 1 is


true) is called the power of the test hypothesis H 0 against the alternative hypothesis H 1 . The
value of the power function at a parameter point is called the power of the test at that point.

p-value: The p-value or observed significance level of a test is the smallest value of  for

3
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

which H 0 can be rejected.

The decision rules, which most researchers follow in stating their results, are as follows:
 If p-value is less than 0.01, the results are regarded as highly significant.
 If p-value is between 0.01 and 0.05, the results are regarded as statistically
significant.
 If p-value is between 0.05 and 0.1, the results are regarded as only tending toward
statistical significance.
 If p-value is greater than 0.10, the results are regarded as not statistically significant.

4
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Critical region or rejection region: A region of rejection is a set of possible values of the
sample statistic, which provides evidence to contradict the null hypothesis and leads to a
decision to reject the null hypothesis.

Acceptance region: A region of acceptance is a set of possible values of the sample statistic,
which provides evidence to support the null hypothesis and lead to a decision to accept it.

5
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Test statistic: The statistic used to provide evidence about the null hypothesis is called the
test statistic.

One-sample mean test: The one sample mean test is used when you have data from a single
sample of participants and you wish to know whether the mean of the population from which
the sample is drawn is the same as the hypothesized mean.

Assumptions: The assumptions underlying all types of mean test are:


1. Scale of measurement: The data should be at the interval or ratio level of
measurement.
2. Random sampling: The scores should be randomly sampled from the population of
interest.
3. Normality: The scores should be normally distributed in the population.
Clearly, assumptions 1 and 2 are a matter of research design and not statistical analysis.
Assumption 3 can be tested in a number of different ways.

Three cases in testing a single mean:


Case 1: when  known
X  0
Z ~ N 0,1 .
 n
Case 2: when  unknown and sample size large n  30
X  0
Z ~ N 0,1 .
s n
Case 3: when  unknown and sample size small n  30
X  0
t ~ t n 1 .
s n

Example: Suppose you are using the data set gssft.sav and you want to test whether the
students work 40 hours in a week.

The null and alternative hypotheses are:


H 0 : In the population, the average work hour of college graduates is 40 hours per week.

6
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

H 1 : In the population, the average work hour of college graduates is not 40 hours per week.

To conduct a one sample t-test


1. Select the Analyze menu.
2. Click on Compare Means and then One-Sample T Test... to open the One-Sample
T Test dialogue box.
3. Select the variable you require (hrs1) and click on the button to move the variable
into the Test Variable(s): box.
4. In the Test Value: box type the mean score (i.e., 40).
5. Click on Ok.

Syntax:-
T-TEST
/TESTVAL=40
/MISSING=ANALYSIS
/VARIABLES=hrs1
/CRITERIA=CI(.95) .

One-Sample Test
Test Value = 40
t df Sig. (2- Mean 95% Confidence Interval
tailed) Difference of the Difference
Lower Upper
Number of
hours worked 20.478 1496 .000 5.627 5.09 6.17
last week

The probability of observing a sample t value greater than +20.478 or less than -20.478 is
given by the entry labeled sig. (2-talied). Since the observed significance level is smaller than
0.05, we can conclude that the null hypothesis can be rejected and it’s quite unlikely that
college graduates work a 40-hour week on average.

Tests with two samples: One-sample mean test was used to determine whether a single
sample of scores was likely to have been drawn from a hypothesized population. Now we
will extend the understanding of sampling distributions to ask whether two sets of scores are
random samples from the same or different populations. If they are random samples from the
same population, then any differences across conditions or groups can be attributed to
random sampling variability. However, if the two sets of scores are random samples from
different populations, then we can attribute any difference between means across conditions
to the independent variable or the treatment effect.

Repeated measures t-test: The repeated measures t-test, also referred to as the dependent-
samples or paired t-test, is used when you have data from only one group of participants. In
other words, an individual obtains two scores under two different conditions. Studies which
employ a pretest-posttest design are commonly analyzed using repeated measures t-test.

Assumptions: The repeated measures t-test has one additional assumption:


Normality of population difference scores- the difference between the scores for each

7
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

participant should be normally distributed. Providing the sample size is not small 30   ,
violations of this assumption are of little concern.

The null hypothesis for a paired design is that there is no difference between the average
values for the two members of a pair in the population. In other words, the average
population difference is zero. The alternative hypothesis is that there is a difference in the
average values.

d  d d
Test statistic:- t   ~ t n1
sd n sd n
where, d i  xi  yi .

Example: Suppose you are using the data set endorph.sav and you want to test whether
average  -endorphin levels changed during a run. (  -endorphins are morphine-like
substances manufactured in the body).

To conduct a repeated measures t-test


1. Select the Analyze menu.
2. Click on Compare Means and then Paired-Samples T Test... to open the Paired-
Sample T Test dialogue box.
3. Select the variables you require (i.e., before and after) and press button to move the
variables into the Paired Variables: box.
4. Click on Ok.

Syntax:
T-TEST
PAIRS= after WITH before (PAIRED)
/CRITERIA=CI(.95)
/MISSING=ANALYSIS.
Paired Samples Test

Paired Differences
95% Confidence
Interval of the
Std. Error Difference
Mean Std. Deviation Mean Lower Upper t df Sig. (2-tailed)
Pair 1 AFTER - BEFORE 18.7364 8.3297 2.5115 13.1404 24.3324 7.460 10 .000

From the table, the observed significance level is small (less than 0.05). So you can reject the
null hypothesis that the  -endorphin levels do not change during a run.

Problem: Use the electric.sav data file to answer the following questions:

1. Use the Select Cases facility to select only men with coronary heart disease (variable
chd equals 1). Test the hypothesis that they come from a population in which the
average serum cholesterol is 205 mg/dl (variable chol58). State the null and
alternative hypotheses.

8
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

2. Select only men without coronary heart disease (variable chd equals 0). Is it plausible
that they are a sample from a population in which the average weight is 175 pounds
(variable wt58)?

Use the renal.sav data file to answer the following questions:

3. For patients who developed acute renal failure (variable type equals 1), determine if
there was a statistically significant change in average BUN and creatinine at
admission ( variables admbun and admcreat) and average BUN and creatinine at
discharge ( variables finbun and fincreat). For all patients, see if there was a
statistically significant change in creatinine between the time of admission (variable
admcreat) and the time of surgery (variable precreat).

Independent Sample T Test: An independent group’s t-test is appropriate when different


participants have performed in each of the different conditions- in other words, when the
participants in one condition are different from the participants in the other condition. This is
commonly referred to as a between-subjects design.

Examples:
(1) We might compare the smoking habit of a sample of women with that among the men
to see if there is any difference between sexes with respect to smoking habit.
(2) A manager of an industrial firm may study the workers’ productivity from two groups
or different sample.

The null and alternative hypotheses are:


H 0 : 1   2 Vs. H 2 : 1   2
Case1: when population variances known
x1  x 2
z ~ N 0, 1
 12  22

n1 n2
Case2: when population variances unknown but the samples are large
x  x2
z 1 ~ N 0, 1
s12 s 22

n1 n2
Case3: when population variances unknown (assumed equal) and the samples are small
x  x2
t 1
s 2p s 2p

n1 n2
x1  x 2
 ~ t n1  n2  2 
1 1
sp 
n1 n2

Where, s  2 n1  1 s12  n2  1 s22


n1  n2  2
p

9
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Case4: when population variances unknown (assumed unequal) and the samples are
small
x  x2
z 1 ~ t v 
s12 s 22

n1 n2
 s12 s 22 
  
Where, v  2 
n1 n2 

s1 n1  

s 22 n 2 
n1  1 n2  1

Example: Suppose you use the data file gssnet.sav and your interest is to check whether
internet users and non-users watch television equal number of hours per day.
Solution:
The null and alternative hypotheses for the problem in hand are:
H 0 : The average hours of television viewing are the same in the population for internet
users and non-users.
H1 : The average hours of television viewing are not the same in the population for internet
users and non-users.

To conduct an independent groups t-test


1. Select the Analyze menu.
2. Click on Compare Means and then Independent Samples T Test... to open the
Independent Samples T Test dialogue box.
3. Select the Test Variable(s) (i.e., tvhours) and then click on the button to move the
variable(s) into the Test Variable(s): box.
4. Select the grouping variable (i.e., usenet) and click on the button to move the
variable into the Grouping Variable: box.
5. Click on the Define Groups... command push button to open the Define Groups
Sub-dialogue: box.
6. In the Group1: box, type the lowest value for the variable (i.e., 0), then tab. Enter the
second value for the variable (i.e., 1) in the Group2: box.
7. Click on Continue and then Ok.

Syntax:
T-TEST
GROUPS=usenet (0 1)
/MISSING =ANALYSIS
/VARIABLES=tvhours
/CRITERIA=CI(.95).

10
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

Independent Samples Test

Levene's Test for


Equality of Variances t-test for Equality of Means
95% Confidence
Interval of the
Mean Std. Error Difference
F Sig. t df Sig. (2-tailed) Difference Difference Lower Upper
Hours per day Equal variances
20.261 .000 6.455 884 .000 1.09 .17 .76 1.42
watching TV assumed
Equal variances
6.569 870.228 .000 1.09 .17 .77 1.42
not assumed

Given that Levene’s test has a probability less than 0.05; you can assume that the population
variances are relatively unequal. Therefore, you can use the t-value, degrees of freedom and
two-tailed significance for the unequal variance estimates to determine whether differences
exists between internet users and non-users. The observed two-tailed significance level is less
than 5%, so you may reject the null hypothesis and conclude that internet users and non-users
do not watch the same average number of hours of television.

Testing Independence of Variables: More often, we ask questions concerning the


interrelationships between variables. For example, we ask
1. Is there a difference in the crime rate between children of poor families and those of
elite or rich families?
2. Is there a difference in the prevalence of malnutrition between the rural children and
urban children?

All the questions raised above have same common characteristics:


 They deal with two or more nominal or ordinal categories
 The categories are non-overlapping
 The data consist of a frequency count
 The data can be cross-classified to fall into several categories of the row and column
variables

Let us designate the two attributes as A and B where, attribute A is assumed to have r
categories and attribute B is assumed to have c categories. Furthermore, assume the total
number of observations in the problem is N. A representation of these observations is shown
below in a table where Oij represents the observations in the ith row and jth column. Such a
table in the matrix form is called a contingency table and is shown below:

B Total
B1 B2 ............... Bj .................... Bc
A1 O11 O12 ............... O1 j .................... O1c R1
A2 O21 O22 ............... O2 j .................... O2 c R2

A 
Ai O i1 Oi 2 ............... Oij .................... Oic Ri

11
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]


Ar Or1 Or 2 ............... Orj .................... Orc Rr


Total C1 C2 ............... Cr

In the table, R i is the total of ith row and C j is the total of jth column. The frequencies in
these cells are termed as cell frequencies and the totals of the frequencies in each of the rows
Ri  and columns C j  are termed as marginal frequencies.

In the analysis of contingency table, the objective is to determine whether or not one method
of classification is ‘contingent’ or ‘dependent’ on the other method of classification. If not,
then the two methods of classification are said to be ‘independent’. The null hypothesis to be
tested in the contingency table is that A and B are independent, that is there is no association
or relationship between the two variable.

If two criteria of classification are independent, a joint probability is equal to the product of
the two corresponding marginal probabilities. Thus, the expected cell frequencies are given
by the formula:
R Cj Ri  C j
Eij  i  N 
N N N
To conduct the test, we use  statistic
2

r c O  Eij 
2 r c Oij2
      N ~  2r 1c 1
2 ij

i 1 j 1 Eij i 1 j 1 Eij

Example: Suppose you use the data file manners.sav and you want to check the association
between sex and attitude of opening door. The null hypothesis is that the percentage opening
doors for strangers is the same for men and women. Another way of stating the null
hypothesis is that gender and response are independent.

To conduct the Chi-square test


1. Select the Analyze menu.
2. Click on Descriptive Statistics and then Crosstabs... to open the Crosstabs dialogue
box.
3. Select the Row Variable(s) (i.e., sex) and then click on the button to move the
variable(s) into the Row(s): box. Also select the Column Variable(s) (i.e., opendoor)
and then click on the button to move the variable(s) into the Column(s): box.
4. Click on the Statistics... command push button to open the Crosstabs: Statistics
Sub-dialogue: box and then select Chi-square.
5. Click on Continue and then Ok.

Syntax:
CROSSTABS
/TABLES-sex BY opendoor
/FORMAT=AVALUE TABLES
/STATISTICS=CHISQ

12
Dr. Md. Abdus Salam Akanda Website: http://du.ac.bd
Professor of Statistics, DU E-mail: [email protected]

/CELLS=COUNT EXPECTED.
Chi-Square Tests

Asymp. Sig. Exact Sig. Exact Sig.


Value df (2-sided) (2-sided) (1-sided)
Pearson Chi-Square 3.318b 1 .069
Continuity Correctiona 2.870 1 .090
Likelihood Ratio 3.354 1 .067
Fisher's Exact Test .075 .045
Linear-by-Linear
3.314 1 .069
Association
N of Valid Cases 1010
a. Computed only for a 2x2 table
b. 0 cells (.0%) have expected count less than 5. The minimum expected count is
32.15.

Based on the observed significance level for the chi-square statistic, you cannot reject the null
hypothesis that men and women are equally likely to report that they open doors for
strangers. The observed significance level is 0.069, which is greater than the customary 0.05.

Problem
Use the gssnet.sav data file to answer the following questions
1. Examine the relationship between education (degree) and perception of life (life). Can
you reject the null hypothesis that education and perception of life are independent?
2. Can you reject the null hypothesis that amount of internet use (netcat) and perception
of life (life) are independent? Explain.
3. Test whether belief in life after death (postlife) and highest degree earned (degree) are
independent.
4. Perform the appropriate analyses to test whether the average number of hours of daily
television viewing (tvhours) is the same for men and women.
5. Consider people who use the Inernet and those who don’t (Usenet). What is the
average income for people who use the Internet and those who don’t? (use rincdol).
Do you have enough evidence to reject the null hypothesis that the average income is
the same for the two groups? Which group makes more money?

13

You might also like