Module 8 - NonParametric Tests
Module 8 - NonParametric Tests
Parametric tests tend to be more accurate and are more likely to find a true significant
effect, so it should be used in as much as possible. Hence, they are used only when
necessary.
1|P a g e
A. One-sample Sign Test
The one-sample sign test is a non-parametric test that was invented by Dr.
Arbuthnot, a Scottish physician in 1710. This is used to test the null hypothesis that
the median of a distribution is equal to some hypothesized value k. It called a Sign
Test because the data are recorded as plus (positive) and minus (negative) signs
based on its direction rather than numerical magnitude.
Assumptions
Ho: median = k
Ha: median < k (left tailed test)
median > k (right tailed test)
median ≠ k (two tailed test)
Test Statistic
If n < 25, use y where y is the smaller number of positive and negative signs.
y has a binomial distribution with p = 0.5
If n > 25 use:
Rejection Region
Reject Ho if test statistic is less than or equal to the critical value. (y<yα)
Reject Ho if p-value < α
2|P a g e
Example
A bank manager indicates that the median number of accounts per day is 64. Another
bank employee claims that is more than 64. The employee then recorded the number of
customers per day for 10 days, data of which are shown below.
Day 1 2 3 4 5 6 7 8 9 10
Number of 60 66 65 70 68 72 46 76 77 75
customers
Solution
Ho: Median = 64
Ha: Median > 64, α = 0.05
Test Statistic
Assign (+) for values greater than the median and (-) below the median
Day 1 2 3 4 5 6 7 8 9 10
Number of 60 66 65 70 68 72 46 76 77 75
customers
Sign - + + + + + - + + +
Rejection region
Reject Ho if p-value < α.
Since 0.055 > 0.05 we do not reject the null hypothesis.
Conclusion
At 5% level of significance, there is no sufficient evidence to say that the median
number of customers per day is more than 64.
3|P a g e
B. One-sample Wilcoxon Signed Rank Test
The Wilcoxon signed-rank test is the non-parametric equivalent to the t-test and
may be used when the dependent variable is not normally distributed. It tests the
hypothesis about the median of a population distribution. The sign test and the
Wilcoxon signed rank tests looks similar by the Wilcoxon Signed Rank Test is more
powerful than the sign test.
Assumptions
1. Measurement scale is at least interval
2. Observations are mutually independent
3. Data follow a symmetric distribution
4. Differences between the data and hypothesized median are continuous.
Test Statistic
If n < 20, use w where W is the smaller number of positive and negative ranks.
If n>20 use:
Where
4|P a g e
Rejection Region
Reject Ho if W < Wα
Example:
The grades of 10 students of in their law subject are given below. Test at 5%
significance level if the median mark is greater than 67.
Student 1 2 3 4 5 6 7 8 9 10
Grade 71 79 40 70 82 72 60 76 69 75
Solution
Ho: median < 67
Ha: median > 67 , α = 0.05
Get the difference from the hypothesized median, ignore the sign and rank the
differences. Separate the positive and negative ranks. If the values of the difference are
tied, get the mean rank.
Student 1 2 3 4 5 6 7 8 9 10
Grade 71 79 40 70 82 72 60 76 69 75
Difference 4 12 -27 3 15 5 -7 9 2 8
+Rank 3 8 2 9 4 7 1 6
-Rank 10 5
Get the sum of the positive and the negative ranks (+ = 40, - = 15)
Since n< 20, our test statistic is the w = minimum of (40,15) = 15
Critical value Wα = 11 from the table of Critical Values for the WSRT.
Decision point
Reject Ho if W < Wα, since 15 >11, we do not reject the Ho.
Conclusion
At 5% level of significance, there is no sufficient evidence that the median grade in the
law subject is greater than 67.
5|P a g e
C. Mann Whitney U Test
The Mann Whitney U test is a non-parametric test that is used to compare two
sample medians from the same population and the tests if these two sample
medians are equal or not. The Mann Whitney U test is used when the data is ordinal
or when the assumptions for the t-test are not met.
Assumptions
1. The sample are drawn randomly from the population.
2. Samples are mutually independent. Hence, sample has to be in either group, it
cannot be in both groups.
3. Data is at least of ordinal scale.
Null and Alternative Hypothesis
Ho: Median1 = Median2
Median1 – Median2 = 0
Ha: Median1 ≠ Median 2
Test Statistic
U1 = n1n2 + (n1)(n1+1) - ∑R1
2
Where:
n1 = sample size of group 1
n2 – sample size of group 2
Ri = rank of the sample size
Rejection Region
Reject Ho if U < Uα
6|P a g e
Example
A researcher wants to test whether there is any difference between the effectiveness of
a new drug for relieving pain. A random sample of 23 people, 12 of whom take the
drug and 11 take a placebo. Results are as follows:
Placebo New Drug
11 34
15 31
9 35
4 29
34 28
17 12
18 18
14 30
12 14
13 22
26 10
31
7|P a g e
Compute for U1 and U2.
U1 = (12)(11) + [(12)(13)/2]-117.5
= 132 + 78- 117.5
= 92.5
U2 = (12)(11) + [(11)(12)/2]-158.5
= 132 + 66- 158.5
= 39.5
U = min (92.5, 39.5)
U = 39.5
Uα = 33
Decision Point:
Reject Ho if U < Uα, since 39.5 > 33, we do not reject the Ho.
Conclusion
At 5% level of significance, there is no sufficient evidence to say there is a difference in
median response time to the new pain reliever.
The Kruskal Wallis test is the non parametric test equivalent for the one-way
ANOVA. It was proposed by Kruskal and Wallis is 1952, and extends the Mann
Whitney U Test to more than 2 groups. It determines significant differences on a
continuous dependent variable by a categorical dependent variable with two or
more groups. Although it is not as powerful as the ANOVA, it can be used for both
continuous and ordinal-level measurements.
Assumptions
1. Samples are randomly drawn from the population
2. Observations are independent of each other.
3. Measurement scale is at least ordinal.
Null and Alternative Hypothesis
Ho: population medians are equal
Ha: population medians are not equal
8|P a g e
Test Statistic
H = 12 ∑R2i – 3(N+1)
N (N+1) ni
Where:
N = total number
ni =number in the ith group
Ri = total sum of ranks in the ith group
Critical Region
Reject Ho if H > Hα.
Example
A shoe company wants to know if three groups of workers have different salaries, data
in thousand pesos are shown below. Test at 5% level of confidence for the significance
of the difference in salaries.
Solution
Sort the data in ascending order in combined set and assign ranks. Give average rank
for tied values
Salary 18 23 30 34 40 41 44 45 54 55 60 66 70 72 90
Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
9|P a g e
Add up the different ranks for each group. Sum up the ranks.
Hα = 23.685
Decision Point:
Reject Ho if H > Hα, since 6.72 <23.685, we do not reject the null hypothesis.
Conclusion:
At 5% level of significance, we do not have sufficient evidence to say that the median
salaries of the three worker groups are not equal.
The Spearman rank correlation coefficient, rs, is the non parametric version of the
Pearson correlation coefficient r.
Assumptions
1. Data must be ordinal, interval or ratio.
2. Data has to be monotonic i.e. if one variable increases (or decreases) the
other variable also increases (or decreases).
Test Statistic
10 | P a g e
Where
d = difference in the ranks
n – sample size (number of pairs)
Test Results
Spearman’s return a value between -1 to 1, inclusive, where:
+1 = perfect positive correlation between ranks
-1 = perfect negative correlation between ranks
0 = no correlation between ranks
Example
Science Reading
35 30
23 33
47 45
17 23
10 8
43 49
9 12
6 4
28 31
Solution.
Rank the scores on each subject in descending order. Use average ranks for tied values.
Find the (1) difference in the ranks and (2) squared differences in the ranks.
11 | P a g e
Compute for the ρ.
ρ = 1 – 6(12)
9(81-1)
= 0.9
There is a strong positive correlation between the ranks of the Science and Reading
scores.
12 | P a g e