0% found this document useful (0 votes)
157 views12 pages

Module 8 - NonParametric Tests

This document provides an overview of non-parametric tests, which are used as alternatives to parametric tests when the assumptions of parametric tests are not met. It discusses when to use non-parametric tests, their advantages over parametric tests, and examples of common non-parametric tests including the one-sample sign test, one-sample Wilcoxon signed-rank test, and Mann-Whitney U test. Each test is explained in terms of its assumptions, hypotheses, test statistic, and rejection region. An example calculation is provided for each test.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
157 views12 pages

Module 8 - NonParametric Tests

This document provides an overview of non-parametric tests, which are used as alternatives to parametric tests when the assumptions of parametric tests are not met. It discusses when to use non-parametric tests, their advantages over parametric tests, and examples of common non-parametric tests including the one-sample sign test, one-sample Wilcoxon signed-rank test, and Mann-Whitney U test. Each test is explained in terms of its assumptions, hypotheses, test statistic, and rejection region. An example calculation is provided for each test.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

ADVANCED STATISTICS

Module 8: Non Parametric Tests

NON PARAMETRIC TESTS


Non parametric tests, also called distribution-free tests, are used when the underlying
distribution of the population is unknown. Unlike in parametric tests, assumptions about
the population’s parameters such as the mean or standard deviation are not made in
parametric tests.

Parametric tests tend to be more accurate and are more likely to find a true significant
effect, so it should be used in as much as possible. Hence, they are used only when
necessary.

I. When to use non-parametric tests?


1. When data is not normally distributed – if no graph is available, check that skewness
and kurtosis of the data
2. When the data is of nominal or ordinal scales
3. When one or more assumptions of a parametric test is not fulfilled
4. When the sample size is too small
5. When the data has outliers (or extreme values that cannot be removed)
6. When testing the median, instead of the mean, is preferred usually for very skewed
distributions.

II. Advantages of Using Non Parametric Tests


1. Fewer assumptions (assumption of “normality” does not apply)
2. Small sample sizes are acceptable
3. Can be used for all data types (nominal, ordinal, interval or ratio)

III. Types of Non-Parametric Tests


Parametric Test Alternative Non-Parametric Test
One-sample Z-test, One-sample sign test
One-sample t-test One-sample Wilcoxon Signed Rank Test
Independent samples t-test Mann Whitney Test
One way ANOVA Kruskal Wallis Test
Correlation Coefficient Spearman Rank Correlation

1|P a g e
A. One-sample Sign Test

The one-sample sign test is a non-parametric test that was invented by Dr.
Arbuthnot, a Scottish physician in 1710. This is used to test the null hypothesis that
the median of a distribution is equal to some hypothesized value k. It called a Sign
Test because the data are recorded as plus (positive) and minus (negative) signs
based on its direction rather than numerical magnitude.

Assumptions

1. Data is not normally distributed


2. Measurements are independent and randomly sampled from a population
with an unknown median
3. The variable of interest is continuous
4. The data is not symmetric (skewed to the left or to the right)

Null and Alternative Hypothesis

For a hypothesized value of median k:

Ho: median = k
Ha: median < k (left tailed test)
median > k (right tailed test)
median ≠ k (two tailed test)

Test Statistic

If n < 25, use y where y is the smaller number of positive and negative signs.
y has a binomial distribution with p = 0.5

If n > 25 use:

Rejection Region

Reject Ho if test statistic is less than or equal to the critical value. (y<yα)
Reject Ho if p-value < α

2|P a g e
Example

A bank manager indicates that the median number of accounts per day is 64. Another
bank employee claims that is more than 64. The employee then recorded the number of
customers per day for 10 days, data of which are shown below.

Test the claim of the bank manager at 5% level of significance.

Day 1 2 3 4 5 6 7 8 9 10
Number of 60 66 65 70 68 72 46 76 77 75
customers

Solution

Ho: Median = 64
Ha: Median > 64, α = 0.05

Test Statistic

Assign (+) for values greater than the median and (-) below the median

Day 1 2 3 4 5 6 7 8 9 10
Number of 60 66 65 70 68 72 46 76 77 75
customers
Sign - + + + + + - + + +

Count the positive and negative values ( + = 8, - = 2)


Since n < 25, test statistic is y
y = minimum of (8,2)
y=2
From the binomial table n=10 (10 days) and p=0.5,
p(y<2) = P(0) + P(1) + P (2) = 0.055

Rejection region
Reject Ho if p-value < α.
Since 0.055 > 0.05 we do not reject the null hypothesis.

Conclusion
At 5% level of significance, there is no sufficient evidence to say that the median
number of customers per day is more than 64.

3|P a g e
B. One-sample Wilcoxon Signed Rank Test

The Wilcoxon signed-rank test is first developed in 1945 by American chemist


Frank Wilcoxon, but was popularized only in 1956 by Sidney Siegel. It is a non-
parametric test that can be used for one-sample , matched paired data and for
unrelated data.

The Wilcoxon signed-rank test is the non-parametric equivalent to the t-test and
may be used when the dependent variable is not normally distributed. It tests the
hypothesis about the median of a population distribution. The sign test and the
Wilcoxon signed rank tests looks similar by the Wilcoxon Signed Rank Test is more
powerful than the sign test.

Assumptions
1. Measurement scale is at least interval
2. Observations are mutually independent
3. Data follow a symmetric distribution
4. Differences between the data and hypothesized median are continuous.

Null and Alternative Hypothesis (k – hypothesized median)


For the left tailed test:
Ho: median > k
Ha: median <k

For the right tailed test:


Ho: median < k
Ha: median >k

For the two tailed test:


Ho: median = k
Ha: median ≠k

Test Statistic
If n < 20, use w where W is the smaller number of positive and negative ranks.

If n>20 use:

Where

4|P a g e
Rejection Region
Reject Ho if W < Wα

Example:

The grades of 10 students of in their law subject are given below. Test at 5%
significance level if the median mark is greater than 67.

Student 1 2 3 4 5 6 7 8 9 10
Grade 71 79 40 70 82 72 60 76 69 75
Solution
Ho: median < 67
Ha: median > 67 , α = 0.05

Get the difference from the hypothesized median, ignore the sign and rank the
differences. Separate the positive and negative ranks. If the values of the difference are
tied, get the mean rank.
Student 1 2 3 4 5 6 7 8 9 10
Grade 71 79 40 70 82 72 60 76 69 75
Difference 4 12 -27 3 15 5 -7 9 2 8
+Rank 3 8 2 9 4 7 1 6
-Rank 10 5

Get the sum of the positive and the negative ranks (+ = 40, - = 15)
Since n< 20, our test statistic is the w = minimum of (40,15) = 15
Critical value Wα = 11 from the table of Critical Values for the WSRT.
Decision point
Reject Ho if W < Wα, since 15 >11, we do not reject the Ho.
Conclusion
At 5% level of significance, there is no sufficient evidence that the median grade in the
law subject is greater than 67.

5|P a g e
C. Mann Whitney U Test

The Mann Whitney U test is a non-parametric test that is used to compare two
sample medians from the same population and the tests if these two sample
medians are equal or not. The Mann Whitney U test is used when the data is ordinal
or when the assumptions for the t-test are not met.

Assumptions
1. The sample are drawn randomly from the population.
2. Samples are mutually independent. Hence, sample has to be in either group, it
cannot be in both groups.
3. Data is at least of ordinal scale.
Null and Alternative Hypothesis
Ho: Median1 = Median2
Median1 – Median2 = 0
Ha: Median1 ≠ Median 2

Test Statistic
U1 = n1n2 + (n1)(n1+1) - ∑R1
2

U2 = n1n2 + (n2)(n2+1) - ∑R1


2
U = min (U1,U2)

Where:
n1 = sample size of group 1
n2 – sample size of group 2
Ri = rank of the sample size

Rejection Region
Reject Ho if U < Uα

6|P a g e
Example
A researcher wants to test whether there is any difference between the effectiveness of
a new drug for relieving pain. A random sample of 23 people, 12 of whom take the
drug and 11 take a placebo. Results are as follows:
Placebo New Drug
11 34
15 31
9 35
4 29
34 28
17 12
18 18
14 30
12 14
13 22
26 10
31

Test for the equality of median response time at 5% level of significance.


Solution
Combine the data set and provide ranks. Use average ranks for tied values. Sum up the
rank for each group.
Placebo Rank New Drug Rank
11 4 34 21.5
15 10 31 19.5
9 2 35 23
4 1 29 17
34 21.5 28 16
17 11 12 5.5
18 12.5 18 12.5
14 8.5 30 18
12 5.5 14 8.5
13 7 22 14
26 15 10 3
31 19.5
117.5 158.5

7|P a g e
Compute for U1 and U2.
U1 = (12)(11) + [(12)(13)/2]-117.5
= 132 + 78- 117.5
= 92.5
U2 = (12)(11) + [(11)(12)/2]-158.5
= 132 + 66- 158.5
= 39.5
U = min (92.5, 39.5)
U = 39.5
Uα = 33
Decision Point:
Reject Ho if U < Uα, since 39.5 > 33, we do not reject the Ho.
Conclusion
At 5% level of significance, there is no sufficient evidence to say there is a difference in
median response time to the new pain reliever.

D. Kruskal Wallis Test

The Kruskal Wallis test is the non parametric test equivalent for the one-way
ANOVA. It was proposed by Kruskal and Wallis is 1952, and extends the Mann
Whitney U Test to more than 2 groups. It determines significant differences on a
continuous dependent variable by a categorical dependent variable with two or
more groups. Although it is not as powerful as the ANOVA, it can be used for both
continuous and ordinal-level measurements.

Assumptions
1. Samples are randomly drawn from the population
2. Observations are independent of each other.
3. Measurement scale is at least ordinal.
Null and Alternative Hypothesis
Ho: population medians are equal
Ha: population medians are not equal

8|P a g e
Test Statistic

H = 12 ∑R2i – 3(N+1)
N (N+1) ni
Where:
N = total number
ni =number in the ith group
Ri = total sum of ranks in the ith group

The distribution of the Kruskal-Wallis test statistic approximates a chi-square


distribution, with k-1 degrees of freedom, if the observations in each group is 5 or
more.

Critical Region
Reject Ho if H > Hα.

Example

A shoe company wants to know if three groups of workers have different salaries, data
in thousand pesos are shown below. Test at 5% level of confidence for the significance
of the difference in salaries.

Women Men Minorities


23 45 18
41 55 30
54 60 34
66 70 40
78 72 44

Solution

Ho: median salaries are equal


Ha: median salaries are not equal, at α = 0.05

Sort the data in ascending order in combined set and assign ranks. Give average rank
for tied values
Salary 18 23 30 34 40 41 44 45 54 55 60 66 70 72 90
Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

9|P a g e
Add up the different ranks for each group. Sum up the ranks.

Women Rank Men Rank Minorities Rank


23 2 45 8 18 1
41 6 55 10 30 3
54 9 60 11 34 4
66 12 70 13 40 5
78 15 72 14 44 7
44 56 20

Calculate the test statistic.

H = (12/(15)(15+1)] [(442/5)+(562/5)+(202/5)] – 3 (15+1)


= (12/240)(387.2+627.2+80)-48
= (0.05)(1094.4) -48
H = 6.72

Hα = 23.685

Decision Point:
Reject Ho if H > Hα, since 6.72 <23.685, we do not reject the null hypothesis.

Conclusion:
At 5% level of significance, we do not have sufficient evidence to say that the median
salaries of the three worker groups are not equal.

E. Spearman’s Rank Correlation

The Spearman rank correlation coefficient, rs, is the non parametric version of the
Pearson correlation coefficient r.

Assumptions
1. Data must be ordinal, interval or ratio.
2. Data has to be monotonic i.e. if one variable increases (or decreases) the
other variable also increases (or decreases).

Test Statistic

10 | P a g e
Where
d = difference in the ranks
n – sample size (number of pairs)

Test Results
Spearman’s return a value between -1 to 1, inclusive, where:
+1 = perfect positive correlation between ranks
-1 = perfect negative correlation between ranks
0 = no correlation between ranks

Example

The scores of nine students in Science and Reading are as follows:

Science Reading
35 30
23 33
47 45
17 23
10 8
43 49
9 12
6 4
28 31

Compute the Spearman’s rank correlation.

Solution.
Rank the scores on each subject in descending order. Use average ranks for tied values.
Find the (1) difference in the ranks and (2) squared differences in the ranks.

Science Rank Reading Rank d d2


35 3 30 5 -2 4
23 5 33 3 2 4
47 1 45 2 -1 1
17 6 23 6 0 0
10 7 8 8 -1 1
43 2 49 1 1 1
9 8 12 7 1 1
6 9 4 9 0 0
28 4 31 4 0 0
∑ 12

11 | P a g e
Compute for the ρ.

ρ = 1 – 6(12)
9(81-1)
= 0.9

There is a strong positive correlation between the ranks of the Science and Reading
scores.

12 | P a g e

You might also like