Wilcoxon Sign Test
Wilcoxon Sign Test
s
T a
+
s
0 i i
d x m =
0
m =
0
i
d =
i
d
i
d
0 i
x m >
( )
i
R d +
0 i
x m <
( )
i
R d
iv. Then, sum the ranks of the positive differences, and sum the ranks of
the negative differences
.
iv. The test statistic, W is the depends on the alternative hypothesis:
- For a two tailed test the test statistic
- For a one tailed test where the the test statistic,
- For a one tailed test where the the test statistic
Critical region:
Compare the test statistic, W with the critical value in the tables; the null
hypothesis is rejected if
Make a decision
( ) ( )
i i
T R d , T R d
+
= + =
T
+
T
( )
min W T ,T
+
=
( )
1 0
: median
i
H R d m >
W T
=
( )
1 0
: median
i
H R d m <
W T
+
=
critical value, W a s
Example:
An environmental activist believes her communitys drinking water contains
at least the 40.0 parts per million (ppm) limit recommended by health officials
for a certain metal. In response to her claim, the health department samples
and analyzes drinking water from a sample of 11 households in the
community. The results are as in the table below. At the 0.05 level of
significance, can we conclude that the communitys drinking water might equal
or exceed the 40.0 ppm recommended limit?
Household Observed concentration
A 39
B 20.2
C 40
D 32.2
E 30.5
F 26.5
G 42.1
H 45.6
i
x
I 42.1
J 29.9
K 40.9
Solution:
0
40 m =
Household Observed
concentration
Rank,
A 39 -1 1 2 2
B 20.2 -19.8 19.8 10 10
C 40 0 _ _
D 32.2 -7.8 7.8 6 6
E 30.5 -9.5 9.5 7 7
F 26.5 -13.5 13.5 9 9
G 42.1 2.1 2.1 3.5 3.5
H 45.6 5.6 5.6 5 5
I 42.1 2.1 2.1 3.5 3.5
J 29.9 -10.1 10.1 8 8
K 40.9 0.9 0.9 1 1
i
x
0 i i
d x m = i
d
( )
i
R d + ( )
i
R d
( )
i
R d
13 T
+
= 42 T
=
1.
(One tail test)
2. Based on the alternative hypothesis, the test statistic
3. From table of Wilcoxon signed rank for one tail test,
We will reject
5. Since , thus we failed to reject and conclude
that the citys water supply might have at least 40.0 ppm of the metal
( )
( )
0
1
: median of 40
: median of 40
H R d
H R d
=
<
0 05 10 . , n o = =
0 05 10 critical value, 11 . , n , a o = = =
0
if H T a
+
s
13 11 T a
+
= > =
0
H
( )
1
13 T R d
+
= + =
Exercise 5.1:
Student satisfaction surveys ask students to rate a particular course, on a
scale of 1 (poor) to 10 (excellent). In previous years the replies have been
symmetrically distributed about a median of 4. This year there has been a
much greater on-line element to the course, and staff want to know how the
rating of this version of the course compares with the previous one.
14 students, randomly selected, were asked to rate the new version of the
course and their ratings were as follows:
1 3 6 4 8 2 3 6 5 2 3 4 1 2
Is there any evidence at the 5% level that students rate this version any
differently?
The Wilcoxon Signed rank test for paired sample
Null and alternative hypothesis:
Test procedure:
i. For each of the observed values, calculate
ii. Ignoring observation where , rank the values so the
smallest will have a rank of 1. Where two or more differences
have the same value find their mean rank, and use this.
iii. For observation where , list the rank as column and
list the rank as column
i i i
d x y =
0
i
d =
i
d
i
d
Case Rejection region
Two tail
Right tail
Left tail
( )
0
: median 0 H R d =
( )
0
: median 0 H R d =
( )
0
: median 0 H R d =
( )
1
: median 0 H R d =
( )
1
: median 0 H R d >
( )
1
: median 0 H R d <
0
H
1
H
( )
min T ,T a
+
s
T a
s
T a
+
s
0
i i
x y >
( )
i
R d +
0
i i
x y <
( )
i
R d
Then, , sum the ranks of the positive differences, and sum the ranks of the
negative differences
iv. The test statistic, W is the depends on the alternative hypothesis:
- For a two tailed test the test statistic
- For a one tailed test where the the test statistic,
- For a one tailed test where the the test statistic
Critical region:
Compare the test statistic, W with the critical value in the tables; the null
hypothesis is rejected if
Make a decision
( ) ( )
i i
T R d , T R d
+
= + =
T
+
T
( )
min W T ,T
+
=
( )
1
: median 0 H R d >
( )
1
: median 0 H R d <
W T
=
W T
+
=
critical value, W a s
Example:
Two computer software packages are being considered for use in the
inventory control department of a small manufacturing firm. The firm has
selected 12 different computing task that are typical of the kinds of jobs. The
results are shown in the table below. At the 0.10 level, can we conclude that
the median difference for the population of such task might be zero?
Computing task Time required for software packages
A 24 23.1
B 16.7 20.4
C 21.6 17.7
D 23.7 20.7
E 37.5 42.1
F 31.4 36.1
G 14.9 21.8
H 37.3 40.3
I 17.9 26
J 15.5 15.5
K 29 35.4
L 19.9 25.5
i
x
i
y
Solution:
Computing
task
Time required for
software packages
Rank,
A 24 23.1 0.9 0.9 1 1
B 16.7 20.4 -3.7 3.7 4 4
C 21.6 17.7 3.9 3.9 5 5
D 23.7 20.7 3 3 2.5 2.5
E 37.5 42.1 -4.6 4.6 6 6
F 31.4 36.1 -4.7 4.7 7 7
G 14.9 21.8 -6.9 6.9 10 10
H 37.3 40.3 -3 -3 2.5 2.5
I 17.9 26 -8.1 8.1 11 11
J 15.5 15.5 0 0 _ _
K 29 35.4 -6.4 6.4 9 9
L 19.9 25.5 -5.6 5.6 8 8
i i i
d x y =
i
x
i
y
i
d
( )
i
R d + ( )
i
R d
( )
i
R d
8 5 T .
+
= 57 5 T .
=
1.
(two tail test)
2. Based on the alternative hypothesis, the test is
3.
4. From table of Wilcoxon signed rank for two tail test,
We will reject
5. Since , thus we reject and conclude that the
software packages are not equally rapid in handling computing tasks like
those in the sample, or the population median for is not equal
to zero and that package x is faster than package y in handling computing
task like ones sample.
0 10 11 then 14 . , n , a o = = =
( )
0
if min H T ,T a
+
s
( ) min 8 5 57 5 8 5 14 . , . . = s
0
H
i i i
d x y =
( )
( )
0
1
: median of 0
: median of 0
H R d
H R d
=
=
( )
( ) min min 8 5 57 5 8 5 T ,T . , . .
+
= =
0 10 12 1 11 . , n o = = =
Exercise 5.2:
The following data gives the number of industrial accidents in ten
manufacturing plants for one month periods before and after an intensive
promotion on safety:
Do the data support the claim that the campaign was successful in reducing
accidents with median less than 0? Use .
Plant 1 2 3 4 5 6 7 8 9 10
Before 3 4 3 6 8 4 5 6 7 8
After 2 3 1 3 4 1 4 5 6 4
0.05 o =
Exercise 5.3
An experiment was conducted to compare the densities of cakes
prepared from two different cake mixes, A and B. Six cake pans received
batter A and six received batter B. Expecting a variation in oven
temperature, the experimenter placed an A and B cake side by side at six
different locations in oven. Test the hypothesis of no difference in the
median of population distributions of cake densities for two different cake
batters. Test at .
0.10 o =
Cake Mixes A Cake Mixes B
0.135 0.129
0.102 0.120
0.098 0.112
0.141 0.152
0.131 0.135
0.144 0.163
Spearmans Rank Correlation Test
We have seen the correlation coefficient r measure the linear relationship
between two continuous variable X and Y
Spearmans Rank Correlation Test is used to measure the strength and the
direction of the relationship between two variables which are at least
ordinal data.
A measure of correlation for ranked data based on the definition of
Pearson Correlation where there is no tie or few ties called Spearman rank
Correlation Coefficient, denoted by
where
( )
2
6
1
1
s
T
r
n n
=
( ) ( ) ( )
2
2
1 1
n n
i i i
i i
T d R X R Y
= =
( = =
( )
( )
is the rank assigned to
is the rank assigned to
is the difference between the ranks assigne to and
is the number of pairs of data
i i
i i
i i i
R X x
R Y y
d x y
n
A value of +1 or -1 indicated perfect association between X and Y
The plus sign with value indicates strong positive correlation
between the x and y, and indicates weak positive correlation
between the x and y
The minus sign with value indicates strong negative correlation
between the x and y, and indicates weak negative correlation
between the x and y
When is zero or close to zero, we would conclude that the variable
are uncorrelated
0 5
s
r . >
0 5
s
r . <
0 5
s
r . >
0 5
s
r . <
s
r
Example:
The data below show the effect of the mole ratio of sebacic acid on the
intrinsic viscosity of copolyesters.
Find the Spearman rank correlation coefficient to measure the relationship of
mole ratio of sebacic acid and the viscosity of copolyesters.
Solutions:
X: mole ratio of sebacic
Y: viscosity of copolyesters
Mole ratio 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3
Viscosity 0.45 0.20 0.34 0.58 0.70 0.57 0.55 0.44
Thus
which shows a weak negative correlation between the mole ratio of sebacic
acid and the viscosity of copolyesters
Mole ratio Viscosity
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.45
0.20
0.34
0.58
0.70
0.57
0.55
0.44
8
7
6
5
4
3
2
1
4
1
2
7
8
6
5
3
4
6
4
-2
-4
-3
-3
-2
16
36
16
4
16
9
9
4
T = 110
( )
i
R x
( )
i
R y
( ) ( )
i i i
d R x R y =
2
i
d
( )
( )
( )
2
6 110
6
1 1 0 3095
8 64 1
1
s
T
r .
n n
= = =
Exercise 5.4:
The following data were collected and rank during an experiment to
determine the change in thrust efficiency, y as the divergence angle of a rocket
nozzle, x changes:
Find the Spearman rank correlation coefficient to measure the relationship
between the divergence angle of a rocket nozzle and the change in thrust
efficiency.
Rank X 1 2 3 4 5 6 7 8 9 10
Rank Y 2 3 1 5 7 9 4 6 10 8
Exercise 5.5
Suppose eight elementary school science teachers have been ranked by a
judge according to their teaching ability and all have taken a national
teachers examination. Calculate the Spearman rank correlation
coefficient to measure the relationship between Judge Rank ( ) and Test
Rank ( ).
i
x
i
y
Judge Rank ( ) Test Rank ( )
7 1
4 5
2 3
6 4
1 8
3 7
8 2
5 6
i
x i
y