Non-Parametric Method: Advantages
Non-Parametric Method: Advantages
Non-Parametric method
Most of the statistical test that we have discussed so far had the following features in
common.
(1) The form of the frequency function of the parent population from which the samples
have been drawn in assumed to be known
They were concerned with testing statistical hypothesis about the parameters of this
frequency function or estimating its parameters.
For example, almost all small sample, test of significance is based on the fundamental
assumption that the parent population is normal and are concerned with testing or
estimating the means and variances of the population such test which deal with the
parameters of the population are known as para metric test.
A Non-Parametric (N.P) test is a test that does not depend on the particular form of the
basic frequency function from which the samples are drawn. In other words, non-
parametric test does not make any assumption regarding the form of the population.
• NP methods are very simple and easy to apply and do not require complicated
sample theory.
• No assumption is made about the form of the frequency function of the parent
population from which Sampling is done.
• Which are. measured in nominal scale
• The Psychometry data are not normally distributed non-parametric tests have found
applications in psychometry.
• Non-parametric tests are available to deal with the data which are given in ranks.
Drawbacks
Sign Test
The Sign tests are two types
Procedure 1
In a one sample sign test we test null hypothesis Ho: θ=θ0 we (a specified value) in a
population known to be symmetrical about the ordinate at x=θ0 Let the observed sample
be x1, x2,…….xn, Give a plus (+) sign all the observations which exceeds θ0 and a
minus(-) sign to all the observations which are Less than θ0 and discard the sample value
exactly equal to θ0 (put zero) value then test the null hypothesis that these plus and minus
1
signs are of a random variable having the binomial distribution with p=
2
1
P (x=x) = nCx ( )n ; x=0, 1, 2, ……., n
2
Procedure 2
If the zero differences are neither plus nor minus, they are excluded from N (the total
number) as well as either from the two categories of +ve and -ve signs. Let as have n
observation.
Let x be the number of positive signs and X follows binomial with parameter n and p
Distribution of X is
f (x) = P(X=x)
0 0<p<1; p+q=1
Otherwise
3
1 1
i. For testing H0: p = v/s H1: p>
2 2
find p′ = P (x ≥ x)
If p′<α we accept Ho
p' ≤ α we reject Ho
1 1
ii. For testing H0: p = v/s H₁ = p<
2 2
If p′> α, accept H0
p′ ≤ α; reject H0
1 1
iii. For testing H0: p = v/s H1: p ≠
2 2
Find p′ = (p ≥ x)
If 2p′> α accept H0
2p′ ≤ α reject H0
Z=
𝑛
(𝑥−0.5)− 𝑛
2
∼ N (0,1) ; x >
𝑛 2
√
4
If |Z| ≥ Zα/2 = Zα/2 reject H0 where Zα/2 can be obtained from standard normal tables
such that
➢ To test the claim that the median age of mathematical faculty of state community
college in at least 42 years the results contain the random sample of 32
mathematical faculties.
56, 62, 61, 54, 52, 32, 24, 35, 50, 42, 52, 49, 26, 31, 31, 54, 38, 36, 45, 53, 37, 40, 38, 31,
29, 25, 45, 52, 48, 39, 30 38
We have to test Ho: θ =42 v/s H1: θ<42. (at least 42)
x x- θ0 (θ0=42) Sign
56 14 +
62 20 +
61 19 +
54 12 +
52 10 +
32 -10 -
24 -18 -
35 -7 -
50 8 +
42 0 0
52 10 +
49 7 +
26 -16 -
31 -11 -
31 -11 -
54 12 +
38 -4 -
36 -6 -
45 3 +
53 11 +
37 -5 -
40 -2 -
38 -4 -
31 -11 -
29 -13 -
25 -17 -
45 3 +
52 10 +
5
48 6 +
39 -3 -
30 -12 -
38 -4 -
Using Procedure 1
N = 32
n = N-r = 32-1 = 31
X = number of times the less frequent sign =14: (here less frequent sign is +ve)
𝑛−1
K= – 0.98√𝑛
2
31−1
= – 0.98√31
2
30
= – 0.98√31
2
= 15-0.98√31
= 9.54
Here X =14 and K = 9.54
X >K, therefore no reason to reject H0 at α=0.05
ie median age is 42.
Using procedure 2.
𝑛
𝑋+0.5−
2
Test statistics =
𝑛
√
4
X- number of positive signs
X= 14
𝑛
= 31/2 = 15.5
2
𝑛
Here X<
2
Test statistics
𝑛
(14+0.5)−
2
Z=
𝑛
√
4
14.5−15.5
=
31
√
4
= -0.36
α = 0.05
Zα = 1.65
Z < Zα
No reason to reject H0 at α = 0.05
ie, Median age is 42.
The sign test is most often employed for observations that have been randomly
selected in pairs, using a paired difference experiment This sign test have
important applications in problem involving paired data such as data relating to the
kind of a pre-test, post-test situation responses of mother and daughter towards
ideal family life. In these problems each pair of sample values can be placed with a
plus sign if the first value is greater than the second, a minus sign. If the first value
is smaller than the second or be discarded if the two values are equal. We then test
7
null hypothesis. that the plus and minus sign are values of random variable having
1
the binomial distribution with p= .
2
➢ 10 women are randomly selected and their weight before and after a particular diet
are recorded.
Weight
Xi before diet: 180 178 165 200 160 145 170 210 185 155
Yi after diet: 174 181 157 198 152 152 160 205 178 160
Use the sign test at α= 0.05 to test the claim that the new weight loss diet is effective.
𝑛−1
K= – 0.98√𝑛
2
9
= – 0.98 √10
2
= 1.4
Let (X1,Y1 ), (X2, Y2), ----, (Xn, Yn) b a sample if size n from a bivariate population
Consider Zi= Xi - Yi (i=1, 2, ….., n) is a random sample from the population of z
with absolutely continuous which is symmetric about the median.
We want to test
v/s
or
or
Now we arrange |Zi| = |Xi – Yi| in the increasing order of magnitude and rank those
absolute value as 1, 2,….,n.
9
T+ be the sum of the positive Zi values and T- be the sum of the negative Zi values
clearly,
𝑛(𝑛+1)
0 ≤ T+ ≤
2
T+ = ∑𝑛𝑖=1 𝑖 𝑍(𝑖)
Where,
➢ Two measuring devices take readings on each of 10 test units. Let X and Y
respectively be the reading on test unit by the first and second devices respectively.
The data are as follows.
Test 1 2 3 4 5 6 7 8 9 10
unit
X 71 108 72 140 61 97 90 127 101 114
Y 77 105 71 152 88 107 93 130 112 105
(Given C3 = 46 and C4 = 9)
𝑛(𝑛+1)
C4 = – C3
2
10 ×11
= – 46
2
= 55 - 46
=9
=3+1+6
= 10
C3 =46 C4 =9
Non parametric tests like sign test and median test are meant to find out a particular kind
of difference between two independent samples or sets of observation, while the Runs
test may be used to find any difference existing in a single sample (with regard to
occurrence of events) or between two independent samples drawn from the population.
Run
If both the samples come from the same population then there could be through mingling
of xi’s and yi’s and consequently number of runs in the combined sample would be large.
On the other hand, if the samples come from 2 different populations, so that their ranges
do not overlap then there would be only 2 runs the type x1, x2, …. Xn and y1, y2, …. yn.
Generally, any difference in mean and variance would tend to reduce the number of runs.
The hypothesis to be tested through the runs test in the single and in the two independent
samples may be stated as
a) with the tossing of a coin a number of times, the occurrence of heads and tails will be
in a random order or unpredictable in terms of showing head and tail
b) In a class the achievement of scores of males and females’ students in random order
2) Two independent samples are drawn from identical populations ie, there exists no
difference in the two samples in their central tendencies, variability and skewness
Procedure
For testing H0: f1(x) = f2(x) i.e. the samples have come from the same population we
count the number of runs and if the obtained of computed value of r falls between the
critical values (lower or higher) read from the tables it cannot be regarded as significant.
In this case we accept H0. But if it is equal to or more than of one of these critical values,
then we & reject H0.
In case of one-tailed tail where the direction of the randomness is predicted, only one of
the tables need to be referred. If the prediction is that too few runs will be observed, then
refers the table O1, which says that r is equal to or smaller than read from table, reject H0
and if the prediction is that too many runs will be observed then consult table O₂.
Similarly, when we have observations recorded from two independent sample drawn
from the populations, we make use of one table for finding out the critical values of ‘r’ at
desired significance Level.
In the case of large sample of these are N1, x’s and N2, y’s it follows arrangement of x’s
and y's are equally likely. Here, the tables employed in the case small samples for the
critical values of r are not applicable. As the size increases the sampling distribution of r
almost taken the form of Normal distribution and therefore the value of Z for rejecting or
accepting the set hypothesis.
𝑟−µ𝑟
Z=
𝜎𝑟
=
2𝑁1 𝑁2 (2𝑁1 𝑁2 − 𝑁1 − 𝑁2 )
√
(𝑁1 + 𝑁2 )2 (𝑁1 + 𝑁2 − 1)
Where Zα and Zα/2 are obtained from standard normal table for α level of significance.
We have to test H0: heads and tails occurs randomly and coin is not erractic
(unbaised).
𝐻 𝑇 𝐻𝐻 𝑇 𝐻𝐻𝐻 𝑇 𝐻𝐻 𝑇𝑇𝑇 𝐻 𝑇 𝐻 𝑇 𝐻𝐻
1 2 3 4 5 6 7 8 9 10 11 12 13
= number of heads
= 12
= number of tails
=8
N = N1 + N2
14
= 12 + 8 = 20
α= 0.05
The total number of runs = 13 lies within the critical values 6 and 16. Hence there is no
reason to reject H0 and concluded that the coin is not erratic.
10 boys and 10 girls of class XI selected from a Boy's Higher Secondary school and a
girls Higher Secondary school respectively, were examined in terms of their attitude
towards population education. Their scores on the attitude scale are shown in the
following table Test the hypothesis that boys and girls do not differ in terms of their
attitude towards population education.
Boys: 15 6 19 12 4 20 5 18 10
Girls: 17 9 15 13 3 8 14 11 2
2 3 4 5 6 7 8 9 10 11 12 13 14 15 15 16 17 18 19 20
G G B B B B G G B G B G G B G G G B B B
1 2 3 4 5 6 7 8 9 10
= number of girls
15
= 10
= number of boys
= 10
N = N1 + N2
= 10 + 10 = 20
There is no reason to reject H0 at α = 0.05 i.e., boys and girls do not differ in terms of
their attitude towards population education.
➢ On a railway reservation window there was a large queue of Men(M) and women
(W) standing in the order in which they have come as depicted in the following
displayed manner. Confirm whether or not they were standing in a random order.
MWMWMMMWWMWMWMWMMMMWMWMWMMWWWMW
MWMWMMWMMWMMWMMMMWMWMM
We have to test H0: they are standing in a random order v/s H1: They are not standing in a
random order (i.e., two tailed test)
𝑀 𝑊 𝑀 𝑊 𝑀𝑀𝑀 𝑊𝑊 𝑀 𝑊 𝑀 𝑊 𝑀 𝑊 𝑀𝑀𝑀𝑀 𝑊 𝑀 𝑊 𝑀 𝑊 𝑀𝑀
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
𝑊𝑊𝑊 𝑀 𝑊 𝑀 𝑊 𝑀 𝑊 𝑀𝑀 𝑊 𝑀𝑀 𝑊 𝑀𝑀𝑀𝑀 𝑊 𝑀 𝑊 𝑀𝑀
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
=Number of men. = 30
= 20
2𝑁1 𝑁2
µr =𝑁 +1
1+𝑁2
2×30×20
= +1
30+20
= 24 + 1 = 25
2𝑁1 𝑁2 (2𝑁1 𝑁2 − 𝑁1 − 𝑁2 )
𝜎𝑟 = √
(𝑁1 + 𝑁2 )2 (𝑁1 + 𝑁2 − 1)
2𝑋30𝑋20(2𝑋30𝑋20−30−20)
=√
(30+20)2 (30+20−1)
= 3.35
𝑟−µ𝑟
Z= 𝜎𝑟
35−25
= 3.35
= 2.96
α= 0.05
Zα/2 ≥ 1.96
Median Test
Median test is a statistical procedure for testing of 2 independent ordered samples (may
be of different size) differ in their central tendencies. In other words, it gives information
if 2 independent samples are likely to have been drawn from the population with the same
median.
Median test is a non-parametric replacement of its parametric test for Comparing the
means of 2 independent samples. The hypothesis to be tested can be stated either in non-
directional (two tailed test) -> directional (one-tailed test) The Pre-requisite of the test is
measurements at least on ordinal Scale.
Under H0 we would expect about half of each group scores to be above the combined
median and about half to be below the combined median in) Use of x test
3. Use of 𝜒 2 test
𝑁(𝑎𝑑−𝑏𝑐)2
𝜒2 =
(𝑎+𝑏 )(𝑎+𝑐 )(𝑏+𝑑)(𝑐+𝑑)
~ 𝜒2
𝑁(|𝑎𝑑−𝑏𝑐 |−𝑁/2)2
𝜒 2 = (𝑎+𝑏)(𝑎+𝑐)(𝑏+𝑑)(𝑐+𝑑) ~ 𝜒 2
18
4. If the observed value or calculated value is greater than table value of 𝜒 2 foe a
level of significance we reject H0.
2
ie, 𝜒 2 > 𝜒1,𝛼 , reject H0.
➢ The data of 10 plots each under two treatments are given below.
Treatment 1 X: 46, 45, 32, 42, 39, 48, 49, 30, 51,34
Treatment 2 Y:44, 40, 59, 47, 55, 50, 47, 71, 49, 55
The hypothesis of equality of median response under two treatments can be treated by the
median test
H0: The treatments are equally effective with regard to their median effect.
30 32 34 39 40 42 43 44 45 46 47 47 48 49 50 51 55 55 59 71
1 1 1 1 2 1 2 2 1 1 2 2 1 1 2 1 2 2 2 2
= 46.5
2 𝑁(|𝑎𝑑−𝑏𝑐 |−𝑁/2)2
𝜒 = (𝑎+𝑏 )(𝑎+𝑐 )(𝑏+𝑑)(𝑐+𝑑)
~ 𝜒2
20 ((149−9)−20/2)2
= 10𝑋10𝑋10𝑋10
20 (140−10)2
= 10000
20 (30)2
= 10000
20 𝑋 900
= = 1.8
10000
𝜒 2 = 3.841
𝜒 2 < 𝜒𝛼2
ie, the treatments are equally effective with regards to their median effect.
The data consist of observations on 'n' independent bivariate random variable (Xi, Yi), i=
1, 2, .--, n. The measurement Scale for Xi and Yi is nominal with two categories back we
20
call 0 and 1 that is, the possible values of (Xi, Yi) are (0,0), (0,1), (1,0), (1, 1). In the Mc
Nemar test the data are usually summarized in 2x2 Contigency table as follows.
Classification of the Yi
Y=0 Yi = 1
Classification X=0 a (the no. of pairs b (the no. of pairs
where X=0, Y=0) where X=0, Y=1)
of Xi X =1 c (the no. of pairs d (the no. of pairs
where X=1, Y=0) where X=1, Y=1)
Assumptions
(𝑏−𝑐)2
Ti = ~ 𝜒2
𝑏=𝑐
1
(Binomial distribution with p = , n = b+c)
2
The hypothesis is
which is equivalent to
v/s
Which is equivalent to
v/s
Let n=b+c, If n ≤ 20
use Table A3, if α is the desired level of significance of 2α, otherwise accept it.
If n ≤ 20 use Ti from table A2. Reject H0 at a level of significance α if Ti exceeds the (1-α)
quantity of a. 𝜒 2 random variable with 1 degree of freedom.
➢ Prior to a nationally televised debate b/w the two presidential Candidates a random
sample of 100 persons stated their choice of conditions as follows. Eighty-four
persons favored the Democratic Candidates and the remaining 16 favored the
Republican. After the debate the same 100 people expressed their preference again
of the persons who formerly favored the Democrat, exactly one fourth of them
changed their minds, and also one fourth of the people formerly favoring the
Republican switched to the democratic side The results are summarized in the
following 2x2 contingency table.
v/s
H1: There has been a change in the proportion of all voters who favor the Democrat.
(𝑏−𝑐)2
T i=
𝑏+𝑐
(21−4)2
= 21+4
289
= 25
22
= 11.26
α = 0.05
𝜒12 =3.841