0% found this document useful (0 votes)
70 views22 pages

Non-Parametric Method: Advantages

Uploaded by

akshayamennuzz25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views22 pages

Non-Parametric Method: Advantages

Uploaded by

akshayamennuzz25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

1

Non-Parametric method
Most of the statistical test that we have discussed so far had the following features in
common.

(1) The form of the frequency function of the parent population from which the samples
have been drawn in assumed to be known

They were concerned with testing statistical hypothesis about the parameters of this
frequency function or estimating its parameters.

For example, almost all small sample, test of significance is based on the fundamental
assumption that the parent population is normal and are concerned with testing or
estimating the means and variances of the population such test which deal with the
parameters of the population are known as para metric test.

A Non-Parametric (N.P) test is a test that does not depend on the particular form of the
basic frequency function from which the samples are drawn. In other words, non-
parametric test does not make any assumption regarding the form of the population.

Advantages and Drawbacks of Non- Parametric Tests.


Advantages.

• NP methods are very simple and easy to apply and do not require complicated
sample theory.
• No assumption is made about the form of the frequency function of the parent
population from which Sampling is done.
• Which are. measured in nominal scale
• The Psychometry data are not normally distributed non-parametric tests have found
applications in psychometry.
• Non-parametric tests are available to deal with the data which are given in ranks.

Drawbacks

• NP tests can be used only if the measurements are nominal or ordinal.


• N.P tests are designed to test statistical hypothesis only and not for estimating. the
parameters
2

Sign Test
The Sign tests are two types

1) The one-sample sign test.

2) The paired sample sign test.

The One-sample sign.

Procedure 1

In a one sample sign test we test null hypothesis Ho: θ=θ0 we (a specified value) in a
population known to be symmetrical about the ordinate at x=θ0 Let the observed sample
be x1, x2,…….xn, Give a plus (+) sign all the observations which exceeds θ0 and a
minus(-) sign to all the observations which are Less than θ0 and discard the sample value
exactly equal to θ0 (put zero) value then test the null hypothesis that these plus and minus
1
signs are of a random variable having the binomial distribution with p=
2

1
P (x=x) = nCx ( )n ; x=0, 1, 2, ……., n
2

Procedure 2

If the zero differences are neither plus nor minus, they are excluded from N (the total
number) as well as either from the two categories of +ve and -ve signs. Let as have n
observation.

n =N if there are no zeroes

n =N - r if there are r zeroes

Let x be the number of positive signs and X follows binomial with parameter n and p
Distribution of X is

f (x) = P(X=x)

= nCx px (1-p)n-x ; x= 0, 1, …..., n

0 0<p<1; p+q=1

Otherwise
3

1 1
i. For testing H0: p = v/s H1: p>
2 2

find p′ = P (x ≥ x)

If p′<α we accept Ho

p' ≤ α we reject Ho

1 1
ii. For testing H0: p = v/s H₁ = p<
2 2

find p′= p (X ≤x)

If p′> α, accept H0

p′ ≤ α; reject H0
1 1
iii. For testing H0: p = v/s H1: p ≠
2 2
Find p′ = (p ≥ x)
If 2p′> α accept H0
2p′ ≤ α reject H0

For moderately large n ≥ 20


𝑛
(𝑥+0.5)− 𝑛
2
∼ N (0,1) ; x <
𝑛 2
√4

Z=
𝑛
(𝑥−0.5)− 𝑛
2
∼ N (0,1) ; x >
𝑛 2

4

If |Z| ≥ Zα/2 = Zα/2 reject H0 where Zα/2 can be obtained from standard normal tables
such that

P (|Z| ≥ Zα/2) = α and if| Z|< Zα/2 accept Ho


4

➢ To test the claim that the median age of mathematical faculty of state community
college in at least 42 years the results contain the random sample of 32
mathematical faculties.

56, 62, 61, 54, 52, 32, 24, 35, 50, 42, 52, 49, 26, 31, 31, 54, 38, 36, 45, 53, 37, 40, 38, 31,
29, 25, 45, 52, 48, 39, 30 38

Use the sign test at the 0.05 level of significance.

We have to test Ho: θ =42 v/s H1: θ<42. (at least 42)

x x- θ0 (θ0=42) Sign
56 14 +
62 20 +
61 19 +
54 12 +
52 10 +
32 -10 -
24 -18 -
35 -7 -
50 8 +
42 0 0
52 10 +
49 7 +
26 -16 -
31 -11 -
31 -11 -
54 12 +
38 -4 -
36 -6 -
45 3 +
53 11 +
37 -5 -
40 -2 -
38 -4 -
31 -11 -
29 -13 -
25 -17 -
45 3 +
52 10 +
5

48 6 +
39 -3 -
30 -12 -
38 -4 -

Using Procedure 1

Number of +ve sign = 14


Number of -ve sign = 17
Number of zero = 1

N = 32
n = N-r = 32-1 = 31

X = number of times the less frequent sign =14: (here less frequent sign is +ve)

𝑛−1
K= – 0.98√𝑛
2
31−1
= – 0.98√31
2
30
= – 0.98√31
2
= 15-0.98√31
= 9.54
Here X =14 and K = 9.54
X >K, therefore no reason to reject H0 at α=0.05
ie median age is 42.

Using procedure 2.

(If n ≥ 20); here n=31)


6

𝑛
𝑋+0.5−
2
Test statistics =
𝑛

4
X- number of positive signs
X= 14
𝑛
= 31/2 = 15.5
2
𝑛
Here X<
2
Test statistics
𝑛
(14+0.5)−
2
Z=
𝑛

4
14.5−15.5
=
31

4

= -0.36

α = 0.05
Zα = 1.65
Z < Zα
No reason to reject H0 at α = 0.05
ie, Median age is 42.

The Paired Sample Sign Test.

The sign test is most often employed for observations that have been randomly
selected in pairs, using a paired difference experiment This sign test have
important applications in problem involving paired data such as data relating to the
kind of a pre-test, post-test situation responses of mother and daughter towards
ideal family life. In these problems each pair of sample values can be placed with a
plus sign if the first value is greater than the second, a minus sign. If the first value
is smaller than the second or be discarded if the two values are equal. We then test
7

null hypothesis. that the plus and minus sign are values of random variable having
1
the binomial distribution with p= .
2

Assumptions of Sign Test

The assumptions under lying the sign test are

• The differences are continuously distributed.


• The differences are independent of each other.

➢ 10 women are randomly selected and their weight before and after a particular diet
are recorded.

Weight
Xi before diet: 180 178 165 200 160 145 170 210 185 155

Yi after diet: 174 181 157 198 152 152 160 205 178 160

Use the sign test at α= 0.05 to test the claim that the new weight loss diet is effective.

H0: θ = θ0 v/s H1: θ > θ0


Xi Yi Sign of Xi - Yi
180 174 +
178 181 -
165 157 +
200 198 +
160 152 +
145 150 -
170 160 +
210 205 +
185 178 +
155 160 -
n =10, (n<20 small sample)
Number of the +ve sign =7
Number of -ve sign = 3
8

X: number of times the less frequent sign occurs.

ie X= min {+ve, -ve sign}


= min {7,3}
=3

𝑛−1
K= – 0.98√𝑛
2
9
= – 0.98 √10
2
= 1.4

X > K, therefore no reason to reject H0 at α = 0.05

Wilcoxon's Matched Pairs Signed Rank test

Let (X1,Y1 ), (X2, Y2), ----, (Xn, Yn) b a sample if size n from a bivariate population
Consider Zi= Xi - Yi (i=1, 2, ….., n) is a random sample from the population of z
with absolutely continuous which is symmetric about the median.

We want to test

H0: Median (z) =0

v/s

H1: Median (z) <0

or

H1: Median (z) >0

or

H1: Median (z) ≠0

Now we arrange |Zi| = |Xi – Yi| in the increasing order of magnitude and rank those
absolute value as 1, 2,….,n.
9

T+ be the sum of the positive Zi values and T- be the sum of the negative Zi values
clearly,
𝑛(𝑛+1)
0 ≤ T+ ≤
2

o, when all Zi ‘s are negative.


T+ =
𝑛(𝑛+1)
; when all Zi’s are positive.
2

The statistics T+ (or T-) is known as Wilcoxon's statistic and is given by

T+ = ∑𝑛𝑖=1 𝑖 𝑍(𝑖)

Where,

1 ; if the observation Zi whose rank i is +ve


Z(i) =
0 ; otherwise

Z(i)'s are independent Bernoulli’s variables but not necessarily identically


distributed.

Null hypothesis Alternative Rejection region


H0 H1
Median (Z) = 0 Median (Z) > 0 T+ > C1
Median (Z) = 0 Median (Z) < 0 T+ < C2
Median (Z) = 0 Median (Z) ≠ 0 T+ > C3 or
T+ < C4

Where Ci ‘s are determined from the table such that


P (Type I error) = α
10

➢ Two measuring devices take readings on each of 10 test units. Let X and Y
respectively be the reading on test unit by the first and second devices respectively.
The data are as follows.

Test 1 2 3 4 5 6 7 8 9 10
unit
X 71 108 72 140 61 97 90 127 101 114
Y 77 105 71 152 88 107 93 130 112 105

Use Wilcoxon’s test to test

H0: median = 0 v/s median ≠ 0

(Given C3 = 46 and C4 = 9)

𝑛(𝑛+1)
C4 = – C3
2
10 ×11
= – 46
2

= 55 - 46

=9

Z i = Xi – Yi |Z| Rank |Z|


71-77=-6 6 5
108-105=3 3 3
72-71=1 1 1
140-152=-12 12 9
61-88=-27 27 10
97-107=-10 10 7
90-93=-3 3 3
127-130=-3 3 3
101-112=-11 11 8
114-105=9 9 6
11

T+ = Sum of rank of positive Zi ‘s

=3+1+6

= 10

C3 =46 C4 =9

C4 < T+ < C3 (9<10<46)

T+ lies between 9 and 46

Therefore, no reason to reject H0 at α = 5%

Wald-Wolfowitz Run Test

Non parametric tests like sign test and median test are meant to find out a particular kind
of difference between two independent samples or sets of observation, while the Runs
test may be used to find any difference existing in a single sample (with regard to
occurrence of events) or between two independent samples drawn from the population.

Run

A run is defined as a sequence of letters of one kind surrounded by sequence of letters of


the other kind and the no: of elements in a run is usually referred to as the length of the
run.

If both the samples come from the same population then there could be through mingling
of xi’s and yi’s and consequently number of runs in the combined sample would be large.

On the other hand, if the samples come from 2 different populations, so that their ranges
do not overlap then there would be only 2 runs the type x1, x2, …. Xn and y1, y2, …. yn.
Generally, any difference in mean and variance would tend to reduce the number of runs.

The hypothesis to be tested through the runs test in the single and in the two independent
samples may be stated as

1) Sequence of events in single sample occurs in a random order.


12

a) with the tossing of a coin a number of times, the occurrence of heads and tails will be
in a random order or unpredictable in terms of showing head and tail

b) In a class the achievement of scores of males and females’ students in random order

2) Two independent samples are drawn from identical populations ie, there exists no
difference in the two samples in their central tendencies, variability and skewness

Procedure

For testing H0: f1(x) = f2(x) i.e. the samples have come from the same population we
count the number of runs and if the obtained of computed value of r falls between the
critical values (lower or higher) read from the tables it cannot be regarded as significant.
In this case we accept H0. But if it is equal to or more than of one of these critical values,
then we & reject H0.

In case of one-tailed tail where the direction of the randomness is predicted, only one of
the tables need to be referred. If the prediction is that too few runs will be observed, then
refers the table O1, which says that r is equal to or smaller than read from table, reject H0
and if the prediction is that too many runs will be observed then consult table O₂.
Similarly, when we have observations recorded from two independent sample drawn
from the populations, we make use of one table for finding out the critical values of ‘r’ at
desired significance Level.

In the case of large sample of these are N1, x’s and N2, y’s it follows arrangement of x’s
and y's are equally likely. Here, the tables employed in the case small samples for the
critical values of r are not applicable. As the size increases the sampling distribution of r
almost taken the form of Normal distribution and therefore the value of Z for rejecting or
accepting the set hypothesis.
𝑟−µ𝑟
Z=
𝜎𝑟

µr = Mean of the distribution


2𝑁1 𝑁2
= +1
𝑁1+𝑁2

𝜎𝑟 = standard deviation of the distribution


13

=
2𝑁1 𝑁2 (2𝑁1 𝑁2 − 𝑁1 − 𝑁2 )

(𝑁1 + 𝑁2 )2 (𝑁1 + 𝑁2 − 1)

r= total number of runs

test procedure is summarized as

Null hypothesis alternative CR


H0: f1(x) = f2(x) H1: f1(x) < f2(x) Z < -Zα
H0: f1(x) = f2(x) H1: f1(x) >f2(x) Z > -Zα
H0: f1(x) = f2(x) H1: f1(x) ≠f2(x) |Z| >-Zα/2

Where Zα and Zα/2 are obtained from standard normal table for α level of significance.

➢ A coin was tossed 20 times and the results obtained are H T H H T H H H T H H T


TTHTHTHH
Test the hypothesis that the coin was not erractive.

We have to test H0: heads and tails occurs randomly and coin is not erractic
(unbaised).
𝐻 𝑇 𝐻𝐻 𝑇 𝐻𝐻𝐻 𝑇 𝐻𝐻 𝑇𝑇𝑇 𝐻 𝑇 𝐻 𝑇 𝐻𝐻
1 2 3 4 5 6 7 8 9 10 11 12 13

r = total number of runs = 13

N1 = number of first event

= number of heads

= 12

N2 = number of second event

= number of tails

=8

N = N1 + N2
14

= 12 + 8 = 20

α= 0.05

with N1 =12 and N2 = 8

from the table O2 we get r = 6 (lower value)

with N1 =12 and N2 = 8 from table O2 we get r = 16 (higher value).

The total number of runs = 13 lies within the critical values 6 and 16. Hence there is no
reason to reject H0 and concluded that the coin is not erratic.

Two independent Small Samples

10 boys and 10 girls of class XI selected from a Boy's Higher Secondary school and a
girls Higher Secondary school respectively, were examined in terms of their attitude
towards population education. Their scores on the attitude scale are shown in the
following table Test the hypothesis that boys and girls do not differ in terms of their
attitude towards population education.

Attitude scores of boys and girls

Boys: 15 6 19 12 4 20 5 18 10
Girls: 17 9 15 13 3 8 14 11 2

2 3 4 5 6 7 8 9 10 11 12 13 14 15 15 16 17 18 19 20

G G B B B B G G B G B G G B G G G B B B

1 2 3 4 5 6 7 8 9 10

r = total number of runs =10 (small sample)

N1 = Number of first event

= number of girls
15

= 10

N2 = number of second event

= number of boys

= 10

N = N1 + N2

= 10 + 10 = 20

for N1 =10 and N2 = 10

referring table O2 we get r = 6 at 5% level.

Computed value of r = 10>6

There is no reason to reject H0 at α = 0.05 i.e., boys and girls do not differ in terms of
their attitude towards population education.

➢ On a railway reservation window there was a large queue of Men(M) and women
(W) standing in the order in which they have come as depicted in the following
displayed manner. Confirm whether or not they were standing in a random order.

MWMWMMMWWMWMWMWMMMMWMWMWMMWWWMW
MWMWMMWMMWMMWMMMMWMWMM

We have to test H0: they are standing in a random order v/s H1: They are not standing in a
random order (i.e., two tailed test)
𝑀 𝑊 𝑀 𝑊 𝑀𝑀𝑀 𝑊𝑊 𝑀 𝑊 𝑀 𝑊 𝑀 𝑊 𝑀𝑀𝑀𝑀 𝑊 𝑀 𝑊 𝑀 𝑊 𝑀𝑀
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

𝑊𝑊𝑊 𝑀 𝑊 𝑀 𝑊 𝑀 𝑊 𝑀𝑀 𝑊 𝑀𝑀 𝑊 𝑀𝑀𝑀𝑀 𝑊 𝑀 𝑊 𝑀𝑀
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

Total number of runs = 35 (large sample)


16

N₁ = Number of first event

=Number of men. = 30

N₂ = Number of Second event = Number of Women

= 20
2𝑁1 𝑁2
µr =𝑁 +1
1+𝑁2

2×30×20
= +1
30+20

= 24 + 1 = 25

2𝑁1 𝑁2 (2𝑁1 𝑁2 − 𝑁1 − 𝑁2 )
𝜎𝑟 = √
(𝑁1 + 𝑁2 )2 (𝑁1 + 𝑁2 − 1)

2𝑋30𝑋20(2𝑋30𝑋20−30−20)
=√
(30+20)2 (30+20−1)

= 3.35
𝑟−µ𝑟
Z= 𝜎𝑟

35−25
= 3.35

= 2.96

α= 0.05

Zα/2 ≥ 1.96

|Z| > Zα/2

Therefore, reject H0 at α= 0.05


17

They are not standing in random order.

Median Test

Median test is a statistical procedure for testing of 2 independent ordered samples (may
be of different size) differ in their central tendencies. In other words, it gives information
if 2 independent samples are likely to have been drawn from the population with the same
median.

Median test is a non-parametric replacement of its parametric test for Comparing the
means of 2 independent samples. The hypothesis to be tested can be stated either in non-
directional (two tailed test) -> directional (one-tailed test) The Pre-requisite of the test is
measurements at least on ordinal Scale.

The computation required the following steps of procedures.

1. Determine a common median for both the groups combined.


2. Dichotomize both sets of scores separately at the common median and put the data
in a 2x2 Contingency table as given below.
Group I Group II Total
No. of scores a b a+b
above median
No. of scores c d c+d
below median
Total a+c b+d M= a +b +c +d

Under H0 we would expect about half of each group scores to be above the combined
median and about half to be below the combined median in) Use of x test

3. Use of 𝜒 2 test
𝑁(𝑎𝑑−𝑏𝑐)2
𝜒2 =
(𝑎+𝑏 )(𝑎+𝑐 )(𝑏+𝑑)(𝑐+𝑑)
~ 𝜒2

If any cell frequently is less than 5, we use

𝑁(|𝑎𝑑−𝑏𝑐 |−𝑁/2)2
𝜒 2 = (𝑎+𝑏)(𝑎+𝑐)(𝑏+𝑑)(𝑐+𝑑) ~ 𝜒 2
18

4. If the observed value or calculated value is greater than table value of 𝜒 2 foe a
level of significance we reject H0.
2
ie, 𝜒 2 > 𝜒1,𝛼 , reject H0.

5. If some scores fall at the combined median, then


a) Drop them from analysis if they are only a few and n1 + n2 is large.
b) Dichotomize the groups as the scores which exceed the median and those
which do not.

➢ The data of 10 plots each under two treatments are given below.

Treatment 1 X: 46, 45, 32, 42, 39, 48, 49, 30, 51,34

Treatment 2 Y:44, 40, 59, 47, 55, 50, 47, 71, 49, 55

The hypothesis of equality of median response under two treatments can be treated by the
median test

H0: The treatments are equally effective with regard to their median effect.

Arrange the data in ascending

30 32 34 39 40 42 43 44 45 46 47 47 48 49 50 51 55 55 59 71

1 1 1 1 2 1 2 2 1 1 2 2 1 1 2 1 2 2 2 2

n =20 (even no.)

Median = average of (n/2)th and ((n/2) + 1)th item

= average (10th and 11th item)


46+47
=
2
19

= 46.5

Group I Group II Total


No. of scores above a=7 b=3 10
median
No. of scores c=3 d=7 10
below median
Total 10 10 20

2 𝑁(|𝑎𝑑−𝑏𝑐 |−𝑁/2)2
𝜒 = (𝑎+𝑏 )(𝑎+𝑐 )(𝑏+𝑑)(𝑐+𝑑)
~ 𝜒2

20 ((149−9)−20/2)2
= 10𝑋10𝑋10𝑋10

20 (140−10)2
= 10000

20 (30)2
= 10000
20 𝑋 900
= = 1.8
10000

𝜒 2 = 3.841

𝜒 2 < 𝜒𝛼2

Therefore, no reason to reject H0 at α = 0.05

ie, the treatments are equally effective with regards to their median effect.

Mc Nemar Test for Significance of Changes

The data consist of observations on 'n' independent bivariate random variable (Xi, Yi), i=
1, 2, .--, n. The measurement Scale for Xi and Yi is nominal with two categories back we
20

call 0 and 1 that is, the possible values of (Xi, Yi) are (0,0), (0,1), (1,0), (1, 1). In the Mc
Nemar test the data are usually summarized in 2x2 Contigency table as follows.

Classification of the Yi

Y=0 Yi = 1
Classification X=0 a (the no. of pairs b (the no. of pairs
where X=0, Y=0) where X=0, Y=1)
of Xi X =1 c (the no. of pairs d (the no. of pairs
where X=1, Y=0) where X=1, Y=1)

Assumptions

1) The pairs (Xi, Yi) are mutually independent.


2) The measurement scale is nominal with two categories for all Xi and Yi.
3) The difference P(Xi =0, Yi =0) – P (Xi = 1, Yi =0) is negative for all i or zero for all
i or positive for all i.

Test statistic for the Mc. - Nemar test is usually written as

(𝑏−𝑐)2
Ti = ~ 𝜒2
𝑏=𝑐

However, for b + c <20 the following test statistic is preferred.


1
T2 ~ 𝐵( , 𝑏 + 𝑐)
2

1
(Binomial distribution with p = , n = b+c)
2

The hypothesis is

H0: P(Xi =0, Yi = 1) = P(Xi=1, Yi =0), For all i

which is equivalent to

H0: P(Xi =0) = P(Yi =0), For all i

v/s

H1: P(Xi =0) ≠ P(Yi =0), For all i


21

Which is equivalent to

H0: P(Xi =1) = P(Yi =1), For all i

v/s

H1: P(Xi =1) ≠ P(Yi =1), For all i

Let n=b+c, If n ≤ 20

use Table A3, if α is the desired level of significance of 2α, otherwise accept it.

If n ≤ 20 use Ti from table A2. Reject H0 at a level of significance α if Ti exceeds the (1-α)
quantity of a. 𝜒 2 random variable with 1 degree of freedom.

➢ Prior to a nationally televised debate b/w the two presidential Candidates a random
sample of 100 persons stated their choice of conditions as follows. Eighty-four
persons favored the Democratic Candidates and the remaining 16 favored the
Republican. After the debate the same 100 people expressed their preference again
of the persons who formerly favored the Democrat, exactly one fourth of them
changed their minds, and also one fourth of the people formerly favoring the
Republican switched to the democratic side The results are summarized in the
following 2x2 contingency table.

Democrat Republican Total


Democrat 63 21 84
Republican 4 12 16
H0: The population voting alignment was not altered by the debate.

v/s

H1: There has been a change in the proportion of all voters who favor the Democrat.

(𝑏−𝑐)2
T i=
𝑏+𝑐

(21−4)2
= 21+4
289
= 25
22

= 11.26

α = 0.05

𝜒12 =3.841

Ti > 𝜒12 , reject H0 at α = 0.05

You might also like