0% found this document useful (0 votes)

125 views5 pages

Nonparametric Tests in R

The document discusses nonparametric tests that can be performed in R including the sign test, Wilcoxon signed-rank test, Mann-Whitney-Wilcoxon test, and Kruskal-Wallis test. Examples are provided to demonstrate how to apply each test in R and interpret the results. The sign test is used to compare two products' popularity, the Wilcoxon signed-rank test compares barley yields between years, the Mann-Whitney-Wilcoxon test analyzes rainfall data between stations, and the Kruskal-Wallis test examines monthly ozone levels.

Uploaded by

prem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

125 views5 pages

Nonparametric Tests in R

Uploaded by

prem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

NONPARAMETRIC TESTS IN R

B N Mandal
I.A.S.R.I., Library Avenue, New Delhi – 110 012
bnmandal @iasri.res.in

Introduction
Nonparametric or distribution free tests are so-called because the assumptions underlying their
use are “fewer and weaker than those associated with parametric tests” (Siegel & Castellan,
1988, p. 34). To put it another way, nonparametric tests require fewer assumptions about the
shapes of the underlying population distributions. For this reason, they are often used in place of
parametric tests when one feels that the assumptions of the parametric test have been too grossly
violated (e.g., if the distributions are too severely skewed). Purpose of this note is to demonstrate
how R software can be used to perform nonparametric tests.

Sign Test
The sign test is one of the simplest nonparametric tests. It is for use with 2 repeated (or
correlated) measures (see the example below), and measurement is assumed to be at least
ordinal. For each subject, subtract the 2nd score from the 1st, and write down the sign of the
difference. (That is write “-” if the difference score is negative, and “+” if it is positive.) The
usual null hypothesis for this test is that there is no difference between the two treatments. If this
is so, then the number of + signs (or - signs, for that matter) should have a binomial distribution
with p = .5, and N = the number of subjects. In other words, the sign test is just a binomial test
with + and - in place of Head and Tail (or Success and Failure), i.e., a sign test is used to decide
whether a binomial distribution has the equal chance of success and failure.

Example
A food product company has invented a new product, and would like to find out if it will be as
popular as the existing favorite product. For this purpose, its research department arranges 18
participants for taste testing. Each participant tries both products in random order before giving
his or her opinion. It turns out that 5 of the participants like the new product better, and the rest
prefer the old one. At .05 significance level, can we reject the notion that the two products are
equally popular?

The null hypothesis is that the products are equally popular. Here we apply the binom.test
function. As the p-value turns out to be 0.096525, and is greater than the .05 significance level,
we do not reject the null hypothesis.

> binom.test(5, 18)

Exact binomial test

data: 5 and 18
number of successes = 5, number of trials = 18,
p-value = 0.09625
alternative hypothesis: true probability of success is not equal to 0.5
Non-parametric tests in R

95 percent confidence interval:

0.09695 0.53480
sample estimates:
probability of success
0.27778

At .05 significance level, we do not reject the notion that the two products are equally popular.

Wilcoxon Signed-Rank Test

One drawback of the sign test is that it discards a lot of information about the data. It takes into
account the direction of the difference, but not the magnitude of the difference between each pair
of scores. The Wilcoxon signed-ranks test is another nonparametric test that can be used for 2
repeated (or correlated) measures when measurement is at least ordinal. But unlike the sign test,
it does take into account (to some degree, at least) the magnitude of the difference. Two data
samples are matched if they come from repeated observations of the same subject. Using the
Wilcoxon Signed-Rank Test, we can decide whether the corresponding data population
distributions are identical without assuming them to follow the normal distribution.

Example
Barley yield in the year 1931 and 1932 of the same field are recorded for different varieties.
Loc Var Y1 Y2
UF M 81 80.7
UF S 105.4 82.3
UF V 119.7 80.4
UF T 109.7 87.2
UF P 98.3 84.2
W M 146.6 100.4
W S 142 115.5
W V 150.7 112.2
W T 191.5 147.7
W P 145.7 108.1
M M 82.3 103.1
M S 77.3 105.1
M V 78.4 116.5
M T 131.3 139.9
M P 89.6 129.6
C M 119.8 98.9
C S 121.4 61.9
C V 124 96.2
C T 140.8 125.5
C P 124.8 75.7
GR M 98.9 66.4
Non-parametric tests in R

GR S 89 49.9
GR V 69.1 96.7
GR T 89.3 61.9
GR P 104.1 80.3
D M 86.9 67.7
D S 77.1 66.7
D V 78.9 67.4
D T 101.8 91.8
D P 96 94.1

Without assuming the data to have normal distribution, test at .05 significance level if the barley
yields of 1931 and 1932 have identical distributions.

The null hypothesis is that the barley yields of the two sample years are identical populations. To
test the hypothesis, we apply the wilcox.test function to compare the matched samples. For the
paired test, we set the "paired" argument as TRUE. As the p-value turns out to be 0.005318, and
is less than the .05 significance level, we reject the null hypothesis.

> barley=read.csv(file.choose())

> attach(barley)

> wilcox.test(Y1,Y2,paired=TRUE)

Wilcoxon signed rank test with continuity correction

data: Y1 and Y2
V = 368.5, p-value = 0.005318
alternative hypothesis: true location shift is not equal to 0

Warning message:
In wilcox.test.default(Y1, Y2, paired = TRUE) :
cannot compute exact p-value with ties

Mann-Whitney-Wilcoxon Test
Two data samples are independent if they come from distinct populations and the samples do not
affect each other. Using the Mann-Whitney-Wilcoxon Test, we can decide whether the
population distributions are identical without assuming them to follow the normal distribution.

Example
The seasonal rainfall in two stations is given below. Without assuming the data to have normal
distribution, test whether the distribution of rainfall in two stations is same or not.

Station A Station B
Non-parametric tests in R

1011.07 496.44
1066.82 541.76
610.8 1562.01
1111.44 2515.12
955.68 1133.99
1203.84 300.33
1600.32 482.55
555.9 503.22
1302.95 2744.23
182.34 1232.22
1233.2
1402.09
> rainfall=read.csv(file.choose())

> attach(rainfall)

To test the hypothesis, we apply the wilcox.test function to compare the independent samples. As
the p-value turns out to be 0.001817, and is less than the .05 significance level, we reject the null
hypothesis.

> wilcox.test(Station_A,Station_B)

Wilcoxon rank sum test

data: Station_A and Station_B

W = 65, p-value = 0.7713

At .05 significance level, we conclude that the rainfall distribution in two stations is same

Kruskal-Wallis Test
A collection of data samples are independent if they come from unrelated populations and the
samples do not affect each other. Using the Kruskal-Wallis Test, we can decide whether the
population distributions are identical without assuming them to follow the normal distribution.

In the built-in data set named airquality, the daily air quality measurements in New York, May to
September 1973, are recorded. The ozone density is presented in the data frame column Ozone.

> head(airquality)

Ozone Solar.R Wind Temp Month Day

Non-parametric tests in R

1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6

Without assuming the data to have normal distribution, test at .05 significance level if the
monthly ozone density in New York has identical data distributions from May to September
1973.

The null hypothesis is that the monthly ozone density is same from May to September. To test
the hypothesis, we apply the kruskal.test function to compare the independent monthly data. The
p-value turns out to be nearly zero (6.901e-06). Hence we reject the null hypothesis.

> kruskal.test(Ozone ~ Month, data = airquality)

Kruskal-Wallis rank sum test

data: Ozone by Month

Kruskal-Wallis chi-squared = 29.267, df = 4, p-value = 6.901e-06

At .05 significance level, we conclude that the monthly ozone density in New York from May to
September 1973 are nonidentical populations.

References

R Development Core Team (2011). R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL
http://www.R-project.org/.

Siegel, S., & Castellan, N.J. (1988). Nonparametric statistics for the behavioral sciences (2nd
Ed.). New York, NY: McGraw-Hill.

Raven's Progressive Matrices
No ratings yet
Raven's Progressive Matrices
4 pages
Applied Nonparametric Statistics 2
No ratings yet
Applied Nonparametric Statistics 2
15 pages
Bruno Lecture Notes PDF
No ratings yet
Bruno Lecture Notes PDF
251 pages
Mathematical Statistics With Applications Solution Manual
No ratings yet
Mathematical Statistics With Applications Solution Manual
5 pages
HW 03 Sol
No ratings yet
HW 03 Sol
9 pages
Exercises - Probability Theory - With Answers
No ratings yet
Exercises - Probability Theory - With Answers
4 pages
Game Theory
100% (1)
Game Theory
89 pages
2018 State of Testing Report 1.1
No ratings yet
2018 State of Testing Report 1.1
31 pages
Modelling in R
No ratings yet
Modelling in R
47 pages
Chapter 9. Test of Hypotheses For A Single Sample
No ratings yet
Chapter 9. Test of Hypotheses For A Single Sample
98 pages
Introductory Concepts of Probabability & Statistics
No ratings yet
Introductory Concepts of Probabability & Statistics
6 pages
Linear Regression: Major: All Engineering Majors Authors: Autar Kaw, Luke Snyder
100% (1)
Linear Regression: Major: All Engineering Majors Authors: Autar Kaw, Luke Snyder
25 pages
L G 0016125104 0051669710
50% (2)
L G 0016125104 0051669710
30 pages
One-Sample T-Test
No ratings yet
One-Sample T-Test
9 pages
Newbold Sbe8 ch09 Ge
No ratings yet
Newbold Sbe8 ch09 Ge
62 pages
Statistics Project
No ratings yet
Statistics Project
14 pages
Probability Theory III (B.Stat. 2017-2020)
No ratings yet
Probability Theory III (B.Stat. 2017-2020)
173 pages
Doing Bayesian Data Analysis With JASP: Darrell A. Worthy
No ratings yet
Doing Bayesian Data Analysis With JASP: Darrell A. Worthy
76 pages
Bioinformatics F&amp M 20100722 Bujak
100% (1)
Bioinformatics F&amp M 20100722 Bujak
27 pages
EE 769 Introduction To Machine Learning: Sheet 4 - 2020-21-2 Linear Classification
No ratings yet
EE 769 Introduction To Machine Learning: Sheet 4 - 2020-21-2 Linear Classification
4 pages
Rohatgi Expl
No ratings yet
Rohatgi Expl
192 pages
Lecture 10 Tensor and Tensor Algebra 2 PDF
No ratings yet
Lecture 10 Tensor and Tensor Algebra 2 PDF
14 pages
2021 - Nature - Bayesian Statistics and Modelling
100% (1)
2021 - Nature - Bayesian Statistics and Modelling
26 pages
(Solutions Manual) Probability and Statistics For Engineers and Scientists Manual Hayler
100% (1)
(Solutions Manual) Probability and Statistics For Engineers and Scientists Manual Hayler
51 pages
Testing of Hypothesis A02
No ratings yet
Testing of Hypothesis A02
38 pages
Estimation and Hypothesis
100% (2)
Estimation and Hypothesis
32 pages
Numerical Tech For Interpolation & Curve Fitting
No ratings yet
Numerical Tech For Interpolation & Curve Fitting
46 pages
1 Limits Continuity
No ratings yet
1 Limits Continuity
44 pages
151 Practice Final 1
100% (1)
151 Practice Final 1
11 pages
Cognitive Psychology - Module 1
No ratings yet
Cognitive Psychology - Module 1
72 pages
BADM 572 Module 4 Study Session 7 April 2019
No ratings yet
BADM 572 Module 4 Study Session 7 April 2019
44 pages
Functions of Random Variables
No ratings yet
Functions of Random Variables
5 pages
Logistic Regression in R
No ratings yet
Logistic Regression in R
19 pages
Numpy
No ratings yet
Numpy
15 pages
1system of Linear Equations
100% (1)
1system of Linear Equations
64 pages
A PageRank Model For Player Performance Assessment
No ratings yet
A PageRank Model For Player Performance Assessment
27 pages
K Kiran Kumar IIM Indore
100% (1)
K Kiran Kumar IIM Indore
115 pages
Binomial Distribution
No ratings yet
Binomial Distribution
16 pages
Lecture Note 4 To 7 OLS
No ratings yet
Lecture Note 4 To 7 OLS
29 pages
Lecture 2 Merged
No ratings yet
Lecture 2 Merged
75 pages
Important Questions
No ratings yet
Important Questions
11 pages
20220108202159D6130 - 01-02 Systems of Linear Equations-Update
No ratings yet
20220108202159D6130 - 01-02 Systems of Linear Equations-Update
31 pages
Pranab K Sen - Julio M Singer - Large Sample Methods in Statistics (1994) - An Introduction With Applications (2017, CRC Press) - Libgen - Li
No ratings yet
Pranab K Sen - Julio M Singer - Large Sample Methods in Statistics (1994) - An Introduction With Applications (2017, CRC Press) - Libgen - Li
395 pages
Eigenvalues and Eigenvect.9241785.Powerpoint
No ratings yet
Eigenvalues and Eigenvect.9241785.Powerpoint
5 pages
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
No ratings yet
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
33 pages
Solution CH # 5
No ratings yet
Solution CH # 5
39 pages
Previous Year Placement Questions of ISI KOLKATA
No ratings yet
Previous Year Placement Questions of ISI KOLKATA
9 pages
R Tutorial
No ratings yet
R Tutorial
26 pages
Bayesian Lecture Notes
No ratings yet
Bayesian Lecture Notes
28 pages
hw3 Solutions PDF
No ratings yet
hw3 Solutions PDF
11 pages
Childhood Asthma Prediction Model Using SVM
No ratings yet
Childhood Asthma Prediction Model Using SVM
9 pages
Duda Solutions PDF
No ratings yet
Duda Solutions PDF
77 pages
P 2.1 Logistic Regression
No ratings yet
P 2.1 Logistic Regression
18 pages
An Introduction To T
No ratings yet
An Introduction To T
7 pages
Lecture - 12 Von Neumann & Morgenstern Expected Utility
No ratings yet
Lecture - 12 Von Neumann & Morgenstern Expected Utility
20 pages
Assignment 1 Answers
No ratings yet
Assignment 1 Answers
7 pages
Quiz 3 Practice PDF
100% (1)
Quiz 3 Practice PDF
4 pages
STAT22209 - Nonparametric Statistics
No ratings yet
STAT22209 - Nonparametric Statistics
74 pages
How Do We Decide If The Medication Was Successful in Lowering The Patient's Concentration of Blood Glucose?
No ratings yet
How Do We Decide If The Medication Was Successful in Lowering The Patient's Concentration of Blood Glucose?
7 pages
Non-Parametric Tests
No ratings yet
Non-Parametric Tests
11 pages
MATH7340 Module 7 Notes
No ratings yet
MATH7340 Module 7 Notes
39 pages
Parametric & Non-Parametric Tests
100% (1)
Parametric & Non-Parametric Tests
34 pages
R Commands New 2
No ratings yet
R Commands New 2
23 pages
Abstrak Bahasa Inggris
No ratings yet
Abstrak Bahasa Inggris
2 pages
Paper 20201202083728
No ratings yet
Paper 20201202083728
195 pages
2022 Bosses PDF
No ratings yet
2022 Bosses PDF
7 pages
Median: Average:: Harvard Business School The Wharton School
No ratings yet
Median: Average:: Harvard Business School The Wharton School
6 pages
Repeated Measures ANOVA
0% (1)
Repeated Measures ANOVA
36 pages
أثر ترتيب الفقرات حسب خصائصها السيكومترية
No ratings yet
أثر ترتيب الفقرات حسب خصائصها السيكومترية
108 pages
Educational Research: Instruments (" ")
100% (1)
Educational Research: Instruments (" ")
100 pages
ToR, and JD For Enumerators
No ratings yet
ToR, and JD For Enumerators
6 pages
ANOVA DTGT Practice With Answers
No ratings yet
ANOVA DTGT Practice With Answers
2 pages
Answer Sheet For Placement Test
100% (2)
Answer Sheet For Placement Test
2 pages
Gtyhdtrghjujjgt
No ratings yet
Gtyhdtrghjujjgt
1 page
Of Small Beauties and Large Beasts
No ratings yet
Of Small Beauties and Large Beasts
33 pages
Intelligence Scale
No ratings yet
Intelligence Scale
10 pages
Non Parametric Statistics
No ratings yet
Non Parametric Statistics
96 pages
A Variable Is Any Characteristic or Quantity That Varies Among The Members of A Particular Group
No ratings yet
A Variable Is Any Characteristic or Quantity That Varies Among The Members of A Particular Group
61 pages
Chapter 3 Revised 9 26
No ratings yet
Chapter 3 Revised 9 26
9 pages
Lampiran 7 Analisa Statistik 2
No ratings yet
Lampiran 7 Analisa Statistik 2
2 pages
Major Project Report Project
100% (2)
Major Project Report Project
47 pages
(M4) Posttask
No ratings yet
(M4) Posttask
4 pages
Group Assignment FNC
No ratings yet
Group Assignment FNC
10 pages
CBSE Assistant Secretary Question Paper With Final Answer Key 2020
No ratings yet
CBSE Assistant Secretary Question Paper With Final Answer Key 2020
152 pages
Hypothesis P
No ratings yet
Hypothesis P
13 pages
Per g22 Pub 1181 Touchstone AssessmentQPHTMLMode1 RRB241 RRB241S1D10934 17331433556185125 301245260004665 RRB241S1D10934E1.html#
No ratings yet
Per g22 Pub 1181 Touchstone AssessmentQPHTMLMode1 RRB241 RRB241S1D10934 17331433556185125 301245260004665 RRB241S1D10934E1.html#
33 pages
Laboratory Exercise 3
No ratings yet
Laboratory Exercise 3
6 pages
13 FS Writing Chapter II
No ratings yet
13 FS Writing Chapter II
41 pages
Psychological Testing PDF
No ratings yet
Psychological Testing PDF
369 pages
Unit - IV: Anova Analysis of Variance
No ratings yet
Unit - IV: Anova Analysis of Variance
149 pages

Nonparametric Tests in R

Uploaded by

Nonparametric Tests in R

Uploaded by

NONPARAMETRIC TESTS IN R

> binom.test(5, 18)

Exact binomial test

95 percent confidence interval:

Wilcoxon Signed-Rank Test

Wilcoxon signed rank test with continuity correction

Wilcoxon rank sum test

data: Station_A and Station_B

W = 65, p-value = 0.7713

Ozone Solar.R Wind Temp Month Day

> kruskal.test(Ozone ~ Month, data = airquality)

Kruskal-Wallis rank sum test

data: Ozone by Month

You might also like