100% found this document useful (1 vote)
451 views49 pages

Non Parametric Tests

The document provides an overview of non-parametric tests. It defines key terms like variables, hypotheses, and scales of measurement. It then describes several common non-parametric tests like the chi-square test, sign test, and Wilcoxon signed rank test. The chi-square test can be used to test for independence between categorical variables. The sign test analyzes paired data by counting positive and negative differences. The Wilcoxon signed rank test tests hypotheses about the median of a population distribution, often using matched pairs data.

Uploaded by

riyaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
451 views49 pages

Non Parametric Tests

The document provides an overview of non-parametric tests. It defines key terms like variables, hypotheses, and scales of measurement. It then describes several common non-parametric tests like the chi-square test, sign test, and Wilcoxon signed rank test. The chi-square test can be used to test for independence between categorical variables. The sign test analyzes paired data by counting positive and negative differences. The Wilcoxon signed rank test tests hypotheses about the median of a population distribution, often using matched pairs data.

Uploaded by

riyaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 49

NON PARAMETRIC TESTS

Presenter –Dr M N Srinivas


OVERVIEW
• Basic definitions
• Introduction
• Types of non-parametric tests
• Advantages of non-parametric tests
• Disadvantages of non-parametric tests
• Parametric vs non-parametric tests
• Statistics : defined as science of collecting and
analyzing numerical data, especially for the purpose
of inferring proportions in a whole from those in a
representative sample.
• Descriptive statistics : provide an overview of the
attributes of a data set- by including mean, mode,
median, standard deviation etc.
• Inferential statistics : provide measures of how well
your data support your hypothesis and if your data
are generalizable beyond what was tested –
significance tests.
• Variable – a characteristic that is observed or
manipulated. It might be dependant or
independent.
Independent variable Dependent variable
SUBSTANCE EXPOSURE ASTHMA
INFLUENCES Weight
Exercise
Age Height
• Hypothesis – is an assumption, assertion or an idea
about any phenomenon, which the researcher
wants to verify when required. It has an important
role of suggesting variables that has to be included
in the study design.
• NULL HYPOTHESIS :there is no significant difference
between specified populations, any observed
difference being due to sampling or experimental
error.
It is denoted by H0
• ALTERNATIVE HYPOTHESIS
It is defined as the prediction that there is a measurable
interaction between variables .It is also called as
“MANTAINED hypothesis” or “RESEARCH hypothesis” .
It is denoted by H(a)
Null hypothesis is opposed by alternative hypothesis.
When null hypothesis is rejected, ALTERNATIVE
HYPOTHESIS is not rejected and vice versa.
Scale of measurement
Data

Numerical Data Categorical data


(QUANTITATIVE) (QUALITATIVE )

Discrete Continuous Nominal Ordinal


•Weight
•Number of Children • Marital Status •Satisfaction
•Voltage
(Measured • Political Party level
characteristics) • Eye Color •Level of
agreement
Parameter : It is a numerical measurement
describing some characteristic of a given
population or some aspect of it. . Most
common statistics parameters are mean,
median, mode, standard deviation.
PARAMETRIC TESTS
• A parametric test is a hypothesis testing procedure
based on the assumption that observed data are
distributed according to some distributions of well-
known form up to some unknown parameter(s) on
which we want to make inference.
• Data should follow normal distribution
• Used for Quantitative Data
• Used for continuous variables
• Used when data are measured on approximate
interval or ratio scales of measurement.
Non parametric test
• Also known as distribution independent tests.
Here calculation of mean and standard deviation
are not needed.
It is simple to understand.
Can be used in simple ranking also. Hence useful
where data is not exact.
• This can be used with small samples.
Non parametric test

non-parametric tests can be applied to situations


when:
• The data does not follow any probability
distribution
• The data constitutes of ordinal values or ranks
• There are outliers in the data
• The data has a limit of detection
Non parametric test
• These tests are used to test hypothesis regarding
qualitative data.
• Scales:nominal or ordinal.
• Data:qualitative eg:pass/fail,male/female.
.
Commonly used Non Parametric Tests
• Chi- Square test
• The sign test
• Wilcoxon signed rank test.
• Mann–Whitney U test or Wilcoxon rank sum test
• The Spearman rank correlation test
• Kruskal Wallis test or H test
• Kolmogrov – Smirnov test
• Friedman test
• Median test
• Cochran’s Q test
• Mc Nemar test
Chi Square Test (χ2)
• First used by Karl Pearson
• Simplest & most widely used non-parametric test
in statistical work.
• The chi-square test of independence determines
weather there is an association between
categorical variables (ie., weather the variables
are independent or related ).
Chi-square test (χ2)
• Chi-square test (χ2) is Used to compare between
observed and expected data.
• 1. Test of goodness of fit - weather an observed
frequency distribution differs from a theoreitical
distribution
• 2. Test of independence - (most commonly used )-
assess weather unpaired observations on two
variables are independent of each other
• 3. Test of homogeneity -compares distribution of
counts for two or more groups using the same
categorical variable
Alcohol consumption
gender Low Mod High Total

Male 10 9 8 27

Female 13 16 12 41

Total 23 25 20 68

Degree of freedom : (number of rows-1) x (number of columns-1)


(2-1) x(3-1)
1 x 2 =2
Step-1 choose your test
Is there a relationship between gender and alcohol consumption?
.-gender : male.female
Alcohol consumption :Low ,moderate,high

• Two qualitative variables


•Looking for :
 A relationship
 Independence
 One variable affecting the other
Step-2 statistical hypothesis

• H :A Person’s alcohol consumption is independent of their


gender
• Ha :A Person’s alcohol consumption depends on their
gender
Step-3
Level of significance
=0.05
Step-4
Critical value
5.991
Step-5
Observed (Expected)
ALCOHOL CONSUMPTION
Gender Low Moderate High Total
Male 10 (9.13) 9(9.93) 8(7.94) 27
Female 13(13.87) 16(15.07) 12 (12.06) 41
Total 23 25 20 68

8:34 / 9:57

Chi-sqare test
ALCOHOL CONSUMPTION
GENDER LOW MODERATE HIGH TOTAL
Male 0.08
Female
Total 0.283
CRITICAL CHI SQUARE
VALUE =5.991
Calculated chi-square =0.283

0.283 <critical value (5.991)

we do not reject the null


hypothesis

There is not enough evidence to claim that an


individuals gender influences their alcohol
consumption
Application
• Test of association ( smoking and cancer, treatment
and outcome of the disease, vaccination and
immunity ).
• The overall number of items must be reasonably
large , atleast 50
Limitations :
• The data is from a random sample.
• Interpretation of this test is difficult if the
sample size is less than 50.
• Cannot be used when samples are related or
matched.
• Tells only about probability of occurrence and
doesn’t indicate cause and effect relationship.
Sign Test
• Sign test is probably the simplest of the nonparametric
tests.
• This test is used for paired data.
Step of analysis:
• Evaluate the difference between each paired data and
take the sign (positive or negative)
• Calculate the number of positive signs and negative
signs
• Take the larger number of sign and compare to the
corresponding table
• If the calculated number (larger number) is greater
than the critical number in the table, then the
difference between two means is significant
Ten consumers in a group have rated the attractiveness of
two package designs for a new product
Consumer Rating Difference Sign of
Package-1 Package-2 difference
1 5 8 -3 -
2 4 8 -4 -
3 4 4 0 0
4 6 5 -+1 +
5 3 9 -6 -
6 5 9 -4 -
7 7 6 -1 -
8 5 9 -4 -
9 6 3 +3 +
10 7 9 -2 -
• Test the hypothesis that there is no overall package
preference using = 0.10
• H0:P=0.5 The proportion of consumers who
prefer package 1 is the same as the proportion
preferring package 2
• H1:P<0.5 A majority prefer package 2
• The test-statistic S for the sign test is

S = the number of pairs with a positive difference= 2

• S has a binomial distribution with P = 0.5 and n = 9


(there was one zero difference)
• The p-value for this sign test is found using the
binomial distribution with n = 9, S = 2, and P =
0.5:
• For a lower-tail test,
p-value = P(x 2|n=9, P=0.5)
= 0.090
• Since 0.090< = 0.10 we reject the null
hypothesis and conclude that consumers
prefer package 2
Wilcoxon signed rank test
• Wilcoxon signed ranks tests is designed to test a
hypothesis about the location (median) of a
population distribution

• It often involves the use of matched pairs ,for


example pre test and post test

• Non-parametric equivalent of the paired t-test.


Why do we use wilcoxon signed rank test ?

• Small sample size

• Test hypothesis about median change rather


than mean change
Ordinal data is displayed in the table below . Is thee a difference between before
and after using alpha =0.05

Before After Differe Rank ranks Ranks


treatm treatm nces + -
H0 : there is no
ent ent difference between the
28 12 16 4 4 two treatments
17 31 -14 1 1
H1 :there is a difference
36 19 17 5 5
between the two
35 14 21 6 6
treatments
32 20 12 2 2
Decision rule
33 19 14 3 3
If Z is less than -1.96 /or
20 1
greater than 1.96 , reject
Z=-1.99 the null hypothesis
SO WE REJECT THE NULL
HYPOTHESIS
Mann Whitney U test
• This test is used to determine whether two independent
samples have been drawn from the same population.
• Tests without population parameters (means and standard
deviation)
• Non parametric equivalent of a t test for 2 independent
samples
• Uses when:
• Data does not support means (ordinal)
• Data is not normally distributed
• To perform this test, we first rank all the data jointly, taking
them as belonging to a single sample in either an increasing
or decreasing order of magnitude.
After this we find the sum of the ranks assigned to the
values of the first sample (and call it R1 ) and also the sum
of the ranks assigned to the values of the second sample
(and call it R2).
Treatment A Treatment B
24 3 28 4
18 2 42 6
45 7 63 10
57 8.5 57 8.5
12 1 90 12
30 5 68 11
26.5 51.5

U1=(n1)(n2)+n1(n1+1)/2 - R1 = 30.5
U2=(n1)(n2)+n2(n2+1)/2 - R2=5.5
U=5.5
• critical value (Mann Whitney table) =5
• When computed value is smaller than the critical value the
outcome is significant
• This is a non significant outcome
Kruskal Wallis test

• This is a rank based non parametric test that can


be used to determine if there are statistically
significant differences between two or more
groups of an independent variable on a
continuous or ordinal dependent variable
• It is non-parametric equivalent of ANOVA
Ordinal data is displayed in the table below.Is there a difference between
groups 1 ,group-2 and group-3 using alpha=0.05?

Group-1 Group-2 Group-3


27 20 34
2 8 31
4 14 3
18 36 23
7 21 30
9 22 6

H0 :there is no difference between treatments


H1: there is a difference between treatments
State alpha =0.05
Degree of freedom = K-1 (3-1=2)
Critical value =5.99
ORGINAL RANK
IF x2 IS GREATER THAN 5.99 WE CAN REJECT THE
SCORE
NULL HYPOTHESIS
2 1
GRO GRO GRO 3 2
UP1 UP2 UP3
4 3
14 10 17 6 4
7 5
1 6 16 8 6
3 8 2 9 7
14 8
9 18 13 18 9
5 11 15 20 10
21 11
7 12 4 22 12

T 39 65 67 23 13
27 14
N 6 6 6 30 15
31 16
34 17
IF x2 IS GREATER THAN 5.99 WE CAN REJECT THE NULL
HYPOTHESIS
H=2.854

DO NOT REJECT THE NULL HYPOTHESIS


THERE IS NO SIGNIFICANT DIFFERENCE AMONG THE THREE
GROUPS,
Spearman rank-correlation
• Correlations look for relationship between two
variables which may not be functionally related
• Used to assess the relationship between two
ordinal variables or two continuous variables
• Nonparametric equivalent of pearson correlation
• It is a relative measure which varies from +1 to -1
To calculate the spearman correlation ,we must first rank the
scores
x Y x Y x y xy
2 21 1 6 1 6 6
5 17 2 5 2 5 10
8 14 3 4 3 4 12
11 10 4 3 4 3 12

15 5 5 2 5 2 10

16 3 6 1 6 1 6
21 21 56

r=-1.00
X and y have a strong negative relationship
Median test
• Used to test the hypothesis that two samples are
from populations with equal medians
• Calculate proportions in each group above/below
the common median of the two groups
• Uses a chi-square test to test the differences
between these frequencies
Mc Nemar test
• Mc Nemar test is a statistical test for paired
nominal data.
• It was created by Quinn McNemar
• In medical research, if a researcher wants to
determine whether or not a particular drug has an
effect on a disease (e.g., yes vs. no), then a count of
the individuals is recorded (as + and – sign, or 0 and
1) in a table before and after being given the drug.
• Then, McNemar’s test is applied to make statistical
decisions
Cochran’s Q test

• Cochran's Q test is an extension to


the McNemar test for related samples that
provides a method for testing for differences
between three or more matched sets of
frequencies or proportions
Subject Task-1 Task-2 Task3
1 0 1 0
2 1 1 0
3 1 1 1
4 0 0 0
5 1 0 0
Friedman test

• The Friedman test is a non-parametric alternative


to ANOVA with repeated measures.
• It is used to test for differences between groups
when the dependent variable being measured is
ordinal.
• The Friedman test tests the Null hypothesis of
identical populations for dependent data.
• It uses only the rank information of the data
Parametric Vs Non parametric

Parametric non-parametric
Assumed Normal Any
distribution
Typical data Ratio or interval Nominal or ordinal

Usual central Mean Median


measures
Benefits Can draw many Simplicity less
conclusions affected by uotliers
Advantages of non parametric over
parametric tests
• It is used when dependent variable is nominal
• And when dependent variable or independent
variable is ordinal
• When sample size is small
• They can be used when the observations are
described in terms of ranks or in a hierarchy
• They are applied in psychometry, sociology and
educational statistics
Advantages of non parametric over
parametric tests (Cont)
• They can be used to analyse samples for which
mean or standard deviation are unavailable or
undeterminable
• Can be used more general conditions and are often
easier to explain and understand than parametric
tests
• Computational burden is less as they are quick and
easy methods
Disadvantages of non-parametric tests
• It can be used only if the measurements or
nominal or ordinal
• These tets can test statistical hypothesis ,but
can’t estimate the parameters
• Table of critical values may not be easily
available

Thank q

You might also like