Mann Whitney U Test: ENS 185 - Data Analysis
Mann Whitney U Test: ENS 185 - Data Analysis
Mann Whitney U Test: ENS 185 - Data Analysis
Mann
Whitney U
Test
Adelantar . Asturias . Arsulo . Ali. Borjel . Binuhat
Catinggan . Catulong . Cellona . Dandang
Contents
1 2 3 4
History and Description and Derivation of Application and
Definition Concept Formulas Examples
History
and Definition
History
The Mann-Whitney U-Test, named after Henry Mann and
Donald R. Whitney, was developed in 1947.
It was initially introduced as a non-parametric alternative
to the parametric t-test for independent samples.
The motivation behind its development was to have a
statistical test that could handle data with non-normal
distributions or other violations of parametric assumptions.
The test gained popularity due to its versatility and
robustness, making it suitable for a wide range of research
fields.
History
The Mann-Whitney U-Test has been extensively used in
disciplines such as psychology, biology, sociology, and
economics, among others. •Its application extends to both
experimental and observational studies, allowing
researchers to compare two independent groups without
requiring assumptions about the underlying distributions.
Over time, the test has undergone refinements and
variations, including the Mann-Whitney-Wilcoxon test and
the Wilcoxon rank-sum test, which have expanded its
utility and applicability in different research contexts.
Definition
The Mann-Whitney U-Test is a statistical test used to compare the
distributions of two independent samples.
It focuses on the ranks of the data rather than the actual values, making it
suitable for ordinal or non-normally distributed data.
Definition
The test works by combining the two samples, assigning ranks to the
combined data, and calculating a U statistic.
2 Perform a ranking in all observations. Combine two samples of data together, and sort
the data in ascending order. If there are ties (i.e., multiple observations with the same
value), assign the average rank to each tied observation.
3 Calculate the U Statistic for the two groups. The formula is given as: U =
n1n2+0.5n1(n1+1)-R1, where n1 and n2 are the sample sizes of the two groups, and R1 is the
sum of ranks for Group 1.
To use the Mann-Whitney U test, you
need two independent groups of data.
4 Determine the critical value associated with the U statistic. Look up the critical value
for the Z-score in the standard normal distribution table or use a statistical software
package to obtain the p-value associated with the Z-score.
5 Compare the obtained p-value with the significance level (e.g., α = 0.05) to make a
decision. If the p-value is less than the significance level, you reject the null hypothesis and
conclude that there is a significant difference between the two groups.
t's important to note that the Mann-Whitney U test
only detects a difference between the groups but
does not provide information about the direction
or magnitude of the difference. Additionally, it
assumes that the observations in each group are
independent and come from continuous
distributions.
Comparing two independent groups: When Non-parametric data: When your data does
you have two independent groups, and you not meet the assumptions required for
want to determine if there is a significant parametric tests, such as normally distributed
difference between their distributions. For data or equal variances. The Mann-Whitney U
example, you may want to compare the exam test does not assume any specific distribution
scores of students who received different and can handle ordinal, interval, or skewed
teaching methods to see if there is a data.
significant difference in their performances.
You can use the Mann-Whitney U
test in the following scenarios:
3 4
Small sample sizes: When your sample sizes Ordinal or ranked data: The Mann-Whitney U
are small, the Mann-Whitney U test can be a test is designed to analyze ranked or ordinal
more appropriate choice compared to the t- data, where the values are ordered but the
test, which assumes larger sample sizes for intervals between them may not be equal.
accurate results.
It is important to note that the Mann-Whitney U test is only
applicable for independent samples. If you have paired or
related samples, you should consider using the Wilcoxon
signed-rank test instead.
Male
Example 2
A new approach to prenatal care is proposed for pregnant
women living in a rural community. The new program involves
in-home visits during the course of pregnancy in addition to
the usual or regularly scheduled visits. A pilot randomized trial
with 15 pregnant women is designed to evaluate whether
women who participate in the program deliver healthier
babies than women receiving usual care. The outcome is the
APGAR scoretext annotation indicator measured 5 minutes
after birth. Recall that APGAR scores range from 0 to 10 with
scores of 7 or higher considered normal (healthy), 4-6 low and
0-3 critically low. The data are shown below. Is there
statistical evidence of a difference in APGAR scores in women
receiving the new and enhanced versus usual prenatal care?
CONTENTS
01 02 02 03
INTRODUCTION
The TEST
PROCEDURE
and FORMULAS
Case 1: Paired data
1 State the null hypothesis (H0) and alternative hypothesis (H1). The null
hypothesis states that there is no significant difference between the medians of
the paired samples, while the alternative hypothesis states that there is a
significant difference.
2 Calculate each paired difference, di = xi - yi, where xi, yi are the pairs of
observations.
3 Rank the dis, ignoring the signs (i.e. assignTitle
rank 1 to the smallest
,rank 2 to the next etc.) text
4 Label each rank with its sign, according to the sign of di.
Calculate W +, the sum of the ranks of the positive dis, and W−, the sum of the
5 ranks of the negative dis. (As a check the total, W + + W−, should be equal to
, where n is the number of pairs of observations in the sample).
Case 2: Single set of
observations
1 State the null hypothesis - the median value is equal to some value M.
EXAMPLES
EXAMPLE 1
Worker 1 2 3 4 5 6 7 8 9
Before 6 8 10 9 5 12 9 5 7
After 10 12 9 12 8 13 8 5 10
EXAMPLE 2
A clinical study is designed to assess differences in albumin levels in adults following diets
with different amounts of protein. Low protein diets are often prescribed for patients with
kidney failure. Albumin is the most abundant protein in blood, and its concentration in the
serum is measured in grams per deciliter (g/dL). Clinically, serum albumin concentrations
are also used to assess whether patients get sufficient protein in their diets. Three diets are
compared, ranging from 5% to 15% protein, and the 15% protein diet represents a typical
American diet. The albumin levels of participants following each diet are shown below.
EXAMPLE #1
EXAMPLE #2
Does physical exercise alleviate depression? We find some depressed people and check
that they are all equivalently depressed to begin with. Then we allocate each person
randomly to one of three groups: no exercise; 20 minutes of jogging per day; or 60 minutes
of jogging per day. At the end of a month, we ask each participant to rate how depressed
they now feel, on a Likert scale that runs from 1 ("totally miserable") through to 100
(ecstatically happy").
EXAMPLE #2
EXAMPLE #2
Rank all of the scores, ignoring which group they belong to. The procedure for ranking is as
follows: the lowest score gets the lowest rank. If two or more scores are the same then they
are "tied". "Tied" scores get the average of the ranks that they would have obtained, had
they not been tied. Here's the scores again, now with their ranks in brackets:
EXAMPLE #2
Solve for H:
EXAMPLE #2
EXAMPLE #2
01
The Kruskal-Wallis test is a 03
powerful
nonparametric test that can be used to
compare three or more groups on a
continuous or ordinal dependent variable. It
is a useful alternative to the one-way
ANOVA when the data is not normally
distributed.
THANK YOU!
1 - Introduction 2 3
12
𝐹𝑟 = ∑𝑇𝑖2 − 3𝑏 𝑘 + 1
𝑏𝑘 𝑘 + 1
where,
b = the number of rows (blocks)
k = the number of columns (treatments)
T = the ranks in each column
• Measuring the mean scores of subjects during three or more time points.
• Measuring the mean scores of subjects under three different conditions.
Suppose you wish to compare the reaction times of people exposed to six
different stimuli. A reaction time measurement is obtained by subjecting a
person to a stimulus and then measuring the time until the person presents
some specified reaction. The objective of the experiment is to determine
whether differences exist in the reaction times for the stimuli used in the
experiment. To eliminate the person-to-person variation in reaction time,
four persons participated in the experiment and each person’s reaction time
(in seconds) was measured for each of the six stimuli.
EXAMPLE #1
1 2 3 4 – Example 1
Stimulus
Subject 1 2 3 4 5 6
1 0.6 (2.5) 0.9 (6) 0.8 (5) 0.7 (4) 0.5 (1) 0.6 (2.5)
2 0.7 (3.5) 1.1 (6) 0.7 (3.5) 0.8 (5) 0.5 (1.5) 0.5 (1.5)
3 0.9 (3) 1.3 (6) 1.0 (4.5) 1.0 (4.5) 0.7 (1) 0.8 (2)
4 0.5 (2) 0.7 (5) 0.8 (6) 0.6 (3.5) 0.4 (1) 0.6 (3.5)
EXAMPLE #1
Stimulus
1 2 3 4 – Example 1
Subject 1 2 3 4 5 6
1 0.6 (2.5) 0.9 (6) 0.8 (5) 0.7 (4) 0.5 (1) 0.6 (2.5)
2 0.7 (3.5) 1.1 (6) 0.7 (3.5) 0.8 (5) 0.5 (1.5) 0.5 (1.5)
3 0.9 (3) 1.3 (6) 1.0 (4.5) 1.0 (4.5) 0.7 (1) 0.8 (2)
4 0.5 (2) 0.7 (5) 0.8 (6) 0.6 (3.5) 0.4 (1) 0.6 (3.5)
Sum 𝑇1 = 11 𝑇2 = 23 𝑇3 = 19 𝑇4 = 17 𝑇5 = 4.5 𝑇6 = 9.5
1. State the HYPOTHESIS
𝐻0 : There is no significant difference among the mean reaction time across the
population
(µ1 = µ2 = µ3 )
EXAMPLE #1
Stimulus
1 2 3 4 – Example 1
Subject 1 2 3 4 5 6
1 0.6 (2.5) 0.9 (6) 0.8 (5) 0.7 (4) 0.5 (1) 0.6 (2.5)
2 0.7 (3.5) 1.1 (6) 0.7 (3.5) 0.8 (5) 0.5 (1.5) 0.5 (1.5)
3 0.9 (3) 1.3 (6) 1.0 (4.5) 1.0 (4.5) 0.7 (1) 0.8 (2)
4 0.5 (2) 0.7 (5) 0.8 (6) 0.6 (3.5) 0.4 (1) 0.6 (3.5)
Sum 𝑇1 = 11 𝑇2 = 23 𝑇3 = 19 𝑇4 = 17 𝑇5 = 4.5 𝑇6 = 9.5
2
𝜒0.05 = 11.07
EXAMPLE #1
Stimulus
1 2 3 4 – Example 1
Subject 1 2 3 4 5 6
1 0.6 (2.5) 0.9 (6) 0.8 (5) 0.7 (4) 0.5 (1) 0.6 (2.5)
2 0.7 (3.5) 1.1 (6) 0.7 (3.5) 0.8 (5) 0.5 (1.5) 0.5 (1.5)
3 0.9 (3) 1.3 (6) 1.0 (4.5) 1.0 (4.5) 0.7 (1) 0.8 (2)
4 0.5 (2) 0.7 (5) 0.8 (6) 0.6 (3.5) 0.4 (1) 0.6 (3.5)
3. Find the OBSERVED VALUE Sum 𝑇1 = 11 𝑇2 = 23 𝑇3 = 19 𝑇4 = 17 𝑇5 = 4.5 𝑇6 = 9.5
2
𝜒0.05 = 11.07
12
𝐹𝑟 = ∑𝑇𝑖2 − 3𝑏 𝑘 + 1 , 𝑓𝑜𝑟 𝑏 = 4, 𝑘 = 6
𝑏𝑘 𝑘 + 1
12
𝐹𝑟 = (112 + 232 + 192 + 172 + 4.52 + 9.52 ) − 4(6) 6 + 1
4 6 6 + 1
𝐹𝑟 = 16.75
EXAMPLE #1
1 2 3 4 – Example 1
EXAMPLE #1
1 2 3 4 – Example 1
4. CONCLUSION
Results showed that the type stimulus used lead to statistically significant
differences in reaction time.
EXAMPLE #1
EXAMPLE #2:
Drug
We want to know if the mean
Patient 1 2 3
reaction time of subjects is
𝑇1 𝑇2 𝑇3
1 4 5 2
different on three different
2 3 1
2 6 2.5
6 2.5
4 1
8 3
4 2
7 3
3 1
7 3
2 1
8 3
2 1.5
4 3
1 1
shown: 8 7 3
6 2
4 1
9 6 3
4 2
3 1
10 5 2.5
5 2.5
2 1
𝑇 21.5 27 11.5
Drug
Patient 1 𝑇1 2 𝑇2 3 𝑇3
1 4 2 5 3 2 1
2 6 2.5 6 2.5 4 1
3 3 1 8 3 4 2
4 4 2 7 3 3 1
5 3 2 7 3 2 1
6 2 1.5 8 3 2 1.5
7 2 2 4 3 1 1
8 7 3 6 2 4 1
9 6 3 4 2 3 1
10 5 2.5 5 2.5 2 1
𝑇 21.5 27 11.5
Drug
Patient 1 𝑇1 2 𝑇2 3 𝑇3
1 4 2 5 3 2 1
2 6 2.5 6 2.5 4 1 1. State the hypothesis
3 3 1 8 3 4 2
4 4 2 7 3 3 1
𝐻0 : There is no significant difference
among the mean reaction time across the
5 3 2 7 3 2 1
population
6 2 1.5 8 3 2 1.5 (µ1 = µ2 = µ3 )
7 2 2 4 3 1 1
8 7 3 6 2 4 1
9 6 3 4 2 3 1
𝐻𝑎 : At least one population mean is
different from the rest.
10 5 2.5 5 2.5 2 1
𝑇 21.5 27 11.5
Drug
2. Perform the Friedman Test
Patient 1 𝑇1 2 𝑇2 3 𝑇3
2
1 4 2 5 3 2 1 For 𝛼 = 0.05 and 𝑑𝑓 = 2, reject 𝐻0 if 𝐹𝑟 > 𝑥0.05
2
𝜒0.05 = 5.99
2 6 2.5 6 2.5 4 1
3 3 1 8 3 4 2
4 4 2 7 3 3 1 12
𝐹𝑟 = ∑𝑇𝑖2 − 3𝑏 𝑘 + 1 , 𝑓𝑜𝑟 𝑏 = 10, 𝑘 = 3
5 3 2 7 3 2 1 𝑏𝑘 𝑘 + 1
6 2 1.5 8 3 2 1.5 12
𝐹𝑟 = (21.52 + 272 + 11.52 ) − 3(10) 3 + 1
10 3 4 + 1
7 2 2 4 3 1 1
8 7 3 6 2 4 1 𝐹𝑟 = 12.35
9 6 3 4 2 3 1
10 5 2.5 5 2.5 2 1
𝑇 21.5 27 11.5
3. Interpret the results 4. Report the results
2
𝜒0.05 = 5.99
𝐹𝑟 = 12.35
2
𝜒0.05 = 5.99 𝐹𝑟 = 12.35
SPEARMAN RANK
CORRELATION
Siarez, Melrose
Sigua, Havie Joy
Solatorio, Christian Jude
Tadem, Gisserie
Teburon, Sittie Ayesha
Tornea, Justine Nicole
Tumlad, Clark
Vencilao, Hannah Gwendoline
Yahya, Nor Hayna
Zorilla, Jhapet
SPEARMAN’S RANK CORRELATION (SPEARMAN
RANK CORRELATION OR SPEARMAN’S RHO)
Named after Charles Spearman
Typically denoted either with the Greek letter rho (ρ), or rs
and is primarily used for data analysis
Measures the strength and direction of association between
two ranked variables. It basically gives the measure of
monotonicity of the relation between two variables i.e. how
well the relationship between two variables could be
represented using a monotonic function.
Similar to the Pearson correlation coefficient.
TO UNDERSTAND SPEARMAN’S RANK CORRELATION, IT IS
IMPORTANT TO UNDERSTAND MONOTONIC FUNCTION.
In the example, the value -0.73 (or +0.73) gives a significance level of slightly
less than 5%. That means that the probability of the relationship you have
found being a chance event is about 5 in a 100. You are 95% certain that your
hypothesis is correct. The reliability of your sample can be stated in terms of
how many researchers completing the same study as yours would obtain
the same results: 95 out of 100.
EXAMPLE:
THANK
YOU!
REFERENCES
https://www.statisticssolutions.com/free-resources/directory-of-
statistical-analyses/spearman-rank-correlation/
https://www.simplilearn.com/tutorials/statistics-
tutorial/spearmans-rank-correlation
https://www.questionpro.com/blog/spearmans-rank-coefficient-
of-correlation/
https://www.questionpro.com/blog/spearmans-rank-coefficient-
of-correlation/