0% found this document useful (0 votes)
27 views93 pages

Numerical Analysis

This document discusses steps for statistical test selection and interpretation. It provides information on bivariate vs multivariable analysis, difference vs correlation, independent vs paired data, normality testing, and assumptions for t-tests, ANOVA, and other analyses. Examples are given to illustrate key points.

Uploaded by

5thyear md
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views93 pages

Numerical Analysis

This document discusses steps for statistical test selection and interpretation. It provides information on bivariate vs multivariable analysis, difference vs correlation, independent vs paired data, normality testing, and assumptions for t-tests, ANOVA, and other analyses. Examples are given to illustrate key points.

Uploaded by

5thyear md
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 93

Numerical data analysis

Dr. Omnia Elmahdy


Steps of statistical test selection

5 questions

Dr. Omnia Mohammed Elmahdy 2


Q1: Bivariate Vs Multivariable

Bivariate analysis: Multivariable (regression


modelling/analysis)
studying the relationship between studying the effect of multiple
two variables variables on an outcome variable.
For example: For example:
• Age and height • Effect of smoking, fast food, coffee
• Gender and smoking consumption on blood pressure.
• Smoking and coffee consumption • Effect of smoking, fast food, coffee
consumption on having a heart
attack.

Dr. Omnia Mohammed Elmahdy 3


Dr. Omnia Mohammed Elmahdy 4
Q2: Difference Vs Correlation

Difference Correlation
to study the difference between two or to study the association between
more groups, or two or more conditions two variables
For example: For example:
• The difference between males and • The association between age
females regarding coffee and weight
consumption • The association between coffee
• The difference in body weight consumption and the number of
before and after being on a specific sleeping hours.
diet.

Dr. Omnia Mohammed Elmahdy 5


Dr. Omnia Mohammed Elmahdy 6
Q3: Independent Vs Paired data

Independent data Paired data


• The observations in each sample are • Pre-test/post-test samples (a
not related variable is measured before
• There is no relationship between the and after an intervention)
subjects in each sample. • Cross-over trials
Subjects in the first group cannot also be • Matched samples
in the second group • When a variable is measured
• No subject in either group can influence twice or more on the same
subjects in the other group individual
• No group can influence the other group

Dr. Omnia Mohammed Elmahdy 7


Dr. Omnia Mohammed Elmahdy 8
Q4: Type of outcome and normality of
distribution

Dr. Omnia Mohammed Elmahdy 9


Test for Normality
To test for normality of the quantitative
variables:
1- Kolmogorov-Smirnov Test
2- Shapiro-Wilk Test

Dr. Omnia Mohammed Elmahdy 10


Q5: Number of groups /conditions

For example:
• Are we comparing two groups (diseased, not
diseased), or three groups (normal, osteopenia,
osteoporosis)?
• Are we comparing two conditions (pre-test, post-
test), or three conditions (before the operation,
during the operation, after the operation)?

Dr. Omnia Mohammed Elmahdy 11


Dr. Omnia Mohammed Elmahdy 12
Assumption of Homogeneity of variances

• Homogeneity of variances (similar standard


deviations) means that the variable we are studying
has the same variance across groups. We need to
test for the equality of variances between groups
when using some statistical tests, e.g. Independent
t-tests and one-way ANOVA.

• Homogeneity of variances is tested using Levene’s


test.
Dr. Omnia Mohammed Elmahdy 13
• Interpretation of the test result: If the p-value is < 0.05 reject
H0 and conclude that the assumption of equal variances has
not been met.

• We accept the null hypothesis (say that there is equal


variance) if the P-value > 0.05.

• If the homogeneity of variance assumption was not met, the standard tests cannot
be done, and modified tests can be used.

Dr. Omnia Mohammed Elmahdy 14


Dr. Omnia Mohammed Elmahdy 15
Dr. Omnia Mohammed Elmahdy 16
Paired t-Test

Dr. Omnia Mohammed Elmahdy 17


Dr. Omnia Mohammed Elmahdy 18
Numerical data analysis

Dr. Omnia Mohammed Elmahdy 19


Independent t test
Assumptions How to check What to do if the
assumption is not met

Normality: Tests of normality Use Mann-Whitney


dependent variables (Shapiro-Wilk, U test
should be normally Kolmogorov-
distributed within Smirnov)
each group
Homogeneity of Levene’s test (part of Use bottom row of t test
variance (standard standard SPSS output) output in SPSS “equal
deviation) variances not assume

Dr. Omnia Mohammed Elmahdy 20


Example
• Comparing hemoglobin level of patients in the treatment and
control groups.
Steps:
• Step 1: We test if hemoglobin is normally distributed in both
groups using Shapiro-Wilk test, or Kolmogorov-Smirnov test.
• Step 2: After confirmation that hemoglobin is normally
distributed in both groups, we use the independent sample t
test.
• Step 3: We check the result of Levene’s test for the
homogeneity of variance which is part of the output in SPSS to
decide which row should be used for reporting the result (The
first row is for equal variance, and the second is for the non-
equal variance).
Dr. Omnia Mohammed Elmahdy 21
Dr. Omnia Mohammed Elmahdy 22
Dr. Omnia Mohammed Elmahdy 23
Dr. Omnia Mohammed Elmahdy 24
Dr. Omnia Mohammed Elmahdy 25
Interpretation of the result:

The hemoglobin level was higher in the treatment group (12.86 ±


1.69) than the control group (11.37 ± 1.26), a statistically significant
difference of 1.49 (95%CI: 0.53, 2.45) was found, p = .003.
The age was not different in the treatment group (32.55 ± 5.60) from
the control group (30.15 ± 5.69), p = .187.

Dr. Omnia Mohammed Elmahdy 26


Dr. Omnia Mohammed Elmahdy 27
Paired t-test
Assumptions How to check What to do if the
assumption is not met

Normality: paired Tests of normality Wilcoxon signed


differences should (Shapiro-Wilk, rank test
be normally Kolmogorov-
distributed Smirnov)

Dr. Omnia Mohammed Elmahdy 28


Example:
• Comparing the weight of a group of individuals before and
after being on a specific diet to see if there is any difference.
Steps:
• Step 1: We calculate the difference between the two
readings.
• Step 2: We test if the difference is normally
distributed using Shapiro-Wilk test, or Kolmogorov-
Smirnov test.
• Step 3: After confirmation that the difference is
normally distributed, we use the paired t test.

Dr. Omnia Mohammed Elmahdy 29


Interpretation of the result:

The mean weight before the program was 71.61 (SD=12.31), and the
mean weight after the program was 63.79 (SD=10.95). A statistically
significant decrease of -7.82 kg (95%CI, -13.63, -2.01) was found, p
=0.011.

The mean hemoglobin before the program was 11.37 (SD=1.26), and
the hemoglobin after the program was 11.98 (SD=1.53). No
statistically significant difference was found, p=0.168.
Dr. Omnia Mohammed Elmahdy 30
Dr. Omnia Mohammed Elmahdy 31
Analysis of Variance

Dr. Omnia Elmahdy


Analysis of Variance
• Data sets are often summarized by giving their
Mean (for a description of the center) and
Standard Deviation (to describe the variation)
• Variance is the square of the Standard
Deviation
• The F-test is used to determine if there is a
Statistically Significant difference among three
or more Means
Dr. Omnia Mohammed Elmahdy 33
Analysis of Variance
• One way ANOVA test is used to test the
difference of means among three or more
independent normally distributed samples

• Example
– There are three types of training given to our
healthcare workers. Do they result in different
effects on worker healthcare performance?

Dr. Omnia Mohammed Elmahdy 34


Analysis of Variance
Criteria for using the one way ANOVA test

• Normally distributed quantitative samples


• The groups have assumed equal variances
• Testing the difference among three or more
means

Dr. Omnia Mohammed Elmahdy 35


Dependent variable: Continuous
Independent variable: Categorical (at least 3 categories)

Usage: Used to examine the difference in means of 3 or more


independent groups.
ANOVA uses the ratio of the “between-group variance” to the
“within-group variance” to decide whether there are statistically
significant differences between the groups or not.
Dr. Omnia Mohammed Elmahdy 36
Assumptions How to check What to do if the
assumption is not met

Normality: Tests of Use Kruskall-Wallis


dependent variables normality (non-parametric) test
should be normally (Shapiro-Wilk,
distributed within Kolmogorov-
each group Smirnov)
Homogeneity of Levene’s test Welch test instead of
variance ANOVA (adjusted for
the differences in
variance) or Kruskal-
Dr. Omnia Mohammed Elmahdy Wallis test 37
Example:
Comparing the birthweight of a group of infants of mothers with
different smoking status (never smoked, quit before pregnancy,
smoke during pregnancy).

Steps:
• Step 1: We test if birth weight is normally distributed in the
three groups using Shapiro-Wilk, or Kolmogorov-Smirnov test.
• Step 2: After confirmation that birth weight is normally
distributed in the three groups, we run the one-way ANOVA
test and the Levene’s test for homogeneity of variance.
• Step 3: We check the result of Levene’s test for the
homogeneity of variance, if there is no homogeneity of
variance, we need to run the Welsh test.
Dr. Omnia Mohammed Elmahdy 38
Analysis of Variance
Dependent variable: This is the item being measured that is
theorized to be affected by the independent variables.
Independent variable/s: These are the items being measured
that may have an effect on the dependent variable.

• Null Hypothesis: There is No difference among


different means
• Alternative Hypothesis: At least one mean differs
from the rest of other means
Dr. Omnia Mohammed Elmahdy 39
• Step 4: If the result of the one-way ANOVA is statistically
significant (p<0.05), we need to do a post hoc test.

Interpretation of the result:


• If the p-value ≥ 0.05, we conclude that there is no significant
difference between the groups.
• If the p-value < 0.05, we conclude that there is a significant
difference between at least one pair of the groups. Post-hoc
tests are used to test where the pairwise differences are.
So, your treatment groups must be normally distributed, and
the variances between your treatments should be equal and
your samples should be independent from each other.

Dr. Omnia Mohammed Elmahdy 40


F = variation between sample means / variation
within the samples

Dr. Omnia Mohammed Elmahdy 41


Dr. Omnia Mohammed Elmahdy 42
The Treatment Variation is the variability that can be
attributed to the difference between the different
treatment groups (difference in sample means).
The Error Variation is the variability that can be
attributed to the random error associated with the
response variable. This is the variability within the
different treatment groups.

Dr. Omnia Mohammed Elmahdy 43


= F score

Dr. Omnia Mohammed Elmahdy 44


Dr. Omnia Mohammed Elmahdy 45
Dr. Omnia Mohammed Elmahdy 46
Dr. Omnia Mohammed Elmahdy 47
Dr. Omnia Mohammed Elmahdy 48
Dr. Omnia Mohammed Elmahdy 49
Dr. Omnia Mohammed Elmahdy 50
Dr. Omnia Mohammed Elmahdy 51
Analysis of Variance

• One-Way Single Factor


• Two-Way
If you have three independent variables rather
than two, you need a three-way ANOVA.

Dr. Omnia Mohammed Elmahdy 52


Analysis of Variance
A researcher wishes to try three different
techniques to lower blood pressure of
individuals diagnosed with high blood pressure.
The subjects are randomly assigned to three
groups; the first group takes medication, the
second group exercises, and the third group
follows a special diet.
After four weeks, the reduction in each person’s
blood pressure is recorded.
Dr. Omnia Mohammed Elmahdy 53
Analysis of Variance

Medication Exercise Diet

10 6 5
12 8 9
9 3 12
15 0 8
13 2 4

Dr. Omnia Mohammed Elmahdy 54


Post-hoc Test
Post hoc tests are designed for situations in which the
researcher has already obtained a significant F-test
with a factor that consists of three or more means and
additional exploration of the differences among means
is needed to provide specific information on which
means are significantly different from each other

Dr. Omnia Mohammed Elmahdy 55


Dr. Omnia Mohammed Elmahdy 56
Dr. Omnia Mohammed Elmahdy 57
Dr. Omnia Mohammed Elmahdy 58
Dr. Omnia Mohammed Elmahdy 59
Dr. Omnia Mohammed Elmahdy 60
Dr. Omnia Mohammed Elmahdy 61
Dr. Omnia Mohammed Elmahdy 62
Dr. Omnia Mohammed Elmahdy 63
Dr. Omnia Mohammed Elmahdy 64
Dr. Omnia Mohammed Elmahdy 65
Dr. Omnia Mohammed Elmahdy 66
Dr. Omnia Mohammed Elmahdy 67
Dr. Omnia Mohammed Elmahdy 68
Dr. Omnia Mohammed Elmahdy 69
Dr. Omnia Mohammed Elmahdy 70
Dr. Omnia Mohammed Elmahdy 71
Dr. Omnia Mohammed Elmahdy 72
Dr. Omnia Mohammed Elmahdy 73
How to report the result:
Report the mean and standard deviation of each group, the p-
value for one-way ANOVA, and the significant pairwise
differences.

Dr. Omnia Mohammed Elmahdy 74


A significant difference was found among the groups, p=0.037.
Levene test was used to determine the nature of the differences
between those groups. This analysis revealed that the birth weight
of infants to mothers who smoke during pregnancy was lower (M =
2606, sd = 334) than that of infants to mothers who never smoke
(M = 3101, sd = 411). The birth weight of infants to mothers who
quit smoking before pregnancy (M = 2959, sd = 490) was not
significantly different from either of the other two groups.

Reporting non-significant results:


We conducted a one-way ANOVA test to compare the age of
mothers with different smoking behavior (never smoke, quit before
pregnancy, smoke during pregnancy). No statistically significant
difference was found among the three groups, p=0.782.

Dr. Omnia Mohammed Elmahdy 75


Homework
You are planning an experiment that will
involve 4 equally sized groups, including 3
experimental groups and a control. Each
group will contain (n) observations.

Your expectation is that each of the 3


experimental treatments will have
approximately the same effect, and that this
effect will be small over the control

Dr. Omnia Mohammed Elmahdy 76


Control Exp1 Exp2 Exp3
1 118 107 133 134

2 121 165 154 176

3 97 121 91 171

4 86 126 63 159

5 118 87 62 118

6 45 135 164 125

7 119 83 96 100

8 92 100 129 60

9 91 144 128 163

10 72 119 105 111

Dr. Omnia Mohammed Elmahdy 77


Dr. Omnia Mohammed Elmahdy 78
Full Factorial ANOVA (also called two-way ANOVA)

Full Factorial ANOVA is used when there are two independent


variables. (called factors).
• Continuous: The same as a one-way ANOVA, the dependent
variable should be continuous.
• Independence: Each sample is independent of other samples,
with no crossover.
• Variance: The variance in data across the different groups is
the same.
• Normalcy: The samples are representative of a normal
population.
• Categories: The independent variables should be in separate
categories or groups.

Dr. Omnia Mohammed Elmahdy 79


The primary purpose of a two-way ANOVA is to
understand if there is an interaction between the two
independent variables on the dependent variable.

For example, you could use a two-way ANOVA to


understand whether there is an interaction between
gender and educational level on test anxiety amongst
university students, where gender (males/females) and
education level (undergraduate/postgraduate) are your
independent variables, and test anxiety is your
dependent variable.

Dr. Omnia Mohammed Elmahdy 80


Dr. Omnia Mohammed Elmahdy 81
Dr. Omnia Mohammed Elmahdy 82
Dr. Omnia Mohammed Elmahdy 83
Dr. Omnia Mohammed Elmahdy 84
Dr. Omnia Mohammed Elmahdy 85
Dr. Omnia Mohammed Elmahdy 86
Dr. Omnia Mohammed Elmahdy 87
Dr. Omnia Mohammed Elmahdy 88
Repeated measures ANOVA

• The repeated measures ANOVA compares means


across one or more variables that are based on
repeated observations.
• The repeated measures ANOVA is similar to the
dependent sample T-Test, because it also compares
the mean scores of one group to another group on
different observations.

Dr. Omnia Mohammed Elmahdy 89


• Consider for example a drug trial where the
participants have individual differences that might
have an impact on the outcome of the trial.
• The typical drug trial splits all participants into a
control and the treatment group and measures the
effect of the drug in month 1 -18. The repeated
measures ANOVA can correct for the individual
differences or baselines.
• The baseline differences that might have an effect on
the outcome could be typical parameter like blood
pressure, age, or gender. Thus the repeated measures
ANOVA analyzes the effect of the drug while
excluding the influence of different baseline levels of
health when the trial began.
Dr. Omnia Mohammed Elmahdy 90
A typical guideline to determine whether the repeated
measures ANOVA is the right test is to answer the
following three questions:

•Is there a direct relationship between each pair of


observations, e.g., before vs. after scores on the same
subject?
•Are the observations of the data points definitely not
random (i.e., they must not be a randomly selected
specimen of the same population)?
•Do all observations have to have the same number of
data points?

Dr. Omnia Mohammed Elmahdy 91


Dr. Omnia Mohammed Elmahdy 92
Dr. Omnia Mohammed Elmahdy 93

You might also like