10.1 Data Analysis and Interpretation
10.1 Data Analysis and Interpretation
DATA ANALYSIS
Associate professor
Fellow at Faculty of Medicine
University of Khartoum
Sudan
There are three kinds of people:
John Stull
RESEARCH PROPOSAL: DATA ANALYSIS
SECTION
Researchers are expected to state the following:
Software for data entry, cleaning and analysis of data
Descriptive statistics (univariate analysis)
Inferential statistics
Bivariate (cross-tabs)
Multivariate analysis (regression analysis)
Clearly state your dependent variable/s
Statistical tests
level of significance (α)
DATA ANALYSIS AND
INTERPRETATION
Think about analysis EARLY
Start with a plan
Code, enter, clean
Analyze
Interpret
Reflect
What did we learn?
What conclusions can we draw?
What are our recommendations?
What are the limitations of our analysis?
VARIABLES TYPES
o Quantitative/ Qualitative data
o Quantitative:
o Categorical variables: Nominal (gender)/ Ordinal (knowledge)
o Numerical variables: Interval (temperature)/ ratio (length) variables
Numerical
o Discrete variables can only take on a limited set of values (nominal and ordinal)
o Continuous variables (where you can find a value between any other two values)
o
o Dependent variable/s
o Independent (explanatory) variables
TYPES OF DATA
Nominal
Categorical
(qualitative)
Ordinal
Types of
data
Interval
Numerical
(quantitative
)
Ratio
MEASUREMENT LEVELS
o Ratio variables have a non-arbitrary zero point that is the same for any scale you use
eg length (zero length is the same whether you measure using inches or cm)
DESCRIPTIVE VS INFERENTIAL
“One thousand households (HHs) and 3,628 individuals surveyed. The presence of any net
varies between 6.6% and 40% and those who reported sleeping under mosquito nets last night
varies between 35 to 80%. Prompt use of medications ranged between 14 to 48% with a delay of
more than 24 hours noticed in different areas….”
Descriptive statistics summarize your group
“Patients originated from the north were significantly older than patients from Khartoum and
Gazera, P<0.001…..”
Inferential statistics use the theory of probability to make inferences about larger
populations from a sample
?WHY
Descriptive Statistics
Identify patterns
Identifies outliers
Guides choice of statistical test
Leads to hypothesis generating
Inferential Statistics
Used to determine the likelihood that a conclusion based on data
from a sample is true
Distinguish true differences from random variation
Allows hypothesis testing
?HOW TO DEAL WITH DATA
1. State your hypothesis (both Null and alternative hypothesis)
2. Identify the dependent and independent variables, and the type of each
one (Categorical (qualitative)/ numerical (quantitative))
3. If you have quantitative variable, check the normality (Assume a normal
distribution if the P-Value of the Shapiro Wilk Test is >0.05)
4. Select the appropriate test based on step 2 and 3 (next slides)
5. Do the calculation
6. Read the tabulated value at the appropriate degrees of freedom and level
of significance
7. Compare the calculated and tabulated values?
8. Write the value of P and the significant of the result
9. Decide on the hypothesis
10. Write the interpretation of the result
DATA ANALYSIS:
STATISTICAL TESTS
Statistical tests
Correlation (Spear-
man)
Chi-squared
One-sample Paired t-test Independent ANOVA
t-test (PR before t-test (age and Linear regression
(observed/ and after (SBP among level of
normal taking a male and disease
value) drug) female) severity)
GUIDE TO SELECTION OF APPROPRIATE STATISTICAL TEST
FOR UNRELATED SAMPLES
Dependent
Quantitative not
Quantitative
normally normally Qualitative
distributed distribution
:Independent Independent
Independen Independen Independent
t: Independent: one or more t: Independen : One or
qualitative qualitative, Quantitativ : qualitative more
qualitative qualitative Quantitative t:
with >2 with or e with >2 qualitative
with 2 with 2 group Qualitative
groups groups without groups or
quantitative quantitative
Linear Mann- Kruskal- Logistic
One-way regression Spearman Chi-
T-test ANOVA Whitne Wallis rank square
regressio
or y n
correlatio
correlatio
n
age/ ( n survival(
malaria/(
Hb/( disease survival( time/
Hb/( residency malaria( sex, age,
)severity time/sex residency survival(
residency
)sex ) weight/( ) )time/ age )/ sex
)age ) ).., Hb
GUIDE TO SELECTION OF APPROPRIATE STATISTICAL TEST
FOR RELATED SAMPLES
Related
samples
Quantitative
Quantitative
but does not
follows
follow normal Qualitative
normal
distribution distribution
Confidence intervals…
P values …
TESTING THE DIFFERENCES
…
Parametric tests
(t-test, …)
Non-parametric tests
(Mann-Whitney,…)
Chi-squared (X2)
COMPARING THE RISK …
Risk difference
Risk ratio
Odds ratio
Correlation
Regression
SURVIVAL ANALYSIS
life tables
Kaplan-Meier
ANALYZING CLINICAL INVESTIGATIONS
AND SCREENING…
Sensitivity
Specificity
…Predictive values
ANALYZING QUALITATIVE
DATA
“Content analysis” steps:
1. Transcribe data (if audio taped)
2. Read transcripts
3. Highlight quotes and note why important
4. Code quotes according to margin notes
5. Sort quotes into coded groups (themes)
6. Interpret patterns in quotes
7. Describe these patterns
ENSURING VALIDITY IN
QUALITATIVE ANALYSIS
Be systematic
Use multiple raters
Attend to context (e.g. keep track of who said what)
Account for outlying and surprising statements
Triangulate