Quantitative Research Data Analysis Lecturers Notes
Quantitative Research Data Analysis Lecturers Notes
Understand statistical terms related to quantitative research (confidence intervals and p-values)
Learn the use of basic statistical tests
Define sampling, data distribution and randomization
Explain probability and non-probability sampling and describe the different types of each
1
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
If the cause is not legitimate, the researcher should eliminate the outlying score from the analysis to
avoid distorts in the analysis.
One-tailed test: A test of a statistical hypothesis, where the region of rejection is on only one side of the sampling
distribution, is called a one-tailed test. For example, suppose the null hypothesis states that the mean is less than or
equal to 10. The alternative hypothesis would be that the mean is greater than 10.
Two-tailed test: When using a two-tailed test, regardless of the direction of the relationship you
hypothesize, you are testing for the possibility of the relationship in both directions. For example,
we may wish to compare the mean of a sample to a given value x using a t-test. Our null hypothesis
is that the mean is equal to x.
2
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
Alpha level (p value): In statistical analysis the researcher examines whether there is any significance in the
results.
The acceptance or rejection of a hypothesis is based upon a level of significance – the alpha (a) level
This is typically set at the 5% (0.05) a level, followed in popularity by the 1% (0.01) a level
So, what do we mean by levels of significance that the 'p' value can give us?
The p value is concerned with confidence levels. This states the threshold at which you are prepared to accept the
possibility of a Type I Error – otherwise known as a false positive – rejecting a null hypothesis that is actually true.
The question that significance levels answer is 'How confident can the researcher be that the results have not arisen
by chance?'
Now, you may be thinking that if an effect could not have arisen by chance 90 times out of 100 (p = 0.1), then that
is pretty significant.
However, what we are determining with our levels of significance, is 'statistical significance', hence we are much
more strict with that, so we would usually not accept values greater than p = 0.05.
So when looking at the statistics in a research paper, it is important to check the 'p' values to find
out whether the results are statistically significant or not.
3
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
Table 1.
Statistical Symbols
Accessed: http://www.statisticshowto.com/statistics-symbols/
4
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
Below are a few of the more common tests used to analyse quantitative data:
T-test
The t-test is used to assess whether the means of two groups differ statistically from each other.
Pearson Correlation
Pearson's correlation is used to test the correlation between at least two continuous variables. The value for
Pearson's correlation lies between 0.00 (no correlation) and 1.00 (perfect correlation).
Mann-Whitney U-test
The Mann-Whitney U-test test is used to test for differences between two independent groups on a continuous
measure, e.g. do males and females differ in terms of their levels of anxiety.
This test requires two variables (e.g. male/female gender) and one continuous variable (e.g. anxiety level).
Basically, the Mann-Whitney U-test converts the scores on the continuous variable to ranks, across the two groups
and calculates and compares the medians of the two groups. It then evaluates whether the medians for the two
groups differ significantly.
5
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
Chi-square test
There are two different types of chi-square tests - but both involve categorical data.
One type of chi-square test compares the frequency count of what is expected in theory against what is actually
observed.
The second type of chi-square test is known as a chi-square test with two variables or the chi-square
test for independence.
It is argued that nonparametric tests should be used when the data do not meet the assumptions of the parametric
test, particularly the assumption about normally distributed data. However, there are additional considerations
when deciding whether a parametric or nonparametric test should be used.
Reason 1: Parametric tests can perform well with skewed and nonnormal distributions
Parametric tests can perform well with continuous data that are not normally distributed if the
sample size guidelines demonstrated in the table below are satisfied.
6
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
*Note: These guidelines are based on simulation studies conducted by statisticians at Minitab.
Reason 2: Parametric tests can perform well when the spread of each group is different
While nonparametric tests don not assume that your data are normally distributed, they do have other assumptions
that can be hard to satisfy. For example, when using nonparametric tests that compare groups, a common
assumption is that the data for all groups have the same spread (dispersion). If the groups have a different spread,
then the results from nonparametric tests might be invalid.
The fact that a parametric test can be performed with nonnormal data does not imply that the mean is the best
measure of the central tendency for your data.
For example, the center of a skewed distribution (e.g. income), can be better measured by the median where 50%
are above the median and 50% are below. However, if you add a few billionaires to a sample, the mathematical
mean increases greatly, although the income for the typical person does not change.
7
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
When the distribution is skewed enough, the mean is strongly influenced by changes far out in the distribution’s
tail, whereas the median continues to more closely represent the center of the distribution.
Reason 3: You have ordinal data, ranked data, or outliers that you cannot remove
Typical parametric tests can only assess continuous data and the results can be seriously affected by
outliers. Conversely, some nonparametric tests can handle ordinal data, ranked data, without being
significantly affected by outliers.
For example, a study that assesses the effects of habitual physical activity on body fat in children might have a
medium effect size (e.g. see Rowlands et al., 1999). In this study, there was a moderate correlation between
habitual physical activity and body fat, with a medium effect size. A large effect size may be anticipated in a study
that assesses the effects of a very low energy diet on body fat in overweight women (e.g. see Eston et al, 1995). In
Eston et al’s study, a significant reduction in total body intake resulted in a substantial decrease in total body mass
and the percentage of body fat.
The effect size should be estimated during the design stage of a study, as this will allow the researcher to determine
the size required to give adequate power for a given alpha (i.e. p value). Therefore, the study can be designed to
8
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
ensure that there is sufficient power to detect the effect of interest, that is minimising the possibility of a type 2
error.
Table 3.
Small, medium and large effect sizes as defined by Cohen
When empirical data are available, they can be used to assess the effect size for a study. However,
for some research questions it is difficult to find enough information (e.g. there is limited empirical
information on the topic or insufficient detail provided in the results of the relevant studies) to
estimate the expected effect size. In order to compare effect sizes of studies that differ in sample
size, it is recommended that, in addition to reporting the test statistic and p value, the appropriate
effect size index is also reported.
SPSS – The Statistical Package for Social Science (SPSS) is one of the most popular software in social science
research. SPSS is comprehensive and compatible with almost any type of data and can be used to run both
descriptive statistics and other more complicated analyses, as well as to generate reports, graphs, plots and trend
lines based on data analyses.
STATA – This is an interactive program that can be used for both simple and complex analyses. It can also
generate charts, graphs and plots of data and results. This program seems a bit more complicated than other
programs as it uses four different windows including the command window, the review window, the result window
and the variable window.
SAS – The Statistical Analysis System (SAS) is another very good statistical software package that can be useful
with very large data sets. It has additional capabilities that make it very popular in the business world because it
can address issues such as business forecasting, quality improvement, planning, and so forth. However, some
9
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
knowledge of programming language is necessary to use the software, making it a less appealing option for some
researchers.
R programming – R is an open source programming language and software environment for statistical computing
and graphics that is supported by the R Foundation for Statistical Computing. The R language is commonly used
among statisticians and data miners for developing statistical software and data analysis.
10
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
distribution, and the height and width of the graph is determined by the standard deviation. When
the standard deviation is large, the curve is short and wide; when the standard deviation is small, the
curve is tall and narrow. Normal distribution graphs look like a symmetric, bell-shaped curve, as
shown above. When measuring things like people's height, weight, salary, opinions or votes, the
graph of the results is very often a normal curve.
11
UU-PhD-801 Quantitative and Qualitative Research Methods
UU-PhD-801
References
Abramson, J. H., Abramson, Z. H. (2008). Scales of Measurement. Research Methods in Community Medicine:
Surveys, Epidemiological Research, Programme Evaluation, Clinical Trials, Sixth Edition, 125-132.
Blaikie, N. (2003). Analyzing quantitative data: From description to explanation. Sage
Publications.
Burns N., Grove S.K. (2005). The Practice of Nursing Research: Conduct, Critique, and Utilization (5th Ed.). St.
Louis, Elsevier Saunders.
Creswell, J. W. (2013). Research design: Qualitative, quantitative, and mixed methods approaches. Sage
Publications, Incorporated.
Eston, RG, Fu F. Fung L (1995). Validity of conventional anthropometric techniques for estimating
body composition in Chinese adults. Br J Sports Med, 29, 52–6.
Frost J. (2015). Choosing Between a Nonparametric Test and a Parametric Test. Retrieved from
http://blog.minitab.com/blog/adventures-in-statistics-2/choosing-between-a-nonparametric-test-and-a-parametric-
test
Langley C, Perrie Y (2014). Maths Skills for Pharmacy: Unlocking Pharmaceutical Calculations. Oxford
University Press.
Rosenthal R. (1991.). Meta-analytic procedures for social research (revised edition). Newbury Park, CA: Sage
Rowlands A.V, Eston R.G, Ingledew D.K. (1999). The relationship between activity levels, body fat and aerobic
fitness in 8–10 year old children. J Appl Physiol, 86, 1428–35.
12
UU-PhD-801 Quantitative and Qualitative Research Methods