0% found this document useful (0 votes)
36 views267 pages

Quantitative Research Methods July2020

The book 'Quantitative Research Methods' by Nicholas Harkiolakis provides a comprehensive overview of quantitative research methods primarily in the social sciences, aimed at readers with little prior knowledge. It covers foundational concepts, research processes, hypothesis testing, and advanced analytical methods, along with practical examples and resources for further learning. The text also emphasizes the philosophical underpinnings of research, including ontology and epistemology, which guide the methodology and interpretation of findings.

Uploaded by

Daphne Halkias
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views267 pages

Quantitative Research Methods July2020

The book 'Quantitative Research Methods' by Nicholas Harkiolakis provides a comprehensive overview of quantitative research methods primarily in the social sciences, aimed at readers with little prior knowledge. It covers foundational concepts, research processes, hypothesis testing, and advanced analytical methods, along with practical examples and resources for further learning. The text also emphasizes the philosophical underpinnings of research, including ontology and epistemology, which guide the methodology and interpretation of findings.

Uploaded by

Daphne Halkias
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 267

Quantitative Research Methods

From Theory to Publication

Nicholas Harkiolakis
ii QUANTITATIVE RESEARCH METHODS

Copyright © 2020 Nicholas Harkiolakis


All rights reserved.
ISBN: 1543148131
ISBN-13: 9781543148138
iii

Preface
Attempting to describe quantitative research methods through one volume of
material is quite a challenging endeavor and it should be noted here that this book is
by no means attempting to exhaustively present everything under the Sun on the
subject. Interested readers will need to expand on what is presented here by searching
the extant literature on what exists and what best suits their research needs. Having
said that, the book does cover the great majority of quantitative methods found in
social sciences research.
The motivation for developing this book came from years of delivering
quantitative methods courses for graduate programs in Europe and the USA. Through
exposure to such programs it became apparent that while most students had some
exposure, to statistics mainly, at the time they entered graduate studies most of their
understanding and familiarity with quantitative techniques was forgotten or vaguely
remembered. In many cases, what remained was the impression of how much they
“hated” the subject. Overcoming this negative predisposition required a re-
introduction of basic concepts and a fast-track approach to higher and more advanced
methods of analysis.
These realities guided the development of this book and so the assumption is
made that the reader doesn’t know anything about quantitative research and
about research in general. All concepts presented in the book are defined and
introduced. Also, alternative and overlapping expressions and keywords used in
quantitative research are presented so the reader can identify them in their readings of
academic research. Whether this “zero-to-hero” approach succeeded is left for the
reader to judge.
Additional effort was made to include examples that are easily replicated in
spreadsheets like Excel so the users can manually repeat them at their convenience.
Regarding the use of software, commands for executing the various methods for SPSS
are given in footnotes to avoid diverting from the core narrative of the text. The
interested reader can easily retrieve a plethora of material from the Internet with step-
by-step instructions for most of the analysis techniques discussed here and for the
most popular statistical software packages. The book’s website at
www.harkiolakis.com/books/quan provides additional material for executing the
methods discussed here with SPSS, as well as all book images in higher resolution and
links to other sources online. For instructors who are interested in using the book as
a textbook, data sets and exercises on the methods included in the book are available
upon request.
iv QUANTITATIVE RESEARCH METHODS

For researchers in a hurry to get their hands dirty it is recommended to start


with Chapters 2 and 3, replace the Chapter 4 material with the cheat sheets in
Appendix C, and continue with Chapter 5. If needed, they can always delve in the
more advanced material of Chapter 6. For someone who is interested in the details
behind the most frequently used methods Chapter 4 should be considered as well as
Chapter 6.
v

Table of Contents

1 Philosophical Foundations.............................................................. 8
1.1 Ontology ....................................................................................................... 9
1.1.1 Realism ............................................................................................ 10
1.1.2 Relativism........................................................................................ 11
1.2 Epistemology .............................................................................................. 13
1.2.1 Positivism ........................................................................................ 16
1.2.2 Constructionism .............................................................................. 17
1.3 Methodology ............................................................................................... 18
1.3.1 Quantitative ..................................................................................... 21
1.3.2 Qualitative ....................................................................................... 23
1.3.3 Mixed methods ................................................................................ 25
2 The Quantitative Research Process ............................................. 28
2.1 Literature Review ....................................................................................... 30
2.2 Theoretical Framework ............................................................................... 32
2.3 Research Questions and Hypotheses .......................................................... 34
2.4 Research Design ......................................................................................... 36
2.4.1 Research Design Perspectives ......................................................... 38
2.4.2 Sampling.......................................................................................... 45
2.4.3 Variables and Factors ...................................................................... 49
2.4.4 Instruments ...................................................................................... 55
2.4.5 Data Collection, Processing, and Analysis...................................... 61
2.4.6 Evaluation of Findings .................................................................... 62
2.5 Conclusions and Recommendations ........................................................... 66
3 Populations ..................................................................................... 68
3.1 Profiling ...................................................................................................... 68
3.2 Probabilities ................................................................................................ 79
3.3 Distributions................................................................................................ 82
vi QUANTITATIVE RESEARCH METHODS

3.3.1 Normal Distribution......................................................................... 86


3.3.2 Chi-Square Distribution................................................................... 95
3.3.3 Binomial Distribution ...................................................................... 97
3.3.4 t Distribution .................................................................................... 99
3.3.5 F Distribution................................................................................. 101
3.3.6 Distribution-Free and Non-Parametric .......................................... 102
4 Samples ......................................................................................... 108
4.1 Statistics .................................................................................................... 115
4.2 One-sample Case ...................................................................................... 117
4.2.1 One Normal-like Scale Variable ................................................... 118
4.2.2 One Non-Parametric Scale Variable ............................................. 120
4.2.3 Two Normal-like Scale Variables ................................................. 122
4.2.4 Two Non-Parametric Scale Variables ........................................... 132
4.2.5 Many Normal-like Scale Variables ............................................... 135
4.2.6 Many Non-Parametric Scale Variables ......................................... 142
4.2.7 Nominal Variables ......................................................................... 146
4.3 Two and Many-Samples Case................................................................... 161
5 Hypothesis Testing ...................................................................... 166
5.1 Sample Size ............................................................................................... 172
5.2 Reliability .................................................................................................. 176
6 Advanced Methods of Analysis .................................................. 180
6.1 Exploratory Factor Analysis and Principle Component Analysis ............ 180
6.2 Cluster Analysis ........................................................................................ 188
6.3 Structural Equation Modeling ................................................................... 193
6.3.1 Path Analysis ................................................................................. 195
6.3.2 Confirmatory Factor Analysis ....................................................... 199
6.4 Time Series Analysis ................................................................................ 200
6.5 Bayesian Analysis ..................................................................................... 209
6.6 Decision Analysis ..................................................................................... 214
Philosophical Foundations vii

6.6.1 Payoff Table Analysis ................................................................... 215


6.6.2 Decision Trees ............................................................................... 220
6.6.3 Game Theory ................................................................................. 222
6.7 Simulations ............................................................................................... 225
6.8 Social Network Analysis .......................................................................... 230
7 Publishing Research .................................................................... 237
7.1 Journal Publication ................................................................................... 237
7.1.1 Title Page....................................................................................... 239
7.1.2 IMRaD Sections ............................................................................ 241
7.1.3 References ..................................................................................... 243
7.2 Dissertations.............................................................................................. 244
7.2.1 Chapter 1 Introduction .................................................................. 245
7.3 Chapter 2 Literature Review ..................................................................... 246
7.4 Chapter 3 Methods .................................................................................... 247
7.5 Chapter 4 Results ...................................................................................... 248
7.6 Chapter 5 Discussion ................................................................................ 249
Appendix A Questionnaire Structure ........................................... 250
Appendix B Research Example Snapshot .................................... 254
Appendix C Cheat Sheets ............................................................... 256
INDEX ………….. ........................................................................... 261
8 QUANTITATIVE RESEARCH METHODS

1 Philosophical Foundations

The origin of the word research can be traced to the French recherché (to seek
out) as a composite, suggesting a repeated activity (re) of searching (cherché). Going even
further, we arrive at its Latin root in the form of circare (to wander around) and
eventually to circle. This wandering around in some sort of cyclical fashion is quite
intuitive as we will see later on since it accurately reflects the process we follow in
modern research. The only exception is that the circles are getting deeper and deeper
into what the “real” world reveals to us. Noticing the quotes around the word ‘real’
might have revealed the direction we will follow here in challenging what “real” means
and accepting how it affects the process we follow when we investigate a research
topic.
Concerns about the nature of reality are vital in deciding about the nature of
truth and the ways to search for it. This brings us to the realm of philosophy or more
specifically to its branch of metaphysics where we deal with questions of existence.
Providing an explanation of the world and understanding its function (to the extent
possible) allow us to accept what is real and make sense of it. Our perception of the
metaphysical worldview is the cause of our behavior and the motivation for moving
on in life. Questions about the origin of the universe belong to the branch of
metaphysics that is called cosmology, while questions about the nature of reality and
the state of being are the concern of the branch of metaphysics called ontology. The
latter is essential to the way we approach and conduct research as it provides a
foundation for describing what exists and the truth about the objects of research.
In addition to the issues about reality that research needs to address based on
our ontological stance, there are epistemological assumptions that guide our approach
to learning. These address issues about what knowledge is — what we can know and
how we learn. These questions, of course, are based on the assumption that knowledge
about the world can be acquired in an “objective”/real way, connecting in this way
with ontology. The interplay among epistemology and ontology as we will soon see is
reflected in the different research traditions that have been adopted and guide past and
present research efforts as they determine the theory and methods that will be utilized
in conducting research.
The ontological and epistemological stances we adopt are considered
paradigms and reflect the researcher’s understanding of the nature of existence from
first principles that are beyond “logical” debate. As paradigms, they are accepted as
self-sufficient logical constructs (dogmas in a way) that are beyond the scrutiny of
proof or doubt. Selecting one is more of an intuitive leap of faith than an “objective”
process of logical and/or empirical conclusions. Both ontology and epistemology are
Philosophical Foundations 9

tightly related to what can be called “theory of truth”. This is an expression of the way
arguments connect the various concepts we adopt in research and the conditions that
make these arguments true or false depending on the ontological and epistemological
posture we adopt. In that respect, arguments can represent concepts, intentions,
conditions, correlations, and causations that we accept as true or false with a certain
degree of confidence.
Typical theories of truth include the instrumentalist, coherence, and
correspondence theories. The last reflects the classical representations that Plato and
Aristotle adopted where something is true when it “reflects” reality. This posture can
be heavily challenged in the world of social sciences since perceptions of individuals
vary and can suggest different views of reality, making it impossible to have a universal
agreement on social “facts”. Such perceptions can influence the way beliefs fit together
(cohere) in producing an explanation of the phenomenon we investigate. This is also
the basic posture of coherence theory, which postulates that truth is interpretation-
oriented as it is constructed in social settings.
Finally, the instrumentalist view of truth emphasizes the interrelation between
truth and action and connects the positive outcomes of an action to the truth behind
the intention that led to that action. This is more of a results-oriented perception of
truth and the basis of the pragmatist epistemology as we will see later on. In terms,
now, of supporting theories, research can be descriptive like when we make factual
claims of what leaders and organizations do, instrumental like when we study the
impact and influence of behavior, and normative when we try to provide evidence that
supports direction. For each one of these “categories” of research, ontology and
epistemology are there to provide philosophical grounds and guidance.

1.1 Ontology
Of particular interest to research is ontology, the branch of metaphysics that
deals with the nature of being and existence or in simplified terms what reality is.
Although it isn’t clear what really is and how it relates to other things, one can always
resolve to degrees of belief that ensure commitment to answers and by extension
acceptance of a particular theory of the world. In this way, an ontological stance will
provide an acceptable dogma of how the world is built and more specifically, with
respect to social sciences, which we are interested in here, the nature of the social
phenomenon under investigation.
The various ontological stances that will be presented here are not an
exhaustive account of what has been developed in the field, but they are meant to
serve as a brief introduction to the core trends, or mainstream in a way, that have been
developed and persist today. Adopting an ontological stance towards the things that
10 QUANTITATIVE RESEARCH METHODS

exist in society it terms of the nature and representations we form of its various entities
will help guide the choice of methodology that best suits our research aims. Believing
in an objective or subjective reality will also suggest, to an extent, the data collection
method (like survey, interviews, etc.) that will best reveal and/or prove, to a degree of
certainty, the relationships and dependencies of the entities we study.
Ontological stances are divided primarily according to their belief in the
existence of external entities. They can range from the extreme position that there are
no external entities, which is the domain and main position of nihilism, to the existence
of universal truths about entities, as realism posits. In between, we have, among others,
relativism with its position of subjective realities that depend on agreement between
observers, nominalism with the position that there is no truth and all facts are human
creations, and Platonism where abstract objects exist but in a non-physical realm. For
the purposes of this book, realism and relativism will be considered as the most
popular ontological stances in research and especially due to the support they provide
for quantitative and qualitative research.

1.1.1 Realism
Realism’s premise is that the world exists regardless of who observes it. In our
case this means that the universe existed before us and will continue to exist after we
are gone (not just as individuals but also as a species). The philosophical positions of
realism have been quite controversial as it can be accepted or rejected in parts
according to one’s focus. For example, an individual might be a realist regarding the
macroscopic nature of the natural world while they can be non-realist regarding human
concepts like ethics and aesthetics. This dualism of treating the “outside” world as
“objective” and the “inside” world of human thought as “subjective” is mainly based
on agreement about the existence of the world’s artifacts. The external perception of
the objective reality is mainly based on our everyday experience that the objects in the
natural world exist by the mere fact that their defining characteristics persist when
observed over time and that their properties seem to transcend human language,
culture, and beliefs. For example, the moon is recognizable by everyone in the world
as an object (nowadays) that orbits the Earth, suggesting it is real and independent of
human interpretation. Accepting its “realness” allows us to do research about it and
define its properties and relationship to other objects, spreading in this way its
“realness” to other objects (including us who make observations about it).
The whole truth will never be revealed to us as all the facts that support it
cannot be revealed. For this reason, we need theories that allow us to imagine
(correctly or incorrectly) what is hidden from our senses and cannot directly be
observed. In research, a realist stance is based on empiricism. Reality or the true nature
of what we are trying to find is out there and will be revealed to us in time. The
assumption is made here that the observable world is composed of elemental and
Philosophical Foundations 11

discrete entities that when combined produce the complex empirical patterns that we
observe. The researchers’ tools are their senses, their impressions and perceptions as
formulated by past experiences, and their rationality. By identifying and describing the
elemental constructs of a phenomenon, realists can study their interaction and form
theories that explain the phenomenon. A lot of these explanations will require
abstractions (like socialism, for example) as they cannot be observed as elemental
entities but rather as aggregates of simpler ones. Understanding social structures will
help transform them in ways that better support human growth.
Many variations of realism have been developed from various philosophical
schools to address deviations from the generic realist path. Among them, critical
realism, idealism, and internal realism hold prominent positions. The first two have
been going at each other for some time now as rivals to true representatives of realism.
Critical realists insist on the separation between social and physical objects based on
the belief that social structures cannot exist in isolation and are derivatives of the
activities they govern and their agent’s perceptions. In turn, these structures tend to
influence the perceptions of their agents, creating in this sense a closed system between
agents and their social construct. Idealists critique the positions of critical realists by
suggesting their interpretations and subject of inquiry fall into metaphysics as they
construct imaginary entities that further impose an ideology that, as most ideologies,
can be oppressive and exploitable.
The variation of realism that comes in the form of internal realism becomes
interesting with respect to social sciences research. The position here is that the world
is real but it is beyond our capabilities and means to study it directly. A universal truth
exists, but we can only see glimpses of it. In business research, realism can be reflected
in the assumption that entities like sellers and buyers interact in a physical (not
imaginary) setting so decisions must reflect the outside reality and not our internal and
subjective representation of it. In that respect, researchers and practitioners need to
try and understand what really happens in the marketplace in terms of the properties
of the various entities and the way they exert forces and interact among themselves
and their environment. However, it is obvious that due to our limitations in receiving
and interpreting all the signals of reality we can only gain an imperfect and mainly
probabilistic understanding of the world. A divide between the real world and the
particular view we perceive at a certain time and place will always exist and as such our
investigations and the interpretations we provide will always be linked to our
experiences as researchers.

1.1.2 Relativism
Relativism takes an opposite stance to that of realism by taking the position
that what we perceive as reality is nothing more than the product of conventions and
frameworks of assessment that we have popularized and agreed as representing the
12 QUANTITATIVE RESEARCH METHODS

truth. In that sense, truth is a human fabrication and no one assessment of human
conduct is more valid than another. Understanding is context and location specific,
and rationality among others is the subject of research rather than an undisputed
conceptual resource for researchers. Relativists view reality as time dependent, and
something that was considered as true at some instance in time could easily be proven
false at a later time when experience and resources reveal another aspect of the
phenomenon or situation under investigation. Research provides revelations of reality
and discourses help develop practices and representations that help us experience and
make sense of the world.
A popular interpretation of relativism is that there is the single reality that exists
out there, but it is inaccessible directly through experimentation and observation. We
end up deducing a lot of the underlying nature of things (like the electrons in atoms)
by observing the way they influence macroscopic phenomena. The accuracy of our
observations can never go beyond a certain threshold because the mere act of
observing means we interact with the object of the observation and thus alter the
aspect of it (variable) that we intended to observe. This can be seen in social sciences,
for example, when we ask people how aware they are of a phenomenon. The question
itself brings the phenomenon in question into the subject’s awareness, affecting in this
way the underlying awareness they had of the phenomenon before the question was
asked.
Because reality is seen as subjective to the individual and social groups, it
suggests a way of approaching research with humility. As a result, research should not
be seen as revelations of a universal truth but more of as a reflection of the period, the
researchers, and the context of the research. Making judgments about social
phenomena like culture should be done with caution and always avoid biases that could
privilege one interpretation or culture, for example, over others. The values held by
one culture might be totally different from another without making one better or more
just than the other.
While it might seem that no progress can be made since everything is
subjective and situation dependent, we must not consider relativism as opposing the
scientific method. To the contrary, it should be seen as a complement or an added
filter that can provide an alternative view based on the influence of the observer in
interpreting information and formulating theories. Critics of relativism believe that the
lack of underlying assumptions that can be considered true undermines the possibility
of the development of commonly accepted theories that will explain the parts of the
world we observe. Another criticism concerns the inherent contradiction of relativism
that results from the universality of the belief that everything is relative to a place, time,
and the context in which it is observed. The universality of such a statement is only
something a realist would claim. In other ways, if everything is relative then the
Philosophical Foundations 13

absolute truth that everything is relative is relative, and thus this statement is something
that is not relative, proving a realist stance. Another criticism of realism that needs to
be considered in social sciences research is its susceptibility to the perception of
authority of the carriers of the truth. Protagonists with influence and power are bound
to influence an adoption of the truth along with their perception of it. Additional
influences could be exerted by social entities (government, business, etc.) with stakes
in the specific and general area of the truth.

1.2 Epistemology
The etymology of the word epistemology suggests the discourse about the
formal observation and acquisition of knowledge. This branch of metaphysics is
concerned with the sources and structure of knowledge and the conditions under
which beliefs/knowledge become justified. As sources of knowledge, here we consider
“reliable” ones like testimonies, memory, reason, perception, and introspection, and
exclude volatile forms like desires, emotions, prejudice, and biases. Perception through
our five senses is the primary entry point of information into the mind where it can be
retained in memory for further processing through reason. Testimony is an indirect
form of knowledge whereupon we rely on someone else to provide credible
information about someone or something else. Finally, introspection, as a unique
capacity of humans to inspect their own thinking, can supplement reason in making
decisions about the nature of evidence and truth.

Figure 1.1 The interrogatives diamond


14 QUANTITATIVE RESEARCH METHODS

Introspection and learning can be represented through the “interrogatives”


diamond of Figure 1.1. In any given situation, we make sense of our world by
inspecting our environment and coming up with answers to “When”, “Where”,
“Who”, and “What” happens. This formulates our direct reality (horizontal plane in
Figure 1.1). Combining these answers with what we know from related experience and
using rationality and insight, we can deduce “How” things happen and most
importantly “Why”. Of course, what we see and perceive might not be the same as
what is really happening out there and/or what other people perceive is happening
(Figure 1.2), so we end up having to deal with a disagreement. This will result in
retrospection and/or a negotiation about what really is.

Figure 1.2 Differences between observer (green) and reality (yellow)

The identification and creation of knowledge relates directly to justification or


proof. In one respect (favored by evidentialists), this proof and the degree of support
it provides for justification is in relation to evidence. The assumption is made here that
we are in control of our beliefs and subsequently of our actions. Although this position
might seem obvious, it is not exactly so if one considers many involuntary actions that
involve our body (digestion, blinking of the eye, etc.) and mind (fear, hunger, etc.). A
way out of the deontological issues that arise could be to assume a probabilistic nature
of knowledge/beliefs the way we see (and discuss later on) in statistics, where a degree
of certainty in the form of a probability is assumed. Another possibility, which also
directly relates to reliability, is to consider a procession of evidence. Repeated
observations or experiments that produce the same results will tend to solidify the
acquired knowledge and beliefs.
Philosophical Foundations 15

From the point of the academic research that we are interested in here,
epistemology is vital in defining our approach to data collection and analysis and also
in the interpretation of findings in search of the underlining truth of the phenomenon
we are studying. A more practical perspective that is also of interest to research
concerns the creation and dissemination of knowledge in a domain of study. Again, as
we did with the case of ontology, we will discuss here the main schools of thought that
guide social sciences research, like positivism and constructionism. In positivism, as
we will see next, the social world exists in reality and we try to accurately represent it
as knowledge (left image in Figure 1.3), while in constructionism learning takes place
in a social context as we confirm our knowledge with others (right image in Figure
1.3). As we will see in the next sections, the aforementioned epistemological stances
align to a great extent with particular ontological stances. While this alignment is
welcomed, it should not be considered as absolute and a one-to-one relationship.

Figure 1.3 Positivist (left) and constructionist (right) perspectives

A compromising position between positivism and constructionism is that of


pragmatism, where meaning comes from lived experience and cannot be attributed to
predetermined frameworks of truth/reality. In learning, reflection needs to be
balanced with observation and the concrete with the abstract. Events trigger
observations that, combined with past experience, lead to reflection by the observer.
As a result, individuals develop abstract representations of them (theories in a way)
that they verify through active experimentation. This accumulates as concrete
experience and the cycle repeats for as long as a phenomenon and its triggers persist.
Because of the reconciliation between the critics and followers of positivism that
pragmatism offers, it has been adopted by many in social sciences research and
especially in qualitative methods like grounded theory.
16 QUANTITATIVE RESEARCH METHODS

1.2.1 Positivism
Positivism is based on the idea that the social world exists externally and can
be studied and represented accurately by human knowledge only when it has been
validated empirically. This aligns well with the realist perspective and the
corresponding theory of truth, but it shouldn’t be seen as a one-to-one
correspondence. Social entities and their interactions are seen as observables that can
be expressed through the appropriate choice of parameters and variables. These can
be studied and empirically tested to reveal the true nature of social phenomena. For
example, organizational structure and culture exist and can be studied to provide proof
of their influence on organizational performance.
The dominance of observation of the external world over any other source of
knowledge the positivism seeks was initially a reaction to metaphysical speculation.
This resulted in the establishment of the word ‘paradigm’ in social sciences as a
description of scientific discoveries in practice rather than the way they are produced
in academic publications. While traditional theories provide explanations for
phenomena, new observations and experiments might challenge established theories.
Enlightened scientists might then come and provide radical insights that comply both
with the past theories and at the same time explain the new observations. These
breakthroughs are not through the traditional way of advancing in science by
incremental application of existing practices but are based on the creative realization
of revolutionary thinking of exceptional individuals who leap into new revelations of
seeing the world.
Positivism is a paradigm that dominated the physical and social sciences for
many centuries before being criticized in the latter half of the twentieth century for its
inappropriateness in describing complex social phenomena that are formed by the
volatile nature of human behavior. Because of this volatility, social phenomena are
difficult to replicate and so it becomes hard to consistently produce the same results.
One way of increasing the reliability of research findings, as we will see later, is by
repeating research procedures. If the same results are observed, then the empirical
evidence can provide strong support for the foundation of theories.
Under positivism, social sciences aim at identifying the fundamental laws
behind human expression and behavior that will establish a cause and effect
relationship among elementary propositions that are independent from each other.
This is something like the elements in chemistry where they combine under certain
laws to create more complex chemical and biological structures. Research follows a
hypothesis and deduction process where the former formulates assumptions about
fundamental laws and the latter deduces the observations that will support or reject
the truth of the hypotheses. For observations to be accurate representations/measures
of reality, a clear and precise definition (operationalization) of concepts and variables
Philosophical Foundations 17

is required to enable their quantitative measurement. For an accurate explanation of


the phenomenon under study, the sample that will be used in research should be as
representative and as random as possible. This will ensure that the results of the
research will be generalizable to the population (social group or entity) of the study.
During the research process the observer should act independently so as not to affect
the outcome of measurements and to avoid biases in the development of the
measuring instrument and in the interpretation of the results.
A challenge with positivism in social sciences is that while it is quite
appropriate for measuring variables that define phenomena and proving hypothetical
propositions, it does not leave room for discovery simply because you can’t prove or
measure something you do not suspect exists. It can, though, suggest a direction and
new dimensions of research when empirical evidence fails to prove the hypothesis. In
this way, the absence of something can clearly indicate the presence of something else.
For example, the absence of a competitive advantage can easily suggest the presence
of a disadvantage.

1.2.2 Constructionism
In response to the “absolute” nature of learning through observation and
experimentation of measurable entities, constructionism was developed during the last
half of the twentieth century. Its purpose was to address the subjective nature of social
experience and interaction. In the constructionist paradigm, it is argued that our
perception of the world is a social construct formed by commonly agreed beliefs
among people and that these constructs should be investigated by research. We assume
here that the internal beliefs of individuals shape their perceptions of their external
reality to the extent that they behave as if their constructed reality is the actual reality,
making the argument about an objective reality unimportant.
One way of converging the multitude of constructs of reality that various
people hold is through a negotiation process that will eventually end in a shared
understanding of a phenomenon/reality. The challenge this process poses is with the
degree of understanding of the negotiating parties, their power and skill asymmetries
in conducting negotiations, and the motivations each party has for reaching an
agreement. The job of the researcher becomes the collection of facts in the form of
subject statements and responses, the identification of patterns of social interaction
that persist over time, and the development of constructs that capture and uniquely
identify beliefs and meaning that people place on their experience.
The way people communicate and express their beliefs and positions and an
understanding of what drives them to interact the way they do takes here the place of
a cause and effect relationship that in other approaches forms the basis for
understanding and explaining phenomena. Social interactions are not a direct response
18 QUANTITATIVE RESEARCH METHODS

to external stimuli but go through the process of developing an agreed upon meaning
before materializing a reaction. The grounds upon which constructionism is developed
align it almost perfectly with relativism.
A major challenge with constructionism regarding research is the fact that
when dealing with external events, like how the market behaves or how an
organization interacts with its stakeholders, an external perspective is required. The
issue is no longer how we perceive reality but rather what the reality of external
stakeholders is. Another challenge we face is the inability to compare views of
individuals as they are subjectively formed and do not represent accurate/realistic
reflections of their outside world.
To address many of the challenges constructionists face with respect to quality
(like validity in positivism), compliance with a set of criteria is sought in
constructionism-based research. Prominent among them is authenticity, whereby the
researchers need to display understanding of the issue under investigation.
Additionally, they need to demonstrate their impartiality in interpreting findings as
expert methodologists. In that direction, identification of correct operational measures
for the concepts being studied will support construct validity. Internal and external
validity are also of concern as they aim at establishing causal relationships and ensure
the generalization of findings, respectively.

1.3 Methodology
Armed with our beliefs about the nature of reality and learning we move on to
adopting the process we will follow in collecting information and data about the
phenomenon we will study. Methodology, as a field of study, concerns the systematic
study and analysis of the methods used for observations and drawing conclusions. In
short, methodology is the philosophy of methods and encompasses the rules for
reaching valid conclusions (epistemology) and the domain/‘objects’ (ontology) of
investigation that form an observable phenomenon. Its philosophical roots suggest
that it is expressed as an attitude towards inquiry and learning that is grounded in our
perception of reality and forms the guiding principles of our behavior.
While it has strong philosophical roots, it is in essence the connecting link
between philosophy and practice as it provides the general strategy for making and
processing observations. Because “practice” is a core ingredient of methodology, it
means that the subject of inquiry/research has been already identified and considered
in the choice of methodology selected. Here is where ontology and epistemology come
into play. Our beliefs about reality and learning can suggest ways of approaching
inquiry and the way we observe the phenomenon under investigation. For example,
suppose that we are studying work burnout. A positivist perspective would assume
Philosophical Foundations 19

that burnout exists in reality and proceed to formulate tests using large samples that
would measure it and confirm its existence while providing details of the various
variables and parameters that influence its expression. In this way, a cause and effect
relationship will be established between the individual influencers/variables. Among
them, one could find control variables like the environment (suppressive and
authoritarian) and independent variables (we will talk about them soon) like genetic
predisposition, etc. The focus here during data collection and analysis is more on
observations of the phenomenon “What”, “When”, “Where”, and “Who” (Figure 1.1)
and leaves the “How” and “Why” as generalizations during the interpretation and
conclusions phase.
On the other side of positivism, we can consider a constructionist perspective
that would focus on aspects of the environment that are considered by individuals as
contributing to burnout and how they manage themselves in such situations.
Researchers would arrange for interviews with those who have experienced burnout.
By recording the individuals’ stories and appropriate probing about the phenomenon
the researcher develops themes that persist across individuals and in this way an
explanation of the phenomenon surfaces. The focus here is more on the “How” and
“Why” (Figure 1.1) and leaves the identification of commonalities “What”, “When”,
“Where”, and “Who” for the interpretation and conclusions. In a way, it is like going
from effect to cause, while in positivism the perspective will move from cause to effect. A
point of interest here is that the researcher is not defining the phenomenon and its
characteristics but leaving it up to the research subjects to do so.
When considering the methodological approach to research we shouldn’t
forget that one common theme is that they all seek the ‘truth’ and making sense of
what we observe/sense. All methods also aim at providing transparent and convincing
evidence about the conclusions they draw, and they are all judged about the validity
and accuracy of their predictions. A “limitation” of proof in social sciences that is
important to mention here is that while the path we follow in proving something might
lead to success (or better yet acceptance by others), it will not necessarily lead to
positive change that improves the lives of others. Research can only bring awareness
and understanding, so we should be clear that changing is something reserved for
policymakers, individuals, and groups.
It is worth pointing out here what a phenomenon is, as we frequently refer to
it as the core element or the essence of research. By its etymology, a phenomenon is
something that appears, meaning it is observed. In our case, we will also add the
element of repeated appearance as otherwise it might not deserve the effort one can
devote to its study. Value from research comes from the understanding we gain about
something, that we can later use to make predictions and optimally deal with similar
situations. Understanding comes from being able to represent the complexity of what
20 QUANTITATIVE RESEARCH METHODS

we observe with abstract representations (variables, constructs, parameters), their


classification according to their similarities and proximity to other abstractions, and
the way they interconnect and react to each other.
If we are able to identify the various elements of the phenomenon with
precision and detail enough to be able to measure them as specific quantities, then we
can say that our data are quantitative and thus the methodology we will follow is a
quantitative one (Figure 1.4). Such quantities include age, gender, education level, etc.
If on the other hand our observations concern constructs that cannot accurately be
represented as quantities, then we can say that our research requires a qualitative
methodology (Figure 1.4). Such constructs include feelings, beliefs, perceptions, etc.
If both quantitative and qualitative elements are required for the proper description
and explanation of a phenomenon, then a combination of methods (mixed methods)
would be the recommended path (Figure 1.4).

Figure 1.4 Research methodologies

The choice of methodology based on the data we plan to collect is one guiding
principle, but it is not the only one. As we mentioned before, our epistemological and
ontological stances play a crucial role in the type of methodology we choose, and one
might go for example for a qualitative methodology (like in a case study of a single
Philosophical Foundations 21

organization) that includes the collection and processing of quantitative data (like its
balance sheets for a period of time). Another element that might influence the choice
of methodology is the role of the researcher in data collection and interpretation.
Quantitative researchers attempt to maintain objectivity by distancing themselves from
their sample population to avoid interference from their personal beliefs and
perceptions that could potentially influence the sample’s responses. Biased
observations and interpretations are an anathema to scientific research and along with
ignorance can do more damage than good. This is considered impossible for
qualitative researchers who pose that one’s personal beliefs and feelings cannot be
eliminated and only by being aware of them can realistic interpretations and
conclusions be drawn. The following sections present the various methodologies in
more detail and provide more information about their underlying philosophies that
will (hopefully) guide someone to correctly choose and apply them in practice.

1.3.1 Quantitative
The quantitative method and analysis is based on meaning derived from data
in numerical form like scale scores, ratings, durations, counts, etc. This is in contrast
to qualitative methods where meaning is derived from narratives. The numbers of
quantitative research can come directly from measurements during observation or
indirectly by converting collected information into numerical form (like with a Likert
scale, which we will see later). While this definition of quantitative research covers the
basics of what it is, a more in-depth representation defines quantitative methodologies
as an attempt to measure (positivist stance mainly) an objective reality (realist stance
mainly). In other words, we assume that the phenomenon under study is real (not a
social construct) and can be represented (knowable) by estimating parameters and
measuring meaningful variables that can represent the state of entities that are involved
in the phenomenon under study.
Associations among such variables can further be used to establish
relationships and ultimately suggest cause and effect that will predict the value of one
variable (effect) based on the observations of another (cause). This process is
particularly suited when we form hypotheses in an effort to explain something.
Hypotheses are statements of truth about facts that can be further tested by
investigation. This usually follows when theories or models are formed in an attempt
to provided “solid” (statistically that is) evidence of their statement and assertions.
Model testing is otherwise the domain of the quantitative research methodology.
In order to produce reliable evaluations, quantitative research requires large
numbers of participants and the analysis is done through statistical tools. Large
samples ensure better representativeness and generalizability of findings as well as
proper application of the statistical tests. The investigator and the investigated are
independent entities and, therefore, the investigator can study a phenomenon without
22 QUANTITATIVE RESEARCH METHODS

influencing it or being influenced by it. This ensures an objective treatment of the


collected data, increasing in this way the reliability of the study. Facts are separated
from values and in this way the “truth” of what is observed is the external reality of
the observation. This is also supported by the rigid procedures that need to be followed
during data collection that in addition to ensuring reliable measurement eliminate
potential biases and personal values of the researcher.
Collecting data for quantitative studies is based on instruments and
procedures. The former concerns the development of written forms for collecting
information through observation and surveys (see Chapter 2 and Appendix A), while
the latter concerns the formal steps we follow in collecting information. Since data
come in numerical form, the use of mathematical methods (like statistics) is recruited
for their processing. At this point we should clarify that while some variables are by
nature numerical (like weight, age, revenue, income, etc.), others (like quality,
performance, beliefs, attitudes, etc.) might need the development of some scale for
their measurement. For the latter category, a questionnaire might be developed where
a measure of agreement with a statement (like ‘strongly disagree’, ‘disagree’, ‘neither
agree or disagree’, ‘agree’, ‘strongly agree’) can be selected by participants. This type of
scale will be “assumed” (more on this when we discuss surveys) equivalent to a
numerical scale (like ‘strongly disagree’ = 1, ‘disagree’ = 2, etc.).
This type of mapping between wordy representations of variables and
numerical ones allows for an analysis of the data with the use of statistical and other
analytical forms of processing. In essence, we aim at removing interpretation that
could be ambiguous during the analysis phase (contrary to the qualitative method that
relies heavily on interpretations, as we will see in the next section). For example,
‘strongly agree’ might mean different things for different people the same way as
experiences like ‘very cold’ might simply mean ‘cold’ or even ‘warm’ depending on
whether you ask a Polynesian or an Eskimo. A number like 5, though, means the same
everywhere.
The use of structured instruments in quantitative research requires
standardization of procedure to ensure replicability of the research to confirm
findings. This way, credibility is ensured and the universality of the proof (model of
the phenomenon under investigation) is enforced as long as contrary evidence is not
revealed.
While the great advantage of quantitative methods is the establishment of
proof about dependencies and the existence of relations among quantities that are easy
to replicate and generalize, the methodology is not free of criticism. A typically
mentioned disadvantage of quantitative methods is that it is oftentimes not clear what
the answers to questions (in polls, for example) mean in terms of the subjects’
behavior. In other words, the contextual details of a situation are not easily captured,
Philosophical Foundations 23

especially when attitudes, beliefs, and behavior in general are studied. This is
something that quantitative researchers usually defend by emphasizing that the focus
of quantitative methods is not on what behavior means but on what causes it and how
it is explained.
In addition to the aforementioned criticism we should not forget that
quantitative methods do not serve as direct observations of a phenomenon as the
researchers are usually in control of the environment, so the results are seen to an
extent as “laboratory results” instead of as “life observations”. Using instruments like
questionnaires are also considered interventions in addition to being considered as
expressions of the designers/researchers rather than independent and objective
measurement instruments. The issue of objectivity is of constant debate amongst
methodologies and is closely related with the internal and external validity that will be
discussed in follow-up chapters.

1.3.2 Qualitative
While qualitative methodology is not the focus of this book, a brief mention
is deemed necessary as it is vital for researchers to understand what is available to them
before committing to a methodology for their research. In general, a qualitative
methodology is defined as everything that is not quantitative. While this type of
definition by exclusion might reflect to a great extent the truth, it is best if we can
define qualitative research with respect to what it is instead of what it isn’t. In search
of a definition of what qualitative research is, the roots of the word quality itself can
serve as a point of departure.
Quality comes with many meanings, ranging from the essence of something to
the level of something with respect to another/similar something that is taken as a
standard. By “something”, in the context of research, we refer to theoretical constructs
that represent assemblies of conceptual elements that appear as independent entities
influencing or representing phenomena or aspects of them. In that respect the first
definition of quality can form the basis upon which qualitative methodology is
expressed. While the latter definition might seem irrelevant to a research methodology,
this is far from truth. Qualitative research is grounded on comparing theoretical
constructs as it is in the act of comparison that new constructs are developed. Such
constructs are necessary in social sciences when human behavior needs to be studied.
Humans are moved by needs, and in response to environmental triggers (both physical
and social) they build an understanding of the world around them that helps them
make sense of it and respond. One definition that captures all this is to see qualitative
research as a methodology that aims to understand human behavior and more
specifically the beliefs, perceptions, and motivations that guide decision making and
behavior.
24 QUANTITATIVE RESEARCH METHODS

Conceptually, qualitative methodology assumes a dynamic and subjective


reality. The role of the researcher becomes critical as they interpret not only the results
but also the content of what is captured and the way it is captured. One can only think
of the interviewing process (a typical data collection technique for qualitative
methodology) and how the interviewer can influence (consciously and unconsciously)
the subjects of the study due to their preconceptions and biases towards the
phenomenon of investigation. Because it concerns interactions amongst individuals
(the researcher is included in some cases, like in an autobiography), qualitative research
is based on the constructionist epistemological stance and heavily reflects the relativist
ontological perspective. As such, it is heavily based on interpretation and induction.
The open-ended methods that are used in qualitative research are meant to
explore participants’ interpretations (usually collected in relatively close settings) and
include, among others, interviewing, on-site observations, case studies, histories,
biographies, ethnographies, and conversational and discourse analysis. Data and
information are usually collected from samples of actors and their accounts of their
perspectives and recollections of events and impressions they formed about specific
situations they experienced. Because the methodology is based on the traditions of
constructionist and relativist views, it is well suited for research that challenges
established preconceptions of truth like male dominance and white supremacy, for
example. The methodology itself is in a way a challenge to the traditional quantitative
approaches that dominated research for centuries.
By many, qualitative research is seen primarily as a discovery process, with
explanatory potential that precedes a quantitative research aimed at proving its
findings. The researcher starts with an interpretation of the phenomenon under
investigation and proceeds to collect data (observations, interviews, etc.) to support
the interpretation (Figure 1.5). As information is processed, the researcher
reformulates the original interpretation, and after cycles of information collection and
interpretation eventually develops a theory of what is observed. This process can be
seen in interviews when a question might be followed with probes to clarify what the
interviewee says and provide more depth. This cyclic aspect of the qualitative
methodology makes it suitable for theory development and/or theory validation. We
will see something like this in the next section where we discuss mixed methods.
Philosophical Foundations 25

Figure 1.5 Qualitative inquiry process

The data analysis phase in qualitative methodology involves the identification


of persistent themes, categories, and relationships, cross verification of the collected
information from multiple sources of evidence (triangulation), cause and effect
representations of the phenomenon under investigation, and testing of the construct
formulations with subsequent evidence. This process relies heavily on the rationality
and healthy skepticism of the researchers as they critically evaluate and verify their
findings. Because of its heavy reliance on interpretation and small sample populations,
qualitative methodology has difficulty in supporting the generalization of findings.
What the methodology does instead is ensure the validity of the results (usually
through triangulation, member checking, etc.) and the transferability of the results to
similar social settings and research studies.
While the value of the qualitative methodology in getting to the bottom of
issues and suggesting cause-and-effect relationships cannot be denied, there are
criticisms with respect to its rigor due to the “soft” nature of data (narratives usually)
that are usually seen as limited (small samples) and subjective. Major concerns are
raised regarding the reliability, validity, and representativeness of the collected
information as oftentimes the methodology is judged in light and according to the
rules of quantitative research.

1.3.3 Mixed methods


Qualitative and quantitative methods can be combined in a variety of ways to
form what is called mixed research methods. This is an attempt to draw from multiple
epistemologies to frame and understand phenomena. As a result, researchers are
supposed to increase the validity of their studies (through triangulation) to allow them
to reach generalizations that will support the formation of theories to accurately
describe the phenomena under investigation. While combinations of methods seem to
26 QUANTITATIVE RESEARCH METHODS

move away from the methodology and get closer to practice as an approach to
examining a research problem, there are many researchers who view the methodology
as a separate and independent epistemological way of approaching research. This
brings mixed methods into the realm of pragmatism (somewhere in between
positivism and constructionism). Others, though, believe that paradigms are not to be
seen as distinct but rather as overlapping or with fluid boundaries where one gives rise
to and supports the other (Figure 1.6), so combining them is quite an acceptable way
of conducting research.

Figure 1.6 Mixed methods research philosophy

The main issue with mixed methods is the sequence and extent of applying the
quantitative and qualitative components of inquiry. For example, we might be
interested in understanding the context of a real-life phenomenon like leadership in
transient teams (teams that are formed for a specific task and then dissolved). If we
are not aware of the characteristics of leadership that lead to success or failure of
leaders in such teams (or there is no strong research behind it), we might decide to
begin with a qualitative methodology that through in-depth interviews with select
leaders of transient teams will reveal key characteristics. We will then follow this with
a quantitative methodology whereby with the development of a widespread
questionnaire we will prove that the identified characteristics (or some of them) exist
universally in transient teams. Alternatively, we might already know (for example, from
past research or similar cases) the range of possible characteristics, so we might decide
to start with a quantitative methodology to identify the specific leadership
characteristics that affect transient team performance and then follow with a
qualitative methodology (for example, in-depth interviews) to identify the reasons
behind the influence of the specific characteristics.
Philosophical Foundations 27

The three possibilities of mixed methods designs are outlined in Figure 1.4.
We can start with a quantitative approach (large sample) that will provide findings like
certain characteristics (attitudes, traits, perceptions, etc.) of a population and follow up
with a qualitative approach (specific small sample) for explanations to surface, and
thus the method is called explanatory sequential mixed methods. On the other hand,
we have exploratory sequential mixed methods where we can start with a qualitative
approach (small sample) to explore the range and depth of a phenomenon and
continue with qualitative (to a wider sample) to provide the “proof” and confirm what
we found. Finally, we can have both methods working in parallel (convergent parallel
mixed methods), and aim at calibrating each of them and comparing their results to
converge (or diverge) for an interpretation, enhancing in this way the validity of the
research findings. An example of the parallel application of methods is when they are
combined in one instrument like a questionnaire that has both closed-ended and open-
ended questions. Other possibilities of combinations have also been identified, like
conducting one before and after an intervention, or having quantitative data in a
qualitative research and vice-versa, but they are beyond the scope of this book.
Apart from the sequence of execution of the various methodologies, of great
importance is dominance. The simple way of seeing dominance is in terms of the time
that is devoted in each methodology. This is kind of a simplistic view as their
importance in the research might be different. For example, one might dedicate plenty
of time and resources in in-depth interviews to confirm past research findings while
spending very little time in posting a questionnaire online and processing the results.
While the interviewing part might seem necessary, the importance of the quantitative
part is at a much higher level.
Along with the advantages of mixed methods research like the well-rounded
investigation of a phenomenon and the insight they could provide for guiding future
choice of methods, there are criticisms that need to be considered before adopting
such methodologies. Prominent among them are the contradictions of the underlined
paradigms that need to coexist and provide support for the research, as well as the
competencies that the investigators need to have as they need to cover the full
spectrum of quantitative and qualitative methodologies. Additionally, mixed methods
studies take significantly more time and more resources to complete, making them
unsuitable when time and resources are of essence. In closing, mixed methods might
be ideal for comparing quantitative and qualitative perspectives for instrument
calibration, for discovering the parameters and variables involved in phenomena, for
understanding and providing support with raw/quantitative data for the “how” and
“why” of phenomena, and in support of interventions and change initiatives (like for
marginalized populations).
28 QUANTITATIVE RESEARCH METHODS

2 The Quantitative Research Process

Having decided that the phenomenon we intend to investigate deems a


quantitative methodology as ideal for its investigation, we describe here the design
process and the steps that we need to follow. We begin in this chapter with the design
aspects of the research and we continue in the follow up chapters with the ways we
can represent populations and their characteristics, the selection of representative
samples, and the ways we can process and analyze the information we will collect.
Eventually, we will see the process of formalizing our proofs through hypothesis
testing and the ways and actions we need to take to ensure our predictions are reliable
and valid.
As we saw in the previous chapter, it all begins with the need to describe and
explain a phenomenon of importance and value to humanity. In research, this is
expressed in terms of a problem statement (Figure 2.1) where the investigator
describes the situation that needs addressing to improve the lives of individuals. This
could be in the form of clear-cut issues like preventing terrorist threats, or dealing with
social, racial, and gender inequalities (to name but a few), or it could be in the form of
lost opportunity to gain something of value like increasing the entrepreneurial
potential of individuals or strengthening their leadership skills. Value becomes of
essence either by preventing its loss or increasing its gain. Keep in mind here that a
proven gap in the literature is not sufficient reason for identifying a problem. For
example, the fact that there are no peer-reviewed academic research publications on
aliens living among us does not suggest that this “problem” would generally be
considered suitable for quantitative research (unless maybe if believing in such a thing
adversely affects society).
Having identified a study-worthy phenomenon, researchers formulate their
intention to conduct a study in the form of a purpose statement (Figure 2.1). This is
where the “what” will be studied is addressed and is often accompanied by the aim
and objectives of the research. With “aim” we express something we hope to achieve
with our research. It displays our intentions in a more specific way than the purpose,
which reflects more the “why” we engage in research. The aim can also vary according
to the research progress, while the purpose remains the same during the research
process. Objectives, on the other hand (not to be confused with action steps), reflect
the goals we set and are similar to milestones in that they can be seen as identifiable
and time-bound landmarks tightly linked to specific, measurable outcomes. Typical
verbs that are used to express the intention of the purpose, aims, and objectives include
but are not limited to ‘understand’, ‘discover’, ’develop’, ‘generate’, ‘explore’, ‘identify’,
’describe’, etc.
The Quantitative Research Process 29

Given that our focus in this book is the social world, we need to add another
important element that usually accompanies the purpose and that is the population of
the study. A brief profile of the entities (people, groups, organizations, etc.) is
necessary to provide the social context of the phenomenon we study and also suggest
later on the way we will sample that population to select those who will participate as
research subjects. A related concept, we will see later on, that identifies our atomic
elements of study is the “unit of analysis”.

Figure 2.1 The initial phases of the quantitative research process

With a problem and purpose statements formulated and armed with their
philosophical beliefs about the nature or reality (ontology) and inquiry (epistemology),
researchers choose the philosophy that will guide their methodology, which in our
cases here will be quantitative. From this point on the quantitative aspect will be the
predominant aspect of research and the focus of our discussion. The identification of
the entities that will be considered in the representation of the phenomenon will be
30 QUANTITATIVE RESEARCH METHODS

first done through the research questions that will be adopted. These will include the
entities that will be measured and/or the type of relationship among them that we
hope to establish. In developing the research questions that align with the problem
and purpose of the study, researchers need to know what other researchers have done
in the past and the types of theories that have been developed to explain similar
phenomena and closely related theoretical constructs (Figure 2.1). This is done
through a review of the extant literature to identify information relevant to the study
that can serve as a point of departure for developing research questions.
A point of discussion, debate actually, is whether the research questions should
guide the literature review. While the argument that the research questions, and not
the research topic, will dictate the parameters of the literature review has merit, the
truth is that by the time they come to developing the research questions researchers
have already gone through a literature review process (even if it is not an extensive and
exhaustive one). This is to ensure the phenomenon of their investigation is a significant
and valid one for scientific research (Figure 2.1) and that it hasn’t been researched
before (unless of course the purpose of the study is to repeat a research for reliability
purposes — something that is out of the scope of this book). Additionally, as we will
see in the next section, in formulating our research questions it is vital to know what
relevant information (constructs, variables, theories, etc.) has already been investigated
or recommended for the explanation of the phenomenon.
The stance we will follow in this book is that the purpose will result in research
questions that are informed by the literature review and the theoretical framework we
choose as our initial point of departure in applying, building, or extending theory. The
research questions in return will provide further direction in reviewing the literature
and shaping the theoretical framework, whether this is in the form of existing theories,
additional constructs, or rearrangements of pieces from multiple theories and
constructs. In essence, a feedback loop is established that informs and influences its
elements until it settles down to a three-way supporting structure of research
questions, literature review, and theoretical framework (Figure 2.1). The purpose of
the research and the researcher’s preconceptions or understanding of the phenomenon
under investigation can point to where one enters the feedback loop.

2.1 Literature Review


Systematically seeking information about what has been done in the past on a
specific subject is the process of developing a literature review. The outcome of such
a process is an evaluative report of what exists in the extant academic literature on the
specific field of our study. This process will allow us to delimit the phenomenon we
investigate, support the validity of our study as a phenomenon worth investigating,
locate gaps in the lines of inquiry (usually in the form of recommendations for future
The Quantitative Research Process 31

research), gain insights about the methodologies (if any) and practices used in the past
to study the phenomenon, and identify closely related phenomena that could
potentially serve as points of departure in our inquiry. At a deeper level, the literature
review can even suggest new theoretical constructs and even specific variables that we
need to consider in our study. This is critically important for the development of the
research questions as it can suggest what needs to be included or avoided.
In addition to revealing what has been done in a field of study, a literature
review also adds credibility to research as it showcases the researcher’s knowledge and
understanding of the state of the art in a topic. This includes familiarity with the
phenomenon under investigation, the history of its conceptualization, terminology,
research methods and practices, and assertions and findings. Additionally, when a
literature review is included in publications it serves to inform the reader and, in many
cases, it is considered a legitimate scholarly work that can be published by itself.
Before deciding on a literature review process, one needs to consider what type
of review they are going to conduct. The motivation and goal for conducting the
review will define its focus, the extent of its coverage in terms of subject matter depth
and breadth, and its point of view perspective (critical evaluation of its findings). With
respect to justifying research, the motivation could be to exhaustively present what has
been done in a field with the goal of revealing persistent themes,
generalizations/theories, or gaps that haven’t been addressed yet. While the latter can
help develop the research questions and provide justification and legitimacy for
conducting the research, persistent themes and generalizations/theories can also help
develop the theoretical framework upon which our research will be based.
Conducting a literature review is like doing research. The only difference is
that instead of people, the subjects of investigation are research articles. While the
process of developing a literature review is beyond the scope of this book we can
briefly state that the process begins by establishing a search plan that involves
identifying keywords that will be used in search engines (like Google Scholar at
scholar.google.com) or library databases (like EBSCO, PROQUEST, etc.) where
academic research publications are collected and cataloged. When search results
become available, the researchers will work on an initial screening for relevance to the
topic of investigation, credibility, and publication date. While relevance will be decided
based on the breadth and depth in the field of study one wants to cover (reading the
abstract will give an indication of this), credibility can be decided on the widespread
popularity and acceptance of a scientific journal (impact factors can reveal this) and
the position (affiliation), expertise, and experience of the authors of a publication.
Regarding the publication date, for fast-advancing topics (in terms of
discovery) like technology one might consider only publications within the last three
years to match the life cycle of new developments in the field that could have either
32 QUANTITATIVE RESEARCH METHODS

addressed the problem of the study or made it obsolete because something new has
replaced it. In other fields like social sciences, for example, the change is usually not
that fast (excluding revolutions and radical social upheavals), so one might assume a
five-year window into the recent past for searches. Exceptions to the consideration of
recently published research could be seminal pieces of work that have sustained the
passage of time and theories that could be considered in the theoretical framework of
our research.
Working on screened publications, researchers will move on to categorize the
published material according to themes that will form the core of the literature review.
Each theme will later be developed under its own heading and will include a critical
and comparative evaluation of the published research under its domain. The general
structure of a literature review starts with the goals and motivation for developing it,
presents the search strategy that was followed in retrieving and screening published
research, moves on to thematically present its findings, and concludes with a
summative and comparative presentation of the core themes that are directly related
to the phenomenon under investigation. A description of the search strategy is
necessary primarily for validity and reliability purposes. Future researchers who want
to validate the review should be able to follow the same process and (hopefully) come
up with the same or similar results.

2.2 Theoretical Framework


Research doesn’t spring out of nothing. There is usually some relevant theory
or ideas expressed in the form of theoretical constructs that have addressed situations
similar to the phenomenon of our study, or at least parts of it. This pre-existing
material is valuable for new research as it can guide the improvement and/or extension
of existing theories (like by applying them in a different context or situation) and the
development of new ones. In cases where we are interested in applied research, that
material will be the basis for developing our data collection methods and instruments.
Searching for a theoretical framework for a study can be done independently
or through the literature review process (Figure 2.1). The two in fact can inform each
other as the literature review might reveal past attempts to explain the phenomenon
that we study, while the theoretical framework might guide one of the directions, we
should follow in conducting searches. The development of the framework usually
starts by identifying the characteristics of the phenomenon under investigation. These
will indicate the conceptual and theoretical areas under which the research falls, along
with potential overlaps and gaps among those areas that will need further investigation.
With that knowledge, we can search the literature for relevant theories and
constructs that have been developed in the general area of the phenomenon we study.
The Quantitative Research Process 33

This could result in one or many theories as some might address specific aspects of
the phenomenon we study while others might deal with its more generic
characteristics. Multiple theories may exist (Figure 2.2) that could overlap in some
aspects (theoretical constructs) and deviate in some others. Overlaps (constructs XY1
and XY2 in Figure 2.2), might indicate areas with strong influence that have also been
validated and proven in providing accurate explanations, while gaps might suggest the
need for individual constructs (new or existing ones) that need to be considered for a
thorough description of the phenomenon we study (construct Z in Figure 2.2). The
latter is a case where the theoretical framework might inform the literature review and
suggest searches that will support new or existing constructs.

Figure 2.2 Theoretical framework constructs development

An example of how a theoretical framework is formed might be a study about


entrepreneurial opportunity identification. Through our literature review we might
have identified that expectancy theory, with its constructs of effort, performance, rewards,
and satisfaction of personal goals, along with prospect theory, with its constructs of
framing and valuation, address to a great extent the phenomenon we want to study. In
such a case, we might decide to base our theoretical framework on the effort and
satisfaction of personal goals (a proxy to motivation) from expectancy theory and
supplement with the framing construct from prospect theory. We might have decided
(following our literature review of course) that these need to be complemented with
the construct of national culture (a proxy to influence) in order to account for the
environment where the entrepreneurial opportunity will be identified. The collection
of the elements we chose, and their hypothesized interconnections will constitute our
theoretical framework.
34 QUANTITATIVE RESEARCH METHODS

Figure 2.3 Proposed theoretical framework

When considering and selecting the various elements that will form our
theoretical framework (Figure 2.3), caution should be exercised in choosing the
number of theories we will consider. Too many might diffuse our focus on the
phenomenon and provide a level of granularity that adds unnecessary complexity. Too
few (usually one) might miss important theoretical constructs or parameters that might
have already been investigated and whose inclusion might be necessary for a complete
coverage and description of the various aspects of the phenomenon we study. A rule
of thumb might be to avoid considering more than three. If convenience is of issue or
the subject of our study has not been investigated in the past, one theory might very
well serve as the basis for developing our framework. In terms of theoretical
constructs, a balance between generic and specific is needed. National culture as a
proxy to influence, for example, might be too generic in the case of the identification
of an entrepreneurial opportunity that we showcased previously, and instead family
environment might serve as a better alternative.

2.3 Research Questions and Hypotheses


Since research aims at providing answers, our specific purpose will need to be
expressed in the form of questions that will provide such answers. This is the stage
where we establish a set of research questions and their associated hypotheses.
Quantitative research questions inquire about the characteristics or features of the
phenomenon we investigate, and they are normally expressed with ‘what’, or
‘is’/’does’. They either try to confirm the existence or the constructs we identified in
our theoretical framework (Figure 2.3) or the relationships (type and strength) amongst
them (interconnecting arrows in Figure 2.3).
In the former category, we have questions of the form:
The Quantitative Research Process 35

What theoretical constructs (or variables) describe (or influence) the phenomenon under investigation?
or
Is X a component of (or influencing) the phenomenon under investigation?

In the latter category, we have questions of the form:

What is the association (or relationship) between X and Y in the phenomenon under investigation?
or
Is there an association/relationship between X and Y in the phenomenon under investigation?

An example could help highlight the possibilities here. Let us assume that we
want to study the problem of leaders underperforming in crisis situations in
multicultural environments. The problem statement in one sentence can be expressed
in the form: ‘The specific problem this research will address is the lack of understanding of the
personality traits that help leaders in multicultural organizational environments perform better in crisis
situations.’
This sentence of course will have to be expanded on and supported with
adequate and recent (last 5 years at most) citations to ensure the problem is valid,
significant, and current. With a problem statement like the aforementioned, one would
expect a purpose statement of the form: ‘The purpose of this research is to provide an
understanding of the personality traits that help leaders in multicultural organizational environments
perform better in crisis situations.’
In a case where we don’t know what theoretical constructs and variables are
involved in the description of the phenomenon, a research question can be expressed
in the form: ‘What personality traits help leaders in multicultural organizational environments
perform better in crisis situations?’ If our literature review has hinted that certain traits,
extroversion for example, help leaders in general perform in crisis situations, we might
formulate a research question of the form: ‘Do leaders with the extrovert personality trait
perform better in multicultural organizational environments in crisis situations?’ This, alternatively,
can be expressed in the form: ‘To what extent does the extrovert personality trait help leaders in
multicultural organizational environments perform better in crisis situations?’, or in the more
familiar form: ‘Is there an association (or relationship optionally) between the extrovert personality
trait and a leader’s ability to perform better in crisis situations in multicultural organizational
environments?’ This variation covers a lot more ground from the previous version as it
doesn’t only aim to establish existence (or non-existent if the “extent” of influence is
zero), it also hopes to identify the strength of the influence the particular trait exerts.
A point of interest here is that the “what” form of expressing a research
question can also appear in qualitative research, but the “relationship” form can only
36 QUANTITATIVE RESEARCH METHODS

exist in the quantitative methodology we study here. This latter form needs to be
accompanied by a set of hypotheses (null and alternative) that exhaustively confirm or
reject the existence of what is assumed and stated. A hypothesis is nothing more than
a presumed statement of the existence or non-existence of a population characteristic
or relationship between two variables. The validity of such statements is assessed by
the use of statistical methods applied to the data generated through empirical inquiry
of the phenomenon under investigation.
While we will discuss the hypotheses in a separate chapter later on, suffice it
to say here that for each research question we have two forms that are mutually
exclusive: the null and the alternative. For the example of leaders’ performance in crisis
situations showcased here, the research question we adopted will suggest the null form:
‘There is no association (or relationship optionally) between the extrovert personality trait and the
leader’s ability to perform better in crisis situations in multicultural organizational environments’,
with the alternative form being: ‘There is an association (or relationship optionally) between the
extrovert personality trait and the leader’s ability to perform better in crisis situations in multicultural
organizational environments.’
Hypotheses are a unique characteristic to quantitative research, and it is more
of a research design element than anything else. They are directly tied to the data
analysis process and the statistical techniques used in developing support for the
existence (or not) of relationships among variables that quantitatively (in numerical
form) represent theoretical constructs as well as estimations of the strength of the
relationships. They are included here because of their direct relevancy to the research
questions and for their conceptual integration within the research process.

2.4 Research Design


With the methodology chosen as quantitative and the research questions
developed, we can move to formulate a plan that will outline the steps that need to be
followed. This is oftentimes addressed in published research in the section “nature of
the study”. The role of the plan outline that will provide the underlined structure in
the research activities is referred to as the research design of the study. Since the focus
of this book is on quantitative research, the designs that will be presented here are
exclusively for that methodology. While an attempt will be made to cover various
designs, it should come as no surprise that the popularity of some, like the
correlational, is evident in quantitative research. Regardless of the popularity, equal
attention will be given here to all that have been considered, which includes the great
majority of what is used in practice. Overlaps and combinations of them do exist and
occasionally some might appear with multiple names, some more exotic than others.
The Quantitative Research Process 37

When considering a research design, we look to ensure that certain features


are included. Apart from setting a credible, coherent, and rigorous process for
collecting and analyzing data, the conclusions and evidence generated will reflect an
objective evaluation of what was observed. In essence, a good research design sets up
a clear research strategy for anyone (researchers and other stakeholders) to follow with
practical steps of what and how it will be performed. Like any plan, we need to
approach its development from various perspectives that fully cover the aspects of the
inquiry we want to perform. Characteristics of design need to align with the purpose
of the research and include the outlook/perspective we adopt in approaching the
design, the types of samples involved, a clear set of hypotheses (if needed), the types
of variables required (measurable entities or treatments in some cases) as
representatives of the quantities in the research questions and hypotheses, and the
methods for data collection and analysis we will follow (Figure 2.4).

Figure 2.4 Research design elements

Before venturing into the research design perspectives and the process of
collecting and analyzing data it is worth mentioning here the very important concept
of unit of analysis specific mainly to social sciences research (Figure 2.4). While we
are typically interested in the way individuals interact between themselves and the
38 QUANTITATIVE RESEARCH METHODS

environment, we are often interested in how identifiable entities like an industry, the
market, social interactions (like, say, negotiations), teams, groups, and even societies
as a whole behave at some time period in their life. In these cases, the unit of analysis
becomes the entity we are studying. This is not to be confused with the unit of
observation/our sample, as in most cases it will be individuals who due to their
expertise or position express their opinion about the state and behavior of the unit of
analysis. An example might help clarify the case. If we are studying employee
performance, then, obviously, our unit of analysis is the individual employee. If, on
the other hand, we are studying organizational performance, then our unit of analysis
is the organization. Regardless of the unit of analysis, the research data might be
provided in both cases by the same individuals. A point of importance is that the unit
of analysis does not have to be an entity. In can very well be a social interaction or
artifact like partnerships, beliefs, etc. For example, we might be interested in studying
how humor is used in negotiations. In this case, humor in the specific setting is our
unit of analysis. The data of course will be provided by individuals (units of
observation or sample). In this example, another unit of analysis (there can be more
than one) could be the negotiation as a form of social interaction. In practice, units of
analysis should be deduced or inferred by the purpose of the study or the research
questions.

2.4.1 Research Design Perspectives


Like any plan of action, the research design needs to have an underlying
philosophy that guides the elements that will be induced. These are decided based on
the perspective/outlook we adopt and in alignment with the purpose and research
questions (and hypotheses) that guide our research within the context of the
methodology (quantitative) we chose. Three core perspectives (Figure 2.4) that define
a research design are defined by the purpose, the time period data were collected, and
the control the researchers exercise on the sample participants (Figure 2.5).
The purpose perspective is a defining aspect of a research design and it is the
connecting link with the purpose and research questions of the study. While one might
presume that all research is exploration (even when we confirm something we might
say we explore its validity or presence), the purpose perspective is meant to indicate
here the degree of knowledge we have about the phenomenon we investigate and can
be exploratory, descriptive, or explanatory (often called causal). Exploratory research
assumes little or no understanding of the phenomenon we study and aims at
discovering the elements/constructs that are involved in its expression. Eventually this
will lead to the generation or buildup of theory. This type of research is often seen as
the initial stage of inquiry, and while it can involve qualitative inquiry its inclusion here
aims to reflect exploration for discovery and investigation. Another aspect, often
associated with exploratory research, is confirmatory in terms of verification of theory.
The Quantitative Research Process 39

This mainly aims at testing α hypothesis and conclusively confirming in this way
findings that can later on be generalized. Oftentimes it is debated whether
confirmability can qualify as exploratory because the former can be representative,
reliable, and valid, while the latter is vague and inconclusive. As a general practice, in
the quantitative domain it is good to leave hypothesis testing out of exploratory
designs.

Figure 2.5 Research design perspectives

Presumably, when the stage of exploration has been completed one moves on
to describing what has been found. This phase is the domain of the descriptive
research design and it is usually launched as a separate investigation/research. It is used
to obtain refined information about variables identified through explorative research
in terms of definitions, limitations, range, and units of measurement that will describe
them. As such, descriptive designs require the development or adoption of proper
instrumentation (like scales) for measuring variables and constructs. In case
observations are utilized, such designs might not qualify as quantitative. Regardless,
descriptive research is ideal as a precursor to more quantitative designs like the
explanatory we will discuss next. In addition, and due to the large amounts of data
40 QUANTITATIVE RESEARCH METHODS

collected and processed, such designs can result in important recommendations for
practice. Caution needs to be exercised as the results cannot be used for discovery and
proof and when the results cannot be replicated (as in the case of observations).
Good descriptions will provoke the “why” questions that fuel explanatory
research. Answering the “why” involves developing explanations that establish cause
and effect relations among factors (causes) and outcomes (effects) for the
phenomenon under investigation. For that reason, explanatory designs are also seen
as causal designs. Such designs identify the conditions/factors that need to exist for
the phenomenon to be observed. Hypothesis testing (discussed later on) is specifically
developed to address such conditional statements. A case in point in social sciences is
when we want to measure the impact a specific change can have (say a policy to enforce
entrepreneurial growth) on the behavior of members of the society (like potential
entrepreneurs). In such a case, causality is established by validating an association
between the change that will be expressed through a factor (also seen as independent
variable) and the behavior (also seen here as dependent variable) it affects. Variations
in the independent variable/factor need to cause variation of the dependent
variable/phenomenon, making sure also that a third variable is not influencing the
observed variations (non-spuriousness).
According to the nature of the explanation/causation that is sought,
explanatory designs can be further subdivided (Figures 2.5 and 2.6) in the general
category of comparative when two groups are compared, causal-comparative when we
try to understand the reasons why two groups are different, and correlational when no
attempt is made to understand cause and effect. Comparative designs are the simplest
ones and focus on examining differences in one variable (the dependent/factor)
between two groups or subjects. In this case, the subjects and measurement
instruments need to be described in detail.

Figure 2.6 Purpose perspective and subdivisions

Causal-comparative designs attempt to establish cause and effect


relationships in cases where the independent variable/factor cannot/should not be
The Quantitative Research Process 41

examined using a controlled experiment. This usually involves two groups, one of
which expresses the factor we are studying while the other one doesn’t (often called
control group). An example of such a case is when we try to establish the relationship
between experience and job satisfaction. One would expect that professionals with
more experience (one group) are going to be more satisfied with their jobs compared
to professionals with less experience (second/control group), establishing in this way
a cause (experience) and effect (job satisfaction) relationship. Alternative names to
causal-comparative can exist like ex-post-facto and correlational causal-comparative
when correlational models are used to investigate possible cause and effect
relationships.
Correlational designs, finally, work on establishing an association between
two variables in single groups when no manipulation of the variables can take place or
is allowed. In the previous example of experience and job satisfaction, a correlational
design will work on a group of experienced professionals and try to establish if job
satisfaction is high. Along with causal-comparative, correlational designs are usually
the precursors to experimental research. A further subdivision of correlational designs
can be observed with simple correlations focusing on the relationship between two
variables and predictive when the relationship is seen in a predictive nature. If one
variable (often called predictor) is used to predict the performance of a second variable
(often called outcome or criterion) we have the category of simple predictive studies,
while when multiple variables/predictors are involved, we have the category of
multiple regression.
Explanatory/causal designs help with the provision of proofs about the
workings of the world by accepting or rejecting the influence of variables that attempt
to express a phenomenon and its aspects. They have the great advantage of allowing
for control of the subject population (in terms of meeting selection criteria) and can
be replicated to enforce greater confidence in the research findings. Despite their
advantages, caution must be exercised in their application and in the interpretation of
the results as not all relationships are causal. There is for example almost perfect
association between the rooster’s wake-up call in the morning and the Sun rising, but
as we all know this is not the reason for the Sun rising. In social environments, the
issue of establishing causality becomes more complex due to the multitude of
extraneous and confounding influences of various variables. In such cases causality
might be inferred but never proven.
Causality is also confused at times with predictability. For example, being
exposed to business aspects from the family environment (like in family businesses)
might predict the entrepreneurial tendencies of an individual but it does not cause one
to become entrepreneur. It also does not tell us what the individuals do to become
successful entrepreneurs. Predictions do not necessarily depend on causal
42 QUANTITATIVE RESEARCH METHODS

relationships, neither do they prove ones. Later on, when we study hypothesis testing,
we will further discuss the issue of causality in the form of probabilistic thinking.
Another classification of research designs emerges with respect to the selection
of the sample population and will be referred here as control perspective. If a random
process has been followed (we will see later on what “random” means in statistics),
then the experimental label can be assigned (also called randomized). If multiple
measures or a control group are involved, then the design can be classified as quasi-
experimental, while if none of the conditions mentioned before are satisfied then we
have a non-experimental design. While the last two categories can carry a degree of bias
in them (an issue of internal validity), randomized designs are considered ideal for
studying cause and effect relationships.
Experimental designs usually involve a test/experimental group where some
form of intervention/treatment (this is the independent variable) takes place and a
control group where no intervention is applied. Individuals are randomly assigned to
each group and their behavior/response is measured before the application of the
intervention (pre-test) and afterwards (post-test). Because a comparison of some sort
is involved, experimental designs usually involve the correlational method for
statistical analysis. This should not confuse them, though, with the correlational design
that involves only one group. The environments are controlled by the researchers in
experimental designs to ensure only the intervention is applied (to the test group) and
no other factor is influencing the behavior/responses (in all the groups). If significant
differences are observed between the two groups one can be certain that the
intervention was the reason the results were different. The ability to control the
situation to such an extent allows researchers to limit alternative explanations and
establish the cause and effect relationship between the intervention (independent
variable) and the result (dependent variable). Despite the high levels of evidence
experimental designs provide, researchers need to be cautious in the conclusions they
draw as the observed results could be artificial and not generalizable to the greater
population. For example, we could show images of terrorist attacks to the test group
and then ask both test and control groups to write how they feel about defense
spending by the government. This type of intervention is clear manipulation of the
test group and one should not consider the results as an indication of the need to
increase the defense budget.
If we are unable to randomly assign participants to test and control groups and
we must rely on existing groups and in cases where we are not in complete control of
the environment, we have the quasi-experimental research designs. Such designs are
often used in evaluation research and are also adopted when an experimental design is
impractical or impossible. For example, let us assume that we want to study gender
differences in leadership. Given that we cannot randomly assign someone to be a male
The Quantitative Research Process 43

or a female (in other words, control the independent variable), we are forced to group
males in one group and females in another. Another case where experimental designs
are not applicable and require quasi-experimental designs is when the unit of analysis
is communities/groups instead of individuals. If that is the case it will be difficult to
isolate communities (especially if they are nearby) as their members can cross borders
and environmental changes can differentiate the two communities drastically (for
example, one might experience a tornado attack) to the point of polluting/biasing the
results. In general, an issue of consideration with quasi-experimental designs is that
their findings cannot be generalized outside the population that participated in the
study (lack of external validity). Variations of quasi-experimental designs exist with
exotic names like regression-discontinuity design, proxy pretest design, etc., but their
presentation is beyond the scope of this book.
When we cannot control the level of randomization in the participants or the
variables we are interested cannot be manipulated, we have the non-experimental
research design. Being unable to manipulate variables could be intentional, like when
they might lead to ethical and/or illegal violations (for example, we cannot induce
violence for the sake of studying it), or unintentional, like when we cannot control
gender (switching men into women and the opposite to study the effects of gender,
let’s say on individuals’ behavior is not possible). Non-experimental designs can
observe associations among factors/variables that influence a phenomenon, but they
cannot establish cause and effect relationships. Despite their shortcomings, non-
experimental methods are very popular in social sciences research (actually, they
represent the majority of the published research). This is mainly due to their non-
invasive nature and the ease with which they are conducted (distributing a survey link
on the Internet is simple nowadays).
A final perspective for research designs is based on the time (or lack of it) of
investigation. If the measurements take place at a specific time and rely on existing
characteristics and differences rather than changes following interventions, then we
have a cross-sectional design. By specific time we don’t literally mean a moment in
time like a second or a minute but more like the period of data collection, usually in
the range of weeks and even a few months. In social sciences, for longer periods (for
example, years) the influences on a sample population from the rest of the
environment could effectively change perceptions, beliefs, or whatever else we might
be studying. Life events and social trends start affecting individuals, so our data might
be polluted from such events and not realistically represent what we measured. For
example, if we are interested in the influence of policies in the emergence of
entrepreneurship and we take a long time to collect our data, the final participants
might be exposed to different policy conditions (a government might have changed or
taken action) than the initial participants. In general, the time period of data collection
could be “zero” when all participants respond at the same time (like filling in a survey
44 QUANTITATIVE RESEARCH METHODS

in a classroom), it could be spread across days and even weeks like when filling in an
online survey, to even months like when interviews and different organizations are
targeted. As a rule of thumb, for cross-sectional designs one should not go over a two-
or three-month period for data collection as environmental changes might have
accumulated significant changes in the domain of study.
Because the time dimension is not considered in cross-sectional designs, their
ability to measure change is limited to only recollections of change as it is experienced
by individuals. The results of such designs tend to be static as they reflect a snapshot
of the sample at the time of data collection and there is always the possibility that
another research that will be conducted at another time will find different results.
Additionally, if only a specific factor/variable is studied it might be difficult to locate
and recruit participants who have similar profiles in order to eliminate covariances
(trends between factors) and influences of other factors. All the above make it difficult
to establish valid cause and effect relationships in cross-sectional designs unless
repeated measures at different times and settings confirm findings. Despite the
aforementioned shortcomings, cross-sectional designs dominate the published
research because of the ease and speed with which they can be conducted and the large
numbers of subjects they can reach (ensuring in this way the representativeness of
populations).
For instances, where the passage of time is significant in what we study, the
longitudinal research design is more appropriate. In this design the same sample is
followed over time and measurements are repeated at regular intervals. In this way,
researchers can track changes in factors/variables and infer patterns over time. In
addition, this design allows the establishment of the magnitude and direction of cause
and effect relationships. Typically, measurements are taken before and after the
application of an intervention, allowing researchers to make inferences about the
impact and effect of an intervention on the sample population. In a way, these designs
allow researchers to get closer to experimental designs and facilitate prediction of
future outcomes as a result of the application of identifiable factors.
Because of the involvement of time in longitudinal designs they are often
confused with time-series. In the latter case variables are measured as a function of
time forming a series of values in time. Time now becomes a continuous variable (scale
or interval as we will see later) while in conventional longitudinal designs is a separator
of groups of values (ordinal or nominal). Another significant difference between the
two cases is that longitudinal designs can suggest causality while time series do not. An
example of time-series is the stock price of a company in time while an example of a
conventional longitudinal design is patient health before and after an
intervention/treatment. If patient health improves after a treatment, we can deduce
with some certainty that it was the cause of the improvement.
The Quantitative Research Process 45

Some critical issues that might arise in longitudinal designs include among
others the possibility of changes in the data collection process over time, the
preservation of the sample constitution and integrity, and the isolation of the variables
under investigation from environmental influences. Also, these designs affect our
ability to study multiple variables at the same time. An assumption that we make in
longitudinal designs, that can be challenged, is that the initial environmental trends (at
the beginning of data collection) will persist during the observation period. Despite
these challenges and provided we can afford the luxury of time and a large and
representative sample; this design is ideal for studying change during long periods of
time (like the lifetime of individuals from childhood to adulthood).
A final design that we will see here is meta-analysis. Its time perspective is
exclusively focused on the past. This design concerns the systematic review of past
research about the phenomenon we are studying and the statistical analysis of the
findings of these past studies. In this way, the sample space increases dramatically and
allows researchers to study the effects of interest from both quantitative and qualitative
studies. While the availability of data might be sometimes overwhelming, the idea is
not simply to summarize findings but instead to create new knowledge from
combinations of variables that have not been analyzed before using synoptic reasoning
and statistical techniques. In cases where not enough past research exists or there are
strong dissimilarities in the results (heterogeneity), this type of design should be
avoided. In addition, attention should be paid to the criteria used for selecting the past
research studies, the objectives set, the precise definitions of the factors/variables and
outcomes that are used, and the justification of the statistical techniques used in the
analysis.
Meta-analyses oftentimes require content analysis (like with qualitative results).
In such cases, small deviations in the criteria used can lead to misinterpretations that
will affect the validity of the findings. That effect can also be aggravated due to large
samples that might not necessary be valid. In conclusion, the quality of the past
research that is included in the analysis is critical regarding the credibility of the meta-
analysis results. Despite its shortcomings and the time and effort investment required
for meta-analysis, the findings can be valuable as they can validate results from multiple
studies and point to future directions of research. This type of analysis is also useful in
future studies as it can provide the basis upon which scenarios can be built.

2.4.2 Sampling
Having identified the research design perspectives that are appropriate for a
research study, the next step involves identifying the sample, the variables/factors that
need to be measured, the instrument that we will use to collect data, and the way they
will be processed and analyzed to accomplish the study’s goals (Figure 2.4). Sample in
this respect is the segment of the population that participates in the research. In this
46 QUANTITATIVE RESEARCH METHODS

section, we will discuss how we identify and organize our sampling process, while in
Chapter 4 we will discuss how we can process the data we collect from our samples.
Guided by the purpose, research questions, hypothesis, and the various
perspectives of our research design, we can profile our target population/entities (see
Chapter 3) and devise a sampling strategy to select the individuals/entities that qualify
for participation in our study. At this stage one could go ahead and discuss what will
be measured, but knowledge of the sample characteristics could help identify details
like demographics that might provide valuable information when considered for data
collection with the instrument we will use (discussed in 2.4.4).
Important characteristics of samples include: heterogeneity (representativeness
in other words) to include participants proportional to the population, maximum
variation to ensure extremes are represented, and randomness to eliminate selection
bias and ensure equal chances for selecting participants. These characteristics are to a
great extent interrelated and express different views of the same property, which is for
the sample to be an accurate miniature reflection of the population it represents
(Figure 2.7). A final characteristic that is often ignored in the initial stages of research
design is sample accessibility. Researchers might have figured out the perfect way to
select individuals only to find out that they are not allowed to access them (like when
children are involved) or they are not available (too busy to participate).

Figure 2.7 Sample representativeness


The Quantitative Research Process 47

The primary issue with sampling is representativeness. Having identified the


population of a study, the sample needs to be an accurate representation of what exists
in the population. For example, if we want to study income distribution in a society
our sample needs to include representatives from all income levels proportionally. It
wouldn’t make sense to select our sample exclusively from upper or lower class
neighborhoods as it would bias the results of our study. To ensure representativeness
one usually adjusts the sample size (large samples have better chances of getting closer
to the population) and randomness. With randomness, we mean here equality (or
proportional representation) of each unit of study for selection. When this is not
possible, we have non-probabilistic methods for sample selection.
Equality is reflected in what we call random or probabilistic sampling. This
type of sampling ensures each unit (individuals, in many cases) has equal chance of
being selected so a sample can consist of all possible combinations of members. To
ensure fair contribution of all types of individuals, a sampling frame needs to be
developed. This includes a numbered listing of all individuals in a population. By
individuals here we refer to the unit of analysis we mentioned at the beginning of the
research design and include individuals, organizations, groups, etc. With the use of a
random number generator, individuals can be identified and selected for participation.
This simple/pure random process of selecting individuals is called simple random
sampling.
Random sampling ensures a kind of equality among members of a population
in terms of being chosen to participate in a research that can oftentimes lead to
undesirable results (Figure 2.7), especially when certain demographic characteristics
are critical for a specific research topic. For example, if we are studying income
distribution it would make sense to take into consideration the quota (strata) of the
population by social class. If the upper class in a society represents 10% of the total
population, the middle class 40%, and the lower class 50%, one would expect that our
sample must preserve these percentages in its participants. This type of sampling is
called stratified and assumes that some kind of census exists or can be conducted. In
practice and considering the social class example, one could create a numbered list of
100 upper-class individuals, followed by 400 middle-class individuals, followed by 500
lower-class individuals. We can then select (using a random number generator) 10
individuals from the upper-class category, 40 from the middle-class, and 50 from the
lower-class. The application of stratification with this type of random selection of
individuals is called random stratified sampling. If instead of randomly selecting
individuals, we start from the beginning and choose (say) every 10th individual we will
end up again with a representative sample for the population (at least with respect to
social class). This form of sampling is called systematic random sampling and as
long as there is no pattern in the way individuals are listed (like if every 10th in our
example is a man) it will result in an adequately representative sample.
48 QUANTITATIVE RESEARCH METHODS

In practice, stratification for more than one variable might be required, like
considering gender and education along with social class. This adds complexity in
ensuring representativeness, but as long as the initial pool of candidates and the
selection of particular individuals is not controlled by the researcher the sample could
be random and representative enough. An issue that arises in many situations is the
geographic distribution of populations that tend to be spread across multiple
geographic regions and areas. For example, it might be that we are interested in racial
bias across a region. With random sampling we might end up covering huge distances
according to what has been selected for participation. Instead, we might identify a few
locations (closely by) and consider them as representative of the population. This form
of sampling is called cluster sampling and as long as the cluster we select includes a
balanced mix of the demographic characteristics of the study’s population it can also
be, to an extent, considered representative.
Apart from the probability/random categories of sampling that we have seen
up to now, we have the general category of non-probability(-stic) sampling where
randomness is not ensured in any way. This does not necessarily mean that the samples
are not representative, it is simply an indication that the selection process is not based
on any part of probability theory. This fact will pose a limitation to representativeness
that researchers need to acknowledge as potential bias or limitation to their research.
Despite its shortcoming, non-probability sampling is quite popular in social sciences
because of its practicality and the low demand in terms of resources, time, and effort
that it imposes on the researchers.
Two general categories of non-probability sampling that are oftentimes
employed, especially in social sciences research, include convenience and purposive
sampling. Convenience sampling (also called accidental or haphazard) is a popular
form of sampling (often exercised when TV reporters interview people on the street)
and involves whatever population is easily available to the researcher. Asking for
volunteers by posting and promoting a call for participation is a form of convenience
sampling. The obvious challenge with this type of sampling is representativeness.
While the assumption is made that only qualified individuals will come forward, there
is no way to know if certain attributes of the population important to our research will
accurately be reflected in the sample. Large samples and/or a good screening process
might alleviate some of the deficiencies of convenience sampling.
Another popular form of non-probabilistic sampling is purposive sampling.
This applies to cases where we are targeting individuals based on their demographic
characteristics in our vicinity (convenience sampling implied). In essence, it is like
convenience sampling with a very aggressive screening process. This is a very efficient
and fast process of reaching a desired sample size and as long as stratification
(proportional representation) is not an issue, it will efficiently compile a sample. A
The Quantitative Research Process 49

subcategory of purposive sampling is snowball or chain sampling. In that case, we


identify a few qualified participants and rely on them to promote the research call and
recommend qualified participants. This cascade effect, spreading the word in a targeted
way, is based on the assumption that like-minded individuals exposed to certain
conditions form networks or closely-tied social groups. As long as specialized
subgroups or isolated individuals are not of importance, we can end up with a qualified
sample for our research.
Combinations of sampling methods are also possible (often called multi-stage
sampling) and are of practical use for screening and pre-sampling purposes. One
might survey a population segment (cluster sampling) that will be further used to target
individuals (maybe for interviewing purposes). A typical case is when we select a cluster
and perform stratification to ensure accurate reflection of population demographics.
We can follow this later with random or systematic sampling. Combining sampling
methods can help maximize resources and eliminate as much bias as possible.
Before closing this section, we need to emphasize that regardless of the choice
of sampling method, issues with samples need to be considered and addressed either
in data collection or in the assumptions, limitations, and delimitations of research.
With the exception of sample representativeness that we discussed previously, there
are numerous other issues that need to be considered when evaluating a sample
constitution. The representativeness of the population demographics in the sample
will not necessarily be reflected in the views expressed by the sample simply because
the views are not known in advance. To alleviate this possibility an appropriate sample
size will have to be determined (see Chapter 5). Another issue with samples that is of
particular importance in survey research is the responsiveness of the sample. Non-
responses to entire questionnaires or parts of them may result in errors and bias the
results of the data analysis. Also, certain types of individuals may be less likely to
respond. When organizations or groups are involved, certain kinds of information (for
example, financial data) that might be important for the research might be inaccessible
or unavailable. This issue might be resolved if sample accessibility is considered and
ensured early on. Proper use of statistical methods may also correct non-responses.

2.4.3 Variables and Factors


In quantitative research, we are interested in measuring quantities that
represent aspects of the entities involved in the phenomenon we study. The
characteristics of these entities are oftentimes represented/referred to as variables.
Under this term, we can find a wide range of measurable characteristics, including
some that can be measured quantitatively as numbers and some that cannot be directly
measured numerically and are called attributes (like gender, educational level,
perceptions, etc.). Much confusion also exists with related terms like factor, parameter,
statistic, and coefficient, among others. In addition, overlaps in terminology are
50 QUANTITATIVE RESEARCH METHODS

observed with terms like attribute and category. If something does not vary, we term
it invariant or constant. When a constant plays the role of the multiplier of a variable
they can also be referred to as coefficients. Rates of growth are oftentimes expressed
as the product of a coefficient and a variable.
Before attempting to clarify the field, it is worth mentioning here that
regardless of the methodology we adopt most of the time some form of comparison
or reference to numbers will have to be made. Even in qualitative research one needs
to eventually present in some form of frequency the persistence or lack of something
among the themes that might emerge (like in interviews). Statements like “the
majority/some/one/none/etc. of the participants stated …” have a
quantitative/numeric connotation simply because this is the only way to logically make
comparisons and draw inferences. While in quantitative research variables represent
numerically expressed characteristics, in qualitative research we might call them instead
categories or themes among others and they can also be quantified in terms of their
frequency of appearance.
In quantitative research, variables come in a variety of different names
depending on their subject (focus), what they express (type), and their position
(function) during their inference process (Figure 2.8). When our focus is on the
population, the variables tend to be referred to as parameters, while when we focus on
the sample they are called statistics. Parameters in science represent measurable
factors that are needed to define/describe a system, an operation, or a situation. In
that sense, they tend to be constant with respect to the time and place of the
observation of the system under investigation. For example, in social sciences
parameters might be used to express the average age of a population, the distribution
of income, etc. at the time of the study. Unless a census is performed, parameters tend
to be inferred from samples of the population.
The Quantitative Research Process 51

Figure 2.8 Variables in quantitative research

The main classification of variables in quantitative research is with respect to


their type or level of measurement as is alternatively know. By this we refer to the
relationship between the numerical values of the measurement, what they represent
and whether the representation has some sort of ordering in it or not. By ordering here
we mean anything that can be arranged either numerically or ordinally. Any variable
that can be expressed numerically (like age, weight, income, etc.) has an inherent
ordering of itself due to the natural ordering of the real numbers. In the case when the
amount of difference between two consecutive values needs to be constant (in other
words values are uniformly spaced across their domain), a variable is also called scale
and can be further subdivided in interval scale and ratio scale. The difference
between the two is that ratio scale includes the origin (zero point in most cases) while
interval scale can have an arbitrary set origin (zero point). A case in point is the variable
temperature that can be expressed in Celsius, Fahrenheit, or Kelvin degrees. The
Kelvin scale is a ratio scale because it starts at the absolute zero and moves on to higher
temperatures, while Celsius and Fahrenheit are interval scales. The zero in Celsius (32
in Fahrenheit) is the freezing point of water and it is at 273 degrees Kelvin.
In cases where the differences between ordered amounts are not equally
spaced (or we are not sure of their exact spacing), we call these variables ordinal. This
includes cases, for example, like sizes that could exist in the form of small, medium,
large or qualities like well done, medium, rare (when we express the degree of cooking
of a steak). Time tends to organize things in order like past, present, future (the months
52 QUANTITATIVE RESEARCH METHODS

within a year, the days within a week, etc.) and in that sense entities referring to time
tend to be considered ordinal. The Likert scale that we will see in the next section is
frequently used in social sciences to capture perspectives in an ordering form with
options like ‘strongly disagree’, ‘disagree’, ‘neither agree or disagree’, ‘agree’, and ‘strongly agree’.
For statistical purposes this type of ordering can sometimes be considered equivalent
to an interval scale 1, 2, 3, 4, 5 or 0, 1, 2, 3, 4 depending where one wishes to place the
zero.
When our variables are assigned to represent entities that cannot be formally
expressed numerically, we call them a variety of names like nominal, categorical and,
occasionally, factors. Entities represented as such include gender, occupation,
opinions, perceptions, beliefs, etc. In the special category of variables like gender,
when only two options are available (male and female), the variables are also called
dichotomous. If an inherent ordering is implied (like small and big) then one can even
classify such variables as ordinal. Nominal variables are inherently difficult to process,
and their analysis might be limited (as we will see in the next chapters) in frequencies
of appearance for each category and potential relationships/dependencies among
them. In many cases, nominal variables are used to represent categories/groupings of
other types of variables (scales and ordinal) and as such they are viewed as factors that
act to categorize them. Any assignment to numerical values in done purely for labeling
purpose and does not reflect in any way a relationship among the different values of
the measure.
A final classification of variables is seen in terms of the function or role they
play in cause and effect relationships that express the phenomenon (or parts of it)
under investigation (Figure 2.9). On the “cause” side of a relationship we have the
variables that trigger the phenomenon and come with names like independent,
predictor (in correlational terminology), and the more abstract x (typically denoting
the unknown in mathematical functions). When an independent variable is “latent”
or unmeasured or when it refers to a category, we can also call it factor. These types
of variables might affect phenomena that are not easily expressed through one variable
and are more appropriately expressed through a construct that is represented by a
combination of variables. For example, a construct like leadership or entrepreneurship
might be difficult to express as a single variable and might require a multitude of
attributes/factors for its definition. Factors, in such a case, tend to represent variables.
On the “effect” side we have variables that are affected/change as a result of
“cause” variables and are called dependent, outcome (in correlational terminology),
criterion (used in non-experimental situations), or y (along with any other letter used
in expressing variables). In between (or outside, one might say) cause and effect we
have variables, generally called extraneous. When such variables do not influence the
existence of a relationship but influence its strength (excluding the direction), they are
The Quantitative Research Process 53

called moderators, and when they influence the existence of the relationship they are
called mediators. An additional type of variable naming that is used for the in-between
situations is control variables. These are variables that need to be kept constant during
the evolution of the phenomenon we study to eliminate interferences that might
otherwise cause unpredictable or false outcomes. For example, if we study how
women experience gender (categorical variable) discrimination, we need to control for
the gender as it is unlikely that men (sexual orientations excluded) can qualify for
women.

Figure 2.9 Variables in cause and effect relationships

An example can help clarify some of the various terms discussed previously.
Let us assume that we are researching the emergence of entrepreneurship and we want
to prove there is a relationship between parents’ entrepreneurial attitudes and their
descendants’ engagement in entrepreneurship. We are, in essence, trying to prove that
the parents’ entrepreneurial attitudes (cause) lead their children to become
entrepreneurs (effect). In this case, the independent/predictor/x variable is ‘parents’
entrepreneurial attributes’ while the dependent/outcome/criterion/y variable is the
engagement in entrepreneurship of their children. While the aforementioned two
variables are the focus of our study there may be many more that could be of influence
and need to be considered. A moderator variable that could influence the relationship
we study could be the economic environment where parents and children live. If the
society experiences an economic recession it might suppress or enforce the intention
of someone to become an entrepreneur and even eliminate any possibility of
entrepreneurial success. A mediator variable in our example could be the business
network that the entrepreneurial parents have developed and made available to their
child. This in many cases can be critical to the success of a child as an entrepreneur.
54 QUANTITATIVE RESEARCH METHODS

Potential other influences to the relationship we study could come from other
confounding/intervening variables like the education level that we didn’t consider
and personality traits like the need for autonomy and control of the potential
entrepreneur. During our research, we might also want to exclude migrant
entrepreneurs and only consider the native population for our study or focus on a
specific racial profile. In this case, we have the origin or race of the entrepreneur as a
control variable. In quantitative terms, we might say that we control/factor for race.
Because the term factor is interchangeably used sometimes for variables, it is
worth mentioning here that factors are more generic and abstract forms of variables
(unlike weight, height, grades, money, etc.) that usually refer to entities that influence
or categorize a result. In research, such variables are seen as constituents of something
and their lack might indicate lack of the quality/property we are trying to study (like
entrepreneurship or leadership). In natural sciences, a variable like temperature might
be considered a factor (in the role of a moderator here) that accelerates a chemical
reaction, while in social sciences education might be a factor (for example, in the role
of an independent variable and as proxy for foresight) that is necessary for leadership
success. Another view of factors in social science is that they tend to create categories
that are nothing more than groupings according to shared characteristics. For example,
gender might be such a factor that is also expressed as a categorical variable.
Other categorizations of variables exist and appear in quantitative research, but
they tend to be specific to statistical practices and should be considered case by case.
As an example, we can mention here a classification in between-subjects and within-
subjects for independent variables. The former case refers to variables or factors in
which a different group of subjects is used for each level of the variable. For example,
if two different interventions/treatments are considered and three groups of
patients/clients are identified (one for each treatment and a control group that will
receive no treatment), then treatment is a between-subjects variable. This occasionally
will be reflected in the name of the research design that can appear as a between-
subjects design. If only one group is involved and we test the various treatments in
sequence (without telling the participants which treatment or no treatment is applied)
and we measure the participants’ responses after each intervention, then the treatment
is considered a within-subjects variable and one can even name the research design a
within-subjects design.
Although we discussed variables and their types in this section, we shouldn’t
forget that in quantitative research variables need to have quantitative characteristics
and should be measurable as numbers. This brings us to the important issue of
operationalization of variables. By this we mean to accurately and clearly detail what
they mean, and what units express their values/attributes. For simple variables like
weight, height, and age this process might be relatively easy and could involve simply
The Quantitative Research Process 55

the adoption of popular units like Kg or pounds for weight, cm or feet for height, and
integers for age. Setting appropriate levels of measurement (the domain and range
of values variables can take) is also necessary to ensure valid entries are considered.
For example, if we are talking about adult individuals and depending on the type of
research we are doing, one might consider 50–120 Kg for weight, 60–210 cm for
height and 18–90 for age as legitimate values. Other variables, like income, might
require a bracketing into categories of say below $20,000, between $20,000 and
$50,000, and above $50,000. Similarly, years in position could be below 2, between 2
and 5, between 5 and 10, and over 10. Certain other variables that are not so easy to
measure (for example, ordinal and nominal), like perceptions and beliefs, might require
a mapping between qualitative terms like small, medium, large to integers like 1, 2, and
3 correspondingly. The Likert scale we mentioned before and will see again in the next
section is an example of such variable mapping. Overall, we are trying at this stage to
ensure that our variables and their descriptions are appropriate for the purpose and
research questions of our research and also reflect, in terms of terminology, the
appropriate data analysis methods we will use (like calling them predictor and criterion
or independent and dependent for the purposes of regression).

2.4.4 Instruments
With knowledge of the types of measurements that we need to perform we
come to the issue of the instrument that will be used for collecting the information we
want. At this stage, we are either going to use or modify an existing instrument that
has been used before in similar situations or we are going to develop a new instrument
from scratch. The former case is practically a lot easier than the latter, provided the
existing instrument is reliable (provides similar readings for similar measurements) and
valid enough (measures what it is supposed to measure and nothing else). A proper
literature review will have identified by now the existence and suitability of such an
instrument and all we will have to do is either adapt the wording to our situation
and/or maybe slightly modify the instrument. By this we don’t mean a drastic
restructuring of the instrument but rather a minor alteration that will not in any way
impact its reliability and validity. In the case where our instrument is a questionnaire,
altering the questions to update the context it addresses might be allowed as long as
the constructs it measures remain the same. For example, a questionnaire that has been
developed to measure certain leadership traits of managers in the transportation
industry might be used for managers in the energy industry with the only alteration
being the change of the word “transportation” to “energy”. Even adding or removing
one or two questions might be acceptable. A rule of thumb might be that altering more
than 10% of the instrument might require re-evaluation of its reliability and validity.
These alterations do not include demographics where one can always ask for more
details from the participants without impacting what the instrument measures. In
general, before using or modifying published instruments one needs to make sure their
56 QUANTITATIVE RESEARCH METHODS

respective publications adequately describe the constructs and variables they measure,
the coding schemes they use, as well as their performance characteristics (reliability
and validity).
For the purposes of this book we will discuss the process of developing an
instrument. We first need to focus on and understand the sources of data that we need
for our research. If archived data is our primary source, then a process like the
literature review will need to be followed to identify where they are located and how
they can be accessed (by requesting permission or through open access if they are
publicly available). In all other situations, a sample will have to be compiled and an
instrument will be developed for retrieving the appropriate information from the
research participants. According to our research questions and what we want to
measure, many possibilities are in general available. These are usually under the general
term survey (often mixed with the term questionnaire), usually referring to the general
method of data collection (even a literature review is a survey). One type of instrument
that is usually seen in natural sciences research is an apparatus (or a few of them) that
will measure the physical characteristics of the phenomenon we study or the
physiological characteristics of the individuals in our sample (when living organisms
are involved). As our focus is on social sciences research the development of
apparatuses as instruments will not be explored here.
In social sciences, the most popular methods for collecting information (in
addition to using archival data) are questionnaires, interviews, and observations. In
questionnaires, we have, as one would expect, a collection of questions and requests
for information (like demographics) that we ask the participants to answer and provide
respectively. These questions are developed per the research needs as stated by the
research questions and hypothesis we adopted. While Appendix A provides a more
detailed description of the process of developing a questionnaire, we will present here
(Figure 2.10) the basic steps and issues that need to be addressed when developing a
questionnaire and the way we need to organize and structure it.
In general, our intention with our questions at this stage should be to either
validate the individual’s suitability for participating in the study or collect subject
matter information that will help answer our research questions. For practical
purposes, we can split the questions into demographics, validation, and subject matter
categories (Figure 2.10). Demographics questions (like age, gender, occupation, etc.)
are used to collect information about participant characteristics that could be necessary
to ensure they meet the demographic profile of our study’s population and use in our
data analysis when answering the research questions. Validation questions are more
specific questions aiming at establishing the participants’ levels of understanding and
suitability to answer questions about the phenomenon we investigate. They are often
used for pre-screening purposes, but in the case of widely distributed questionnaires
The Quantitative Research Process 57

where the researcher has no control over the participants they are of critical
importance in ensuring instrument validity.

Figure 2.10 Questionnaire development process

The last and most important category of questions concerns the subject matter
questions. A primary issue when developing a questionnaire is the appropriateness of
its questions in collecting the information needed to answer the research questions and
the hypotheses. A typical mistake in developing surveys is to include questions that
somehow relate to the research subject but do not directly relate to the established
research questions. Unless these questions serve some other function (warm-up and
validation purpose), they must be excluded. Each question must be grounded on
literature that raised the importance of the subject (construct or variable) the question
refers to in relation to our population and one of our research questions. This ensures
what is termed construct validity and it is usually addressed by considering the theory
underlying a construct and by evaluating the adequacy of the questionnaire in
measuring the construct. Given that a construct could often include various items and
variables, this term is often used as an overarching one that can be broken down to
content, criterion, convergent and divergent validity, among others. Oftentimes
these items are seen as separate to construct validity, but the reality is they all refer to
the instrument and its accuracy. For example, convergent validity is used to ensure
58 QUANTITATIVE RESEARCH METHODS

different techniques/procedures are used to collect data about a construct. The


triangulation that we will see later on is such a case. Divergent validity on the other
hand ensures that other constructs that might influence our research are not measured
by the instrument we developed and do not “pollute” in this way the readings we get
about our study’s constructs. The degree to which the items of a questionnaire
represent the domain of the property or trait that we want to measure is framed under
the content or face validity aspect. A sub-category of content validity is ecological
validity and is concerned with whether subjects were studied in an environment where
one would expect natural responses rather than laboratory artifacts.
A common practice for ensuring the aforementioned aspects of validity when
developing questionnaires is to have a panel of experts (2–3 usually) that review the
questions and provide an initial feedback about their appropriateness in retrieving the
information needed to address the research question of the study. If the panel
approves the questionnaire the research should proceed to the next stage and pilot test
the instrument in a subsection of the sample; otherwise, the process will have to be
repeated to alleviate the issues raised by the panel of experts. The incorporation of a
panel of experts aims at ensuring content validity.
The pilot study stage is a fundamental phase in the development of an
instrument before its full-scale release. It helps to evaluate the competency of the
questionnaire in addressing the information needs of the research and provides a
pragmatic view of its effectiveness and efficiency for data collection. The latter
includes estimates of the time it takes to complete it, its appropriateness in terms of
language, level of understanding, complexity, comfort, and compatibility with the
respondents’ experience in the subject matter. Oftentimes, in pilot studies, participants
are expected to comment on the quality, clarity, and practical aspects of the
questionnaire (like the time we mentioned before) as feedback to guide further
improvements of the instrument. The pilot study is the major test for a questionnaire
and the information collected should loop back to the development process to ensure
original limitations and inefficiencies are addressed appropriately.
Pilot studies also help address the aspect of validity that is termed criterion-
based validity and concerns the capability of the questionnaire to detect the absence
or presence of the criteria considered in the representation of the constructs and/or
traits that express the phenomenon we study and its various aspects. By selecting a test
group that exhibits the traits we plan to measure we can see the extent to which our
instrument/questionnaire reveals these traits. This presumes of course that the traits
in question can be measured in some other way (other researchers’ instruments that
have been proven in the field), in which case we can claim concurrent validity among
instruments. Only when the results of the pilot tests are encouraging and all aspects of
validity that we mentioned before have been addressed should the instrument be
The Quantitative Research Process 59

considered ready for a full-scale release. Even at that stage, it is not unusual to go back
to the drawing board if evidence suggests that further calibration of the instrument is
required. If future results confirm the findings of our study, we can say that we have
ensured predictive validity.
We should mention here that pilot studies are not only necessary for testing
and validating a measurement instrument. Their need is of primary importance to
evaluating the feasibility of our recruitment strategy for the sample participants, the
randomization process (if one has been adopted), and the retention rates in terms of
participation and completion of the intervention we established during the data
recording process. Despite the advantages of pilot studies, we also need to be aware
of their limitations. Pilot studies cannot evaluate safety, effectiveness, and efficacy
issues and cannot provide meaningful effect size estimations (see Chapter 5 for more
on this). In addition, inferences about the hypotheses cannot be made and neither can
the studies be used for feasibility estimates apart for generalizations about the inclusion
and exclusion criteria of questions in the pilot study. It is suggested that researchers
form clear and realizable objectives regarding the purpose of their pilot studies and the
evaluation of their outcomes.
While questionnaires dominate quantitative research (especially in social
sciences), oftentimes interviews and observations might be enlisted to collect
quantitative data. These solutions are considered qualitative data collection techniques
so we will just mention their basic characteristics here. The interested reader can find
ample material on the Internet (including this book’s website). Starting with interviews,
the process we follow is to a great extent similar to questionnaires, with the only
difference that the researchers or their designees ask the questions and record the
answers. An interview protocol/guide plays the role of the instrument as well as the
interviewer who is part of the instrument. While the development of the interview
protocol follows the same guidelines and process as the questionnaires, the human
element of the interviewer needs special consideration. Personal biases need to stand
away when conducting and transcribing an interview. In addition, the face-to-face
interaction with the research participants need to be as humane and objective at the
same time as possible. In case face-to-face interviews are not possible, telephone and
online interviews are possible and even asynchronous forms like email could be
considered. Pilot interviews also need to be conducted to ensure the interviewees and
the protocol perform as expected and be modified accordingly if needed.
Apart from interviews, another qualitative form of data collection popular in
social sciences is observations. The instrument in this case is none other than the
researcher who conducts the observations. The researcher could use in this case a
check-list of what needs to be observed or record information as it happens in front
of them. Observations are mainly considered a qualitative technique and, like the
60 QUANTITATIVE RESEARCH METHODS

interviews, when applied for quantitative data collection a preprocessing will have to
be performed on the collected information to retrieve the quantitative data that will be
used in the analysis. As with the interviews, the researcher needs to be as objective as
possible to eliminate personal biases that might influence the collected information.
A final topic relating to instrumentation that we will mention here is when
multiple instruments are used for data collection. This is called triangulation and it is
a typical strategy for ensuring the validity of the information we collect. Possible
combinations of data collection methods might include any three of the following:
questionnaires, interviews, observations, archival data. The last could be in the form
of past research results, census data, and government, organizational, company,
and/or media reports, among others. Usually, in triangulation, one of the data
collection methods (like the questionnaire) acts as the primary source of data, while
the others act as supporting sources that could confirm or reject the findings of the
primary source. Attention should be paid here that for validity purposes all three
sources need to refer to the same data and not complement each other since that is
referring to distinctly different data that would also be considered primary data
(requiring their own validation). Further discussion on instrument validity has been
deferred to section 2.4.6 Evaluation of Findings.
Another requirement in academic research that we need to mention at this
stage is the need to provide ample information to participants (Figure 2.10) about the
research objectives, what their participation involves, and assurances about their
confidentiality and privacy. Even in cases where the purpose or parts of the research
cannot be revealed (probably because they could influence participant
behavior/responses), participants need to know that all aspects of safety have been
considered and their personal identifications will be protected from public exposure
without their consent. To ensure all proper ethical and safety issues have been
considered, researchers usually go through some sort of review process conducted by
an independent body (institutional review board – IRB) at their parent organization
and/or at the data collection site. Such boards consider any ethical and safety concerns
that might be raised and the measures that researchers take to comply with
international and national research guidelines before approving an instrument and the
entire research for accessing participants (including humans, animals, and any other
source of data).
Researchers are required to inform their research subjects ahead of their
participation in the research and ensure they understand everything involved in the
research process, harms or benefits, and how privacy and confidentiality are going to
be ensured. Typical solutions for the last point is to use randomly assigned aliases or
even avoid any personally identifiable information like names, addresses, etc.
Additional information might be needed, depending on the research, that identifies
The Quantitative Research Process 61

inclusion and exclusion criteria for participants. This will allow the researchers to
decide on a subject’s suitability for participation, as well as their rights to withdraw at
any time they wish to do so without any impact on them. Special cases of vulnerable
populations like children, disabled, criminals, etc. must be considered per the
guidelines and requirements set by the IRB of the researchers’ parent organizations
and the source sites.

2.4.5 Data Collection, Processing, and Analysis


With the design decisions made we need to consider the practical aspects of
data collection, processing, and analysis. Outlining these in detail is important so others
can follow them if needed and replicate the study and reach (hopefully) similar results.
This is important for ensuring the reliability of our study, and in cases where we
investigate cause and effect relationships they will serve as the strongest proof that our
results represent the reality of the phenomenon under investigation. While the research
design represents the higher-order decisions about how we will conduct the research
(sample, instruments, etc.), in this phase the process needs to outline the exact steps
that we will take for collecting, processing, and analyzing the research data.
While most of the “what” will happen and “what” we will use has been
addressed in the research design, we are more interested here on the “how” we will
collect the data, “how” we will process them (like conversion, coding, etc.), and
eventually “how” we will analyze them. Given that the latter will be covered in the
following chapters (Chapters 3, 4, and 5) we will focus here on data collection and
processing. The discussion will be brief as most of this is research dependent and could
vary according to circumstances.
Data collection begins by identifying and inviting our sample to join our
research. In the case of a questionnaire, this might entail publishing it on the Internet
(Google Forms, SurveyMonkey, or a similar service) and distributing invitations for
participation physically (like with flyers, advertisements, announcements, etc.) and/or
online (like in social media sites, bulletin boards, etc.). Incentives for participation
might be considered but should be kept to a minimum as otherwise it might be
considered that they distract from the participation and could also attract unqualified
participants. In the case of questionnaires, the general practice nowadays is to have
them online so that participants access them through a link included in the call for
participation. This way the logistics of accessing, submitting, and collecting the data is
simplified. Most online platforms offer options for easily retrieving the data in the
form of tables (like in Excel format). This makes it extremely easy afterwards to insert
them in a statistical analysis software for further analysis. For interviews, one needs to
be more attentive in terms of accommodating the identified participants’ choice of
location and time unless there are specific reasons for meeting in a specified place and
time (like in interventions). For other forms of data collection (for example,
62 QUANTITATIVE RESEARCH METHODS

observations and archival data), the researcher will have to make arrangements with
the site that will provide the data as to how they will be collected.
Having retrieved our data from our various sources, we might need to screen
them for ineligible and/or erroneous entries and for no entries in the cases where one
was required, inconsistencies, impossible data combinations, out of range values, etc.
With questionnaires, this is relatively easy given they usually involve selections among
existing options (per question) or single entries (for example, age) that could easily be
checked in terms of representing participants. At this stage, we might also decide if
additional classifications/conversions are required. For example, this might include
using the same units like having currency in euros or dollars, categorizing entries
according to groupings. For example, for age we might decide that instead of the actual
age to provide categories like ‘below 30’, ‘between 30 and 40’, and ‘above 40’.
Additionally, transformations might be required before the analysis phase like when
we convert monthly salary to yearly income, normalization of variables, applying
formulas, etc.
In cases when our data are in a narrative form (like in interviews and open-
ended questions), retrieving the quantitative information might require more effort
especially when the information is not expressed in numerical form. For example,
themes might need to be identified so that their frequency of appearance might be
used as their quantitative representation. The co-appearance or lack of appearance of
themes might also be of value to our analysis so records of such incidents might be
retrieved for further analysis.
Processing and analyzing the data comes after their collection and
“cleaning”/preparation for the analysis techniques we will apply. At this stage, we need
to have a clear analysis strategy for testing our hypothesis (when necessary) and
answering our research questions. The appropriateness of the statistical tests that we
will use needs to be discussed and justified in light of our research design (like making
sure variables/constructs meet the necessary assumptions of each statistical test). A
detailed presentation of the various methods and techniques available for data analysis
as well as proper reporting of the results can be found in the remaining chapters of
this book.

2.4.6 Evaluation of Findings


Following the results of the analysis (statistical or other), we move to one of
the most critical phases of research, the evaluation of findings (Figure 2.11). At this
stage, we need to appraise our results and report what they mean with respect to each
of the research questions of the study. Our interpretations need to consider any
assumptions and limitations (including delimitations) imposed by our methodology,
The Quantitative Research Process 63

research design, data collection, and analysis and in light of the theoretical framework
of our study.

Figure 2.11 Final research phases

Assumptions come from a variety of sources and could be due to the


methodology (quantitative in our case), research design, data collection, data analysis,
and data interpretation practices we followed. Although we raised the issues posed by
our choices of the aforementioned elements in previous sections, we will present here
some elements of importance. For both methodology and research design a basic
assumption is that the ones selected are the appropriate ones for our study and the
research questions we adopted. A design element that we haven’t appropriately
discussed and is a basic assumption in most research is that the sample of the study,
in addition to being representative and of the right size, includes participants qualified
to provide information about the phenomenon we study and that the instruments we
use can capture that information. We also assume the participants are willing to
provide the required information honestly and accurately.
Regarding limitations, we are usually concerned with those elements that we
cannot control, for example limited geographical coverage in our sample, finding
qualified participants, using an instrument that might not be as accurate and reliable as
we need, etc. We are mainly concerned here with weaknesses to interpretation and
validity and the extent to which our methodology and design choices have addressed
them. In most instances, any assumptions we made will become limitations. In the
same category as limitations we often include delimitations as specific choices that
narrow the scope of the research. One such case might be our choice to conduct
64 QUANTITATIVE RESEARCH METHODS

quantitative research since it limits contributions from qualitative research. Similarly,


we have to consider if the methods of analysis we choose carry with them conditions
that should also be addressed, otherwise they will limit aspects of the interpretation of
the results. The difference with limitations is that delimitations are usually under our
control (choices we make), while limitations might include externally imposed
restrictions. For example, our study might be delimited to a specific geographical area
or to a specific expertise for our sample. Because delimitations are also imposing
restrictions to our research they are often cited under the limitations. In general,
delimitations should not be viewed as either good or bad but as restrictions imposed
on the scope of the research. As long as one is aware of them in interpreting the
findings of research and makes note of them it should be fine.
An element that was mentioned before is validity. Issues with validity have
been discussed in the various sections (methodology, research design, instrument
development, etc.) when they were raised as a result of choices we made so it is worth
mentioning at this point some characteristics of it that affect our evaluation of findings
that usually show up in quantitative research. As we mentioned before, validity refers
to the accuracy of something in representing something else. It is directly associated
with realism and the notions of truth, although it can also be seen in relativism under
the guises of transferability, credibility, dependability, and trustworthiness, among
others. A general categorization of validity is in terms of the explanations we provide
and the generalizability of our results. The former is identified as internal validity or
validity of explanation and simply assesses whether the explanations and conclusions
derived from research realistically represent the phenomenon under investigation. The
latter is referred to as external validity or validity of generalization/population and
provides an assessment of the extent to which the results of a research can be
applicable and generalizable to a wider population, across populations, and across time
and/or contexts (ecological validity).
Internal validity focuses on sources of bias and error as a result of the research
design choices we made. If we consider some of the design elements in Figure 2.4 we
can easily see that such issues could be due to choices we made regarding the
perspective, population, variables, and instrument. Since instrument-related validity
was discussed in section 2.4.4 we will discuss here some of the remaining design
elements. Regarding the perspectives we might have adopted, validity becomes of
importance in designs (like the correlational) where cause and effect relationships are
sought. Justifying such relationships relies on proper selection of independent and
dependent variables, choosing the appropriate statistical test, making sure a
representative sample is selected, and that the data collection process and time do not
alter or affect in any way the environment of the phenomenon we study. Samples are
only abstractions of populations and as such can never equal populations. This reality
is one of the threats to internal validity and refers to selection bias in recruiting research
The Quantitative Research Process 65

participants. This could be intentional by “polluting” the sample with participants of


specific characteristics (like when we recruit offering rewards and “volunteering”
others — often called volunteer bias) — or unintentional and due to the availability of
potential participants (ecological validity). Additional issues to internal validity might
rise in the form of testing effects, like when the participants get familiar with the
process and as a result they become more efficient or aware of the investigation subject
and process (learning effect) or when fatigue gets accumulated during the participation
process (experimental fatigue). A final point regarding internal validity is the
contribution or the effect time plays in it. By time we are not only considering how
generalizable the findings are over time but also the fact that in most cases participants
respond at different moments in time. For example, for a questionnaire it could be
that the time difference between the first and the last respondent is long enough
(months, sometimes) to allow for experiences to alter perceptions of the phenomenon
we investigate (history effects).
Regarding external validity, we need to keep in mind that it refers to the
generalizability of findings. As such, it could be affected by whatever influences
populations. This includes social changes and perceptions that change over time and
could risk making the results of our study outdated (history effects), different
population demographics from those of the sample making our study inapplicable,
and the extension of the findings to population situations different from the ones we
studied. Given that samples can never equal populations (unless in exceptional cases),
external validity is an ideal that could never be perfectly achieved. What we can
address, though, is the importance we place on it and the extent to which we can
generalize our findings.
After consideration of the assumptions and limitations that might affect the
validity of our results, a discussion on the role of the findings in supporting or
extending our theoretical framework should be considered. In the case of applied
research (like for Doctor of Business Administration degrees or related), this
discussion needs to be in light of the theories that supported our investigation and
their applicability to the phenomenon we investigated. Additionally, and provided our
study or parts of it have been researched in the past by other researchers, an extensive
comparison of our research findings with those from past research findings in the field
of investigation as revealed by our literature review should be considered. The
emphasis here should be on results that confirm or most importantly contradict past
findings. The latter might suggest further research to confirm or expand upon what
we found.
66 QUANTITATIVE RESEARCH METHODS

2.5 Conclusions and Recommendations


Following the directives of the purpose of our research and considering the
limitations imposed by our various choices and the realities of the data collection and
analysis, we are in the position to draw conclusions about the description of the
phenomenon under investigation and its representation in terms of the variables,
constructs, and parameters we have considered. We are mainly interested here in the
implications of our research in light of the significant descriptive information revealed
by our study and the acceptance or rejection of our hypotheses. Theoretical
implications need to address the suitability of the theoretical framework we developed
or expanded or the extent to which it represented the phenomenon we investigated
(in the case of applied degrees). Practical implications should consider application of
the results in the form of suggestions about products, services, policies, procedures,
strategies, etc. Overall, the research findings need to be put back into context by
discussing how they respond to the problem of the study, how they demonstrate
significance, and how they contribute to existing literature.
Caution is required at this stage so that we avoid drawing conclusions beyond
the scope of out research and what our research findings can support. If the
framework that we proposed was not adequately supported or the findings suggest
areas that will need further investigation, we need to clearly state this. Despite potential
disappointments for findings that do not prove what we wanted to prove, what we
didn’t find might be as important as what we found. Eliminating possible explanations
and/or constructs as influencing the phenomenon we studied might help point future
researchers in a direction of finding something.
Once conclusions and recommendations are drawn, their dissemination
follows. This is probably the most important part of research as it informs the world
about our results and invites everyone to evaluate, critique, confirm, and expand upon
what we found. Chapter 7 of this book addresses the publication aspect of
dissemination of research findings. What is worth pointing out at this stage is that the
research process loops back to targeting other aspects of the problem we investigated
or serve as steppingstones to address other related or similar problems (Figure 2.12).
Experience and knowledge gained can help point researchers in directions of new
inquiries that further expand our understanding of the natural and social worlds.
Appendix B includes a snapshot of the application of the approach outlined here on a
real research project.
The Quantitative Research Process 67

Figure 2.12 The quantitative research process


68 QUANTITATIVE RESEARCH METHODS

3 Populations

Conclusions in quantitative research are based on inferences we make about


the population of our study. As such, defining populations, understanding their
characteristics, and expressing them accurately in quantitative form is important in
providing context and establishing their boundaries. Population characteristics are
specific to our unit of observation (individuals, groups, organizations, pieces of data,
etc.) and can include, in the case of people, specific attributes, traits, experiences,
attitudes, impressions, perceptions, etc.

3.1 Profiling
The process of specifying the sets of characteristics that describe and uniquely
define a population is called profiling. It involves establishing the extent to which a
characteristic (for example, university education) is shared among the population
members and usually expressed in the form of frequency of occurrence for small
populations (oftentimes below 30 units/members) or in the form of a percentage of
the total population for larger populations (it makes no sense to state percentages for,
say, a population of 5 units). For example, in the case of the educational levels of
individuals who compose the population of a city, instead of the actual counts, we
might say that 12% had university-level education, 18% had technical-level education,
45% had finished high school, 16% elementary school, and 9% had no education at
all (Table 3.1). For an easier comparison of such results this information is usually
depicted in the form of bar and pie charts (Figure 3.1).

Table 3.1 Frequencies and percentages


Educational Level Frequency Percentage
University 12000 12
Technical 18000 18
High school 45000 45
Elementary 16000 16
No education 9000 9
Total 100000 100

While the previous profiling case of nominal/categorical characteristics can be


accurately described with frequencies and percentages, it cannot cover the case of scale
(interval or ratio) variables where continuous and discrete observations are needed to
represent the characteristics of the population we study. Such cases include, as we saw
Populations 69

in section 2.4.3, characteristics such as age, income, scores, etc. While these can also
be grouped in categories, most of the times they are treated as continuous variables
that reveal more details about the population. Some individuals in our population
might have distinct values different from anyone else’s (outliers), rendering the
representation of the population profile with charts (like the bar and pie chart of Figure
3.1) meaningless. A more appropriate form of graphical representation in such cases
is through a distribution, as we will see in section 3.3. For now, we will focus on ways
we can confer the profile of observations by abstracting them in representative values
like mean, median, mode, etc.

Figure 3.1 Profile representation for categorical variables

Let us assume that the population of individuals we study is distributed as


Figure 3.2 displays per a specific characteristic (say height). If we were to plot this
characteristic in the form of a bar chart/histogram1, we might observe something like
what Figure 3.2.a shows. We can see that there is a tendency for most of them to be
relatively tall (maybe the population was basketball players) and one would expect a
representative value to be towards the higher values of height. By representative here
we mean an educated guess with respect to height when one considers a population
like the one we study.
Communicating a continuous variable (Figure 3.2.b) or even a discrete one
with all the values it can get in the population (it could be in the millions) is not as easy
as with the percentages and the charts of categorical variables. We need to find
“creative” ways to express the shapes of the curves (Figure 3.2.b) that represent the
profiles of the characteristic we study in a compact and realistic way. Probably the
most classical value that can be used in such cases is the mean (symbolized with the

1 SPSS: Analyze => Descriptive Statistics => Frequencies


70 QUANTITATIVE RESEARCH METHODS

Greek letter μ) or average, as it is popularly known (or expected value in probability


distributions language). This is nothing more than the sum of all values divided by the
total number/count of values.

Figure 3.2 Bar chart and distribution of a scale variable

Keeping in mind that all population variables are represented (by convention)
with Greek letters, we have:

(3.1)
or in its more compact form:
Populations 71

(3.2)
By using the mean, we presume that our population with respect to the
characteristic we measure is equivalent with a population where all individuals have
the value μ. We can imagine how misleading this can be as often times μ can get values
that don’t even exist in the population it comes from. Consider for example the set of
numbers 32, 36, 40, 26, 30, 28, 70, 34, which we will assume represent the distances of
municipalities (choose whatever unit of distance you want) from a city center across a
highway (Figure 3.3). For the purposes of this example let us also assume that all
municipalities have the same number of residents. We want to identify the location of
a firehouse that will serve all eight municipalities. By calculating the mean, in other
words if all populations were in one location, we come up with μ = 37. Oddly enough
this number is not included in the population/municipalities we study so we have an
entity/mean representing the population that does not belong in the population.

Figure 3.3 Population mean and median

While the mean might seem representative enough of a population, in many


cases it tends to be influenced by outliers (observations farther away from the others)
as Figure 3.3 shows. In our example, and considering equal chance of fires among the
municipalities, it seems that the solution the mean suggests might not be optimal. One
would want to build the firehouse where most of the houses reside. In such cases a
more representative parameter might be the median of the population (often
symbolized as Mdn, η, and Q2, among others), which is nothing else than the midpoint
of the frequency distribution of observed values. In our example, we can order the
numbers first from smallest to largest like 26, 28, 30, 32, 34, 36, 40, 70 and select the
one exactly at the midpoint. In our case, the midpoint is not an existing number and
is 33, the midpoint/average of 32 and 34. If we had an additional value, say 240, then
the numbers 26, 28, 30, 32, 34, 36, 40, 70, 240 will give us a median of 34 (the 5th
number in the 9 numbers of the population). We can see in this way that extreme
values influence the median much less than the mean which with the addition of 240
shifts from its initial value of 37 to 67 (not ideal for our firehouse).
72 QUANTITATIVE RESEARCH METHODS

Because the median provides an indication of where the midpoint of an


ordered list of values is, it is used as a measure of the central tendency of the list. In
simple terms, the median is the point where half of the population will be below that
point and the other half above it. While this is quite informative it doesn’t tell us much
about how these population halves are spread below and above the median. To get a
more descriptive idea of the values distribution we can treat each half independently
and calculate their median (and their mean if needed). The median of the lower half
of the ordered population is called first or lower quartile (symbolized as Q1), while
the median of the upper half is called third or upper quartile (symbolized as Q3).
The designation “quartile” is used because each point is found in quarterly (25%)
increments of the population (Figure 3.4). As one can guess, the second quartile Q2 is
no other than the median of the population. Valuable information provides the
distance between Q1 and Q3, called interquartile range (IQR) as it is an indication
of the central range around the median where 50% of the observations are included.
This can be useful when we are interested in the density around the median, but we
should keep in mind that this information (as is with the quartiles anyway) ignores
information about the variability of the individual values.
Along with the minimum and maximum of the observations the quartiles are
pictorially captured in a box plot2 (bottom of Figure 3.4). Alternatively, one can
include the numbers in the order they appear with the extremes (minimum and
maximum) in what is called a 5-number summary like (min, Q1, median, Q3, max). In
the example of the firehouse of Figure 3.4, the 5-number summary will be (26, 29, 33,
38.5, 70). The extremes (minimum and maximum) are an indication of what we should
expect to see (nothing outside these values exists) when making observations and
define the boundaries of the population in terms of accepted values for the specific
characteristic. The range of the available observations as defined by the difference
between the minimum and maximum (in our case it will be 70-26 = 34) is referred to
as the absolute spread of the specific population characteristic.

2 SPSS: Graphs => Chart Builder Box-Plot


Populations 73

Figure 3.4 Quartiles (top) and box plot (bottom)

A final parameter that is occasionally used (mainly when the values are discreet
and integer) to describe populations is the mode (symbolized Mo). This is nothing
else than the most frequent/popular value within the population. In a frequency
distribution graph, the mode is the point at the top of the curve and in a bar graph the
frequency of the highest bar (green line in Figure 3.5). For example, in a 9th grade class
one would expect to find most students to be of age 14. Another example is the most
frequent item in a supermarket order (like water). In practice, mode is oftentimes
reported in census data as the most frequent value of a population. For example, when
census data report that a country has a young population, they directly refer to the
mode of the population age.

Figure 3.5 Characteristic population parameters

The main purpose of the parameters we’ve seen so far is to provide an


abstraction of the profile of the population we study due to our inability to efficiently
communicate the complete shape of the distribution curve. If we were to superimpose
these parameters on the graph of a distribution like the one in Figure 3.2.b, we will see
their relative positions (Figure 3.5). What happens as we consider more parameters is
74 QUANTITATIVE RESEARCH METHODS

that we get a better idea of the shape of the curve. With only one parameter, like the
population mean (Figure 3.6.a), it is difficult to say anything about the shape of the
curve. With the 5-number summary we see that the situation improves (Figure 3.6.b)
as we get a sense of how the curve might be, while when we consider all the parameters
we mentioned up to now (Figure 3.6.c) a more precise image begins to appear.

Figure 3.6 Parameters and curve shape approximation


Populations 75

One could continue adding parameters to approach the population


variability/curve even more (like including the midpoints between quartiles), but it is
apparent that this process adds complication and we lose the advantage that
abstractions, like the mean and median, offer in representing the characteristics of a
population. While the parameters we mentioned up to now can provide indications of
the population variability with respect to the characteristic we are studying, they can
pose major issues in some cases like when outliers exist and when positive and negative
values are involved that can cancel each other out when summed (Figure 3.7). The
resulting mean in such cases could provide completely misleading information or an
impossible representation of the population characteristic.
While the mean can be misleading at times, its popularity in representing
populations is far greater than the median or any other measure we will consider. This
is also due to the fact that other measures like the median, in addition to their own
challenges, can be cumbersome to calculate, as they would require a lot of effort
(imagine having to order the millions involved in a population to find the median),
rendering them in essence impractical. Also, the mean works nicely with statistical
techniques (see next chapter) and as such is the preferred choice for profiling
populations.

Figure 3.7 Influence of opposite sign values on the mean

To counter the deficiencies of the mean (Figure 3.7) we can complement it


with an additional metric that reflects the variability (also called dispersion) around it.
What we need, in other words, is a measure of how far, on average, are the values of
the population from the mean. One such possibility would be to consider the average
of the difference/deviation of each value from the mean (second column of Table
3.2). As we can see for our example values for the firehouse (Figure 3.3), this results
in a zero value for the mean, which is quite misleading, not to mention meaningless,
as it suggests zero variability (like if all values were the same). To eliminate the negative
results that the deviation calculations created we can consider the absolute value of the
deviations (third column of Table 3.2). This results in a mean that is called mean
absolute deviation (MAD) or mean absolute error, and it would suggest in our case
that most of the municipalities will be found in a range of 9 units from the mean. Since
the mean in our example is 37 the absolute deviation in our case (moving 9 units to
76 QUANTITATIVE RESEARCH METHODS

left and right of the mean) will include any value between 28 and 46. As Figure 3.8
depicts, this includes the municipalities at 32, 36, 40, 30, 28, and 34.

Table 3.2 Value variation calculations


x x-μ |x-μ| (x-μ)2
32 -5 5 25
36 -1 1 1
40 3 3 9
26 -11 11 121
30 -7 7 49
28 -9 9 81
70 33 33 1089
34 -3 3 9
Sum 296 0 72 1384
Mean 37 0 9 173

Figure 3.8 Mean absolute deviation and standard deviation

An alternative to MAD, that is preferred in statistical testing, is variance


(symbolized as σ2 or var(x)) and it is expressed as the mean of the squares of the
deviations of all values from the population mean:

(3.3)
Variance should not be confused with variation as the latter is an assessment
of an observed condition that deviates from its expected or theoretical value while the
former is a quantifiable deviation away from a known baseline or expected value (the
mean in most cases). Squaring the differences from the mean, like we did here, has the
effect of converting it to a positive value regardless of its sign. In our case, it is also
magnifying the difference from the mean, as the last column of Table 3.2 shows,
resulting in a value of 173 units from the mean for σ2.
Populations 77

Given that the square power the variance carries might not mean much in
terms of our initial units (that are now squared), one can revert the influence of the
square by calculating the square root of the variance. This quantity is called standard
deviation (SD) and symbolized with the Greek letter σ (remember all population
parameters are symbolized by convention with Greek letters):

(3.4)
Squaring the variance of our example (173) we get 13. This value is higher than
MAD and, as Figure 3.8 shows, it includes more observations. One could say that σ is
more considerate of outliers and so it is more informative than MAD. While this is
true, one can see it as an advantage or disadvantage depending on the situation. For
example, if we don’t want to be influenced by outliers, using the mean could be a
disadvantage. In general, though, the variance (μ and σ) approach is easier to handle
in terms of the math involved and as such it is preferred in profiling populations and
samples, as we will see later. Mean and standard deviation are reported together in the
form μ = 37, σ = 13 or M = 37, SD = 13.
An interesting observation with respect to both the variance and standard
deviation is when their values become very small or very large. The more the values
would approach zero the more they would be suggestive of a constant instead of a
variable while the higher the value becomes the more random a variable will appear
suggesting probably that it doesn’t relate at all to what we are studying.
If we were to indicate now the spread of the population around the mean that
a standard deviation distance includes in a profile curve, we will get the shaded areas
of Figure 3.9 (for the example of the basketball players (a) and the municipalities of
the firehouse (b)). The reason these two distributions were placed side by side was to
showcase the shapes a profile curve can take. The way the curve leans provides
valuable information about the tendencies in a population (like where the majority and
outliers are with respect to the mean) and as such the parameter skewness was derived
to express such tendencies. It is a measure of the asymmetry of the curve with respect
to the mean and is measured by a variety of formulas (like Pearson’s moment
coefficient of skewness). For practical purposes, one such formula is Skewness = 3*(μ
- η)/σ. A rule of thumb for considering significant deviations from a symmetric profile
is when it is more than twice the standard error (discussed later). For now, and based
on Figure 3.9, we can say that our basketball player population is negatively skewed
(Skewness < 0) with respect to height (Figure 3.9.a) and our municipality profiles are
positively skewed (Skewness > 0) with respect to distance (Figure 3.9.b).
78 QUANTITATIVE RESEARCH METHODS

A final parameter that will be mentioned here and is used to express the profile
curve of populations is kurtosis. It is a measure of the flatness of the tails of a
distribution curve (or as sometimes presented, a measure of the pointiness of the
curve). Positive values indicate leptokurtic/pointier curves, while negative values
indicate platykurtic/flatter curves (Figure 3.10). Zero kurtosis value is an indication of
a mesokurtic or normal curve. It is worth mentioning here that skewness and kurtosis
are not mutually exclusive. A curve that displays skewness can also show kurtosis
(although it would be evenly distributed across its tails).

Figure 3.9 Mean and standard deviation

Figure 3.10 Kurtosis effect of profile/distribution curve


Populations 79

In closing this profiling section, it is worth indicating that all the parameters
we considered stemmed from our inability to express the profile in terms of the details
of each individual observation. What we need to be aware of, though, is that
parameters are abstract representations of something complicated so at no time should
they be considered as possible substitutes for individual observations. In fact, our
tendency to oftentimes apply profiling characteristics to individuals at random (called
stereotyping) can greatly impede decision making.

3.2 Probabilities
Having discussed how quantification of population characteristics can guide
their representation with abstractions like parameters, we need to introduce here the
very important concept of likelihood. By this we refer here to the chances a specific
characteristic or groups of them have in appearing randomly. This likelihood of
appearance is termed probability3 (symbolized with p) and is valued by convention
between 0 and 1. What we do with probability is imagine that we squeeze (map is the
official term) the population to 100 units of observation (we will be calling them
individuals occasionally for convenience) and represent each characteristic with its
fraction/ratio in this special population. It’s like a form of imposed stratification of
the population for a better mental representation.

Figure 3.11 Probability definition

A popular way of conceptualizing profiles visually that directly relates to


probabilities is through Venn diagrams (Figure 3.11). A population set U (called
conformation space, sample space, universe set, or plain universe) is represented with
a rectangle that engulfs groupings (circles) of attributes/values A, B, and C in Figure

3 Actually, likelihood is proportional to probability, but for our purposes here we will
consider the proportionality constant equal to 1 making likelihood and probability the same.
80 QUANTITATIVE RESEARCH METHODS

3.11 as well as all remaining attributes (remaining/gray space of rectangle). If we


assume that the number of elements (also called cardinality) in the universe is n(U) and
the number of elements with attribute A is n(A), the probability of the attribute A
(similarly for the other attributes) is given by the formula:

(3.5)
In practice, we have three ways of calculating probabilities. We can apply a
theoretical formula that has been developed with formal mathematical techniques (we
will see this when we discuss distributions), empirically using direct observations (we
will see this when we discuss samples), and intuitively (we won’t see this) when we
base it on past information, other observations or simply on instinct. Obviously, one
would expect the last to be the least reliable estimation as it will be based on partial
information and influenced by observer subjectivity and bias. Consequently, we will
not be dealing with it in this book.
Leaving the theoretical calculation of probability aside for the moment, we will
discuss now the stuff that makes up its essence in as practical a way as possible and in
light of their role in quantitative research. In its most basic form, the probability of an
observation/event/outcome can be calculated as the ratio of the number of times the
observation can occur over the total number of outcomes in existence, as formula 3.5
indicates. Because the nominator is always smaller than the denominator (which is the
universe), the result will always be between 0 (zero) and 1 (one). Considering for the
shake of simplicity two attributes only, for making up a universe U, there will only be
two possible geometric arrangements, as Figure 3.12 indicates.

Figure 3.12 Possible attributes arrangements

The situation in Figure 3.12.a represents a disjoint set of attributes where each
member of the populations has only one of the two attributes (A could include males
and B females), while the situation in Figure 3.12.b represents overlapping attributes
Populations 81

where some member can have both attributes together (A could be females and B
could be engineers). Questions that might be of interest here are how many individuals
(or what percentages of them) have one or the other attribute and how many have
both. The former represents the addition of the areas of each attribute (termed union
and symbolized with U (not always the same as the universe, as we will see soon)),
while the latter is just the overlap (termed intersection and symbolized with ∩ – an
upside U).
In our case of profiling populations, we are interested to know/predict what
is the probability of observing one or many attributes at an instance of time or what is
the probability of observing attributes at different instances of time. The classical
example of flipping a fair coin will help illustrate the point we want to make. For this
we will use another popular form of representing sets/profiles and probabilities, the
decision tree. In Figure 3.13 the decision tree of 4 fair coin flips is presented along
with the probabilities of each outcome. It is obvious that at each flip of the coin two
outcomes are possible with equal probabilities 0.5 (50% chance). As we move along
with flipping the coin the 0.5 probability of each outcome still holds but the cumulative
probabilities of the combinations of outcomes change as the conformation
space/universe changes since it includes more and more
specimens/individuals/attributes. While at the first flip there are only two
outcomes/species in the universe, H and T for heads and tails splitting the space
equally between them (0.5 probability each), at the second flip our universe includes
the species/outcomes HH, HT, TH, TT (as they appear from left to right in Figure
3.13). As a result, the individual observations/species share the universe space and
subsequently the probabilities (0.25 each). At the third flip of the coin we have a new
universe with species HHH, HHT, HTH, HTT, THH, THT, TTH, TTT and so on as
the flips continue.
Overall, at each flip of the coin a new universe is created where the sum of the
probabilities of its species/outcomes equals the whole universe (total p = 1). Things
get more interesting (math wise) when we are not interested in the order of appearance
of H and T in the individuals in our universe. This results in similar attributes (Figure
3.14) like, for example, in the third flip universe the individuals HHT, HTH, and THH
are all the same as they have two H and one T (we could imagine them as two headed
monsters). If we were interested in the probability of such individuals, then this would
be 0.125+0.125+0.125 = 0.375. Similarly, the probability of two tail individuals will be
0.375, while the single species HHH (3 headed monster) and TTT (3 tails monster)
each have 0.125 probability of appearance.
82 QUANTITATIVE RESEARCH METHODS

Figure 3.13 Decision tree for coin flip

Figure 3.14 Three-coin flip universe

In profiling populations, as we saw in section 3.1, we are interested to know


the proportions of each attribute of the population we study with respect to the whole
universe of attributes. In the example of the three-coin flip universe (Figure 3.14) we
see that this population was composed of one attribute/species with 3 heads, three
with 2 heads and 1 tail, three with 1 head and 2 tails, and one with 3 tails. In terms of
probabilities we can say that in that universe the probability of finding a three-headed
species is 0.125, the probability of finding a two-headed/one-tail species is 0.375, the
probability of finding a one-headed/two-tail species is 0.375, and the probability of
finding a three-tail species is 0.125. This sentence expresses a universe in what is called
a probability distribution.

3.3 Distributions
In the previous section, we saw how probabilities can help profile populations
by providing a simplified representation of the population in a ‘0’ (0%) to ‘1’ (100%)
Populations 83

range and how a probability distribution expresses the profile of populations in terms
of the probabilities of various outcomes. In this section, we will discuss further the
advantages of this type of profiling in the context of quantitative research. Let us
assume now another popular example of the application of probabilities, dice. A
typical situation involves throwing dice and betting on the sum of the outcomes of the
two die. Our universe here includes every possible combination that might come up.
In Figure 3.15 we have a representation of all possible outcomes in the form of a
lattice diagram. The rows represent the outcomes of one of the dice, while the
columns represent the outcomes of the other die. The intersections, then, represent
possible combinations.

Figure 3.15 Dice universe

We can easily see from Figure 3.15 that we have 6x6 = 36 possible
combinations so the cardinality (number of elements) of our universe is 36.
Considering that in this game we are interested in the sum of the dice we can see that
many combinations can produce certain sums. For example, the sum of 7 can be
produced by 1 and 6, 2 and 5, 3 and 4, 4 and 3, 5 and 2, and 6 and 1 (red intersections
in Figure 3.15). These are in total 6 combinations out of the 36 available, so the
probability of seeing a sum of 7 is 6/36 or 0.167. In similar fashion, we can calculate
the probability of all other possible combinations (Table 3.3).
84 QUANTITATIVE RESEARCH METHODS

Table 3.3 Dice probability distribution


Dice Sum Frequency Formula Probability
1 0 0/36 0.000
2 1 1/36 0.028
3 2 2/36 0.056
4 3 3/36 0.083
5 4 4/36 0.111
6 5 5/36 0.139
7 6 6/36 0.167
8 5 5/36 0.139
9 4 4/36 0.111
10 3 3/36 0.083
11 2 2/36 0.056
12 1 1/36 0.028
13 0 0/36 0.000
Total 36 Total 1.000

Having the probability distribution from Table 3.3 we can easily produce a
graph of it (bar graph in Figure 3.16) to pictorially depict the population profile of
the dice universe. Because in quantitative research we usually deal with large
populations, specific probability values are not as much of significance as they are
regions/groups of probability below, above, or between specific values. For example,
in the case of the dice universe we might be interested in the probability to get
below, above, and between a certain outcome/sum. In such cases, all we have to do
is simply calculate the union (see previous section) of the elements that are found
within our range of interest.
For example, if we are interested in the probability of getting a sum less than
6 then all we must do is add the probabilities of getting a sum of 1, 2, 3, 4, and 5, which
from Table 3.3 is 0 + 0.028 + 0.056 + 0.083 + 0.111, giving us P(less than 6) = 0.28. If
we were interested in the probability of getting above 10 we would again add the values
for 11, 12, and 13 (0.056. 0.026, and 0) and we would get P(above 10) = 0.08. Finally, if
we were interested in the probability of getting between 7 (inclusive) and say 9
(inclusive), we would add the values for 7, 8, and 9 (0.167, 0.139, and 0.111) and get
P(between 7 and 9) = 0.42.
An interested observation can be made at this point by considering the
probabilities we get through the aforementioned process and the area under the
probability distribution curve. For example, if we consider P(between 7 and 9) we can
see the sum 0.167 + 0.139 + 0.111 that contributed to the result as 1*0.167 + 1*0.139
+ 1*0.111 which in the graph is nothing other than the area of the 3 rectangular bars
Populations 85

(Area = Base*Height for each one of them) that form the area B in Figure 3.17. The
same case can be evident for P(less than 6) (area A) and P(above 10) (area C).

Figure 3.16 Dice universe probability distribution

Figure 3.17 Dice universe probability distribution areas


86 QUANTITATIVE RESEARCH METHODS

We come to a point of importance with respect to profiling populations, which


is that sections of the profile can be described in terms of the area under the
distribution curve of the population universe. Given that in most cases we will be
dealing with large populations with continuous values of their characteristic under
investigation, the profile graphs will appear as continuous curves in distribution charts.
In our case of the dice universe a line graph of the distribution will take the form
depicted in Figure 3.18.
If we were to apply our conclusion that the area under the curve represents
the probability of observing outcomes between certain values, we can see that the
probability of having an outcome between 1 and 5 would be the area of the shaded
triangle A. Applying the formula for the area of the triangle:
Area = Base*Height/2 becomes Area = 5*0.111/2 or Area = 0.28, which from
our previous discussion is the same as the probability for getting a sum of 5 and below.

Figure 3.18 Probability as area under the distribution curve

In the following sections we will see some of the probability distributions that
are of importance to quantitative research. Unlike our dice universe, these distributions
will refer to large populations and in their form as mathematical functions they will be
concerned with real numbers.

3.3.1 Normal Distribution


The most popular probability distribution in use is the normal distribution
(also called bell curve). It is a symmetric distribution around the mean of the
Populations 87

population, and it can be completely described by only the mean (μ) and the standard
deviation (σ). Considering x as the variable that represents an event/outcome/attribute
of a population profile and P(x) as the probability of the event x appearing, the
mathematical expression of the normal distribution density is given by formula 3.6.

(3.6)
The graphical representation of (3.6) for a variety of values (μ, σ) is given in
Figure 3.19. The symmetry of the shapes around their respective means is apparent, as
is the influence of the standard deviation on the kurtosis (compare with Figure 3.10)
of each curve. The appeal of the normal distribution stems, among others, from the
fact that most distributions will approach the normal curve for large populations and
its simplicity in requiring only two parameters (μ and σ) for its description. The former
fact is key, as we will see later, in developing the central limit theorem, while the latter
makes it “easy” to work with in statistics (most statistical techniques depend on
distributions being normal).

Figure 3.19 Normal distribution graphs

Of special interest in quantitative research is the normal distribution with μ =


0 and σ = 1. It is called standardized normal distribution and the event values are
88 QUANTITATIVE RESEARCH METHODS

symbolized with z instead of the traditional x used in most functions (blue line in
Figure 3.20). Again, as we saw in the previous section, our interest is in the population
between segments of the attribute values (z in our case), so we need to be able to
calculate the area under our distribution curve. By integrating (3.6) across the values
of z we get the area under the curve from the leftmost part (which asymptotically is
zero) to each value of z. The graph (Figure 3.20) of this integration that represents the
area under the normal curve is referred to as normal cumulative distribution.

Figure 3.20 Standardized normal distribution

Because of the complexity of formula 3.6 and the difficulty of calculating the
area from the cumulative distribution, tables with the most probable numerical values
for z have been developed. In Figure 3.21 we see the probability estimates for positive
values of z in the form of a table. Because the normal curve is symmetric, the same
values will apply for negative values of z (they have not been included in Figure 3.21
to save space). Normal distribution tables arrange the probability values at the
intersection of the first 2 digits of z (leftmost column in Figure 3.21) and the third digit
(2nd decimal place in top row). For example, if we are interested in the probability (area
under the curve) below z = 2.33, all we need to do is consider z = 2.3 + 0.03. We need
to locate the first two digits (2.3) in the first column in Figure 3.21 and the third digit
0.03 in the top row. The intersection of their corresponding line and column (red
arrows in Figure 3.21) will point to the solution of formula 3.6 and the probability
value we were looking for. In this case, we see that for z = 2.33 we get 0.9901, which
translates as having 99% of our population beyond the z = 2.33. The same process
can be followed for negative values of z (an example will follow later on).
Populations 89

Figure 3.21 Table of positive z values

The process can be reversed, for example, when we are interested in the z value
of a certain probability. Let us assume that we are interested to know the z value that
corresponds to 95% of the population (Figure 3.22). We need to find the area under
the normal distributions curve that represents the probability value of 0.95. In
searching within the table values we can see (origin of blue arrows in Figure 3.21) that
it is between the existing values of 0.9495 and 0.9505. By following the blue arrows in
Figure 3.21 we see that the first two digits of the z value (first column) are 1.6 and the
third (top row) is somewhere between 0.04 and 0.05. Assuming an approximate value
of 0.045 for the in-between point we can calculate the corresponding z as z = 1.6 +
0.045 or z = 1.645.
90 QUANTITATIVE RESEARCH METHODS

Figure 3.22 One-tail power and confidence


The example we saw previously where we calculated the z value that engulfs
95% of the population is of great significance in quantitative research and especially in
hypothesis testing as it is the most popular value used in reporting poll results and
making predictions. In hypothesis testing (Chapter 5) the probability value that
represents the area under a curve is symbolized with the Greek letter β and referred to
as confidence (suggesting confidence in the percentage of the population expressing
a certain attribute). Its complementary value (remember total probability needs to add
up to 1 or 100%) is symbolized with the Greek letter α and is referred to as power. In
the case of the 0.95 probability (95% of the population), the power will be α = 0.05
(5% of the population). The corresponding value of z (symbolized with ẑ), in our
case 1.645, is called a one-tail or one-sided critical value.
Of interest in hypothesis testing (Chapter 5) is also when confidence (β) has
0.95 probability (95% of the population) around the mean. In this case the power will
again be 0.05 (5% of the population) but will now have to split in two symmetrical and
equal areas at the tails of the curve with each one of them corresponding to 0.025
(2.5% of the population). This case, as seen from Figure 3.23, will result in a two-tail
or two-sided critical values. Calculating the leftmost value requires z-tables with
negative values, which we don’t have available here so we will focus on calculating the
rightmost value and deduce the negative from it. The rightmost critical value will
correspond to 95% of the population plus the leftmost 2.5% (remember all areas in
the tables are calculated from the leftmost point of the curve), which results in a total
of 97.5% of the population or p = 0.975. For this value the process we follow in Figure
3.21 (pink cell) will give z = 1.96 as the rightmost critical value. Having this value, it is
Populations 91

easy to calculate the leftmost value based on the symmetry of the curve as it will be
the negative of the previous results (z = -1.96 in our case).

Figure 3.23 Two-tail power and confidence

Another set of significant critical values (although not as popular as the


previous) can be derived for the 99% and the 99.9% of the population. The
significance of these values will be discussed in Chapter 5, but for now the most
popular critical values are listed in Table 3.4. The interested reader can confirm these
values using the same process that we followed for the confidence of 0.95.

Table 3.4 Critical values for standardized normal distribution


Probability z*-value
Confidence Power 1-tail/ 2-tail/2-sided 2-tail/2-sided
1-sided (leftmost) (rightmost)
0.95 0.05 1.645 -1.960 1.960
0.99 0.01 2.330 -2.576 2.576
0.999 0.001 3.090 -3.291 3.291

Along with the critical values, sometimes we are interested in the inverse
situation like the probabilities (percentages of the population) that can be found
between 1σ, 2σ, and 3σ deviations from the mean, which are depicted in Figure 3.24.
We can see that 68.3% of the population is within 1σ from the mean, 95.5% is within
2σ, and 99.7% is within 3σ. The significance of the standard deviation in representing
population segments can be seen and for practical purposes we can assume that in the
standardized normal curve almost all the population is included among three standard
92 QUANTITATIVE RESEARCH METHODS

deviations from the mean. The interested reader can verify the depicted values in
Figure 3.24 as well as the critical values of Table 3.4 by following the process outlined
in the previous paragraph.

Figure 3.24 Population under normal curve

Because of the significance of representing populations as the area under the


probability distribution curve we will visually showcase here the process of calculating
the population that falls within 1σ from the mean. Our goal is to calculate the area in
Figure 3.24 between -1σ and +1σ and prove that it corresponds to 68.3% of the
population. Because we only have areas/probabilities from the leftmost end of the
curve in the table of Figure 3.21 we will try and express the area of interest in this case
in relationship to areas from the leftmost part of the graph. Figure 3.25 depicts the
sequence of steps we need to follow. We can easily calculate the area under the curve
for +1σ (Figure 3.25.a) from the table of Figure 3.21 as A1 = 0.8413 (intersection of
1.0 in the first column and 0.00 in the top row). Similarly, we can calculate the area
under the curve for 0σ (Figure 3.25.b) as A2 = 0.5 (half the total population – first
entry in the table). By subtracting A2 from A1 (Figure 3.25.c) we get A3 = A1 - A2 or
A3 = 0.3643 which is the area between 0σ and +1σ . Due to the symmetry of the normal
curve we only must double this number (Figure 3.25.d) to get the area we are
interested. In this case it will be A4 = 2 * A3 or A4 = 2 * 0.3643 that can be rounded
to 0.683 or in percentage form 68.3%. In a similar fashion the remaining population
percentages of Figure 4.24 can be derived.
Populations 93

Figure 3.25 Population within 1 SD from the mean

The standardized normal distribution is a model distribution that is too ideal


to be observed in nature. What we do observe, though, is normal distributions around
different means based on the phenomenon we study. If, for example, we are studying
the income distribution of a suburb of New York we might find out that it has a mean
of say μ = $60,000 and a standard deviation of say σ = $10,000. Working out the
94 QUANTITATIVE RESEARCH METHODS

percentages of the population under certain values might get complicated unless we
transform the income distribution into a standardized normal distribution using the
transformation:

(3.7)
This formula will map every value x of the income distribution curve to its
corresponding value in the standardized normal distribution curve (Figure 3.26). If,
for example, we are interested in the percentage of the population in the suburb that
is below $35,000 all we need is to apply the transformation of formula 3.7 for x =
35,000, μ = 60,000 and σ = 10,000 and then look up the percentage in the table of
Figure 3.21. From formula 3.7 we get z = (35,000 – 60,000)/10,000 or z = - 2.50.
Looking up Figure 3.21 for positive z = 2.50 (2.5 in the first column and 0.00 in first
row) we get the probability value p = 0.9938. This means the population for z below
2.50 is 9938%, so due to the symmetry of the normal curve the population for z below
-2.50 will be the complement of the previous or 1 – 0.9938 = 0.0062 or 0.62%
(apparently, it is a relatively affluent suburb).

Figure 3.26 Mapping a normal curve to the standardized normal curve

Following the inverse process, we can estimate, for example, what is the cut-
off point of the wealthiest 10% of the population. In this case, we consider the 90%
below the curve or p = 0.9 and by searching the table of Figure 3.21 we see this is
Populations 95

between 0.8997 and 0.9015 in the row of z = 1.2. The corresponding value in the top
row is between 0.08 and 0.09 so we will assume it is 0.085. Adding this to z = 1.2 we
get z = 1.285, which after substitution to formula 3.7 will produce x = 72,850. We can
conclude from this process that the wealthiest 10% of the population will be making
$72,850 and above. The transformation process that we followed can be done for any
normal distribution regardless of the mean or standard deviation.

3.3.2 Chi-Square Distribution


The distribution we considered before concerns one variable. Very often in
statistics we need to deal with multiple independent variables. As we will see in the
next two chapters, the sum of squares of such independent variables (dimensions in
other words) appear in our calculations (variance is one such example), making the
study of such functions important. When these sums are composed from mutually
independent standardized variables, they form the chi-square (or chi-squared)
distribution. An important characteristic of this distribution is the degrees of freedom
(symbolized here as k) that represent the number of parameters in a population that
may be independently varied. This in most cases is equal to the population size N
minus one4. The formula of the chi-square distribution for k degrees of freedom is
given by:

(z are the standardized independent variables) which results in:

and
Figure 3.27 shows a variety of chi-square distributions for increasing values of
k. What we can observe from the curve shapes is that the higher the degrees of
freedom the closer to the normal curve the distribution looks. This property is vital as
it allows (to a great extent) one to apply the transformation of formula 3.7 and reap
the benefits of the normal distribution calculations. Similar to the normal distribution,

4 If there are N objects in a system, we can consider one as the reference frame/source
(say with coordinates 0,0,0 in three dimension) so we only need the relative positions of the
remaining N-1 with respect to the source to completely describe the system.
96 QUANTITATIVE RESEARCH METHODS

tables exist (Figure 3.28) of the most popular combinations of degrees of freedom and
probabilities.

Figure 3.27 Chi-distribution graphs

Figure 3.28 Table of chi-square values

The practical value of the chi-square distribution comes from the development
and application of the chi-square test statistic that we will see in the next chapter. We
briefly need to mention here that the statistic is built as the sum of the squares of the
Populations 97

differences between observed and expected frequency values. The number of


observations minus 1 in that case equals the degrees of freedom of the chi-square
distribution.

3.3.3 Binomial Distribution


While the normal distribution deals with continuous variables (all possible
values of x), there are situations where the profile of a population includes only two
attributes, usually labeled as success and failure (also called Bernoulli trials). This is the
case where repeated trials/choices of an outcome are made where in each one the
probability of success or failure remains the same. Examples include machine
assemblies and quality control where we are interested in minimizing the number of
failed products in a certain amount of them, like having 30 defected products come
out of the assembly line for every 1,000 produced. In this case, we are targeting a
probability of failure of 30/1000 or 0.03 (3%) with a complementary probability of
success of 0.97 (97%) for 1,000 trials. This case is like the coin flipping of Figure 3.13
with the only difference being that it might not be a fair coin. For n trials, where the
probability of success is p and its complementary probability of failure is q, the
probability of finding x failures/defects within them is expressed by the binomial
formula 3.8.

(3.8)
In the case of our coin in Figure 3.13 it could be that we are interested in the
probability of getting three heads only in the 4 coin flips/trials. The only combinations
that satisfy that requirement are HHHT, HHTH, HTHH, and THHH. Since the
probability of each one of them is 0.0625 (Figure 3.13), the overall probability for
observing 3 heads (let’s assume it represents success) and 1 tail (let’s assume it
represents failure) will be 4*0.0625 or p(3) = 0.25. It is left to the reader to confirm
that formula 3.8 will produce the same result for p = 0.5, q = 1 – p = 0.5, n = 4, and
x = 3.
98 QUANTITATIVE RESEARCH METHODS

Figure 3.29 Binomial probability distributions

Figure 3.29 depicts binomial probability distributions for various probabilities


of success (0.2, 0.3, 0.5). It might be evident from the shape of the curves that they
closely resemble the normal. This is true to an extent, especially when the number of
trials becomes large (to accommodate at least 10 successes or failures as a rule of
thumb). This can be an advantage if we were to avoid the tedious calculations that
formula 3.8 imposes. The theoretical mean and standard deviation of the binomial
curve are given by, respectively:

(3.9) and (3.10)


For large populations, we can always consider them as an approximation to
the mean and standard deviation of a normal curve and calculate the probabilities from
the table of Figure 3.21. If the populations are not large the binomial distribution tables
like the one in Figure 3.30 can be used for the most popular values of the parameters
involved. The selection here is based on the number n of trials like, say, coin flips (first
column in Figure 3.30), the number of successes (second column in Figure 3.30), and
the probability of success for each trial (top row in Figure 3.30).
Populations 99

Figure 3.30 Table of binomial distribution values

3.3.4 t Distribution
In certain cases, we might know that our population attribute follows a normal
distribution, but we can only access a small section/sample of the population. For such
a case the t distribution (often called student t distribution – Student was the alias
the developer of the distribution used for anonymity purposes) has been developed.
Given that the size of the section of the population we study is small, here it is to be
expected that the shape of the corresponding curve will change and will approach the
normal distribution curve as the sample size increases. The formula that relates the t-
statistic with the other parameters of the population is given by

or
100 QUANTITATIVE RESEARCH METHODS

Figure 3.31 t probability distributions

Figure 3.32 t-statistic probability distribution

The probability density function is omitted due to its complexity and little
practical significance for the material in this book. The graph of the distribution,
though, is depicted in Figure 3.31 for comparison with the normal curve. The
distribution is dependent on the degrees of freedom (sample size – 1) and two values
are shown in Figure 3.31. It is apparent from the graph that the curves are approaching
the normal fast even with small increases in the sample size. As with the previous
Populations 101

distribution, popular values are collected in the form of tables like the one in Figure
3.32 where the absolute values of the t-statistic are displayed.

3.3.5 F Distribution
A final distribution that we will briefly mention here is the F distribution. This
is used when we conduct analysis of variance (see next chapter) for multiple variables
and it provides a comparative measurement of the variances of two populations with
varying degrees of freedom (population sizes). One of the populations includes the
means of each variable, while the other includes the values of all variables as one group.
The degrees of freedom (population size) of the population of the means can be
referred to as k1 and its variance is called the between mean square or treatment mean
square (MST), while the corresponding degrees of freedom of the population of all
values can be referred to as k2 and their variance as within mean square or error mean
square (MSE). The sampling distribution model of the ratio of the two means
(MST/MSE) form what we call the F distribution (F is for the inventor Sir Ronald
Fisher). For two variables with degrees of freedom k and m and with their squares
distributed as chi-squared, the F-statistic is given by the formula:

The graph of the distribution for various combinations of the two population
sizes/degrees of freedom is shown in Figure 3.33. We can see that the higher the
degrees of freedom (population size) the closer the distribution gets to normal. This
closeness to normal is what will allow us to use the distribution (in the next chapter)
for statistical purposes.
While the formula of the corresponding probability distribution is beyond the
scope of this book, its value for different degrees of freedom and for a certain area
under the curve (probability) can be found in F-statistic tables online. Figure 3.34
displays the values of the statistic for p = 0.95 and various degrees of freedom for k1
and k2. The value of the statistic can be found as usual at the intersection of the
corresponding row and column.
102 QUANTITATIVE RESEARCH METHODS

Figure 3.33 F probability distributions

Figure 3.34 F statistic table

3.3.6 Distribution-Free and Non-Parametric


The distributions that we have seen so far approach the normal distribution
for large populations. In life, though, we oftentimes deal with distributions that are far
from normal. Such distributions are called distribution-free or non-parametric
distributions. In Figure 3.35 we see what is called a bimodal distribution. It exhibits
Populations 103

two peaks/modes and one trough in between. Apparently, it is not normal (far from
it) so anything we discussed in the previous sections cannot be applied here. Luckily,
in math we can transform distributions to new ones in any way that is mathematically
acceptable (meaning applying the typical mathematical operators and functions).
Formula 3.7 was one such way and it allowed us to convert any normal distribution to
the standardized normal distribution.

Figure 3.35 Bimodal distribution

In practice, we need ways of evaluating how close to normal a distribution is.


One indication is the existence of outliers. If when applying the transformation of
formula 3.7 (converting to the standardized normal distribution) we observe values
below z = -3 or above z = 3, we can deduce that there are outliers and as a result our
curve is not normal. A better alternative would be to plot our distribution (vertical axis
in Figure 3.36) against the normal scores (horizontal axis in Figure 3.36) and see if
there is a similarity/correlation (see more in Chapter 4) between the two distributions.
The closer to the diagonal our points get, the closer to normal our distribution is. In
Figure 3.36 the distribution to the left (a) closely resembles the normal as the points
in the graph follow a linear patter while the one on the right (b) does not resemble
normal as the points do not follow a linear pattern.
104 QUANTITATIVE RESEARCH METHODS

Figure 3.36 Normal plots

If we consider plotting the non-cumulative normal values (Figure 3.20, blue


line) against our distribution, the corresponding plot is called Q-Q plot5 (Quantile –
Quantile). If we were to consider the firehouse example we discussed previously with
the municipalities at distances 26, 28, 30, 32, 34, 36, 40, 70 from the city and we were
to develop the Q-Q plot for these values using SPSS, we will get the plot of Figure
3.37.a. It is obvious that our distribution is far from normal. If on the other hand, we
had instead as distances for the municipalities the values 24, 26, 28, 29, 30, 31, 32, 36,
then we would get the Q-Q plot of Figure 3.37.b indicating a better similarity with the
normal distribution. The reference line in the plots has intercept and slope equal to
the location and scale parameters of the theoretical distribution.

Figure 3.37 Q-Q plots

5 SPSS: Analyze => Descriptive => Q-Q Plot


Populations 105

If instead of the normal values we plot our distributions against the cumulative
normal values (Figure 3.20, red line), the corresponding plot is called a P-P plot
(Probability – Probability). This type of plot will work, in addition to normal values,
for exponential, lognormal (logarithms of the normal), etc. The reference line in such
plots is always the diagonal line y=x. The difference between the Q-Q plot and the P-
P plot is that the former magnifies the deviation from the proposed theoretical
distributions on the tails of the distribution and it is unaffected by changes in location
or scale, while the latter magnifies the middle and its linearity might be affected by
changes in location or scale. Using the same data as the ones used to produce the Q-
Q plots of Figure 3.37, SPSS produces the P-P plots of Figure 3.38.
In addition to the P-P and Q-Q plots that provide a visual interpretation of
how closely a curve resembles the normal distribution, there are two popular metrics:
the Kolmogorov-Smirnov6 test (KS test) and the Shapiro-Wilk. Both tests quantify
the difference of our distribution with the normal by using hypothesis testing (Chapter
5). The KS test is typically suggested for populations greater than 2,000 individuals,
while the Shapiro-Wilk test is suggested for populations less than 2,000.

Figure 3.38 P-P plots

While the methods we discussed up to now can help us judge if a distribution


is near normal, they don’t address the issue of what to do when our distribution is not
normal. To approximate/convert a non-normal distribution to normal, some popular

6SPSS: Analyze => Descriptive Statistics => Explore … => Plots => select
Normality plots with tests
106 QUANTITATIVE RESEARCH METHODS

transformations include the log10 7and sqrt (square root), among others. This means
that for every value of our distribution we calculate its base 10 logarithm, or its square
root and we use the newly found values in place of the regular x values. Figure 3.39.a
shows the graphs of the bimodal distribution (blue line) with its log10 (green curve)
and sqrt (red curve) transforms.

Figure 3.39 Transformation of a bimodal probability distribution

7 SPSS: Transform => Compute Variable => LG10


Populations 107

It is evident from the different curves that in this case the log10 smoothed out
the initial curve a lot more than sqrt, so although it still does not resemble the normal
curve it came a lot closer than any of the other curves. A different situation (blue
curve) is shown in Figure 3.39.b were the sqrt transform produces a closer to normal
curve than the log10 that produces even unacceptable/negative values for the
frequency at certain values. The reader can find other popular transformations in the
extant literature. In practice, it is best to apply a variety of transforms to see which one
gets our distribution closer to normal.
108 QUANTITATIVE RESEARCH METHODS

4 Samples

In the previous chapter, we saw how we can profile populations by identifying


selected parameters that abstractly describe their attributes. In life, though, we are
rarely fortunate enough to have census data that provide that information for every
member of the population, so we need to rely on samples of the population we study.
A reader who is familiar with the theoretical basis of the methods we use in the analysis
of samples can skip this chapter and consider the cheat sheets provided in Appendix
B for a quick reference to what test should be applied for each situation they are facing.
Having chosen our sampling method (see section 2.4.2) we can only hope that
our sample size is large enough to be able to realistically represent every attribute of
the characteristic of the population we study. To ensure such constitution we will
assume for analysis purposes through this chapter that the sampling process is random.
Since this, in general, is something we cannot hope with great certainty to achieve
because of the variability in populations in terms of their various attributes, different
samples should result in different representations of the population (Figure 4.1).

Figure 4.1 Population samples

Our goal now becomes to establish relationships between population


parameters (symbolized with Greek letters) and sample metrics (called statistics
from now on and symbolized with English letters). The mean and standard deviation
Samples 109

of the population (μ and σ, respectively) and sample (x̅ and s, respectively) are naturally
our primary targets in establishing such relationships.
Considering the various sample distributions (Figure 4.2) we would expect
them to fall somewhere within the population distribution. If at this point we were to
consider the means of each possible sample it might be intuitively possible to see that
there will be more of them closer to the population mean since more members of the
population exist around that mean and are most likely to be selected in most samples.
In other words, one can intuitively see that the mean of all possible samples will be
our population mean. This intuitive conclusion has been formally proven and referred
to as the central limit theorem (CLT).

Figure 4.2 Sampling distribution

To be more precise, what the theorem states is that if we take random samples
of size N from the population, the distribution of the sample means (referred to as
sampling distribution) will approach the normal distribution the larger N becomes.
For instance, if we were to consider all possible samples of size N = 4 and calculate
their means, we will observe their distribution resembles the normal distribution. The
more we increase the sample size the closer we get to the normal curve. If we were to
continue experimenting with increased sample sizes, we would observe that around
the sample size of N = 30 we have an almost perfect match with the normal. As a rule
of thumb, this sample size is oftentimes considered the minimum for allowing
significant observations to be made.
The advantage of proving that the sampling distribution approaches the
normal distribution is that we can accurately represent it with its mean and standard
deviation. We have already mentioned that the mean of the sampling distribution is in
110 QUANTITATIVE RESEARCH METHODS

fact our population mean. Another fact that has been proven theoretically is that the
standard deviation (also referred to as standard error later) of the sampling
distribution relates with the population standard deviation through the relationship:

(4.1)
Because in research the sampling distribution will be unknown, we
approximate its standard deviation with our sample’s distribution standard deviation
s. Formula 4.1 will then become for practical purposes:

(4.2)
Figure 4.3 summarizes the results of the CLT and its implications for
connecting sample statistics with population parameters. Given the conclusion of the
CLT and some of the assumptions involved like equating the sampling distribution
mean and standard deviation with that of our sample, it would be appropriate to have
a metric about our confidence in the conclusions made. In considering such a metric,
some of the observations we made in the previous chapter about population profiles
will be valuable. One such observation (Figure 3.24) concerns the percentages of the
population under the normal curve for integer values of z. We know, for example, that
with respect to the attribute we investigate, around 68% of the population (an
approximation of 68.3%) is within one standard deviation from the mean, around 95%
of the population (an approximation of 95.5%) is within two standard deviations, and
that the great majority (99.7%) will be within three standard deviations.
Considering the 95% population spread within two standard deviations from
the mean and given that the sampling distribution is the distribution of the means of
samples of size N, we can be “sure”/confident that the chances of our sample mean
x̅ (having the same chances as any other distribution) being within that spread will be
95% (Figure 4.4.a). Reversing the thinking, we can say that we can be 95% confident
that the sampling mean (and as proven by CLT, the population mean) will be within
two sample distribution standard deviations from our sample mean (Figure 4.4.b). By
consulting Table 3.5 (critical values) we see that the value of z for 95% of the
population around the mean is -1.96 and 1.96 (a little different from z = 2 that we used
as an approximation).
Samples 111

Figure 4.3 Sample, sampling, and population distributions

Figure 4.4 Confidence intervals

By using formula 3.7 for z = -1.96 and z = 1.96, we can calculate the values of
our attribute x that will have 95% chance of including the sampling mean.
xL = x̅ − (1.96 ∗ sx̅ ) and xU = x̅ + (1.96 ∗ sx̅ )
112 QUANTITATIVE RESEARCH METHODS

where xL and xU denote the lower and upper boundaries of the attribute x. sx̅
is also known as standard error (symbolized SE from now on for compliance with
other publications). The range (xL, xU) from xL to xU is called the 95% confidence
interval (CI), while its half value (the range to the left and right of the mean) is called
margin of error (ME). The corresponding formulas are:
CI = xU - xL or CI = 2*1.96*SE and ME = 1.96*SE
Like the popular 95% confidence intervals (0.05 power), we can calculate the
99% (0.01 power) and 99.9% (0.001 power) confidence intervals.
An alternative to the consideration of CLT as an estimator to confidence
intervals is bootstrapping8. This statistical analysis technique belongs to the general
category of resampling techniques (also known as random sampling with replacement)
and in addition to samples we “collect”/engage during our research, it can be applied
to samples that have been collected in the past. The technique is distribution
independent, so it is ideal for situations where we are unsure about the shape of the
sampling distribution. It is simple enough but relies heavily on computers to perform
calculations. The sample is considered here in the role of the population from which
we randomly extract individuals to form sub-samples (we will call them bootstrap
samples from now on)9. The same individual can exist in this way in multiple bootstrap
samples and thousands or even tens of thousands can be created in this way. The
statistic of interest (for example, the mean) can be measured in these huge numbers of
samples and their deviations from the population (our original sample) can be
estimated producing in this way a confidence interval and standard error. The
dependency of the method on our initial sample is considered by many as a weakness
of the method but there is theoretical proof that the method works in producing
reliable estimates.
While the analysis we performed up to now addressed the situation when the
variables we study are continuous (scale), we need to see how the situation will change
when dealing with categorical/nominal data. The challenge with this type of data is
that they usually are not expressed numerically as they are in text form. Examples of
these types of variables are color, gender, race, location (like countries), etc. In social
sciences, typical categorical variables that are investigated include feelings, perceptions,
beliefs, etc. The most familiar use of categorical values (to most readers at least) are

SPSS: Analyze => Choose test …=> Activate the bootstrap option when it appears
8
9The name bootstrap is meant to indicate the absurdity of using a sample from a
sample which is like lifting ourselves up by pulling our boot straps up.
Samples 113

when polls are conducted. During elections people are asked which party they will vote
for and the results of the polls are then presented as proportions of their preferences.
While we cannot define categorical/nominal variables numerically, we can
nevertheless count the various instances of their attributes and express their presence
in populations and samples in terms of their frequency and probability of occurrence.
If we record, for example, the favorite car color of a sample of 500 individuals10 we
might find the frequencies listed in Table 4.1 along with their corresponding
proportions.
Notice the subtle switch of the wording from “probabilities” to
“proportions”? Although they are the same this is done to distinguish their references.
Probabilities is reserved for the population and proportions is reserved for the sample.
It would be interesting to see if something similar to CLT can be applied to nominal
variables to allow inferences between sample and population. The solution is simple
enough if one focuses on one category at a time. Consider for example only the color
White. We could take multiple samples from our population and record its proportion
in each one of them. If we were to plot all these values, we would get the sampling
distribution of the proportions for the color White (similarly for other colors).
Table 4.1 Nominal data values
Car Color Frequency Proportions
White 115 0.23
Silver 90 0.18
Black 105 0.21
Gray 70 0.14
Blue 30 0.06
Red 40 0.08
Brown 30 0.06
Green 5 0.01
Other 15 0.03
Total 500 1

Applying the same rationale as in the case of the means of continuous


variables, the sampling distribution for the proportions for the color White will have
the shape of the normal distribution centered around the population proportion p. To
avoid confusion with the already used Greek letters, our sample mean (the proportion

10 Derived from https://en.wikipedia.org/wiki/Car_colour_popularity


114 QUANTITATIVE RESEARCH METHODS

of the White color) will be denoted as μ(p̂). p̂ (pronounced p-hat) will refer to our
sample mean in the role of the predicted population proportion. The mean on the
sampling distribution of the proportions (which we presume equals our sample
mean/proportion of 0.23) is given by the formula:

Luckily for us, knowledge of the mean allows theory to provide us with the
standard deviation which here is expressed by the formula:

(4.3)
where N is the sample size and q = 1 – p (the probability of not having a White
color). Figure 4.5 displays the normal distributions centered around p and the 68-95-
99.7 rule of the percentages of the population under the curve.

Figure 4.5 Normal model centered at p

In the case of the White color we know from Table 4.1 that its proportion is
p = 0.23. Assuming this as the proportion in the population we can apply formula 4.3
and get the standard deviation σ(p̂) = 0.019. This means that if we were to draw
different samples from our population we would expect 68% of them to give us a
proportion for the White color within 0.23 ± 0.019 values (0.21 and 0.25, respectively),
95% of the sample to give us a proportion within 0.23 ± 2*0.019 values (0.19 and 0.27,
Samples 115

respectively), and 99.7% of the sample will give us a proportion within 0.23 ± 3*0.019
values (0.17 and 0.29, respectively). The amounts of the multiples of standard
deviations are called sampling errors (usually when polls are conducted). In our
example, we can say that the sampling error in the case of 95% of the sample (typical
choice in most polls) is 2*0.019 or 0.038 above and below the sample proportion of
0.23. This does not mean that we make an error in our estimate of the 0.23 proportion
but rather that the variability within the various samples (95% of them) will be between
0.19 and 0.27. The “error” labeling is misleading and a better alternative for ‘sampling
error’ would have been ‘sampling variability’.

4.1 Statistics
With the establishment of the CLT that allows us to connect our samples to
the population they came from we can begin the study of samples. Our main interest
in statistics is whether two or more samples come from the same population. If
the p-value (see previous chapter) of our statistic is greater than our critical value, we
will be confident enough that the two samples come from the same population (null
hypothesis as we will see in the next chapter) otherwise we will deduce that they come
from different populations (alternative hypothesis).
We will focus first on studying the characteristics/statistics of samples
(referred to as descriptive statistics) and in the next chapter we will discuss how they
relate to the population (referred to as inferential statistics). While there are many ways
we can approach the subject of statistics, we will view it here from the specifics of the
number of samples/groups our study includes, the number of variables/dimensions
involved, and the type of data we have (Figure 4.6).

Figure 4.6 The statistics process


116 QUANTITATIVE RESEARCH METHODS

Based on the three distinct elements of number of samples/groups, number


of variables/dimensions, and type of data we have (scale or nominal), our statistical
methods space can be subdivided as Figure 4.7 depicts. Figure 4.8 captures the space
of possible methods as a three-story apartment building metaphor for convenience
and ease of presentation. We will start with one sample/group/category statistics (first
floor), move on to the two samples/groups/categories statistics (second floor), and
then to many groups/samples/categories statistics (third floor). We will reserve the
attic for the advanced methods we will discuss in Chapter 6. This process will help
familiarize the reader with the most popular methods of analysis in a step-wise fashion
where the “higher” methods will depend on the “lower” ones, the same way as new
knowledge is based on building blocks of previous knowledge.

Figure 4.7 Breakdown of statistical methods

As the various statistical tests are presented in the remaining sections of this
chapter, an effort will be made to explain the way they are structured and their
workings the first time they appear by providing a simplified numerical example.
Similar to how our understanding of the workings of a car engine improves our
driving, it is assumed here that understanding how the various statistical tests work
will improve our understanding of their appropriateness for the data analysis we are
planning. When subsequent applications or extensions of the same or similar methods
appear, it will be left to the reader to follow up with numerical examples as they abound
in the extant print and online literature (a Google search will prove the point).
Samples 117

Figure 4.8 The house of stats

4.2 One-sample Case


The one sample situation is the most typical in research and involves a
selection of individuals from a population based on criteria that align with the purpose
and research questions of the research. We will first discuss the situation when scale
variables are involved and then continue with nominal variables. For each one of these
categories we will first deal with the case of one variable then move on to two variables
and eventually to many variables (Figure 4.9).
The case of scale variables will be further subdivided into those that have a
normal-like distribution and the distribution-free (non-parametric) variables. For a
variable to belong in the former category, we must have first (see section 3.3.6) either
constructed the P-P or Q-Q plots of the variable and have visual confirmation of a
close to diagonal scatter plot or apply the Kolmogorov-Smirnov test (suggested for
populations greater than 2,000) or the Shapiro-Wilk (test (suggested for populations
less than 2,000). If the results indicate that our variable is not normally distributed, we
should apply some of the popular transformations (log10, sqrt, etc.) and test again for
normality. If our distribution still cannot be classified as normal, we should then treat
the variable as non-parametric (distribution-free).
118 QUANTITATIVE RESEARCH METHODS

Figure 4.9 One-sample situations

4.2.1 One Normal-like Scale Variable


When studying a scale variable whose distribution resembles the normal
distribution, we are interested in statistics/metrics that realistically express the
distribution of the values of our sample population and how they compare with other
samples or populations. In the first category, we use what we call measures of
location/central tendency11 which identify the most typical/summary values that
represent the sample and of the variability/dispersion of the sample values that show
how the individual measures vary. In the second category, we are interested in
comparisons or measures of relative position and shape between parts of the sample
or with hypothetical/theoretical values.
Typical measures of location are those that identify critical values of the
sample distribution and include the mean, median, and mode. They are also referred
to as measures of central tendency simply because they are representing the
“center”/peak of the distributions. Like the population mean μ, the sample mean x̅
will be given by:

11 SPSS: Analyze => Descriptive Statistics


Samples 119

Typical measures of variability/dispersion include the range, variance, and


standard deviation. While the range is (similar to the population) the difference
between the maximum and minimum values in the sample, the variance s2 and standard
deviation s are slightly different and given by:

The reader might notice that both these formulas include a division by N-1
instead of N as it was in the corresponding formulas (3.3 and 3.4) for the population
μ and σ. This is an attempt to lower a bias that s2 and s introduce in samples especially
when the number of individuals in the sample is small. Because the proof of this
approximation is beyond the scope of this book, we will demonstrate its validity (for
the inquiring reader) using a simplified example. Let us assume that the mean of a
population variable is 70 (whatever the units might be) and our sample is composed
of 3 individuals with values 80, 90, and 100 for the variable we study. If we were to
calculate the mean we will get x̅ = 90. For the standard deviation if we use N-1 (which
is 2 in this case) we get s = 10, while if we use n (which is 3) we get s = 8.16. The effect
the N has is in biasing the s towards smaller values, while the effect N-1 has is in
increasing s and allowing it to cover more distance. Because the variance and the
standard deviation include differences from the mean (like 80-90, 90-90, and 100-90)
the values close to the mean (90 in our case) add insignificant amounts (90-90 = 0 in
our case) and thus their influence in shaping the outcome is eliminated or becomes
negligible. In other words, we could even exclude the value 90 from the 80, 90, 100
set and we will still get the same standard deviation when using N. Reducing N to N-
1 alleviates the nullification/zeroing of the variables near the mean. In the case of the
numbers we used we can see that the increased s that the N-1 produces will include
the population mean (70) when considering two standard deviations (-20 and +20)
from the sample mean (90), while the s produced when using N does not. Of course,
120 QUANTITATIVE RESEARCH METHODS

the numbers are selectively chosen in this example but the idea of what the use of N-
1 does is accurately portrayed.
Typical measures of relative position and shape include skewness, kurtosis,
and percentiles. While the first two measure deviations from the symmetry and the
standardized normal curve, percentiles express the sections (percentages) of the
population below certain values of the observed variables. As such, they include the
quartiles, IQR, and their depiction as box-plots. Another metric that shows relative
position and an indication of the shape are the z values of the various
individuals/observations in the sample.
A final statistic of importance that is used to compare the mean of the sample
against a hypothetical/population mean is the t-statistic that is produced by the well-
known t-test. (also, known as Student t-test12 or independent t-test). The naming
similarity with the t-distribution is evident and since we are interested in just two
values, our sample mean x̅ and a hypothetical population mean μ (that is, we have a
population with two values), the t-distribution is appropriate here. If we assume that
the standard error is SE(x) (standard deviation of sampling distributions which is equal
to the population standard deviation), then we have:

and
The evaluation of the values of t we get will depend on the assumptions we
made when forming the hypothesis of our research. In general, though, when the
significance of the test is below 0.05 (see Chapter 3) we can conclude (with 95%
confidence) that our sample mean is at most within 2 standard deviations from the
population mean. In practice (when conclusions are concerned), we can say that our
sample mean is representative (equal approximately) to the population mean with 95%
confidence.

4.2.2 One Non-Parametric Scale Variable


Non-parametric distributions pose a great challenge in that the traditional
metrics of the well-studied normal distribution do not apply. Someone could also
calculate here the mean, variance, standard deviation, etc., but their representativeness
of the actual population will be unrealistic. As we discussed in Chapter 3, the median
can be a better representative as a location measure than the mean when the spread of

12 SPSS: Analyze => Compare Means => One‐Sample T-test


Samples 121

the values is “abnormal” (there are outliers and/or multiple modes like in the bimodal
distribution).
The most popular statistical test in this case is the sign test (also referred to
as Wilcoxon’s test). It provides a simple comparison of the sample median with a
hypothetical/population median. The test is based on the sign of the difference
between the observed values and the hypothetical/population median to reduce the
problem into a binomial distribution problem. Let us assume that the observed values
(we will ignore the units as usual) in our sample are the values of the municipalities in
Figure 3.3 where we are interested in building a firehouse. Let us also assume that
these distances represent the sample of a population of municipalities of an extended
region around a city or, if we are interested in a large population, the distances of all
the municipalities around big cities in the US (of say half a million population and
above). We want to see how the median of our sample compares to the population
median that we hypothesize to be 38.

Table 4.2 Sign-test data


x η x-η sign
32 38 -6 -
36 38 -2 -
40 38 2 +
26 38 -12 -
30 38 -8 -
28 38 -10 -
70 38 32 +
34 38 -4 -

Table 4.2 summarizes the differences of the sample values with our
hypothetical population median and the sign of that difference (last column). Having
two values for the sign (positive and negative) should suggest a binomial distribution
(a process similar to the coin flip example with, say, plus-sign for heads and minus-
sign for tails). If we consider the plus-signs in Table 4.2 we see that we have two such
results in the total of eight values/signs. By checking the binomial distribution table
of Figure 3.29, we see that for 2 successes in 8 draws (considering equal chances for
success/plus-sign and failure/minus-sign or p = 0.5) we get p(2) = 0.109. This means
that the probability of observing the 2 plus-signs is 10.9%. In other words, the
probability of observing a median like 38 which results in two positives and 6 negatives
in our set has 10.9% (rather low) chances of appearance. Whether this value is
122 QUANTITATIVE RESEARCH METHODS

acceptable or not for the inferences we want to make will depend on the hypothesis
we formed (see next chapter).

4.2.3 Two Normal-like Scale Variables


When dealing with two different variables we are interested in comparisons
between their individual values or in their spread across some desirable range of values.
The former case is easily addressed by converting each value to its corresponding z
value and comparing the z values, while the latter demands associating the two
variables in some way. Starting with the simple case of comparing specific values let
us consider the case of height (symbolized with h) versus weight (symbolized with w)
in adult males as expressed through their body-mass index (BMI). We sample a group
of male adults under some conditions (we will ignore for the purposes of this exercise)
and by calculating the means and standard deviations we find that for the weight x̅w =
82Kg and sw = 10KG and for the height x̅h = 1.7m and sh = 0.1m. We are interested
in finding out if our sample is overweight, so we decide to check the top 10% height-
mark and see how it related to the weights distribution. With respect to the
standardized normal curves the cut-off point is where the lower 90% of the area under
the curve lies. Looking at the table in Figure 3.21 we can see that the 90% or p = 0.9
is for zh = 1.285 (approximately). By substituting this value in formula 3.7 for height
we get that the cut-off height above which 10% of the population lies is h = 1.83. By
using a BMI calculator (the Internet if full of them) we can see that the cut-off weight
that corresponds to that height and above which an individual is considered
overweight is around 83Kg. Using the mean and standard deviation of the weight
distribution and applying formula 3.7 for this value of weight we get zw = 0.1. For this
value the table in Figure 3.21 produces a probability of 0.5398, which indicates that
around 54% of the population is below h = 1.83. Considering its complement, we can
say that 46% population is above h = 1.83. In other words, the top 10% height-mark
of the heights accounts for the top 46% weight-mark of the weights. A lot of “shorter”
individual have weights they shouldn’t have.
Another way of comparing individual values, according to what we saw in the
previous section, is by using the t-test. In this case, instead of comparing a value like
the mean to a hypothetical value, we might compare the means of the two variables
we study. The t-test will now be called paired t-test13 (unlike the independent t-test
that compares means of the same variable for different samples). In similar fashion to

13 One sample (Paired) SPSS: Analyze => Compare Means => Paired Samples T Test
Samples 123

the independent t-test we saw before, the formula for the t-statistic and its standard
error are given by:

and
While comparing for two variable means using t-test or individual values by
comparing their corresponding z values is straightforward, comparing the spread
across a range of values is more challenging. What we are interested in here is possible
associations of the variables which might suggest dependency of one over the other.
This is the realm of causation so we need to be careful in the way we express the
conclusions of our findings. Everything we will discuss here will not address
causation but only “prove” associations between variables. Associations are nothing
more than the tendency of one variable to follow the rate of change of another in some
sort of fashion. This can be expressed mathematically in the form of a function like y
= f(x). We should recall here the various variable naming conventions that are used in
quantitative research (discussed in section 2.4.3). Variables that tend to trigger
something tend to be called independent, predictor, or plain x, while variables that
express the result of the trigger tend to be called dependent, criterion, outcome, or
even plain y. To simplify matters and avoid names that insinuate causation, we will use
the abstract algebraic convention of x and y here.
In defining associations let us look first at the scatter plots14 of the x and y
variable values of some sample populations (Figure 4.10). By association we mean
any shape (line or curve) of the relationship between x and y that seems to indicate
structure. We can see in Figure 4.10 that plot (a) shows a decline in a linear fashion of
the values of y as the values of x increase (negative association), plot (b) shows an
initial decline (negative association) followed by an increase of the y values as the x
values increase (positive association), and plot (c) shows no association at all as the
values appear randomly placed on the plot. We will see later on how to deal with some
of the randomness we observe, but for now it is sufficient to say that plots (a) and (b)
indicate some kind of association between x and y (linear for (a) and kind-of quadratic
for (b)), while (c) does not show any association at all. The curved (straight including)
line forms of association are also typically referred to as correlations.

14 SPSS: Graphs => Chart Builder =>Scatter


124 QUANTITATIVE RESEARCH METHODS

For a last time let us emphasize the importance of distinguishing between


correlation/association and causation. Despite the temptation to assign dependencies
in variables when distinct patterns emerge in scatter plots, it should be clear that
correlation/association does not imply causation. Causations suggest a causal factor
that precedes an effect in time while correlations suggest nothing about the
relationship of variables in time. In many cases the real causal variable might be hidden
and not one of the two variables that correlate. For example, when the roosters crow
in the morning, the Sun comes up; but unless the roosters have special metaphysical
powers (unlikely because they usually end up as food), you can be assured they do not
cause the Sun to come up.
While there could be many models that fit the data, we might have (including
something like Figure 4.10.c that could be approximated with a high degree
polynomial) we will focus our investigation here to the situation in plot (a) of Figure
4.10. Our goal will be to develop a metric that can measure the extent/degree/strength
of the association between the two variables we observe. For this we will first convert
all values into their corresponding z values so we can have the same units (standard
deviations) for each of the two variables. The new plot is depicted in Figure 4.11.a and
although it appears to be somewhat distorted from the original the negative association
is still evident.

Figure 4.10 Scatter plots of x and y

Considering that the general trend in Figure 4.11.a is downward, we need to


establish a metric that supports and enhances that trend. If we were to see the plot as
four quadrants, we can see that the top left and bottom right ones (blue data points in
Figure 4.11.b) are those that enhance the negative trend, while the others weaken the
trend (red data points in Figure 4.11.b). There are also those data points that fall on
top of one of the axes (green data points in Figure 4.11.b) that have a neutral effect
(neither supporting nor weakening) the trend. If we look at the zx and zy coordinates
of the red data point we can easily observe that they are of the same sign (both negative
or positive) for each data point, while for the blue data points they have opposite signs.
Samples 125

One operation/metric that can catch sign differences is multiplication. If we consider


the multiplication of the zx and zy coordinates for each point we can see that the
product for the blue points will have a minus sign while the red data points will have
a positive sign (the green data points will produce zero in multiplications).

Figure 4.11 Scatter plots of Zx and Zy

Considering the average of all zxzy products, we can form our metric r as:

This is the well-known correlation coefficient15 or Pearson’s r and takes


values between -1 and +1. Negative values of r indicate negative associations (like in
our case), while positive values indicate positive associations. The closer we get to -1
or +1 the stronger our association will be, while the closer we get to zero the weaker
it will be. While getting close to -1 or +1 might seem ideal, in practice if we get a value
of r that is very close to any of these two values we should worry as it might very well

15 SPSS: Analyze => Correlate => Bivariate


126 QUANTITATIVE RESEARCH METHODS

be the case that x and y are the same variables. In math, close to perfect diagonal plots
(those that produce the -1 or +1 coefficients) are indications of functions of the form
y = x or y = -x (also called identities). The type of correlation we have seen here is also
known as bivariate correlation or Pearson correlation.
Having an indication of a strong association might tempt us to consider
modeling our scatter plot along a line (referred to as regression)16. This means we
might be interested in finding the equation of the line that best approaches the trend
of the data points. This process is called regression and focuses on identifying the
parameters that define the line that best fits/approximates our data values. From math,
we know that lines are expressed as functions of the form y = mx + b (referred to as
regression equation) where m represents the slope/angle of the line with the
horizontal axis and the constant term b is the y-intercept or the point where the line
intersects with the y axis. In statistics, the form of the line equation is usually shown
as ŷ = a0+a1x or ŷ = b0+b1x (if we are to stick with our constant term notation)
suggesting the polynomial origin of the line as one of many curves that derive from an
nth degree polynomial of the form a0+a1x+ a2x2+a3x3+ …… anxn. By considering this
form the reader could see that the scatter plot of Figure 4.10.b could be approximated
by ŷ = a0+a1x+ a2x2 (quadratic equation). Notice that instead of y we now use ŷ
(pronounced y-hat) — this is to distinguish the points produced by the line equation
as the predicted ŷ instead of the actual y in our data set.

The process we follow in math to calculate the coefficients of the line, m and
b (or the polynomial coefficients in the more general form), is by applying the least-
squares method. This is a standard approximation technique for calculating
coefficients (details can be found on the Internet for the interested reader). By applying
the least-squares approximation technique and combining it with the statistics we have
seen so far (mean, standard deviation, and the correlation coefficient), we get:

One interesting observation from the formula for a1 (the slope of the line) is
that the absolute value of the product rsy will always be less than 1 given that r is always
less than 1 (positive or negative). This means that the slope of the regression line will

16 SPSS: Analyze => Regression => Linear =>…Plot (select Z values)


Samples 127

always be less than 1 or in terms of the angle the line makes with the horizontal axis it
will be less the 45 degrees (remember that slope = tan(angle)). In normalized diagrams
this will lead to having the predicted values of ŷ (vertical red lines in Figure 4.12)
always be smaller than its corresponding value of x (horizontal red lines in Figure 4.12).
The parallelogram that the red lines make with the axes will always be flatter (its
y/height smaller than its x/length). This property that resulted from the values a1 can
take is called regression to the mean (remember the mean in the standardized normal
distributions is zero) as the predicted values of y tend to get closer to the mean (zero
in our case) and it is how regression got its name.

Figure 4.12 Regression to the mean

Now that we have established a metric for measuring the strength of


associations and a model/line to describe them that also allows us to predict values of
y based on values of x, we need to consider a metric that could measure the strength
of the regression. In other words, we need something to shows us if the deviations we
might observe from the regression line are too significant to ignore. This means we
need a measure of the effect of the dispersion/scattering of the data points around the
established trend/line. Similar to what we discussed when we developed the variance
and standard deviation metrics, we could consider here the differences between the
observed y from the predicted ŷ for each value of x. These differences are called
residuals and they will form their own distribution that we can study further. In
128 QUANTITATIVE RESEARCH METHODS

practice, we can get an indication of the residuals by plotting them versus their
corresponding x values. If a structure/form appears (like Figure 4.10.a and 4.10.b),
then we should worry as this would be an indication of the underlined influence of
another variable that our model/line did not consider. If on the other hand the plot
appears random (like Figure 4.10.c), then we can be certain that there are no underlying
influences and that the scattering of the residuals is random.
While the plot of the residuals can provide a strong indication of the validity
of our regression model, searching for a statistic that could do the same work is
preferable even if it is to provide additional support for the residual plot. Considering
that regression models will fall somewhere between the perfect correlation (r = -1 or
r = 1) and no correlation (r = 0), a metric will just need to show where in the domain
between perfect and no correlation our model falls. If we take into consideration that
the absolute value of positive and negative values of r is the same (the only difference
is the direction of the line), we can adopt the square of r as an indication of the
correlation strength independent of sign. This statistic is referred to as R2 (pronounced
R-squared)17 and is used as an indication of the variability in our data that can be
explained by our model/regression line.
To clarify the concepts and the statistics we mentioned in this section, we will
consider the following example. A sample of data concerning white stork populations
in some European countries along with their corresponding human populations for
the period between 1980 and 1990 is compiled in Table 4.318 (first three columns
ordered in increasing order of stork pairs). The means and standard deviations are also
displayed. By developing a scatter plot of the available data (Figure 4.13), we can see a
trend emerging in the form of a positive association. The correlation coefficient and
the regression line coefficients can be calculated by applying the formulas we presented
in this section and produce: r = 0.85 (an indication of a strong correlation as it is close
to 1), and a1 = 1412.4, a0 = 9,000,000 for the regression line.
Figure 4.14 displays the regression line (red color) with the predicted values
(red dots) for each data value (blue dots) and the residuals (black line segments). With
the given value of r = 0.85 we can calculate R-squared as R2 = 0.73 (rounded to two
decimal places). An interpretation for this value is that around 73% of the observed
variation in the data points can be accommodated by our model (regression line), while

17SPSS: Analyze => Regression => Linear => Save => Understandardized (x axis),
Standardized (y axis) = Chart Builder
18
Source: Mathews, R. (2000), Storks Deliver Babies. Teaching Statistics, 22, 2,
pp. 36–38, Wiley.
Samples 129

only the remaining 27% is unaccounted for. By plotting the residuals (shaded column
data in Table 4.3), we can see (Figure 4.15) that no specific form emerges and that
their distribution appears random. This is good news as it doesn’t suggest underlying
influences and confirms the 73% accountability of the model (red line).

Table 4.3 Regression test data


Predicted
Storks Humans
Country Humans Residuals
(pairs) (millions)
(millions)
Belgium 1 9,900,000 9001412 898588
Holland 4 15,000,000 9005650 5994350
Denmark 9 5,100,000 9012712 -3912712
Albania 100 3,200,000 9141240 -5941240
Switzerland 150 6,700,000 9211860 -2511860
Austria 300 7,600,000 9423720 -1823720
Portugal 1500 10,000,000 11118600 -1118600
Greece 2500 10,000,000 12531000 -2531000
Bulgaria 5000 9,000,000 16062000 -7062000
Hungary 5000 11,000,000 16062000 -5062000
Romania 5000 23,000,000 16062000 6938000
Spain 8000 39,000,000 20299200 18700800
Turkey 25000 56,000,000 44310000 11690000
Poland 30000 38,000,000 51372000 -13372000
Mean 5897 17392857 17329528 63329
SD 9550 15832657 13488846 8289946

Based on the analysis of the data we had (Table 4.3), we can deduce, with a
relatively high degree of certainty (we will talk more about this in the next chapter),
that there is a correlation between the size of the human population and the stork
population. The more storks we have in an area, the more populated it is. Considering
that human migrations during the 80s when the data were collected were minor, one
could be tempted to deduce that the higher the number of storks the higher the birth
rate in the human settlements. Would that also mean the storks are responsible for the
increase in births? The reader can see how tempting it could be to extend correlation
to causation, only to realize that unless the storks are the ones that bring the babies
(remember the familiar cartoons with the storks delivering newborns) the correlation
we proved is meaningless considering the evidence we have. A more normal
explanation would involve another variable (hidden/lurking) that could be the
geographic area of each country in our data. The larger the country, the more
populated one would expect it to be in terms of both humans and storks.
130 QUANTITATIVE RESEARCH METHODS

Figure 4.13 Stork example scatter plot

Figure 4.14 Regression line

While the presentation of the correlation and regression aimed at familiarizing


the reader with the concepts in a simplified form, the reader who wants to expand on
the subject will encounter a formal terminology for the assumptions we made. For
regression, the conditions of normality (assumed for this section), linearity,
multicollinearity, and homoscedasticity need to apply before we proceed.
Samples 131

Multicollinearity assumes that variables are measured without error (correct sample
size will ensure this condition). Homoscedasticity will be indicated by the scatterplot
and will ensure the variance of errors is the same across all levels of the independent
variable (standardized residuals will ideally be scattered around the horizontal line).
Assumptions will be required for all the tests we will discuss in this book but will be
left for the reader to explore unless of course they are critical for the understanding of
the material.

Figure 4.15 Scatter plot of residuals

In closing this section, it is worth mentioning the cases of correlations that do


not necessarily result in regressions. These are the cases where the scatter plots do not
suggest a line model but rather a more complicated but still distinct geometric curve.
As we mentioned previously these cases are dealt with by applying the least-squares
approximation to a polynomial of a degree of our preference. Unless the distribution
of the data points suggests a specific mathematical form like quadratic (a0+a1x+a2x2),
exponential (a0ea1+a2x+….), logarithmic (a0log(a1+a2x+…), etc., we start the
approximation with a high-order polynomial a0+a1x+a2x2+a3x3+ ……+anxn (plus an
error term that we ignore here), and by eliminating the coefficients that contribute the
least (small values of a), we end up with the form that closely approximates our data
values. Additional options include Fourier transforms and other techniques that are
beyond the scope of this book.
132 QUANTITATIVE RESEARCH METHODS

4.2.4 Two Non-Parametric Scale Variables


The case of scale variables that do not follow the normal distribution pose
many challenges and as such special methods need to be considered. General
assumptions these methods make are that the sample individuals are randomly selected
and that the probability distributions of the variables involved are continuous. When
it comes to comparing non-parametric variables, the analog to the t-test we’ve seen
before is the Wilcoxon’s Rank or Wilcoxon’s Rank Sum test19 (ideal also when the
two variables represent different samples). To demonstrate the case, let us suppose the
values of two variables X1 and X2 are as shown in Table 4.4 with the first set (yellow
background) having 6 values and the second set (blue background) having 7 values.
The test performs initially an ordering of the two sets of variables as if they were from
the same population (first column in Table 4.5) and assigns each its ordering value
(second column in Table 4.5). It then proceeds to isolate and sum the orders that
correspond to each set (last two columns in Table 4.5). In our case, we end up with
Sum1 = 39 and Sum2 = 52.
The total sum of all orders (in our case numbers from 1 to 13) will be Sum1 +
Sum2 = 91 and will remain the same as long as the number of entries for X1 and X2
add up to 13. This means that if Sum1 gets smaller, Sum2 will be getting larger (so that
the total sums up to 91) as will the difference between the two. This difference would
then suggest that the two sets of values come from different populations or that the
two variables do not refer to the same thing. The Wilcoxon Rank Sum test considers
as its statistic the sum of the smallest population of values. Tables with various values
of the statistic have been developed (can be retrieved from the Internet) for minimum
and maximum values of the statistic. When the statistic is within the range provided in
the tables then the two sets of variables might be coming from the same set (or their
populations have the same mean). An equivalent to Wilcoxon’s Rank Sum test is the
Mann-Whitney U test (the reader can find more on this on the Internet). The
necessary condition for the application of rank tests is that the two distributions have
the same shape.
When we are interested in comparing/correlating the full range of two data
sets (the spread of their values), in the non-parametric case, we can use the equivalent
for the correlation coefficient, which in this case is called Spearman’s rank
(symbolized as rs)20. A case in point could be to see if the grades two instructors give
when evaluating a group of students are related. Assuming intervals of 10 in a 100%

19 SPSS: Analyze => Nonparametric Tests => Legacy Dialogs => 2 Related Samples
20 SPSS: Analyze > Correlate > Bivariate > Spearman (uncheck Pearson)
Samples 133

scale with 10 the lowest rank for a student and 100 the highest and a group of 8
students, Table 4.6 displays two evaluations sets and the process of producing the sum
of the squares of the differences of the two grades (X1 and X2 are the grades of the
first and second instructor, respectively).

Table 4.4 Data values Table 4.5 Ordered values


X2
X1 X2 X-ordered Order X1 orders orders
4 5 -10 1 1
-3 -2 -9 2 2
-7 6 -7 3 3
-5 8 -6 4 4
12 -9 -5 5 5
-6 -10 -3 6 6
7 -2 7 7
4 8 8
5 9 9
6 10 10
7 11 11
8 12 12
12 13 13
SUM 39 52

Table 4.6 Spearman’s rank data


X1 X2 X1-X2 SQR(X1-X2)
3 5 -2 4
6 4 2 4
4 3 1 1
9 5 4 16
5 4 1 1
2 4 -2 4
4 5 -1 1
6 7 -1 1
Sum 32
134 QUANTITATIVE RESEARCH METHODS

The formula for rs (a shortcut actually) is:

As with the regular correlation coefficient, strong correlation results in values


of close to +1 (positive correlation) and -1 (negative correlation), while weak
correlation results in values of rs close to zero. In the case of the data in Table 4.6, the
Spearman’s rank formula produces rs = 0.62 indicating a good (not strong) correlation
of the two variables.
In case we are interested in a one-to-one (paired) comparison of two variables
(like when we apply two different treatments to the same individuals), the appropriate
method would be the Wilcoxon’s Signed Rank test. This works, up to a point, the
same way Wilcoxon’s Rank Sum works, but here we are ordering the absolute
difference of the two variables. When we have ordered the absolute values, we sum
the orders/ranks of those that have negative differences and those that have positive
differences. Table 4.7 displays a set of values for two variables X1 and X2 and the
process for producing the aforementioned sums (Sum+ = 9 and Sum- = 12). A point
of interest here is that in the event of multiple similar absolute values (in our case we
have two 1s that should have occupied first rank), we assign as rank the average of the
spread of ranks they occupy. The two 13s in the absolute difference in Table 4.7, for
example, should be in 4th and 5th place but because they are the same they share the
4.5 rank.

Table 4.7 Wilcoxon’s Signed Rank data


X1 X2 X1-X2 ABS(X1-X2) Rank/Order SUM+ SUM-
-3 -2 -1 1 1.5 1.5
-7 6 -13 13 4.5 4.5
-5 8 -13 13 4.5 4.5
12 -9 21 21 6 6
4 5 -1 1 1.5 1.5
-6 -10 4 4 3 3
SUM 9 12
Samples 135

As before, the test statistic is the smallest of the two sums (this is 9 in our case).
The smaller this gets the stronger the difference between the two variables, and by
extension the effect of our intervention that produced the two sets of data. Values of
the statistics can be found (for the enquiring reader) on the Internet pre-calculated for
various sample sizes. In the example we presented here (Table 4.7), with a sample size
of 6 a paired difference signed rank table would give for a power of 0.05 a value of 1,
meaning that only if we observed a test statistic (smallest sum) of 1 and below would
the two variables show significant differentiation between them. An assumption that
Wilcoxon’s Signed Rank makes is that the distribution of the differences between the
two variables (third column in Table 4.7) is symmetrical in shape.
As a final case, we need to consider the situation when one of the variables is
non-parametric while the other has a normal-like distribution. Given that non-
parametric variables are peculiar and need to be treated in ways we presented here, and
while there might be methods in literature for this type of situation, it is suggested that
we ignore the normality of the one variable and treat them all as non-parametric. In
this way, everything we have presented in this section can be applied. In the case where
the non-parametric variable displays distinct modes (like it is bimodal), a possible
treatment would be to break the data set in two. This case will be dealt with when we
discuss the two samples situation.

4.2.5 Many Normal-like Scale Variables


The many variables situation requires that we identify what exactly we are
comparing. Is it single entries/values of each variable? Is it specific entries like the
means? Is it the whole spectrum/spread of values of each variable with the
corresponding ones for the others? In the case of the single-value entries, we can
perform a comparison by converting the values of interest to their corresponding z
values and compare those values or their corresponding probabilities (Chapter 3).
The case of comparing special values like the means of multiple variables is a
more complicated one as up to now we have seen comparison of the means (t-test) of
pairs of variables. While this is possible here, as the number of variables grows the
combination of all pairs could increase drastically (for 10 variables, for example, the
number of all pair comparisons we can make is 45), making the need for an all-inclusive
metric a necessity. The method that was developed for the case of comparing the
means of multiple variables (usually representing multiple groups) is called analysis of
variance (ANOVA)21 and is based on a comparison of the differences of the means

SPSS: Analyze => Compare Means => One way ANOVA or SPSS:Analyze =>
21

Compare Means => Means => Option => check ANOVA Tables
136 QUANTITATIVE RESEARCH METHODS

(t-tests) with the variation between the variables. The corresponding test is called F-
test and its statistic is the F-ratio, expressed as the ratio of the variance
between/across the variables over the variance within the variables.
One condition that we need to consider when applying ANOVA is that it
assumes sphericity. This means that the variances of the differences between all group
combinations are near equal. If a group is an outlier it will distort the results, we will
get. To avoid such a possibility, Mauchly’s test can be applied. This test is based on
the hypothesis that the variances are equal and produces a metric to support or reject
that possibility. For power = 0.05, if the test produces a significance value above 0.05
then we can conclude that sphericity is preserved.
Let us demonstrate the ANOVA case through an example of a randomized
sample with N = 9 individuals/entries, each one represented by three (n = 3) variables
x1, x2, x3 (Table 4.8). The variability between variables is measured by considering the
mean of the means (yellow cell) of the three variables and seeing how each individual
mean varies from it. In essence we are treating each variable as having the same value
(the mean) across all 8 entries of it. The process of calculating the variance between
the means (left side of Table 4.8 below yellow cell) is exactly the same as we did for
the population in the previous chapter. We will end up with three means that will be
treated as a special sample of n = 3 entries/values. As we see in Table 4.8 the in-
between variables variation in this case (VarBetween) is 355.7.
For the within each variable variation we proceed by calculating the sums of
the squares of the difference of each variable value from its corresponding mean (as
we did with the variance calculation). This process will result in one sum for each
variable. We will treat these sums as a separate set of values and will calculate their
mean by dividing with (N-1)*n = 24 as the degrees of freedom of this special sample
(justification can be found in the literature). As Table 4.8 shows (right side), the within
variables variation (VarWithin) is 113. In order to calculate the F-ratio we need to
divide VarBetween by VarWithin. Eventually we get F = 3.144. By looking at an F
distribution table (Figure 3.33) for k1 = 2 and k2 = 24, we find the value 3.403 for the
F-statistic. This means that our statistic (3.144) is smaller than the corresponding one
for 95% of the population, meaning that the means of the 3 variables of our sample
do not show any significant difference between them (they are as typical as 95% of the
population).
Samples 137

Table 4.8 ANOVA data set


Between Variables Variation Within Variables Variation
x1 x2 x3 (x1- x̅1)2 (x2-x̅2)2 (x3-x̅3)2
43 28 38 3 22 47
25 23 34 267 93 119
52 33 62 114 0 293
60 49 25 348 267 396
40 35 51 2 5 37
27 20 54 205 160 83
36 40 47 28 54 4
51 37 50 93 19 26
38 29 43 11 13 4
x̅1 = 41.3 x̅2 = 32.7 x̅3 = 44.9
Mean of means: x̅ = 39.6
(x̅1- x̅) = (x̅2- x̅)2 = (x̅3- x̅)2 = Σ(x1-x̅1)2 = Σ(x2-x̅2)2 Σ(x3-x̅3)2
2

2.9 48.5 27.7 1072 = 634 = 1009


Sum = 9*(x̅1- x̅) + 9*(x̅2- x̅) + Sum = Σ(x1-x̅1) + Σ(x2-x̅2)2 + Σ(x3-
2 2 2

9*(x̅3- x̅)2 = 711.4 x̅3)2 = 2715


VarBetween = Sum/(n-1) = 355.7 VarWithin= Sum/(N-n) = 113
F = VarBetween/VarWithin = 3.144

Figure 4.16 Box-plots for ANOVA


138 QUANTITATIVE RESEARCH METHODS

If we were to compare the box-plots of the three variables of our example


(Figure 4.16.a), we could easily deduce that their overlaps are an indication that they
are not very different. A clearer difference would have been reflected in a box-plot
diagram like the one (hypothetical) in Figure 4.16.b. While in the case of ANOVA
box-plots can create a visual representation of the means and the spread of the variable
values, we should always stick to the actual statistic values for any conclusion we make.
A point of importance with ANOVA is that while it can identify exceptional cases (or
similarities one might say) among the group of variables, it cannot tell us which pairs
of variables contribute to these cases. For this we might still end up performing an
independent t-test for each pair.
ANOVA has gained popularity in social sciences because of the variety of
situations in which it can be used. To address the situations it can be applied in,
ANOVA have been “tweaked” under different names. We will mention here these
variations in light of the way they are practiced and the way they appear in statistical
analysis software.

• One-way ANOVA: This is the situation we presented previously. It is


called one-way because it considers one variable as a common factor
among the variables we study. In our Table 4.8 case, that
factor/category is the sample — all 3 x variables belong to the same
sample. In practice, we might have two variables, one nominal, say with
4 categories, called factor and one scale dependent on the factor. By
breaking the sample into four sets — one for every category — we
immediately end up with 4 sets of values (like we had x1, x2, and x3 in
Table 4.8). As an example, consider the situation where we are
interested in comparing student GPAs across the 4 years of university
studies. We have here two variables: one is the GPA that serves as the
scale variable (in the role of a dependent for ANOVA software) and
one is the student level (freshman, sophomore, junior, senior) that
serves as a factor variable. To satisfy the ANOVA conditions we will
assume that the student level (ordinal variable) is a continuous scale
variable. By breaking the set of GPA values into freshman GPAs (x1),
sophomore GPAs (x2), junior GPAs (x3), and senior GPAs (x4), we
end up with 4 variables and can perform the ANOVA process as we
did in our Table 4.8 example.

• Two-way ANOVA: This is like ANOVA but with two factors. In the
case of the university students, apart from studying GPA throughout
the 4 years we might also be interested in studying it across genders.
The initial 4 groupings of the GPA values (one per study year) will now
Samples 139

have to be further subdivided per gender. Thankfully, the software we


will use will do all of this for us and we only have to identify the factors
(student level and student gender) that influence the dependent
variable (student GPA).

• ANCOVA (analysis of covariance): This is like the one-way


ANOVA when there is some covariance (influence/relationship)
between some of the scale variables. It could be, for example, that in
our university student case we suspect that time students spend in
student support affects their grades. In such a case, along with GPA
scores we might identify the time spent as an influential
variable/covariance and indicate that in the options of the software
we use. The formula for covariance is similar to variance but considers
two variables instead:

• One-way MANOVA (one-way multiple analysis of variance): This


is similar to one-way ANOVA but with two continuous variables in
the dependent role. In our university student example, we might be
interested instead in looking at the student GPA (one dependent
variable) per student level/year (one factor variable) to look at their
GPA in the general education courses (first dependent variable) and
their GPA in the specialization courses (second dependent variable).

• Two-way MANOVA: This is like the two-way ANOVA but with two
continuous variables in the dependent role. In our university student
example, we might be interested in looking at the student GPA in the
general education courses (first dependent variable) and their GPA in
the specialization courses (second dependent variable) while we factor
for student level (freshman, sophomore, junior, senior) and gender
(male, female).

• MANCOVA (multivariate analysis of covariance): This is similar


to ANCOVA but with two continuous variables in the dependent role.
140 QUANTITATIVE RESEARCH METHODS

In our student example, it will be like having GPA in general education


and GPA in specialization courses as dependent variables with
covariate the time spent in student support and factor for student level
and student gender.
Following the situation of comparing means, we come to the situation where
we are interested in comparing the whole spread of each variable with the others. One
could consider in this case performing everything we did so far like considering all
combinations of pairs of variables (scatterplot, correlations, and regressions) but this
might get out of hand as the number of variables increases. To avoid such
complication, we will consider instead multiple regression. This is similar to the
regression we’ve seen with two variables but extended to include multiple variables.
The form of the regression equation then becomes ŷ = a0+a1x1+a2x2+a3x3+ ……+anxn
(plus an error term that we ignore here). This can be extended to include terms for
possible interactions between variables. In the case of 3 variables the form of the
equation with the interactions included becomes ŷ = a0+a1x1+a2x2+a3x3+a4x1x2+
a5x1x3+a6x2x3+a7x1x2x3. The coefficients can be produced by applying the least-squares
approximation (beyond the scope of this book) as in the simple regression case.
The typical assumptions that we made for the simple regression also apply in
the case of multiple regression. That includes the assumption that our variables are
nearly normally distributed, linearly correlated, they are independent of each other (no
covariance’s exist), and they have similar variances. With these conditions met we then
need to consider if our model (that the software we use produced) is any good. This
is usually performed with an F-test (ANOVA) that allows us to test if the means of
the variables could realistically model the associations between our variables. If the F-
statistic is large (“reasonably” far from one), then the means are unlikely to be a good
model for the variables and the multiple regression might have a better chance. Most
software would then proceed to check if any of the coefficients of the multiple
regression model/line can account for the variation of y. This is usually performed
using the t-statistic (Student t) and the corresponding t distribution by considering
each coefficient in the regression equation as the mean of a hypothetical population
and comparing it to a hypothetical value of zero. If the test indicates that the
coefficient and the zero value belong to, say, the 95% of the hypothetical population,
we can conclude that our coefficient is similar to “zero” so the variable/coefficient it
represents does not contribute to the model after allowing for the effect of all the other
variables. Please note that linearity with y is still not excluded, it is simply negated in
our model in light of the effects of the other variables. We need to be cautious when
interpreting the multiple regression results in light of the aforementioned point. A
typical error made is the assumption that each coefficient indicates the effect of its
corresponding variable on the y. This is far from true. Multiple regression does not
Samples 141

identify the type of relationship between each variable xi and y (it could be linear,
quadratic, or something else even when the corresponding coefficient ai is close to
zero). What we see in multiple regression is the combined effect of all x variables and
not their individual contribution.
As we did with the case of the simple regression, we can evaluate the strength
of the (multi) correlation by using residuals. The symbol we use for this statistic is R2
(the same one as for the simple regression), but in this case, it is called adjusted R-
squared. This is because as we add more independent/predictor variables xi the more
the sum of squared residuals will increase, unless of course the corresponding
coefficients ai become zero. Unlike the normal regression model, the adjusted R2 no
longer suggests the fraction of variability accounted for by the model (it is not even
contained between 0 and 100% as the normal R2), so we need to be extra careful when
interpreting it. We should also complement the interpretation by inspecting the
scatterplots of the various xi with the y.
An alternative to multiple regression that is of interest when control or
mediator variables are involved is partial correlation22. While it is more of a case of
correlation between two variables the fact that there might be interference from others
leads to their consideration here as a case of many normal-like scale variables.
Presuming all the conditions of this category apply (our variables exhibit a linear
relationship, they are continuous and have a normal like distribution with no
significant outliers) partial correlation allows for measuring the strength of the
relationship while controlling for the interference of other variables. The process
begins by performing correlations of all the combinations of the variables we have
(dependent, independent, and suspected control/moderators/mediators). The partial
correlation coefficient is then calculated as a combination of the individual
correlations. In the case of the correlation of variables x and y where we suspect
influences from another variable z the partial correlation coefficient of x and y when
controlling for z is given by the formula:

As an example, let us assume that we are interested in studying the correlation


between employees’ performance the amount of free overtime they put at work and

22 SPSS: Analyze => Correlate => Partial


142 QUANTITATIVE RESEARCH METHODS

their fear of being fired. Let us suppose that various instruments measured all 3
variables for an organization’s workforce, and we got correlation coefficients rop = 0.2
for overtime versus performance, rof = 0.8 for overtime versus fear of being fired, and
rpf = -0.4 for performance versus fear of being fired. If we only studied the correlation
between overtime and work performance (small rop) it would appear that there is no
correlation between the two while logic would suggest that the more time and effort
one invests on job related tasks the better they will become. However, if we were to
consider the influence (control in statistical lingo) of the fear of being fired we can see
that the greater the fear of being fired the more overtime employees put (large rof) and
that the more the fear of being fired paralyzes them the less they perform (negative
rpf). If we were to remove the factor of fear by applying the partial correlation formulae
we get rop.f = 0.95 which clearly supports what we would normally expect – overtime
improves performance.

4.2.6 Many Non-Parametric Scale Variables


The case of many non-parametric variables adds difficulties to their analysis to
the point that very few methods address their complexity. When it comes to
comparing single values of the different variables one could only really compare their
position with respect to the median and quartiles. The Wilcoxon’s Rank Sum test can
also be applied here when considering the different variables in pairs. In the case,
though, that we are interested in comparing all the variables as a set, a more appropriate
test is the Kruskal-Wallis H test (often called “one-way ANOVA on ranks”).
The corresponding statistic has a distribution that approximates the chi-square
distribution (section 3.3.2) for k-1 degrees of freedom (k is the number of variables
compared) and is given by the formula:

Ni and Si are the value counts and the sum of the ranks of the ith variable after
we order them (similar to the Wilcoxon process). N is the total of all value counts (N
= N1 + N2 +…..Ni +…. ). Let us consider as an example the data for three variables
in Table 4.9. After we rank-order them (first column in Table 4.10) we calculate the
sums of the orderings of each variable. Note that when multiple entries of the same
value exist across variables (like 5, 6, and 8 that appear twice in Table 4.10), the average
of their orders is assigned instead. By applying the formula for H (for N1 = 6, N2 = 7,
N3 = 8,, and N = 21) we get H = 5.22.
Samples 143

Table 4.9 Non-


parametric data Table 4.10 Friedman statistic process
X- X1 X2 X3
X1 X2 X3 ordered Order orders orders orders
4 5 11 -10 1 1
-3 -2 6 -9 2 2
-7 6 5 -7 3 3
-5 8 7 -6 4 4
12 -9 13 -5 5 5
-6 -10 2 -4 6 6
-4 8 -3 7 7
1 -2 8 8
1 9 9
2 10 10
4 11 11
5 12 12.5
5 13 12.5
6 14 14.5
6 15 14.5
7 16 16
8 17 17.5
8 18 17.5
11 19 19
12 20 20
13 21 21
S3 =
S1 = 50 S2 = 61.5 119.5
S21 = S22 = S23 =
2500 3782.3 14280.3
2
S 1 / N1 S22 / N2 S23 / N3
= 417 = 540 = 1785
Total 2742

For the interpretation of the H statistic we need to also consider the variability
(shape of frequency curves) of the variables we study. If they have similar shapes
(Figure 4.17.a) then the test can provide a realistic comparison of the medians for the
different groups. However, if the distributions have different shapes (Figure 4.17.b)
then the test can only be used to compare mean ranks/orders (instead of the means
of the variables). In the case of our example the statistic for the corresponding values
of k-1 degrees of freedom (becomes 2 in our case) in table of Figure 3.27 is between
144 QUANTITATIVE RESEARCH METHODS

the 0.1 and 0.05 probabilities. This means that if we are interested in difference as rare
as 10% of the population then our sample does belong in this category (something has
happened in some of the variables that drastically differentiated from the others —
maybe a drag was more effective). If, on the other hand, we are interested in
differences as rare as 5% of the population then our sample does not qualify as such.
Depending on the shape of the variable distributions by difference, in the previous
sentence we will either refer to the means (Figure 4.17.a) or the means of the ranks
(Figure 4.17.b) of each variable.

Figure 4.17 Similar and dissimilar distributions

In situations where there is a one-to-one relationship between the variables we


study (like when the same population is exposed to, say, three different treatments), a
better comparison of all variables as a set can be achieved by Friedman’s rank test
(Fr-test)23. The corresponding statistic has a distribution that approximates the chi-
square distribution (section 3.3.2) for k-1 degrees of freedom and is given by the
formula:

Where N is the sample size, k is the number of variables we test and S are the
sum of the ranks of each value after we order them (similar to the Wilcoxon process).
Consider as an example the data in Table 4.9 but without the last value of X2 (-4) and
the last two values for X3 (8 and 1) so they all have an equal number of entries. After
following the same ranking process as for H-statistic and applying Friedman’s formula,

23SPSS: Analyze => Nonparametric => Related Samples => follow up with Graph
Builder => Boxplots
Samples 145

we will get Fr = 1674.8. This statistic is by far higher than the corresponding values
for k-1 (becomes 2 in our case) in Table 3.27, indicating that the medians of the
variables vary significantly. Apparently, there is something in the three sets of variables
(like maybe an effective drag treatment) that distinguishes them from each other.
The final case of non-parametric variables we will discuss here is dichotomous
variables. These are variables that can take only one of two values like heads or tail,
yes or no, “0” or “1”, success or failure, etc., implying that the binomial distribution
will be involved. The suggested method in such cases is binomial logistic regression
or simply logistic regression24. We apply this method when the dichotomous variable
is in the role of dependent variables, while the independent variables can be normal-
like or non-parametric. In that sense, it is like the mixed-case of variable types. In
logistic regression, instead of predicting y from the values of x, we predict the
probability of y occurring. For that we transform our original scale values to their
logarithms (same thing we did when we applied the log transform to make data
normal). This allows us to treat the values as linear without being affected by the non-
linearity of our variables. The logistic regression equation takes the form:

Given that the results of the equation are probabilities, their values will be
between 0 and 1. A value close to zero means that the y value is unlikely to have
occurred, while the opposite is true the closer we get to 1. The statistical software we
will use is going to produce the values of the coefficients using a maximum-likelihood
estimation. As with the normal regression where we used R2 to evaluate the closeness
of our predicted values to the observed ones, we use a specialized statistic here called
log likelihood, given by the formula:

It can be seen that the formula is based on the sums between the predicted
and actual outcomes and as with R2 it is an indicator of the variance that the model

24 SPSS: Analyze => Regression => Linear =>…one dependent, many independent
146 QUANTITATIVE RESEARCH METHODS

explains. Oftentimes we use the log likelihood to test different models. One such
model (assumed as the baseline model) is when only the constant a0 of the polynomial
is considered and all other coefficients are zero. By evaluating the differences between
the suggested and the baseline model we get the chi-square metric:

As degrees of freedom, df in this case, we take the difference between the


degrees of freedom of the new and the baseline models. By locating our chi-square
value in the chi-square distribution table of Figure 3.28 we can find the probability
past which the model becomes rare enough to be “interesting”.

4.2.7 Nominal Variables


Having considered the case of scale variables for one sample, we move on now
to discuss the case of nominal/categorical variables. These types of variables have no
inherent ordering of any form that could allow for the organization/ordering of their
values along some continuum, so we can only rely on their frequencies and by
extension their proportions in the population. This is not really limiting as we have
seen at the beginning of the chapter when we discussed CLT and saw how to associate
these proportions (in the role of the mean) with the population proportion and the
standard deviation (formula 4.3). We will start here with the case of a single variable,
we will then proceed as before with the case of two variables and conclude with the
case of many variables. Combinations of nominal and scale variables will be dealt in
the sections of two and many samples latter on.
For the case of one nominal variable let us consider the example of the car
color preferences (replicated in the first two columns of Table 4.11). We are interested
here to find out whether our sample data have any significance compared to
hypothetical values for the frequencies or proportions. These hypothetical values
comprise the model that we might want to prove for the property (car color) we study.
In other words, we want to see if our data fit the model we have in mind. For this
reason, the test we are about to present is called “goodness of fit” (also appears as
Pearson’s chi-square)25. Preconditions for the test are that the data are counts
(meaning positive integer values), and there are at least five entries for each attribute.

25SPSS: Analyze => Non Parametric => One Sample => Fields => Settings =>
Chi-square
Samples 147

The process of building the metric (Table 4.11) is similar to what we did with
continuous variables by considering their residuals, that is the differences between
what we observe in our sample from what we expect them to be in the population
according to the model we adopted. To avoid cancelation of the differences when we
sum the residual (some are negative, and some are positive), we end up raising each
one to the square. Because these squares will become higher the more values we have,
it is best at this point to use their relative differences, so we consider the divisions of
the residuals’ squares by their corresponding expected values. The sum of these final
values (referred to as components) is the statistic we call chi-square (symbolized as
χ2) and its distribution follows the chi-square distribution we saw in section 3.3.2.
Coming back to our example values of Table 4.11, we will assume the simplest
possible model and that is that every color in our list has equal probability for
appearance in the preference list. Given that there are 9 categories/colors (we consider
“Other” as a distinct category), an equal probability distribution would assume each
color to be chosen by 1/9 or 55.56 of the population (500). Table 4.11 outlines (from
left to right) the process of building the chi-square statistic according to the description
provided in the previous paragraph.
As we can see the car color case, χ2 = 236.2. Considering the degrees of
freedom k as equal to the number of entries minus one (N-1 will be 8 in our case), we
can see from the chi-square values of the table in Figure 3.28 that our value (236.2) is
higher than any of the entries in the table row with k = 8, indicating that the probability
of our sample approaching the model (equal probabilities for all colors) is less than
1‰. In other words, there is no real chance (at least higher than 1‰) based on our
findings that the various car colors are chosen with equal probabilities. This is quite
realistic as for example, we rarely, if ever, see any pink cars on the street.
Table 4.11 Nominal data for car color example
Residual Residual2 Component
Car Observed
Expected (Obs- (Obs- (Obs-Exp)2
Color Frequency 2
Frequency Exp) Exp) / Exp
White 115 55.56 59.44 3533.64 63.61
Silver 90 55.56 34.44 1186.42 21.36
Black 105 55.56 49.44 2444.75 44.01
Gray 70 55.56 14.44 208.64 3.76
Blue 30 55.56 -25.56 653.09 11.76
Red 40 55.56 -15.56 241.98 4.36
Brown 30 55.56 -25.56 653.09 11.76
Green 5 55.56 -50.56 2555.86 46.01
Other 15 55.56 -40.56 1644.75 29.61
SUM 500.00 500.00 0.00 13122.22 236.20
148 QUANTITATIVE RESEARCH METHODS

While the chi-square test can give us an indication of the significance of an


association of nominal and/or ordinal variables it does not tell anything about its
strength. In this case Cramer’s V26. (square root of chi-square over sample
size*degrees of freedom) is a test that can be used to measure the strength of the
association, producing a value between zero (weak association) and one (strong
association). In the case of a 2x2 data matrix the Phi value is often cited (same formula
as Cramer’s V).
When sample sizes are too large (thousands of entries), a better alternative to
chi-square is the G-test of goodness of fit (also called likelihood ratio test or log-
likelihood ratio test or G2 test). This test uses the logarithm of the ratio of the
observed over the expected values (called likelihood ratio) as a basis to form its
statistic. Each logarithm then is multiplied by its observed value and the sum of all of
them is multiplied by 2. The last step is necessary so that the distribution resembles
the chi-square distribution. Using the entry data of Table 4.11 (first three columns),
Table 4.12 outlines the process of the G-test. The sum in this case becomes 128.31,
which when multiplied by 2 will give us the value of the statistic G = 256.62. We can
see that it is quite close to the 236.2 value of the chi-square statistic. After calculating
G, we follow the same steps as with the chi-square test (same degrees of freedom) to
deduce whether our result is significant or not.

Table 4.12 G-test statistic process


Car Observed Expected Obs*ln(Obs
Color Frequency Frequency Obs/Exp ln(Obs/Exp) /Exp)
White 115 55.56 2.07 0.73 83.67
Silver 90 55.56 1.62 0.48 43.42
Black 105 55.56 1.89 0.64 66.84
Gray 70 55.56 1.26 0.23 16.18
Blue 30 55.56 0.54 -0.62 -18.49
Red 40 55.56 0.72 -0.33 -13.14
Brown 30 55.56 0.54 -0.62 -18.49
Green 5 55.56 0.09 -2.41 -12.04
Other 15 55.56 0.27 -1.31 -19.64
SUM 128.31

26
SPSS: Analyze => Descriptive Statistics => Crosstabs => Statistics: Select
Chi-square, Cramer’s V, and Phi
Samples 149

We will see now what we can do when we have two variables that could
potentially relate to each other. Notice we are avoiding the word correlate here. Our
variables are nominal and as such they have no ordering so there is no sense that one
increases or decreases as we had in correlation and regression. To demonstrate the
case, we will use the same example of the car color but only for the top three popular
colors. We will consider an additional variable that will be the geographic location. The
attributes of this variable will include North America, Europe, Asia-Pacific, and rest
of the world as a single category. Table 4.13 (ignoring the yellow cells) includes the
data for our car color example spread among four car color options (White, Silver,
Black, and Other) and four geographic locations. Such tables are generally called
contingency tables or cross-tabulations27. They are quite popular in contingency
table analysis and they typically display the frequency of observations along with the
corresponding percentage of each entry with respect to its group (yellow highlights).
For the color Silver, for example, the total is 90 so the percentage of people from
North America who prefer Silver will be given by 14/90 or 15.6%. It goes like this for
the rest of the entries, except the sum where our total for each color sum is our sample
size of 500. For the sum of the color Silver, its percentage to the sample total (500)
will be 90/500 or 18% and so on for the other color sums. Typically, contingency
tables are displayed with their corresponding bar chart (Figure 4.18).

Table 4.13 Contingency table


Rest of
Car North Asia-
Europe the
Color America Pacific
World Sum
White 28 28 25 34 115
24.3% 24.3% 21.7% 29.6% 23%
Silver 14 13 13 50 90
15.6% 14.4% 14.4% 55.6% 18%
Black 20 24 22 39 105
19.0% 22.9% 21.0% 37.1% 21%
Other 36 37 41 76 190
18.9% 19.5% 21.6% 40.0% 38%
Sum 99 103 102 200 500

27
SPSS: Analyze => Descriptive Statistics => Crosstabs
150 QUANTITATIVE RESEARCH METHODS

While contingency analysis can provide a wealth of information about the


frequencies and their corresponding percentages across the two variables, it doesn’t
directly show a potential dependency between them. Given the nature of the variables
(categorical), the only thing we can really do here is test whether our variable
values/frequencies have any kind of relationship to hypothetical ones or as they are
called here expected values. In our car color example, we might be interested to see
if the color preferences show any dependence on the geographic region or if they are
the same across geographic regions. This type of test is called chi-square test for
independence (also appears as Pearson’s chi-square) and we can best demonstrate
this through an example.

Figure 4.18 Bar chart of contingency table

By assuming a homogeneous distribution of the car color preferences across


regions, we can build the expected frequencies (see Table 4.15) as follows. For the
color White, we see that the total across all geographic locations is 115 (Table 4.14 is
a copy of Table 4.13 without the percentages). Such totals are also called marginal
frequencies (they are at the margin of the tables). Given that we have a total of 500
values, the probability of observing the White color is 115/500 or 23%. For a
homogenous distribution, this percentage will have to be the same across all regions.
So, for North America the expected frequencies would be 23% of 98 (equals 22.54),
for Europe it would be 23% of 102 (equals 23.46), for Asia-Pacific it would be 23%
of 101 (equals 23.23), and for the rest of the world would be 23% of 199 (equals 45.77).
The process continues for the rest of the colors until Table 4.15 is completed with
expected values in each cell.
Samples 151

Table 4.14 Observed frequencies Table 4.15 Expected frequencies


Car North Asia- Rest of Car North Asia- Rest of
Europe Europe
Color America Pacific World Sum Color America Pacific World
White 28 28 25 34 115 White 22.54 23.46 23.23 45.77
Silver 14 13 13 50 90 Silver 7.64 .36 18.18 35.82
Black 20 24 22 39 105 Black 0.58 1.42 21.21 41.79
Other 36 37 41 76 190 Other 7.24 8.76 38.38 75.62
Sum 98 102 101 199 500

Table 4.16 (Observed - Expected)2 / Table 4.17 (Observed - Expected)


Expected /Expected
Car North Asia- Rest of Car North Asia- Rest of
Europe Europe
Color America Pacific World Sum Color America Pacific World
White 1.323 0.879 0.135 3.027 5.363 White 1.150 0.937 0.367 -1.740
Silver 0.751 1.565 1.476 5.613 9.405 Silver -0.867 -1.251 -1.215 2.369
Black 0.016 0.311 0.029 0.186 0.543 Black -0.128 0.557 0.172 -0.432
Other 0.041 0.080 0.179 0.002 0.302 Other -0.203 -0.283 0.423 0.044
X2 15.61

By calculating (Observed - Expected)2/Expected for the component for each


cell (as we did in Table 4.11), we end up with Table 4.16. The sum of all components
is our chi-square statistic. In this case, we get x2 = 15.6. To find the corresponding
probability in the chi-square table (table of Figure 3.28), we need the degrees of
freedom. In the case of two variables the degrees of freedom can be found by the
product (N1-1)*(N2-1), where N1 is the attribute of one of the variables like car color
(4 in this case) and N2 is the attribute of the second variable — like location (also 4 in
our case). Based on the previous, the degrees of freedom are (4-1)*(4-1) = 9. From the
table of Figure 3.28, for k = 9 we can see that our statistic is between the corresponding
values for p = 0.10 and p = 0.05. This means that the homogenous assumption that
provided the expected values could be evident in 90% of the samples we could draw,
but not in 95% of them. In general, we can say here that the choice of color is not
really dependent on the geographic region and that neutral colors like White, Silver,
and Black have more or less a universal appeal at least with respect to 90% of the
population.
As with the regression, it is worth studying here the residuals that are now
expressed as the difference between the observed and expected values for each cell
(remember in proportions every cell is like its own sampling distribution). Because in
this case the cell counts can be large, we standardize/scale them to ensure their mean
(if each cell was a distribution in itself) is zero and their standard deviation is the square
root of the expected frequency. This process produces the residuals which in our case
152 QUANTITATIVE RESEARCH METHODS

are also the square roots of the components but with the sign of the subtraction
(observed-expected) preserved. The standardized residuals of our example are shown
in Table 4.17. The standardization process results in expressing the distance in terms
of standard deviations. As such, the close to zero values in our table suggest closeness
to the mean and support for the findings of the chi-square test. Another case for the
chi-square test will appear in section 4.3 for two and many samples with the only
difference being the name. In those cases, it will be called chi-square test for
homogeneity. This is mentioned here for completeness purposes.
When the values of the cells in cross-tabulation become large (into thousands),
a better alternative to the chi-square test of independence (as in the case of the chi-
square goodness of fit test) is the G-test of independence. The math is similar to the
G-test of goodness of fit with the only difference being that the expected frequencies
are calculated based on the observed frequencies (similar to the chi-square test of
independence). Considering the values of Table 4.14 and Table 4.15, the G-test
process is displayed in Table 4.18. As a last step we need to multiply by 2 the final sum
(7.766) to get the statistic. We will end up with G2 = 15.531, which is almost the same
as the x2 value of 15.613 we got before.
Table 4.18 Observed*ln(Observed/Expected)
Rest of
Car North Asia-
Europe the
Color America Pacific
World Sum
White 6.074 4.953 1.836 -10.107 2.756
Silver -3.236 -4.488 -4.360 16.676 4.592
Black -0.572 2.729 0.805 -2.695 0.268
Other -1.219 -1.719 2.707 0.381 0.150
G 7.766

While the chi-square and G-tests we mentioned here pretty much cover all
possible situations when two nominal variables are involved, there is the special case
when the two variables are paired where the McNemar test might be a better
alternative. This pairing is expressed as a dichotomous variable in the role of the
dependent variable like when we measure the impact of an intervention (“low” and
“high” or “success” and “failure”) compared to an alternative. For example, consider
that we have two promotional campaigns for a product — one with just the product
and the other with the product and an offer. We want to test to see if there are
differences in their effectiveness. Let’s assume we have 200 participants for our study.
We create 100 pairs and we expose one of the members in the pair to the plain product
campaign and the other to the product plus offer campaign. The participants record
Samples 153

with “Yes” or “No” whether they liked the campaign they were exposed to. Their data
are displayed in Table 4.19.
McNemar’s formula focuses only on the cells with one “Yes” and one “No”
(yellow highlights) and considers the square of their difference over their sum. In our
case is takes the form (54-48)2 / (54+48) resulting in 0.35 for its statistic. Looking at
the chi-square table of Figure 3.28 with (2-1)*(2-1) or 1 degrees of freedom we see that
the statistic is around p = 0.65. This is not that far from the 50% chance, so we can
safely conclude there is no significant difference between the two campaigns with
respect to customer choices.
Table 4.19 McNemar test data
Product
and Offer
Campaign
Yes No
Product Yes 65 54
Campaign No 48 33

A problem with chi-square that was not mentioned before is that it does not
handle well situations where the cell entries are small (typically below 5). In these cases,
a better alternative is Fisher’s exact test of independence28. The assumption that
needs to be satisfied for this test is that variables are independent (one does not suggest
the other). Unlike most tests that develop a mathematical formula for calculating a
statistic, we calculate here the probability of getting the observed data given all possible
values the variables can take.
Consider the situation of Table 4.20 with the values of two nominal variables.
The first variable has two attributes (Attr.11 and Attr.12) and so does the second
(Attr.21 and Attri.22). n11, n12, n21, and n22 are the observed frequencies, while the
rest of the value entries include the corresponding sums. The probability that Fisher’s
test calculates is given by considering the factorials (remember n! = 1*2*3…..*n) in
the formula:

28 SPSS: Analyze => Descriptive Statistics => Crosstabs => Exact => Asymptotic
154 QUANTITATIVE RESEARCH METHODS

This will give us the p values for the corresponding arrangement of values
(Table 4.20). By creating all possible arrangements of values, we end up with a
distribution of probabilities similar to what we achieved when we considered the
example of the dice in section 3.3.

Table 4.20 Fisher’s exact test


Variables Attr.11 Attr.12
Attr.21 n11 n12 n11+n12
Attr.22 n21 n22 n21+n22
n11+n21 n12+n22 Sum

Table 4.21 Rare car color frequencies


Car North
Europe Sum
Color America
Pink 4 2 6
Yellow 5 7 12
Sum 9 9 18

Table 4.22 Possible value


arrangements
Number Permutation p
0 6
1 0.005
9 3
1 5
2 0.061
8 4
2 4
3 0.244
7 5
3 3
4 0.380
6 6
4 2
5 0.244
5 7
5 1
6 0.061
4 8
6 0
7 0.005
3 9

To illustrate the application of the method let us consider the car color
example with some rare car colors (pink and yellow) for North America and Europe
Samples 155

only. Table 4.21 shows the frequencies for these colors for the two regions we study
along with the sums of the corresponding rows and columns. By applying Fisher’s
formula with the factorials, we get p = 0.244. We need to compare this with all possible
arrangements (permutations) of the cell values that exist that also preserve the values
of the totals for each row and column. One easy way to find out how many there are
is to consider the lowest marginal sum (in this case it is 6), start with the lowest
frequency entry (in this case it is the Pink color for Europe) cell by assigning it the
maximum allowable value (6 in this case), and create all possible arrangement by
reducing it by one until it becomes zero (yellow highlights). All possible and allowable
(first and second line sums result in 6 and 12 respectively, and first and second column
sums result in 9) permutations are listed in Table 4.22 with their corresponding p
values on the side.

Figure 4.19 Distribution of p values

If we now plot the p values, we got for the various arrangements (like we did
for Figures 3.18 and 3.19) we end up with the probability distribution of the Fisher’s
test values (Figure 4.19). With the p value of the observed arrangement (Table 4.21 –
also replicated as arrangement 5 in Table 4.22) we can calculate the area under the
curve for the one tail (not recommended) or two tail (preferable for Fisher’s test)
power by simply adding the probabilities we found. In this way, we can say that the
probability of observing an arrangement of frequencies more rare from what we have
is 0.31 (add p values of 5, 6, and 7 arrangement). For the two-tail case (preferred) we
can either double that value or add the symmetric regions under the curve (Figure
4.19).
156 QUANTITATIVE RESEARCH METHODS

The final subject we will deal with in the nominal variables case is the multi-
variables situation (3 and above). We will discuss here the 3 variables case and leave it
for the reader to extrapolate to more variables. As one would expect, what was
developed in the previous paragraphs can be applied here, so performing chi-square
goodness of fit or G-test is applicable exactly as we did before. Some interesting
alternatives, though, of the multi-nominal variable situation are worth investigating
and will be discussed here.
One such alternative is when we study sections of the data in isolation of the
effect of some of the variables. In essence, we reduce the dimensionality of our tables
by “ignoring” some variables. As an example, for the 3 nominal variables we will
consider the car color example we’ve seen before (Table 4.13) but with the addition of
gender as another variable along with car color and region. Our contingency table will
now take the form of Table 4.23. If these were our original data instead of the data we
used up to now, then Table 4.13 was nothing more than a reduction of the three-
dimensional Table 4.23 (with variables car color, region, and gender) to a two-
dimensional Table 4.13 (with variables car color and region). Such reduced tables are
called partial tables.

Table 4.23 Observed frequencies for three nominal variables


Car North Asia- Rest of
Gender Europe Sum
Color America Pacific World
White Male 16 10 15 18 59
Female 12 18 10 16 56
Silver Male 9 7 5 27 48
Female 5 6 8 23 42
Black Male 12 16 12 16 56
Female 8 8 10 23 49
Other Male 20 20 20 30 90
Female 16 17 21 46 100
Sum 98 102 101 199 500

We are interested to see if there is a relationship/association between the three


variables. Given that the gender variable is dichotomous (only male and female
options), we will “freeze” it for each of its values and create the corresponding partial
tables for ‘male’ and ‘female’. If we assume that Gender = Male then Table 4.23
becomes Table 4.23.a, while for Gender = Female it becomes Tables 4.23.b. If we add
these two tables, we get what is called a marginal table (Table 4.24). This is how our
data will look if we were to ignore gender — of course this would be Table 4.23 that
we’ve used all along.
Samples 157

Table 4.23.a Males observed frequencies


Car North Asia- Rest of
Europe
Color America Pacific World Sum
White 16 10 15 18 59
Silver 9 7 5 27 48
Black 12 16 12 16 56
Other 20 20 20 30 90
Sum 57 53 52 91 253

Table 4.23.b Females observed frequencies


Car North Asia- Rest of
Europe
Color America Pacific World Sum
White 12 18 10 16 56
Silver 5 6 8 23 42
Black 8 8 10 23 49
Other 16 17 21 46 100
Sum 41 49 49 108 247

Table 4.24 Marginal table for gender


Car North Asia- Rest of
Europe
Color America Pacific World Sum
White 28 28 25 34 115
Silver 14 13 13 50 90
Black 20 24 22 39 105
Other 36 37 41 76 190
Sum 98 102 101 199 500

The advantage of using partial tables is that it reduces the problem of studying
a higher-dimensional problem (three dimensions in Table 4.23) to a lower-dimensional
problem (two in Tables 4.23.a, 4.23.b, and 4.24). This way we can apply whatever tests
we had available for the two-dimensional tables like chi-square test and G-test. If the
tests show that the values in any of the tables are “rare” enough we can conclude there
is potentially some association between color, region, and gender that could be further
investigated.
Caution is required here as we might occasionally run into what is known as
Simpson’s Paradox. This arises when the partial tables support an association in one
direction while the marginal table supports an association in the opposite direction.
Consider the example of 789 individuals (515 male and 274 female) who are asked to
choose between two rare car color choices, Sarcoline and Mikado. The partial tables
158 QUANTITATIVE RESEARCH METHODS

for male and female participants are shown in Tables 4.25.a and 4.25.b, while the
marginal table (the sums) is shown in Table 4.26.

Table 4.25.a Observed Table 4.25.b Observed frequencies


frequencies for male for female
North North
Gender Europe Ratio Gender Europe %
America America
Sarcoline 80 400 0.20 Sarcoline 8 100 0.08
Mikado 7 28 0.25 Mikado 16 150 0.11

Table 4.26 Observed frequencies


male+female
North
Gender Europe %
America
Sarcoline 88 500 0.18
Mikado 23 178 0.13

The aforementioned tables also list the relative percentages of the entries with
respect to the geographic region. From Table 4.25.a we can see that the ratio of North
Americans over Europeans that prefer Sarcoline is smaller than the corresponding
ratio for Mikado. If we look at Table 4.25.b, we see that the same is true for the female
population. Naturally, one would expect that when we create the marginal table from
the addition of the values of the male and female tables we would observe the same
analogy. Surprisingly though, we can see from Table 4.26 that the ratio of North
Americans over Europeans who prefer Sarcoline is higher than the corresponding
ratio for Mikado. The example here is meant to showcase the Simpson’s Paradox and
alert researchers to be careful when conferring conclusions from the marginal tables.
Having seen the case of partial and marginal tables we will discuss now an
analog to the regression we saw with scale variables. This is possible in situations where
the independent variable is a scale variable and the dependent is categorical. In this
case, we would want to do something similar to ANOVA or multiple regression but
instead of the continuous variables x we will now have the nominal variables
(something like Frequency = a0+a1Color+a2Region+a3Gender (for the multiple
regression case of Table 4.23). We usually include an error term also in such equations
but for simplicity we will ignore it here.
In order to include influences of individual variables and cross-influences
between variables the general linear model for the 3-variables case becomes:
Samples 159

Frequency =a0+a1Color+a2Region+a3Gender
+a4Color*Region+a5Color*Gender+a6Region*Gender
+a7Color*Region*Gender
The products between the variables will attempt to capture interaction effects
between them. The problem with the aforementioned regression equation is that our
variables cannot be expressed as continuous variable like say White = 1, Silver = 2,
Black = 3, and Other = 4 and similarly for Region and Gender. Given that the different
categories are mutually exclusive (in each cell in Table 4.23 there is only one color and
not multiple ones), we can consider binary representations for the existence of a color
(indicated with the value or 1) and its absence (indicated with the value of 0). The
numeric variables of this type are called dummy variables.
For the car color White we set it as 1 when there are no other colors and 0
when any of the other colors exist. We do the same for Silver, Black, and Other. We
treat the regions and gender similarly. Table 4.27 shows all the possible combinations
of values that exist for the 3 variables.

Table 4.27 Dummy variable representation


Car color White Silver Black Other
White 1 0 0 0
Silver 0 1 0 0
Black 0 0 1 0
Other 0 0 0 0

Region North Europe Asia- Rest of


America Pacific the
World
North
1 0 0 0
America
Europe 0 1 0 0
Asia-Pacific 0 0 1 0
Rest of World 0 0 0 1

Gender Male Female


Male 1 0
Female 0 1

The regression equation will then take the form:


160 QUANTITATIVE RESEARCH METHODS

Frequency = a0+a1White+a2Silver+a3Black+a4Other
+a5Male+a6Female
+a7NorthAmerica +a8Europe +a9AsiaPacific+a10RestOfTheWorld
+a11WhiteMale + ……all possible interaction terms
Because we are considering linear relationships we need to use the natural
logarithms of the frequencies instead of the actual values (like in logistic regression).
When considering prediction, we just insert the appropriate dummy variables for the
case we are interested in the regression equation to get the predicted frequencies. For
example, if we are interested in the car color Black in Europe for Females then
according to the dummy variables representation (Table 4.27) for each entry the
regression equation will become:
ln(Frequency) = a0+a1*0+a2*0+a3*1+a4*0
+a5*0+a6*1
+a7*0+a8*1 +a9*0+a10*0
+a11*0*0 + ……all possible interaction terms
Apparently, the only terms that contribute are the ones that include Black,
Europe, and Females simply because they are 1 while everything else is zero. To avoid
collinearities that the constant term might introduce as it never disappears, we usually
consider dropping one of the attributes (usually the one with the most values). This
way the constant term can be set to represent the influence of the category we dropped.
It should be evident by now that the situation can increase in complication the more
variables we consider. This should not be a problem since this is dealt with by most
statistical software packages. The interested reader can find more on what we
discussed in the extant literature (Internet).
Expanding on the previous discussion it would be interesting to see what
happens in the case where we have a mix of nominal and scale variables. The logistic
regression equation can be extended in that case to include the scale variable and
potential combinations with the nominal variables in the generic form:
Frequency = a0+a1Nominal1+a2Nominal2+a3Nominal1*Nominal2+a4X1+
a5X2+a6X1*X2+a7Nominal1*X1*+…
While the previous method can cover every possible situation when nominal
variables are involved, there is the special case of one nominal dependent variable and
multiple independent variable that are all dichotomous. In such cases Cochran’s Q
test is the appropriate choice and we will discuss it here. In the case of the car color
Samples 161

example we might be interested to find out how a particular fictitious car model (let’s
say SupperStat) will sell in the various colors in each geographic region. The answers
a sample provides are recorded as “Yes” or “No” and entered in Table 4.28 as “1” or
“0”, respectively.
Cochran’s Q is computed by the formula (k-1)*[k*(S12+ S22+ S32+ S42) –(S1 +
S2 + S3 + S4)] / (k*S – S2), where k is the number of dichotomous variables (in our
case this is 4 — equal to the number of geographic regions). After completing the
calculations, we get Q = 44.6. Considering this as our chi-square statistic or comparing
it with the chi-square statistic for the p value we might be interested in, we can make
assertions about the rareness of our observations. With our example’s degrees of
freedom (k-1) = 3 we can see that the Q value is higher than any of the chi-square
values in the table of Figure 3.28, indicating that the observed values are rarer than
even 1‰. The different geographic regions do not seem to relate to each other (or
influence each other in a way), at least with respect to the color popularity of the
SupperStat car color.

Table 4.28 Nominal data for SuperStat car color example


Rest of
Car North Asia-
Europe the
Color America Pacific
World Sum Sum2
White 1 1 0 1 3 9
Silver 0 1 1 0 2 4
Black 1 1 1 1 4 16
Gray 1 0 0 1 2 4
Blue 0 1 0 1 2 4
Red 0 0 1 1 2 4
Brown 0 0 0 1 1 1
Green 1 1 1 0 3 9
Other 1 1 0 0 2 4
Sum S1 = 5 S2 = 6 S3 = 4 S4 = 6 S = 21 S2 = 55
Sum2 S12= 25 S22= 36 S32= 16 S42= 36
Total 113

4.3 Two and Many-Samples Case


Having completed/“graduated” from the one-sample situation (first floor
from our house of stats in Figure 4.8), we are ready to move into the two-samples case
(Figure 4.20) and later on the multi-sample situation. As before we will deal here with
162 QUANTITATIVE RESEARCH METHODS

the one, two, and many variables cases for scale and nominal data types and we will
base our analysis on the methods and statistics we developed in the previous sections.
As such, when we see variables in isolation from other variables, we can apply
everything that we did as if it was the one-sample case. The interest in our two-samples
case comes from comparing variables between two samples.
Let us first deal with the one scale variable that is normally distributed. We
have in this case two sets of values. It might come to mind that we faced a similar
situation when we dealt with two different variables in the one-sample situation. The
truth of the matter is that from the math point of view the two situations are identical
so one can apply everything that we did in the one-sample two-variables case in our
current two-samples one-variable case (Figure 4.21). So, if we are interested in
comparing the means of the two-samples variables that follow the normal distribution,
we just need to apply the independent t-test or if there is a one-to-one correspondence
between the variables the paired t-test. If our variables are non-parametric, Wilcoxon’s
Rank Sum for comparing the medians will be ideal, while if there is a one-to-one
correspondence between the variables the Wilcoxon’s Signed Rank might be applied.

Figure 4.20 Two-sample situation

Moving on to the situation of two variables, in each of the two samples we


could see a similarity with the application of ANOVA for means or multiple regression
Samples 163

for one-to-one correspondence if the variables are normally distributed and Kruskal-
Willis H-test for median or Friedman’s rank test for a one-to-one correspondence if
our variables are non-parametric. The same applies to the case of many variables. We
can choose to apply the method that suits our case as we see fit (Figure 4.21).
Things become a lot simpler in the case of two samples when nominal
variables are involved simply because we can consider “sample” as a nominal variable.
In our case of two samples it could be seen as a dichotomous variable since every value
in our data set will belong to one or the other sample. Our two-variables situation
becomes three variables with the introduction of the dichotomous variable sample.
This allows us to apply everything we mentioned in the one-sample many-variables
case here in our two-samples two-variables case. Extending this process, we can also
cover the situation of the two-samples many-variables cases (Figure 4.21)
It should be evident by now that the many-samples cases would be treated
similarly by introducing the Sample (or Groups or Sets or Categories) dimension into
our analysis (Figure 4.22). For the case of many-samples with one variable, for
example, one can easily see that in essence we have two variables. One nominal that
represents the sample that each value comes from and another to represent the actual
values.

Figure 4.21 From one sample to two


164 QUANTITATIVE RESEARCH METHODS

Figure 4.22 From one sample to many

To demonstrate the introduction of the Sample variable we will use the data
of Table 4.8 copied here in Table 4.29.a in the form of 3-sample data. When
considering Sample as a categorical variable we can have our 3 sets of scale variables
organized as one “sample” set with one nominal and one scale variable, as shown in
Table 4.29.b. For the case of nominal data, we will consider the car color example data
of Table 4.13 copied here in Table 4.30.a in the form of 4-sample data. Table 4.30.b
shows the same data organized as a 1-sample data with two nominal variables.
The same process we followed for reducing everything to one sample can be
followed the other way around and convert categorical variables to multi-sample cases.
A set of values, for example, for a certain scale variable that were collected from a
sample population of men and women can be considered as two samples, one with
only the values that correspond to the men and one with the values that correspond
to the women. This will allow everything that we discussed in the case of two samples
to be applied here.
In most cases, it will be up to the researcher and the type of research conducted
to rule on a method’s appropriateness and suitability for the data available for analysis.
The presentation of the methods we discussed in the previous sections do not
exhaustively cover all available methods, but they cover to a great extent the data
analysis needs for most social sciences research. Chapter 6 will deal with advanced
methods for more demanding data analysis.
Samples 165

Table 4.29.a Table 4.29.b Table 4.30.a


Car
Sample1 Sample2 Sample3 Sample x Sample 1 Sample2 Sample3 Sample4
Color
43 28 38 Sample1 43 White 28 28 25 34
25 23 34 Sample1 25 Silver 14 13 13 50
52 33 62 Sample1 52 Black 20 24 22 39
60 49 25 Sample1 60 Other 36 37 41 76
40 35 51 Sample1 40
27 20 54 Sample1 27 Table 4.30.b
Car
36 40 47 Sample1 36 Sample Frequency
Color
51 37 50 Sample1 51 White Sample1 28
38 29 43 Sample1 38 White Sample2 28
Sample2 28 White Sample3 25
Sample2 23 White Sample4 34
Sample2 33 Silver Sample1 14
Sample2 49 Silver Sample2 13
Sample2 35 Silver Sample3 13
Sample2 20 Silver Sample4 50
Sample2 40 Black Sample1 20
Sample2 37 Black Sample2 24
Sample2 29 Black Sample3 22
Sample3 38 Black Sample4 39
Sample3 34 Other Sample1 36
Sample3 62 Other Sample2 37
Sample3 25 Other Sample3 41
Sample3 51 Other Sample4 76
Sample3 54
Sample3 47
Sample3 50
Sample3 43
166 QUANTITATIVE RESEARCH METHODS

5 Hypothesis Testing

In quantitative research, we aim at proving assertions about the population we


study based on measurements of variables in samples. These assertions have a specific
name in research and are called hypotheses (also seen in section 2.3). They are tentative
explanations based on facts our samples provide that can be tested by further
investigation. This is an important point to make as it indicates that the acceptance of
hypotheses is provisional, as new data may emerge that disprove them.
Hypotheses come in pairs and they are framed in the form of null (symbolized
Ho) and alternative (symbolized Ha or H1) forms. Traditionally, the null hypothesis
refers to the state of nature where things are similar or remain the same in alignment
with the general profile of the population we study. The alternative, however, refers
to the extreme/out of ordinary/rare state of nature that has been observed (hopefully)
in our samples. Our goal is to prove that our rare state/observations signify or reveal
something new about the population. As an example, let us assume that we are
interested in proving that family support contributes to entrepreneurial success while
the prevailing (hypothetical) norm is that entrepreneurs are self-made. The two forms
would be expressed as:
Ho: Family support does not significantly contribute to entrepreneurial
success.
Ha: Family support significantly contributes to entrepreneurial success.
The word “significantly” is critical, and we will discuss its role later on. The
rationale behind the way the hypotheses are formed has to do to with the way statistical
logic works. We always try to disprove/reject the opposite of what we want to prove.
This is because it is always easier to attack an assertion than its absence. If we end up
rejecting the opposite, then we should have naturally proved what we originally hoped
to prove. In our example of entrepreneurial success, we suspect (we actually want to
prove) that family support contributes to entrepreneurial success. Instead of proving
this directly we will try and prove the opposite (Ho). If we decide that beyond any
reasonable doubt that Ho is false, then we can conclude that Ha (what we want to
prove) is true. It works the same way as in the case of a jury deciding on a guilty or not
guilty verdict. They have to presume a not guilty case (Ho) and if that does not
withstand the scrutiny of the facts then the alternative (guilty beyond a reasonable
doubt) is the only natural conclusion. This is not a far-fetched assumption since the
greatest majority of the Earth’s population are not going to be related with the offence
in question. From the point of view of the defendants it is always easier proving that
they are guilty (simply confess and describe the offence in detail) instead of disproving
Hypothesis Testing 167

it (you need alibis, convincing explanations, etc.). Most juries will accept the former
much easier than the latter.
The logic of forming hypotheses creates much confusion (it looks unnatural)
for new researchers so we will expand a little more through a couple of examples. Let’s
assume that we want to prove that a certain population like say graduate students
“hates” statistics (that is before reading this book) while it is “rumored” that most
students (say 95% of the population) “love” statistics (ignoring any intermediate
states). The latter assertion will represent the state of nature as it represents the
assertion made by the majority of the population and will form our null hypothesis
like Ho: Graduate students love statistics with the alternative Ha: Graduate students do not love
statistics. Which is more sound/“easier” to prove or disprove? In statistical logic, it is
more solid/final disproving something than proving it as in the latter case there could
always be newer evidence/cases that can disprove the assertion. Intuitively also, isn’t
it easier to find people who hate statistics (reject the null hypothesis) that those who
love statistics (reject the alternative hypothesis)?
On another case, we might be interested in proving that we are not elephants.
Our set of hypotheses would then be Ho: I am an elephant and Ha: I am not an elephant
(Figure 5.1). Is it easier to disprove/attack that we are elephants (null hypothesis Ho)
than disprove that we are not (alternative Ha)?

Figure 5.1 Null and alternative hypotheses

One keyword that might have gone unnoticed in the previous discussion is
“prove”. What do we mean by proving something? How certain are we of the proof
we provided? In quantitative research these questions are answered by deciding how
likely or unlikely the observed values are in profiling the population we study. To
answer these questions, we will introduce the concept of “significance”. This is a
“subjectively” established reference limit we set with respect to the statistic we use in
168 QUANTITATIVE RESEARCH METHODS

our analysis, below which we assume our results acquire “meaning”/significance in


light of the research questions and hypotheses we set. We are interested, in essence, in
how likely it is for our observations/data to represent the population (null hypothesis).
The limit we want to set will be based on probability p. This value of the probability
limit we’ve set is called significance level or alpha level or power (symbolized with
the Greek letter a). Typical values for most quantitative research are 0.10, 0.05, and
0.01. When the p values that tests produce are below our set limit we conclude that
the observed values show something significantly different/rarer from the state of
nature we assumed through the null hypothesis, so we can deduce that the null
hypothesis should be rejected (we reject the presumed innocence of the accused).
Caution should be exercised in interpreting results in light of rejecting the null
hypothesis. We reject the hypothesis based on our observations and the limits we have
set. That does not mean the null hypothesis is false. This can be asserted only when
complete knowledge of the population members is available (practically impossible for
large populations).
An issue that needs addressing now is how do we decide to set the limit that
would indicate significance/rareness of observations (in our sample) with respect to
our hypothesized population as expressed by the null hypothesis. It is time to
remember the critical values we mentioned in Chapter 3 and more specifically in Table
3.5. These are nothing more than probability values with their associate z values from
the normal distribution. Considering, as an example, 95% of the population around
the mean (two-tail) or the low 95% of the population (one-tail) for the variable we
study, the pair of confidence β = 0.95 and power α = 0.05 produce z = 1.960 for the
two-tail case and z = 1.645 for the one-tale case. Just as an aside, the reason critical
values are considered and not standard deviations (like 2σ that corresponds to 95.5%
of the population around the mean) is just a matter of popularity.
By adopting α values as our critical values, we can decide whether a test statistic
and its corresponding p-value are significant enough to support or reject the null
hypothesis. While the aforementioned values were derived for the normal distribution,
their role as critical/alpha-level values is applied to all other distributions (t, chi-square,
etc.) and the tests that use them. In practice, we don’t even need to consider the
statistics the various tests produce and just focus on their corresponding p-values. If a
p-value is less than the critical value, we set (like 1.96 in the case of two-tail for a =
0.05) then we can conclude that our observations are significant enough.
An issue raised by the conclusions we make is how significant they are or in
other words what is the degree or level of error involved. This should bring to mind
Chapter 3 when we discussed the CLT and its consequence on errors regarding the
mean. At that point, we devised the formulas for the standard error (replicated below
for convenience) that can give the limits (confidence interval) between which our
Hypothesis Testing 169

prediction holds true for a certain percentage of the population we study (95% in this
case):
xL = 𝑥̅ - (1.96*SE) and xU = 𝑥̅ + (1.96*SE)
Let us demonstrate with an example how all this blend into supporting or
rejecting a null hypothesis. Let us assume that a car manufacturer assigns us to
investigate if a new car color they developed (we will call it StatCol) will appeal to
consumers in Quantland. The manufacturer is planning to go ahead with production
if the color appeals to at least 75% (p = 0.75) of the consumers. This number will then
form our null and alternative hypotheses as:
Ho: The proportion of the population that finds StatCol appealing is greater p
= 0.75
Ha: The proportion of the population that finds StatCol appealing is not
greater p = 0.75
We presented the color to a sample of 230 consumers and 161 of them
indicated they liked the color. Converting this to a proportion we get 𝑝̂ = 161/230 or
𝑝̂ = 0.7. This is below what the manufacturer expected so we would be tempted to
reject the null hypothesis if it wasn’t for a potential error that we need to consider.
With 𝑝̂ = 0.7 as our predicted mean we can calculate the standard deviation of the
sampling distribution of our sample size using formula 4.3. With q = 1-p or q = 0.3
and N = 230, formula 4.3 produces σ = 0.03 or SE = 0.03. Considering the formulas
for the limits we mentioned previously with x̅ = 0.7 we get:
xL = 0.7 – 1.96*0.03 or xL = 0.64 and xU = 0.7 + 1.96*0.03 or xU = 0.76
Because the hypothesized/manufacturer’s value (75%) is between these limits
we can conclude (with 95% certainty) the null hypothesis is not rejected. It could very
well be that in the actual population StatCol appeals to more than 75% of the people.
While the previous analysis takes care of the error our statistics might have
produced, it still doesn’t address the situation where real populations behave
differently. This is due to possibilities and influence our sample might introduce as it
might have not been or behaved as an accurate representation of the population we
study. Our inferences are based on the evidence at hand and even in the best of
circumstances we can still make the wrong decision. In hypothesis testing, there are
two possibilities of producing the wrong conclusion (Table 5.1). Either the null
hypothesis is true in reality and we rejected it based on our evidence (sample) or it is
false and we accepted it (failed to reject it). These two types of errors are known as
Type I and Type II errors. Figure 5.2 depicts all possible scenarios of errors and no
errors for the one-tail case.
170 QUANTITATIVE RESEARCH METHODS

Table 5.1 Error types


Reality/Population Evidence/Sample Error
Ho = true Accept (cannot reject) No Error
Ho = true Reject Type I
Ho = false Accept (cannot reject) Type II
Ho = false Reject No Error

We will discuss now two examples to highlight the different error types. In
disease testing the null hypothesis is usually the assumption that there is no disease
(the person is healthy), while the alternative is that the person is sick (Table 5.2). Type
I error is an error in our conclusion to reject the null hypothesis. This means we
erroneously reject it by concluding that a person is sick when they are not actually sick
(false positive). As a result, we might impose some unnecessary treatment but other
than that no harm is done. When we make a Type II error though, we conclude the
person is not sick (we accept the null hypothesis) when they actually are. Obviously in
this case the Type II error is the worse as we will fail to treat a person who is sick.

Figure 5.2 Critical and p-value arrangements

In another situation let us consider the case of a jury coming up with a verdict
(Table 5.3). The null hypothesis is that the accused is not guilty. When we make a Type
Hypothesis Testing 171

I error we find the accused guilty when they are not, while when we make a Type II
error we find them innocent when they are not. In this case the Type I error might be
considered worse as innocent people might end up in jail. The two cases we mentioned
were relatively easy in suggesting which error type is the worse. In many situations, the
decision is not so easy, and it really depends on our viewpoint and the research we are
conducting.
Table 5.2 Medical disease testing
Reality / Evidence /
Population Sample Error
Healthy Healthy No Error
Healthy Sick /Positive Type I
Sick Healthy/Negative Type II
Sick Sick No Error

Table 5.3 Jury verdict


Reality / Evidence /
Population Sample Error
Innocent Found Innocent No Error
Innocent Found Guilty Type I
Guilty Found Innocent Type II
Guilty Found Innocent No Error

As we saw before, our decision to reject or not reject/accept the null


hypothesis is based on the critical value29 we adopted. For the case of Type I error we
should have accepted the null hypothesis but failed to do so because our p-value was
higher that the critical value (Figure 5.2 bottom left). To eliminate such errors, we need
to make sure our critical value is set at a much higher z value. In terms of using the
confidence and power terminology as depicted in Figure 3.24 we need to increase the
confidence (blue area under the curve) to include our p value or reduce the power (red
area under the curve). For the case of Type II error, we have that our null hypothesis
is false in our population/reality and we failed to do so (Figure 5.2 top right). To
alleviate such an error, we need to adopt a lower critical (lower confidence) that is
below our p-value (which should be in the red area in Figure 5.2). An interesting
situation arises when we want to reduce both types of error. If we consider the two
error graphs that represent the two types of errors in Figure 5.2 (top right and bottom
left) we need to increase both blue and red areas at the same time. The only way we

29 not to be confused with the power a which is 1-β


172 QUANTITATIVE RESEARCH METHODS

can do this is by increasing the overall area under the curve, meaning we need to
increase the sample size.
A point of consideration with error types is the case of metrics that reflect
combinations of other metrics like in the case of ANOVA where all possible t-tests
among the variable are used to produce the tests statistic. A case like this is when we
take sample measurements at different times (repeated measures30). In such cases the
a value we choose might be adjusted to properly reflect the a values of the individually
contributing statistics. A typical such adjustment is Bonfferoni and what in essence
does is divide internally our a value by the number of comparisons (like t-tests) made.
The adjustment overall ensures that the possibility for Type I error is reduced.

5.1 Sample Size


The natural question that follows the previous discussion is by how much
should a sample be increased to minimize both Type I and Type II errors. Since we
are interested in areas under the curve, we might recall that one statistic that directly
relates to such areas is the standard deviation. We need a way to connect the area
relating standard deviation with the sample size. If we recall CLT we will see that we
have already developed such formulas for both the scale variables case (formula 4.2)
and the nominal variable case (formula 4.3) where proportions were used instead of
continuous variables. We used these formulas to calculate confidence intervals (lower
xL and upper xU limits) for various confidence levels. Eventually we developed the
margin of error (ME) formula that related ME with standard error (SE), which is as
we saw the standard deviation of the sampling distribution. For a critical value of 1.96
(95% confidence) we got ME = 1.96*SE
By substituting the standard deviation (we guess this) in this formula we can
get a sample size estimation for the ME we desire. To showcase the process let us
consider the scale variables situation where we want to compare the mean of the time
it takes for a new drug to act against an existing drug on a population of patients. We
are going to use paired t-test and we want to calculate the appropriate sample size for
a critical value of 1.96 (95% confidence) and a margin of error (ME) of 10 minutes.
We suspect that the standard deviation will be similar to the old drug, which is say σ

30SPSS: Analyze => General Linear Model => Repeated Measures => ….Enter the
number the dependent variable has been measured … if you need plots make sure time is on
the horizontal axis … => Options => Display Means for: time, Confidence interval
adjustment Bonferroni.
Hypothesis Testing 173

= 20 minutes. The margin of error that is equal to the standard deviation of the
sampling distribution (Chapter 4) is given by the formula:

For our example, it becomes

and eventually results in N = 15.4 or N = 16 when rounded.


To showcase the nominal variables case let us consider the situation where we
want to estimate the voter support for a political party with 95% confidence and a
margin of error (ME) of 3%. The two-tail critical value for the 95% confidence is 1.96
(Table 3.5). Using the margin of error formula for nominal variables we get:

(5.1)
Of course, we don’t know the predicted 𝑝̂ because we don’t have a sample yet,
but we can guess the worst-case scenario that maximizes the product 𝑝̂ ∗ 𝑞̂. Given that
𝑞̂ = 1 − 𝑝̂ the product becomes 𝑝̂ − 𝑝̂ 2 . For someone who is familiar with the
quadratic equation (y = a2x + bx + c with maximum at -b/(2a)), the equation
y = −𝑝̂ 2 − 𝑝̂ has a maximum at = 0.5. This also results in 𝑞̂ = 0.5, so the maximum
value for 𝑝̂ ∗ 𝑞̂ is 0.5*0.5 or 0.25. Substituting the values, we have in formula (5.1) we
get:

This eventually becomes N = 1067.1 or N = 1068 since we talk about


individuals. If the produced sample size is impractical for the research we are
conducting, we can increase the margin of error and get smaller sample sizes. For a
pilot study, for example, we might be comfortable with a 10% margin of error (ME =
0.1). In such a case formula 5.1 will produce N = 96 which is probably an easier to
achieve sample size.
174 QUANTITATIVE RESEARCH METHODS

While the brief demonstration we gave here is an indication of sample size


calculations, it is recommended that the researchers use existing tools for calculating
sample sizes that consider a lot more parameters and the requirements of the various
statistical tests. A popular such tool that is freely available on the Internet is GPower
(http://www.gpower.hhu.de/en.html). Figure 5.3 displays the output of GPower for
a t-test.

Figure 5.3 GPower output

In the discussion up to now we used critical values and alpha-levels/power for


errors and sample size calculations. What we have not discussed yet is how we decide
on what alpha to use. This is where another important statistic comes into play, the
effect size. This is measured by the Cohen’s d statistic as the distance between the
null hypothesis value of po we have set, and the true population value p standardized
with the standard deviation. The observant reader will see that this is one of the
parameters GPower (Figure 5.3) requires for its sample size calculation. In practice,
the effect size becomes important when there are two samples/groups: an
experimental and a control. In such a case, po becomes the p value for the experimental
group and the true population p becomes the p value for the control group. The
standard deviation in practice is taken as the standard deviation of the control group
that is already known. Cohen’s formula then becomes:
Hypothesis Testing 175

If this seems similar to formula 3.7, we used to convert a normal distribution


to its standardized z values, it is because the two formulas are equivalent. An effect
size of 1.2, for example, means that the mean of our experimental/test group is 1.2
standard deviations above that of the control group. If we were to look for 1.2 in the
normal distribution table in Figure 3.21, we would find that it corresponds to p =
0.8849 (0.89 rounded), meaning that the mean in the experimental groups exceeds the
mean of the control group by 89%. In a similar way, various effect sizes can be
converted to percentages of the control group below or above the mean of the
experimental group.
The effect size has been related to the correlation coefficient, statistical
significance, standard deviation, and margin of error. These relations tend to carry
their own interpretations, so readers should look further for what makes the most
sense for their research. One suggestion for avoiding confusion with statistical
significance would be to report effect sizes together with their margin of errors. When
used in statistical software it is recommended that users adopt the default values the
software suggests unless they are familiar with the meaning of the values they intend
to use in the context of their research. As a rule of thumb, values of d below 0.1 suggest
a trivial effect, values between 0.1 and 0.3 suggest a small effect, values between 0.3
and 0.5 suggest a moderate effect, and values above 0.5 suggest a large effect. In
comparison with the p-values, when the p values are into the thousandths (0.00X) and
below they are considered small enough and suggest significant findings (we reject the
null hypothesis), when the values are into the hundredths (0.0X) they are considered
relatively small and suggest somewhat significant findings (we reject the null
hypothesis with caution), and for values of the order of tenths they are not considered
small enough, suggesting insignificant findings (we do not reject the null hypothesis).
In addition to the effect size there are other measures of the effectiveness of
our predictions. These include, among others, sensitivity and specificity. The former
is a measure of detection of abnormal cases while the latter measures the detection of
normal cases. One is inversely proportional to the other. To evaluate these metrics, we
need to have performed a series of tests and have decided which ones correctly
predicted the null hypothesis (accept or reject) and which ones failed. If we consider
as true-positive (TP) and true-negative (TN) the number of correct predictions for Ho
and false-positive/Type I error (FP) and false-negative/Type II error (FN) the number
of incorrect predictions, then sensitivity and specificity are expressed by the formulas:
Sensitivity = TP/(TP+FN), Specificity = TN/(TN+FP)
176 QUANTITATIVE RESEARCH METHODS

Both formulas express the number of the cases they target as a fraction of the
total. Additional metrics, that are popular in instrument development, include the
positive predictive value (PPV) and negative predictive (NPV) values as:
PPV = TP/(TP+FP) and NPV = TN/(TN+FP)
All these formulas are valuable for instrument validity, but the subject goes
beyond the purposes of this book, so the reader will need to search for more in the
extant literature.

5.2 Reliability
The metrics mentioned in the previous section bring to attention the issue of
reliability when measurements are involved. Any instrument we develop to measure a
variable will have been influenced by uncertainties in the measurements. These could
be due to either its inability to perfectly capture information or the inability of the
source (mostly human participant in social sciences) to accurately communicate
information. The latter could be intentional when participants are unwilling/afraid to
share the truth or unintentional when they cannot remember or express something.
When an event is in the distant past it might be difficult to recall details. Also, when
sensitive groups like children, disabled, etc. are involved there might be limitations on
how they formulate and express their responses to instruments.
A more general classification of measurement errors is random and systematic.
Random errors can be due to response variations due to participant emotional state,
intentions, attitudes and personality to name a few. While their random nature results
in variability in our data its effect on summary statistics like the mean tends to be
negligible (positive errors will on the average be canceled out by negative ones).
Systematic errors on the other hand are usually associated with validity and tend to
have a directional distortion of the data as they persist in both nature and direction
throughout the sample. Such errors are usually attributed to environmental factors
during the time of data collection. For example, if data on peoples’ mood is collected
on a rainy day the bad weather might predispose everyone in the sample to a bad
mood.
All the uncertainties mentioned here will eventually lead to errors in the
measurements we make. This is best captured through the true score theory which
maintains that any measurement can be expressed through the general form:
Measurement = True Value + Error
This model for measurement can directly relate to the issue of reliability we are
discussing here as it suggests a definition of reliability in the form of
Hypothesis Testing 177

Reliability = True Value / Measurement


or
Reliability = True Value / (True Value + Error)
This can be seen as the proportion of truth in an observation or if we multiply
the previous ratio by 100 we can have reliability as the percentage of the true value in
a measurement. Given that reliability is an aspect of the whole sample not of any
particular observation a more appropriate way of expressing it is my considering
variability or variance of the scores. This will lead to a definition of reliability for a
sample as the ratio:
Reliability = (Variance of true scores) / (Variance of the measurements)
or
Reliability = (Variance of true scores) / (Variance of true scores + Variance of errors)
When there is no error in measurement the variance of the errors will be zero resulting
in:
Reliability = (Variance of true scores) / (Variance of true scores) = 1
When there is only error in measurement then the variance of the score will be zero
resulting in:
Reliability = 0 / (Variance of errors) = 0
These extremes give us the range of reliability values between 0 and 1. An intermediate
value of, say 0.75, would indicate that about 75% of the observed measurements are
attributed to true values and only the remaining 25% will be attributable to errors.
The practical challenge with the reliability formula is that we have no way of
measuring the variance of the true scores because they are “polluted” with the errors
in our measurements. This leaves us with the option to come up with estimates of its
value. Considering the case of repeated measurements of the value of a variable that
we study, we would expect that if there is no error in our repeated measurements all
the data values in our sample will be the same number. Since, in real life, there will be
errors in the measurements, the reliability of the measuring instrument can be seen as
the extent of the correlation between separate values.
Let’s demonstrate this assertion through an example. Let us say that we want
to see how reliable is a scale that measures weight. We take a 10 Kg object and we
place it in the scale two times and record the ratings. Let’s say the values were 10.02
and 9.99. The variance we observe is obviously due to devise errors. What the two
readings have in common is that inside of them they include the true value of the
178 QUANTITATIVE RESEARCH METHODS

object’s weight which is 10 Kg. Alternatively, the correlation of these two values is an
indication of how close they are to the true value. This suggests an equivalence
between correlation of measurements and reliability.
Based on that assertion a formula for reliability between the two measurements
we made (labeled V1 and V2 here) will take the form of:

Within the context of quantitative research methods reliability concerns the


repeatability of our results in light of multiple experiments/tests. Notice the keyword
‘results’. That means repeated tests on a variety of samples should always produce
more or less similar results. Reliability measures are meant to give a quantitative value
for this repeatability of results. It should not be confused with validity as a test might
be reliable producing the same statistic test after test and be invalid in that it doesn’t
measure what it is supposed to measure. I might weigh myself on my scale which
always shows the same weight (when I maintain my diet, that is) but 10 Kg more than
I am. The scale is very reliable, but the results are not valid.
Reliability is also related to relevance. We could, for example, measure poverty
by the amount of income individuals have, but this could be misleading as the same
amount of income might make someone a rich person in the Bronx and a poor person
in the Upper West Side. The instrument/income is reliable but not relevant.
When the development of instruments is involved, we are interested in this
case if some of the items in our questionnaire measure the same thing (internal
validity). A metric that allows such measurements is Cronbach a. The metric considers
the correlations among all variables in a set and reports their average. Its advantage is
that it is not affected by any ordering of the variables and that makes it the popular
metric in statistical software. Since it is a correlation coefficient, its value can range
from one (perfect correlation) to zero (no correlation). A high value (typically above
0.7) would suggest that the variables involved are probably measuring the same thing.
A typical test for interval (and ordinal for that matter) variables is the inter-class
correlation coefficient (ICCC). It can take values between -1.0 (low agreement) and
+1.0 (high agreement). For nominal variables, it would be better if we use Cohen’s
Kappa which produces values similar to Cronbach a. This is also ideal when we want
to see if two observers or tests measure the same thing.
Some other concepts that occasionally tend to be confused with reliability are
that of accuracy and precision. Accuracy refers to the closeness of the statistic we use
Hypothesis Testing 179

in measuring the true value of the variable we study, while precision refers to the
closeness of its different measures/spread. The key to distinguishing the various
concepts (Figure 5.4) is to remember that reliability has to do with repeatability of the
experimental results when multiple tests (ideally from different researchers and
different samples) of the same statistic are involved. It is a prerequisite to validity as
we cannot have different results reflecting the same metric and being valid at the same
time. Validity on the other hand can easily be seen as synonymous to accuracy. Figure
5.4 has the results of a single shooter one day (top row) and his combined results with
another day (bottom row). The captions below each target showcases the concepts we
mentioned here.

Figure 5.4 Reliability, precisions, validity, and accuracy

In terms of experiments, precision also refers to the level of detail a


measurement can provide. A p value of 0.001 is more precise than a p value of 0.01,
but if our critical value p* is 0.1 then both p values are accurate. If repeated measures
of a statistic over different samples produce p values of 0.05, 0.06, and 0.04, then we
might conclude that the instrument we used is reliable. If all tests successfully reject or
fail to reject the null hypothesis, they will also be considered valid.
180 QUANTITATIVE RESEARCH METHODS

6 Advanced Methods of Analysis

Our main interest in quantitative research is to validate/prove models that we


build to describe phenomena (social in the case of this book). These models are
expressed through variables and constructs (usually agglomerations of variables under
an umbrella term) and their relationships (dependent, independent, covariant,
moderators, mediators, etc.) with each other. In many cases, we build a model based
on past research and the literature review we might have conducted and then try to
see if our observations fit the model, while in others we start from our observations
and try to build our model.
While the methods discussed up to now should address the quantitative needs
of many research projects, there are situations where more advanced methods of
analysis are required. Basic statistical methods can handle well situations where a
limited number of variables is involved, but they are not capable of addressing the
needs of complex phenomena that require the development of sophisticated
theoretical models with many interconnected variables. Additionally, there is a great
need nowadays for ensuring the reliability and validity of measuring instruments (like
questionnaires) that requires minimization of measurement errors. This is something
that advanced methods can handle a lot better than basic statistical techniques.
The boundary between what is advanced and what is not is arbitrary and more
dependent on the experience and skills of the researcher than anything else. For that
matter, the collection of methods that will be presented here is not exhaustive but to
a great extent cover the “advanced” needs of most quantitative research projects.
Caution is taken also not to cross over (a lot) to related fields like operations research
or extend to specialized techniques like artificial intelligence that are occasionally
recruited to address specific research needs.

6.1 Exploratory Factor Analysis and Principle Component


Analysis
In many situations in quantitative research we might have suspected that some
factors/variables are involved in a phenomenon, but we are not sure which ones are
really significant and which ones could be redundant. More specifically we are
interested to know the minimum number of (independent) variables that describe or
accurately represent the phenomenon we study. In other situations, we might be
interested in developing a questionnaire that would measure the various components
of a trait or skill that is not directly measurable, for example the leadership,
entrepreneurship, or other skills of an individual, and is expressed through multiple
Advanced Methods of Analysis 181

components, for example intelligence, knowledge, etc., that could be measured. We


might have a list of target components/constructs that we suspect contribute to the
trait we want to measure but we are not exactly sure which ones strongly contribute
and which ones do not. In such cases, factor analysis (FA) or more specific for this
section exploratory factor analysis (EFA) and its variants like principal component
analysis (PCA) might be the recommended method of quantitative research. While
there are close relationships with MANOVA, factor analysis is quite different and
more appropriate for discoveries than proofs.
As an example of what factor analysis does let us consider an attempt to
describe a car. If we have an idea of what a car is (like it has wheels, engine, etc.) but
we are not sure, then we are conducting a FA, while if we have no idea what a car is
(assume you are an alien) then we are conducting a PCA. Let’s assume a list of a car’s
various parts like what we have in the column ‘Car item’ in Table 6.1. By inspecting
the list, we can soon notice some redundancies like all wheels are the same, so they
could all be under a general category wheels. The same will apply to doors, seats, lights,
and mirrors. Some entries also correlate in that one can suggest another like mileage
could be used to predict engine condition and horse power could be related to
consumption. Additionally, some others might be irrelevant, like production year and
accidents, as the car might have been repaired with new parts and the car is in excellent
condition overall (antique cars tend to be in excellent condition) so the number of
accidents will say nothing about its condition. What this process did is factoring
(‘Factors’ column in Table 6.1) — identifying the minimum number of
variables/factors that are needed to describe our observation (in this case, the car).
Factor analysis starts by creating a matrix (R-matrix) with the correlation
coefficients of all combinations of variables. Table 6.2 shows the values of such a
matrix for a set of six hypothetical variables. The matrix is symmetric across the
diagonal so in its completed form the empty cells will be filled by their symmetric
values with respect to the diagonal. In some instances, we can use covariances (we
mentioned them when we discussed ANOVA) instead of the correlation coefficients.
Sometimes we might even include the covariances above the diagonal in Table 6.2 to
increase the amount of information we display in the table. It might be worth keeping
in mind that with covariances as with the correlation coefficients the sign indicates the
direction of the relationship (proportional or inversely proportional) between the two
variables.
182 QUANTITATIVE RESEARCH METHODS

Table 6.1 Factoring


Car item Factors
trunk trunk
engine engine
front left wheel wheels
front right wheel doors
back left wheel lights
back right wheel seats
left front door mirrors
left front light horsepower
right front light mileage
left back light
right back light
left front seat
right front seat
back seat
right front door
left mirror
right mirror
horsepower
mileage
consumption
engine condition
production year
number of accidents

One could observe from Table 6.2 or its corresponding matrix form that there
are high correlation values (closer to “1”) between variables Var1, Var2, and Var3
(yellow shading) and Var4, Var5, and Var6 (orange shading). This could mean that the
first group might be measuring one factor and the second another. We will call them
FactorYellow and FactorOrange. This would suggest that all our variables can in
essence be represented in a correlation coefficient space (axis FactorYellow and
FactorOrange) as having FactorYellow and FactorOrange coordinates (Figure 6.1). A
rough estimation of the coordinates of each of our variables across the two factors is
depicted in Table 6.3. For convenience, the average of the entries for each variable
across the factor regions was considered here in producing the factor coordinate
values. While the average is a specific form of multiple regression, a more appropriate
method (usually adopted in most statistical software) is to perform multiple regressions
with the R-matrix entries by using one of the variables as dependent/outcome and the
others as independent (like Var1 = a0+a1Var2+ a2Var3+ a3Var4+ a4Var5 + a5Var6 and
Advanced Methods of Analysis 183

so on for the other variables). The R2 of those multiple regressions would be good
indicators of how much variability can be explained by the model or is common among
the variables and can be used as initial coordinates for the variable in Table 6.3. The
variance accounted by the model is called common variance, while its proportion to
the overall variance is called communality. If we were to see the Table 6.3 numeric
entries as a matrix (6x2 — 6 rows and 2 columns) we would have what is called a
factor matrix (it would be called component matrix in principle component
analysis).
If we were to consider, as we mentioned before, factors as dimensions
(meaning independent of each other) and we plot the values of Table 6.3 we will get
the scatter plot of Figure 6.1. The coordinates of each variable along the axes are called
factor loadings. If we had identified three factors, we would have to draw a third axis
perpendicular to the other two and so on for higher dimensions (although difficult to
visualize).
Table 6.2 Correlation coefficient matrix
Variables Var1 Var2 Var3 Var4 Var5 Var6
Var1 1.000
Var2 0.839 1.000
Var3 0.710 0.739 1.000
-
Var4 0.042 -0.320 1.000
0.093
Var5 0.028 0.056 -0.092 0.692 1.000
- -
Var6 0.019 0.598 0.702 1.000
0.057 0.025

Table 6.3 Factors


Variables FactorYellow FactorOrange
Var1 0.775 -0.041
Var2 0.789 0.024
Var3 0.725 -0.131
Var4 -0.124 0.645
Var5 -0.003 0.697
Var6 -0.021 0.650
184 QUANTITATIVE RESEARCH METHODS

Figure 6.1 Factor coordinates plot

Because each factor is represented as an axis we can assume that a general


linear model (like y = a0+a1x+a2x2+a3x3+ …… +anxn but with a0 = 0 as the lines/axis
intersect at zero and an error term that we will ignore cautiously) could be a model for
our observations with the factor loadings as the coefficients. This can be expressed as:
FactorYellow = 0.775*Var1 + 0.789*Var2 + 0.725*Var3 – 0.124*Var4 – 0.003*Var5
– 0.021*Var6
and
FactorOrange = -0.041*Var1 + 0.024*Var2 - 0.131*Var3 + 0.645*Var4 + 0.697*Var5
+ 0.650*Var6
These equations can be used to evaluate scores (also known as weighted
averages) for each factor based on the values of the variables of an observation. It is
obvious that the coefficients of these equations (factor loads in this case) play a critical
role in how accurate the factor scores will be. It is possible that the factor weights we
used are not the best and there might be others that produce more realistic results.
One way to improve the coefficients is to consider the factor weights as our initial
estimates and use them to find better ones. For this we can multiply our factor matrix
with the inverse of the original correlation matrix. This way the initial correlations are
considered in our improved factor weights. As an aside, we have already considered
such influences by taking the averages of the correlation coefficients across the
variables that form the factors (Table 6.3), but in general one could start with arbitrary
values for the factor weights. In such a case the process we describe here will refine
the factor weights to better represent the observed values.
Advanced Methods of Analysis 185

Apart from the method described here for discovering factors there are others
that can be applied and can be found in the literature. A distinction needs to be made
for cases where we have some idea/suspicion/hypothesis of what the factors are and
want to confirm their existence and for cases where we aim at discovering the factors.
For the former situation, explanatory/confirmatory factor analysis is required,
while for the latter exploratory factor analysis would be recommended. A variation
of the exploratory factor analysis is PCA. The two techniques are similar in that they
both process correlations (linear combinations) of variables and variances to explain a
set of observations. However, in FA we are more interested in the underlying factors
(latent variables) and not in the observed variable values because we care to
extrapolate our findings to the population while in PCA we focus on combinations of
variables that reflect the observed variable values in our sample without concern of
how they could infer the population. The latter can be circumvented if another sample
is used that reveals the same factors. In practice, we start with a model in mind in FA
and see how it fits the observations (accounts for the observed variability), while in
PCA we are just trying to reduce the number of variables by eliminating covariances
(while preserving variability). FA accounts for the common variance in the data and
does not produce any values for the identified components, while PCA accounts for
the maximal variance and can produce values for the identified factors. In terms of
modeling, FA derives a model and then estimates the factors, while PCA simply
identifies linear relationships between variables when they exist. In that sense, the
method we described with the data of Table 6.2 is more of a PCA than a FA as we
were trying to find a reduced set of dimensions (also called eigenvectors) that
accounted for most of the variance (also called eigenvalues). The terminology and
matrix transformation process for finding eigenvectors and eigenvalues is beyond the
scope of this book so the interested reader should look for more in the extant
literature/Internet. Considering the initial example of Table 6.1 (defining a car), if we
were doing a FA we might have confirmed our initial model of a car consisting of
wheels, engine, etc., while if we were doing a PCA we would have found the items like
wheels, engine, etc. that define a car. These different perspectives should guide a
researcher on which method is better for the research they are conducting.
The automated factor extraction process adopted in most statistical software
will produce a number of factors but might not be strong enough to tell us which ones
are significant unlike what we did by a simple visual inspection of the correlation
matrix. Also, the boundaries between factors might not be as clear as in Table 6.1. In
most cases we will need a way of telling which ones are the most influential. One way
of doing this is by plotting their eigenvalues in what is known as a scree plot. Table
6.4 shows the output of a FA from software. We can see what part of the total variance
is attributed to factors. Evidently, the first two factors in this case seem to amount for
87.5% of the total variance (45.4% and 42.1% for the first and second factors,
186 QUANTITATIVE RESEARCH METHODS

respectively). The scree plot in Figure 6.2 shows the same information in a visual form.
The sudden drop in the eigenvalue total that the component ‘3’ introduces (inflexion
point) is the cut-off point past which the remaining variability can be considered
insignificant compared to the contributions of the first two factors. Statistical software
usually retains all factors above an eigenvalue total of 1 (sometimes 0.7) and ignores
the rest.
Table 6.4 Factor analysis output
Component Eigenvalue Eigenvalue Eigenvalue
Total % of Cumulative
Variance % of
Variance
1 2.356 45.4 45.4
2 2.184 42.1 87.5
3 0.205 3.9 91.4
4 0.192 3.7 95.1
5 0.173 3.3 98.4
6 0.081 1.6 100.0

Figure 6.2 Scree plot

Once we extract the factors there might still be room for improvement in
reaching the ideal where variables have the highest possible loads (coordinates
according to Figure 6.1) across one of the factors. If we were to see the scatter plot of
Advanced Methods of Analysis 187

Figure 6.1 we can visually suspect that a better set of axes like the ones depicted in
Figure 6.3 (red lines) could have worked better in representing the factors. To achieve
this, we need to rotate our initial axes to get to the new axes. Rotation is a standard
technique applied to improve factor loading and can either be an orthogonal rotation
where the axes remain orthogonal to each other or oblique rotation like the one
performed in Figure 6.3.
The choice of rotation depends mainly on whether we are interested in
preserving the independence of factors among each other (orthogonal rotation) or we
are allowing correlations between them to exist (oblique rotation). Modern statistical
software allows for a variety of transformations depending on how the spread of
loadings for variables is distributed across factors. Either types of rotations and their
variances require matrix transformations that are beyond the scope of this book. Using
the default values provided by the statistical software we use is always a good starting
point from which interested researchers can expand to address the specific needs of
their study.
One point of interest before closing this section is the impact of sample size
on the limit for accepting factor loadings as significant. Based on a two-tailed alpha
level of 0.01 for sample sizes of 50, loadings of 0.7 and above can be considered
significant, while for sample sizes of 1,000, loadings of 0.2 and above might be
significant. While someone could guess the in-between situation, researchers will have
to consult the literature for what are the recommended loadings that indicate
significance for the research they conduct.

Figure 6.3 Axes rotation


188 QUANTITATIVE RESEARCH METHODS

A similar to FA and PCA method that is occasionally used is correspondence


analysis where we consider each variable/column as a dimension with the frequencies
as values. We then consider the geometric distance between each point and proceed
to reduce the dimensions one at a time while recalculating the distances. If for a certain
dimension the difference is not significant, we can eliminate it. We proceed like this as
far as the process eliminates dimensions. As always, more on the method can be found
in the extant literature.

6.2 Cluster Analysis


When multiple data values are involved, we are oftentimes interested in
grouping them (or splitting, one might say) according to their similarity. This might
sound similar to what we did in FA but the differences are significant as will be
highlighted with an example. Let us assume that we are presented with the target of
Figure 6.4 and we are asked what we can make of it. There is an apparent, although
not so clear grouping of the shots and there are some general conclusions we can draw
about their appearance and arrangement. For example, one might conclude based on
the apparent groupings that five shooters were likely involved (Figure 6.5.a). It will
also be apparent from the size of damage on the target that some bullets were of higher
caliber than others and that there is a clear tendency for shots towards the left-side of
the target (Figure 6.5.b). The latter might be the result of strong winds coming from
the right of the shooters. The process we followed to identify the shooters can be seen
as clustering, while the way we identified the different bullet calibers and the wind is
analogous to FA.

Figure 6.4 Target practice results

Cluster analysis is based on the development of a metric that groups/splits


observations in clusters according to similarities and/or in a hierarchical order
(hierarchical clustering). The latter is a typical process in phylogenetic analysis
where one categorizes into groups/kingdoms the various life forms like the animal
Advanced Methods of Analysis 189

kingdom, the plant kingdom, etc. and further subdivides them into more specific
forms until we reach the individual species. In other words, we try to divide
observations into homogeneous and distinct groups. This is not to be confused with
classification where we try and predict which group an observation belongs to. The
groups in classifications are known while in cluster analysis they are not.

Figure 6.5 Cluster and factor analysis

The key entity in classification is the group. The most popular metric we use
for identifying closeness and by extension membership in a group is usually the
geometric/Euclidean distance of an observation to the group center. For two
observations in a three-dimensional space this distance is given by the formula:

For cluster analysis, the coordinates/values along the dimensions we have are
standardized. Also, in cases where the importance of the dimension varies we can use
weights to indicate so. The distance in those cases will look like:
190 QUANTITATIVE RESEARCH METHODS

We can start the clustering process by assuming every observation is a group


and start identifying observations as members of the closest group near them. We
repeat this process until groups are too far apart to be identified as close. Figure 6.6
showcases the process of clustering. At first pass (Figure 6.6.a), group I is formed with
the closest two observations. The next pass will form group II with another pair
(Figure 6.6.a). Subsequent passes (Figure 6.6.b and Figure 6.6.c) will expand the group
size by absorbing the closest observations/groups until eventually everything is
engulfed in one group (Figure 6.6.d).
The question that needs to be answered is when do we stop the grouping
process? If we consider the example of Figure 6.6, we can see that the first couple of
passes (Figure 6.6.a) seem to have achieved some grouping (groups I and II) that
progressively got better (groups III and IV) and then they started getting worse (groups
V and VI) as they grouped far away observations. Visually we can see that passes III
and IV are probably the best options. Visual confirmation will be difficult to achieve
when thousands of observations are involved so we need a metric that will closely
resemble our visual ability for pattern recognition. An appropriate metric for this case
is the distance between the members of groups (superimposed in the circles of Figure
6.6 in the form of the diameter of each group). If we were to comparatively display
these distances (Figure 6.7.b) we would observe a sudden increase/jump from step IV
to V. This could be the trigger to where the groupings will stop getting better (unlike
the cases of random observations that don’t display such jumps). After that point an
“over-grouping” will emerge. This is the criterion used by most statistical software and
is considered quite effective for identifying clusters.

Figure 6.6 The clustering process


Advanced Methods of Analysis 191

Figure 6.7 Grouping distance and dendogram

A pictorial way of displaying clustering is through dendograms (also called


phylogenetic trees). Figure 6.7.c displays the dendogram of the target practice case
we discussed. Different coloring of the marks (Figure 6.7.a) is being used to show the
process of grouping in the dendogram, while the heights of the various branches equal
the distances of Figure 6.7.b between the observations that form the group. One point
worth mentioning here about the calculations of the distance between an observation
and a group or between groups is what we consider as the coordinates of a group. In
the example we presented here it was assumed that the geometric center (called
centroid) of the group was used with coordinates, for each of its dimensions, the
average/mean of the coordinates of the corresponding dimensions of its individual
group members. Other popular alternatives include the nearest neighbor/single
linkage (the pair of observations of the two groups with the smallest distance), the
farthest neighbor/complete linkage (the pair of observations between the two
groups with the largest distance), and the average linkage (the average of all pairs of
observations between the two groups).
Although we discussed here a typical clustering process, there are alternatives
that could be considered like the hierarchical divisive method that follows the exact
opposite process from what we described here (starting with one cluster that includes
everyone and breaking it down until each individual observation becomes a cluster). A
non-hierarchical method that is often used is the k-means method. In this process,
we assume an initial number of groups and then assign and reassign observations
according to their closeness to the centroid of the groups. The process stops when all
observations belong to groups where their distances from the group centroid is the
smallest possible.
192 QUANTITATIVE RESEARCH METHODS

The clustering process requires an additional step when nominal


variables/attributes are involved. We need in this case a way to define distance between
categories. This is done by considering as distance the number of similar attributes
between two observations over the total number of nominal variables we have. Let us
demonstrate the process through an example. Consider the demographic data of Table
6.5. If we were to consider Observation 1 (Male, Below 25, Undergraduate) with
Observation 2 (Female, 25–40, Graduate) we observe there are no similarities along
any variable so the distance between these two observations is 0 (zero). For
Observation 1 (Male, Below 25, Undergraduate) and Observation 3 (Male, Above 40,
Graduate) there is only one similarity (gender) across the three variables so the distance
between these two observations is 1/3 or 0.33. Similarly, between Observation 1
(Male, Below 25, Undergraduate) and Observation 4 (Male, Below 25, Graduate) there
are two similarities so the distance between them will be 2/3 or 0.67. We calculate all
other distances in a similar fashion (Table 6.6). With the distances in the form of
probabilities in Table 6.6, the clustering process described previously can be applied.
In practice, the complements/dissimilarities are used instead of the distances (Table
6.6).

Table 6.5 Demographics


Age
Observation Gender Group Education
1 Male Below 25 Undergraduate
2 Female 25-40 Graduate
3 Male Above 40 Graduate
4 Male Below 25 Graduate
5 Female 25-40 Undergraduate
6 Male Below 25 Undergraduate
7 Female Above 40 Graduate
8 Male Below 25 Undergraduate
9 Female 25-40 Graduate
10 Female Above 40 Graduate

Another application of clustering concerns the case of dimension reduction


similar to what we did with PCA in the previous section. This is usually applied to
questionnaire development where we start with a lot of questions to make sure
everything is covered and then we cluster to remove similar and high correlated
questions. Assuming that the answers to questions can be quantified in some way, we
start by calculating the correlation coefficients (r) for all combinations of questions
(similar to the R-matrix in the previous section). For the distances between questions
Advanced Methods of Analysis 193

we then consider the complement/dissimilarity of the correlation coefficient (1-r).


From then on, the process continues as usual.

Table 6.6 Distances


Obs. Obs. Obs. Obs. Obs. Obs. Obs. Obs. Obs. Obs.
1 2 3 4 5 6 7 8 9 10
0.00
0.00 0.00
0.33 0.33 0.00
0.67 0.33 0.67 0.00
0.33 0.67 0.00 0.00 0.00
1.00 0.00 0.33 0.67 0.33 0.00
0.00 0.67 0.67 0.33 0.33 0.00 0.00
1.00 0.00 0.33 0.67 0.33 1.00 0.00 0.00
0.00 1.00 0.33 0.33 0.67 0.00 0.67 0.00 0.00
0.00 0.67 0.67 1.00 0.33 0.00 1.00 0.00 0.67 0.00

A final point of interest with clustering nowadays is its application to social


network analysis. The popularity of Facebook and other social platforms allowed
billions of individuals to connect through the Internet in ways that were not possible
before. People of similar professional and/or personal interests, beliefs, perceptions,
and ideals form groups where they share their views and experiences. Identifying who
is connected to whom and what form that connection takes (direction, strength, etc.)
becomes very important as it can specify clusters in the form of groups, subgroups,
cliques (when everyone is connected to everyone else), gatekeepers (individuals that
connect clusters), etc. For more on social network analysis the readers should search
the extant literature.

6.3 Structural Equation Modeling


In cases where we are interested in confirming/proving theoretical models the
structural equation modeling (SEM) technique is often used. The goal of SEM in these
cases is to determine the extent to which our sample observations support the
theoretical model we used as a framework (usually for developing our instrument for
data collection). Such models include among others the regression we have already
seen, path analysis (discussed later), and confirmatory factor analysis (discussed in
another section).
The goal of SEM, regardless of the underlying model, is to identify dependent
and independent variables that are latent variables (not directly observable or
194 QUANTITATIVE RESEARCH METHODS

measured, such as ambition, confidence, etc.). Such variables are usually represented
by combinations of observed variables and confirm the observed variables (our
sample variables) as part of a model. For example, regression models can represent a
phenomenon by explaining or predicting the influence of one or more independent
variables on one dependent variable. Path models are similar to regression models but
allow for multiple dependent variables, while confirmatory factor models aim at
connecting observed variables to latent variables.
We develop a model (model specification) by determining every relationship
and parameter that is suspected to be involved in the phenomenon we study. SEM
then uses variance-covariance matrices (referred to as matrix Σ from now on) based
on the model and attempts to fit the observed variance-covariance matrices (referred
to as matrix S from now on). If inconsistencies/errors are observed, then the model
is deemed miss-specified, suggesting that either some of the assumed relationships are
not there in reality or that some other variables might be needed to complete the
model. This will lead to model modification and the process will be repeated until a
satisfactory model is found.
A model can in general be under-identified when there is not enough
information in the covariance matrix to define some of its parameters, it could be just-
identified when its parameters are sufficient to explain the covariance matrix, and over-
identified when there are more than one ways of estimating its parameters. The last
two cases usually consider the model adequate to explain the sample observations.
Having a classification for identification requires the establishment of metrics that will
determine its classification. In these cases, these are more in the form of conditions
than exact metrics and we will discuss now the most important of them.
The first condition we will consider is the order condition under which the
number of parameters (also called free parameters) required to describe the model
cannot exceed the number of distinct entries in the variance/covariance matrix
mentioned before. Given the symmetry of the S matrix the elements above the
diagonal are the same as the symmetric entries below the diagonal (that is why we often
ignore such elements when depicting the matrix). So, for n observed variables the
matrix will have n*n elements and the total of the elements below (or above) the
diagonal plus the diagonal elements will be distinct and equal to n*(n+1)/2. If we also
consider the means of the variables involved (n in total) as necessary for the
description of the phenomenon we study, the number of free parameters becomes
n*(n+1)/2 + n or n*(n+3)/2. A consequence of the required number of parameters
is that the sample size should be at least as large as the number of free parameters
otherwise there will not be enough information to estimate all the parameters (this is
also known as the “t-rule”). Another necessary condition that is far more difficult to
assess is the rank condition which requires a determination of the suitability of matrix
Advanced Methods of Analysis 195

S for determining each parameter (matrix Σ). This is in general difficult to prove and
we will defer the interested reader to search the extant literature for more on this.
We will discuss now actions we can take in avoiding identification problems.
One of the methods used is a mapping between observed and latent variables and
ensuring that either the factor loading, or the variance of the latent variable is fixed to
‘1’. This ensures there is no indeterminacy between the two, but it might require
additional constraints. Additional methods exist but are beyond the scope of this book.
Having developed a model, the next step is to estimate its parameters. For this
we need a fitting function that will provide a metric of the difference between S and
Σ. The interested reader can find many such fitting functions in the extant literature
but the most popular ones include maximum likelihood (ML), generalized least
squares (GLS), and unweighted or ordinary least squares (ULS or OLS).
Regardless of the process we follow we will end up with a set of parameters as
descriptors of our model. The next step concerns the evaluation of the adequacy of
these parameters in providing a description of the model we adopted and by extension
of the phenomenon under investigation.
This evaluation can start by considering the parameters that are significantly
different from zero and their signs are the same as the ones in our model (indicating
significant contributions to the explanation of the phenomenon). From then on, we
can estimate their standard errors and form critical values as the ratio of the parameter
to its standard error. By conducting a t-test between the theoretical and observed
parameters we can deduce which ones are truly significant (for example, they exceed a
specified alpha level of say a = 0.05 for a two-tails t-test). Finally, if the values we
observed are within expected ranges as suggested by the relevant literature and/or
other sources (for example, pilot tests), then we can confirm that all free parameters
have been identified and meaningfully interpreted. While at the level of the individual
parameter t-test will work, at the level of the whole model we will need chi-square to
measure the fit between Σ and S. When chi-square is close to zero (perfect fit) and the
values of the residuals matrix are also close to zero we can safely conclude that our
theoretical model (Σ) fits the data (S). A final stage after evaluation concerns the
modification of our model (also called specification search) to find parameters that are
more meaningful and better fit observations.

6.3.1 Path Analysis


When multiple dependent variables are involved in the description of a
phenomenon, we have seen that MANOVA and MANCOVA could be applied if the
variables were scale, while if the variables are nominal we could consider multiple
regression with dummy variables, partial and marginal tables, etc. An alternative and
more advanced method that can handle the case of multiple dependent variables
196 QUANTITATIVE RESEARCH METHODS

(meaning several regression equations) is path analysis. Although this might suggest
that the method is suitable for proving causations, this is not the case unless there is a
temporal relationship between the cause and effect variables which also correlate or
covary while other causes are controlled. If the aforementioned conditions are
confirmed over time and across multiple experiments, then we can assume that
causation has been established.
Let us demonstrate the path analysis process through an example. We will
assume that following our literature review and past research we have developed a
model for the emergence of entrepreneurship that involves 10 observed variables
(Figure 6.8). One-way arrow connectors represent direct effects/influences, while two-
way connector lines (usually drawn curved) represent covariances (Networking with
Social capital and Education with Social capital). The rational for the covariances is
that there are influences outside the proposed model that affect the relationship of the
variables involved in the relationships. Finally, the model includes error terms
(ovals/circles in the diagram) for all dependent variables to make up for the variance
that the model will not be able to explain. These errors usually represent latent
variables that influence the phenomenon we study. In terms of dependent (also called
endogenous here) and independent (also called exogenous here) variables, the red
rectangles represent independent variables while all other rectangles represent
dependent variables.

Figure 6.8 Entrepreneurship model


Advanced Methods of Analysis 197

In terms of a numerical representation of the model, path coefficients for


every path in the model are derived from the standardized correlation coefficients
(direct one-way arrows) and the covariances (two-way arrows). These coefficients are
then used to express the dependent variables in the model in linear forms like:
Opportunity = cmo * MarketEnvironment + ceo * Education + ErrorOp
Education = cfe * FamilySupport + ces * SocialCapital + ErrorEd
SocialCapital = ces * Education + cns * Networking + cse * Entrepreneurship +
ErrorSC
Networking = cfn * FamilySupport + cpn * Personality + cns * SocialCapital +
ErrorN
RiskTaking = cpr * Personality + ErrorRt
Entrepreneurship = coe * Opportunity + cse * SocialCapital + cre * RiskTaking +
ErrorEn
While the model depicted in Figure 6.8 is our hypothesized model, we have in
reality a number of possible models that could potentially exist. An alternative one, for
example, could have Personality connected to Education or Risk-taking connected to
Opportunity. These possible options are included in our hypothesized model in the
form of fixed parameters, while the connections we assumed are represented with free
parameters. When thinking in terms of the parameters we mentioned in SEM, the
aforementioned model requires 9 path coefficients (one for every arrow), 2
correlations among dependent variables (two-way arrows), 6 error variances, and 3
independent variable variances. This gives us a total of 20 free parameters that we wish
to estimate.
Because we have 10 observed variables the number of distinct values in our
sample observation matrix (10x10) is equal to 10*(10+1)/2 = 55 (elements in the
diagonal and below). Because this is higher than the number of free parameters (20),
the model will be over-identified. Following the process outlined in the SEM section,
the next step is to estimate the parameters of the model. In path analysis, this is done
by decomposing the correlation matrix.
We proceed by multiplying each equation by the variables in their left side and
substituting the products with their covariance. For example, considering the first
equation that expresses the Opportunity as a function of Education and
MarketEnvironment, we can first multiply both sides by Education first and then by
MarketEnvironment:
198 QUANTITATIVE RESEARCH METHODS

Opportunity*Education = cmo*MarketEnvironment*Education +
ceo*Education*Education + ErrorOp*Education
which becomes (note that all variables are standardized so their variance will
be ‘1’ and Education*Education can only be expressed as variance that will be ‘1’ as
everything is standardized, and the variance when error terms are involved is ‘0’ as
errors reflect variance in the model unrelated to the variables):
Cov(Opportunity,Education) = cmo*Cov(MarketEnvironment,Education) +
ceo*Var(Education2) + Var(ErrorOp*Education)
or
Cov(Opportunity,Education) = cmo*Cov(MarketEnvironment,Education) + ceo
Given that the covariances are known (from matrix S as discussed in the
previous section), the previous equation has only the coefficients as unknown.
A similar equation will be formed by multiplying the first equation
(Opportunity in our model) by the MarketEnvironment variable:
Opportunity * MarketEnvironment =
cmo* MarketEnvironment * MarketEnvironment +
ceo * Education * MarketEnvironment +
ErrorOp * MarketEnvironment
which will become:
Cov(Opportunity,MarketEnvironment) = cmo* Var(MarketEnvironment2) +
ceo * Cov(Education * MarketEnvironment) +
Var(ErrorOp * MarketEnvironment)
or
Cov(Opportunity,MarketEnvironment) = cmo +
ceo* Cov(Education * MarketEnvironment)
Doing the same for the other equations in the model we end up with a system
of equations that when solved will produce all the coefficients and error terms of our
model equations. While this process will provide numerical answers, the interpretation
Advanced Methods of Analysis 199

of the results needs to be done in light of the theory and assumptions used in
developing our model.

6.3.2 Confirmatory Factor Analysis


While FA and more specifically exploratory factor analysis was covered in
section 6.1 we will deal here with its structural equation relative, confirmatory factor
analysis (CFA). In this case, we will assume a model that includes observable and latent
variables and try to confirm it with the observations we have. As before, we will
demonstrate the application of the method through an example. By borrowing some
constructs from the emergence of entrepreneurship case we saw before we might
suspect that support from family, market environment, and opportunity could be
expressions of the theoretical construct Environment, while qualities like risk-taking,
education, and networking might be expressions of the theoretical construct
Individual. A potential model could look like the one depicted in Figure 6.9.

Figure 6.9 Confirmatory factor model examples


200 QUANTITATIVE RESEARCH METHODS

We will assume that all the variables in rectangles are observables (we can
measure them with some questionnaire) and their associate errors represent
measurement error or variation not attributed to the common factor. Error variances
could be correlated if they share something common, like the same instrument for
example. The model could be more complicated with correlations (double arrows)
between observable variables, but for the purposes of demonstration here we will only
assume correlation between the two factors (Environment and Individual).
Considering coefficients (in this case called loadings as we saw in EFA) for the
relationships between the factors and observable variables we can express the model
of Figure 6.9 with the following equations:
MarketEnvironment = fem * Environment + ErrorM
Opportunity = feo * Environment + ErrorOp
FamilySupport = fef * Environment + ErrorFs
Networking = fin * Individual + ErrorN
RiskTaking = fir * Individual + ErrorRt
Education = fie * Individual + ErrorEd
By considering the order and ranking conditions (see the beginning of SEM)
as having been met, we assume the model is identified and proceed to
model/parameter estimation as we described before. Eventually, we will have
parameters estimated and we can focus on interpreting their significance. This means
deciding if their values reflect the theory or theoretical framework we are trying to
prove. Only then can we consider the model confirmed. It will be then left to follow-
up experiments to confirm the validity of our model.

6.4 Time Series Analysis


While in the methods we have seen so far, our focus was on describing and
explaining, when possible, social phenomena, there are cases when we are interested
in predicting outcomes and improving processes. Although this will bring us close to
the domain of operations research, we will briefly mention here how we deal with
streams of data as they are produced in time as a result of a process. These types of
data are produced regularly by businesses and could include, among others, sales
figures, revenue, earnings, etc. over time. All these data are called time series data
and their data streams are called time series.
These data are of course of the ratio scale type and can be subjected either to
descriptive or inferential analysis. The former can provide a numerical and graphical
Advanced Methods of Analysis 201

presentation and interpretation of the data streams and their patterns, while the latter
can be used to make forecasts and provide estimates of their reliability. This is typically
the case when we try to predict trends, whether it is the economy, the stock market,
etc.
A common technique used to describe a time series is through an index
number. Such numbers measure the rate of change of the data in their stream during
a period of time. The beginning of the time period is also referred to as base period.
In the world of business and economics, indexes are developed to measure price
changes and production/quantities. These indexes (symbolized as It) are expressed as
the ratio of the price or quantity of the item of interest at the point of interest (xt) over
its value at the base period (x0) multiplied by 100 to take the form of a percentage.

In addition to the simple type of indexes mentioned before we also have


composite indexes that represent combinations (additions usually) of variables (like
prices or quantities) over their same combinations during their base period (times 100
to make into percentages). Consumer price indexes are such types of indexes. If we
also consider in these indexes weights for each variable before combining them to
account for the quantity sold, then we have weighted composite indexes. For a
stream of a set (x,y) of N number of values that represent a target period t with
quantities y at prices x and a corresponding set for a base period t0, such an index will
be expressed as:

While indexes can give us an idea about the data stream we study (similar to
the mean and median we have seen in previous chapters), they can be quite misleading
if the data in the stream fluctuate irregularly and rapidly. In such cases, we try and
apply a technique called smoothing to remove the rapid fluctuations. One of the types
of smoothing we can apply is exponential smoothing and it is a form of a weighted
average of past and present values of the time series. Considering a weight w (also
202 QUANTITATIVE RESEARCH METHODS

called smoothing constant) in the range between ‘0’ and ‘1’ that will be applied
throughout the series of data x. A value near ‘0’ places emphasis on the past while a
value near ‘1’ places an emphasis/weight on the future. Formula 6.1 expresses the
smoothing function.

(6.1)
The smoothing process will produce replacement values X as follows:
X1 = x 1
X2 = wx2 + (1-w)X1
X3 = wx3 + (1-w)X2
………………………..
XN = wxN + (1-w)XN-1
Table 6.7 showcases the smoothing process with data for the closing value of
Apple’s stock during Fall 2016. Three different weights are considered, and their
influence can be seen in the smoothing they produce in Figure 6.10. If we are interested
in “long-term” trends we might consider the smoothing the weight of 0.1 produced
(blue line), which shows a decreasing tendency of the stock value. If we are interested
in capturing daily fluctuation, then the smoothing the weight of 0.5 has produced
might be more appropriate. In between we have the effect of the weight of 0.3. It is
evident that according to our interests we might consider which weight is more
appropriate.
The method we discussed along with indexes provide a descriptive
representation of the data stream we study. While this is important in terms of
understanding our time series, our interest is in making predictions or more
appropriately forecasting future values of the series. Based on what we just did we
can continue applying the smoothing formula for the next future period t+1 by placing
emphasis on the past that is already known. That means that we need to use w = 0 in
formula 6.1 to produce the next value of the series X N+1 = X N , X N+2 = X N+1 and so
on until we reach our target future time. It is evident from this process that all the
future values will equal the last smoothed value X N in our original series so one would
expect the farther in time we move the less accurate our forecasts will become.
Advanced Methods of Analysis 203

Table 6.7 Smoothing of Apple stock


Closed w = 0.5 w = 0.3 w = 0.1
116.95 116.95 116.95 116.95
116.64 116.80 116.86 116.92
115.97 116.38 116.59 116.82
115.82 116.10 116.36 116.72
115.19 115.65 116.01 116.57
115.19 115.42 115.76 116.43
113.30 114.36 115.02 116.12
113.95 114.15 114.70 115.90
112.12 113.14 113.93 115.52
111.03 112.08 113.06 115.07
109.95 111.02 112.13 114.56
109.11 110.06 111.22 114.02
109.90 109.98 110.82 113.61
109.49 109.74 110.42 113.19
110.52 110.13 110.45 112.93
111.46 110.79 110.76 112.78
111.57 111.18 111.00 112.66
111.79 111.49 111.24 112.57
111.23 111.36 111.23 112.44
111.80 111.58 111.40 112.37
111.73 111.65 111.50 112.31
110.06 110.86 111.07 112.08
109.95 110.40 110.73 111.87
109.99 110.20 110.51 111.68
107.11 108.65 109.49 111.23
105.71 107.18 108.36 110.67
108.43 107.81 108.38 110.45
107.79 107.80 108.20 110.18
110.88 109.34 109.01 110.25
111.06 110.20 109.62 110.33
110.41 110.30 109.86 110.34
108.84 109.57 109.55 110.19
109.83 109.70 109.64 110.16
204 QUANTITATIVE RESEARCH METHODS

Figure 6.10 Weighted exponential smoothing

One way to alleviate this problem is to consider adding some trend influence
in the form of a component in the smoothed function we are using. One such model
is the Holt forecasting model and expressed by the pair of smoothed/ weighted
averages equations:
Xt = wXxt + (1-wX)(Xt-1 + Tt-1) and Tt = wT(Xt - Xt-1) + (1-wT)Tt-1
This set of equations updates both the series values and the trends with
separate weights for each one of them.
The smoothing process starts from the second series value and continues until
it reaches the end of that time series and even to the point in the future we are
forecasting:
X2 = x2 and T2 = x2 – x1
X3 = wXx3 + (1-wX)(X2 + T2) and T3 = wT(X3 – X2) + (1-wT)T3
……

XN = wXxN + (1-wX)(XN-1 + TN-1) and TN = wT(XN – XN-1) + (1-wT)TN-1


Advanced Methods of Analysis 205

Table 6.8 shows of the Holt forecasting model for the same weights that we
used in Table 6.7. The results are also plotted (Figure 6.11) for comparison with
corresponding plots of Figure 6.10. It should be evident from the comparison of the
graphs that the Holt model captures the visual trend with more consistency for the
various weight values.

Table 6.8 Holt smoothing


Closed 0.50 0.30 0.10
116.95 116.95 116.95 116.95
116.64 116.64 116.64 116.64
115.97 116.15 116.28 116.32
115.82 115.74 115.86 115.99
115.19 115.31 115.43 115.64
115.19 115.01 115.03 115.28
113.30 114.04 114.47 114.89
113.95 113.46 113.79 114.45
112.12 112.69 113.07 113.96
111.03 111.35 112.12 113.39
109.95 110.12 110.93 112.71
109.11 109.08 109.71 111.92
109.90 109.01 108.82 111.05
109.49 109.21 108.44 110.21
110.52 109.74 108.53 109.48
111.46 110.80 109.11 108.95
111.57 111.52 109.98 108.66
111.79 111.85 110.80 108.60
111.23 111.69 111.31 108.70
111.80 111.68 111.57 108.88
111.73 111.82 111.74 109.09
110.06 110.98 111.61 109.29
109.95 110.06 111.03 109.40
109.99 109.80 110.40 109.40
107.11 108.35 109.54 109.26
105.71 106.26 108.08 108.93
108.43 106.61 106.87 108.40
107.79 107.51 106.59 107.82
110.88 109.19 107.24 107.37
111.06 110.90 108.59 107.21
110.41 111.08 109.88 107.34
108.84 110.01 110.33 107.63
109.83 109.56 110.17 107.92
206 QUANTITATIVE RESEARCH METHODS

Figure 6.11 Holt model smoothing

Using this smoothing process for forecasting is done in the same way as before. We
consider w = 0 for the future values since we are basing our predictions on the past.
XN+1 = XN + TN and TN+1 = TN (6.2)
XN+2 = XN+1 + TN+1 and TN+2 = TN+1 (6.3)
If we substitute XN+1 and TN+1 from their previous values (6.2) in (6.3) we get:
XN+2 = XN + TN + TN and TN+2 = TN or XN+2 = XN + 2TN
If we follow on this, we will get XN+3 = XN + 3TN and so on until we reach
our target date after t periods.
Xt = Xt + t*Tt
If we consider what we did with linear regression in Chapter 4, it should be
evident that the equations we just derived look like the regression equations. This
would suggest that a regression equation of the form X = a0+a1t could be considered
in forecasting situations. This will depend on the research problem we study. For
Advanced Methods of Analysis 207

example, if we are considering the variability of values real-life situations create like
the Apple stock in Table 6.7 we would be better off consider forecasting.
A better working approach that requires a lot more information would be to
use what is known as an additive model. In such a model, various influences are
considered as added components that influence the time series values like:
Xt = T t + C t + St + R t

• Tt is usually called secular trend and represents long-term


trends/influences (could be years to decades). The data values in Table
6.7 that led to the shape of the blue smoothed line of Figure 6.10 could
be an indication of such a trend — in our case this is a downward
trend.
• Ct is called cyclical effect/trend and represents fluctuations around
the secular term like recessions and growth (could be months/quarters
to years). In Figure 6.10 the shape of the green line could be an
indication of such a trend.
• St is called seasonal effect and accounts for fluctuations during
specific time periods (could be hours, days, or months/quarters). The
value of the stock could be such an effect (red or gray lines in Figure
6.10).
• Rt is the residual effect and represents the remainder after all previous
effects have been accounted for. This might include unpredictable
events like the death of the CEO, the destruction of a production
facility, etc.
Combining this form with the regression model we presented before we have
what is known as a seasonal model, especially if the individual terms that are added
to the regression equation represent quarters:
Xt = a0 + a1t + a2Q1 + a3Q2 + a4Q3 + Error
Similar to additive models we have multiplicative models that produce the
natural logarithm of a prediction instead:
lnXt = a0 + a1t + a2Q1 + a3Q2 + a4Q3 + Error
In such forms we can consider that the term (a1t) represents the secular trend
Tt, the term (a2Q1 + a3Q2 + a4Q4) represents the seasonal and cyclical effect (Ct + St),
and the Error term represents the residual effect Rt. The multiplicative model is often
regarded as a better forecasting model than the previous ones we mentioned.
208 QUANTITATIVE RESEARCH METHODS

A critical issue that has not been addressed yet is the accuracy of the various
forecasting methods and the metrics that we can use to measure the differences of the
forecasts with the actual values they forecast (when they become available). The latter
is also referred as forecast error. In practice, some of the metrics we use include the
mean absolute deviation (MAD), the root mean squared error (RMSE), and the mean
absolute percentage error (MAPE):

Consideration of errors in time series needs to be done in association with


forms of trends we consider as some of them like the secular and seasonal include
cyclical influences that could support or inhibit a trend periodically. If we were to
consider the residuals of our estimations (real values – forecasted values) they will
fluctuate predictably from the main trend (the regression line, for example). This
creates a form of correlation amongst the residuals (often referred to as
autocorrelation) that in the case of neighboring residuals is also called first-order
autocorrelation. The presence of such correlations can be identified by specific metrics
like the Durbin-Watson d-statistic but their discussion is beyond the scope of this
book.
Advanced Methods of Analysis 209

6.5 Bayesian Analysis


Bayesian analysis (also known as Bayesian inference and Bayesian
statistics) refers to the application of Bayes’ theorem to update the outcome of
hypotheses based on improved information as it becomes available. It encapsulates
the notion of subjective probability which refers to our initial estimates based on
preconceived notions and stereotypes or early predictions without the consideration
of all facts. The use of new evidence that Bayesian statistics considers allows for the
dynamic analysis of sequences of data in real time making this type of analysis widely
applicable to science, engineering, and medicine among others. The consequence of
this practice is that we do not consider any more a fixed population size that can be
expressed through fixed parameters but rather a flexible population where parameters
vary in response to changes (new evidence).
Before we reveal more about the nuances of Bayesian analysis let us first
consider an example to highlight the case. Suppose we have an organization where
employees are either lawyers or engineers. We know from experience (movies
included) that at work 95% of lawyers wear suits while only 15% of engineers wear
suits. We visit the organization one day and the person who greet us is wearing a suit.
Based on what we know, would we assume the greeter is a lawyer or an engineer? The
situation is depicted in Figure 6.12.a below as a tree structure.

(a) (b)
Figure 6.12 Tree representation of a decision scenario
210 QUANTITATIVE RESEARCH METHODS

Our “facts” up to that point would suggest that chances are the employee in
front of us is a lawyer (0.95 probability). As in real life situations, though, there are
times when information can be available to us that might change our minds. In this
case let us say that through prior research we learn that 90% of the organization’s
employees are engineers -perhaps this is a construction firm - and only 10% are
lawyers. That would obviously change our perception of the situation we are facing
(Figure 6.12.b). This new “reality” would reveal a different set of facts to us as we now
have a 0.9*0.15 or 0.135 probability of an employee been and engineer wearing a suit
and 0.1*0.95 or 0.095 probability of an employee been a lawyer wearing a suit. These
updated evidence (joint probabilities) are also known as conditional probabilities
or likelihood and symbolized as P(Observation|Evidence). This is interpreted as the
probability of an Observation given the Evidence. In our case it would be
P(Suit|Engineer) = 0.135 and P(Suit|Lawyer) = 0.095.
Based on the information from our prior research we have now an updated
set of facts; 13.5% chance for someone wearing a suit to be an engineer and 9.5% to
be a lawyer). This would suggest that the employee who greeted us has more chance
of being an engineer than a lawyer. In fact, if we wanted to calculate the probability of
someone being an engineer or lawyer based on our evidence that the employee is
wearing a suit, we can apply the definition of probability presented in Chapter 3 for
the universe of the suit wearing employees only.

(a) (b)
Figure 6.13 Venn diagram representation of a decision scenario

The image in Figure 6.13.a displays a more realistic representation (in terms of
surface areas) of the employee situation based on all of the available evidence. It can
be seen from the shaded areas that represent those wearing suits that despite the fact
Advanced Methods of Analysis 211

that the lawyers wearing suits dominate the lawyer employees the shaded area that
represents engineers wearing suits is much larger. This is more evident in Figure 6.13.b
where only those employees in the organization that are wearing suits is considered.
From that image we can calculate the actual probabilities (called posterior) of
someone being an engineer or lawyer given the observation that they are wearing a
suit. Accordingly, the posterior probability of the employee that greet us to be an
engineer is 0.135/0.23 or 0.59 (59%) and for lawyer is 0.095/0.23 or 0.41 (41%). So,
the chances are 59% for engineer against 41% for lawyer.
If we were to express, now, these calculations in a more detailed form
(combine Figure 6.12.b and 6.13.b) we could say that the posterior probability of an
employee being an engineer or lawyer given that he is wearing a suit is
0.135/0.23 = 0.15*0.9/(0.135+0.095) and 0.095/0.023 = 0.95*0.1/(0.135+0.095)
respectively. These later expressions can be derived from the followin general formula,
also know as Bayes formula/theorem:

P(Something|Observation) is the probability (called posterior) to have Something


given an Observation. In the case of our example the probability of having an engineer
given our observation that the employee is wearing a suit was found as 0.59 (59%).
P(Observation|Something) is the probability (called likelihood) of an Observation
to belong to Something. In the case of our example the probability of an employ wearing
a suit when we know he/she belongs to the engineers was given as 0.15 (15%).
P(Something) is the probability (called prior) of Something given the total
population. In the case of our example the probability of any employee being an
engineer was given as 0.9 (90%).
P(Observation) is the probability (called marginal) of Observation given the total
population. In the case of our example the probability of observing a suit was 0.135
+0.095 or 0.23 (23%)
In the case of multiple events (a decision tree with more branches like say
lawyers, engineers and secretaries in our example) the only thing that changes in Bayes
formula is the denominator which needs to include all contributions to the property
we observe (the suit in the example). In a general form this can be expressed as:
212 QUANTITATIVE RESEARCH METHODS

P(Observation) = P(Obervation|Something1) * P(Something1) +


P(Obervation|Something2) * P(Something2) +
P(Obervation|Something3) * P(Something3) + …
The reader might have noticed that with Bayesian analysis we are dealing with
probabilities similar to how we dealt with proportions in categorical/nominal variables
at the beginning of Chapter 4. This probability approach of Bayesian statistics
discriminates it from what we did in Chapters 4, 5, and 6 that is commonly referred as
frequentist statistics. The approach forms the basis of many decision-making situations
as we will see in the next section.
What is interesting about Bayes formula is that in can be used in what we call
Bayesian inference. Because our populations in reality are not known, in this type of
inference we have models (mathematical formulations) of the population as expressed
by parameters that get updated as the models change. To illustrate what this means let
us consider the tossing of a coin as a case in point (as we did in Chapter 3). We might
be interested in the probability, for example, of 3 heads out of 4 tosses given the coin
is fair. Using the binomial distribution with parameters 3 successes in 4 trials (Section
3.3.3) or a tree representation of the alternatives, we get a probability of 0.25 for our
observation. This might have us worry about the fairness of the coin. This is where
Bayesian inference helps as it draws our attention on the coin rather than the outcome.
In otherwise it phrases the question from:
What is the probability of 3 heads in 4 tosses given that we have a fair coin? to
Given the observation 3 heads in 4 tosses what is the probability of the coin being
fair?
In terms of the Bayes’ theorem we will have:
P(FairCoin|Observation) = P(Observation|FairCoin) * P(FairCoin)/P(Observation)
In reality, our estimates are based on guesses and are better captured as
distributions (more precisely beta distributions in Bayesian statistics) that represent our
strengths on beliefs about the parameters of a population based on previous
experience. This approach allows us to work even in situations with no previous
experience (uninformed priors). Because a detail coverage of the subject is beyond
the scope of this book for now we will just focus on how it relates to the things we
covered in previous chapters and more specifically the concept of significance.
In Chapters 4 and 5, our main criterion of evaluations was the p-value. We
tended to interpret p-values as the probability of a sample having a certain parameter.
For example, in a t-test with p = 0.05 on a sample with mean 30 we would say that
Advanced Methods of Analysis 213

there is a 95% probability that the sample mean represents the actual population. If
we were to perform the same test with different sample sizes, we would get different
t-scores and hence different p-values leaving the acceptance or rejection of the null
hypothesis uncertain and dependent on the sample size. This uncertainty will
eventually spill over to confidence intervals leaving us with no way of knowing which
values are most probable.
The equivalent of the p-value in Bayesian analysis is the Bayes factor. The
null hypothesis here assumes an infinite probability distribution at the parameter (the
mean 30 in the previous example or 3 successes in the die case) and zero everywhere
else. The alternative hypothesis in this case assumes all values of the parameter as
possible (uniform distribution). Because prior and posterior odds are involved the
Bayes factor is calculated as the ratio of the posterior odds to the prior odds. To reject
the null hypothesis, we usually consider Bayes factors below 0.1. The confidence
intervals in this approach become credibility intervals.
While confidence intervals suggest our belief that a certain percentage of
samples (95% in most cases) will contain the parameter we investigate (true value of
the population) the credibility interval expresses the probability (95%) that the
population mean is within the limits of the interval. If the logic behind these
expressions seem similar keep in mind that they are conceptually different as the
former considers that a true value of the population can be captured while the latter
assumes there is an inherent uncertainty in capturing it. Bayesians assume that updated
information can improve our convergence to reality.
Despite the attractiveness of the Bayesian approach in reflecting the reality that
experiences improve our predictions this approach does have some shortcomings. For
one thing priors are mostly subjective in nature and influenced by our preconceived
notions and stereotypes of situational characteristics. Personal beliefs and
environmental influences might contribute to bias in our estimation/guessing of priors
limiting the application of a balanced and reliable Bayesian approach. Experience in
such cases becomes a determining factor for a successful implementation of a Bayesian
model. An additional challenge with Bayesian analysis is the computational complexity
of the multi factor situations that we see in real-life cases. Considering all these
challenges, the choice of whether we follow a frequentist or a Bayesian approach to
our analysis should be problem dependent. The two approaches can complement each
other as they can address the flaws of each other and in this way mitigate the real-
world problems.
214 QUANTITATIVE RESEARCH METHODS

6.6 Decision Analysis


Rules of thumb, intuition, and experience might not be enough for someone
to comprehend the multiple dimensions of decision making, especially nowadays that
there are vast amounts of information that need to be processed. Using formal
techniques in this process is imperative for reaching successful outcomes. Decision
analysis methods provide the theoretical background for quantitative support for
decision makers. The purpose of formal decision analysis is to select decisions from a
set of possible alternatives, especially in the case where uncertainties regarding the
future exist. These uncertainties are usually expressed in terms of the probabilities of
the various outcomes and our goal will be to optimize the resulting return or payoff
in terms of some decision criterion.
Data and probabilities we assign to processes and outcomes are formalized in
models that will allow mathematical processing to reach a solution. In addition, models
allow understanding of a situation in terms of the variables affecting it and add
predictive capabilities and visualization of outcomes. Developing a model is reaching
a balance between reality and representation. Too analytical models tend to be difficult
to understand and apply in decision making, while too simplistic ones will be viewed
as a waste of time for stating the obvious. The model that provides insight and can
handle the most probable alternatives is what creates solutions and adds value to a
decision maker.
The basic principle upon which models for decision analysis are built is that of
a system. In the systems perspective everything can be determined from its parts
and their interaction with the environment. In terms of modeling, that means that
models are assumed to be closed in that they inherently include or represent every
possible solution that can exist. Although this might sound quite rigid and far from
reality where situations change dynamically due to uncertainties in the parameters that
define them, for certain cases models become good approximations of reality.
For our purposes here we will consider two types of decision models:
deterministic ones where the value of a decision is completely judged by the outcome,
and probabilistic ones where in addition to outcomes we have an amount of risk to
consider for each decision. The two types of models are directly related to past and
future experience as in the first case we know exactly what happened while in the latter
we have to guess with some degree of certainty. The concept of probability is very
important in real-life situations since there is no such thing as complete knowledge of
events, issues, and people involved. Of major concern in decision analysis and
something we should always be aware of is the degree of reliability of our estimation
and the probability distributions we use to represent situations along with the fact that
emotions can be involved in risky decisions.
Advanced Methods of Analysis 215

Decision analysis is based on combining decision alternatives with possible


future events in pictorial representations that make analysis and location of optimum
solutions easy to identify and view. From the plethora of models applied in research
and real-life situations, we will present here simplified forms of three of the most
popular and frequently used models. Payoff table analysis, decision trees, and game
theory have been extensively studied and used in decision making and negotiations
and can be quite practical for situations with few alternatives and states of nature. An
alternative to the aforementioned normative methods is simulations. They can be
developed to explore complex phenomena and due to their importance, they will also
be discussed here.

6.6.1 Payoff Table Analysis


Payoff tables represent situations as matrices of rows and columns. They are
ideal when we are facing finite sets of discrete decision alternatives whose outcome is
a function of a single future event. The rows of the matrix correspond to decision
alternatives, while the columns correspond to possible mutually exclusive future events
(states of nature). They presume free choice of decision alternatives and no control
of future states. We will demonstrate the application of payoff tables through an
example. Let us assume that we are interested in investing $1,000 for a year. We consult
a broker who recommends five potential investments (like gold, corporate or
government bonds, etc.). We assume five possibilities of the direction the market can
take (large rise, small rise, no change, small fall, large fall) during the investment year
and through research we identified payoffs (gains or losses for the $1,000 investment)
for each situation and the five investment options (Table 6.9). With the information
available before us we need to consider our options and decide on an “optimal”
investment.
It should be obvious from Table 6.9 that Investment2 outperforms
Investment5 regardless of the direction the market takes, so for all purposes we can
ignore it (Table 6.10). It comes now to what we perceive as appropriate for us in
choosing an investment. This is more of a personality/attitude issue than anything
else. A pessimist, for example, would consider what they can make under the worst
possible scenario (yellow cells in Table 6.11.a), while an optimist would consider the
highest payoff for each investment (yellow cells in Table 6.11.b). Investment4 of
course doesn’t pose any challenge so it could be considered risk neutral investment.
For moderate decision makers on the other hand, an alternative would be to sum up
all possible outcomes for each investment/row (Table 6.11.c) and choose the highest
gain. These sums are also called expected values (EV). Another possibility that is
often considered is the maximum regret case. In such situations, we estimate the
difference between the maximum gain for each market direction (Table 6.11.d). For a
Large Rise, for example, the maximum gain is 500 (Investment 3) so we subtract every
216 QUANTITATIVE RESEARCH METHODS

entry in the Large Rise column from 500. Again, depending on our degree of optimism
and pessimism, we could decide on the appropriate investment/row.

Table 6.9 Payoff Table


States of Nature
Decision Large Small No Small Large
Alternatives Rise Rise Change Fall Fall
Investment1 -100 100 200 300 0
Investment2 250 200 150 -100 -150
Investment3 500 250 100 -200 -600
Investment4 60 60 60 60 60
Investment5 200 150 150 -200 -150

Table 6.10 Reduced Payoff Table


Market Direction
Decision Large Small No Small Large
Alternatives Rise Rise Change Fall Fall
Investment1 -100 100 200 300 0
Investment2 250 200 150 -100 -150
Investment3 500 250 100 -200 -600
Investment4 60 60 60 60 60

Table 6.11.a Payoff Table Pessimist's View


Market Direction
Decision Large Small No Small Large
Alternatives Rise Rise Change Fall Fall
Investment1 -100 100 200 300 0
Investment2 250 200 150 -100 -150
Investment3 500 250 100 -200 -600
Investment4 60 60 60 60 60

Table 6.11.b Payoff Table Optimist's View


Market Direction
Decision Large Small No Small Large
Alternatives Rise Rise Change Fall Fall
Investment1 -100 100 200 300 0
Investment2 250 200 150 -100 -150
Investment3 500 250 100 -200 -600
Investment4 60 60 60 60 60
Advanced Methods of Analysis 217

Table 6.11.c Payoff Table Neutral View


Market Direction
Decision Large Small No Small Large
Sum
Alternatives Rise Rise Change Fall Fall
Investment1 -100 100 200 300 0 500
Investment2 250 200 150 -100 -150 350
Investment3 500 250 100 -200 -600 50
Investment4 60 60 60 60 60 300

Table 6.11.d Payoff Table Maximum Regret


Market Direction
Decision Large Small No Small Large
Alternatives Rise Rise Change Fall Fall
Investment1 600 150 0 0 60
Investment2 250 50 50 400 210
Investment3 0 0 100 500 660
Investment4 440 190 140 240 0

In cases where there is no knowledge of the probabilities of the states of nature


(market in our case), subjective criteria will influence the decision process. Decision
maker personalities can range from pessimistic to conservative to optimistic,
influencing the strategy for approaching a decision problem and selecting an option.
A pessimistic decision maker usually expects the worst possible result no matter what
decision is made, while an optimistic one feels that luck is always shining and whatever
decision they make, the best possible outcome will occur. Somewhere in the middle
we find the conservative decision maker who ensures a guaranteed minimum payoff
regardless of which state of nature occurs. The reward is always a function of risk so
great rewards come with higher risk and one needs to balance what they need with
what they can afford.
If some knowledge of the probabilities of the various states of nature exists,
payoff table analysis can assure optimal decision (in the long run) by taking into
consideration every possible state of nature. In our example, we might have additional
information from an expert that we want to consider. Table 6.12 shows the assigned
probabilities to the various directions the market can take that the expert provided.
Considering the neutral view perspective of Table 6.11.c we can repeat the process we
followed there, only this time each value is multiplied by the corresponding probability
of each market direction. The same process can be applied to the maximum regret
case of Table 6.11.d if we are interested to see what it produces with the added
218 QUANTITATIVE RESEARCH METHODS

information of the probabilities. In case we are interested in gains we can multiply the
maximum gain for each market direction with the probability of that market direction
happening and sum the results over all market conditions. We then get what is known
as expected return of perfect information (ERPI).
Table 6.12 Reduced payoff table
Market Direction
Decision Large Small No Small Large
Alternatives Rise Rise Change Fall Fall
Probability 0.20 0.30 0.30 0.10 0.10 EV
Investment1 -100 100 200 300 0 100
Investment2 250 200 150 -100 -150 130
Investment3 500 250 100 -200 -600 125
Investment4 60 60 60 60 60 60
Max 500 250 200 300 60 ERPI
Gain 100 75 60 30 6 271

A very useful technique to enhance the analysis made is to combine it with


Bayesian analysis and utility theory. Bayesian analysis uses sample information to
aid decision making by fine-tuning probability estimates, while utility theory allows for
utility values that reflect the decision maker’s perspective for each possible outcome.
For the investment example, we presented here it might be that we have information
on the success or failure rates of the expert regarding the direction the market takes.
It could be for example that when the market experiences a large rise the expert
predicted it correctly 80% of the times. Let us assume the rates of success of the expert
for the series of market conditions Large Rise, Small Rise, No Change, Small Fall,
Large Fall are 80%, 70%, 50%, 40%, and 0% and corresponding failure rates are 20%,
30%, 50%, 60%, and 100%.
By multiplying the expert provided probabilities with his/her success and
failure rates (Table 6.13) we get the joint probabilities of each prediction. The
percentage of each of these probabilities with respect to the corresponding joint
probabilities total (0.56 and 0.44 in the case of the expert’s success or failure
respectively) results in what is called posterior probabilities. These are used the same
way we calculated EV and will produce the expected value of perfect information for
expert success (EVPI+) and failure (EVPI-).
With respect to utility theory mentioned before, one can assign probabilities
based on an ordering of the payoffs. Table 6.14 displays the ordering of the payoffs of
Table 6.10 in ascending order and their assigned utilities in the form of probabilities
along the spectrum from 0 to 1. These probabilities from now on can be used for the
analysis instead of the actual payoffs. Following the expected value process (sum over
Advanced Methods of Analysis 219

the products of probability times utility across all market directions for every
investment), we get what is called the expected utility (Table 6.15). This process
seems to suggest Investment3 as the better alternative.

Table 6.13 Joined probabilities payoff table


Market Direction
Decision Large Small No Small Large
Alternatives Rise Rise Change Fall Fall
Probability 0.20 0.30 0.30 0.10 0.10
Expert Success
Rate 0.80 0.70 0.50 0.40 0.00 Sum
Expert Joint
Success Probabilities 0.16 0.21 0.15 0.04 0.00 0.56
Posterior
Probabilities 0.29 0.38 0.27 0.07 0.00 EV EVPI+ EVPI-
Investment1 -100 100 200 300 0 500 84 120
Investment2 250 200 150 -100 -150 350 179 67
Investment3 500 250 100 -200 -600 50 249 -33
Investment4 60 60 60 60 60 300 60 60
Probability 0.20 0.30 0.30 0.10 0.10
Expert Failure
Rate 0.20 0.30 0.50 0.60 1.00 Sum
Expert Joint
Failure Probabilities 0.04 0.09 0.15 0.06 0.10 0.44
Posterior
Probabilities 0.09 0.20 0.34 0.14 0.23

Table 6.14 Utility


values
Payoff Probabilities
-600.00 0.00
-200.00 0.25
-150.00 0.30
-100.00 0.35
0.00 0.50
60.00 0.60
100.00 0.65
150.00 0.70
200.00 0.75
250.00 0.85
300.00 0.90
500.00 1.00
220 QUANTITATIVE RESEARCH METHODS

Table 6.15 Payoff table with utilities


Market Direction
Decision Large Small No Small Large
Alternatives Rise Rise Change Fall Fall
Expected
Probability 0.20 0.30 0.30 0.10 0.10 Utility
Investment1 0.35 0.65 0.75 0.90 0.50 0.630
Investment2 0.85 0.75 0.70 0.35 0.30 0.670
Investment3 1.00 0.85 0.65 0.25 0.00 0.675
Investment4 0.60 0.60 0.60 0.60 0.60 0.600

The choice of action for the methods discussed here is as always left to the
decision maker. Risk-averse individuals try to reduce uncertainty so they tend to
“underestimate” utility, while risk-taking ones place high utility the more the earnings
increase. Risk-neutral ones tend to be more reserved in their estimation and
assignment of utilities. Figure 6.14 provides a pictorial depiction of the
aforementioned categories.

Figure 6.14 Risk attitudes

6.6.2 Decision Trees


A popular process in decision making concerns the development of a decision
tree. We have seen the simple case of developing a decision tree when discussing
probabilities in the case of the coin flipping example (Figure 3.15). In decision analysis,
the tree can become complicated and difficult to traverse to find an optimum solution.
From the perspective of decision analysis, trees are seen as sequential representations
of all possible events involved in a scenario with combinations of decision and state
nodes to represent decision alternatives and their outcomes. Decision nodes
(represented by square shapes) are the available alternatives to a decision maker at each
stage of a decision while states of nature (represented by circles) are the possible
natural events that can happen following a decision. The latter is expressed in terms
of probabilities for each possible outcome.
Advanced Methods of Analysis 221

Figure 6.15 displays the decision tree of a development project case where a
company needs to decide about investing to get a license permit to build a plant that
will later sell for profit. The company is also considering a consultant to provide a
prediction of whether the application to the city council for a permit to build will be
successful or not. The expert’s records suggest that 40% of the time he predicts
approval and 60% denial. In the case of approval, it is also known that 70% of the
time the approval is granted while 30% of the time it is denied. The diagram of Figure
6.15 shows all possible outcomes with the additional information that pertains to the
problem (expert fee €5,000, purchase option fee €20,000, price of land €300,000,
development permit application fee €30,000, development cost $500,000, revenue
from sale €950,000, and 40% approval rate by the city council).

Figure 6.15 Decision tree analysis

The process starts by deciding if we will hire an expert to give us a prediction


about the fate of a permit application. If we don’t hire one (upper branch of tree in
Figure 6.15) then we need to decide whether we wish to go ahead and buy the land
(cost €300,000) or pay for a purchase option (cost €20,000) that will reserve the land
for us until we hear about the fate of our development permit application. Of course,
there is always the option of doing nothing, in which case we will end up with nothing.
The other alternative in our first decision node is to go ahead and hire an expert (cost
€5,000) who successfully predicts approval 40% of the time and denial 60% of the
time.
222 QUANTITATIVE RESEARCH METHODS

For the purposes of illustration, we will follow here only one branch of the
tree and leave it to the reader to confirm the remainder of the tree. Let us say we
choose the option where the expert predicts denial to our permit application (0.6
probability of this happening). At that point (bottom branch) we still have the option
to go ahead with our application, but we need to decide if we will buy the land or the
purchase option. Let us say we decide to avoid the big investment of buying the land
and we go with the purchase option alternative (cost €20,000) which is still at the
bottom branch of the decision tree. We then submit our permit application (cost
€30,000) which has a 20% chance of approval and 80% chance of denial. Let us assume
here that we end up with approval (second from bottom branch). We will then have
to buy the land (cost €300,000), complete the development (cost €500,000), and
eventually sell for €950,000. If we extract from the revenue all the costs, we end up
with a profit of €95,000. The probability of something like this happening is the
multiplication of the probabilities (see Chapter 3) we encountered along the branch of
the tree we followed. In this case this is the probability of the expert predicting denial
times the probability of having our permit approved (all other probabilities along the
path are 1 as they represent certainties). This product results in a probability of 0.12
or 12% for obtaining the profit of €95,000.
Traversing the other branches of the tree we get to all possible leaves that
represent the expected payoffs of each branch. Green leaves represent profits while
the pink represent losses. Whether we choose one of the branches (obviously one that
leads to profit) at a decision node depends on our attitude towards risk (Figure 6.14)
and any additional information we might happen to have at the time of the decision.
Based on the risk we want to take the optimal branch we will follow is called critical
path.
The formalism of decision trees is very powerful in situations with clear
objectives and predetermined states of nature. As in most methods, there are always
disadvantages that we need to consider when selecting the decision tree method. One
such disadvantage stems from our inability to assign realistic probabilities for future
events. In addition, the more complicated a situation is the more branches we need to
represent it, realistically, making detection of the optimum path difficult.

6.6.3 Game Theory


In many decision-making scenarios, the situations we face are zero-sum
transactions, meaning that whatever one party gains the other loses. Such cases are
ideal for applying game theory to determine the optimal decision. The payoffs here are
based on the actions taken by competing individuals who are seeking to maximize their
return. From that perspective, decision theory can also be viewed as a special case of
game theory in which we play against nature.
Advanced Methods of Analysis 223

Parameters taken into consideration when applying game theory include:

• Number of players – we can have situations with two players like in the game
of chess or many players like in the game of poker.
• Total return – it can be a zero-sum game like poker among friends and non-
zero-sum game like poker in a casino where the “house” takes a cut.
• Player turns – it could be sequential where each player takes a turn that affects
the states of nature like monopoly or simultaneous where each state of nature
is defined after all players declare their moves like in paper-rock-scissors.
The most popular case of game theory is the prisoner’s dilemma. In this
situation two suspects are caught for a criminal act and during their separate
interrogation an attempt is made to motivate them to confess by presenting them with
different options. One is to confess participation in the crime and receive a reduced
jail time of 5 years and the other is to deny, in which case the penalty if the other
suspect confesses will be 10 years jail time, otherwise (if they both do not confess)
they will both end up with 2 years in jail each (maybe for a minor offence like being
present and passive in the crime scene). Such a scenario can easily be represented with
a payoff table like the one displayed in Table 6.16.
Table 6.16 Prisoner’s dilemma payoff table
Suspect A
Confess Deny
5 10
Suspect Confess 5 1
B 1 2
Deny 10 2

Provided each suspect knows that the other also received the same alternatives,
and excluding potential loyalties and any other influences, it appears that there is an
optimal option (called Nash equilibrium) and that is to confess. Consider suspect B
for example. If he assumes that suspect A is going to confess, then he is better off
confessing too (top left cell in Table 6.16) as he will receive a 5-year sentence, while if
he denies (bottom left cell) he will receive a 10-year sentence. If he assumes the other
suspect will deny, again he is better off confessing and receiving a 1-year sentence
instead of denying and receiving a 2-year sentence. Rationality in the prisoner’s
dilemma case suggests that the choice is not the globally optimum where both deny
but instead the case of both confessing. The problem with the global optimum that
makes it unlikely to be chosen (all other factors like influences and loyalties excluded)
is that it is an unstable state in that it can always be improved for each one of the
suspects individually as there is always a better alternative (1-year sentence).
224 QUANTITATIVE RESEARCH METHODS

Let us consider now a more appropriate example of a business situation where


game theory can be of value. Let us assume we are assigned to plan the promotion
strategy of a supermarket (B) competing against a rival (A) in certain food categories
for a certain week. We will assume that Supermarket A is in general promoting four
product categories (Fruit, Dairy, Meat, and Bakery) while Supermarket B will be
promoting three (Fruit, Dairy, and Meat). From past promotion strategies we know
what gains or losses have been achieved for the various combinations of strategies
between the two competitors. Table 6.17 shows the gains (let’s say percentage increase
or decrease in sales) of one against the other based on what promotion strategy each
one chooses.
Table 6.17 Payoff table for competing strategies
Supermarket A
Fruit Dairy Meet Bakery
Fruit 2 2 -8 6
Supermarket
Dairy -2 0 6 -4
B
Meet 2 -7 1 -3

The objective here is to find the best strategy that will ensure gains for B
regardless of the strategy A will adopt. Assuming the promotion efforts will be
continued for some time it might be worth considering the appropriate blend of
strategies across time that could maximize our efforts. For the long run let us assume
we will be promoting Fruit x1 percent of the time, Dairy x2 percent of the time, and
Meat x3 percent of the time. Since the x represents probabilities it will always be:
x1 + x 2 + x 3 = 1 (6.2.1)
Considering a minimum overall gain V (referred to as expected value)
regardless of the strategy A follows we can formulate the following equations:
When A chooses:
Fruit (see Fruit column in Table 6.17) we need 2x1 – 2x2 + 2x3 ≥ V (6.2.2)
Dairy (see Dairy column in Table 6.17) we need 2x1 – 7x3 ≥ V (6.2.3)
Meat (see Meat column in Table 6.17) we need -8x1 + 6x2 + x3 ≥ V (6.2.4)
Bakery (see Bakery column in Table 6.17) we need 6x1 – 4x2 - 3x3 ≥ V (6.2.5)
Our aim is to find the solution to the system of equation (6.2) that maximizes
V ≥ 0. The case when V = 0 is called a fair game. The solutions can be produced in
practice with a variety of methods that are available in almost all quantitative methods
software. For the given system and for a fair game (V = 0), the Solver add-in of Excel
Advanced Methods of Analysis 225

produces x1 = 0.39, x2 = 0.5, and x3 = 0.11. This means that as long as Supermarket B
promotes Fruit 39% of the time, Dairy 50% of the time, and Meat 11% of the time it
will in effect neutralize Supermarkets A’s promotion efforts. By choosing a different
value for V we can work out different promotion percentages.
The reader needs to realize that while game theory will provide some ideal
solutions, these solutions refer to a particular instance in time and that future times
will require updating the payoffs based on new information that might become
available. Also, as one player adopts an optimum strategy so could another one (by
using again game theory), so the situations in real life are far from static. The
interaction amongst multiple players will complicate the playing field so combinations
of decision making (and data mining nowadays) techniques might be required to
achieve a true competitive advantage.

6.7 Simulations
Simulations are nothing more than artificial imitations of reality (in our case
with the use of computers). They are descriptive techniques that decision makers can
use to conduct experiments when a problem is too complex to be treated with
normative methods like the ones presented before and also lacks an analytic
representation that would allow numerical optimization approaches. Simulations
require proper definition of the problem, the development of a suitable model, the
validation of the model, and finally the design of the experiment. By running the
simulation many times (performing lots of trials) we will be able to see prevalent
patterns that lead to optimal behavior.
In practice, there are different types of simulations:

• Probabilistic simulations – can be applied for discrete or continuous


probability distributions;
• Time-dependent and time-independent;
• Object-oriented simulations – different entities are represented with properties
and behaviors;
• Visual simulations.
The process of developing a simulation starts with the definition of the
problem in terms of mathematics or some other technique (like object-oriented
programming, automata theory, etc.). We then proceed by constructing the model,
designing the simulation experiment, conducting the experiment multiple times, and
evaluating the results. After each evaluation, we go back to the design phase to calibrate
the model until it runs within the parameters we set at the beginning of the process as
representing the phenomenon we study. A critical part of simulation experiments is
226 QUANTITATIVE RESEARCH METHODS

the implementation of a realistic random event generator. Randomness is quite


difficult to achieve artificially even when events are equally likely. In practice, we use
software that uses algorithms to produce what we call pseudorandom
events/numbers. For most purposes, pseudorandom values are good enough because
they are virtually indistinguishable from truly random numbers.
We will demonstrate the application of simulations with an example as we did
in previous sections. Suppose that a product manufacturer in a promotion effort
inserts three different memorabilia A, B, and C in the product packages. 20% of the
packages contain A, 30% contain B, and the remaining 50% contain C. One would
wonder how many products they need to buy to get the complete set of memorabilia.
Here the percentages represent the model of the phenomenon, so we need to proceed
by designing an experiment that will simulate reality as much as possible.

Table 6.18 Simulation results


Ru Produ Avera
ns Randomly generated outcomes cts ge
1 3 8 0 3
2 1 9 0 8 4 5 4
3 7 0 1 9 1 2 6 5
4 2 9 9 3 9 1 6 5
5 1 7 6 5 7 3 6 5
6 0 2 1 3 9 5 5
7 3 5 9 8 3 7 8 5 3 1 10 6
8 1 4 9 3 6
9 2 0 0 2 0 8 6 6
10 4 0 6 3 5
11 7 7 0 0 6 1 5 3 8 6
12 2 9 9 8 5 7 4 1 8 6
13 7 4 8 5 5 7 3 4 8 2 4 2 3 9 9 7 7 3 1 19 7
14 3 4 1 0 9 5 7
15 6 2 6 5 7 3 9 0 8 7
16 2 0 8 3 7
17 5 4 1 3 6
18 7 5 1 3 4 6
19 6 4 7 2 1 5 6
20 6 4 6 0 4 6

As the experimental apparatus we will use a random number generator


(function RANDBETWEEN(0,9) in Excel) that will produce numbers between 0 and
9 (including 0 and 9) as representative of the percentages of each offer type in the
Advanced Methods of Analysis 227

packages. Numbers 0 and 1 will represent A, numbers 2, 3, and 4 will represent B, and
numbers 5, 6, 7, 8, and 9 will represent C. For every run of the experiment we will be
retrieving random numbers/offers until all products are collected. Table 6.18 shows
the results of the simulation for 20 runs of the experiment along with the number of
products bought to complete the set and the average among the simulations (from the
first and up until the current one). From the results, we can deduce that we would
need on the average to buy 6 products to get all 3 memorabilia.
In the previous example the numbers created by the random number generator
had equal probabilities of appearance. There are other cases, though, where this is not
true. For example, we might be interested in deciding the size of a restaurant we should
open in a certain area and we need to calculate the optimal number of tables that the
area can comfortably service. If we find a place where too many tables can be
accommodated the place might look empty, so the customers might believe it is not a
popular place and leave, while if the place is too small it will have few tables and
customers might decide to leave when they frequently see it filled. We need a way here
to simulate customers coming in, eating, and leaving so eventually we can decide the
optimum number of tables the neighborhood can support and subsequently the ideal
size for our restaurant. We make the assumption that customers could be coming alone
or in groups of say up to 6. Research would suggest that people rarely go alone to the
restaurant or in big groups, so a normal distribution of the group size would be more
appropriate with probably a mean at group size of 3 and standard deviation 1 (meaning
68.3% of the groups will be between 2 and 4 people). Similarly, we can presume an
average time of stay that is normally distributed with a mean of 50 minutes and
standard deviation 10 minutes (data from research) and time between groups’ arrivals
again normally distributed with a mean of 20 minutes and standard deviation of 10
minutes. To set up the experiment we will need three random number generators that
follow the normal distribution (one for the group size, one for the time of stay, and
one for the in-between groups time). Simulations will run for different numbers of
tables (say 20, 30, and 40) and the time to fill up the restaurant will be recorded. The
averages for each of the different numbers of tables would indicate if the store is filled
and by what time this happens. The optimal solution will be produced by the number
of tables that can sustain a steady flow of customers with all the tables filled. Setting
up and running the simulation is better done by developing a computer program that
goes beyond the scope of this book31. The results of three such simulation runs are
shown in Table 6.19. Even though the number of runs is small it can be seen that 11
tables would be ideal for a restaurant operating under the assumptions of our scenario.

31 The code for the simulation in Java can be found on the book’s website.
228 QUANTITATIVE RESEARCH METHODS

Table 6.19 Simulation results


Simulation Number of Customers Customers
Run Tables Served Left
6 84 51
7 90 33
8 107 18
9 113 26
10 118 15
1
11 141 4
12 122 0
13 134 0
14 129 0
15 132 0
6 76 53
7 78 43
8 109 26
9 111 31
10 120 7
2
11 128 2
12 139 0
13 127 0
14 146 0
15 141 0
6 89 47
7 90 41
8 98 28
9 113 16
10 119 10
3
11 121 7
12 131 0
13 136 0
14 137 0
15 125 0

Apart from the algorithmic approach to simulations that we discussed here


there is a popular technique called Markov chains that allows the simulation of a
phenomenon in the form of transitions between states. A real-life application of this
technology is the PageRank algorithm Google uses to determine the order of their
search results. To see how this works, consider, for example, weather prediction from
one day to the next. Let’s assume, for simplicity, only three states like Sunny, Rainy,
and Cloudy for the dominant characteristic of a day and assume conditional
Advanced Methods of Analysis 229

probabilities between these states as represented in Table 6.20. Figure 6.16 displays the
same information in the more popular steady state form.
Table 6.20 Weather transition matrix
Future Sunny Rainy Cloudy
Present
P(S|S) P(R|S) P(C|S)
Sunny 0.6 0.1 0.3
P(S|R) P(R|R) P(C|R)
Rainy 0.7 0.2 0.1
P(S|C) P(R|C) P(C|C)
Cloudy 0.4 0.4 0.2

Figure 6.16 Markov chain steady state representation

Based on this transition matrix and given our present state we can use a
random number generator as we did in the memorabilia example and produce weather
predictions for any sequence of days. Assuming, for example, that today is a rainy day,
we might get the random number 0.45. From the Rainy row in Table 6.20 we can see
that this falls within 0.7 so we can assume P(S|R) or that tomorrow will be a sunny
day. If the next random number, we get is 0.8 then from the Sunny row we see that in
order to reach 0.8 we have to get to the Cloudy column (0.6+0.1+0.1) so we can
assume P(C|S) or that the day after tomorrow will be a cloudy day. By adding more
230 QUANTITATIVE RESEARCH METHODS

states of nature (Partly Cloudy, Snowy, etc.) and adjusting the transition matrix to
reflect reality as much as possible we might end up with more realistic predictions of
the weather in our location.
Concluding this section on simulation, we need to keep in mind that
simulations are straightforward, they allow for a great amount of time compression,
they can handle unstructured and complex problems, and they allow manipulation of
parameters to evaluate alternatives. Despite their strengths, they have disadvantages
like there is no guarantee that an optimal solution will be achieved, and the process
can be slow and costly especially when a lot of computational power is required. In
addition, they are specific to the situation we are facing so when something changes
in the model they might need to be redesigned from scratch.

6.8 Social Network Analysis


Trillions of connections are built everyday through social media like Facebook
and Tweeter. Our clicks build relationships that when aggregated form social networks
that link each one of us with each other and various entities. Some network entities
can be tangible like documents, locations, products and services while others are
intangible like opinions, preferences, and emotions to name a few. Social network
analysis addresses the need to represent visually, analyze, and generate insight from
networks that could be of value to organizations and individuals. It is based on graph
theory and it takes on familiar forms like organizational charts, transportation
networks (like metro maps), the World Wide Web, and the popular “six degrees of
separation” between individuals. While traditional quantitative methods depart from
the attributes of units of analysis (characteristics within an entity), social network
analysis departs from their relationships (characteristics between entities). In terms of
measurement one can see the former as collecting information about
attributes/demographics (gender, age, likes, etc.) while the latter collects information
about a network location (where in the network an entity is) and the form and strength
of the connections it forms with other network entities.
An example of social networks as they are typically represented (node-link
diagram) is displayed in the graph of Figure 6.17. The image displays a visual
representation of a slice (499 connections) of the author’s LinkedIn network. The
coloring distinguishes connections by industry type. Blue nodes represent
professionals in information technology and data science, orange nodes are those in
education, red nodes are those in publishing, and green nodes are those that identified
themselves in the role of CEO. Interesting observations can be made about the way
the various industries are clustered in sections of the graph or the way certain ones are
spread throughout the graph. From this representation, it should also be evident that
some individuals hold key positions connecting many others or liaise between clusters
Advanced Methods of Analysis 231

(the bottom left cluster in Figure 6.17 seems to be connected to the main graph
through an orange node) while others are more isolated and remote with single or very
few connections.

Figure 6.17 LinkedIn network connections

The complex webs that emerge from our social interactions create the need
for metrics that capture key locations in networks and inform about influences and
trends invisible to regular analytics techniques. Connections among social network
entities may be implicit or explicit. The first type concerns inferred connections due
to someone’s behavior while the second concerns connections that we intentionally
establish as when we follow someone or connect to a friend or coworker. The latter
needs approval/consent from both parties (also called undirected connection) while
the former is a unidirectional/directed privilege one gets from their participation in a
network. In many cases it is the undirected connections that have more value especially
to those with access to the network data as these reveal strong ties such as in the case
of two people following each other.
Another point of importance with respect to connections is that they carry
different weights. For example, if two people exchange multiple messages then we can
232 QUANTITATIVE RESEARCH METHODS

naturally deduce they have a stronger connection (meaning that potentially they can
influence each other) than two people that rarely exchange messages. To make sense
of the significance of the various connections in a social network we need metrics such
as the frequency of message exchanges.
Before we delve more into a discussion of some of the most popular metrics,
it is worth defining some key characteristics of networks and their elements. As a
commonly accepted definition, networks represent collections of entities that are
interlinked in some way. The individual entities are also labeled as nodes or vertices
and they can represent people, objects, concepts and any entity that is independently
meaningful to the network (such as transactions, “likes", etc.). Social networks,
specifically, include people that interact with other people, organizations, and artifacts.
Although specific attributes of each node are not, in general, necessary for network
analysis, their presence can only add value and can more accurately profile the
individuals and their relationships.
The connections between the entities are called edges or links or ties among
others and can be directed (single arrow connectors – also known as asymmetric edges)
such as when one person (identified as the origin) “influences” another (identified as
the destination) or undirected (straight line connectors with no arrows – also known
as symmetric edges). Another characteristic of edges, as we mentioned previously, is
their weight; when their weights are zero (also called unweighted or binary) they simply
indicate the existence of a relationship. Weights might also indicate an edge’s strength
or frequency (how many messages exchanged). Due to the space restrictions of this
section only unweighted metrics will be considered. Addressing weighted graphs might
be done by conversion. This is easily done by assuming a cutoff weight below which
no connection is assumed to exist. For example, if the weights represent the number
of times a web link was clicked then if we assume a cutoff point of say 5 then any links
that were clicked less than 5 times will be considered insignificant and will be
eliminated leaving the graph with the most “popular” links.
While there is a variety of ways to represent networks the most popular form
by far is the network graph (Figure 6.18). Such graphs make it easy to identify key
players (like C and D), isolated entities (like I), terminal nodes (like H and G) and
reciprocate relationships (like AC, BC, and DE). Because this form is difficult to
process computationally alternative representations include the matrix (also called
adjacent matrix) representation (Table 6.21) and the edge-list representation (Table
6.22). Directional influences between the origins (rows) and destinations (columns) are
represented as “1” in the matrix form while everything else is represented with “0”.
While this form is easy to process, much of the space is waisted in redundant
information (“0”s). The edge-list representation is eliminating the space challenges of
Advanced Methods of Analysis 233

the matrix form by including only existing connections, but it adds processing time as
it requires multiple traces of the list to calculate network metrics.

Figure 6.18 Decision tree analysis

Table 6.21 Matrix representation of a network


Vertex 2/Destination
A B C D E F G H I
A 0 0 1 0 0 0 0 0 0
B 0 0 1 0 0 0 0 0 0
Vertex 1/Origin

C 1 1 0 1 0 0 0 1 0
D 0 0 1 0 1 0 0 0 0
E 0 0 0 1 0 0 0 0 0
F 0 0 0 1 0 0 1 0 0
G 0 0 0 0 0 1 0 0 0
H 0 0 1 0 0 0 0 0 0
I 0 0 0 0 0 0 0 0 0

Table 6.22 Edge-list representation of a network


Vertex 1/Origin Vertex 2/Destination
A C
B C
C A
C B
C H
D C
D E
E D
F D
F G
234 QUANTITATIVE RESEARCH METHODS

Having a formal representation of a social network leads to the adoption of


metrics that can be used to extract meaningful information like the network’s evolution
over time and the relative positions of individuals and clusters of them within the
network. What we need, in most cases, is metrics that can represent the network as a
whole and help us answer questions like how dense it is, how we might decompose it
into its components/sections/groups, what similarities/demographics can be deduced
from closely connected vertices (homophily), what are the key individual vertices that
hold the network together, etc. Typical metrics include simplistic ones like the number
of connections an entity makes to more advanced like density and centrality.
Typical metrics that describe the network as a whole include:
Absolute size is the total number of entities/vertices in a network. In the case
of Figure 6.18 this is 9 while in the case of Figure 6.17 it is 499. This is not a very
informative metric as it fails to capture the power or value of the network which should
also consider the influence or support one can expect from a network. It is one case
to have 9 acquaintances and another to have 9 strong supporters who can make things
happen.
Effective size is a more realistic measure of a network’s potential as it removes
redundancies such as when two members know/connect to the same members. In
such cases, one of the members can be eliminated without any loss of
potential/strength for the network. From another point of view this is a measure of
the clusters that exist in the network. The closer it gets to the absolute size the more
diverse the network is. In the case of Figure 6.17 the effective size was 443.4 which is
close to 499, suggesting the snapshot contains a relatively diverse set of individuals.
Density is a measure of the connectivity of vertices and is expressed as the
percentage of the connections in a network over all possible connections that could
be made. For example, in Figure 6.18 we have a total of 7 connections over a possible
total of 9! (nine factorial or 9*8*7*6*5*4*3*2*1). This results to a low density of
0.000019 or 0.0019%. In contrast the author’s network in Figure 6.17 exhibits a density
of 0.0073 or 0.73%. The closer we get to 100% the more closed the network becomes
meaning everyone is connected to everyone else while the closer we get to zero the
more isolated the members of the network become. High values could be possible for
small networks (like the one in Figure 6.18) but they are almost impossible for large
networks (like the World Wide Web).
When it comes to individual vertices/entities a group of metrics are used under
the general label of centrality. This includes a group of metrics that capture the
importance (centeredness) of a vertex within a network according to certain criteria,
and when entities in the center or the periphery of a network can be identified.
Additionally, entities that act as mediators/connectors of groups can also be identified.
Advanced Methods of Analysis 235

Typically, when such entities connect clusters of entities they are called gatekeepers
or brokers.
Specific centrality metrics include:
Betweenness centrality measures how far apart individuals are and is
calculated as the smallest number of neighbor-to-neighbor jumps that separates two
individuals. The actual path is called “geodesic distance” and it is considered when we
are interested in how often an individual is in the shortest path that connects two
others. This is an indication of the bridging capabilities of that individual and its
removal from the network can be similar to collapsing a bridge in real life. There could
be other ways to reach two points/individuals but accessing them through the bridge
might be the more efficient one. When a connection between two individuals is not
possible, we consider that as a structural hole or a missing gap. Such cases are potential
opportunities to create more value for the network. In the case of organizational
structures leaders can identify such gaps (disconnects between units) and invest
resources in “bridging” otherwise separate organizational units. Table 6.23 displays the
betweenness centrality scores of the Figure 6.18 network. The reader can confirm the
values (like from G to B is 4 hops) by tracing the path from one individual to another.
The higher the number of jumps the higher the potential to form direct connections
leveraging the existing connections already established in the network. For
comparison, the author’s betweenness in the 499-sample cross-section in Figure 6.17
is 113552 (calculated by LinkedIn).
Closeness centrality is the average distance of an individual from any other
individual in the network. A low value is an indicator that an individual is connected
(small distances) with most others in the network while a high value would suggest
someone is on the periphery of the network, such as the distance between two points
in physical space. The lower the distance values the closer we are, and the faster
information/messages reach their destination while the higher the value the longer our
messages need to travel to reach their destination. The closeness values for the
connected individuals of Figure 6.18 are displayed in Table 6.23. Key individuals like
C and D have as expected the lowest values while the more isolated G has the highest
one.
Eigenvector centrality is capturing the importance of someone’s
connections in terms of how connected they are. For example, an individual like F in
Figure 6.18 is connected to the very influential D so that in a sense D is a form of a
proxy of F’s influence. The metric is calculated for every individual by multiplying each
of its row entries from Table 6.21 with its corresponding closeness centrality in Table
6.23 and adding all at the end. For example, eigenvector centrality for the most critical
individual C is 1*2.4+1*2.4+0*1.7+1*1.6+0*2.4+0*2.1+0*3+1*2 = 8.4. Google is
236 QUANTITATIVE RESEARCH METHODS

using a variant of eigenvalue centrality to rank web pages (PageRank algorithm) based
on how they link to each other.
Degree centrality is a vertex/individual’s total direct connections. For
example, in Figure 6.18, C’s degree centrality is 4 while D’s is 3. The lower the degree
centrality the less influential is an individual within a network. Individual I has zero
connections so it is the least influential individual in that network. Table 6.23 displays
the degree centrality for all other individuals in the Figure 6.18 network. Caution
should be exercised in interpreting the metric as low counts can sometimes be
misleading. An individual might be connected to two others who individually connect
with large parts of the network, rendering these two as influential despite their low
degree centrality since their removal from the network might cause the network to
collapse.

Table 6.23 Betweenness Centrality


Vertex 2/Destination
A B C D E F G H
A 0 2 1 2 3 3 4 2
B 2 0 1 2 3 3 4 2
C 1 1 0 1 2 2 3 2
D 2 2 1 0 1 1 2 2
Vertex 1/Origin
E 3 3 2 1 0 2 3 3
F 3 3 2 1 2 0 1 3
G 4 4 3 2 3 1 0 0
H 2 2 2 2 3 3 4 0
Closeness
2.4 2.4 1.7 1.6 2.4 2.1 3 2
Centrality
Eigenvector
1.7 1.7 8.4 4.1 1.6 4.6 2.1 1.7
Centrality
Degree
1 1 4 3 1 2 1 1
Centrality
Publishing Research 237

7 Publishing Research

A natural step at the end of any academic research is the dissemination of the
results in the greater academic and professional communities to inform and invite
critique. Only when repeated efforts to challenge the research findings fail can we say
with some certainty that the findings contribute to our understanding of the
phenomenon under investigation until new research proves otherwise. Dissemination
in academia is traditionally done through conference presentations and academic
publications (journal, books, etc.). The latter nowadays has been supplemented with
online repositories (like arXiv.org, ssrn.com, and researchgate.net) where even
preliminary findings can be presented in an effort to invite feedback that will further
guide the efforts of researchers.
The most popular options available for publishing research include peer-
reviewed academic journals, conference proceedings, and academic research books
(not to be confused with textbooks). For other forms of dissemination like newsletters,
commentaries, etc., readers should consult the specific publisher’s guidelines. The
journal and conference proceedings options generally follow the same style and
formatting rules as oftentimes conference proceedings are published as special issues
of journals or as academic research books. When it comes to book publishing some
publishers might request a specific style, but in general it is left to the editor (for
conferences) or the lead author for academic research books for the structure and style
of the print material. For this reason, the details of book publishing are left for the
researcher to explore through the websites of publishers. Some book publishers
dedicated to academic research include Routledge, Springer, etc. Prestigious
institutions like MIT and Oxford University tend to have their own publishing houses
so interested authors can find details about what and how they accept for publication
on their respective websites. One special case of publication, the research dissertation,
will be discussed here as it is of great interest and probably the starting point for many
researchers. Dissertations almost always are written in a research book style and often
end up published as books.

7.1 Journal Publication


The focus of this section is on the form and requirements that are generally
expected when researchers are interested in publishing in peer-reviewed academic
journals. These journals are produced by major publishers like Taylor & Francis,
Pearson, Springer, Elsevier, etc., usually on behalf of associations or groups of
scientists or by the associations themselves like IEEE, APA, etc. There is usually an
Editor in Chief with Associate Editors for specialized needs and a board of reviewers
238 QUANTITATIVE RESEARCH METHODS

that usually covers the breadth and depth of the field the journal is covering. Their
primary function is to screen the material that will be published for appropriateness
for the journal domain and ensure the journal structure is followed and the
submissions pass the scientific rigor of the review process. The latter is usually through
a double-blind review process whereby the editors assign two reviewers to
anonymously evaluate the submitted material (oftentimes stripped of any author
details).
Based on the outcome of the review process the authors of submitted material
are informed whether their work has been accepted for publication by the journal,
whether revisions are required before publication, or if it has been rejected. For
prospective authors, even when their work is rejected there is value as they receive
feedback from the reviewers of the paper on the areas that were not appropriately
covered and supported. This way, researchers can learn from each other and improve
in the process the quality of their research.
Prospective authors can find the details of the publication process from the
journal’s website and additional information about the success rates of the journal
submission and possibly a categorization of the popularity of the journal as a source
for references by researcher. This is usually indicated through the calculation of an
impact factor metric that some organizations like Thomson Reuters produce for
academic journals. They can range from 0 to even high numbers (40 and above) for a
few select journals, but the great majority of journals will fall below 10 with most
probable values around 3 and even lower. This is not by far a fair process as quality
work can be found even in journals with impact factors below 1, but as in any social
function tradition, prestige and even politics in the form of author affiliations can carry
a publication a long way. Having the journal name carry the importance of someone’s
research is useful but by far what will make research “famous” is the quality of the
work presented and its dissemination by the researcher in more interactive and
engaging modes like conferences, presentations and, nowadays, bulletin and discussion
boards/groups in professional associations and social network sites.
While the material that will follow here covers the general requirements in
terms of style and structure that appear in the majority of academic journals,
researchers should always check the specific requirements set by their target journal
(usually found on the journal’s website). Another point of reference for the discussion
that follows is that we are mainly focusing here on original research (excluding
newsletters, commentaries, etc.) and on social sciences research, but the deviations for
other fields of research should be minimal and usually concern the citation style and
formatting. In general, publication of research is an account of the research process as
outlined in Figure 2.12.
Publishing Research 239

The reason a uniform style is found has primarily to do with convenience when
reviewing research papers as we can easily locate the sections that are of interest to us
and retrieve key points and findings. Style helps express the key elements of
quantitative research (like statistics, tables, graphs, etc.) in a consistent way that allows
retrieval and processing without distractions. This in addition provides clarity in
communication and allows researchers to focus on the substance of their research.
Research paper styles have been recommended by major scientific bodies like APA,
developed by the American Psychological Association, but in general what is known
as the IMRaD (Introduction, Methods, Results, and Discussion) structure is the
standard many journals follow with minor deviations like separating the review of the
literature from the introduction. If we add the title page at the beginning of this
structure and the references at the end, we have a complete journal publication
structure. Before we proceed with a discussion of the aforementioned structure it is
worth pointing out that occasionally journals will impose a word count limit on the
length of a manuscript mainly due to space restriction in the journal and in an effort
to restrain authors from getting “carried away” with their presentation. Typical size
limits are set around 10,000 words and less. Presumably, if more is required the authors
should consider alternative routes like publishing their research as a book. Many
publishers specialize in such publications and even encourage authors to publish their
research even as a collection of similar research, like with conference proceedings.

7.1.1 Title Page


Every piece of published work comes with a title that serves to identify the
work and convey its context as much as possible. Titles summarize the work that will
follow and should be a concise statement about the topic, and the variables and
constructs involved and their relationships. At the same time, titles should be able to
stand alone as representatives of the whole research. In this respect, titles in the form
of questions are not popular and when used are meant to suggest all the above.
Regarding the length of the title there are no specific guidelines but typically average
around between 10 and 15 words. Following the research title, we usually have the
names, affiliations, and contact details of the authors of the publication with markings
for the corresponding author who is available to answer queries and follow up with
the published material.
The next piece of information that follows is the abstract of the research. This
is a one-paragraph summary of the contents of the research. It includes the
phenomenon under investigation, the essential features of the study like its
methodology and research design, the profile of the population of the study, and the
sampling process that was used. This should be followed by the basic findings,
including metrics like statistical significance, confidence intervals, effect sizes, etc., and
the main conclusions of the research. In essence, the abstract is a compressed IMRaD.
240 QUANTITATIVE RESEARCH METHODS

A typical breakdown of the extent of the various sections in an abstract (for those
journals that do not force the breakdown) could be 25% Introduction, 25% Methods,
35% Results, and 15% Discussion. In terms of length, typical abstract requirements
range between 150 and 250 words. Like the title, the abstract should be able to stand
on its own if separated from the rest of the paper. Abstracts tend to be available for
free as promotional material and as such are freely distributed. Table 7.1 shows the
breakdown of a hypothetical research publication on the subject of workplace
employee spirituality.
Table 7.1 Abstract structure for journal publication
Introduction Workplace spirituality from an employee’s perspective is of great
importance in making the workplace productive and satisfying
while contributing to integrating work-life balance values into
organizational behavior. Literature suggests that a theoretical
framework that considers spirituality as a vital constituent of
employees directly influencing their performance at work is needed
if organizations are to treat their employees with respect while
reaping the benefits of spirituality.
Methods By integrating existing research on workplace spirituality, a
correlational research design was adopted to evaluate the impact of
spirituality on employee performance in the workplace. A self-
administered questionnaire was developed with 8 items using a five-
point Likert scale. The questionnaire was screened by a panel of
experts and pilot-tested to 20 qualified individuals. The calibrated
form of the questionnaire was further completed by 214
participants who adhered to the eligibility criteria of the study.
Results The results indicate that employee workplace spirituality is best
captured by 5 factors that showed significant levels of correlation
with their work performance. These include: (a) belief in “higher
power” (r(46) = 0.78, p< 0.01) that provides meaning and purpose
whether that power is in the form of a deity, the individual,
principles, or the universe in some form, (b) the belief that work is
part of a higher power plan and so an acceptable and valuable part
of life (r(38) = 0.62, p< 0.01), (c) the need to support one’s lifestyle
in accordance with the high power directives (r(42) = 0.71, p<
0.01), the need for self-actualization (r(34) = 0.52, p< 0.01), and
skills to endure the hardships imposed by the workplace (r(40) =
0.68, p< 0.01).
Discussion The results of the research suggest a strong connection between
spirituality as expressed by 5 factors and workplace employee
performance. Further research might be required to identify ways
that organization can use to integrate spiritual practices in the
workforce.
Publishing Research 241

Following the abstract narrative, a keyword section is required. This usually


contains on the average a list of 4 to 6 keywords that identify the research area and can
be used for indexing purposes. Usually, these are the keywords we would expect
someone to use to find our research in a search engine.

7.1.2 IMRaD Sections


The core of a research publication starts with the introduction. As in most
types of writing, we begin by explaining what problem is studied and why it is/was
important and necessary to research it. For applied research this might present the
need for understanding or solving a social problem, while for theoretical research
(oftentimes referred to as basic research) it might concern the development of theory
or extension of existing theory to new cases. Our discussion should be as neutral as
possible, presenting arguments from all sides of the debate. The introduction needs to
build up the case for the problem the research will address.
If the journal is one that strictly adheres to the IMRaD standard this is the
place where we need to present a review of the literature that relates to our research
subject (see section 2.1). If not, then at this stage basic references to the specific subject
will need to be briefly mentioned. This is where the current state of the research field
is presented with an emphasis on “gaps” that needed to be addressed with additional
research. The material should be presented as proof of the timeliness and necessity for
addressing the gap through our research. A point of interest here is that a literature
gap is not a sufficient reason by itself as there are an infinite number of gaps that might
not be of great importance to humanity at this time of our evolution. For example,
there is a gap in the academic literature on alien societies, but we can be assured that
unless we have regular encounters with aliens the subject will be of very little
importance to our societies (excluding probably to those individuals who claim to have
been abducted by aliens). The gap is support for the significance of the study, but it
should not dominate the importance of the study that is primarily its contribution to
theory, practice, or both.
This discussion should be followed by a clear statement of the purpose of our
research (obviously to address the problem raised in the previous material) with an
explicit list of the hypotheses that we formed and their appropriateness to the research
design. In some cases, the hypotheses might be preceded by the research questions
that were used to create them, but the majority of quantitative journal publications will
skip the research questions as they can be directly implied by the hypotheses. Both
research questions and hypotheses will need to show their relevance to the theory and
the constructs used as the framework for the research and should be a natural and
logical outcome of the previous discussion.
242 QUANTITATIVE RESEARCH METHODS

Having discussed what our research is all about we move on to discussing what
the research did. This is where the various theoretical constructs and variables will be
operationalized, and a detailed description of the methods used will be discussed. The
details should be sufficient for other researchers to replicate the study and confirm or
disprove the findings of our study. Readers should also have sufficient information to
evaluate the appropriateness of the methods we used for the hypotheses we set and
the type of data we collected. References to past research that used similar methods
for similar studies should be provided as a support for the choices we made.
Everything we presented in section 2.4 of Chapter 2 (like research design,
sampling, instruments, etc.) is material that will be mentioned here so the interested
reader is referred to those sections for additional information. Specifics about
experimental manipulation and interventions, if used, also need to be discussed within
their specific context. It is suggested that the methods/methodology section is written
in past tense and passive voice to reduce researcher biases when discussing the choices
they made (depersonalizing the presentation).
After the methods, we proceed to the results section where we summarize the
collected data, the analysis we performed on them, and the results of our research.
This needs to be done in sufficient detail to provide a complete picture of the results
to someone with a professional knowledge of quantitative methods (Chapters 4, 5, and
6). No citations for the methods used are necessary in this section unless a justification
for a special procedure is required to interpret the results. The language used in
reporting statistical results is more or less standard, so we will provide here a list of the
way such results could be presented.
Mean and standard deviation are always presented as a pair like (M = 25.3, SD
= 1.8). Alternatively, in a narrative form, we might say that the mean for our sample
for VariableX was 25.3 variable units (SD = 1.8). Substitute ‘units’ with the units used
for the variable. Test results should be presented with their associated p values. For a
t-test this could be in the form “there was a significant effect for VariableX (M = 25.3,
SD = 1.8), t(10) = 1.26, p = 0.05”. Similarly, for a chi-square test we might say “the
percentage of our sample that expressed VariableX did not differ by VariableY, x2(4,
N=89) = 0.93, p > 0.05”. Correlations could be of the form of “VariableX and
VariableY were strongly correlated, r(46) = 0.78, p< 0.01”, and ANOVA could be
“one-way analysis of variance showed that the effect of VariableX was significant,
F(3,27) = 5.94, p = .007”. Finally, regression results can be in the form of “a significant
regression equation was found between VariableX and VariableY, F(1,210) = 37.29, p
< 0.01, with R2 = 0.12”. Most other statistical tests discussed in this book can be
presented in similar ways.
After the presentation of the results we come to the last main section of the
paper, the discussion of the research findings. This section is often titled “Conclusions
Publishing Research 243

and Recommendations”. At this stage, we should be in the position to evaluate and


interpret the results in light of the hypotheses we adopted. This means we will either
accept or reject our hypotheses and present a rational explanation for our decision and
its implications for the subject of our research. Further, the discussion needs to
compare the findings of our research with past research findings that could support
or oppose our results and provide explanations for potential similarities and
differences. Biases, assumptions, and limitations that we acknowledged should be
addressed as part of our validity and reliability analysis. The discussion section should
end with a well-supported commentary on the importance of the findings and the
direction future research efforts should take to confirm and expand our research. This
is important as it will be used by future researchers as a basis upon which they will
justify the need for their research in the same area. Section 2.5 of Chapter 2 can provide
additional details of what could be included in this part of the paper.

7.1.3 References
The last part of a journal publication is if not the most “torturous” it is for
sure the most boring one (based on anecdotal evidence and personal experience).
Citing research work and referencing at the end is a requirement for every research
publication as it provides the sources used to make statements about claims and facts
that related to our research in some way. By the time researchers reach this stage they
will have undoubtedly seen hundreds of citation and reference styles through the
review of the literature they have conducted so some familiarity with referencing styles
will have been picked up along the way.
Popular styles nowadays include the American Psychological Association
(APA), Modern Language Association (MLA), Institute of Electrical and Electronics
Engineers (IEEE), Chicago Manual of Style, Harvard, etc. These have been developed
by different associations and journals to ensure compliance and in addressing the
needs of specific disciplines. APA for example is predominantly used in social sciences,
while IEEE is very popular in engineering and sciences. Overall, there are great
similarities between them as they all need to sufficiently describe the source material,
but the differences could be enough to lead to paper rejection if not properly
addressed. Table 7.2 demonstrates the APA, MLA, and IEEE styles for a research
journal and book as produced by Zotero (mentioned below). For additional types of
referencing and in-text citation styles the reader should refer to websites that explain
the various styles.
Luckily for researchers, there is software that has been developed to manage
references. Zotero is a popular free software that comes along with a citation manager
and plugins for browsers like Firefox and Chrome and also Microsoft Word. It allows
for the creation of a citation library that multiple researchers can access and update
online. It can also produce a bibliography in any of the popular formats available.
244 QUANTITATIVE RESEARCH METHODS

Similar functionality is provided by other products like RefWorks, Mendeley, etc.


Interested researchers should spend some time familiarizing themselves with such
products as they are one of the best investments of time one can afford when it comes
to academic publishing.
Table 7.2 Reference styles
Journal APA Harkiolakis, N., Prinia, D., & Mourad, L. (2012). Research
initiatives of the European Union in the areas of sustainability,
entrepreneurship, and poverty alleviation. Thunderbird
International Business Review, 54(1), 73–78.
MLA Harkiolakis, Nicholas, Despina Prinia, and Lara Mourad.
“Research Initiatives of the European Union in the Areas of
Sustainability, Entrepreneurship, and Poverty Alleviation.”
Thunderbird International Business Review 54.1 (2012): 73–78. Print.
IEEE N. Harkiolakis, D. Prinia, and L. Mourad, “Research initiatives
of the European Union in the areas of sustainability,
entrepreneurship, and poverty alleviation,” Thunderbird
International Business Review, vol. 54, no. 1, pp. 73–78, 2012.
Book APA Harkiolakis, N. (2017). Leadership Explained: Leading Teams
in the 21st Century. Routledge.
MLA Harkiolakis, Nicholas. Leadership Explained: Leading Teams in
the 21st Century. Routledge, 2017.
IEEE N. Harkiolakis, Leadership Explained: Leading Teams in the
21st Century. Routledge, 2017.

7.2 Dissertations
A special category of published research concerns dissertations. These can be
at the master’s level (M.Sc., M.Ed., MFA, etc.) or the doctorate level (Ph.D, DBA,
Ed.D., .D.Eng., etc.). The differences are mainly in the length of the manuscript (with
the doctorate been more extended), which generally reflects the amount of time
dedicated to the degree (1-2 years for MS and 3+ additional years for the doctorate)
and the contribution of the work to theory (this is mainly the PhD domain) and/or
practice (mainly DBA, MS domain).
Regarding structure, the great majority of dissertations follow a standard five-
chapter structure which is the IMRaD format with the interjection of a literature
chapter after the introduction. This is deemed necessary due to the extensive coverage
of the research topic of a dissertation in terms of what has been done in the past.
Because the previous section has discussed what is to be included in the various
IMRaD sections, we will present the various chapters of a quantitative dissertation
Publishing Research 245

with a brief discussion. The reader should keep in mind that additional entries are
required before these chapters and include:

• Title page: include the title of the research, type of degree, school and
department, author name, and publication year.
• Abstract page: Same as section 7.1.1.
• Acknowledgments page: everyone who has contributed to the
research in any form or means should be acknowledged here.
• Table of Contents page.
• List of Tables page: should mirror table titles within the body of the
paper according to the school’s referencing style.
• List of Figures page: should mirror figure titles according to the
school’s referencing style.
The five chapters (discussed next) will have to be followed by the references
section and any appendices mentioned in the main body of the text. When the
dissertation is complete and after it has been properly defended, the researcher can
proceed with the publication process. Apart from publishing it as a printed book, there
are dedicated databases like ProQuest that accept dissertations and make them
available to anyone interested.

7.2.1 Chapter 1 Introduction


Provide here a brief overview of what will be discussed in the chapter.

• Background: Introduce the field and provide the background that


leads to the study of the problem under investigation. This is usually
2–3 pages and is a brief review of the literature in the research topic
area that serves as background to the problem/phenomenon of the
study. It also serves as an introduction to key terms that define the
domain of the research topic.
• Problem Statement: Be specific about the problem the research will
address and support it with multiple recent peer-reviewed citations
(within the last 5 years — preferably the last 3). This means citations
that confirm that the problem presented is still not addressed and is
important for humanity. Identify the theoretical gap that exists in the
literature if there is one (required for Ph.Ds).
• Purpose of the Study: Indicate the purpose of the study (must
address the problem mentioned previously) and the methodology.
Provide brief profile of the population of the study.
• Theoretical Framework: Identify the theoretical foundations upon
which the study is built. Apart from a justification of the theories and
246 QUANTITATIVE RESEARCH METHODS

constructs that will support the research, the discussion needs to also
address the appropriateness of the aforementioned for the study. In
the case of Ph.Ds this section should close with the extensions to
theory (or the development of a new theory) that the research is trying
to achieve. This section should normally include the definitions of key
terms and constructs used along with the research questions and
hypotheses of the study. This is because they are part of the model that
will be developed (for Ph.Ds) or used. In many dissertations, though,
they form separate sections, so for compliance reasons they will also
be discussed separately here.
• Definition of Key Terms: The terms that will be used to form the
research questions and hypothesis and any others that need to
accompany them should be clearly and concisely presented and
supported with citations.
• Research Questions: Ensure that the research questions are aligned
with the purpose of the research and are based on the theoretical
framework selected for the research. An easy way to ensure alignment
is by taking the sentence that expresses the purpose of the research
and converting it to a question. This can be a central research question
(CRQ) upon which the research questions (RQs) will expand.
• Hypotheses: For the case of quantitative research each research
question that is not exploratory in nature will have to be followed by a
pair of null and alternative hypotheses.
• Nature of the Study: Describe the research methodology and design
and discuss their appropriateness for our research (purpose, research
questions, and hypotheses). Briefly also discuss the data collection and
analysis methods (a more detailed discussion should be reserved for
Chapter 3). Make sure citation support is provided. 2–3 pages should
be enough for this section.
• Significance of the Study: Here is where we need to discuss the
importance of our contribution both to theory (mainly Ph.Ds) and
practice. The benefits of the answers to the research questions should
be emphasized as well as the positive impact of completing the study.
• Summary: Here we need to restate the key points of Chapter 1.

7.3 Chapter 2 Literature Review


The size of this section is usually around 30 and above pages for a sufficiently
detailed coverage of the research area. Although sources other than peer-reviewed
journals could be included, the great majority of the citations should be from recent
Publishing Research 247

peer-reviewed scholarly publications. All the discussion should be in accordance and


in relationship to the purpose of the study.
At the beginning of this chapter provide a brief overview of how the chapter
is organized. It should then be followed by:

• Search Strategy: As with all the parts of the research it should be clear
how the material was acquired so that future researchers can replicate
and validate the study. A paragraph describing the databases used for
searches, keywords used, and the screening process should be
sufficient here.
• Topics/Subtopics: The past and current state of affairs in the area of
our research should be presented followed by a critical analysis and
synthesis of the key elements as they relate to our research. A historical
account of the subject with their advantages and disadvantages and
their implications for practice would help present the theoretical
approaches used to study similar phenomena. This discussion should
include both theoretical and practical perspectives. It should be
comprehensive and should flow logically. A mistake that should be
avoided is to view the literature review as an annotated bibliography
with the material of one reference following another.
• Summary: Key points of the chapter should be summarized.
Emphasize similarities between research approaches and the
dissertation topic along with differences, omissions, and the challenges
they pose.

7.4 Chapter 3 Methods


Introduce the chapter and provide a brief overview of what will be discussed.

• Research Design: Discuss the research design of the study and


provide justification of its selection over other alternatives considering
the purpose and goals of our research. Provide a profile of the
population that will be studied and describe the steps that will be
followed in detail along with any assumptions made with respect to the
design.
• Sampling: Describe the sampling method that was followed and
justify the sample size calculations (section 2.4.2). If multiple data
sources are involved they should be accounted for here. The selection
and screening process of participants could also be part of this section
unless it is deferred for the data collection section.
248 QUANTITATIVE RESEARCH METHODS

• Materials and Instruments: Provide a detailed account of all the


instruments that will be used to extract data from each source in the
study. Identify the variables that will be used and describe their
operational characteristics (section 2.4.3). Present any assumption that
relates to the material and instruments (section 2.4.4).
• Data Collection and Processing: Describe how the data are going to
be collected, any preprocessing that will take place, and the statistical
techniques (Chapters 4 and 6) that will be used for their analysis. Any
assumptions about the adopted statistical models (e.g. normality,
homogeneity of variance, etc.) should be accounted for here.
• Assumptions and Limitations: While assumptions and limitations
should be discussed at the point they were made, it is always good
practice to bring them all to one place as it shows awareness of them
and allows others to quickly locate them. Assumptions and limitations
provide the ground upon which further research on the subject will be
conducted. Focus here on issues that might affect the validity and
reliability of the research (response rate, honesty, etc.) and potential
biases of the sources. Include how biases have been considered and
countered and how threats to validity have been addressed.
• Ethical Considerations: This section does not always appear at this
part in dissertations as many researchers do not have to go through
IRB boards and government regulations. It is a recommended practice
to have this section to showcase how the research complied with
standards like requiring and acquiring inform consents for subject
participation. Additionally, the section should include how issues of
confidentiality and privacy have been dealt with.
• Summary: Key points of Chapter 3 need to be summarized here.

7.5 Chapter 4 Results


Introduce the chapter and provide a brief overview of what will be discussed.

• Data Analysis: Include the results of the analysis here. The


demographics of the sample along with descriptive statistics should be
presented first so the readers can get an idea of the sample profile and
the potential for bias or misrepresentation of the population that might
exist. If, for example, 90% of the participants were male, then a
stronger gender bias exists and should be accounted for. If the sample
size is small enough to afford presenting them, a table format should
be adopted, otherwise summative results in the form of percentages
and pie or bar charts should be selected. This presentation should be
Publishing Research 249

followed by inferential statistics and the acceptance or rejection of the


study’s hypotheses along with an account of potential violations of
assumptions that can affect the interpretations of the findings. The
presentation of results should point to significant results that will form
the core upon which the following section will be built. Pointing out
aspects of the results in this section should be as blunt as possible to
eliminate any influence that might bias the reader.
• Evaluation of Findings: This is where we will discuss what the
findings mean considering the theoretical framework of the research
and in accordance with the purpose and research questions of the
study. An account of the value of the results in relation to past research
and the assumptions made as well as potential explanations for
contradicting or unexpected results should be presented. The
discussion should not be extended beyond what the results afford.
• Summary: Key points of Chapter 4 need to be summarized here.

7.6 Chapter 5 Discussion


This is the final chapter in a dissertation and can be found under different
headings with popular ones being conclusions and recommendations. As usual it
should start with a brief overview of what will be discussed and followed by:

• Significant Findings: Present here the significant results of the study


and their consequences for both theory and practice. Emphasize
deviations that might influence the interpretations made and might
signify changes in how things are explained or practiced. Any ethical
considerations regarding the impact of any assumptions made should
also be addressed as they might affect the interpretation of the results.
Typically, one can discuss each research question individually and draw
the appropriate conclusions.
• Suggestions for Future Research: The dissertation should close with
recommendations for future researchers based on the findings of the
research. Areas that haven’t been addressed by the study or significant
findings that might suggest areas of interest should be presented. This
section should serve as a guide for someone who is interested in
expanding the specific field of study.
• Summary: Summarize major findings and benefits of the research.
250 QUANTITATIVE RESEARCH METHODS

Appendix A Questionnaire Structure

Questionnaires are a primary tool for data collection in quantitative research


especially in the case of social sciences. Their design should be strictly guided by the
research questions and hypotheses of the research and should ensure the information
needed for data analysis is captured as accurately and reliably as possible. Since their
development process has been already discussed in section 2.4.4, in this appendix we
will specifically discuss the way we can structure a questionnaire and their style.
Having set a clear goal for the questionnaire, we can establish the general
principles that will guide its development. Typical concerns include the way it will be
broken down into sections, if any, and the type and format of questions that will be
included along with the types of answers that will be allowed. From the quantitative
research point of view that we are interested here close-ended questions will be the
focus of the discussion.
A list of important issues/principles for consideration includes:

• Filtering or screening questions: In most cases, like when


questionnaires are distributed online, the researchers most of the time
don’t know who is actually taking the questionnaire and whether they
are qualified to participate or not. For this reason, we might have
questions at the beginning to screen unqualified participants from
answering questions unintentionally. For example, if only females
should take part it would be useful to have a question requesting the
gender of the participant before allowing them to proceed to
subsequent questions. In addition to ensuring that participants are
representative of the population of a research study, this group of
questions (demographics as we will see later) allow the description of
participants so other researchers can replicate the study and also for
the evaluate the potential of the study findings to be generalized to
other populations.
• Introductory questions: These are question that relate to the subject
matter, but they require little effort to answer imposing minimum
distress. They derive basic factual information and are relatively
straightforward to answer. The purpose of such questions is to engage
the participants without threatening or offending them and ensure in
this way their continuing participation.
• Grouping questions: Because jumping from one subject to another
might confuse people it is a good practice to group related questions
Appendix A Questionnaire Structure 251

together. This will allow the respondents to focus and concentrate on


the issues the group of questions addresses before moving to another
topic. Caution should be exercised in this respect as it could provoke
patterned, automatic and unidirectional responses setting up a trend
that distorts the answer they would have provided if the question were
spread further apart in the questionnaire. Open-ended or even
demographic questions might be interjected occasionally to eliminate
such possibilities.
• Reliability checks: For questions that might feel threatening like
when addressing controversial or sensitive issues there will be a degree
of unwillingness for the participants to answer truthfully despite the
measures we take to ensure anonymity. Such questions might have to
be doubled at different parts of the questionnaire and expressed
differently to measure the consistency of the respondent and ensure
the reliability of the questions.
• Sensitive issues biases: A lot of times (as we also mentioned before)
our research might require sensitive information. These types of
questions oftentimes bias the respondents to implicitly popular
responses unless alternative responses that extend the range beyond
what is normally expected are included. This issue is topic specific so
the researchers will have to consult with experts in the field of who to
eliminate such biases. Conducting a pilot study will help identify
potential issues and help develop appropriate questions.
• Wording considerations: The profile of the participants might reveal
cultural characteristics (colloquial expressions, jargon, etc.) that should
be considered when choosing the wording of questions. Appropriate,
simple and straightforward wording should persist the questionnaire.
When unfamiliar terms are included the questions should include (or
preceded) by a definition or explanation of the special term. In such
cases, we must make an effort and eliminate potential biases that we
might infuse in the way we provide definitions and explanations.
Similarly, vague words or phrases should be avoided as they lead to
ambiguity. The same applies with double-barreled questions (those
that introduce multiple issues) as they can confuse participants and
affect the response they could provide. Additionally, we should avoid
using text decorations like bold, italics, etc. that places emphasis on
words or phrases as they might inappropriately influence respondents
unless it is done for clarification purposes. The same is true for
emotional words and phrases as they carry the power to elicit emotions
and influence responses. If the purpose of the research requires the
252 QUANTITATIVE RESEARCH METHODS

elicitation of emotions, it should be best done through a narrative that


precedes the question.
• Questionnaire length: The extent of a questionnaire is defined by the
information required to answer the hypotheses and research questions
of a study and nothing more. Any information that is peripheral to the
study should be considered unnecessary and excluded despite the
interest they may have. Additionally, it should be feasible to complete
the questionnaire in terms of the time allotted to potential participants.
Lengthy questionnaires might force participants to rash through the
answers or even deter them from answering all questions.
• Question format: Despite the advantage of close-ended questions to
allow for comparison between participants and easy manipulation for
statistical analysis there might potentially miss valuable information
that cannot be covered by the available choices. Additionally, the
respondents might feel that what expresses them is in between or
outside the available options. While open-ended questions have no
particular formatting requirements as they are simply inviting narrative
responses, closed-ended questions can come in a variety of formats.
For online questionnaires (and oftentimes for print) using checkboxes
might be suitable for capturing multiple selections while if we want to
force the participants to select one option from what is available radio
buttons might be preferable. Studying the formatting of questionnaires
used in similar instances can help decide accordingly.
To ensure the aforementioned elements are covered sufficiently it is best to
structure the questionnaire in a way that captures the necessary information and
ensures the sample population is screened and qualified to answer the questionnaire.
A typical questionnaire structure includes the following sections:

• Introduction: Provide a brief introduction of the purpose of the


research and the structure of the questionnaire.
• Demographics: Questions that address demographic information are
always specific to the research topic and research questions. These
types of questions are usually packed before or after the subject
specific question. Putting them at the beginning, as we did here, can
also serve as screening and warmup questions as they are not perceived
as threatening or raising any special issues while putting them at the
end might allow the participants to focus early on the questions that
are closer to the subject without having the participants distracted by
demographics.
Appendix A Questionnaire Structure 253

• Subject matter specific questions: Groups of questions should be


identifiable unless there is a specific reason for mixing them (like the
elimination of groupthink). A correspondence with the research
questions and hypotheses of the study should be inherent in the
question groupings. In case participants need to expand in their own
wording about the subject, open-ended questions can be included at
the end of each group.
• Closing: In quantitative questionnaires including closing questions
(oftentimes called venting questions) is not as necessary as it is for
example in interviews, but we could potentially ask exploratory
questions in hope of recovering information that identify potential
omissions in our theoretical framework or suggest future research
direction. At the end make sure a thank you message is included.
Whatever form and structure we decide for a questionnaire, a pilot test would
be the ultimate judge of its effectiveness in capturing the intended information.
Refinements should be expected as a normal practice in academic research and variants
could be suggested from future research. Calibrating a questionnaire to act as an
instrument is an art as much as it is practice.
254 QUANTITATIVE RESEARCH METHODS

Appendix B Research Example Snapshot

f
Appendix B Research Example Snapshot 255
256 QUANTITATIVE RESEARCH METHODS

Appendix C Cheat Sheets

The plethora of the available statistical methods creates many times the need
to simplification of the process of selecting the appropriate method for the research
we are conducting. This need leads the way to flow-charts, decision trees and other
forms of representing the available options in an effort to ease the selection process.
Based on what is discussed in this book the following table summarizes the available
option according to the number and types of variables involved.

Table C1. Cheat sheet of statistical tests


VARIABLES
Number Scale (Interval, Ratio) Nominal/Categorical
1 Normally Distributed Chi-square goodness of fit
Measures of location, variability (mean, G-test goodness of fit
Var, σ)
QQ and PP Plots/ KS test
Student t-test - independent t-test
(hypothetical mean)
Distribution Free/Non-parametric
Median, Quartiles, box-plot
Sign test/Wilcoxon's test
2 Normally Distributed Cross-tabulations/Contingency
Paired t-test (one sample) Tables
Independence t-test (two samples) Chi-square test for independence
Correlation G-test for independence (large
Regression entries)
Distribution Free/Non-parametric McNemar's test (paired
Wilcoxon's rank/rank sum or Mann- dichotomous)
Whitney Fisher's exact test (small entries)
Spearman's rank
Wilcoxon's signed rank (paired)
Many Normally Distributed Chi-square test for independence
One-Way ANOVA G-test goodness of fit (large
Multiple regression entries)
Partial correlation Partial and marginal tables
Distribution Free/Non-parametric Multiple regression (dummy
Kruskal-Wallis H/One-way ANOVA variables)
on ranks Cochran's Q test (dichotomous
Friedman's rank test Xs)
Logistic regression (dichotomous)
Appendix C Cheat Sheets 257

An alternative form of selecting the appropriate statistical technique is depicted


in the figure that follows and is based on the dimension perspective instead of the
variables perspective of the previous table.

Figure C1. Cheat sheet of statistical methods

Regarding the use of software, Table C2 is a replicate of Table C1 with the


SPSS menu options inserted for select methods.
258 QUANTITATIVE RESEARCH METHODS

Table C2. Cheat sheet of statistical tests in SPSS


VARIABLES
Num Scale (Interval, Ratio) Nominal/Categorical
Measures of location, variability, etc. SPSS: Bar graphs and pie
SPSS: Analyze => Descriptive Statistics => charts
Frequencies Chi-square goodness of
SPSS: Graphs => Chart Builder Box-Plot fit
SPSS: Analyze => Non
Normally Distributed Parametric => One Sample
QQ and PP Plots/ KS test => Fields => Settings =>
SPSS: Analyze => Descriptive => Explore => Chi-square
Plots => Normality plots with tests
SPSS: Analyze => Descriptive => Q-Q Plot G-test goodness of fit
SPSS: Analyze =>Nonparametric => Legacy => 1
Sample KS (Prove it is not nonparametric)
SPSS: Transform => Compute Variable => LG10

1 Student t-test - independent t-test


SPSS: Analyze > Compare Means > One‐Sample
T-test

Distribution Free/Non-parametric
SPSS: Analyze => Nonparametric Tests =>
Binomial

Sign test/Wilcoxon's test


Two samples (Independent) Mann Whitney U Test
=> SPSS: Analyze => Non-parametric => Legacy
Dialogs => 2 Independent Samples
Two samples (Paired) Wilcoxon Sign Test => SPSS:
Analyze => Non-parametric => Legacy Dialogs
=> 2 Related Samples
Appendix C Cheat Sheets 259

Table C2. Cheat sheet of statistical tests in SPSS (Cont.)


VARIABLES
Num Scale (Interval, Ratio) Nominal/Categorical
Normally Distributed One sample
Paired t-test (one sample) Cross-
One sample (Paired) => SPSS: Analyze > tabulations/Contingency
Compare Means > Paired Samples T Test tables
SPSS: Analyze => Descriptive
Independence t-test (two samples) Statistics => Crosstabs
Two samples (Independent) => SPSS: Analyze
> Compare Means > Independent Samples T Chi-square test for
Test independence

Correlation – Regression Two or Many Samples


SPSS: Graphs > Chart Builder >Scatter Chi-square test for
SPSS: Analyze > Correlate > Bivariate homogeneity
SPSS: Analyze > Regression > Linear
>…Plot(select Z values) G-test for independence
2 R2 => SPSS: Analyze => Regression =>
Linear => Save => Understandardized (x
axis), Standardized (y axis) = Chart Builder

Distribution Free/Non-parametric
Wilcoxon's rank/rank sum
One sample (Paired) SPSS: Analyze >
Nonparametric Tests > Legacy Dialogs > 2
Related Samples

Mann-Witney
Spearman's rank
SPSS: Analyze > Correlate > Bivariate >
Spearman (uncheck Pearson)

Wilcoxon's signed rank


260 QUANTITATIVE RESEARCH METHODS

Table C2. Cheat sheet of statistical tests in SPSS


VARIABLES
Num Scale (Interval, Ratio) Nominal/Categorical
Normally Distributed McNemar's test
One-Way ANOVA (dichotomous)
SPSS: Analyze > Compare Means > One Fisher's exact test (N <= 5)
way ANOVA or Analyze > Compare SPSS: Analyze => Descriptive
Means > Means => Option => check Statistics => Crosstabs =>
ANOVA Tables Exact => Asymptotic
SPSS: Analyze > Compare Means > One
way ANOVA Multiple regression Cochran's
Q test
Multiple regression
SPSS: Analyze > Regression > Linear
>…one dependent, many independent
Partial correlation
Many SPSS: Analyze > Correlate > Partial

Distribution Free/Non-parametric
Kruskal-Wallis H/One-way ANOVA on
ranks

Friedman's rank test


SPSS: Analyze => Nonparametric =>
Related Samples => follow up with Graph
Builder => Boxplots

Logistic regression
SPSS: Analyze > Regression > Linear
>…one dependent, many independent
INDEX 261

INDEX

5-number summary ......................... 72 between-subjects design .....................54


Absolute size ................................... 234 between-subjects variable ..............54
absolute spread ................................. 72 bimodal distribution ........................ 106
accuracy .............................................. 178 Binomial Distribution ........................97
additive model ................................ 207 binomial logistic regression ....... 145
adjacent matrix .................................. 232 bivariate correlation ...................... 126
adjusted R-squared........................ 141 Bonfferoni........................................ 172
aim......................................................... 28 bootstrapping ................................... 112
alpha level......................................... 168 box plot ................................................72
alternative hypothesis ....................... 166 cardinality .............................................80
American Phycological Association categorical variable ..........................52
.......................................................... 239 causal designs .............................. 40, 41
American Physiological Association Causal-comparative .........................40
.......................................................... 243 Causality ...............................................41
analysis of covariance ................... 139 causation .................................. 123, 124
analysis of variance ........................ 135 cause and effect ...................................21
ANCOVA ......................................... 139 central limit theorem .................... 109
ANOVA............................................. 135 central tendency ......................... 72, 118
ANOVA repeated measures........ 172 centrality .......................................... 234
APA ........................................... 239, 243 centroid ............................................ 191
Assumptions ............................. 63, 248 CFM ................................................... 199
authenticity........................................... 18 chain sampling..................................49
autocorrelation .................................. 208 Chicago Manual of Style ................. 243
autocorrelation first-order ............... 208 Chi-Square Distribution .....................95
average .................................................. 70 chi-square test for homogeneity 152
average linkage ............................... 191 chi-square test for independence
bar chart ............................................... 68 ......................................................... 150
base period ....................................... 201 Closeness centrality ...................... 235
Bayes factor ....................................... 213 CLT .................................................... 109
Bayes formula ................................. 211 Cluster Analysis ................................ 188
Bayes theorem ................................ 211 cluster sampling ...............................48
Bayes’ theorem ............................... 209 Cohen’s d ......................................... 174
Bayesian analysis ........................... 218 Cohen’s Kappa ............................... 178
Bayesian Analysis .............................. 209 common variance............................. 183
Bayesian inference ......................... 209 communality ................................... 183
Bayesian statistics .............................. 209 Comparative designs.........................40
bell-curve.............................................. 86 complete linage.............................. 191
between mean square ....................... 101 component matrix ......................... 183
Betweenness centrality ................. 235 components .................................... 147
262 QUANTITATIVE RESEARCH METHODS

composite indexes ........................ 201 Degree centrality ........................... 236


Conclusions .........................................66 degrees of freedom.................. 95, 101
concurrent validity ...........................58 delimitations ..................................... 63
conditional probabilities ............. 210 dendograms .................................... 191
confidence ..........................................90 Density ............................................. 234
confidence interval........................ 112 dependent variable .......................... 52
confirmatory factor analysis ....... 185 descriptive analysis ........................... 200
Confirmatory Factor Analysis ........ 199 descriptive research .......................... 39
conformation space ............................79 Design Perspectives ........................... 38
confounding variable ......................54 Deterministic model ........................ 214
construct validity .......................... 18, 57 deviation .............................................. 75
constructionism ...................................15 dichotomous variable ..................... 52
Constructionism ..................................17 directed connection ......................... 231
Consumer price indexes .............. 201 dispersion .................................... 75, 118
content validity .................................57 Dissertations ..................................... 244
contingency table analysis .......... 149 distribution free ............................. 102
contingency tables ........................ 149 distribution-free ............................. 117
control variable .......................... 53, 54 Distributions ....................................... 82
Convenience sampling ...................48 divergent validity ............................. 57
convergent parallel mixed methods dominance............................................... 27
............................................................27 dummy variables ........................... 159
convergent validity ...........................57 ecological validity ...................... 58, 64
correlation ....................................... 123 edge list representation .................... 232
correlation coefficient .................. 125 edges ................................................. 232
correlational causal-comparative.......41 EFA .................................................... 181
Correlational design ..........................41 effect size ......................................... 174
correspondence analysis ............. 188 Effective size................................... 234
covariance........................................ 139 eigenvalues ...................................... 185
Cramer’s V....................................... 148 Eigenvector centrality .................. 235
credibility interval............................. 213 eigenvectors .................................... 185
criterion validity ................................57 endogenous variables ....................... 196
criterion-based validity ...................58 Epistemology ...................................... 13
critical realism ......................................11 error mean square ............................ 101
Cronbach a ...................................... 178 error random ..................................... 176
cross-sectional design ......................43 error systematic ................................ 176
cross-tabulations ................... 148, 149 error Type I ....................................... 169
cyclical effect .................................. 207 error Type II ..................................... 169
Data Analysis .............................. 61, 248 Evaluation of Findings ...................... 62
Data Collection .......................... 61, 248 evidentialists ........................................ 14
Data Processing.......................... 61, 248 exogenous variables ......................... 196
Decision Analysis ............................. 214 Expected Frequency .................... 147
Decision nodes .............................. 220 expected utility ............................... 219
decision tree .......................................81 expected value .......................... 70, 224
Decision Trees.................................. 220 expected values ...................... 150, 215
INDEX 263

Experimental designs ....................... 42 Holt forecasting model ................ 204


experimental fatigue ........................... 65 homophily ......................................... 234
explanatory research ......................... 40 homoscedasticity .............................. 130
explanatory sequential mixed Hypotheses ........................... 21, 34, 246
methods ........................................... 27 Hypothesis Testing .......................... 166
Exploratory Factor Analysis............ 180 idealism .................................................11
Exploratory research......................... 38 IEEE.................................................. 243
exploratory sequential mixed impact factor ..................................... 238
methods ........................................... 27 IMRaD............................................... 239
exponential......................................... 131 independent t-test ......................... 120
exponential smoothing ................. 201 independent variable .......................52
ex-post-facto........................................ 41 index ................................................. 201
external validity ............................ 18, 64 inferential analysis ............................ 200
extraneous variable .......................... 52 inflexion point ................................ 186
extremes ............................................... 72 Institute of Electrical and Electronics
face validity ........................................ 58 Engineers ....................................... 243
factor loadings ................................ 183 institutional review board ..................60
factor matrix .................................... 183 Instruments ..........................................55
Factors .................................................. 49 inter-class correlation coefficient
fair game ........................................... 224 ......................................................... 178
forecast error.................................... 208 internal realism ....................................11
forecasting ........................................ 202 internal validity .................................64
free parameters ............................... 194 Internal validity....................................18
frequency.............................................. 68 interquartile range............................72
frequency of occurrence .................... 68 interrogatives diamond.......................14
frequentist .......................................... 213 interrogatives” dia ...............................14
frequentist statistics .......................... 212 intersection...........................................81
Friedman’s rank test ..................... 144 interval scale variable ......................51
Fr-test ................................................. 144 intervening variable .........................54
furthest neighbor ............................ 191 interviews .............................................59
future studies ....................................... 45 IQR .......................................................72
G2 test ................................................ 148 IRB ........................................................60
Game Theory .................................... 222 Journal Publication .......................... 237
generalized least squares ............. 195 k-means method ............................ 191
goodness of fit ................................. 146 Kolmogorov-Smirnov test...... 105, 117
GPower .............................................. 174 Kruskal-Wallis H ........................... 142
graph theory....................................... 230 kurtosis ................................................78
G-test of goodness of fit ............... 148 latent variable ....................................52
G-test of independence ................ 152 latent variables ....................... 185, 193
heterogeneity ................................ 45, 46 lattice diagram ..................................83
hierarchical clustering .................. 188 learning effect ......................................65
hierarchical divisive method ....... 191 least-squares ................................... 126
histogram ............................................. 69 leptokurtic ............................................78
history effects ...................................... 65 level of measurement ......................51
264 QUANTITATIVE RESEARCH METHODS

levels of measurement ....................55 MLA ................................................... 243


likelihood ...........................79, 210, 211 model just-identified ........................ 194
likelihood ratio test ....................... 148 model specification ....................... 194
Likert scale ...........................................52 model under-identified .................... 194
limitations...........................................63 moderators......................................... 53
Limitations ...................................... 248 Modern Language Association ....... 243
linearity .............................................. 130 multicollinearity ................................ 130
Literature Review ....................... 30, 246 multiple regression ................. 41, 140
log likelihood .................................. 145 multi-stage sampling...................... 49
log10................................................... 106 multivariate analysis of covariance
logarithmic ........................................ 131 ......................................................... 139
logistic regression ......................... 145 Nash equilibrium .......................... 223
log-likelihood ratio test ............... 148 nature of the study .......................... 36
longitudinal research design ............44 Nature of the Study ....................... 246
lurking variables ............................... 129 nearest neighbor ............................ 191
MAD .....................................................75 network graph ................................ 232
MANCOVA .................................... 139 node-link diagram ............................ 230
Mann-Whitney test ....................... 132 nodes ................................................. 232
MANOVA........................................ 139 nominal variable .............................. 52
margin of error ............................... 112 Nominal Variables ........................... 146
marginal ........................................... 211 non-experimental design ........................ 42
marginal frequencies .................... 150 non-experimental research design 43
marginal probability ..................... 211 non-parametric ............................... 117
marginal table ................................ 156 non-parametric distributions ..... 102
Markov chains ................................ 228 non-probability sampling.............. 48
matrix representation....................... 232 normal cumulative distribution ... 88
maximum .............................................72 normal curve ....................................... 78
maximum likelihood .................... 195 Normal Distribution .......................... 86
maximum regret ............................ 215 Normal distribution tables ................ 88
McNemar test ................................... 152 null hypothesis .................................. 166
mean.....................................................69 objectives ............................................. 28
mean absolute deviation ................75 observations ........................................ 59
mean absolute error ............................75 Observed Frequency .................... 147
mean absolute percentage error ..... 208 observed variables ......................... 194
measures of location ........................ 118 one-sided critical value .................. 90
measures of relative position .......... 118 one-tail critical value ...................... 90
measures of shape ............................ 118 One-way ANOVA ......................... 138
median .................................................71 one-way ANOVA on ranks ......... 142
mediators ............................................53 One-way MANOVA ..................... 139
mesokurtic............................................78 one-way multiple analysis of
meta-analysis .....................................45 variance ......................................... 139
Methodology........................................18 operationalization of variables .... 54
minimum ..............................................72 ordinal variables ............................... 51
Mixed methods....................................25 ordinary least squares .................. 195
INDEX 265

outcome variable .............................. 52 prisoner’s dilemma ....................... 223


outliers ................................................. 71 probabilistic model .......................... 214
over-identified ................................... 194 probabilistic sampling ....................47
paired t-test ...................................... 122 Probabilities .........................................79
paradigm............................................... 16 probability distribution...................82
parameters................................. 50, 108 problem statement ..............................28
partial correlation ........................... 141 Problem Statement ....................... 245
partial tables .................................... 156 proportions ...................................... 113
Path Analysis ..................................... 195 pseudorandom ............................... 226
path coefficients ............................. 197 Publishing Research......................... 237
payoff ................................................. 214 purpose perspective .........................38
Payoff Table Analysis ....................... 215 purpose statement ...............................28
PCA .................................................... 181 purposive sampling .........................48
Pearson correlation ........................ 126 p-value ............................................... 168
Pearson’s chi-square ............ 146, 150 Q-Q plot............................................ 104
Pearson’s moment coefficient of qualitative methodology .....................23
skewness ........................................... 77 Quantitative .........................................21
Pearson’s r ........................................ 125 Quantitative Research Process ..........28
percentage ............................................ 68 quartile lower .....................................72
phenomenon ....................................... 19 quartile third ......................................72
Phi quartile upper ....................................72
Phi value ........................................ 148 quasi-experimental...................................42
phylogenetic analysis .................... 188 quasi-experimental research designs
phylogenetic trees .......................... 191 ............................................................42
pie chart................................................ 68 questionnaire........................................56
pilot study ........................................... 58 random error .................................... 176
platykurtic ............................................ 78 random event generator .............. 226
polls..................................................... 115 random sampling .............................47
population ............................................ 29 random sampling with replacement
Populations .......................................... 68 ......................................................... 112
positive predictive value................... 176 random stratified sampling ...........47
positivism ............................................. 15 randomized ..........................................42
Positivism ............................................. 16 randomness ................................... 46, 47
posterior ............................................ 211 range......................................................72
posterior probabilities ................... 218 rank condition ................................ 194
power ................................................... 90 ratio scale variable ...........................51
P-P plot .............................................. 105 Realism .................................................10
pragmatism ................................... 15, 26 Recommendations ..............................66
precision ............................................. 178 References ......................................... 243
predictability ........................................ 41 regression ........................................ 126
predictive validity ............................. 59 regression equation ...................... 126
predictor variable ............................. 52 regression to the mean ................ 127
Principle Component Analysis ....... 180 Relativism .............................................11
prior ........................................... 210, 211 reliability ...............................................55
266 QUANTITATIVE RESEARCH METHODS

Reliability ........................................... 176 simple predictive studies ................... 41


representativeness ............................47 simple random sampling .............. 47
resampling ......................................... 112 Simulations ........................................ 225
research applied ................................ 241 single linkage.................................. 191
research basic .................................... 241 skewed negatively ............................... 77
Research Design ................... 36, 38, 247 skewed positively................................ 77
Research Design Perspectives ...........38 skewness ............................................ 77
research questions ...............................30 slope .................................................. 126
Research Questions ................... 34, 246 smoothing ........................................ 201
research theoretical .......................... 241 snowball sampling .......................... 49
residual effect ................................. 207 social network analysis ................ 193
residuals ........................................... 127 Social Network Analysis ................. 230
Risk averse ........................................ 220 Spearman’s rank ............................ 132
risk taking .......................................... 220 specificity ......................................... 175
Risk-neutral ....................................... 220 sphericity ......................................... 136
R-matrix ........................................... 181 sqrt...................................................... 106
R-squared ........................................ 128 standard deviation ........................... 77
sample one ........................................ 117 standard error ......................... 110, 112
Sample Size ....................................... 172 standardized normal distribution 87
sample space ........................................79 states of nature ....................... 215, 220
Samples .............................................. 108 statistic ............................................... 50
Samples Many ................................... 161 statistics............................................ 108
samples two ...................................... 161 Statistics ............................................. 115
Sampling ...................................... 45, 247 steady state ........................................ 229
sampling distribution ........... 109, 113 stereotyping ......................................... 79
sampling errors .............................. 115 stratified sampling .......................... 47
sampling frame .................................47 Structural Equation Modeling ........ 193
sampling variability ...................... 115 student t distribution ...................... 99
scale variable .....................................51 Student t-test................................... 120
scatter plots ..................................... 123 subjective probability ................... 209
scree plot .......................................... 185 system ............................................... 214
screened publications .........................32 systematic error ................................ 176
search plan ...........................................31 systematic random sampling ....... 47
seasonal effect ................................ 207 systems perspective ...................... 214
secular trend ................................... 207 t Distribution ...................................... 99
selection bias ........................................46 the order condition ........................ 194
SEM ................................................... 193 Theoretical Framework ............. 32, 245
sensitivity ......................................... 175 theory of truth .......................................9
Shapiro-Wilk ..................................... 105 third quartile ..................................... 72
Shapiro-Wilk test.............................. 117 time series ........................................ 200
sign test ............................................ 121 Time Series Analysis ................ 200, 230
significance ..................................... 167 time series data .............................. 200
Significance..................................... 246 time-series ........................................... 44
significance level ........................... 168 transferability ...................................... 25
INDEX 267

transformations ................................. 106 utility theory .................................... 218


transition matrix ................................ 229 validity............................................ 55, 64
treatment mean square ..................... 101 variability ........................................... 118
triangulation ...................................... 60 Variables ...............................................49
true score theory ............................. 176 variance ...............................................76
t-rule ................................................... 194 variation ..............................................76
t-test ................................................... 120 Venn diagrams ..................................79
two-sided critical values ................. 90 vertices.............................................. 232
two-tail critical values ..................... 90 volunteer bias ......................................65
Two-way ANOVA.......................... 138 weighted averages ......................... 184
Two-way MANOVA ..................... 139 weighted composite indexes...... 201
Type I error ....................................... 169 Wilcoxon’s Rank............................ 132
Type II error ...................................... 169 Wilcoxon’s Rank Sum .................. 132
undirected connection...................... 231 Wilcoxon’s Singed Rank ............. 134
union ..................................................... 81 Wilcoxon’s test ............................... 121
unit of analysis .............................. 29, 37 within mean square .......................... 101
unit of observation ...................... 38, 68 within-subjects design ........................54
universality of the prof .................... 22 within-subjects variable..................54
universe ................................................ 79 y-intercept ....................................... 126
universe set .......................................... 79

You might also like