Quantitative Research Methods July2020
Quantitative Research Methods July2020
Nicholas Harkiolakis
ii QUANTITATIVE RESEARCH METHODS
Preface
Attempting to describe quantitative research methods through one volume of
material is quite a challenging endeavor and it should be noted here that this book is
by no means attempting to exhaustively present everything under the Sun on the
subject. Interested readers will need to expand on what is presented here by searching
the extant literature on what exists and what best suits their research needs. Having
said that, the book does cover the great majority of quantitative methods found in
social sciences research.
The motivation for developing this book came from years of delivering
quantitative methods courses for graduate programs in Europe and the USA. Through
exposure to such programs it became apparent that while most students had some
exposure, to statistics mainly, at the time they entered graduate studies most of their
understanding and familiarity with quantitative techniques was forgotten or vaguely
remembered. In many cases, what remained was the impression of how much they
“hated” the subject. Overcoming this negative predisposition required a re-
introduction of basic concepts and a fast-track approach to higher and more advanced
methods of analysis.
These realities guided the development of this book and so the assumption is
made that the reader doesn’t know anything about quantitative research and
about research in general. All concepts presented in the book are defined and
introduced. Also, alternative and overlapping expressions and keywords used in
quantitative research are presented so the reader can identify them in their readings of
academic research. Whether this “zero-to-hero” approach succeeded is left for the
reader to judge.
Additional effort was made to include examples that are easily replicated in
spreadsheets like Excel so the users can manually repeat them at their convenience.
Regarding the use of software, commands for executing the various methods for SPSS
are given in footnotes to avoid diverting from the core narrative of the text. The
interested reader can easily retrieve a plethora of material from the Internet with step-
by-step instructions for most of the analysis techniques discussed here and for the
most popular statistical software packages. The book’s website at
www.harkiolakis.com/books/quan provides additional material for executing the
methods discussed here with SPSS, as well as all book images in higher resolution and
links to other sources online. For instructors who are interested in using the book as
a textbook, data sets and exercises on the methods included in the book are available
upon request.
iv QUANTITATIVE RESEARCH METHODS
Table of Contents
1 Philosophical Foundations.............................................................. 8
1.1 Ontology ....................................................................................................... 9
1.1.1 Realism ............................................................................................ 10
1.1.2 Relativism........................................................................................ 11
1.2 Epistemology .............................................................................................. 13
1.2.1 Positivism ........................................................................................ 16
1.2.2 Constructionism .............................................................................. 17
1.3 Methodology ............................................................................................... 18
1.3.1 Quantitative ..................................................................................... 21
1.3.2 Qualitative ....................................................................................... 23
1.3.3 Mixed methods ................................................................................ 25
2 The Quantitative Research Process ............................................. 28
2.1 Literature Review ....................................................................................... 30
2.2 Theoretical Framework ............................................................................... 32
2.3 Research Questions and Hypotheses .......................................................... 34
2.4 Research Design ......................................................................................... 36
2.4.1 Research Design Perspectives ......................................................... 38
2.4.2 Sampling.......................................................................................... 45
2.4.3 Variables and Factors ...................................................................... 49
2.4.4 Instruments ...................................................................................... 55
2.4.5 Data Collection, Processing, and Analysis...................................... 61
2.4.6 Evaluation of Findings .................................................................... 62
2.5 Conclusions and Recommendations ........................................................... 66
3 Populations ..................................................................................... 68
3.1 Profiling ...................................................................................................... 68
3.2 Probabilities ................................................................................................ 79
3.3 Distributions................................................................................................ 82
vi QUANTITATIVE RESEARCH METHODS
1 Philosophical Foundations
The origin of the word research can be traced to the French recherché (to seek
out) as a composite, suggesting a repeated activity (re) of searching (cherché). Going even
further, we arrive at its Latin root in the form of circare (to wander around) and
eventually to circle. This wandering around in some sort of cyclical fashion is quite
intuitive as we will see later on since it accurately reflects the process we follow in
modern research. The only exception is that the circles are getting deeper and deeper
into what the “real” world reveals to us. Noticing the quotes around the word ‘real’
might have revealed the direction we will follow here in challenging what “real” means
and accepting how it affects the process we follow when we investigate a research
topic.
Concerns about the nature of reality are vital in deciding about the nature of
truth and the ways to search for it. This brings us to the realm of philosophy or more
specifically to its branch of metaphysics where we deal with questions of existence.
Providing an explanation of the world and understanding its function (to the extent
possible) allow us to accept what is real and make sense of it. Our perception of the
metaphysical worldview is the cause of our behavior and the motivation for moving
on in life. Questions about the origin of the universe belong to the branch of
metaphysics that is called cosmology, while questions about the nature of reality and
the state of being are the concern of the branch of metaphysics called ontology. The
latter is essential to the way we approach and conduct research as it provides a
foundation for describing what exists and the truth about the objects of research.
In addition to the issues about reality that research needs to address based on
our ontological stance, there are epistemological assumptions that guide our approach
to learning. These address issues about what knowledge is — what we can know and
how we learn. These questions, of course, are based on the assumption that knowledge
about the world can be acquired in an “objective”/real way, connecting in this way
with ontology. The interplay among epistemology and ontology as we will soon see is
reflected in the different research traditions that have been adopted and guide past and
present research efforts as they determine the theory and methods that will be utilized
in conducting research.
The ontological and epistemological stances we adopt are considered
paradigms and reflect the researcher’s understanding of the nature of existence from
first principles that are beyond “logical” debate. As paradigms, they are accepted as
self-sufficient logical constructs (dogmas in a way) that are beyond the scrutiny of
proof or doubt. Selecting one is more of an intuitive leap of faith than an “objective”
process of logical and/or empirical conclusions. Both ontology and epistemology are
Philosophical Foundations 9
tightly related to what can be called “theory of truth”. This is an expression of the way
arguments connect the various concepts we adopt in research and the conditions that
make these arguments true or false depending on the ontological and epistemological
posture we adopt. In that respect, arguments can represent concepts, intentions,
conditions, correlations, and causations that we accept as true or false with a certain
degree of confidence.
Typical theories of truth include the instrumentalist, coherence, and
correspondence theories. The last reflects the classical representations that Plato and
Aristotle adopted where something is true when it “reflects” reality. This posture can
be heavily challenged in the world of social sciences since perceptions of individuals
vary and can suggest different views of reality, making it impossible to have a universal
agreement on social “facts”. Such perceptions can influence the way beliefs fit together
(cohere) in producing an explanation of the phenomenon we investigate. This is also
the basic posture of coherence theory, which postulates that truth is interpretation-
oriented as it is constructed in social settings.
Finally, the instrumentalist view of truth emphasizes the interrelation between
truth and action and connects the positive outcomes of an action to the truth behind
the intention that led to that action. This is more of a results-oriented perception of
truth and the basis of the pragmatist epistemology as we will see later on. In terms,
now, of supporting theories, research can be descriptive like when we make factual
claims of what leaders and organizations do, instrumental like when we study the
impact and influence of behavior, and normative when we try to provide evidence that
supports direction. For each one of these “categories” of research, ontology and
epistemology are there to provide philosophical grounds and guidance.
1.1 Ontology
Of particular interest to research is ontology, the branch of metaphysics that
deals with the nature of being and existence or in simplified terms what reality is.
Although it isn’t clear what really is and how it relates to other things, one can always
resolve to degrees of belief that ensure commitment to answers and by extension
acceptance of a particular theory of the world. In this way, an ontological stance will
provide an acceptable dogma of how the world is built and more specifically, with
respect to social sciences, which we are interested in here, the nature of the social
phenomenon under investigation.
The various ontological stances that will be presented here are not an
exhaustive account of what has been developed in the field, but they are meant to
serve as a brief introduction to the core trends, or mainstream in a way, that have been
developed and persist today. Adopting an ontological stance towards the things that
10 QUANTITATIVE RESEARCH METHODS
exist in society it terms of the nature and representations we form of its various entities
will help guide the choice of methodology that best suits our research aims. Believing
in an objective or subjective reality will also suggest, to an extent, the data collection
method (like survey, interviews, etc.) that will best reveal and/or prove, to a degree of
certainty, the relationships and dependencies of the entities we study.
Ontological stances are divided primarily according to their belief in the
existence of external entities. They can range from the extreme position that there are
no external entities, which is the domain and main position of nihilism, to the existence
of universal truths about entities, as realism posits. In between, we have, among others,
relativism with its position of subjective realities that depend on agreement between
observers, nominalism with the position that there is no truth and all facts are human
creations, and Platonism where abstract objects exist but in a non-physical realm. For
the purposes of this book, realism and relativism will be considered as the most
popular ontological stances in research and especially due to the support they provide
for quantitative and qualitative research.
1.1.1 Realism
Realism’s premise is that the world exists regardless of who observes it. In our
case this means that the universe existed before us and will continue to exist after we
are gone (not just as individuals but also as a species). The philosophical positions of
realism have been quite controversial as it can be accepted or rejected in parts
according to one’s focus. For example, an individual might be a realist regarding the
macroscopic nature of the natural world while they can be non-realist regarding human
concepts like ethics and aesthetics. This dualism of treating the “outside” world as
“objective” and the “inside” world of human thought as “subjective” is mainly based
on agreement about the existence of the world’s artifacts. The external perception of
the objective reality is mainly based on our everyday experience that the objects in the
natural world exist by the mere fact that their defining characteristics persist when
observed over time and that their properties seem to transcend human language,
culture, and beliefs. For example, the moon is recognizable by everyone in the world
as an object (nowadays) that orbits the Earth, suggesting it is real and independent of
human interpretation. Accepting its “realness” allows us to do research about it and
define its properties and relationship to other objects, spreading in this way its
“realness” to other objects (including us who make observations about it).
The whole truth will never be revealed to us as all the facts that support it
cannot be revealed. For this reason, we need theories that allow us to imagine
(correctly or incorrectly) what is hidden from our senses and cannot directly be
observed. In research, a realist stance is based on empiricism. Reality or the true nature
of what we are trying to find is out there and will be revealed to us in time. The
assumption is made here that the observable world is composed of elemental and
Philosophical Foundations 11
discrete entities that when combined produce the complex empirical patterns that we
observe. The researchers’ tools are their senses, their impressions and perceptions as
formulated by past experiences, and their rationality. By identifying and describing the
elemental constructs of a phenomenon, realists can study their interaction and form
theories that explain the phenomenon. A lot of these explanations will require
abstractions (like socialism, for example) as they cannot be observed as elemental
entities but rather as aggregates of simpler ones. Understanding social structures will
help transform them in ways that better support human growth.
Many variations of realism have been developed from various philosophical
schools to address deviations from the generic realist path. Among them, critical
realism, idealism, and internal realism hold prominent positions. The first two have
been going at each other for some time now as rivals to true representatives of realism.
Critical realists insist on the separation between social and physical objects based on
the belief that social structures cannot exist in isolation and are derivatives of the
activities they govern and their agent’s perceptions. In turn, these structures tend to
influence the perceptions of their agents, creating in this sense a closed system between
agents and their social construct. Idealists critique the positions of critical realists by
suggesting their interpretations and subject of inquiry fall into metaphysics as they
construct imaginary entities that further impose an ideology that, as most ideologies,
can be oppressive and exploitable.
The variation of realism that comes in the form of internal realism becomes
interesting with respect to social sciences research. The position here is that the world
is real but it is beyond our capabilities and means to study it directly. A universal truth
exists, but we can only see glimpses of it. In business research, realism can be reflected
in the assumption that entities like sellers and buyers interact in a physical (not
imaginary) setting so decisions must reflect the outside reality and not our internal and
subjective representation of it. In that respect, researchers and practitioners need to
try and understand what really happens in the marketplace in terms of the properties
of the various entities and the way they exert forces and interact among themselves
and their environment. However, it is obvious that due to our limitations in receiving
and interpreting all the signals of reality we can only gain an imperfect and mainly
probabilistic understanding of the world. A divide between the real world and the
particular view we perceive at a certain time and place will always exist and as such our
investigations and the interpretations we provide will always be linked to our
experiences as researchers.
1.1.2 Relativism
Relativism takes an opposite stance to that of realism by taking the position
that what we perceive as reality is nothing more than the product of conventions and
frameworks of assessment that we have popularized and agreed as representing the
12 QUANTITATIVE RESEARCH METHODS
truth. In that sense, truth is a human fabrication and no one assessment of human
conduct is more valid than another. Understanding is context and location specific,
and rationality among others is the subject of research rather than an undisputed
conceptual resource for researchers. Relativists view reality as time dependent, and
something that was considered as true at some instance in time could easily be proven
false at a later time when experience and resources reveal another aspect of the
phenomenon or situation under investigation. Research provides revelations of reality
and discourses help develop practices and representations that help us experience and
make sense of the world.
A popular interpretation of relativism is that there is the single reality that exists
out there, but it is inaccessible directly through experimentation and observation. We
end up deducing a lot of the underlying nature of things (like the electrons in atoms)
by observing the way they influence macroscopic phenomena. The accuracy of our
observations can never go beyond a certain threshold because the mere act of
observing means we interact with the object of the observation and thus alter the
aspect of it (variable) that we intended to observe. This can be seen in social sciences,
for example, when we ask people how aware they are of a phenomenon. The question
itself brings the phenomenon in question into the subject’s awareness, affecting in this
way the underlying awareness they had of the phenomenon before the question was
asked.
Because reality is seen as subjective to the individual and social groups, it
suggests a way of approaching research with humility. As a result, research should not
be seen as revelations of a universal truth but more of as a reflection of the period, the
researchers, and the context of the research. Making judgments about social
phenomena like culture should be done with caution and always avoid biases that could
privilege one interpretation or culture, for example, over others. The values held by
one culture might be totally different from another without making one better or more
just than the other.
While it might seem that no progress can be made since everything is
subjective and situation dependent, we must not consider relativism as opposing the
scientific method. To the contrary, it should be seen as a complement or an added
filter that can provide an alternative view based on the influence of the observer in
interpreting information and formulating theories. Critics of relativism believe that the
lack of underlying assumptions that can be considered true undermines the possibility
of the development of commonly accepted theories that will explain the parts of the
world we observe. Another criticism concerns the inherent contradiction of relativism
that results from the universality of the belief that everything is relative to a place, time,
and the context in which it is observed. The universality of such a statement is only
something a realist would claim. In other ways, if everything is relative then the
Philosophical Foundations 13
absolute truth that everything is relative is relative, and thus this statement is something
that is not relative, proving a realist stance. Another criticism of realism that needs to
be considered in social sciences research is its susceptibility to the perception of
authority of the carriers of the truth. Protagonists with influence and power are bound
to influence an adoption of the truth along with their perception of it. Additional
influences could be exerted by social entities (government, business, etc.) with stakes
in the specific and general area of the truth.
1.2 Epistemology
The etymology of the word epistemology suggests the discourse about the
formal observation and acquisition of knowledge. This branch of metaphysics is
concerned with the sources and structure of knowledge and the conditions under
which beliefs/knowledge become justified. As sources of knowledge, here we consider
“reliable” ones like testimonies, memory, reason, perception, and introspection, and
exclude volatile forms like desires, emotions, prejudice, and biases. Perception through
our five senses is the primary entry point of information into the mind where it can be
retained in memory for further processing through reason. Testimony is an indirect
form of knowledge whereupon we rely on someone else to provide credible
information about someone or something else. Finally, introspection, as a unique
capacity of humans to inspect their own thinking, can supplement reason in making
decisions about the nature of evidence and truth.
From the point of the academic research that we are interested in here,
epistemology is vital in defining our approach to data collection and analysis and also
in the interpretation of findings in search of the underlining truth of the phenomenon
we are studying. A more practical perspective that is also of interest to research
concerns the creation and dissemination of knowledge in a domain of study. Again, as
we did with the case of ontology, we will discuss here the main schools of thought that
guide social sciences research, like positivism and constructionism. In positivism, as
we will see next, the social world exists in reality and we try to accurately represent it
as knowledge (left image in Figure 1.3), while in constructionism learning takes place
in a social context as we confirm our knowledge with others (right image in Figure
1.3). As we will see in the next sections, the aforementioned epistemological stances
align to a great extent with particular ontological stances. While this alignment is
welcomed, it should not be considered as absolute and a one-to-one relationship.
1.2.1 Positivism
Positivism is based on the idea that the social world exists externally and can
be studied and represented accurately by human knowledge only when it has been
validated empirically. This aligns well with the realist perspective and the
corresponding theory of truth, but it shouldn’t be seen as a one-to-one
correspondence. Social entities and their interactions are seen as observables that can
be expressed through the appropriate choice of parameters and variables. These can
be studied and empirically tested to reveal the true nature of social phenomena. For
example, organizational structure and culture exist and can be studied to provide proof
of their influence on organizational performance.
The dominance of observation of the external world over any other source of
knowledge the positivism seeks was initially a reaction to metaphysical speculation.
This resulted in the establishment of the word ‘paradigm’ in social sciences as a
description of scientific discoveries in practice rather than the way they are produced
in academic publications. While traditional theories provide explanations for
phenomena, new observations and experiments might challenge established theories.
Enlightened scientists might then come and provide radical insights that comply both
with the past theories and at the same time explain the new observations. These
breakthroughs are not through the traditional way of advancing in science by
incremental application of existing practices but are based on the creative realization
of revolutionary thinking of exceptional individuals who leap into new revelations of
seeing the world.
Positivism is a paradigm that dominated the physical and social sciences for
many centuries before being criticized in the latter half of the twentieth century for its
inappropriateness in describing complex social phenomena that are formed by the
volatile nature of human behavior. Because of this volatility, social phenomena are
difficult to replicate and so it becomes hard to consistently produce the same results.
One way of increasing the reliability of research findings, as we will see later, is by
repeating research procedures. If the same results are observed, then the empirical
evidence can provide strong support for the foundation of theories.
Under positivism, social sciences aim at identifying the fundamental laws
behind human expression and behavior that will establish a cause and effect
relationship among elementary propositions that are independent from each other.
This is something like the elements in chemistry where they combine under certain
laws to create more complex chemical and biological structures. Research follows a
hypothesis and deduction process where the former formulates assumptions about
fundamental laws and the latter deduces the observations that will support or reject
the truth of the hypotheses. For observations to be accurate representations/measures
of reality, a clear and precise definition (operationalization) of concepts and variables
Philosophical Foundations 17
1.2.2 Constructionism
In response to the “absolute” nature of learning through observation and
experimentation of measurable entities, constructionism was developed during the last
half of the twentieth century. Its purpose was to address the subjective nature of social
experience and interaction. In the constructionist paradigm, it is argued that our
perception of the world is a social construct formed by commonly agreed beliefs
among people and that these constructs should be investigated by research. We assume
here that the internal beliefs of individuals shape their perceptions of their external
reality to the extent that they behave as if their constructed reality is the actual reality,
making the argument about an objective reality unimportant.
One way of converging the multitude of constructs of reality that various
people hold is through a negotiation process that will eventually end in a shared
understanding of a phenomenon/reality. The challenge this process poses is with the
degree of understanding of the negotiating parties, their power and skill asymmetries
in conducting negotiations, and the motivations each party has for reaching an
agreement. The job of the researcher becomes the collection of facts in the form of
subject statements and responses, the identification of patterns of social interaction
that persist over time, and the development of constructs that capture and uniquely
identify beliefs and meaning that people place on their experience.
The way people communicate and express their beliefs and positions and an
understanding of what drives them to interact the way they do takes here the place of
a cause and effect relationship that in other approaches forms the basis for
understanding and explaining phenomena. Social interactions are not a direct response
18 QUANTITATIVE RESEARCH METHODS
to external stimuli but go through the process of developing an agreed upon meaning
before materializing a reaction. The grounds upon which constructionism is developed
align it almost perfectly with relativism.
A major challenge with constructionism regarding research is the fact that
when dealing with external events, like how the market behaves or how an
organization interacts with its stakeholders, an external perspective is required. The
issue is no longer how we perceive reality but rather what the reality of external
stakeholders is. Another challenge we face is the inability to compare views of
individuals as they are subjectively formed and do not represent accurate/realistic
reflections of their outside world.
To address many of the challenges constructionists face with respect to quality
(like validity in positivism), compliance with a set of criteria is sought in
constructionism-based research. Prominent among them is authenticity, whereby the
researchers need to display understanding of the issue under investigation.
Additionally, they need to demonstrate their impartiality in interpreting findings as
expert methodologists. In that direction, identification of correct operational measures
for the concepts being studied will support construct validity. Internal and external
validity are also of concern as they aim at establishing causal relationships and ensure
the generalization of findings, respectively.
1.3 Methodology
Armed with our beliefs about the nature of reality and learning we move on to
adopting the process we will follow in collecting information and data about the
phenomenon we will study. Methodology, as a field of study, concerns the systematic
study and analysis of the methods used for observations and drawing conclusions. In
short, methodology is the philosophy of methods and encompasses the rules for
reaching valid conclusions (epistemology) and the domain/‘objects’ (ontology) of
investigation that form an observable phenomenon. Its philosophical roots suggest
that it is expressed as an attitude towards inquiry and learning that is grounded in our
perception of reality and forms the guiding principles of our behavior.
While it has strong philosophical roots, it is in essence the connecting link
between philosophy and practice as it provides the general strategy for making and
processing observations. Because “practice” is a core ingredient of methodology, it
means that the subject of inquiry/research has been already identified and considered
in the choice of methodology selected. Here is where ontology and epistemology come
into play. Our beliefs about reality and learning can suggest ways of approaching
inquiry and the way we observe the phenomenon under investigation. For example,
suppose that we are studying work burnout. A positivist perspective would assume
Philosophical Foundations 19
that burnout exists in reality and proceed to formulate tests using large samples that
would measure it and confirm its existence while providing details of the various
variables and parameters that influence its expression. In this way, a cause and effect
relationship will be established between the individual influencers/variables. Among
them, one could find control variables like the environment (suppressive and
authoritarian) and independent variables (we will talk about them soon) like genetic
predisposition, etc. The focus here during data collection and analysis is more on
observations of the phenomenon “What”, “When”, “Where”, and “Who” (Figure 1.1)
and leaves the “How” and “Why” as generalizations during the interpretation and
conclusions phase.
On the other side of positivism, we can consider a constructionist perspective
that would focus on aspects of the environment that are considered by individuals as
contributing to burnout and how they manage themselves in such situations.
Researchers would arrange for interviews with those who have experienced burnout.
By recording the individuals’ stories and appropriate probing about the phenomenon
the researcher develops themes that persist across individuals and in this way an
explanation of the phenomenon surfaces. The focus here is more on the “How” and
“Why” (Figure 1.1) and leaves the identification of commonalities “What”, “When”,
“Where”, and “Who” for the interpretation and conclusions. In a way, it is like going
from effect to cause, while in positivism the perspective will move from cause to effect. A
point of interest here is that the researcher is not defining the phenomenon and its
characteristics but leaving it up to the research subjects to do so.
When considering the methodological approach to research we shouldn’t
forget that one common theme is that they all seek the ‘truth’ and making sense of
what we observe/sense. All methods also aim at providing transparent and convincing
evidence about the conclusions they draw, and they are all judged about the validity
and accuracy of their predictions. A “limitation” of proof in social sciences that is
important to mention here is that while the path we follow in proving something might
lead to success (or better yet acceptance by others), it will not necessarily lead to
positive change that improves the lives of others. Research can only bring awareness
and understanding, so we should be clear that changing is something reserved for
policymakers, individuals, and groups.
It is worth pointing out here what a phenomenon is, as we frequently refer to
it as the core element or the essence of research. By its etymology, a phenomenon is
something that appears, meaning it is observed. In our case, we will also add the
element of repeated appearance as otherwise it might not deserve the effort one can
devote to its study. Value from research comes from the understanding we gain about
something, that we can later use to make predictions and optimally deal with similar
situations. Understanding comes from being able to represent the complexity of what
20 QUANTITATIVE RESEARCH METHODS
The choice of methodology based on the data we plan to collect is one guiding
principle, but it is not the only one. As we mentioned before, our epistemological and
ontological stances play a crucial role in the type of methodology we choose, and one
might go for example for a qualitative methodology (like in a case study of a single
Philosophical Foundations 21
organization) that includes the collection and processing of quantitative data (like its
balance sheets for a period of time). Another element that might influence the choice
of methodology is the role of the researcher in data collection and interpretation.
Quantitative researchers attempt to maintain objectivity by distancing themselves from
their sample population to avoid interference from their personal beliefs and
perceptions that could potentially influence the sample’s responses. Biased
observations and interpretations are an anathema to scientific research and along with
ignorance can do more damage than good. This is considered impossible for
qualitative researchers who pose that one’s personal beliefs and feelings cannot be
eliminated and only by being aware of them can realistic interpretations and
conclusions be drawn. The following sections present the various methodologies in
more detail and provide more information about their underlying philosophies that
will (hopefully) guide someone to correctly choose and apply them in practice.
1.3.1 Quantitative
The quantitative method and analysis is based on meaning derived from data
in numerical form like scale scores, ratings, durations, counts, etc. This is in contrast
to qualitative methods where meaning is derived from narratives. The numbers of
quantitative research can come directly from measurements during observation or
indirectly by converting collected information into numerical form (like with a Likert
scale, which we will see later). While this definition of quantitative research covers the
basics of what it is, a more in-depth representation defines quantitative methodologies
as an attempt to measure (positivist stance mainly) an objective reality (realist stance
mainly). In other words, we assume that the phenomenon under study is real (not a
social construct) and can be represented (knowable) by estimating parameters and
measuring meaningful variables that can represent the state of entities that are involved
in the phenomenon under study.
Associations among such variables can further be used to establish
relationships and ultimately suggest cause and effect that will predict the value of one
variable (effect) based on the observations of another (cause). This process is
particularly suited when we form hypotheses in an effort to explain something.
Hypotheses are statements of truth about facts that can be further tested by
investigation. This usually follows when theories or models are formed in an attempt
to provided “solid” (statistically that is) evidence of their statement and assertions.
Model testing is otherwise the domain of the quantitative research methodology.
In order to produce reliable evaluations, quantitative research requires large
numbers of participants and the analysis is done through statistical tools. Large
samples ensure better representativeness and generalizability of findings as well as
proper application of the statistical tests. The investigator and the investigated are
independent entities and, therefore, the investigator can study a phenomenon without
22 QUANTITATIVE RESEARCH METHODS
especially when attitudes, beliefs, and behavior in general are studied. This is
something that quantitative researchers usually defend by emphasizing that the focus
of quantitative methods is not on what behavior means but on what causes it and how
it is explained.
In addition to the aforementioned criticism we should not forget that
quantitative methods do not serve as direct observations of a phenomenon as the
researchers are usually in control of the environment, so the results are seen to an
extent as “laboratory results” instead of as “life observations”. Using instruments like
questionnaires are also considered interventions in addition to being considered as
expressions of the designers/researchers rather than independent and objective
measurement instruments. The issue of objectivity is of constant debate amongst
methodologies and is closely related with the internal and external validity that will be
discussed in follow-up chapters.
1.3.2 Qualitative
While qualitative methodology is not the focus of this book, a brief mention
is deemed necessary as it is vital for researchers to understand what is available to them
before committing to a methodology for their research. In general, a qualitative
methodology is defined as everything that is not quantitative. While this type of
definition by exclusion might reflect to a great extent the truth, it is best if we can
define qualitative research with respect to what it is instead of what it isn’t. In search
of a definition of what qualitative research is, the roots of the word quality itself can
serve as a point of departure.
Quality comes with many meanings, ranging from the essence of something to
the level of something with respect to another/similar something that is taken as a
standard. By “something”, in the context of research, we refer to theoretical constructs
that represent assemblies of conceptual elements that appear as independent entities
influencing or representing phenomena or aspects of them. In that respect the first
definition of quality can form the basis upon which qualitative methodology is
expressed. While the latter definition might seem irrelevant to a research methodology,
this is far from truth. Qualitative research is grounded on comparing theoretical
constructs as it is in the act of comparison that new constructs are developed. Such
constructs are necessary in social sciences when human behavior needs to be studied.
Humans are moved by needs, and in response to environmental triggers (both physical
and social) they build an understanding of the world around them that helps them
make sense of it and respond. One definition that captures all this is to see qualitative
research as a methodology that aims to understand human behavior and more
specifically the beliefs, perceptions, and motivations that guide decision making and
behavior.
24 QUANTITATIVE RESEARCH METHODS
move away from the methodology and get closer to practice as an approach to
examining a research problem, there are many researchers who view the methodology
as a separate and independent epistemological way of approaching research. This
brings mixed methods into the realm of pragmatism (somewhere in between
positivism and constructionism). Others, though, believe that paradigms are not to be
seen as distinct but rather as overlapping or with fluid boundaries where one gives rise
to and supports the other (Figure 1.6), so combining them is quite an acceptable way
of conducting research.
The main issue with mixed methods is the sequence and extent of applying the
quantitative and qualitative components of inquiry. For example, we might be
interested in understanding the context of a real-life phenomenon like leadership in
transient teams (teams that are formed for a specific task and then dissolved). If we
are not aware of the characteristics of leadership that lead to success or failure of
leaders in such teams (or there is no strong research behind it), we might decide to
begin with a qualitative methodology that through in-depth interviews with select
leaders of transient teams will reveal key characteristics. We will then follow this with
a quantitative methodology whereby with the development of a widespread
questionnaire we will prove that the identified characteristics (or some of them) exist
universally in transient teams. Alternatively, we might already know (for example, from
past research or similar cases) the range of possible characteristics, so we might decide
to start with a quantitative methodology to identify the specific leadership
characteristics that affect transient team performance and then follow with a
qualitative methodology (for example, in-depth interviews) to identify the reasons
behind the influence of the specific characteristics.
Philosophical Foundations 27
The three possibilities of mixed methods designs are outlined in Figure 1.4.
We can start with a quantitative approach (large sample) that will provide findings like
certain characteristics (attitudes, traits, perceptions, etc.) of a population and follow up
with a qualitative approach (specific small sample) for explanations to surface, and
thus the method is called explanatory sequential mixed methods. On the other hand,
we have exploratory sequential mixed methods where we can start with a qualitative
approach (small sample) to explore the range and depth of a phenomenon and
continue with qualitative (to a wider sample) to provide the “proof” and confirm what
we found. Finally, we can have both methods working in parallel (convergent parallel
mixed methods), and aim at calibrating each of them and comparing their results to
converge (or diverge) for an interpretation, enhancing in this way the validity of the
research findings. An example of the parallel application of methods is when they are
combined in one instrument like a questionnaire that has both closed-ended and open-
ended questions. Other possibilities of combinations have also been identified, like
conducting one before and after an intervention, or having quantitative data in a
qualitative research and vice-versa, but they are beyond the scope of this book.
Apart from the sequence of execution of the various methodologies, of great
importance is dominance. The simple way of seeing dominance is in terms of the time
that is devoted in each methodology. This is kind of a simplistic view as their
importance in the research might be different. For example, one might dedicate plenty
of time and resources in in-depth interviews to confirm past research findings while
spending very little time in posting a questionnaire online and processing the results.
While the interviewing part might seem necessary, the importance of the quantitative
part is at a much higher level.
Along with the advantages of mixed methods research like the well-rounded
investigation of a phenomenon and the insight they could provide for guiding future
choice of methods, there are criticisms that need to be considered before adopting
such methodologies. Prominent among them are the contradictions of the underlined
paradigms that need to coexist and provide support for the research, as well as the
competencies that the investigators need to have as they need to cover the full
spectrum of quantitative and qualitative methodologies. Additionally, mixed methods
studies take significantly more time and more resources to complete, making them
unsuitable when time and resources are of essence. In closing, mixed methods might
be ideal for comparing quantitative and qualitative perspectives for instrument
calibration, for discovering the parameters and variables involved in phenomena, for
understanding and providing support with raw/quantitative data for the “how” and
“why” of phenomena, and in support of interventions and change initiatives (like for
marginalized populations).
28 QUANTITATIVE RESEARCH METHODS
Given that our focus in this book is the social world, we need to add another
important element that usually accompanies the purpose and that is the population of
the study. A brief profile of the entities (people, groups, organizations, etc.) is
necessary to provide the social context of the phenomenon we study and also suggest
later on the way we will sample that population to select those who will participate as
research subjects. A related concept, we will see later on, that identifies our atomic
elements of study is the “unit of analysis”.
With a problem and purpose statements formulated and armed with their
philosophical beliefs about the nature or reality (ontology) and inquiry (epistemology),
researchers choose the philosophy that will guide their methodology, which in our
cases here will be quantitative. From this point on the quantitative aspect will be the
predominant aspect of research and the focus of our discussion. The identification of
the entities that will be considered in the representation of the phenomenon will be
30 QUANTITATIVE RESEARCH METHODS
first done through the research questions that will be adopted. These will include the
entities that will be measured and/or the type of relationship among them that we
hope to establish. In developing the research questions that align with the problem
and purpose of the study, researchers need to know what other researchers have done
in the past and the types of theories that have been developed to explain similar
phenomena and closely related theoretical constructs (Figure 2.1). This is done
through a review of the extant literature to identify information relevant to the study
that can serve as a point of departure for developing research questions.
A point of discussion, debate actually, is whether the research questions should
guide the literature review. While the argument that the research questions, and not
the research topic, will dictate the parameters of the literature review has merit, the
truth is that by the time they come to developing the research questions researchers
have already gone through a literature review process (even if it is not an extensive and
exhaustive one). This is to ensure the phenomenon of their investigation is a significant
and valid one for scientific research (Figure 2.1) and that it hasn’t been researched
before (unless of course the purpose of the study is to repeat a research for reliability
purposes — something that is out of the scope of this book). Additionally, as we will
see in the next section, in formulating our research questions it is vital to know what
relevant information (constructs, variables, theories, etc.) has already been investigated
or recommended for the explanation of the phenomenon.
The stance we will follow in this book is that the purpose will result in research
questions that are informed by the literature review and the theoretical framework we
choose as our initial point of departure in applying, building, or extending theory. The
research questions in return will provide further direction in reviewing the literature
and shaping the theoretical framework, whether this is in the form of existing theories,
additional constructs, or rearrangements of pieces from multiple theories and
constructs. In essence, a feedback loop is established that informs and influences its
elements until it settles down to a three-way supporting structure of research
questions, literature review, and theoretical framework (Figure 2.1). The purpose of
the research and the researcher’s preconceptions or understanding of the phenomenon
under investigation can point to where one enters the feedback loop.
research), gain insights about the methodologies (if any) and practices used in the past
to study the phenomenon, and identify closely related phenomena that could
potentially serve as points of departure in our inquiry. At a deeper level, the literature
review can even suggest new theoretical constructs and even specific variables that we
need to consider in our study. This is critically important for the development of the
research questions as it can suggest what needs to be included or avoided.
In addition to revealing what has been done in a field of study, a literature
review also adds credibility to research as it showcases the researcher’s knowledge and
understanding of the state of the art in a topic. This includes familiarity with the
phenomenon under investigation, the history of its conceptualization, terminology,
research methods and practices, and assertions and findings. Additionally, when a
literature review is included in publications it serves to inform the reader and, in many
cases, it is considered a legitimate scholarly work that can be published by itself.
Before deciding on a literature review process, one needs to consider what type
of review they are going to conduct. The motivation and goal for conducting the
review will define its focus, the extent of its coverage in terms of subject matter depth
and breadth, and its point of view perspective (critical evaluation of its findings). With
respect to justifying research, the motivation could be to exhaustively present what has
been done in a field with the goal of revealing persistent themes,
generalizations/theories, or gaps that haven’t been addressed yet. While the latter can
help develop the research questions and provide justification and legitimacy for
conducting the research, persistent themes and generalizations/theories can also help
develop the theoretical framework upon which our research will be based.
Conducting a literature review is like doing research. The only difference is
that instead of people, the subjects of investigation are research articles. While the
process of developing a literature review is beyond the scope of this book we can
briefly state that the process begins by establishing a search plan that involves
identifying keywords that will be used in search engines (like Google Scholar at
scholar.google.com) or library databases (like EBSCO, PROQUEST, etc.) where
academic research publications are collected and cataloged. When search results
become available, the researchers will work on an initial screening for relevance to the
topic of investigation, credibility, and publication date. While relevance will be decided
based on the breadth and depth in the field of study one wants to cover (reading the
abstract will give an indication of this), credibility can be decided on the widespread
popularity and acceptance of a scientific journal (impact factors can reveal this) and
the position (affiliation), expertise, and experience of the authors of a publication.
Regarding the publication date, for fast-advancing topics (in terms of
discovery) like technology one might consider only publications within the last three
years to match the life cycle of new developments in the field that could have either
32 QUANTITATIVE RESEARCH METHODS
addressed the problem of the study or made it obsolete because something new has
replaced it. In other fields like social sciences, for example, the change is usually not
that fast (excluding revolutions and radical social upheavals), so one might assume a
five-year window into the recent past for searches. Exceptions to the consideration of
recently published research could be seminal pieces of work that have sustained the
passage of time and theories that could be considered in the theoretical framework of
our research.
Working on screened publications, researchers will move on to categorize the
published material according to themes that will form the core of the literature review.
Each theme will later be developed under its own heading and will include a critical
and comparative evaluation of the published research under its domain. The general
structure of a literature review starts with the goals and motivation for developing it,
presents the search strategy that was followed in retrieving and screening published
research, moves on to thematically present its findings, and concludes with a
summative and comparative presentation of the core themes that are directly related
to the phenomenon under investigation. A description of the search strategy is
necessary primarily for validity and reliability purposes. Future researchers who want
to validate the review should be able to follow the same process and (hopefully) come
up with the same or similar results.
This could result in one or many theories as some might address specific aspects of
the phenomenon we study while others might deal with its more generic
characteristics. Multiple theories may exist (Figure 2.2) that could overlap in some
aspects (theoretical constructs) and deviate in some others. Overlaps (constructs XY1
and XY2 in Figure 2.2), might indicate areas with strong influence that have also been
validated and proven in providing accurate explanations, while gaps might suggest the
need for individual constructs (new or existing ones) that need to be considered for a
thorough description of the phenomenon we study (construct Z in Figure 2.2). The
latter is a case where the theoretical framework might inform the literature review and
suggest searches that will support new or existing constructs.
When considering and selecting the various elements that will form our
theoretical framework (Figure 2.3), caution should be exercised in choosing the
number of theories we will consider. Too many might diffuse our focus on the
phenomenon and provide a level of granularity that adds unnecessary complexity. Too
few (usually one) might miss important theoretical constructs or parameters that might
have already been investigated and whose inclusion might be necessary for a complete
coverage and description of the various aspects of the phenomenon we study. A rule
of thumb might be to avoid considering more than three. If convenience is of issue or
the subject of our study has not been investigated in the past, one theory might very
well serve as the basis for developing our framework. In terms of theoretical
constructs, a balance between generic and specific is needed. National culture as a
proxy to influence, for example, might be too generic in the case of the identification
of an entrepreneurial opportunity that we showcased previously, and instead family
environment might serve as a better alternative.
What theoretical constructs (or variables) describe (or influence) the phenomenon under investigation?
or
Is X a component of (or influencing) the phenomenon under investigation?
What is the association (or relationship) between X and Y in the phenomenon under investigation?
or
Is there an association/relationship between X and Y in the phenomenon under investigation?
An example could help highlight the possibilities here. Let us assume that we
want to study the problem of leaders underperforming in crisis situations in
multicultural environments. The problem statement in one sentence can be expressed
in the form: ‘The specific problem this research will address is the lack of understanding of the
personality traits that help leaders in multicultural organizational environments perform better in crisis
situations.’
This sentence of course will have to be expanded on and supported with
adequate and recent (last 5 years at most) citations to ensure the problem is valid,
significant, and current. With a problem statement like the aforementioned, one would
expect a purpose statement of the form: ‘The purpose of this research is to provide an
understanding of the personality traits that help leaders in multicultural organizational environments
perform better in crisis situations.’
In a case where we don’t know what theoretical constructs and variables are
involved in the description of the phenomenon, a research question can be expressed
in the form: ‘What personality traits help leaders in multicultural organizational environments
perform better in crisis situations?’ If our literature review has hinted that certain traits,
extroversion for example, help leaders in general perform in crisis situations, we might
formulate a research question of the form: ‘Do leaders with the extrovert personality trait
perform better in multicultural organizational environments in crisis situations?’ This, alternatively,
can be expressed in the form: ‘To what extent does the extrovert personality trait help leaders in
multicultural organizational environments perform better in crisis situations?’, or in the more
familiar form: ‘Is there an association (or relationship optionally) between the extrovert personality
trait and a leader’s ability to perform better in crisis situations in multicultural organizational
environments?’ This variation covers a lot more ground from the previous version as it
doesn’t only aim to establish existence (or non-existent if the “extent” of influence is
zero), it also hopes to identify the strength of the influence the particular trait exerts.
A point of interest here is that the “what” form of expressing a research
question can also appear in qualitative research, but the “relationship” form can only
36 QUANTITATIVE RESEARCH METHODS
exist in the quantitative methodology we study here. This latter form needs to be
accompanied by a set of hypotheses (null and alternative) that exhaustively confirm or
reject the existence of what is assumed and stated. A hypothesis is nothing more than
a presumed statement of the existence or non-existence of a population characteristic
or relationship between two variables. The validity of such statements is assessed by
the use of statistical methods applied to the data generated through empirical inquiry
of the phenomenon under investigation.
While we will discuss the hypotheses in a separate chapter later on, suffice it
to say here that for each research question we have two forms that are mutually
exclusive: the null and the alternative. For the example of leaders’ performance in crisis
situations showcased here, the research question we adopted will suggest the null form:
‘There is no association (or relationship optionally) between the extrovert personality trait and the
leader’s ability to perform better in crisis situations in multicultural organizational environments’,
with the alternative form being: ‘There is an association (or relationship optionally) between the
extrovert personality trait and the leader’s ability to perform better in crisis situations in multicultural
organizational environments.’
Hypotheses are a unique characteristic to quantitative research, and it is more
of a research design element than anything else. They are directly tied to the data
analysis process and the statistical techniques used in developing support for the
existence (or not) of relationships among variables that quantitatively (in numerical
form) represent theoretical constructs as well as estimations of the strength of the
relationships. They are included here because of their direct relevancy to the research
questions and for their conceptual integration within the research process.
Before venturing into the research design perspectives and the process of
collecting and analyzing data it is worth mentioning here the very important concept
of unit of analysis specific mainly to social sciences research (Figure 2.4). While we
are typically interested in the way individuals interact between themselves and the
38 QUANTITATIVE RESEARCH METHODS
environment, we are often interested in how identifiable entities like an industry, the
market, social interactions (like, say, negotiations), teams, groups, and even societies
as a whole behave at some time period in their life. In these cases, the unit of analysis
becomes the entity we are studying. This is not to be confused with the unit of
observation/our sample, as in most cases it will be individuals who due to their
expertise or position express their opinion about the state and behavior of the unit of
analysis. An example might help clarify the case. If we are studying employee
performance, then, obviously, our unit of analysis is the individual employee. If, on
the other hand, we are studying organizational performance, then our unit of analysis
is the organization. Regardless of the unit of analysis, the research data might be
provided in both cases by the same individuals. A point of importance is that the unit
of analysis does not have to be an entity. In can very well be a social interaction or
artifact like partnerships, beliefs, etc. For example, we might be interested in studying
how humor is used in negotiations. In this case, humor in the specific setting is our
unit of analysis. The data of course will be provided by individuals (units of
observation or sample). In this example, another unit of analysis (there can be more
than one) could be the negotiation as a form of social interaction. In practice, units of
analysis should be deduced or inferred by the purpose of the study or the research
questions.
This mainly aims at testing α hypothesis and conclusively confirming in this way
findings that can later on be generalized. Oftentimes it is debated whether
confirmability can qualify as exploratory because the former can be representative,
reliable, and valid, while the latter is vague and inconclusive. As a general practice, in
the quantitative domain it is good to leave hypothesis testing out of exploratory
designs.
Presumably, when the stage of exploration has been completed one moves on
to describing what has been found. This phase is the domain of the descriptive
research design and it is usually launched as a separate investigation/research. It is used
to obtain refined information about variables identified through explorative research
in terms of definitions, limitations, range, and units of measurement that will describe
them. As such, descriptive designs require the development or adoption of proper
instrumentation (like scales) for measuring variables and constructs. In case
observations are utilized, such designs might not qualify as quantitative. Regardless,
descriptive research is ideal as a precursor to more quantitative designs like the
explanatory we will discuss next. In addition, and due to the large amounts of data
40 QUANTITATIVE RESEARCH METHODS
collected and processed, such designs can result in important recommendations for
practice. Caution needs to be exercised as the results cannot be used for discovery and
proof and when the results cannot be replicated (as in the case of observations).
Good descriptions will provoke the “why” questions that fuel explanatory
research. Answering the “why” involves developing explanations that establish cause
and effect relations among factors (causes) and outcomes (effects) for the
phenomenon under investigation. For that reason, explanatory designs are also seen
as causal designs. Such designs identify the conditions/factors that need to exist for
the phenomenon to be observed. Hypothesis testing (discussed later on) is specifically
developed to address such conditional statements. A case in point in social sciences is
when we want to measure the impact a specific change can have (say a policy to enforce
entrepreneurial growth) on the behavior of members of the society (like potential
entrepreneurs). In such a case, causality is established by validating an association
between the change that will be expressed through a factor (also seen as independent
variable) and the behavior (also seen here as dependent variable) it affects. Variations
in the independent variable/factor need to cause variation of the dependent
variable/phenomenon, making sure also that a third variable is not influencing the
observed variations (non-spuriousness).
According to the nature of the explanation/causation that is sought,
explanatory designs can be further subdivided (Figures 2.5 and 2.6) in the general
category of comparative when two groups are compared, causal-comparative when we
try to understand the reasons why two groups are different, and correlational when no
attempt is made to understand cause and effect. Comparative designs are the simplest
ones and focus on examining differences in one variable (the dependent/factor)
between two groups or subjects. In this case, the subjects and measurement
instruments need to be described in detail.
examined using a controlled experiment. This usually involves two groups, one of
which expresses the factor we are studying while the other one doesn’t (often called
control group). An example of such a case is when we try to establish the relationship
between experience and job satisfaction. One would expect that professionals with
more experience (one group) are going to be more satisfied with their jobs compared
to professionals with less experience (second/control group), establishing in this way
a cause (experience) and effect (job satisfaction) relationship. Alternative names to
causal-comparative can exist like ex-post-facto and correlational causal-comparative
when correlational models are used to investigate possible cause and effect
relationships.
Correlational designs, finally, work on establishing an association between
two variables in single groups when no manipulation of the variables can take place or
is allowed. In the previous example of experience and job satisfaction, a correlational
design will work on a group of experienced professionals and try to establish if job
satisfaction is high. Along with causal-comparative, correlational designs are usually
the precursors to experimental research. A further subdivision of correlational designs
can be observed with simple correlations focusing on the relationship between two
variables and predictive when the relationship is seen in a predictive nature. If one
variable (often called predictor) is used to predict the performance of a second variable
(often called outcome or criterion) we have the category of simple predictive studies,
while when multiple variables/predictors are involved, we have the category of
multiple regression.
Explanatory/causal designs help with the provision of proofs about the
workings of the world by accepting or rejecting the influence of variables that attempt
to express a phenomenon and its aspects. They have the great advantage of allowing
for control of the subject population (in terms of meeting selection criteria) and can
be replicated to enforce greater confidence in the research findings. Despite their
advantages, caution must be exercised in their application and in the interpretation of
the results as not all relationships are causal. There is for example almost perfect
association between the rooster’s wake-up call in the morning and the Sun rising, but
as we all know this is not the reason for the Sun rising. In social environments, the
issue of establishing causality becomes more complex due to the multitude of
extraneous and confounding influences of various variables. In such cases causality
might be inferred but never proven.
Causality is also confused at times with predictability. For example, being
exposed to business aspects from the family environment (like in family businesses)
might predict the entrepreneurial tendencies of an individual but it does not cause one
to become entrepreneur. It also does not tell us what the individuals do to become
successful entrepreneurs. Predictions do not necessarily depend on causal
42 QUANTITATIVE RESEARCH METHODS
relationships, neither do they prove ones. Later on, when we study hypothesis testing,
we will further discuss the issue of causality in the form of probabilistic thinking.
Another classification of research designs emerges with respect to the selection
of the sample population and will be referred here as control perspective. If a random
process has been followed (we will see later on what “random” means in statistics),
then the experimental label can be assigned (also called randomized). If multiple
measures or a control group are involved, then the design can be classified as quasi-
experimental, while if none of the conditions mentioned before are satisfied then we
have a non-experimental design. While the last two categories can carry a degree of bias
in them (an issue of internal validity), randomized designs are considered ideal for
studying cause and effect relationships.
Experimental designs usually involve a test/experimental group where some
form of intervention/treatment (this is the independent variable) takes place and a
control group where no intervention is applied. Individuals are randomly assigned to
each group and their behavior/response is measured before the application of the
intervention (pre-test) and afterwards (post-test). Because a comparison of some sort
is involved, experimental designs usually involve the correlational method for
statistical analysis. This should not confuse them, though, with the correlational design
that involves only one group. The environments are controlled by the researchers in
experimental designs to ensure only the intervention is applied (to the test group) and
no other factor is influencing the behavior/responses (in all the groups). If significant
differences are observed between the two groups one can be certain that the
intervention was the reason the results were different. The ability to control the
situation to such an extent allows researchers to limit alternative explanations and
establish the cause and effect relationship between the intervention (independent
variable) and the result (dependent variable). Despite the high levels of evidence
experimental designs provide, researchers need to be cautious in the conclusions they
draw as the observed results could be artificial and not generalizable to the greater
population. For example, we could show images of terrorist attacks to the test group
and then ask both test and control groups to write how they feel about defense
spending by the government. This type of intervention is clear manipulation of the
test group and one should not consider the results as an indication of the need to
increase the defense budget.
If we are unable to randomly assign participants to test and control groups and
we must rely on existing groups and in cases where we are not in complete control of
the environment, we have the quasi-experimental research designs. Such designs are
often used in evaluation research and are also adopted when an experimental design is
impractical or impossible. For example, let us assume that we want to study gender
differences in leadership. Given that we cannot randomly assign someone to be a male
The Quantitative Research Process 43
or a female (in other words, control the independent variable), we are forced to group
males in one group and females in another. Another case where experimental designs
are not applicable and require quasi-experimental designs is when the unit of analysis
is communities/groups instead of individuals. If that is the case it will be difficult to
isolate communities (especially if they are nearby) as their members can cross borders
and environmental changes can differentiate the two communities drastically (for
example, one might experience a tornado attack) to the point of polluting/biasing the
results. In general, an issue of consideration with quasi-experimental designs is that
their findings cannot be generalized outside the population that participated in the
study (lack of external validity). Variations of quasi-experimental designs exist with
exotic names like regression-discontinuity design, proxy pretest design, etc., but their
presentation is beyond the scope of this book.
When we cannot control the level of randomization in the participants or the
variables we are interested cannot be manipulated, we have the non-experimental
research design. Being unable to manipulate variables could be intentional, like when
they might lead to ethical and/or illegal violations (for example, we cannot induce
violence for the sake of studying it), or unintentional, like when we cannot control
gender (switching men into women and the opposite to study the effects of gender,
let’s say on individuals’ behavior is not possible). Non-experimental designs can
observe associations among factors/variables that influence a phenomenon, but they
cannot establish cause and effect relationships. Despite their shortcomings, non-
experimental methods are very popular in social sciences research (actually, they
represent the majority of the published research). This is mainly due to their non-
invasive nature and the ease with which they are conducted (distributing a survey link
on the Internet is simple nowadays).
A final perspective for research designs is based on the time (or lack of it) of
investigation. If the measurements take place at a specific time and rely on existing
characteristics and differences rather than changes following interventions, then we
have a cross-sectional design. By specific time we don’t literally mean a moment in
time like a second or a minute but more like the period of data collection, usually in
the range of weeks and even a few months. In social sciences, for longer periods (for
example, years) the influences on a sample population from the rest of the
environment could effectively change perceptions, beliefs, or whatever else we might
be studying. Life events and social trends start affecting individuals, so our data might
be polluted from such events and not realistically represent what we measured. For
example, if we are interested in the influence of policies in the emergence of
entrepreneurship and we take a long time to collect our data, the final participants
might be exposed to different policy conditions (a government might have changed or
taken action) than the initial participants. In general, the time period of data collection
could be “zero” when all participants respond at the same time (like filling in a survey
44 QUANTITATIVE RESEARCH METHODS
in a classroom), it could be spread across days and even weeks like when filling in an
online survey, to even months like when interviews and different organizations are
targeted. As a rule of thumb, for cross-sectional designs one should not go over a two-
or three-month period for data collection as environmental changes might have
accumulated significant changes in the domain of study.
Because the time dimension is not considered in cross-sectional designs, their
ability to measure change is limited to only recollections of change as it is experienced
by individuals. The results of such designs tend to be static as they reflect a snapshot
of the sample at the time of data collection and there is always the possibility that
another research that will be conducted at another time will find different results.
Additionally, if only a specific factor/variable is studied it might be difficult to locate
and recruit participants who have similar profiles in order to eliminate covariances
(trends between factors) and influences of other factors. All the above make it difficult
to establish valid cause and effect relationships in cross-sectional designs unless
repeated measures at different times and settings confirm findings. Despite the
aforementioned shortcomings, cross-sectional designs dominate the published
research because of the ease and speed with which they can be conducted and the large
numbers of subjects they can reach (ensuring in this way the representativeness of
populations).
For instances, where the passage of time is significant in what we study, the
longitudinal research design is more appropriate. In this design the same sample is
followed over time and measurements are repeated at regular intervals. In this way,
researchers can track changes in factors/variables and infer patterns over time. In
addition, this design allows the establishment of the magnitude and direction of cause
and effect relationships. Typically, measurements are taken before and after the
application of an intervention, allowing researchers to make inferences about the
impact and effect of an intervention on the sample population. In a way, these designs
allow researchers to get closer to experimental designs and facilitate prediction of
future outcomes as a result of the application of identifiable factors.
Because of the involvement of time in longitudinal designs they are often
confused with time-series. In the latter case variables are measured as a function of
time forming a series of values in time. Time now becomes a continuous variable (scale
or interval as we will see later) while in conventional longitudinal designs is a separator
of groups of values (ordinal or nominal). Another significant difference between the
two cases is that longitudinal designs can suggest causality while time series do not. An
example of time-series is the stock price of a company in time while an example of a
conventional longitudinal design is patient health before and after an
intervention/treatment. If patient health improves after a treatment, we can deduce
with some certainty that it was the cause of the improvement.
The Quantitative Research Process 45
Some critical issues that might arise in longitudinal designs include among
others the possibility of changes in the data collection process over time, the
preservation of the sample constitution and integrity, and the isolation of the variables
under investigation from environmental influences. Also, these designs affect our
ability to study multiple variables at the same time. An assumption that we make in
longitudinal designs, that can be challenged, is that the initial environmental trends (at
the beginning of data collection) will persist during the observation period. Despite
these challenges and provided we can afford the luxury of time and a large and
representative sample; this design is ideal for studying change during long periods of
time (like the lifetime of individuals from childhood to adulthood).
A final design that we will see here is meta-analysis. Its time perspective is
exclusively focused on the past. This design concerns the systematic review of past
research about the phenomenon we are studying and the statistical analysis of the
findings of these past studies. In this way, the sample space increases dramatically and
allows researchers to study the effects of interest from both quantitative and qualitative
studies. While the availability of data might be sometimes overwhelming, the idea is
not simply to summarize findings but instead to create new knowledge from
combinations of variables that have not been analyzed before using synoptic reasoning
and statistical techniques. In cases where not enough past research exists or there are
strong dissimilarities in the results (heterogeneity), this type of design should be
avoided. In addition, attention should be paid to the criteria used for selecting the past
research studies, the objectives set, the precise definitions of the factors/variables and
outcomes that are used, and the justification of the statistical techniques used in the
analysis.
Meta-analyses oftentimes require content analysis (like with qualitative results).
In such cases, small deviations in the criteria used can lead to misinterpretations that
will affect the validity of the findings. That effect can also be aggravated due to large
samples that might not necessary be valid. In conclusion, the quality of the past
research that is included in the analysis is critical regarding the credibility of the meta-
analysis results. Despite its shortcomings and the time and effort investment required
for meta-analysis, the findings can be valuable as they can validate results from multiple
studies and point to future directions of research. This type of analysis is also useful in
future studies as it can provide the basis upon which scenarios can be built.
2.4.2 Sampling
Having identified the research design perspectives that are appropriate for a
research study, the next step involves identifying the sample, the variables/factors that
need to be measured, the instrument that we will use to collect data, and the way they
will be processed and analyzed to accomplish the study’s goals (Figure 2.4). Sample in
this respect is the segment of the population that participates in the research. In this
46 QUANTITATIVE RESEARCH METHODS
section, we will discuss how we identify and organize our sampling process, while in
Chapter 4 we will discuss how we can process the data we collect from our samples.
Guided by the purpose, research questions, hypothesis, and the various
perspectives of our research design, we can profile our target population/entities (see
Chapter 3) and devise a sampling strategy to select the individuals/entities that qualify
for participation in our study. At this stage one could go ahead and discuss what will
be measured, but knowledge of the sample characteristics could help identify details
like demographics that might provide valuable information when considered for data
collection with the instrument we will use (discussed in 2.4.4).
Important characteristics of samples include: heterogeneity (representativeness
in other words) to include participants proportional to the population, maximum
variation to ensure extremes are represented, and randomness to eliminate selection
bias and ensure equal chances for selecting participants. These characteristics are to a
great extent interrelated and express different views of the same property, which is for
the sample to be an accurate miniature reflection of the population it represents
(Figure 2.7). A final characteristic that is often ignored in the initial stages of research
design is sample accessibility. Researchers might have figured out the perfect way to
select individuals only to find out that they are not allowed to access them (like when
children are involved) or they are not available (too busy to participate).
In practice, stratification for more than one variable might be required, like
considering gender and education along with social class. This adds complexity in
ensuring representativeness, but as long as the initial pool of candidates and the
selection of particular individuals is not controlled by the researcher the sample could
be random and representative enough. An issue that arises in many situations is the
geographic distribution of populations that tend to be spread across multiple
geographic regions and areas. For example, it might be that we are interested in racial
bias across a region. With random sampling we might end up covering huge distances
according to what has been selected for participation. Instead, we might identify a few
locations (closely by) and consider them as representative of the population. This form
of sampling is called cluster sampling and as long as the cluster we select includes a
balanced mix of the demographic characteristics of the study’s population it can also
be, to an extent, considered representative.
Apart from the probability/random categories of sampling that we have seen
up to now, we have the general category of non-probability(-stic) sampling where
randomness is not ensured in any way. This does not necessarily mean that the samples
are not representative, it is simply an indication that the selection process is not based
on any part of probability theory. This fact will pose a limitation to representativeness
that researchers need to acknowledge as potential bias or limitation to their research.
Despite its shortcoming, non-probability sampling is quite popular in social sciences
because of its practicality and the low demand in terms of resources, time, and effort
that it imposes on the researchers.
Two general categories of non-probability sampling that are oftentimes
employed, especially in social sciences research, include convenience and purposive
sampling. Convenience sampling (also called accidental or haphazard) is a popular
form of sampling (often exercised when TV reporters interview people on the street)
and involves whatever population is easily available to the researcher. Asking for
volunteers by posting and promoting a call for participation is a form of convenience
sampling. The obvious challenge with this type of sampling is representativeness.
While the assumption is made that only qualified individuals will come forward, there
is no way to know if certain attributes of the population important to our research will
accurately be reflected in the sample. Large samples and/or a good screening process
might alleviate some of the deficiencies of convenience sampling.
Another popular form of non-probabilistic sampling is purposive sampling.
This applies to cases where we are targeting individuals based on their demographic
characteristics in our vicinity (convenience sampling implied). In essence, it is like
convenience sampling with a very aggressive screening process. This is a very efficient
and fast process of reaching a desired sample size and as long as stratification
(proportional representation) is not an issue, it will efficiently compile a sample. A
The Quantitative Research Process 49
observed with terms like attribute and category. If something does not vary, we term
it invariant or constant. When a constant plays the role of the multiplier of a variable
they can also be referred to as coefficients. Rates of growth are oftentimes expressed
as the product of a coefficient and a variable.
Before attempting to clarify the field, it is worth mentioning here that
regardless of the methodology we adopt most of the time some form of comparison
or reference to numbers will have to be made. Even in qualitative research one needs
to eventually present in some form of frequency the persistence or lack of something
among the themes that might emerge (like in interviews). Statements like “the
majority/some/one/none/etc. of the participants stated …” have a
quantitative/numeric connotation simply because this is the only way to logically make
comparisons and draw inferences. While in quantitative research variables represent
numerically expressed characteristics, in qualitative research we might call them instead
categories or themes among others and they can also be quantified in terms of their
frequency of appearance.
In quantitative research, variables come in a variety of different names
depending on their subject (focus), what they express (type), and their position
(function) during their inference process (Figure 2.8). When our focus is on the
population, the variables tend to be referred to as parameters, while when we focus on
the sample they are called statistics. Parameters in science represent measurable
factors that are needed to define/describe a system, an operation, or a situation. In
that sense, they tend to be constant with respect to the time and place of the
observation of the system under investigation. For example, in social sciences
parameters might be used to express the average age of a population, the distribution
of income, etc. at the time of the study. Unless a census is performed, parameters tend
to be inferred from samples of the population.
The Quantitative Research Process 51
within a year, the days within a week, etc.) and in that sense entities referring to time
tend to be considered ordinal. The Likert scale that we will see in the next section is
frequently used in social sciences to capture perspectives in an ordering form with
options like ‘strongly disagree’, ‘disagree’, ‘neither agree or disagree’, ‘agree’, and ‘strongly agree’.
For statistical purposes this type of ordering can sometimes be considered equivalent
to an interval scale 1, 2, 3, 4, 5 or 0, 1, 2, 3, 4 depending where one wishes to place the
zero.
When our variables are assigned to represent entities that cannot be formally
expressed numerically, we call them a variety of names like nominal, categorical and,
occasionally, factors. Entities represented as such include gender, occupation,
opinions, perceptions, beliefs, etc. In the special category of variables like gender,
when only two options are available (male and female), the variables are also called
dichotomous. If an inherent ordering is implied (like small and big) then one can even
classify such variables as ordinal. Nominal variables are inherently difficult to process,
and their analysis might be limited (as we will see in the next chapters) in frequencies
of appearance for each category and potential relationships/dependencies among
them. In many cases, nominal variables are used to represent categories/groupings of
other types of variables (scales and ordinal) and as such they are viewed as factors that
act to categorize them. Any assignment to numerical values in done purely for labeling
purpose and does not reflect in any way a relationship among the different values of
the measure.
A final classification of variables is seen in terms of the function or role they
play in cause and effect relationships that express the phenomenon (or parts of it)
under investigation (Figure 2.9). On the “cause” side of a relationship we have the
variables that trigger the phenomenon and come with names like independent,
predictor (in correlational terminology), and the more abstract x (typically denoting
the unknown in mathematical functions). When an independent variable is “latent”
or unmeasured or when it refers to a category, we can also call it factor. These types
of variables might affect phenomena that are not easily expressed through one variable
and are more appropriately expressed through a construct that is represented by a
combination of variables. For example, a construct like leadership or entrepreneurship
might be difficult to express as a single variable and might require a multitude of
attributes/factors for its definition. Factors, in such a case, tend to represent variables.
On the “effect” side we have variables that are affected/change as a result of
“cause” variables and are called dependent, outcome (in correlational terminology),
criterion (used in non-experimental situations), or y (along with any other letter used
in expressing variables). In between (or outside, one might say) cause and effect we
have variables, generally called extraneous. When such variables do not influence the
existence of a relationship but influence its strength (excluding the direction), they are
The Quantitative Research Process 53
called moderators, and when they influence the existence of the relationship they are
called mediators. An additional type of variable naming that is used for the in-between
situations is control variables. These are variables that need to be kept constant during
the evolution of the phenomenon we study to eliminate interferences that might
otherwise cause unpredictable or false outcomes. For example, if we study how
women experience gender (categorical variable) discrimination, we need to control for
the gender as it is unlikely that men (sexual orientations excluded) can qualify for
women.
An example can help clarify some of the various terms discussed previously.
Let us assume that we are researching the emergence of entrepreneurship and we want
to prove there is a relationship between parents’ entrepreneurial attitudes and their
descendants’ engagement in entrepreneurship. We are, in essence, trying to prove that
the parents’ entrepreneurial attitudes (cause) lead their children to become
entrepreneurs (effect). In this case, the independent/predictor/x variable is ‘parents’
entrepreneurial attributes’ while the dependent/outcome/criterion/y variable is the
engagement in entrepreneurship of their children. While the aforementioned two
variables are the focus of our study there may be many more that could be of influence
and need to be considered. A moderator variable that could influence the relationship
we study could be the economic environment where parents and children live. If the
society experiences an economic recession it might suppress or enforce the intention
of someone to become an entrepreneur and even eliminate any possibility of
entrepreneurial success. A mediator variable in our example could be the business
network that the entrepreneurial parents have developed and made available to their
child. This in many cases can be critical to the success of a child as an entrepreneur.
54 QUANTITATIVE RESEARCH METHODS
Potential other influences to the relationship we study could come from other
confounding/intervening variables like the education level that we didn’t consider
and personality traits like the need for autonomy and control of the potential
entrepreneur. During our research, we might also want to exclude migrant
entrepreneurs and only consider the native population for our study or focus on a
specific racial profile. In this case, we have the origin or race of the entrepreneur as a
control variable. In quantitative terms, we might say that we control/factor for race.
Because the term factor is interchangeably used sometimes for variables, it is
worth mentioning here that factors are more generic and abstract forms of variables
(unlike weight, height, grades, money, etc.) that usually refer to entities that influence
or categorize a result. In research, such variables are seen as constituents of something
and their lack might indicate lack of the quality/property we are trying to study (like
entrepreneurship or leadership). In natural sciences, a variable like temperature might
be considered a factor (in the role of a moderator here) that accelerates a chemical
reaction, while in social sciences education might be a factor (for example, in the role
of an independent variable and as proxy for foresight) that is necessary for leadership
success. Another view of factors in social science is that they tend to create categories
that are nothing more than groupings according to shared characteristics. For example,
gender might be such a factor that is also expressed as a categorical variable.
Other categorizations of variables exist and appear in quantitative research, but
they tend to be specific to statistical practices and should be considered case by case.
As an example, we can mention here a classification in between-subjects and within-
subjects for independent variables. The former case refers to variables or factors in
which a different group of subjects is used for each level of the variable. For example,
if two different interventions/treatments are considered and three groups of
patients/clients are identified (one for each treatment and a control group that will
receive no treatment), then treatment is a between-subjects variable. This occasionally
will be reflected in the name of the research design that can appear as a between-
subjects design. If only one group is involved and we test the various treatments in
sequence (without telling the participants which treatment or no treatment is applied)
and we measure the participants’ responses after each intervention, then the treatment
is considered a within-subjects variable and one can even name the research design a
within-subjects design.
Although we discussed variables and their types in this section, we shouldn’t
forget that in quantitative research variables need to have quantitative characteristics
and should be measurable as numbers. This brings us to the important issue of
operationalization of variables. By this we mean to accurately and clearly detail what
they mean, and what units express their values/attributes. For simple variables like
weight, height, and age this process might be relatively easy and could involve simply
The Quantitative Research Process 55
the adoption of popular units like Kg or pounds for weight, cm or feet for height, and
integers for age. Setting appropriate levels of measurement (the domain and range
of values variables can take) is also necessary to ensure valid entries are considered.
For example, if we are talking about adult individuals and depending on the type of
research we are doing, one might consider 50–120 Kg for weight, 60–210 cm for
height and 18–90 for age as legitimate values. Other variables, like income, might
require a bracketing into categories of say below $20,000, between $20,000 and
$50,000, and above $50,000. Similarly, years in position could be below 2, between 2
and 5, between 5 and 10, and over 10. Certain other variables that are not so easy to
measure (for example, ordinal and nominal), like perceptions and beliefs, might require
a mapping between qualitative terms like small, medium, large to integers like 1, 2, and
3 correspondingly. The Likert scale we mentioned before and will see again in the next
section is an example of such variable mapping. Overall, we are trying at this stage to
ensure that our variables and their descriptions are appropriate for the purpose and
research questions of our research and also reflect, in terms of terminology, the
appropriate data analysis methods we will use (like calling them predictor and criterion
or independent and dependent for the purposes of regression).
2.4.4 Instruments
With knowledge of the types of measurements that we need to perform we
come to the issue of the instrument that will be used for collecting the information we
want. At this stage, we are either going to use or modify an existing instrument that
has been used before in similar situations or we are going to develop a new instrument
from scratch. The former case is practically a lot easier than the latter, provided the
existing instrument is reliable (provides similar readings for similar measurements) and
valid enough (measures what it is supposed to measure and nothing else). A proper
literature review will have identified by now the existence and suitability of such an
instrument and all we will have to do is either adapt the wording to our situation
and/or maybe slightly modify the instrument. By this we don’t mean a drastic
restructuring of the instrument but rather a minor alteration that will not in any way
impact its reliability and validity. In the case where our instrument is a questionnaire,
altering the questions to update the context it addresses might be allowed as long as
the constructs it measures remain the same. For example, a questionnaire that has been
developed to measure certain leadership traits of managers in the transportation
industry might be used for managers in the energy industry with the only alteration
being the change of the word “transportation” to “energy”. Even adding or removing
one or two questions might be acceptable. A rule of thumb might be that altering more
than 10% of the instrument might require re-evaluation of its reliability and validity.
These alterations do not include demographics where one can always ask for more
details from the participants without impacting what the instrument measures. In
general, before using or modifying published instruments one needs to make sure their
56 QUANTITATIVE RESEARCH METHODS
respective publications adequately describe the constructs and variables they measure,
the coding schemes they use, as well as their performance characteristics (reliability
and validity).
For the purposes of this book we will discuss the process of developing an
instrument. We first need to focus on and understand the sources of data that we need
for our research. If archived data is our primary source, then a process like the
literature review will need to be followed to identify where they are located and how
they can be accessed (by requesting permission or through open access if they are
publicly available). In all other situations, a sample will have to be compiled and an
instrument will be developed for retrieving the appropriate information from the
research participants. According to our research questions and what we want to
measure, many possibilities are in general available. These are usually under the general
term survey (often mixed with the term questionnaire), usually referring to the general
method of data collection (even a literature review is a survey). One type of instrument
that is usually seen in natural sciences research is an apparatus (or a few of them) that
will measure the physical characteristics of the phenomenon we study or the
physiological characteristics of the individuals in our sample (when living organisms
are involved). As our focus is on social sciences research the development of
apparatuses as instruments will not be explored here.
In social sciences, the most popular methods for collecting information (in
addition to using archival data) are questionnaires, interviews, and observations. In
questionnaires, we have, as one would expect, a collection of questions and requests
for information (like demographics) that we ask the participants to answer and provide
respectively. These questions are developed per the research needs as stated by the
research questions and hypothesis we adopted. While Appendix A provides a more
detailed description of the process of developing a questionnaire, we will present here
(Figure 2.10) the basic steps and issues that need to be addressed when developing a
questionnaire and the way we need to organize and structure it.
In general, our intention with our questions at this stage should be to either
validate the individual’s suitability for participating in the study or collect subject
matter information that will help answer our research questions. For practical
purposes, we can split the questions into demographics, validation, and subject matter
categories (Figure 2.10). Demographics questions (like age, gender, occupation, etc.)
are used to collect information about participant characteristics that could be necessary
to ensure they meet the demographic profile of our study’s population and use in our
data analysis when answering the research questions. Validation questions are more
specific questions aiming at establishing the participants’ levels of understanding and
suitability to answer questions about the phenomenon we investigate. They are often
used for pre-screening purposes, but in the case of widely distributed questionnaires
The Quantitative Research Process 57
where the researcher has no control over the participants they are of critical
importance in ensuring instrument validity.
The last and most important category of questions concerns the subject matter
questions. A primary issue when developing a questionnaire is the appropriateness of
its questions in collecting the information needed to answer the research questions and
the hypotheses. A typical mistake in developing surveys is to include questions that
somehow relate to the research subject but do not directly relate to the established
research questions. Unless these questions serve some other function (warm-up and
validation purpose), they must be excluded. Each question must be grounded on
literature that raised the importance of the subject (construct or variable) the question
refers to in relation to our population and one of our research questions. This ensures
what is termed construct validity and it is usually addressed by considering the theory
underlying a construct and by evaluating the adequacy of the questionnaire in
measuring the construct. Given that a construct could often include various items and
variables, this term is often used as an overarching one that can be broken down to
content, criterion, convergent and divergent validity, among others. Oftentimes
these items are seen as separate to construct validity, but the reality is they all refer to
the instrument and its accuracy. For example, convergent validity is used to ensure
58 QUANTITATIVE RESEARCH METHODS
considered ready for a full-scale release. Even at that stage, it is not unusual to go back
to the drawing board if evidence suggests that further calibration of the instrument is
required. If future results confirm the findings of our study, we can say that we have
ensured predictive validity.
We should mention here that pilot studies are not only necessary for testing
and validating a measurement instrument. Their need is of primary importance to
evaluating the feasibility of our recruitment strategy for the sample participants, the
randomization process (if one has been adopted), and the retention rates in terms of
participation and completion of the intervention we established during the data
recording process. Despite the advantages of pilot studies, we also need to be aware
of their limitations. Pilot studies cannot evaluate safety, effectiveness, and efficacy
issues and cannot provide meaningful effect size estimations (see Chapter 5 for more
on this). In addition, inferences about the hypotheses cannot be made and neither can
the studies be used for feasibility estimates apart for generalizations about the inclusion
and exclusion criteria of questions in the pilot study. It is suggested that researchers
form clear and realizable objectives regarding the purpose of their pilot studies and the
evaluation of their outcomes.
While questionnaires dominate quantitative research (especially in social
sciences), oftentimes interviews and observations might be enlisted to collect
quantitative data. These solutions are considered qualitative data collection techniques
so we will just mention their basic characteristics here. The interested reader can find
ample material on the Internet (including this book’s website). Starting with interviews,
the process we follow is to a great extent similar to questionnaires, with the only
difference that the researchers or their designees ask the questions and record the
answers. An interview protocol/guide plays the role of the instrument as well as the
interviewer who is part of the instrument. While the development of the interview
protocol follows the same guidelines and process as the questionnaires, the human
element of the interviewer needs special consideration. Personal biases need to stand
away when conducting and transcribing an interview. In addition, the face-to-face
interaction with the research participants need to be as humane and objective at the
same time as possible. In case face-to-face interviews are not possible, telephone and
online interviews are possible and even asynchronous forms like email could be
considered. Pilot interviews also need to be conducted to ensure the interviewees and
the protocol perform as expected and be modified accordingly if needed.
Apart from interviews, another qualitative form of data collection popular in
social sciences is observations. The instrument in this case is none other than the
researcher who conducts the observations. The researcher could use in this case a
check-list of what needs to be observed or record information as it happens in front
of them. Observations are mainly considered a qualitative technique and, like the
60 QUANTITATIVE RESEARCH METHODS
interviews, when applied for quantitative data collection a preprocessing will have to
be performed on the collected information to retrieve the quantitative data that will be
used in the analysis. As with the interviews, the researcher needs to be as objective as
possible to eliminate personal biases that might influence the collected information.
A final topic relating to instrumentation that we will mention here is when
multiple instruments are used for data collection. This is called triangulation and it is
a typical strategy for ensuring the validity of the information we collect. Possible
combinations of data collection methods might include any three of the following:
questionnaires, interviews, observations, archival data. The last could be in the form
of past research results, census data, and government, organizational, company,
and/or media reports, among others. Usually, in triangulation, one of the data
collection methods (like the questionnaire) acts as the primary source of data, while
the others act as supporting sources that could confirm or reject the findings of the
primary source. Attention should be paid here that for validity purposes all three
sources need to refer to the same data and not complement each other since that is
referring to distinctly different data that would also be considered primary data
(requiring their own validation). Further discussion on instrument validity has been
deferred to section 2.4.6 Evaluation of Findings.
Another requirement in academic research that we need to mention at this
stage is the need to provide ample information to participants (Figure 2.10) about the
research objectives, what their participation involves, and assurances about their
confidentiality and privacy. Even in cases where the purpose or parts of the research
cannot be revealed (probably because they could influence participant
behavior/responses), participants need to know that all aspects of safety have been
considered and their personal identifications will be protected from public exposure
without their consent. To ensure all proper ethical and safety issues have been
considered, researchers usually go through some sort of review process conducted by
an independent body (institutional review board – IRB) at their parent organization
and/or at the data collection site. Such boards consider any ethical and safety concerns
that might be raised and the measures that researchers take to comply with
international and national research guidelines before approving an instrument and the
entire research for accessing participants (including humans, animals, and any other
source of data).
Researchers are required to inform their research subjects ahead of their
participation in the research and ensure they understand everything involved in the
research process, harms or benefits, and how privacy and confidentiality are going to
be ensured. Typical solutions for the last point is to use randomly assigned aliases or
even avoid any personally identifiable information like names, addresses, etc.
Additional information might be needed, depending on the research, that identifies
The Quantitative Research Process 61
inclusion and exclusion criteria for participants. This will allow the researchers to
decide on a subject’s suitability for participation, as well as their rights to withdraw at
any time they wish to do so without any impact on them. Special cases of vulnerable
populations like children, disabled, criminals, etc. must be considered per the
guidelines and requirements set by the IRB of the researchers’ parent organizations
and the source sites.
observations and archival data), the researcher will have to make arrangements with
the site that will provide the data as to how they will be collected.
Having retrieved our data from our various sources, we might need to screen
them for ineligible and/or erroneous entries and for no entries in the cases where one
was required, inconsistencies, impossible data combinations, out of range values, etc.
With questionnaires, this is relatively easy given they usually involve selections among
existing options (per question) or single entries (for example, age) that could easily be
checked in terms of representing participants. At this stage, we might also decide if
additional classifications/conversions are required. For example, this might include
using the same units like having currency in euros or dollars, categorizing entries
according to groupings. For example, for age we might decide that instead of the actual
age to provide categories like ‘below 30’, ‘between 30 and 40’, and ‘above 40’.
Additionally, transformations might be required before the analysis phase like when
we convert monthly salary to yearly income, normalization of variables, applying
formulas, etc.
In cases when our data are in a narrative form (like in interviews and open-
ended questions), retrieving the quantitative information might require more effort
especially when the information is not expressed in numerical form. For example,
themes might need to be identified so that their frequency of appearance might be
used as their quantitative representation. The co-appearance or lack of appearance of
themes might also be of value to our analysis so records of such incidents might be
retrieved for further analysis.
Processing and analyzing the data comes after their collection and
“cleaning”/preparation for the analysis techniques we will apply. At this stage, we need
to have a clear analysis strategy for testing our hypothesis (when necessary) and
answering our research questions. The appropriateness of the statistical tests that we
will use needs to be discussed and justified in light of our research design (like making
sure variables/constructs meet the necessary assumptions of each statistical test). A
detailed presentation of the various methods and techniques available for data analysis
as well as proper reporting of the results can be found in the remaining chapters of
this book.
research design, data collection, and analysis and in light of the theoretical framework
of our study.
3 Populations
3.1 Profiling
The process of specifying the sets of characteristics that describe and uniquely
define a population is called profiling. It involves establishing the extent to which a
characteristic (for example, university education) is shared among the population
members and usually expressed in the form of frequency of occurrence for small
populations (oftentimes below 30 units/members) or in the form of a percentage of
the total population for larger populations (it makes no sense to state percentages for,
say, a population of 5 units). For example, in the case of the educational levels of
individuals who compose the population of a city, instead of the actual counts, we
might say that 12% had university-level education, 18% had technical-level education,
45% had finished high school, 16% elementary school, and 9% had no education at
all (Table 3.1). For an easier comparison of such results this information is usually
depicted in the form of bar and pie charts (Figure 3.1).
in section 2.4.3, characteristics such as age, income, scores, etc. While these can also
be grouped in categories, most of the times they are treated as continuous variables
that reveal more details about the population. Some individuals in our population
might have distinct values different from anyone else’s (outliers), rendering the
representation of the population profile with charts (like the bar and pie chart of Figure
3.1) meaningless. A more appropriate form of graphical representation in such cases
is through a distribution, as we will see in section 3.3. For now, we will focus on ways
we can confer the profile of observations by abstracting them in representative values
like mean, median, mode, etc.
Keeping in mind that all population variables are represented (by convention)
with Greek letters, we have:
(3.1)
or in its more compact form:
Populations 71
(3.2)
By using the mean, we presume that our population with respect to the
characteristic we measure is equivalent with a population where all individuals have
the value μ. We can imagine how misleading this can be as often times μ can get values
that don’t even exist in the population it comes from. Consider for example the set of
numbers 32, 36, 40, 26, 30, 28, 70, 34, which we will assume represent the distances of
municipalities (choose whatever unit of distance you want) from a city center across a
highway (Figure 3.3). For the purposes of this example let us also assume that all
municipalities have the same number of residents. We want to identify the location of
a firehouse that will serve all eight municipalities. By calculating the mean, in other
words if all populations were in one location, we come up with μ = 37. Oddly enough
this number is not included in the population/municipalities we study so we have an
entity/mean representing the population that does not belong in the population.
A final parameter that is occasionally used (mainly when the values are discreet
and integer) to describe populations is the mode (symbolized Mo). This is nothing
else than the most frequent/popular value within the population. In a frequency
distribution graph, the mode is the point at the top of the curve and in a bar graph the
frequency of the highest bar (green line in Figure 3.5). For example, in a 9th grade class
one would expect to find most students to be of age 14. Another example is the most
frequent item in a supermarket order (like water). In practice, mode is oftentimes
reported in census data as the most frequent value of a population. For example, when
census data report that a country has a young population, they directly refer to the
mode of the population age.
that we get a better idea of the shape of the curve. With only one parameter, like the
population mean (Figure 3.6.a), it is difficult to say anything about the shape of the
curve. With the 5-number summary we see that the situation improves (Figure 3.6.b)
as we get a sense of how the curve might be, while when we consider all the parameters
we mentioned up to now (Figure 3.6.c) a more precise image begins to appear.
left and right of the mean) will include any value between 28 and 46. As Figure 3.8
depicts, this includes the municipalities at 32, 36, 40, 30, 28, and 34.
(3.3)
Variance should not be confused with variation as the latter is an assessment
of an observed condition that deviates from its expected or theoretical value while the
former is a quantifiable deviation away from a known baseline or expected value (the
mean in most cases). Squaring the differences from the mean, like we did here, has the
effect of converting it to a positive value regardless of its sign. In our case, it is also
magnifying the difference from the mean, as the last column of Table 3.2 shows,
resulting in a value of 173 units from the mean for σ2.
Populations 77
Given that the square power the variance carries might not mean much in
terms of our initial units (that are now squared), one can revert the influence of the
square by calculating the square root of the variance. This quantity is called standard
deviation (SD) and symbolized with the Greek letter σ (remember all population
parameters are symbolized by convention with Greek letters):
(3.4)
Squaring the variance of our example (173) we get 13. This value is higher than
MAD and, as Figure 3.8 shows, it includes more observations. One could say that σ is
more considerate of outliers and so it is more informative than MAD. While this is
true, one can see it as an advantage or disadvantage depending on the situation. For
example, if we don’t want to be influenced by outliers, using the mean could be a
disadvantage. In general, though, the variance (μ and σ) approach is easier to handle
in terms of the math involved and as such it is preferred in profiling populations and
samples, as we will see later. Mean and standard deviation are reported together in the
form μ = 37, σ = 13 or M = 37, SD = 13.
An interesting observation with respect to both the variance and standard
deviation is when their values become very small or very large. The more the values
would approach zero the more they would be suggestive of a constant instead of a
variable while the higher the value becomes the more random a variable will appear
suggesting probably that it doesn’t relate at all to what we are studying.
If we were to indicate now the spread of the population around the mean that
a standard deviation distance includes in a profile curve, we will get the shaded areas
of Figure 3.9 (for the example of the basketball players (a) and the municipalities of
the firehouse (b)). The reason these two distributions were placed side by side was to
showcase the shapes a profile curve can take. The way the curve leans provides
valuable information about the tendencies in a population (like where the majority and
outliers are with respect to the mean) and as such the parameter skewness was derived
to express such tendencies. It is a measure of the asymmetry of the curve with respect
to the mean and is measured by a variety of formulas (like Pearson’s moment
coefficient of skewness). For practical purposes, one such formula is Skewness = 3*(μ
- η)/σ. A rule of thumb for considering significant deviations from a symmetric profile
is when it is more than twice the standard error (discussed later). For now, and based
on Figure 3.9, we can say that our basketball player population is negatively skewed
(Skewness < 0) with respect to height (Figure 3.9.a) and our municipality profiles are
positively skewed (Skewness > 0) with respect to distance (Figure 3.9.b).
78 QUANTITATIVE RESEARCH METHODS
A final parameter that will be mentioned here and is used to express the profile
curve of populations is kurtosis. It is a measure of the flatness of the tails of a
distribution curve (or as sometimes presented, a measure of the pointiness of the
curve). Positive values indicate leptokurtic/pointier curves, while negative values
indicate platykurtic/flatter curves (Figure 3.10). Zero kurtosis value is an indication of
a mesokurtic or normal curve. It is worth mentioning here that skewness and kurtosis
are not mutually exclusive. A curve that displays skewness can also show kurtosis
(although it would be evenly distributed across its tails).
In closing this profiling section, it is worth indicating that all the parameters
we considered stemmed from our inability to express the profile in terms of the details
of each individual observation. What we need to be aware of, though, is that
parameters are abstract representations of something complicated so at no time should
they be considered as possible substitutes for individual observations. In fact, our
tendency to oftentimes apply profiling characteristics to individuals at random (called
stereotyping) can greatly impede decision making.
3.2 Probabilities
Having discussed how quantification of population characteristics can guide
their representation with abstractions like parameters, we need to introduce here the
very important concept of likelihood. By this we refer here to the chances a specific
characteristic or groups of them have in appearing randomly. This likelihood of
appearance is termed probability3 (symbolized with p) and is valued by convention
between 0 and 1. What we do with probability is imagine that we squeeze (map is the
official term) the population to 100 units of observation (we will be calling them
individuals occasionally for convenience) and represent each characteristic with its
fraction/ratio in this special population. It’s like a form of imposed stratification of
the population for a better mental representation.
3 Actually, likelihood is proportional to probability, but for our purposes here we will
consider the proportionality constant equal to 1 making likelihood and probability the same.
80 QUANTITATIVE RESEARCH METHODS
(3.5)
In practice, we have three ways of calculating probabilities. We can apply a
theoretical formula that has been developed with formal mathematical techniques (we
will see this when we discuss distributions), empirically using direct observations (we
will see this when we discuss samples), and intuitively (we won’t see this) when we
base it on past information, other observations or simply on instinct. Obviously, one
would expect the last to be the least reliable estimation as it will be based on partial
information and influenced by observer subjectivity and bias. Consequently, we will
not be dealing with it in this book.
Leaving the theoretical calculation of probability aside for the moment, we will
discuss now the stuff that makes up its essence in as practical a way as possible and in
light of their role in quantitative research. In its most basic form, the probability of an
observation/event/outcome can be calculated as the ratio of the number of times the
observation can occur over the total number of outcomes in existence, as formula 3.5
indicates. Because the nominator is always smaller than the denominator (which is the
universe), the result will always be between 0 (zero) and 1 (one). Considering for the
shake of simplicity two attributes only, for making up a universe U, there will only be
two possible geometric arrangements, as Figure 3.12 indicates.
The situation in Figure 3.12.a represents a disjoint set of attributes where each
member of the populations has only one of the two attributes (A could include males
and B females), while the situation in Figure 3.12.b represents overlapping attributes
Populations 81
where some member can have both attributes together (A could be females and B
could be engineers). Questions that might be of interest here are how many individuals
(or what percentages of them) have one or the other attribute and how many have
both. The former represents the addition of the areas of each attribute (termed union
and symbolized with U (not always the same as the universe, as we will see soon)),
while the latter is just the overlap (termed intersection and symbolized with ∩ – an
upside U).
In our case of profiling populations, we are interested to know/predict what
is the probability of observing one or many attributes at an instance of time or what is
the probability of observing attributes at different instances of time. The classical
example of flipping a fair coin will help illustrate the point we want to make. For this
we will use another popular form of representing sets/profiles and probabilities, the
decision tree. In Figure 3.13 the decision tree of 4 fair coin flips is presented along
with the probabilities of each outcome. It is obvious that at each flip of the coin two
outcomes are possible with equal probabilities 0.5 (50% chance). As we move along
with flipping the coin the 0.5 probability of each outcome still holds but the cumulative
probabilities of the combinations of outcomes change as the conformation
space/universe changes since it includes more and more
specimens/individuals/attributes. While at the first flip there are only two
outcomes/species in the universe, H and T for heads and tails splitting the space
equally between them (0.5 probability each), at the second flip our universe includes
the species/outcomes HH, HT, TH, TT (as they appear from left to right in Figure
3.13). As a result, the individual observations/species share the universe space and
subsequently the probabilities (0.25 each). At the third flip of the coin we have a new
universe with species HHH, HHT, HTH, HTT, THH, THT, TTH, TTT and so on as
the flips continue.
Overall, at each flip of the coin a new universe is created where the sum of the
probabilities of its species/outcomes equals the whole universe (total p = 1). Things
get more interesting (math wise) when we are not interested in the order of appearance
of H and T in the individuals in our universe. This results in similar attributes (Figure
3.14) like, for example, in the third flip universe the individuals HHT, HTH, and THH
are all the same as they have two H and one T (we could imagine them as two headed
monsters). If we were interested in the probability of such individuals, then this would
be 0.125+0.125+0.125 = 0.375. Similarly, the probability of two tail individuals will be
0.375, while the single species HHH (3 headed monster) and TTT (3 tails monster)
each have 0.125 probability of appearance.
82 QUANTITATIVE RESEARCH METHODS
3.3 Distributions
In the previous section, we saw how probabilities can help profile populations
by providing a simplified representation of the population in a ‘0’ (0%) to ‘1’ (100%)
Populations 83
range and how a probability distribution expresses the profile of populations in terms
of the probabilities of various outcomes. In this section, we will discuss further the
advantages of this type of profiling in the context of quantitative research. Let us
assume now another popular example of the application of probabilities, dice. A
typical situation involves throwing dice and betting on the sum of the outcomes of the
two die. Our universe here includes every possible combination that might come up.
In Figure 3.15 we have a representation of all possible outcomes in the form of a
lattice diagram. The rows represent the outcomes of one of the dice, while the
columns represent the outcomes of the other die. The intersections, then, represent
possible combinations.
We can easily see from Figure 3.15 that we have 6x6 = 36 possible
combinations so the cardinality (number of elements) of our universe is 36.
Considering that in this game we are interested in the sum of the dice we can see that
many combinations can produce certain sums. For example, the sum of 7 can be
produced by 1 and 6, 2 and 5, 3 and 4, 4 and 3, 5 and 2, and 6 and 1 (red intersections
in Figure 3.15). These are in total 6 combinations out of the 36 available, so the
probability of seeing a sum of 7 is 6/36 or 0.167. In similar fashion, we can calculate
the probability of all other possible combinations (Table 3.3).
84 QUANTITATIVE RESEARCH METHODS
Having the probability distribution from Table 3.3 we can easily produce a
graph of it (bar graph in Figure 3.16) to pictorially depict the population profile of
the dice universe. Because in quantitative research we usually deal with large
populations, specific probability values are not as much of significance as they are
regions/groups of probability below, above, or between specific values. For example,
in the case of the dice universe we might be interested in the probability to get
below, above, and between a certain outcome/sum. In such cases, all we have to do
is simply calculate the union (see previous section) of the elements that are found
within our range of interest.
For example, if we are interested in the probability of getting a sum less than
6 then all we must do is add the probabilities of getting a sum of 1, 2, 3, 4, and 5, which
from Table 3.3 is 0 + 0.028 + 0.056 + 0.083 + 0.111, giving us P(less than 6) = 0.28. If
we were interested in the probability of getting above 10 we would again add the values
for 11, 12, and 13 (0.056. 0.026, and 0) and we would get P(above 10) = 0.08. Finally, if
we were interested in the probability of getting between 7 (inclusive) and say 9
(inclusive), we would add the values for 7, 8, and 9 (0.167, 0.139, and 0.111) and get
P(between 7 and 9) = 0.42.
An interested observation can be made at this point by considering the
probabilities we get through the aforementioned process and the area under the
probability distribution curve. For example, if we consider P(between 7 and 9) we can
see the sum 0.167 + 0.139 + 0.111 that contributed to the result as 1*0.167 + 1*0.139
+ 1*0.111 which in the graph is nothing other than the area of the 3 rectangular bars
Populations 85
(Area = Base*Height for each one of them) that form the area B in Figure 3.17. The
same case can be evident for P(less than 6) (area A) and P(above 10) (area C).
In the following sections we will see some of the probability distributions that
are of importance to quantitative research. Unlike our dice universe, these distributions
will refer to large populations and in their form as mathematical functions they will be
concerned with real numbers.
population, and it can be completely described by only the mean (μ) and the standard
deviation (σ). Considering x as the variable that represents an event/outcome/attribute
of a population profile and P(x) as the probability of the event x appearing, the
mathematical expression of the normal distribution density is given by formula 3.6.
(3.6)
The graphical representation of (3.6) for a variety of values (μ, σ) is given in
Figure 3.19. The symmetry of the shapes around their respective means is apparent, as
is the influence of the standard deviation on the kurtosis (compare with Figure 3.10)
of each curve. The appeal of the normal distribution stems, among others, from the
fact that most distributions will approach the normal curve for large populations and
its simplicity in requiring only two parameters (μ and σ) for its description. The former
fact is key, as we will see later, in developing the central limit theorem, while the latter
makes it “easy” to work with in statistics (most statistical techniques depend on
distributions being normal).
symbolized with z instead of the traditional x used in most functions (blue line in
Figure 3.20). Again, as we saw in the previous section, our interest is in the population
between segments of the attribute values (z in our case), so we need to be able to
calculate the area under our distribution curve. By integrating (3.6) across the values
of z we get the area under the curve from the leftmost part (which asymptotically is
zero) to each value of z. The graph (Figure 3.20) of this integration that represents the
area under the normal curve is referred to as normal cumulative distribution.
Because of the complexity of formula 3.6 and the difficulty of calculating the
area from the cumulative distribution, tables with the most probable numerical values
for z have been developed. In Figure 3.21 we see the probability estimates for positive
values of z in the form of a table. Because the normal curve is symmetric, the same
values will apply for negative values of z (they have not been included in Figure 3.21
to save space). Normal distribution tables arrange the probability values at the
intersection of the first 2 digits of z (leftmost column in Figure 3.21) and the third digit
(2nd decimal place in top row). For example, if we are interested in the probability (area
under the curve) below z = 2.33, all we need to do is consider z = 2.3 + 0.03. We need
to locate the first two digits (2.3) in the first column in Figure 3.21 and the third digit
0.03 in the top row. The intersection of their corresponding line and column (red
arrows in Figure 3.21) will point to the solution of formula 3.6 and the probability
value we were looking for. In this case, we see that for z = 2.33 we get 0.9901, which
translates as having 99% of our population beyond the z = 2.33. The same process
can be followed for negative values of z (an example will follow later on).
Populations 89
The process can be reversed, for example, when we are interested in the z value
of a certain probability. Let us assume that we are interested to know the z value that
corresponds to 95% of the population (Figure 3.22). We need to find the area under
the normal distributions curve that represents the probability value of 0.95. In
searching within the table values we can see (origin of blue arrows in Figure 3.21) that
it is between the existing values of 0.9495 and 0.9505. By following the blue arrows in
Figure 3.21 we see that the first two digits of the z value (first column) are 1.6 and the
third (top row) is somewhere between 0.04 and 0.05. Assuming an approximate value
of 0.045 for the in-between point we can calculate the corresponding z as z = 1.6 +
0.045 or z = 1.645.
90 QUANTITATIVE RESEARCH METHODS
easy to calculate the leftmost value based on the symmetry of the curve as it will be
the negative of the previous results (z = -1.96 in our case).
Along with the critical values, sometimes we are interested in the inverse
situation like the probabilities (percentages of the population) that can be found
between 1σ, 2σ, and 3σ deviations from the mean, which are depicted in Figure 3.24.
We can see that 68.3% of the population is within 1σ from the mean, 95.5% is within
2σ, and 99.7% is within 3σ. The significance of the standard deviation in representing
population segments can be seen and for practical purposes we can assume that in the
standardized normal curve almost all the population is included among three standard
92 QUANTITATIVE RESEARCH METHODS
deviations from the mean. The interested reader can verify the depicted values in
Figure 3.24 as well as the critical values of Table 3.4 by following the process outlined
in the previous paragraph.
percentages of the population under certain values might get complicated unless we
transform the income distribution into a standardized normal distribution using the
transformation:
(3.7)
This formula will map every value x of the income distribution curve to its
corresponding value in the standardized normal distribution curve (Figure 3.26). If,
for example, we are interested in the percentage of the population in the suburb that
is below $35,000 all we need is to apply the transformation of formula 3.7 for x =
35,000, μ = 60,000 and σ = 10,000 and then look up the percentage in the table of
Figure 3.21. From formula 3.7 we get z = (35,000 – 60,000)/10,000 or z = - 2.50.
Looking up Figure 3.21 for positive z = 2.50 (2.5 in the first column and 0.00 in first
row) we get the probability value p = 0.9938. This means the population for z below
2.50 is 9938%, so due to the symmetry of the normal curve the population for z below
-2.50 will be the complement of the previous or 1 – 0.9938 = 0.0062 or 0.62%
(apparently, it is a relatively affluent suburb).
Following the inverse process, we can estimate, for example, what is the cut-
off point of the wealthiest 10% of the population. In this case, we consider the 90%
below the curve or p = 0.9 and by searching the table of Figure 3.21 we see this is
Populations 95
between 0.8997 and 0.9015 in the row of z = 1.2. The corresponding value in the top
row is between 0.08 and 0.09 so we will assume it is 0.085. Adding this to z = 1.2 we
get z = 1.285, which after substitution to formula 3.7 will produce x = 72,850. We can
conclude from this process that the wealthiest 10% of the population will be making
$72,850 and above. The transformation process that we followed can be done for any
normal distribution regardless of the mean or standard deviation.
and
Figure 3.27 shows a variety of chi-square distributions for increasing values of
k. What we can observe from the curve shapes is that the higher the degrees of
freedom the closer to the normal curve the distribution looks. This property is vital as
it allows (to a great extent) one to apply the transformation of formula 3.7 and reap
the benefits of the normal distribution calculations. Similar to the normal distribution,
4 If there are N objects in a system, we can consider one as the reference frame/source
(say with coordinates 0,0,0 in three dimension) so we only need the relative positions of the
remaining N-1 with respect to the source to completely describe the system.
96 QUANTITATIVE RESEARCH METHODS
tables exist (Figure 3.28) of the most popular combinations of degrees of freedom and
probabilities.
The practical value of the chi-square distribution comes from the development
and application of the chi-square test statistic that we will see in the next chapter. We
briefly need to mention here that the statistic is built as the sum of the squares of the
Populations 97
(3.8)
In the case of our coin in Figure 3.13 it could be that we are interested in the
probability of getting three heads only in the 4 coin flips/trials. The only combinations
that satisfy that requirement are HHHT, HHTH, HTHH, and THHH. Since the
probability of each one of them is 0.0625 (Figure 3.13), the overall probability for
observing 3 heads (let’s assume it represents success) and 1 tail (let’s assume it
represents failure) will be 4*0.0625 or p(3) = 0.25. It is left to the reader to confirm
that formula 3.8 will produce the same result for p = 0.5, q = 1 – p = 0.5, n = 4, and
x = 3.
98 QUANTITATIVE RESEARCH METHODS
3.3.4 t Distribution
In certain cases, we might know that our population attribute follows a normal
distribution, but we can only access a small section/sample of the population. For such
a case the t distribution (often called student t distribution – Student was the alias
the developer of the distribution used for anonymity purposes) has been developed.
Given that the size of the section of the population we study is small, here it is to be
expected that the shape of the corresponding curve will change and will approach the
normal distribution curve as the sample size increases. The formula that relates the t-
statistic with the other parameters of the population is given by
or
100 QUANTITATIVE RESEARCH METHODS
The probability density function is omitted due to its complexity and little
practical significance for the material in this book. The graph of the distribution,
though, is depicted in Figure 3.31 for comparison with the normal curve. The
distribution is dependent on the degrees of freedom (sample size – 1) and two values
are shown in Figure 3.31. It is apparent from the graph that the curves are approaching
the normal fast even with small increases in the sample size. As with the previous
Populations 101
distribution, popular values are collected in the form of tables like the one in Figure
3.32 where the absolute values of the t-statistic are displayed.
3.3.5 F Distribution
A final distribution that we will briefly mention here is the F distribution. This
is used when we conduct analysis of variance (see next chapter) for multiple variables
and it provides a comparative measurement of the variances of two populations with
varying degrees of freedom (population sizes). One of the populations includes the
means of each variable, while the other includes the values of all variables as one group.
The degrees of freedom (population size) of the population of the means can be
referred to as k1 and its variance is called the between mean square or treatment mean
square (MST), while the corresponding degrees of freedom of the population of all
values can be referred to as k2 and their variance as within mean square or error mean
square (MSE). The sampling distribution model of the ratio of the two means
(MST/MSE) form what we call the F distribution (F is for the inventor Sir Ronald
Fisher). For two variables with degrees of freedom k and m and with their squares
distributed as chi-squared, the F-statistic is given by the formula:
The graph of the distribution for various combinations of the two population
sizes/degrees of freedom is shown in Figure 3.33. We can see that the higher the
degrees of freedom (population size) the closer the distribution gets to normal. This
closeness to normal is what will allow us to use the distribution (in the next chapter)
for statistical purposes.
While the formula of the corresponding probability distribution is beyond the
scope of this book, its value for different degrees of freedom and for a certain area
under the curve (probability) can be found in F-statistic tables online. Figure 3.34
displays the values of the statistic for p = 0.95 and various degrees of freedom for k1
and k2. The value of the statistic can be found as usual at the intersection of the
corresponding row and column.
102 QUANTITATIVE RESEARCH METHODS
two peaks/modes and one trough in between. Apparently, it is not normal (far from
it) so anything we discussed in the previous sections cannot be applied here. Luckily,
in math we can transform distributions to new ones in any way that is mathematically
acceptable (meaning applying the typical mathematical operators and functions).
Formula 3.7 was one such way and it allowed us to convert any normal distribution to
the standardized normal distribution.
If instead of the normal values we plot our distributions against the cumulative
normal values (Figure 3.20, red line), the corresponding plot is called a P-P plot
(Probability – Probability). This type of plot will work, in addition to normal values,
for exponential, lognormal (logarithms of the normal), etc. The reference line in such
plots is always the diagonal line y=x. The difference between the Q-Q plot and the P-
P plot is that the former magnifies the deviation from the proposed theoretical
distributions on the tails of the distribution and it is unaffected by changes in location
or scale, while the latter magnifies the middle and its linearity might be affected by
changes in location or scale. Using the same data as the ones used to produce the Q-
Q plots of Figure 3.37, SPSS produces the P-P plots of Figure 3.38.
In addition to the P-P and Q-Q plots that provide a visual interpretation of
how closely a curve resembles the normal distribution, there are two popular metrics:
the Kolmogorov-Smirnov6 test (KS test) and the Shapiro-Wilk. Both tests quantify
the difference of our distribution with the normal by using hypothesis testing (Chapter
5). The KS test is typically suggested for populations greater than 2,000 individuals,
while the Shapiro-Wilk test is suggested for populations less than 2,000.
6SPSS: Analyze => Descriptive Statistics => Explore … => Plots => select
Normality plots with tests
106 QUANTITATIVE RESEARCH METHODS
transformations include the log10 7and sqrt (square root), among others. This means
that for every value of our distribution we calculate its base 10 logarithm, or its square
root and we use the newly found values in place of the regular x values. Figure 3.39.a
shows the graphs of the bimodal distribution (blue line) with its log10 (green curve)
and sqrt (red curve) transforms.
It is evident from the different curves that in this case the log10 smoothed out
the initial curve a lot more than sqrt, so although it still does not resemble the normal
curve it came a lot closer than any of the other curves. A different situation (blue
curve) is shown in Figure 3.39.b were the sqrt transform produces a closer to normal
curve than the log10 that produces even unacceptable/negative values for the
frequency at certain values. The reader can find other popular transformations in the
extant literature. In practice, it is best to apply a variety of transforms to see which one
gets our distribution closer to normal.
108 QUANTITATIVE RESEARCH METHODS
4 Samples
of the population (μ and σ, respectively) and sample (x̅ and s, respectively) are naturally
our primary targets in establishing such relationships.
Considering the various sample distributions (Figure 4.2) we would expect
them to fall somewhere within the population distribution. If at this point we were to
consider the means of each possible sample it might be intuitively possible to see that
there will be more of them closer to the population mean since more members of the
population exist around that mean and are most likely to be selected in most samples.
In other words, one can intuitively see that the mean of all possible samples will be
our population mean. This intuitive conclusion has been formally proven and referred
to as the central limit theorem (CLT).
To be more precise, what the theorem states is that if we take random samples
of size N from the population, the distribution of the sample means (referred to as
sampling distribution) will approach the normal distribution the larger N becomes.
For instance, if we were to consider all possible samples of size N = 4 and calculate
their means, we will observe their distribution resembles the normal distribution. The
more we increase the sample size the closer we get to the normal curve. If we were to
continue experimenting with increased sample sizes, we would observe that around
the sample size of N = 30 we have an almost perfect match with the normal. As a rule
of thumb, this sample size is oftentimes considered the minimum for allowing
significant observations to be made.
The advantage of proving that the sampling distribution approaches the
normal distribution is that we can accurately represent it with its mean and standard
deviation. We have already mentioned that the mean of the sampling distribution is in
110 QUANTITATIVE RESEARCH METHODS
fact our population mean. Another fact that has been proven theoretically is that the
standard deviation (also referred to as standard error later) of the sampling
distribution relates with the population standard deviation through the relationship:
(4.1)
Because in research the sampling distribution will be unknown, we
approximate its standard deviation with our sample’s distribution standard deviation
s. Formula 4.1 will then become for practical purposes:
(4.2)
Figure 4.3 summarizes the results of the CLT and its implications for
connecting sample statistics with population parameters. Given the conclusion of the
CLT and some of the assumptions involved like equating the sampling distribution
mean and standard deviation with that of our sample, it would be appropriate to have
a metric about our confidence in the conclusions made. In considering such a metric,
some of the observations we made in the previous chapter about population profiles
will be valuable. One such observation (Figure 3.24) concerns the percentages of the
population under the normal curve for integer values of z. We know, for example, that
with respect to the attribute we investigate, around 68% of the population (an
approximation of 68.3%) is within one standard deviation from the mean, around 95%
of the population (an approximation of 95.5%) is within two standard deviations, and
that the great majority (99.7%) will be within three standard deviations.
Considering the 95% population spread within two standard deviations from
the mean and given that the sampling distribution is the distribution of the means of
samples of size N, we can be “sure”/confident that the chances of our sample mean
x̅ (having the same chances as any other distribution) being within that spread will be
95% (Figure 4.4.a). Reversing the thinking, we can say that we can be 95% confident
that the sampling mean (and as proven by CLT, the population mean) will be within
two sample distribution standard deviations from our sample mean (Figure 4.4.b). By
consulting Table 3.5 (critical values) we see that the value of z for 95% of the
population around the mean is -1.96 and 1.96 (a little different from z = 2 that we used
as an approximation).
Samples 111
By using formula 3.7 for z = -1.96 and z = 1.96, we can calculate the values of
our attribute x that will have 95% chance of including the sampling mean.
xL = x̅ − (1.96 ∗ sx̅ ) and xU = x̅ + (1.96 ∗ sx̅ )
112 QUANTITATIVE RESEARCH METHODS
where xL and xU denote the lower and upper boundaries of the attribute x. sx̅
is also known as standard error (symbolized SE from now on for compliance with
other publications). The range (xL, xU) from xL to xU is called the 95% confidence
interval (CI), while its half value (the range to the left and right of the mean) is called
margin of error (ME). The corresponding formulas are:
CI = xU - xL or CI = 2*1.96*SE and ME = 1.96*SE
Like the popular 95% confidence intervals (0.05 power), we can calculate the
99% (0.01 power) and 99.9% (0.001 power) confidence intervals.
An alternative to the consideration of CLT as an estimator to confidence
intervals is bootstrapping8. This statistical analysis technique belongs to the general
category of resampling techniques (also known as random sampling with replacement)
and in addition to samples we “collect”/engage during our research, it can be applied
to samples that have been collected in the past. The technique is distribution
independent, so it is ideal for situations where we are unsure about the shape of the
sampling distribution. It is simple enough but relies heavily on computers to perform
calculations. The sample is considered here in the role of the population from which
we randomly extract individuals to form sub-samples (we will call them bootstrap
samples from now on)9. The same individual can exist in this way in multiple bootstrap
samples and thousands or even tens of thousands can be created in this way. The
statistic of interest (for example, the mean) can be measured in these huge numbers of
samples and their deviations from the population (our original sample) can be
estimated producing in this way a confidence interval and standard error. The
dependency of the method on our initial sample is considered by many as a weakness
of the method but there is theoretical proof that the method works in producing
reliable estimates.
While the analysis we performed up to now addressed the situation when the
variables we study are continuous (scale), we need to see how the situation will change
when dealing with categorical/nominal data. The challenge with this type of data is
that they usually are not expressed numerically as they are in text form. Examples of
these types of variables are color, gender, race, location (like countries), etc. In social
sciences, typical categorical variables that are investigated include feelings, perceptions,
beliefs, etc. The most familiar use of categorical values (to most readers at least) are
SPSS: Analyze => Choose test …=> Activate the bootstrap option when it appears
8
9The name bootstrap is meant to indicate the absurdity of using a sample from a
sample which is like lifting ourselves up by pulling our boot straps up.
Samples 113
when polls are conducted. During elections people are asked which party they will vote
for and the results of the polls are then presented as proportions of their preferences.
While we cannot define categorical/nominal variables numerically, we can
nevertheless count the various instances of their attributes and express their presence
in populations and samples in terms of their frequency and probability of occurrence.
If we record, for example, the favorite car color of a sample of 500 individuals10 we
might find the frequencies listed in Table 4.1 along with their corresponding
proportions.
Notice the subtle switch of the wording from “probabilities” to
“proportions”? Although they are the same this is done to distinguish their references.
Probabilities is reserved for the population and proportions is reserved for the sample.
It would be interesting to see if something similar to CLT can be applied to nominal
variables to allow inferences between sample and population. The solution is simple
enough if one focuses on one category at a time. Consider for example only the color
White. We could take multiple samples from our population and record its proportion
in each one of them. If we were to plot all these values, we would get the sampling
distribution of the proportions for the color White (similarly for other colors).
Table 4.1 Nominal data values
Car Color Frequency Proportions
White 115 0.23
Silver 90 0.18
Black 105 0.21
Gray 70 0.14
Blue 30 0.06
Red 40 0.08
Brown 30 0.06
Green 5 0.01
Other 15 0.03
Total 500 1
of the White color) will be denoted as μ(p̂). p̂ (pronounced p-hat) will refer to our
sample mean in the role of the predicted population proportion. The mean on the
sampling distribution of the proportions (which we presume equals our sample
mean/proportion of 0.23) is given by the formula:
Luckily for us, knowledge of the mean allows theory to provide us with the
standard deviation which here is expressed by the formula:
(4.3)
where N is the sample size and q = 1 – p (the probability of not having a White
color). Figure 4.5 displays the normal distributions centered around p and the 68-95-
99.7 rule of the percentages of the population under the curve.
In the case of the White color we know from Table 4.1 that its proportion is
p = 0.23. Assuming this as the proportion in the population we can apply formula 4.3
and get the standard deviation σ(p̂) = 0.019. This means that if we were to draw
different samples from our population we would expect 68% of them to give us a
proportion for the White color within 0.23 ± 0.019 values (0.21 and 0.25, respectively),
95% of the sample to give us a proportion within 0.23 ± 2*0.019 values (0.19 and 0.27,
Samples 115
respectively), and 99.7% of the sample will give us a proportion within 0.23 ± 3*0.019
values (0.17 and 0.29, respectively). The amounts of the multiples of standard
deviations are called sampling errors (usually when polls are conducted). In our
example, we can say that the sampling error in the case of 95% of the sample (typical
choice in most polls) is 2*0.019 or 0.038 above and below the sample proportion of
0.23. This does not mean that we make an error in our estimate of the 0.23 proportion
but rather that the variability within the various samples (95% of them) will be between
0.19 and 0.27. The “error” labeling is misleading and a better alternative for ‘sampling
error’ would have been ‘sampling variability’.
4.1 Statistics
With the establishment of the CLT that allows us to connect our samples to
the population they came from we can begin the study of samples. Our main interest
in statistics is whether two or more samples come from the same population. If
the p-value (see previous chapter) of our statistic is greater than our critical value, we
will be confident enough that the two samples come from the same population (null
hypothesis as we will see in the next chapter) otherwise we will deduce that they come
from different populations (alternative hypothesis).
We will focus first on studying the characteristics/statistics of samples
(referred to as descriptive statistics) and in the next chapter we will discuss how they
relate to the population (referred to as inferential statistics). While there are many ways
we can approach the subject of statistics, we will view it here from the specifics of the
number of samples/groups our study includes, the number of variables/dimensions
involved, and the type of data we have (Figure 4.6).
As the various statistical tests are presented in the remaining sections of this
chapter, an effort will be made to explain the way they are structured and their
workings the first time they appear by providing a simplified numerical example.
Similar to how our understanding of the workings of a car engine improves our
driving, it is assumed here that understanding how the various statistical tests work
will improve our understanding of their appropriateness for the data analysis we are
planning. When subsequent applications or extensions of the same or similar methods
appear, it will be left to the reader to follow up with numerical examples as they abound
in the extant print and online literature (a Google search will prove the point).
Samples 117
The reader might notice that both these formulas include a division by N-1
instead of N as it was in the corresponding formulas (3.3 and 3.4) for the population
μ and σ. This is an attempt to lower a bias that s2 and s introduce in samples especially
when the number of individuals in the sample is small. Because the proof of this
approximation is beyond the scope of this book, we will demonstrate its validity (for
the inquiring reader) using a simplified example. Let us assume that the mean of a
population variable is 70 (whatever the units might be) and our sample is composed
of 3 individuals with values 80, 90, and 100 for the variable we study. If we were to
calculate the mean we will get x̅ = 90. For the standard deviation if we use N-1 (which
is 2 in this case) we get s = 10, while if we use n (which is 3) we get s = 8.16. The effect
the N has is in biasing the s towards smaller values, while the effect N-1 has is in
increasing s and allowing it to cover more distance. Because the variance and the
standard deviation include differences from the mean (like 80-90, 90-90, and 100-90)
the values close to the mean (90 in our case) add insignificant amounts (90-90 = 0 in
our case) and thus their influence in shaping the outcome is eliminated or becomes
negligible. In other words, we could even exclude the value 90 from the 80, 90, 100
set and we will still get the same standard deviation when using N. Reducing N to N-
1 alleviates the nullification/zeroing of the variables near the mean. In the case of the
numbers we used we can see that the increased s that the N-1 produces will include
the population mean (70) when considering two standard deviations (-20 and +20)
from the sample mean (90), while the s produced when using N does not. Of course,
120 QUANTITATIVE RESEARCH METHODS
the numbers are selectively chosen in this example but the idea of what the use of N-
1 does is accurately portrayed.
Typical measures of relative position and shape include skewness, kurtosis,
and percentiles. While the first two measure deviations from the symmetry and the
standardized normal curve, percentiles express the sections (percentages) of the
population below certain values of the observed variables. As such, they include the
quartiles, IQR, and their depiction as box-plots. Another metric that shows relative
position and an indication of the shape are the z values of the various
individuals/observations in the sample.
A final statistic of importance that is used to compare the mean of the sample
against a hypothetical/population mean is the t-statistic that is produced by the well-
known t-test. (also, known as Student t-test12 or independent t-test). The naming
similarity with the t-distribution is evident and since we are interested in just two
values, our sample mean x̅ and a hypothetical population mean μ (that is, we have a
population with two values), the t-distribution is appropriate here. If we assume that
the standard error is SE(x) (standard deviation of sampling distributions which is equal
to the population standard deviation), then we have:
and
The evaluation of the values of t we get will depend on the assumptions we
made when forming the hypothesis of our research. In general, though, when the
significance of the test is below 0.05 (see Chapter 3) we can conclude (with 95%
confidence) that our sample mean is at most within 2 standard deviations from the
population mean. In practice (when conclusions are concerned), we can say that our
sample mean is representative (equal approximately) to the population mean with 95%
confidence.
the values is “abnormal” (there are outliers and/or multiple modes like in the bimodal
distribution).
The most popular statistical test in this case is the sign test (also referred to
as Wilcoxon’s test). It provides a simple comparison of the sample median with a
hypothetical/population median. The test is based on the sign of the difference
between the observed values and the hypothetical/population median to reduce the
problem into a binomial distribution problem. Let us assume that the observed values
(we will ignore the units as usual) in our sample are the values of the municipalities in
Figure 3.3 where we are interested in building a firehouse. Let us also assume that
these distances represent the sample of a population of municipalities of an extended
region around a city or, if we are interested in a large population, the distances of all
the municipalities around big cities in the US (of say half a million population and
above). We want to see how the median of our sample compares to the population
median that we hypothesize to be 38.
Table 4.2 summarizes the differences of the sample values with our
hypothetical population median and the sign of that difference (last column). Having
two values for the sign (positive and negative) should suggest a binomial distribution
(a process similar to the coin flip example with, say, plus-sign for heads and minus-
sign for tails). If we consider the plus-signs in Table 4.2 we see that we have two such
results in the total of eight values/signs. By checking the binomial distribution table
of Figure 3.29, we see that for 2 successes in 8 draws (considering equal chances for
success/plus-sign and failure/minus-sign or p = 0.5) we get p(2) = 0.109. This means
that the probability of observing the 2 plus-signs is 10.9%. In other words, the
probability of observing a median like 38 which results in two positives and 6 negatives
in our set has 10.9% (rather low) chances of appearance. Whether this value is
122 QUANTITATIVE RESEARCH METHODS
acceptable or not for the inferences we want to make will depend on the hypothesis
we formed (see next chapter).
13 One sample (Paired) SPSS: Analyze => Compare Means => Paired Samples T Test
Samples 123
the independent t-test we saw before, the formula for the t-statistic and its standard
error are given by:
and
While comparing for two variable means using t-test or individual values by
comparing their corresponding z values is straightforward, comparing the spread
across a range of values is more challenging. What we are interested in here is possible
associations of the variables which might suggest dependency of one over the other.
This is the realm of causation so we need to be careful in the way we express the
conclusions of our findings. Everything we will discuss here will not address
causation but only “prove” associations between variables. Associations are nothing
more than the tendency of one variable to follow the rate of change of another in some
sort of fashion. This can be expressed mathematically in the form of a function like y
= f(x). We should recall here the various variable naming conventions that are used in
quantitative research (discussed in section 2.4.3). Variables that tend to trigger
something tend to be called independent, predictor, or plain x, while variables that
express the result of the trigger tend to be called dependent, criterion, outcome, or
even plain y. To simplify matters and avoid names that insinuate causation, we will use
the abstract algebraic convention of x and y here.
In defining associations let us look first at the scatter plots14 of the x and y
variable values of some sample populations (Figure 4.10). By association we mean
any shape (line or curve) of the relationship between x and y that seems to indicate
structure. We can see in Figure 4.10 that plot (a) shows a decline in a linear fashion of
the values of y as the values of x increase (negative association), plot (b) shows an
initial decline (negative association) followed by an increase of the y values as the x
values increase (positive association), and plot (c) shows no association at all as the
values appear randomly placed on the plot. We will see later on how to deal with some
of the randomness we observe, but for now it is sufficient to say that plots (a) and (b)
indicate some kind of association between x and y (linear for (a) and kind-of quadratic
for (b)), while (c) does not show any association at all. The curved (straight including)
line forms of association are also typically referred to as correlations.
Considering the average of all zxzy products, we can form our metric r as:
be the case that x and y are the same variables. In math, close to perfect diagonal plots
(those that produce the -1 or +1 coefficients) are indications of functions of the form
y = x or y = -x (also called identities). The type of correlation we have seen here is also
known as bivariate correlation or Pearson correlation.
Having an indication of a strong association might tempt us to consider
modeling our scatter plot along a line (referred to as regression)16. This means we
might be interested in finding the equation of the line that best approaches the trend
of the data points. This process is called regression and focuses on identifying the
parameters that define the line that best fits/approximates our data values. From math,
we know that lines are expressed as functions of the form y = mx + b (referred to as
regression equation) where m represents the slope/angle of the line with the
horizontal axis and the constant term b is the y-intercept or the point where the line
intersects with the y axis. In statistics, the form of the line equation is usually shown
as ŷ = a0+a1x or ŷ = b0+b1x (if we are to stick with our constant term notation)
suggesting the polynomial origin of the line as one of many curves that derive from an
nth degree polynomial of the form a0+a1x+ a2x2+a3x3+ …… anxn. By considering this
form the reader could see that the scatter plot of Figure 4.10.b could be approximated
by ŷ = a0+a1x+ a2x2 (quadratic equation). Notice that instead of y we now use ŷ
(pronounced y-hat) — this is to distinguish the points produced by the line equation
as the predicted ŷ instead of the actual y in our data set.
The process we follow in math to calculate the coefficients of the line, m and
b (or the polynomial coefficients in the more general form), is by applying the least-
squares method. This is a standard approximation technique for calculating
coefficients (details can be found on the Internet for the interested reader). By applying
the least-squares approximation technique and combining it with the statistics we have
seen so far (mean, standard deviation, and the correlation coefficient), we get:
One interesting observation from the formula for a1 (the slope of the line) is
that the absolute value of the product rsy will always be less than 1 given that r is always
less than 1 (positive or negative). This means that the slope of the regression line will
always be less than 1 or in terms of the angle the line makes with the horizontal axis it
will be less the 45 degrees (remember that slope = tan(angle)). In normalized diagrams
this will lead to having the predicted values of ŷ (vertical red lines in Figure 4.12)
always be smaller than its corresponding value of x (horizontal red lines in Figure 4.12).
The parallelogram that the red lines make with the axes will always be flatter (its
y/height smaller than its x/length). This property that resulted from the values a1 can
take is called regression to the mean (remember the mean in the standardized normal
distributions is zero) as the predicted values of y tend to get closer to the mean (zero
in our case) and it is how regression got its name.
practice, we can get an indication of the residuals by plotting them versus their
corresponding x values. If a structure/form appears (like Figure 4.10.a and 4.10.b),
then we should worry as this would be an indication of the underlined influence of
another variable that our model/line did not consider. If on the other hand the plot
appears random (like Figure 4.10.c), then we can be certain that there are no underlying
influences and that the scattering of the residuals is random.
While the plot of the residuals can provide a strong indication of the validity
of our regression model, searching for a statistic that could do the same work is
preferable even if it is to provide additional support for the residual plot. Considering
that regression models will fall somewhere between the perfect correlation (r = -1 or
r = 1) and no correlation (r = 0), a metric will just need to show where in the domain
between perfect and no correlation our model falls. If we take into consideration that
the absolute value of positive and negative values of r is the same (the only difference
is the direction of the line), we can adopt the square of r as an indication of the
correlation strength independent of sign. This statistic is referred to as R2 (pronounced
R-squared)17 and is used as an indication of the variability in our data that can be
explained by our model/regression line.
To clarify the concepts and the statistics we mentioned in this section, we will
consider the following example. A sample of data concerning white stork populations
in some European countries along with their corresponding human populations for
the period between 1980 and 1990 is compiled in Table 4.318 (first three columns
ordered in increasing order of stork pairs). The means and standard deviations are also
displayed. By developing a scatter plot of the available data (Figure 4.13), we can see a
trend emerging in the form of a positive association. The correlation coefficient and
the regression line coefficients can be calculated by applying the formulas we presented
in this section and produce: r = 0.85 (an indication of a strong correlation as it is close
to 1), and a1 = 1412.4, a0 = 9,000,000 for the regression line.
Figure 4.14 displays the regression line (red color) with the predicted values
(red dots) for each data value (blue dots) and the residuals (black line segments). With
the given value of r = 0.85 we can calculate R-squared as R2 = 0.73 (rounded to two
decimal places). An interpretation for this value is that around 73% of the observed
variation in the data points can be accommodated by our model (regression line), while
17SPSS: Analyze => Regression => Linear => Save => Understandardized (x axis),
Standardized (y axis) = Chart Builder
18
Source: Mathews, R. (2000), Storks Deliver Babies. Teaching Statistics, 22, 2,
pp. 36–38, Wiley.
Samples 129
only the remaining 27% is unaccounted for. By plotting the residuals (shaded column
data in Table 4.3), we can see (Figure 4.15) that no specific form emerges and that
their distribution appears random. This is good news as it doesn’t suggest underlying
influences and confirms the 73% accountability of the model (red line).
Based on the analysis of the data we had (Table 4.3), we can deduce, with a
relatively high degree of certainty (we will talk more about this in the next chapter),
that there is a correlation between the size of the human population and the stork
population. The more storks we have in an area, the more populated it is. Considering
that human migrations during the 80s when the data were collected were minor, one
could be tempted to deduce that the higher the number of storks the higher the birth
rate in the human settlements. Would that also mean the storks are responsible for the
increase in births? The reader can see how tempting it could be to extend correlation
to causation, only to realize that unless the storks are the ones that bring the babies
(remember the familiar cartoons with the storks delivering newborns) the correlation
we proved is meaningless considering the evidence we have. A more normal
explanation would involve another variable (hidden/lurking) that could be the
geographic area of each country in our data. The larger the country, the more
populated one would expect it to be in terms of both humans and storks.
130 QUANTITATIVE RESEARCH METHODS
Multicollinearity assumes that variables are measured without error (correct sample
size will ensure this condition). Homoscedasticity will be indicated by the scatterplot
and will ensure the variance of errors is the same across all levels of the independent
variable (standardized residuals will ideally be scattered around the horizontal line).
Assumptions will be required for all the tests we will discuss in this book but will be
left for the reader to explore unless of course they are critical for the understanding of
the material.
19 SPSS: Analyze => Nonparametric Tests => Legacy Dialogs => 2 Related Samples
20 SPSS: Analyze > Correlate > Bivariate > Spearman (uncheck Pearson)
Samples 133
scale with 10 the lowest rank for a student and 100 the highest and a group of 8
students, Table 4.6 displays two evaluations sets and the process of producing the sum
of the squares of the differences of the two grades (X1 and X2 are the grades of the
first and second instructor, respectively).
As before, the test statistic is the smallest of the two sums (this is 9 in our case).
The smaller this gets the stronger the difference between the two variables, and by
extension the effect of our intervention that produced the two sets of data. Values of
the statistics can be found (for the enquiring reader) on the Internet pre-calculated for
various sample sizes. In the example we presented here (Table 4.7), with a sample size
of 6 a paired difference signed rank table would give for a power of 0.05 a value of 1,
meaning that only if we observed a test statistic (smallest sum) of 1 and below would
the two variables show significant differentiation between them. An assumption that
Wilcoxon’s Signed Rank makes is that the distribution of the differences between the
two variables (third column in Table 4.7) is symmetrical in shape.
As a final case, we need to consider the situation when one of the variables is
non-parametric while the other has a normal-like distribution. Given that non-
parametric variables are peculiar and need to be treated in ways we presented here, and
while there might be methods in literature for this type of situation, it is suggested that
we ignore the normality of the one variable and treat them all as non-parametric. In
this way, everything we have presented in this section can be applied. In the case where
the non-parametric variable displays distinct modes (like it is bimodal), a possible
treatment would be to break the data set in two. This case will be dealt with when we
discuss the two samples situation.
SPSS: Analyze => Compare Means => One way ANOVA or SPSS:Analyze =>
21
Compare Means => Means => Option => check ANOVA Tables
136 QUANTITATIVE RESEARCH METHODS
(t-tests) with the variation between the variables. The corresponding test is called F-
test and its statistic is the F-ratio, expressed as the ratio of the variance
between/across the variables over the variance within the variables.
One condition that we need to consider when applying ANOVA is that it
assumes sphericity. This means that the variances of the differences between all group
combinations are near equal. If a group is an outlier it will distort the results, we will
get. To avoid such a possibility, Mauchly’s test can be applied. This test is based on
the hypothesis that the variances are equal and produces a metric to support or reject
that possibility. For power = 0.05, if the test produces a significance value above 0.05
then we can conclude that sphericity is preserved.
Let us demonstrate the ANOVA case through an example of a randomized
sample with N = 9 individuals/entries, each one represented by three (n = 3) variables
x1, x2, x3 (Table 4.8). The variability between variables is measured by considering the
mean of the means (yellow cell) of the three variables and seeing how each individual
mean varies from it. In essence we are treating each variable as having the same value
(the mean) across all 8 entries of it. The process of calculating the variance between
the means (left side of Table 4.8 below yellow cell) is exactly the same as we did for
the population in the previous chapter. We will end up with three means that will be
treated as a special sample of n = 3 entries/values. As we see in Table 4.8 the in-
between variables variation in this case (VarBetween) is 355.7.
For the within each variable variation we proceed by calculating the sums of
the squares of the difference of each variable value from its corresponding mean (as
we did with the variance calculation). This process will result in one sum for each
variable. We will treat these sums as a separate set of values and will calculate their
mean by dividing with (N-1)*n = 24 as the degrees of freedom of this special sample
(justification can be found in the literature). As Table 4.8 shows (right side), the within
variables variation (VarWithin) is 113. In order to calculate the F-ratio we need to
divide VarBetween by VarWithin. Eventually we get F = 3.144. By looking at an F
distribution table (Figure 3.33) for k1 = 2 and k2 = 24, we find the value 3.403 for the
F-statistic. This means that our statistic (3.144) is smaller than the corresponding one
for 95% of the population, meaning that the means of the 3 variables of our sample
do not show any significant difference between them (they are as typical as 95% of the
population).
Samples 137
• Two-way ANOVA: This is like ANOVA but with two factors. In the
case of the university students, apart from studying GPA throughout
the 4 years we might also be interested in studying it across genders.
The initial 4 groupings of the GPA values (one per study year) will now
Samples 139
• Two-way MANOVA: This is like the two-way ANOVA but with two
continuous variables in the dependent role. In our university student
example, we might be interested in looking at the student GPA in the
general education courses (first dependent variable) and their GPA in
the specialization courses (second dependent variable) while we factor
for student level (freshman, sophomore, junior, senior) and gender
(male, female).
identify the type of relationship between each variable xi and y (it could be linear,
quadratic, or something else even when the corresponding coefficient ai is close to
zero). What we see in multiple regression is the combined effect of all x variables and
not their individual contribution.
As we did with the case of the simple regression, we can evaluate the strength
of the (multi) correlation by using residuals. The symbol we use for this statistic is R2
(the same one as for the simple regression), but in this case, it is called adjusted R-
squared. This is because as we add more independent/predictor variables xi the more
the sum of squared residuals will increase, unless of course the corresponding
coefficients ai become zero. Unlike the normal regression model, the adjusted R2 no
longer suggests the fraction of variability accounted for by the model (it is not even
contained between 0 and 100% as the normal R2), so we need to be extra careful when
interpreting it. We should also complement the interpretation by inspecting the
scatterplots of the various xi with the y.
An alternative to multiple regression that is of interest when control or
mediator variables are involved is partial correlation22. While it is more of a case of
correlation between two variables the fact that there might be interference from others
leads to their consideration here as a case of many normal-like scale variables.
Presuming all the conditions of this category apply (our variables exhibit a linear
relationship, they are continuous and have a normal like distribution with no
significant outliers) partial correlation allows for measuring the strength of the
relationship while controlling for the interference of other variables. The process
begins by performing correlations of all the combinations of the variables we have
(dependent, independent, and suspected control/moderators/mediators). The partial
correlation coefficient is then calculated as a combination of the individual
correlations. In the case of the correlation of variables x and y where we suspect
influences from another variable z the partial correlation coefficient of x and y when
controlling for z is given by the formula:
their fear of being fired. Let us suppose that various instruments measured all 3
variables for an organization’s workforce, and we got correlation coefficients rop = 0.2
for overtime versus performance, rof = 0.8 for overtime versus fear of being fired, and
rpf = -0.4 for performance versus fear of being fired. If we only studied the correlation
between overtime and work performance (small rop) it would appear that there is no
correlation between the two while logic would suggest that the more time and effort
one invests on job related tasks the better they will become. However, if we were to
consider the influence (control in statistical lingo) of the fear of being fired we can see
that the greater the fear of being fired the more overtime employees put (large rof) and
that the more the fear of being fired paralyzes them the less they perform (negative
rpf). If we were to remove the factor of fear by applying the partial correlation formulae
we get rop.f = 0.95 which clearly supports what we would normally expect – overtime
improves performance.
Ni and Si are the value counts and the sum of the ranks of the ith variable after
we order them (similar to the Wilcoxon process). N is the total of all value counts (N
= N1 + N2 +…..Ni +…. ). Let us consider as an example the data for three variables
in Table 4.9. After we rank-order them (first column in Table 4.10) we calculate the
sums of the orderings of each variable. Note that when multiple entries of the same
value exist across variables (like 5, 6, and 8 that appear twice in Table 4.10), the average
of their orders is assigned instead. By applying the formula for H (for N1 = 6, N2 = 7,
N3 = 8,, and N = 21) we get H = 5.22.
Samples 143
For the interpretation of the H statistic we need to also consider the variability
(shape of frequency curves) of the variables we study. If they have similar shapes
(Figure 4.17.a) then the test can provide a realistic comparison of the medians for the
different groups. However, if the distributions have different shapes (Figure 4.17.b)
then the test can only be used to compare mean ranks/orders (instead of the means
of the variables). In the case of our example the statistic for the corresponding values
of k-1 degrees of freedom (becomes 2 in our case) in table of Figure 3.27 is between
144 QUANTITATIVE RESEARCH METHODS
the 0.1 and 0.05 probabilities. This means that if we are interested in difference as rare
as 10% of the population then our sample does belong in this category (something has
happened in some of the variables that drastically differentiated from the others —
maybe a drag was more effective). If, on the other hand, we are interested in
differences as rare as 5% of the population then our sample does not qualify as such.
Depending on the shape of the variable distributions by difference, in the previous
sentence we will either refer to the means (Figure 4.17.a) or the means of the ranks
(Figure 4.17.b) of each variable.
Where N is the sample size, k is the number of variables we test and S are the
sum of the ranks of each value after we order them (similar to the Wilcoxon process).
Consider as an example the data in Table 4.9 but without the last value of X2 (-4) and
the last two values for X3 (8 and 1) so they all have an equal number of entries. After
following the same ranking process as for H-statistic and applying Friedman’s formula,
23SPSS: Analyze => Nonparametric => Related Samples => follow up with Graph
Builder => Boxplots
Samples 145
we will get Fr = 1674.8. This statistic is by far higher than the corresponding values
for k-1 (becomes 2 in our case) in Table 3.27, indicating that the medians of the
variables vary significantly. Apparently, there is something in the three sets of variables
(like maybe an effective drag treatment) that distinguishes them from each other.
The final case of non-parametric variables we will discuss here is dichotomous
variables. These are variables that can take only one of two values like heads or tail,
yes or no, “0” or “1”, success or failure, etc., implying that the binomial distribution
will be involved. The suggested method in such cases is binomial logistic regression
or simply logistic regression24. We apply this method when the dichotomous variable
is in the role of dependent variables, while the independent variables can be normal-
like or non-parametric. In that sense, it is like the mixed-case of variable types. In
logistic regression, instead of predicting y from the values of x, we predict the
probability of y occurring. For that we transform our original scale values to their
logarithms (same thing we did when we applied the log transform to make data
normal). This allows us to treat the values as linear without being affected by the non-
linearity of our variables. The logistic regression equation takes the form:
Given that the results of the equation are probabilities, their values will be
between 0 and 1. A value close to zero means that the y value is unlikely to have
occurred, while the opposite is true the closer we get to 1. The statistical software we
will use is going to produce the values of the coefficients using a maximum-likelihood
estimation. As with the normal regression where we used R2 to evaluate the closeness
of our predicted values to the observed ones, we use a specialized statistic here called
log likelihood, given by the formula:
It can be seen that the formula is based on the sums between the predicted
and actual outcomes and as with R2 it is an indicator of the variance that the model
24 SPSS: Analyze => Regression => Linear =>…one dependent, many independent
146 QUANTITATIVE RESEARCH METHODS
explains. Oftentimes we use the log likelihood to test different models. One such
model (assumed as the baseline model) is when only the constant a0 of the polynomial
is considered and all other coefficients are zero. By evaluating the differences between
the suggested and the baseline model we get the chi-square metric:
25SPSS: Analyze => Non Parametric => One Sample => Fields => Settings =>
Chi-square
Samples 147
The process of building the metric (Table 4.11) is similar to what we did with
continuous variables by considering their residuals, that is the differences between
what we observe in our sample from what we expect them to be in the population
according to the model we adopted. To avoid cancelation of the differences when we
sum the residual (some are negative, and some are positive), we end up raising each
one to the square. Because these squares will become higher the more values we have,
it is best at this point to use their relative differences, so we consider the divisions of
the residuals’ squares by their corresponding expected values. The sum of these final
values (referred to as components) is the statistic we call chi-square (symbolized as
χ2) and its distribution follows the chi-square distribution we saw in section 3.3.2.
Coming back to our example values of Table 4.11, we will assume the simplest
possible model and that is that every color in our list has equal probability for
appearance in the preference list. Given that there are 9 categories/colors (we consider
“Other” as a distinct category), an equal probability distribution would assume each
color to be chosen by 1/9 or 55.56 of the population (500). Table 4.11 outlines (from
left to right) the process of building the chi-square statistic according to the description
provided in the previous paragraph.
As we can see the car color case, χ2 = 236.2. Considering the degrees of
freedom k as equal to the number of entries minus one (N-1 will be 8 in our case), we
can see from the chi-square values of the table in Figure 3.28 that our value (236.2) is
higher than any of the entries in the table row with k = 8, indicating that the probability
of our sample approaching the model (equal probabilities for all colors) is less than
1‰. In other words, there is no real chance (at least higher than 1‰) based on our
findings that the various car colors are chosen with equal probabilities. This is quite
realistic as for example, we rarely, if ever, see any pink cars on the street.
Table 4.11 Nominal data for car color example
Residual Residual2 Component
Car Observed
Expected (Obs- (Obs- (Obs-Exp)2
Color Frequency 2
Frequency Exp) Exp) / Exp
White 115 55.56 59.44 3533.64 63.61
Silver 90 55.56 34.44 1186.42 21.36
Black 105 55.56 49.44 2444.75 44.01
Gray 70 55.56 14.44 208.64 3.76
Blue 30 55.56 -25.56 653.09 11.76
Red 40 55.56 -15.56 241.98 4.36
Brown 30 55.56 -25.56 653.09 11.76
Green 5 55.56 -50.56 2555.86 46.01
Other 15 55.56 -40.56 1644.75 29.61
SUM 500.00 500.00 0.00 13122.22 236.20
148 QUANTITATIVE RESEARCH METHODS
26
SPSS: Analyze => Descriptive Statistics => Crosstabs => Statistics: Select
Chi-square, Cramer’s V, and Phi
Samples 149
We will see now what we can do when we have two variables that could
potentially relate to each other. Notice we are avoiding the word correlate here. Our
variables are nominal and as such they have no ordering so there is no sense that one
increases or decreases as we had in correlation and regression. To demonstrate the
case, we will use the same example of the car color but only for the top three popular
colors. We will consider an additional variable that will be the geographic location. The
attributes of this variable will include North America, Europe, Asia-Pacific, and rest
of the world as a single category. Table 4.13 (ignoring the yellow cells) includes the
data for our car color example spread among four car color options (White, Silver,
Black, and Other) and four geographic locations. Such tables are generally called
contingency tables or cross-tabulations27. They are quite popular in contingency
table analysis and they typically display the frequency of observations along with the
corresponding percentage of each entry with respect to its group (yellow highlights).
For the color Silver, for example, the total is 90 so the percentage of people from
North America who prefer Silver will be given by 14/90 or 15.6%. It goes like this for
the rest of the entries, except the sum where our total for each color sum is our sample
size of 500. For the sum of the color Silver, its percentage to the sample total (500)
will be 90/500 or 18% and so on for the other color sums. Typically, contingency
tables are displayed with their corresponding bar chart (Figure 4.18).
27
SPSS: Analyze => Descriptive Statistics => Crosstabs
150 QUANTITATIVE RESEARCH METHODS
are also the square roots of the components but with the sign of the subtraction
(observed-expected) preserved. The standardized residuals of our example are shown
in Table 4.17. The standardization process results in expressing the distance in terms
of standard deviations. As such, the close to zero values in our table suggest closeness
to the mean and support for the findings of the chi-square test. Another case for the
chi-square test will appear in section 4.3 for two and many samples with the only
difference being the name. In those cases, it will be called chi-square test for
homogeneity. This is mentioned here for completeness purposes.
When the values of the cells in cross-tabulation become large (into thousands),
a better alternative to the chi-square test of independence (as in the case of the chi-
square goodness of fit test) is the G-test of independence. The math is similar to the
G-test of goodness of fit with the only difference being that the expected frequencies
are calculated based on the observed frequencies (similar to the chi-square test of
independence). Considering the values of Table 4.14 and Table 4.15, the G-test
process is displayed in Table 4.18. As a last step we need to multiply by 2 the final sum
(7.766) to get the statistic. We will end up with G2 = 15.531, which is almost the same
as the x2 value of 15.613 we got before.
Table 4.18 Observed*ln(Observed/Expected)
Rest of
Car North Asia-
Europe the
Color America Pacific
World Sum
White 6.074 4.953 1.836 -10.107 2.756
Silver -3.236 -4.488 -4.360 16.676 4.592
Black -0.572 2.729 0.805 -2.695 0.268
Other -1.219 -1.719 2.707 0.381 0.150
G 7.766
While the chi-square and G-tests we mentioned here pretty much cover all
possible situations when two nominal variables are involved, there is the special case
when the two variables are paired where the McNemar test might be a better
alternative. This pairing is expressed as a dichotomous variable in the role of the
dependent variable like when we measure the impact of an intervention (“low” and
“high” or “success” and “failure”) compared to an alternative. For example, consider
that we have two promotional campaigns for a product — one with just the product
and the other with the product and an offer. We want to test to see if there are
differences in their effectiveness. Let’s assume we have 200 participants for our study.
We create 100 pairs and we expose one of the members in the pair to the plain product
campaign and the other to the product plus offer campaign. The participants record
Samples 153
with “Yes” or “No” whether they liked the campaign they were exposed to. Their data
are displayed in Table 4.19.
McNemar’s formula focuses only on the cells with one “Yes” and one “No”
(yellow highlights) and considers the square of their difference over their sum. In our
case is takes the form (54-48)2 / (54+48) resulting in 0.35 for its statistic. Looking at
the chi-square table of Figure 3.28 with (2-1)*(2-1) or 1 degrees of freedom we see that
the statistic is around p = 0.65. This is not that far from the 50% chance, so we can
safely conclude there is no significant difference between the two campaigns with
respect to customer choices.
Table 4.19 McNemar test data
Product
and Offer
Campaign
Yes No
Product Yes 65 54
Campaign No 48 33
A problem with chi-square that was not mentioned before is that it does not
handle well situations where the cell entries are small (typically below 5). In these cases,
a better alternative is Fisher’s exact test of independence28. The assumption that
needs to be satisfied for this test is that variables are independent (one does not suggest
the other). Unlike most tests that develop a mathematical formula for calculating a
statistic, we calculate here the probability of getting the observed data given all possible
values the variables can take.
Consider the situation of Table 4.20 with the values of two nominal variables.
The first variable has two attributes (Attr.11 and Attr.12) and so does the second
(Attr.21 and Attri.22). n11, n12, n21, and n22 are the observed frequencies, while the
rest of the value entries include the corresponding sums. The probability that Fisher’s
test calculates is given by considering the factorials (remember n! = 1*2*3…..*n) in
the formula:
28 SPSS: Analyze => Descriptive Statistics => Crosstabs => Exact => Asymptotic
154 QUANTITATIVE RESEARCH METHODS
This will give us the p values for the corresponding arrangement of values
(Table 4.20). By creating all possible arrangements of values, we end up with a
distribution of probabilities similar to what we achieved when we considered the
example of the dice in section 3.3.
To illustrate the application of the method let us consider the car color
example with some rare car colors (pink and yellow) for North America and Europe
Samples 155
only. Table 4.21 shows the frequencies for these colors for the two regions we study
along with the sums of the corresponding rows and columns. By applying Fisher’s
formula with the factorials, we get p = 0.244. We need to compare this with all possible
arrangements (permutations) of the cell values that exist that also preserve the values
of the totals for each row and column. One easy way to find out how many there are
is to consider the lowest marginal sum (in this case it is 6), start with the lowest
frequency entry (in this case it is the Pink color for Europe) cell by assigning it the
maximum allowable value (6 in this case), and create all possible arrangement by
reducing it by one until it becomes zero (yellow highlights). All possible and allowable
(first and second line sums result in 6 and 12 respectively, and first and second column
sums result in 9) permutations are listed in Table 4.22 with their corresponding p
values on the side.
If we now plot the p values, we got for the various arrangements (like we did
for Figures 3.18 and 3.19) we end up with the probability distribution of the Fisher’s
test values (Figure 4.19). With the p value of the observed arrangement (Table 4.21 –
also replicated as arrangement 5 in Table 4.22) we can calculate the area under the
curve for the one tail (not recommended) or two tail (preferable for Fisher’s test)
power by simply adding the probabilities we found. In this way, we can say that the
probability of observing an arrangement of frequencies more rare from what we have
is 0.31 (add p values of 5, 6, and 7 arrangement). For the two-tail case (preferred) we
can either double that value or add the symmetric regions under the curve (Figure
4.19).
156 QUANTITATIVE RESEARCH METHODS
The final subject we will deal with in the nominal variables case is the multi-
variables situation (3 and above). We will discuss here the 3 variables case and leave it
for the reader to extrapolate to more variables. As one would expect, what was
developed in the previous paragraphs can be applied here, so performing chi-square
goodness of fit or G-test is applicable exactly as we did before. Some interesting
alternatives, though, of the multi-nominal variable situation are worth investigating
and will be discussed here.
One such alternative is when we study sections of the data in isolation of the
effect of some of the variables. In essence, we reduce the dimensionality of our tables
by “ignoring” some variables. As an example, for the 3 nominal variables we will
consider the car color example we’ve seen before (Table 4.13) but with the addition of
gender as another variable along with car color and region. Our contingency table will
now take the form of Table 4.23. If these were our original data instead of the data we
used up to now, then Table 4.13 was nothing more than a reduction of the three-
dimensional Table 4.23 (with variables car color, region, and gender) to a two-
dimensional Table 4.13 (with variables car color and region). Such reduced tables are
called partial tables.
The advantage of using partial tables is that it reduces the problem of studying
a higher-dimensional problem (three dimensions in Table 4.23) to a lower-dimensional
problem (two in Tables 4.23.a, 4.23.b, and 4.24). This way we can apply whatever tests
we had available for the two-dimensional tables like chi-square test and G-test. If the
tests show that the values in any of the tables are “rare” enough we can conclude there
is potentially some association between color, region, and gender that could be further
investigated.
Caution is required here as we might occasionally run into what is known as
Simpson’s Paradox. This arises when the partial tables support an association in one
direction while the marginal table supports an association in the opposite direction.
Consider the example of 789 individuals (515 male and 274 female) who are asked to
choose between two rare car color choices, Sarcoline and Mikado. The partial tables
158 QUANTITATIVE RESEARCH METHODS
for male and female participants are shown in Tables 4.25.a and 4.25.b, while the
marginal table (the sums) is shown in Table 4.26.
The aforementioned tables also list the relative percentages of the entries with
respect to the geographic region. From Table 4.25.a we can see that the ratio of North
Americans over Europeans that prefer Sarcoline is smaller than the corresponding
ratio for Mikado. If we look at Table 4.25.b, we see that the same is true for the female
population. Naturally, one would expect that when we create the marginal table from
the addition of the values of the male and female tables we would observe the same
analogy. Surprisingly though, we can see from Table 4.26 that the ratio of North
Americans over Europeans who prefer Sarcoline is higher than the corresponding
ratio for Mikado. The example here is meant to showcase the Simpson’s Paradox and
alert researchers to be careful when conferring conclusions from the marginal tables.
Having seen the case of partial and marginal tables we will discuss now an
analog to the regression we saw with scale variables. This is possible in situations where
the independent variable is a scale variable and the dependent is categorical. In this
case, we would want to do something similar to ANOVA or multiple regression but
instead of the continuous variables x we will now have the nominal variables
(something like Frequency = a0+a1Color+a2Region+a3Gender (for the multiple
regression case of Table 4.23). We usually include an error term also in such equations
but for simplicity we will ignore it here.
In order to include influences of individual variables and cross-influences
between variables the general linear model for the 3-variables case becomes:
Samples 159
Frequency =a0+a1Color+a2Region+a3Gender
+a4Color*Region+a5Color*Gender+a6Region*Gender
+a7Color*Region*Gender
The products between the variables will attempt to capture interaction effects
between them. The problem with the aforementioned regression equation is that our
variables cannot be expressed as continuous variable like say White = 1, Silver = 2,
Black = 3, and Other = 4 and similarly for Region and Gender. Given that the different
categories are mutually exclusive (in each cell in Table 4.23 there is only one color and
not multiple ones), we can consider binary representations for the existence of a color
(indicated with the value or 1) and its absence (indicated with the value of 0). The
numeric variables of this type are called dummy variables.
For the car color White we set it as 1 when there are no other colors and 0
when any of the other colors exist. We do the same for Silver, Black, and Other. We
treat the regions and gender similarly. Table 4.27 shows all the possible combinations
of values that exist for the 3 variables.
Frequency = a0+a1White+a2Silver+a3Black+a4Other
+a5Male+a6Female
+a7NorthAmerica +a8Europe +a9AsiaPacific+a10RestOfTheWorld
+a11WhiteMale + ……all possible interaction terms
Because we are considering linear relationships we need to use the natural
logarithms of the frequencies instead of the actual values (like in logistic regression).
When considering prediction, we just insert the appropriate dummy variables for the
case we are interested in the regression equation to get the predicted frequencies. For
example, if we are interested in the car color Black in Europe for Females then
according to the dummy variables representation (Table 4.27) for each entry the
regression equation will become:
ln(Frequency) = a0+a1*0+a2*0+a3*1+a4*0
+a5*0+a6*1
+a7*0+a8*1 +a9*0+a10*0
+a11*0*0 + ……all possible interaction terms
Apparently, the only terms that contribute are the ones that include Black,
Europe, and Females simply because they are 1 while everything else is zero. To avoid
collinearities that the constant term might introduce as it never disappears, we usually
consider dropping one of the attributes (usually the one with the most values). This
way the constant term can be set to represent the influence of the category we dropped.
It should be evident by now that the situation can increase in complication the more
variables we consider. This should not be a problem since this is dealt with by most
statistical software packages. The interested reader can find more on what we
discussed in the extant literature (Internet).
Expanding on the previous discussion it would be interesting to see what
happens in the case where we have a mix of nominal and scale variables. The logistic
regression equation can be extended in that case to include the scale variable and
potential combinations with the nominal variables in the generic form:
Frequency = a0+a1Nominal1+a2Nominal2+a3Nominal1*Nominal2+a4X1+
a5X2+a6X1*X2+a7Nominal1*X1*+…
While the previous method can cover every possible situation when nominal
variables are involved, there is the special case of one nominal dependent variable and
multiple independent variable that are all dichotomous. In such cases Cochran’s Q
test is the appropriate choice and we will discuss it here. In the case of the car color
Samples 161
example we might be interested to find out how a particular fictitious car model (let’s
say SupperStat) will sell in the various colors in each geographic region. The answers
a sample provides are recorded as “Yes” or “No” and entered in Table 4.28 as “1” or
“0”, respectively.
Cochran’s Q is computed by the formula (k-1)*[k*(S12+ S22+ S32+ S42) –(S1 +
S2 + S3 + S4)] / (k*S – S2), where k is the number of dichotomous variables (in our
case this is 4 — equal to the number of geographic regions). After completing the
calculations, we get Q = 44.6. Considering this as our chi-square statistic or comparing
it with the chi-square statistic for the p value we might be interested in, we can make
assertions about the rareness of our observations. With our example’s degrees of
freedom (k-1) = 3 we can see that the Q value is higher than any of the chi-square
values in the table of Figure 3.28, indicating that the observed values are rarer than
even 1‰. The different geographic regions do not seem to relate to each other (or
influence each other in a way), at least with respect to the color popularity of the
SupperStat car color.
the one, two, and many variables cases for scale and nominal data types and we will
base our analysis on the methods and statistics we developed in the previous sections.
As such, when we see variables in isolation from other variables, we can apply
everything that we did as if it was the one-sample case. The interest in our two-samples
case comes from comparing variables between two samples.
Let us first deal with the one scale variable that is normally distributed. We
have in this case two sets of values. It might come to mind that we faced a similar
situation when we dealt with two different variables in the one-sample situation. The
truth of the matter is that from the math point of view the two situations are identical
so one can apply everything that we did in the one-sample two-variables case in our
current two-samples one-variable case (Figure 4.21). So, if we are interested in
comparing the means of the two-samples variables that follow the normal distribution,
we just need to apply the independent t-test or if there is a one-to-one correspondence
between the variables the paired t-test. If our variables are non-parametric, Wilcoxon’s
Rank Sum for comparing the medians will be ideal, while if there is a one-to-one
correspondence between the variables the Wilcoxon’s Signed Rank might be applied.
for one-to-one correspondence if the variables are normally distributed and Kruskal-
Willis H-test for median or Friedman’s rank test for a one-to-one correspondence if
our variables are non-parametric. The same applies to the case of many variables. We
can choose to apply the method that suits our case as we see fit (Figure 4.21).
Things become a lot simpler in the case of two samples when nominal
variables are involved simply because we can consider “sample” as a nominal variable.
In our case of two samples it could be seen as a dichotomous variable since every value
in our data set will belong to one or the other sample. Our two-variables situation
becomes three variables with the introduction of the dichotomous variable sample.
This allows us to apply everything we mentioned in the one-sample many-variables
case here in our two-samples two-variables case. Extending this process, we can also
cover the situation of the two-samples many-variables cases (Figure 4.21)
It should be evident by now that the many-samples cases would be treated
similarly by introducing the Sample (or Groups or Sets or Categories) dimension into
our analysis (Figure 4.22). For the case of many-samples with one variable, for
example, one can easily see that in essence we have two variables. One nominal that
represents the sample that each value comes from and another to represent the actual
values.
To demonstrate the introduction of the Sample variable we will use the data
of Table 4.8 copied here in Table 4.29.a in the form of 3-sample data. When
considering Sample as a categorical variable we can have our 3 sets of scale variables
organized as one “sample” set with one nominal and one scale variable, as shown in
Table 4.29.b. For the case of nominal data, we will consider the car color example data
of Table 4.13 copied here in Table 4.30.a in the form of 4-sample data. Table 4.30.b
shows the same data organized as a 1-sample data with two nominal variables.
The same process we followed for reducing everything to one sample can be
followed the other way around and convert categorical variables to multi-sample cases.
A set of values, for example, for a certain scale variable that were collected from a
sample population of men and women can be considered as two samples, one with
only the values that correspond to the men and one with the values that correspond
to the women. This will allow everything that we discussed in the case of two samples
to be applied here.
In most cases, it will be up to the researcher and the type of research conducted
to rule on a method’s appropriateness and suitability for the data available for analysis.
The presentation of the methods we discussed in the previous sections do not
exhaustively cover all available methods, but they cover to a great extent the data
analysis needs for most social sciences research. Chapter 6 will deal with advanced
methods for more demanding data analysis.
Samples 165
5 Hypothesis Testing
it (you need alibis, convincing explanations, etc.). Most juries will accept the former
much easier than the latter.
The logic of forming hypotheses creates much confusion (it looks unnatural)
for new researchers so we will expand a little more through a couple of examples. Let’s
assume that we want to prove that a certain population like say graduate students
“hates” statistics (that is before reading this book) while it is “rumored” that most
students (say 95% of the population) “love” statistics (ignoring any intermediate
states). The latter assertion will represent the state of nature as it represents the
assertion made by the majority of the population and will form our null hypothesis
like Ho: Graduate students love statistics with the alternative Ha: Graduate students do not love
statistics. Which is more sound/“easier” to prove or disprove? In statistical logic, it is
more solid/final disproving something than proving it as in the latter case there could
always be newer evidence/cases that can disprove the assertion. Intuitively also, isn’t
it easier to find people who hate statistics (reject the null hypothesis) that those who
love statistics (reject the alternative hypothesis)?
On another case, we might be interested in proving that we are not elephants.
Our set of hypotheses would then be Ho: I am an elephant and Ha: I am not an elephant
(Figure 5.1). Is it easier to disprove/attack that we are elephants (null hypothesis Ho)
than disprove that we are not (alternative Ha)?
One keyword that might have gone unnoticed in the previous discussion is
“prove”. What do we mean by proving something? How certain are we of the proof
we provided? In quantitative research these questions are answered by deciding how
likely or unlikely the observed values are in profiling the population we study. To
answer these questions, we will introduce the concept of “significance”. This is a
“subjectively” established reference limit we set with respect to the statistic we use in
168 QUANTITATIVE RESEARCH METHODS
prediction holds true for a certain percentage of the population we study (95% in this
case):
xL = 𝑥̅ - (1.96*SE) and xU = 𝑥̅ + (1.96*SE)
Let us demonstrate with an example how all this blend into supporting or
rejecting a null hypothesis. Let us assume that a car manufacturer assigns us to
investigate if a new car color they developed (we will call it StatCol) will appeal to
consumers in Quantland. The manufacturer is planning to go ahead with production
if the color appeals to at least 75% (p = 0.75) of the consumers. This number will then
form our null and alternative hypotheses as:
Ho: The proportion of the population that finds StatCol appealing is greater p
= 0.75
Ha: The proportion of the population that finds StatCol appealing is not
greater p = 0.75
We presented the color to a sample of 230 consumers and 161 of them
indicated they liked the color. Converting this to a proportion we get 𝑝̂ = 161/230 or
𝑝̂ = 0.7. This is below what the manufacturer expected so we would be tempted to
reject the null hypothesis if it wasn’t for a potential error that we need to consider.
With 𝑝̂ = 0.7 as our predicted mean we can calculate the standard deviation of the
sampling distribution of our sample size using formula 4.3. With q = 1-p or q = 0.3
and N = 230, formula 4.3 produces σ = 0.03 or SE = 0.03. Considering the formulas
for the limits we mentioned previously with x̅ = 0.7 we get:
xL = 0.7 – 1.96*0.03 or xL = 0.64 and xU = 0.7 + 1.96*0.03 or xU = 0.76
Because the hypothesized/manufacturer’s value (75%) is between these limits
we can conclude (with 95% certainty) the null hypothesis is not rejected. It could very
well be that in the actual population StatCol appeals to more than 75% of the people.
While the previous analysis takes care of the error our statistics might have
produced, it still doesn’t address the situation where real populations behave
differently. This is due to possibilities and influence our sample might introduce as it
might have not been or behaved as an accurate representation of the population we
study. Our inferences are based on the evidence at hand and even in the best of
circumstances we can still make the wrong decision. In hypothesis testing, there are
two possibilities of producing the wrong conclusion (Table 5.1). Either the null
hypothesis is true in reality and we rejected it based on our evidence (sample) or it is
false and we accepted it (failed to reject it). These two types of errors are known as
Type I and Type II errors. Figure 5.2 depicts all possible scenarios of errors and no
errors for the one-tail case.
170 QUANTITATIVE RESEARCH METHODS
We will discuss now two examples to highlight the different error types. In
disease testing the null hypothesis is usually the assumption that there is no disease
(the person is healthy), while the alternative is that the person is sick (Table 5.2). Type
I error is an error in our conclusion to reject the null hypothesis. This means we
erroneously reject it by concluding that a person is sick when they are not actually sick
(false positive). As a result, we might impose some unnecessary treatment but other
than that no harm is done. When we make a Type II error though, we conclude the
person is not sick (we accept the null hypothesis) when they actually are. Obviously in
this case the Type II error is the worse as we will fail to treat a person who is sick.
In another situation let us consider the case of a jury coming up with a verdict
(Table 5.3). The null hypothesis is that the accused is not guilty. When we make a Type
Hypothesis Testing 171
I error we find the accused guilty when they are not, while when we make a Type II
error we find them innocent when they are not. In this case the Type I error might be
considered worse as innocent people might end up in jail. The two cases we mentioned
were relatively easy in suggesting which error type is the worse. In many situations, the
decision is not so easy, and it really depends on our viewpoint and the research we are
conducting.
Table 5.2 Medical disease testing
Reality / Evidence /
Population Sample Error
Healthy Healthy No Error
Healthy Sick /Positive Type I
Sick Healthy/Negative Type II
Sick Sick No Error
can do this is by increasing the overall area under the curve, meaning we need to
increase the sample size.
A point of consideration with error types is the case of metrics that reflect
combinations of other metrics like in the case of ANOVA where all possible t-tests
among the variable are used to produce the tests statistic. A case like this is when we
take sample measurements at different times (repeated measures30). In such cases the
a value we choose might be adjusted to properly reflect the a values of the individually
contributing statistics. A typical such adjustment is Bonfferoni and what in essence
does is divide internally our a value by the number of comparisons (like t-tests) made.
The adjustment overall ensures that the possibility for Type I error is reduced.
30SPSS: Analyze => General Linear Model => Repeated Measures => ….Enter the
number the dependent variable has been measured … if you need plots make sure time is on
the horizontal axis … => Options => Display Means for: time, Confidence interval
adjustment Bonferroni.
Hypothesis Testing 173
= 20 minutes. The margin of error that is equal to the standard deviation of the
sampling distribution (Chapter 4) is given by the formula:
(5.1)
Of course, we don’t know the predicted 𝑝̂ because we don’t have a sample yet,
but we can guess the worst-case scenario that maximizes the product 𝑝̂ ∗ 𝑞̂. Given that
𝑞̂ = 1 − 𝑝̂ the product becomes 𝑝̂ − 𝑝̂ 2 . For someone who is familiar with the
quadratic equation (y = a2x + bx + c with maximum at -b/(2a)), the equation
y = −𝑝̂ 2 − 𝑝̂ has a maximum at = 0.5. This also results in 𝑞̂ = 0.5, so the maximum
value for 𝑝̂ ∗ 𝑞̂ is 0.5*0.5 or 0.25. Substituting the values, we have in formula (5.1) we
get:
Both formulas express the number of the cases they target as a fraction of the
total. Additional metrics, that are popular in instrument development, include the
positive predictive value (PPV) and negative predictive (NPV) values as:
PPV = TP/(TP+FP) and NPV = TN/(TN+FP)
All these formulas are valuable for instrument validity, but the subject goes
beyond the purposes of this book, so the reader will need to search for more in the
extant literature.
5.2 Reliability
The metrics mentioned in the previous section bring to attention the issue of
reliability when measurements are involved. Any instrument we develop to measure a
variable will have been influenced by uncertainties in the measurements. These could
be due to either its inability to perfectly capture information or the inability of the
source (mostly human participant in social sciences) to accurately communicate
information. The latter could be intentional when participants are unwilling/afraid to
share the truth or unintentional when they cannot remember or express something.
When an event is in the distant past it might be difficult to recall details. Also, when
sensitive groups like children, disabled, etc. are involved there might be limitations on
how they formulate and express their responses to instruments.
A more general classification of measurement errors is random and systematic.
Random errors can be due to response variations due to participant emotional state,
intentions, attitudes and personality to name a few. While their random nature results
in variability in our data its effect on summary statistics like the mean tends to be
negligible (positive errors will on the average be canceled out by negative ones).
Systematic errors on the other hand are usually associated with validity and tend to
have a directional distortion of the data as they persist in both nature and direction
throughout the sample. Such errors are usually attributed to environmental factors
during the time of data collection. For example, if data on peoples’ mood is collected
on a rainy day the bad weather might predispose everyone in the sample to a bad
mood.
All the uncertainties mentioned here will eventually lead to errors in the
measurements we make. This is best captured through the true score theory which
maintains that any measurement can be expressed through the general form:
Measurement = True Value + Error
This model for measurement can directly relate to the issue of reliability we are
discussing here as it suggests a definition of reliability in the form of
Hypothesis Testing 177
object’s weight which is 10 Kg. Alternatively, the correlation of these two values is an
indication of how close they are to the true value. This suggests an equivalence
between correlation of measurements and reliability.
Based on that assertion a formula for reliability between the two measurements
we made (labeled V1 and V2 here) will take the form of:
in measuring the true value of the variable we study, while precision refers to the
closeness of its different measures/spread. The key to distinguishing the various
concepts (Figure 5.4) is to remember that reliability has to do with repeatability of the
experimental results when multiple tests (ideally from different researchers and
different samples) of the same statistic are involved. It is a prerequisite to validity as
we cannot have different results reflecting the same metric and being valid at the same
time. Validity on the other hand can easily be seen as synonymous to accuracy. Figure
5.4 has the results of a single shooter one day (top row) and his combined results with
another day (bottom row). The captions below each target showcases the concepts we
mentioned here.
One could observe from Table 6.2 or its corresponding matrix form that there
are high correlation values (closer to “1”) between variables Var1, Var2, and Var3
(yellow shading) and Var4, Var5, and Var6 (orange shading). This could mean that the
first group might be measuring one factor and the second another. We will call them
FactorYellow and FactorOrange. This would suggest that all our variables can in
essence be represented in a correlation coefficient space (axis FactorYellow and
FactorOrange) as having FactorYellow and FactorOrange coordinates (Figure 6.1). A
rough estimation of the coordinates of each of our variables across the two factors is
depicted in Table 6.3. For convenience, the average of the entries for each variable
across the factor regions was considered here in producing the factor coordinate
values. While the average is a specific form of multiple regression, a more appropriate
method (usually adopted in most statistical software) is to perform multiple regressions
with the R-matrix entries by using one of the variables as dependent/outcome and the
others as independent (like Var1 = a0+a1Var2+ a2Var3+ a3Var4+ a4Var5 + a5Var6 and
Advanced Methods of Analysis 183
so on for the other variables). The R2 of those multiple regressions would be good
indicators of how much variability can be explained by the model or is common among
the variables and can be used as initial coordinates for the variable in Table 6.3. The
variance accounted by the model is called common variance, while its proportion to
the overall variance is called communality. If we were to see the Table 6.3 numeric
entries as a matrix (6x2 — 6 rows and 2 columns) we would have what is called a
factor matrix (it would be called component matrix in principle component
analysis).
If we were to consider, as we mentioned before, factors as dimensions
(meaning independent of each other) and we plot the values of Table 6.3 we will get
the scatter plot of Figure 6.1. The coordinates of each variable along the axes are called
factor loadings. If we had identified three factors, we would have to draw a third axis
perpendicular to the other two and so on for higher dimensions (although difficult to
visualize).
Table 6.2 Correlation coefficient matrix
Variables Var1 Var2 Var3 Var4 Var5 Var6
Var1 1.000
Var2 0.839 1.000
Var3 0.710 0.739 1.000
-
Var4 0.042 -0.320 1.000
0.093
Var5 0.028 0.056 -0.092 0.692 1.000
- -
Var6 0.019 0.598 0.702 1.000
0.057 0.025
Apart from the method described here for discovering factors there are others
that can be applied and can be found in the literature. A distinction needs to be made
for cases where we have some idea/suspicion/hypothesis of what the factors are and
want to confirm their existence and for cases where we aim at discovering the factors.
For the former situation, explanatory/confirmatory factor analysis is required,
while for the latter exploratory factor analysis would be recommended. A variation
of the exploratory factor analysis is PCA. The two techniques are similar in that they
both process correlations (linear combinations) of variables and variances to explain a
set of observations. However, in FA we are more interested in the underlying factors
(latent variables) and not in the observed variable values because we care to
extrapolate our findings to the population while in PCA we focus on combinations of
variables that reflect the observed variable values in our sample without concern of
how they could infer the population. The latter can be circumvented if another sample
is used that reveals the same factors. In practice, we start with a model in mind in FA
and see how it fits the observations (accounts for the observed variability), while in
PCA we are just trying to reduce the number of variables by eliminating covariances
(while preserving variability). FA accounts for the common variance in the data and
does not produce any values for the identified components, while PCA accounts for
the maximal variance and can produce values for the identified factors. In terms of
modeling, FA derives a model and then estimates the factors, while PCA simply
identifies linear relationships between variables when they exist. In that sense, the
method we described with the data of Table 6.2 is more of a PCA than a FA as we
were trying to find a reduced set of dimensions (also called eigenvectors) that
accounted for most of the variance (also called eigenvalues). The terminology and
matrix transformation process for finding eigenvectors and eigenvalues is beyond the
scope of this book so the interested reader should look for more in the extant
literature/Internet. Considering the initial example of Table 6.1 (defining a car), if we
were doing a FA we might have confirmed our initial model of a car consisting of
wheels, engine, etc., while if we were doing a PCA we would have found the items like
wheels, engine, etc. that define a car. These different perspectives should guide a
researcher on which method is better for the research they are conducting.
The automated factor extraction process adopted in most statistical software
will produce a number of factors but might not be strong enough to tell us which ones
are significant unlike what we did by a simple visual inspection of the correlation
matrix. Also, the boundaries between factors might not be as clear as in Table 6.1. In
most cases we will need a way of telling which ones are the most influential. One way
of doing this is by plotting their eigenvalues in what is known as a scree plot. Table
6.4 shows the output of a FA from software. We can see what part of the total variance
is attributed to factors. Evidently, the first two factors in this case seem to amount for
87.5% of the total variance (45.4% and 42.1% for the first and second factors,
186 QUANTITATIVE RESEARCH METHODS
respectively). The scree plot in Figure 6.2 shows the same information in a visual form.
The sudden drop in the eigenvalue total that the component ‘3’ introduces (inflexion
point) is the cut-off point past which the remaining variability can be considered
insignificant compared to the contributions of the first two factors. Statistical software
usually retains all factors above an eigenvalue total of 1 (sometimes 0.7) and ignores
the rest.
Table 6.4 Factor analysis output
Component Eigenvalue Eigenvalue Eigenvalue
Total % of Cumulative
Variance % of
Variance
1 2.356 45.4 45.4
2 2.184 42.1 87.5
3 0.205 3.9 91.4
4 0.192 3.7 95.1
5 0.173 3.3 98.4
6 0.081 1.6 100.0
Once we extract the factors there might still be room for improvement in
reaching the ideal where variables have the highest possible loads (coordinates
according to Figure 6.1) across one of the factors. If we were to see the scatter plot of
Advanced Methods of Analysis 187
Figure 6.1 we can visually suspect that a better set of axes like the ones depicted in
Figure 6.3 (red lines) could have worked better in representing the factors. To achieve
this, we need to rotate our initial axes to get to the new axes. Rotation is a standard
technique applied to improve factor loading and can either be an orthogonal rotation
where the axes remain orthogonal to each other or oblique rotation like the one
performed in Figure 6.3.
The choice of rotation depends mainly on whether we are interested in
preserving the independence of factors among each other (orthogonal rotation) or we
are allowing correlations between them to exist (oblique rotation). Modern statistical
software allows for a variety of transformations depending on how the spread of
loadings for variables is distributed across factors. Either types of rotations and their
variances require matrix transformations that are beyond the scope of this book. Using
the default values provided by the statistical software we use is always a good starting
point from which interested researchers can expand to address the specific needs of
their study.
One point of interest before closing this section is the impact of sample size
on the limit for accepting factor loadings as significant. Based on a two-tailed alpha
level of 0.01 for sample sizes of 50, loadings of 0.7 and above can be considered
significant, while for sample sizes of 1,000, loadings of 0.2 and above might be
significant. While someone could guess the in-between situation, researchers will have
to consult the literature for what are the recommended loadings that indicate
significance for the research they conduct.
kingdom, the plant kingdom, etc. and further subdivides them into more specific
forms until we reach the individual species. In other words, we try to divide
observations into homogeneous and distinct groups. This is not to be confused with
classification where we try and predict which group an observation belongs to. The
groups in classifications are known while in cluster analysis they are not.
The key entity in classification is the group. The most popular metric we use
for identifying closeness and by extension membership in a group is usually the
geometric/Euclidean distance of an observation to the group center. For two
observations in a three-dimensional space this distance is given by the formula:
For cluster analysis, the coordinates/values along the dimensions we have are
standardized. Also, in cases where the importance of the dimension varies we can use
weights to indicate so. The distance in those cases will look like:
190 QUANTITATIVE RESEARCH METHODS
measured, such as ambition, confidence, etc.). Such variables are usually represented
by combinations of observed variables and confirm the observed variables (our
sample variables) as part of a model. For example, regression models can represent a
phenomenon by explaining or predicting the influence of one or more independent
variables on one dependent variable. Path models are similar to regression models but
allow for multiple dependent variables, while confirmatory factor models aim at
connecting observed variables to latent variables.
We develop a model (model specification) by determining every relationship
and parameter that is suspected to be involved in the phenomenon we study. SEM
then uses variance-covariance matrices (referred to as matrix Σ from now on) based
on the model and attempts to fit the observed variance-covariance matrices (referred
to as matrix S from now on). If inconsistencies/errors are observed, then the model
is deemed miss-specified, suggesting that either some of the assumed relationships are
not there in reality or that some other variables might be needed to complete the
model. This will lead to model modification and the process will be repeated until a
satisfactory model is found.
A model can in general be under-identified when there is not enough
information in the covariance matrix to define some of its parameters, it could be just-
identified when its parameters are sufficient to explain the covariance matrix, and over-
identified when there are more than one ways of estimating its parameters. The last
two cases usually consider the model adequate to explain the sample observations.
Having a classification for identification requires the establishment of metrics that will
determine its classification. In these cases, these are more in the form of conditions
than exact metrics and we will discuss now the most important of them.
The first condition we will consider is the order condition under which the
number of parameters (also called free parameters) required to describe the model
cannot exceed the number of distinct entries in the variance/covariance matrix
mentioned before. Given the symmetry of the S matrix the elements above the
diagonal are the same as the symmetric entries below the diagonal (that is why we often
ignore such elements when depicting the matrix). So, for n observed variables the
matrix will have n*n elements and the total of the elements below (or above) the
diagonal plus the diagonal elements will be distinct and equal to n*(n+1)/2. If we also
consider the means of the variables involved (n in total) as necessary for the
description of the phenomenon we study, the number of free parameters becomes
n*(n+1)/2 + n or n*(n+3)/2. A consequence of the required number of parameters
is that the sample size should be at least as large as the number of free parameters
otherwise there will not be enough information to estimate all the parameters (this is
also known as the “t-rule”). Another necessary condition that is far more difficult to
assess is the rank condition which requires a determination of the suitability of matrix
Advanced Methods of Analysis 195
S for determining each parameter (matrix Σ). This is in general difficult to prove and
we will defer the interested reader to search the extant literature for more on this.
We will discuss now actions we can take in avoiding identification problems.
One of the methods used is a mapping between observed and latent variables and
ensuring that either the factor loading, or the variance of the latent variable is fixed to
‘1’. This ensures there is no indeterminacy between the two, but it might require
additional constraints. Additional methods exist but are beyond the scope of this book.
Having developed a model, the next step is to estimate its parameters. For this
we need a fitting function that will provide a metric of the difference between S and
Σ. The interested reader can find many such fitting functions in the extant literature
but the most popular ones include maximum likelihood (ML), generalized least
squares (GLS), and unweighted or ordinary least squares (ULS or OLS).
Regardless of the process we follow we will end up with a set of parameters as
descriptors of our model. The next step concerns the evaluation of the adequacy of
these parameters in providing a description of the model we adopted and by extension
of the phenomenon under investigation.
This evaluation can start by considering the parameters that are significantly
different from zero and their signs are the same as the ones in our model (indicating
significant contributions to the explanation of the phenomenon). From then on, we
can estimate their standard errors and form critical values as the ratio of the parameter
to its standard error. By conducting a t-test between the theoretical and observed
parameters we can deduce which ones are truly significant (for example, they exceed a
specified alpha level of say a = 0.05 for a two-tails t-test). Finally, if the values we
observed are within expected ranges as suggested by the relevant literature and/or
other sources (for example, pilot tests), then we can confirm that all free parameters
have been identified and meaningfully interpreted. While at the level of the individual
parameter t-test will work, at the level of the whole model we will need chi-square to
measure the fit between Σ and S. When chi-square is close to zero (perfect fit) and the
values of the residuals matrix are also close to zero we can safely conclude that our
theoretical model (Σ) fits the data (S). A final stage after evaluation concerns the
modification of our model (also called specification search) to find parameters that are
more meaningful and better fit observations.
(meaning several regression equations) is path analysis. Although this might suggest
that the method is suitable for proving causations, this is not the case unless there is a
temporal relationship between the cause and effect variables which also correlate or
covary while other causes are controlled. If the aforementioned conditions are
confirmed over time and across multiple experiments, then we can assume that
causation has been established.
Let us demonstrate the path analysis process through an example. We will
assume that following our literature review and past research we have developed a
model for the emergence of entrepreneurship that involves 10 observed variables
(Figure 6.8). One-way arrow connectors represent direct effects/influences, while two-
way connector lines (usually drawn curved) represent covariances (Networking with
Social capital and Education with Social capital). The rational for the covariances is
that there are influences outside the proposed model that affect the relationship of the
variables involved in the relationships. Finally, the model includes error terms
(ovals/circles in the diagram) for all dependent variables to make up for the variance
that the model will not be able to explain. These errors usually represent latent
variables that influence the phenomenon we study. In terms of dependent (also called
endogenous here) and independent (also called exogenous here) variables, the red
rectangles represent independent variables while all other rectangles represent
dependent variables.
Opportunity*Education = cmo*MarketEnvironment*Education +
ceo*Education*Education + ErrorOp*Education
which becomes (note that all variables are standardized so their variance will
be ‘1’ and Education*Education can only be expressed as variance that will be ‘1’ as
everything is standardized, and the variance when error terms are involved is ‘0’ as
errors reflect variance in the model unrelated to the variables):
Cov(Opportunity,Education) = cmo*Cov(MarketEnvironment,Education) +
ceo*Var(Education2) + Var(ErrorOp*Education)
or
Cov(Opportunity,Education) = cmo*Cov(MarketEnvironment,Education) + ceo
Given that the covariances are known (from matrix S as discussed in the
previous section), the previous equation has only the coefficients as unknown.
A similar equation will be formed by multiplying the first equation
(Opportunity in our model) by the MarketEnvironment variable:
Opportunity * MarketEnvironment =
cmo* MarketEnvironment * MarketEnvironment +
ceo * Education * MarketEnvironment +
ErrorOp * MarketEnvironment
which will become:
Cov(Opportunity,MarketEnvironment) = cmo* Var(MarketEnvironment2) +
ceo * Cov(Education * MarketEnvironment) +
Var(ErrorOp * MarketEnvironment)
or
Cov(Opportunity,MarketEnvironment) = cmo +
ceo* Cov(Education * MarketEnvironment)
Doing the same for the other equations in the model we end up with a system
of equations that when solved will produce all the coefficients and error terms of our
model equations. While this process will provide numerical answers, the interpretation
Advanced Methods of Analysis 199
of the results needs to be done in light of the theory and assumptions used in
developing our model.
We will assume that all the variables in rectangles are observables (we can
measure them with some questionnaire) and their associate errors represent
measurement error or variation not attributed to the common factor. Error variances
could be correlated if they share something common, like the same instrument for
example. The model could be more complicated with correlations (double arrows)
between observable variables, but for the purposes of demonstration here we will only
assume correlation between the two factors (Environment and Individual).
Considering coefficients (in this case called loadings as we saw in EFA) for the
relationships between the factors and observable variables we can express the model
of Figure 6.9 with the following equations:
MarketEnvironment = fem * Environment + ErrorM
Opportunity = feo * Environment + ErrorOp
FamilySupport = fef * Environment + ErrorFs
Networking = fin * Individual + ErrorN
RiskTaking = fir * Individual + ErrorRt
Education = fie * Individual + ErrorEd
By considering the order and ranking conditions (see the beginning of SEM)
as having been met, we assume the model is identified and proceed to
model/parameter estimation as we described before. Eventually, we will have
parameters estimated and we can focus on interpreting their significance. This means
deciding if their values reflect the theory or theoretical framework we are trying to
prove. Only then can we consider the model confirmed. It will be then left to follow-
up experiments to confirm the validity of our model.
presentation and interpretation of the data streams and their patterns, while the latter
can be used to make forecasts and provide estimates of their reliability. This is typically
the case when we try to predict trends, whether it is the economy, the stock market,
etc.
A common technique used to describe a time series is through an index
number. Such numbers measure the rate of change of the data in their stream during
a period of time. The beginning of the time period is also referred to as base period.
In the world of business and economics, indexes are developed to measure price
changes and production/quantities. These indexes (symbolized as It) are expressed as
the ratio of the price or quantity of the item of interest at the point of interest (xt) over
its value at the base period (x0) multiplied by 100 to take the form of a percentage.
While indexes can give us an idea about the data stream we study (similar to
the mean and median we have seen in previous chapters), they can be quite misleading
if the data in the stream fluctuate irregularly and rapidly. In such cases, we try and
apply a technique called smoothing to remove the rapid fluctuations. One of the types
of smoothing we can apply is exponential smoothing and it is a form of a weighted
average of past and present values of the time series. Considering a weight w (also
202 QUANTITATIVE RESEARCH METHODS
called smoothing constant) in the range between ‘0’ and ‘1’ that will be applied
throughout the series of data x. A value near ‘0’ places emphasis on the past while a
value near ‘1’ places an emphasis/weight on the future. Formula 6.1 expresses the
smoothing function.
(6.1)
The smoothing process will produce replacement values X as follows:
X1 = x 1
X2 = wx2 + (1-w)X1
X3 = wx3 + (1-w)X2
………………………..
XN = wxN + (1-w)XN-1
Table 6.7 showcases the smoothing process with data for the closing value of
Apple’s stock during Fall 2016. Three different weights are considered, and their
influence can be seen in the smoothing they produce in Figure 6.10. If we are interested
in “long-term” trends we might consider the smoothing the weight of 0.1 produced
(blue line), which shows a decreasing tendency of the stock value. If we are interested
in capturing daily fluctuation, then the smoothing the weight of 0.5 has produced
might be more appropriate. In between we have the effect of the weight of 0.3. It is
evident that according to our interests we might consider which weight is more
appropriate.
The method we discussed along with indexes provide a descriptive
representation of the data stream we study. While this is important in terms of
understanding our time series, our interest is in making predictions or more
appropriately forecasting future values of the series. Based on what we just did we
can continue applying the smoothing formula for the next future period t+1 by placing
emphasis on the past that is already known. That means that we need to use w = 0 in
formula 6.1 to produce the next value of the series X N+1 = X N , X N+2 = X N+1 and so
on until we reach our target future time. It is evident from this process that all the
future values will equal the last smoothed value X N in our original series so one would
expect the farther in time we move the less accurate our forecasts will become.
Advanced Methods of Analysis 203
One way to alleviate this problem is to consider adding some trend influence
in the form of a component in the smoothed function we are using. One such model
is the Holt forecasting model and expressed by the pair of smoothed/ weighted
averages equations:
Xt = wXxt + (1-wX)(Xt-1 + Tt-1) and Tt = wT(Xt - Xt-1) + (1-wT)Tt-1
This set of equations updates both the series values and the trends with
separate weights for each one of them.
The smoothing process starts from the second series value and continues until
it reaches the end of that time series and even to the point in the future we are
forecasting:
X2 = x2 and T2 = x2 – x1
X3 = wXx3 + (1-wX)(X2 + T2) and T3 = wT(X3 – X2) + (1-wT)T3
……
Table 6.8 shows of the Holt forecasting model for the same weights that we
used in Table 6.7. The results are also plotted (Figure 6.11) for comparison with
corresponding plots of Figure 6.10. It should be evident from the comparison of the
graphs that the Holt model captures the visual trend with more consistency for the
various weight values.
Using this smoothing process for forecasting is done in the same way as before. We
consider w = 0 for the future values since we are basing our predictions on the past.
XN+1 = XN + TN and TN+1 = TN (6.2)
XN+2 = XN+1 + TN+1 and TN+2 = TN+1 (6.3)
If we substitute XN+1 and TN+1 from their previous values (6.2) in (6.3) we get:
XN+2 = XN + TN + TN and TN+2 = TN or XN+2 = XN + 2TN
If we follow on this, we will get XN+3 = XN + 3TN and so on until we reach
our target date after t periods.
Xt = Xt + t*Tt
If we consider what we did with linear regression in Chapter 4, it should be
evident that the equations we just derived look like the regression equations. This
would suggest that a regression equation of the form X = a0+a1t could be considered
in forecasting situations. This will depend on the research problem we study. For
Advanced Methods of Analysis 207
example, if we are considering the variability of values real-life situations create like
the Apple stock in Table 6.7 we would be better off consider forecasting.
A better working approach that requires a lot more information would be to
use what is known as an additive model. In such a model, various influences are
considered as added components that influence the time series values like:
Xt = T t + C t + St + R t
A critical issue that has not been addressed yet is the accuracy of the various
forecasting methods and the metrics that we can use to measure the differences of the
forecasts with the actual values they forecast (when they become available). The latter
is also referred as forecast error. In practice, some of the metrics we use include the
mean absolute deviation (MAD), the root mean squared error (RMSE), and the mean
absolute percentage error (MAPE):
(a) (b)
Figure 6.12 Tree representation of a decision scenario
210 QUANTITATIVE RESEARCH METHODS
Our “facts” up to that point would suggest that chances are the employee in
front of us is a lawyer (0.95 probability). As in real life situations, though, there are
times when information can be available to us that might change our minds. In this
case let us say that through prior research we learn that 90% of the organization’s
employees are engineers -perhaps this is a construction firm - and only 10% are
lawyers. That would obviously change our perception of the situation we are facing
(Figure 6.12.b). This new “reality” would reveal a different set of facts to us as we now
have a 0.9*0.15 or 0.135 probability of an employee been and engineer wearing a suit
and 0.1*0.95 or 0.095 probability of an employee been a lawyer wearing a suit. These
updated evidence (joint probabilities) are also known as conditional probabilities
or likelihood and symbolized as P(Observation|Evidence). This is interpreted as the
probability of an Observation given the Evidence. In our case it would be
P(Suit|Engineer) = 0.135 and P(Suit|Lawyer) = 0.095.
Based on the information from our prior research we have now an updated
set of facts; 13.5% chance for someone wearing a suit to be an engineer and 9.5% to
be a lawyer). This would suggest that the employee who greeted us has more chance
of being an engineer than a lawyer. In fact, if we wanted to calculate the probability of
someone being an engineer or lawyer based on our evidence that the employee is
wearing a suit, we can apply the definition of probability presented in Chapter 3 for
the universe of the suit wearing employees only.
(a) (b)
Figure 6.13 Venn diagram representation of a decision scenario
The image in Figure 6.13.a displays a more realistic representation (in terms of
surface areas) of the employee situation based on all of the available evidence. It can
be seen from the shaded areas that represent those wearing suits that despite the fact
Advanced Methods of Analysis 211
that the lawyers wearing suits dominate the lawyer employees the shaded area that
represents engineers wearing suits is much larger. This is more evident in Figure 6.13.b
where only those employees in the organization that are wearing suits is considered.
From that image we can calculate the actual probabilities (called posterior) of
someone being an engineer or lawyer given the observation that they are wearing a
suit. Accordingly, the posterior probability of the employee that greet us to be an
engineer is 0.135/0.23 or 0.59 (59%) and for lawyer is 0.095/0.23 or 0.41 (41%). So,
the chances are 59% for engineer against 41% for lawyer.
If we were to express, now, these calculations in a more detailed form
(combine Figure 6.12.b and 6.13.b) we could say that the posterior probability of an
employee being an engineer or lawyer given that he is wearing a suit is
0.135/0.23 = 0.15*0.9/(0.135+0.095) and 0.095/0.023 = 0.95*0.1/(0.135+0.095)
respectively. These later expressions can be derived from the followin general formula,
also know as Bayes formula/theorem:
there is a 95% probability that the sample mean represents the actual population. If
we were to perform the same test with different sample sizes, we would get different
t-scores and hence different p-values leaving the acceptance or rejection of the null
hypothesis uncertain and dependent on the sample size. This uncertainty will
eventually spill over to confidence intervals leaving us with no way of knowing which
values are most probable.
The equivalent of the p-value in Bayesian analysis is the Bayes factor. The
null hypothesis here assumes an infinite probability distribution at the parameter (the
mean 30 in the previous example or 3 successes in the die case) and zero everywhere
else. The alternative hypothesis in this case assumes all values of the parameter as
possible (uniform distribution). Because prior and posterior odds are involved the
Bayes factor is calculated as the ratio of the posterior odds to the prior odds. To reject
the null hypothesis, we usually consider Bayes factors below 0.1. The confidence
intervals in this approach become credibility intervals.
While confidence intervals suggest our belief that a certain percentage of
samples (95% in most cases) will contain the parameter we investigate (true value of
the population) the credibility interval expresses the probability (95%) that the
population mean is within the limits of the interval. If the logic behind these
expressions seem similar keep in mind that they are conceptually different as the
former considers that a true value of the population can be captured while the latter
assumes there is an inherent uncertainty in capturing it. Bayesians assume that updated
information can improve our convergence to reality.
Despite the attractiveness of the Bayesian approach in reflecting the reality that
experiences improve our predictions this approach does have some shortcomings. For
one thing priors are mostly subjective in nature and influenced by our preconceived
notions and stereotypes of situational characteristics. Personal beliefs and
environmental influences might contribute to bias in our estimation/guessing of priors
limiting the application of a balanced and reliable Bayesian approach. Experience in
such cases becomes a determining factor for a successful implementation of a Bayesian
model. An additional challenge with Bayesian analysis is the computational complexity
of the multi factor situations that we see in real-life cases. Considering all these
challenges, the choice of whether we follow a frequentist or a Bayesian approach to
our analysis should be problem dependent. The two approaches can complement each
other as they can address the flaws of each other and in this way mitigate the real-
world problems.
214 QUANTITATIVE RESEARCH METHODS
entry in the Large Rise column from 500. Again, depending on our degree of optimism
and pessimism, we could decide on the appropriate investment/row.
information of the probabilities. In case we are interested in gains we can multiply the
maximum gain for each market direction with the probability of that market direction
happening and sum the results over all market conditions. We then get what is known
as expected return of perfect information (ERPI).
Table 6.12 Reduced payoff table
Market Direction
Decision Large Small No Small Large
Alternatives Rise Rise Change Fall Fall
Probability 0.20 0.30 0.30 0.10 0.10 EV
Investment1 -100 100 200 300 0 100
Investment2 250 200 150 -100 -150 130
Investment3 500 250 100 -200 -600 125
Investment4 60 60 60 60 60 60
Max 500 250 200 300 60 ERPI
Gain 100 75 60 30 6 271
the products of probability times utility across all market directions for every
investment), we get what is called the expected utility (Table 6.15). This process
seems to suggest Investment3 as the better alternative.
The choice of action for the methods discussed here is as always left to the
decision maker. Risk-averse individuals try to reduce uncertainty so they tend to
“underestimate” utility, while risk-taking ones place high utility the more the earnings
increase. Risk-neutral ones tend to be more reserved in their estimation and
assignment of utilities. Figure 6.14 provides a pictorial depiction of the
aforementioned categories.
Figure 6.15 displays the decision tree of a development project case where a
company needs to decide about investing to get a license permit to build a plant that
will later sell for profit. The company is also considering a consultant to provide a
prediction of whether the application to the city council for a permit to build will be
successful or not. The expert’s records suggest that 40% of the time he predicts
approval and 60% denial. In the case of approval, it is also known that 70% of the
time the approval is granted while 30% of the time it is denied. The diagram of Figure
6.15 shows all possible outcomes with the additional information that pertains to the
problem (expert fee €5,000, purchase option fee €20,000, price of land €300,000,
development permit application fee €30,000, development cost $500,000, revenue
from sale €950,000, and 40% approval rate by the city council).
For the purposes of illustration, we will follow here only one branch of the
tree and leave it to the reader to confirm the remainder of the tree. Let us say we
choose the option where the expert predicts denial to our permit application (0.6
probability of this happening). At that point (bottom branch) we still have the option
to go ahead with our application, but we need to decide if we will buy the land or the
purchase option. Let us say we decide to avoid the big investment of buying the land
and we go with the purchase option alternative (cost €20,000) which is still at the
bottom branch of the decision tree. We then submit our permit application (cost
€30,000) which has a 20% chance of approval and 80% chance of denial. Let us assume
here that we end up with approval (second from bottom branch). We will then have
to buy the land (cost €300,000), complete the development (cost €500,000), and
eventually sell for €950,000. If we extract from the revenue all the costs, we end up
with a profit of €95,000. The probability of something like this happening is the
multiplication of the probabilities (see Chapter 3) we encountered along the branch of
the tree we followed. In this case this is the probability of the expert predicting denial
times the probability of having our permit approved (all other probabilities along the
path are 1 as they represent certainties). This product results in a probability of 0.12
or 12% for obtaining the profit of €95,000.
Traversing the other branches of the tree we get to all possible leaves that
represent the expected payoffs of each branch. Green leaves represent profits while
the pink represent losses. Whether we choose one of the branches (obviously one that
leads to profit) at a decision node depends on our attitude towards risk (Figure 6.14)
and any additional information we might happen to have at the time of the decision.
Based on the risk we want to take the optimal branch we will follow is called critical
path.
The formalism of decision trees is very powerful in situations with clear
objectives and predetermined states of nature. As in most methods, there are always
disadvantages that we need to consider when selecting the decision tree method. One
such disadvantage stems from our inability to assign realistic probabilities for future
events. In addition, the more complicated a situation is the more branches we need to
represent it, realistically, making detection of the optimum path difficult.
• Number of players – we can have situations with two players like in the game
of chess or many players like in the game of poker.
• Total return – it can be a zero-sum game like poker among friends and non-
zero-sum game like poker in a casino where the “house” takes a cut.
• Player turns – it could be sequential where each player takes a turn that affects
the states of nature like monopoly or simultaneous where each state of nature
is defined after all players declare their moves like in paper-rock-scissors.
The most popular case of game theory is the prisoner’s dilemma. In this
situation two suspects are caught for a criminal act and during their separate
interrogation an attempt is made to motivate them to confess by presenting them with
different options. One is to confess participation in the crime and receive a reduced
jail time of 5 years and the other is to deny, in which case the penalty if the other
suspect confesses will be 10 years jail time, otherwise (if they both do not confess)
they will both end up with 2 years in jail each (maybe for a minor offence like being
present and passive in the crime scene). Such a scenario can easily be represented with
a payoff table like the one displayed in Table 6.16.
Table 6.16 Prisoner’s dilemma payoff table
Suspect A
Confess Deny
5 10
Suspect Confess 5 1
B 1 2
Deny 10 2
Provided each suspect knows that the other also received the same alternatives,
and excluding potential loyalties and any other influences, it appears that there is an
optimal option (called Nash equilibrium) and that is to confess. Consider suspect B
for example. If he assumes that suspect A is going to confess, then he is better off
confessing too (top left cell in Table 6.16) as he will receive a 5-year sentence, while if
he denies (bottom left cell) he will receive a 10-year sentence. If he assumes the other
suspect will deny, again he is better off confessing and receiving a 1-year sentence
instead of denying and receiving a 2-year sentence. Rationality in the prisoner’s
dilemma case suggests that the choice is not the globally optimum where both deny
but instead the case of both confessing. The problem with the global optimum that
makes it unlikely to be chosen (all other factors like influences and loyalties excluded)
is that it is an unstable state in that it can always be improved for each one of the
suspects individually as there is always a better alternative (1-year sentence).
224 QUANTITATIVE RESEARCH METHODS
The objective here is to find the best strategy that will ensure gains for B
regardless of the strategy A will adopt. Assuming the promotion efforts will be
continued for some time it might be worth considering the appropriate blend of
strategies across time that could maximize our efforts. For the long run let us assume
we will be promoting Fruit x1 percent of the time, Dairy x2 percent of the time, and
Meat x3 percent of the time. Since the x represents probabilities it will always be:
x1 + x 2 + x 3 = 1 (6.2.1)
Considering a minimum overall gain V (referred to as expected value)
regardless of the strategy A follows we can formulate the following equations:
When A chooses:
Fruit (see Fruit column in Table 6.17) we need 2x1 – 2x2 + 2x3 ≥ V (6.2.2)
Dairy (see Dairy column in Table 6.17) we need 2x1 – 7x3 ≥ V (6.2.3)
Meat (see Meat column in Table 6.17) we need -8x1 + 6x2 + x3 ≥ V (6.2.4)
Bakery (see Bakery column in Table 6.17) we need 6x1 – 4x2 - 3x3 ≥ V (6.2.5)
Our aim is to find the solution to the system of equation (6.2) that maximizes
V ≥ 0. The case when V = 0 is called a fair game. The solutions can be produced in
practice with a variety of methods that are available in almost all quantitative methods
software. For the given system and for a fair game (V = 0), the Solver add-in of Excel
Advanced Methods of Analysis 225
produces x1 = 0.39, x2 = 0.5, and x3 = 0.11. This means that as long as Supermarket B
promotes Fruit 39% of the time, Dairy 50% of the time, and Meat 11% of the time it
will in effect neutralize Supermarkets A’s promotion efforts. By choosing a different
value for V we can work out different promotion percentages.
The reader needs to realize that while game theory will provide some ideal
solutions, these solutions refer to a particular instance in time and that future times
will require updating the payoffs based on new information that might become
available. Also, as one player adopts an optimum strategy so could another one (by
using again game theory), so the situations in real life are far from static. The
interaction amongst multiple players will complicate the playing field so combinations
of decision making (and data mining nowadays) techniques might be required to
achieve a true competitive advantage.
6.7 Simulations
Simulations are nothing more than artificial imitations of reality (in our case
with the use of computers). They are descriptive techniques that decision makers can
use to conduct experiments when a problem is too complex to be treated with
normative methods like the ones presented before and also lacks an analytic
representation that would allow numerical optimization approaches. Simulations
require proper definition of the problem, the development of a suitable model, the
validation of the model, and finally the design of the experiment. By running the
simulation many times (performing lots of trials) we will be able to see prevalent
patterns that lead to optimal behavior.
In practice, there are different types of simulations:
packages. Numbers 0 and 1 will represent A, numbers 2, 3, and 4 will represent B, and
numbers 5, 6, 7, 8, and 9 will represent C. For every run of the experiment we will be
retrieving random numbers/offers until all products are collected. Table 6.18 shows
the results of the simulation for 20 runs of the experiment along with the number of
products bought to complete the set and the average among the simulations (from the
first and up until the current one). From the results, we can deduce that we would
need on the average to buy 6 products to get all 3 memorabilia.
In the previous example the numbers created by the random number generator
had equal probabilities of appearance. There are other cases, though, where this is not
true. For example, we might be interested in deciding the size of a restaurant we should
open in a certain area and we need to calculate the optimal number of tables that the
area can comfortably service. If we find a place where too many tables can be
accommodated the place might look empty, so the customers might believe it is not a
popular place and leave, while if the place is too small it will have few tables and
customers might decide to leave when they frequently see it filled. We need a way here
to simulate customers coming in, eating, and leaving so eventually we can decide the
optimum number of tables the neighborhood can support and subsequently the ideal
size for our restaurant. We make the assumption that customers could be coming alone
or in groups of say up to 6. Research would suggest that people rarely go alone to the
restaurant or in big groups, so a normal distribution of the group size would be more
appropriate with probably a mean at group size of 3 and standard deviation 1 (meaning
68.3% of the groups will be between 2 and 4 people). Similarly, we can presume an
average time of stay that is normally distributed with a mean of 50 minutes and
standard deviation 10 minutes (data from research) and time between groups’ arrivals
again normally distributed with a mean of 20 minutes and standard deviation of 10
minutes. To set up the experiment we will need three random number generators that
follow the normal distribution (one for the group size, one for the time of stay, and
one for the in-between groups time). Simulations will run for different numbers of
tables (say 20, 30, and 40) and the time to fill up the restaurant will be recorded. The
averages for each of the different numbers of tables would indicate if the store is filled
and by what time this happens. The optimal solution will be produced by the number
of tables that can sustain a steady flow of customers with all the tables filled. Setting
up and running the simulation is better done by developing a computer program that
goes beyond the scope of this book31. The results of three such simulation runs are
shown in Table 6.19. Even though the number of runs is small it can be seen that 11
tables would be ideal for a restaurant operating under the assumptions of our scenario.
31 The code for the simulation in Java can be found on the book’s website.
228 QUANTITATIVE RESEARCH METHODS
probabilities between these states as represented in Table 6.20. Figure 6.16 displays the
same information in the more popular steady state form.
Table 6.20 Weather transition matrix
Future Sunny Rainy Cloudy
Present
P(S|S) P(R|S) P(C|S)
Sunny 0.6 0.1 0.3
P(S|R) P(R|R) P(C|R)
Rainy 0.7 0.2 0.1
P(S|C) P(R|C) P(C|C)
Cloudy 0.4 0.4 0.2
Based on this transition matrix and given our present state we can use a
random number generator as we did in the memorabilia example and produce weather
predictions for any sequence of days. Assuming, for example, that today is a rainy day,
we might get the random number 0.45. From the Rainy row in Table 6.20 we can see
that this falls within 0.7 so we can assume P(S|R) or that tomorrow will be a sunny
day. If the next random number, we get is 0.8 then from the Sunny row we see that in
order to reach 0.8 we have to get to the Cloudy column (0.6+0.1+0.1) so we can
assume P(C|S) or that the day after tomorrow will be a cloudy day. By adding more
230 QUANTITATIVE RESEARCH METHODS
states of nature (Partly Cloudy, Snowy, etc.) and adjusting the transition matrix to
reflect reality as much as possible we might end up with more realistic predictions of
the weather in our location.
Concluding this section on simulation, we need to keep in mind that
simulations are straightforward, they allow for a great amount of time compression,
they can handle unstructured and complex problems, and they allow manipulation of
parameters to evaluate alternatives. Despite their strengths, they have disadvantages
like there is no guarantee that an optimal solution will be achieved, and the process
can be slow and costly especially when a lot of computational power is required. In
addition, they are specific to the situation we are facing so when something changes
in the model they might need to be redesigned from scratch.
(the bottom left cluster in Figure 6.17 seems to be connected to the main graph
through an orange node) while others are more isolated and remote with single or very
few connections.
The complex webs that emerge from our social interactions create the need
for metrics that capture key locations in networks and inform about influences and
trends invisible to regular analytics techniques. Connections among social network
entities may be implicit or explicit. The first type concerns inferred connections due
to someone’s behavior while the second concerns connections that we intentionally
establish as when we follow someone or connect to a friend or coworker. The latter
needs approval/consent from both parties (also called undirected connection) while
the former is a unidirectional/directed privilege one gets from their participation in a
network. In many cases it is the undirected connections that have more value especially
to those with access to the network data as these reveal strong ties such as in the case
of two people following each other.
Another point of importance with respect to connections is that they carry
different weights. For example, if two people exchange multiple messages then we can
232 QUANTITATIVE RESEARCH METHODS
naturally deduce they have a stronger connection (meaning that potentially they can
influence each other) than two people that rarely exchange messages. To make sense
of the significance of the various connections in a social network we need metrics such
as the frequency of message exchanges.
Before we delve more into a discussion of some of the most popular metrics,
it is worth defining some key characteristics of networks and their elements. As a
commonly accepted definition, networks represent collections of entities that are
interlinked in some way. The individual entities are also labeled as nodes or vertices
and they can represent people, objects, concepts and any entity that is independently
meaningful to the network (such as transactions, “likes", etc.). Social networks,
specifically, include people that interact with other people, organizations, and artifacts.
Although specific attributes of each node are not, in general, necessary for network
analysis, their presence can only add value and can more accurately profile the
individuals and their relationships.
The connections between the entities are called edges or links or ties among
others and can be directed (single arrow connectors – also known as asymmetric edges)
such as when one person (identified as the origin) “influences” another (identified as
the destination) or undirected (straight line connectors with no arrows – also known
as symmetric edges). Another characteristic of edges, as we mentioned previously, is
their weight; when their weights are zero (also called unweighted or binary) they simply
indicate the existence of a relationship. Weights might also indicate an edge’s strength
or frequency (how many messages exchanged). Due to the space restrictions of this
section only unweighted metrics will be considered. Addressing weighted graphs might
be done by conversion. This is easily done by assuming a cutoff weight below which
no connection is assumed to exist. For example, if the weights represent the number
of times a web link was clicked then if we assume a cutoff point of say 5 then any links
that were clicked less than 5 times will be considered insignificant and will be
eliminated leaving the graph with the most “popular” links.
While there is a variety of ways to represent networks the most popular form
by far is the network graph (Figure 6.18). Such graphs make it easy to identify key
players (like C and D), isolated entities (like I), terminal nodes (like H and G) and
reciprocate relationships (like AC, BC, and DE). Because this form is difficult to
process computationally alternative representations include the matrix (also called
adjacent matrix) representation (Table 6.21) and the edge-list representation (Table
6.22). Directional influences between the origins (rows) and destinations (columns) are
represented as “1” in the matrix form while everything else is represented with “0”.
While this form is easy to process, much of the space is waisted in redundant
information (“0”s). The edge-list representation is eliminating the space challenges of
Advanced Methods of Analysis 233
the matrix form by including only existing connections, but it adds processing time as
it requires multiple traces of the list to calculate network metrics.
C 1 1 0 1 0 0 0 1 0
D 0 0 1 0 1 0 0 0 0
E 0 0 0 1 0 0 0 0 0
F 0 0 0 1 0 0 1 0 0
G 0 0 0 0 0 1 0 0 0
H 0 0 1 0 0 0 0 0 0
I 0 0 0 0 0 0 0 0 0
Typically, when such entities connect clusters of entities they are called gatekeepers
or brokers.
Specific centrality metrics include:
Betweenness centrality measures how far apart individuals are and is
calculated as the smallest number of neighbor-to-neighbor jumps that separates two
individuals. The actual path is called “geodesic distance” and it is considered when we
are interested in how often an individual is in the shortest path that connects two
others. This is an indication of the bridging capabilities of that individual and its
removal from the network can be similar to collapsing a bridge in real life. There could
be other ways to reach two points/individuals but accessing them through the bridge
might be the more efficient one. When a connection between two individuals is not
possible, we consider that as a structural hole or a missing gap. Such cases are potential
opportunities to create more value for the network. In the case of organizational
structures leaders can identify such gaps (disconnects between units) and invest
resources in “bridging” otherwise separate organizational units. Table 6.23 displays the
betweenness centrality scores of the Figure 6.18 network. The reader can confirm the
values (like from G to B is 4 hops) by tracing the path from one individual to another.
The higher the number of jumps the higher the potential to form direct connections
leveraging the existing connections already established in the network. For
comparison, the author’s betweenness in the 499-sample cross-section in Figure 6.17
is 113552 (calculated by LinkedIn).
Closeness centrality is the average distance of an individual from any other
individual in the network. A low value is an indicator that an individual is connected
(small distances) with most others in the network while a high value would suggest
someone is on the periphery of the network, such as the distance between two points
in physical space. The lower the distance values the closer we are, and the faster
information/messages reach their destination while the higher the value the longer our
messages need to travel to reach their destination. The closeness values for the
connected individuals of Figure 6.18 are displayed in Table 6.23. Key individuals like
C and D have as expected the lowest values while the more isolated G has the highest
one.
Eigenvector centrality is capturing the importance of someone’s
connections in terms of how connected they are. For example, an individual like F in
Figure 6.18 is connected to the very influential D so that in a sense D is a form of a
proxy of F’s influence. The metric is calculated for every individual by multiplying each
of its row entries from Table 6.21 with its corresponding closeness centrality in Table
6.23 and adding all at the end. For example, eigenvector centrality for the most critical
individual C is 1*2.4+1*2.4+0*1.7+1*1.6+0*2.4+0*2.1+0*3+1*2 = 8.4. Google is
236 QUANTITATIVE RESEARCH METHODS
using a variant of eigenvalue centrality to rank web pages (PageRank algorithm) based
on how they link to each other.
Degree centrality is a vertex/individual’s total direct connections. For
example, in Figure 6.18, C’s degree centrality is 4 while D’s is 3. The lower the degree
centrality the less influential is an individual within a network. Individual I has zero
connections so it is the least influential individual in that network. Table 6.23 displays
the degree centrality for all other individuals in the Figure 6.18 network. Caution
should be exercised in interpreting the metric as low counts can sometimes be
misleading. An individual might be connected to two others who individually connect
with large parts of the network, rendering these two as influential despite their low
degree centrality since their removal from the network might cause the network to
collapse.
7 Publishing Research
A natural step at the end of any academic research is the dissemination of the
results in the greater academic and professional communities to inform and invite
critique. Only when repeated efforts to challenge the research findings fail can we say
with some certainty that the findings contribute to our understanding of the
phenomenon under investigation until new research proves otherwise. Dissemination
in academia is traditionally done through conference presentations and academic
publications (journal, books, etc.). The latter nowadays has been supplemented with
online repositories (like arXiv.org, ssrn.com, and researchgate.net) where even
preliminary findings can be presented in an effort to invite feedback that will further
guide the efforts of researchers.
The most popular options available for publishing research include peer-
reviewed academic journals, conference proceedings, and academic research books
(not to be confused with textbooks). For other forms of dissemination like newsletters,
commentaries, etc., readers should consult the specific publisher’s guidelines. The
journal and conference proceedings options generally follow the same style and
formatting rules as oftentimes conference proceedings are published as special issues
of journals or as academic research books. When it comes to book publishing some
publishers might request a specific style, but in general it is left to the editor (for
conferences) or the lead author for academic research books for the structure and style
of the print material. For this reason, the details of book publishing are left for the
researcher to explore through the websites of publishers. Some book publishers
dedicated to academic research include Routledge, Springer, etc. Prestigious
institutions like MIT and Oxford University tend to have their own publishing houses
so interested authors can find details about what and how they accept for publication
on their respective websites. One special case of publication, the research dissertation,
will be discussed here as it is of great interest and probably the starting point for many
researchers. Dissertations almost always are written in a research book style and often
end up published as books.
that usually covers the breadth and depth of the field the journal is covering. Their
primary function is to screen the material that will be published for appropriateness
for the journal domain and ensure the journal structure is followed and the
submissions pass the scientific rigor of the review process. The latter is usually through
a double-blind review process whereby the editors assign two reviewers to
anonymously evaluate the submitted material (oftentimes stripped of any author
details).
Based on the outcome of the review process the authors of submitted material
are informed whether their work has been accepted for publication by the journal,
whether revisions are required before publication, or if it has been rejected. For
prospective authors, even when their work is rejected there is value as they receive
feedback from the reviewers of the paper on the areas that were not appropriately
covered and supported. This way, researchers can learn from each other and improve
in the process the quality of their research.
Prospective authors can find the details of the publication process from the
journal’s website and additional information about the success rates of the journal
submission and possibly a categorization of the popularity of the journal as a source
for references by researcher. This is usually indicated through the calculation of an
impact factor metric that some organizations like Thomson Reuters produce for
academic journals. They can range from 0 to even high numbers (40 and above) for a
few select journals, but the great majority of journals will fall below 10 with most
probable values around 3 and even lower. This is not by far a fair process as quality
work can be found even in journals with impact factors below 1, but as in any social
function tradition, prestige and even politics in the form of author affiliations can carry
a publication a long way. Having the journal name carry the importance of someone’s
research is useful but by far what will make research “famous” is the quality of the
work presented and its dissemination by the researcher in more interactive and
engaging modes like conferences, presentations and, nowadays, bulletin and discussion
boards/groups in professional associations and social network sites.
While the material that will follow here covers the general requirements in
terms of style and structure that appear in the majority of academic journals,
researchers should always check the specific requirements set by their target journal
(usually found on the journal’s website). Another point of reference for the discussion
that follows is that we are mainly focusing here on original research (excluding
newsletters, commentaries, etc.) and on social sciences research, but the deviations for
other fields of research should be minimal and usually concern the citation style and
formatting. In general, publication of research is an account of the research process as
outlined in Figure 2.12.
Publishing Research 239
The reason a uniform style is found has primarily to do with convenience when
reviewing research papers as we can easily locate the sections that are of interest to us
and retrieve key points and findings. Style helps express the key elements of
quantitative research (like statistics, tables, graphs, etc.) in a consistent way that allows
retrieval and processing without distractions. This in addition provides clarity in
communication and allows researchers to focus on the substance of their research.
Research paper styles have been recommended by major scientific bodies like APA,
developed by the American Psychological Association, but in general what is known
as the IMRaD (Introduction, Methods, Results, and Discussion) structure is the
standard many journals follow with minor deviations like separating the review of the
literature from the introduction. If we add the title page at the beginning of this
structure and the references at the end, we have a complete journal publication
structure. Before we proceed with a discussion of the aforementioned structure it is
worth pointing out that occasionally journals will impose a word count limit on the
length of a manuscript mainly due to space restriction in the journal and in an effort
to restrain authors from getting “carried away” with their presentation. Typical size
limits are set around 10,000 words and less. Presumably, if more is required the authors
should consider alternative routes like publishing their research as a book. Many
publishers specialize in such publications and even encourage authors to publish their
research even as a collection of similar research, like with conference proceedings.
A typical breakdown of the extent of the various sections in an abstract (for those
journals that do not force the breakdown) could be 25% Introduction, 25% Methods,
35% Results, and 15% Discussion. In terms of length, typical abstract requirements
range between 150 and 250 words. Like the title, the abstract should be able to stand
on its own if separated from the rest of the paper. Abstracts tend to be available for
free as promotional material and as such are freely distributed. Table 7.1 shows the
breakdown of a hypothetical research publication on the subject of workplace
employee spirituality.
Table 7.1 Abstract structure for journal publication
Introduction Workplace spirituality from an employee’s perspective is of great
importance in making the workplace productive and satisfying
while contributing to integrating work-life balance values into
organizational behavior. Literature suggests that a theoretical
framework that considers spirituality as a vital constituent of
employees directly influencing their performance at work is needed
if organizations are to treat their employees with respect while
reaping the benefits of spirituality.
Methods By integrating existing research on workplace spirituality, a
correlational research design was adopted to evaluate the impact of
spirituality on employee performance in the workplace. A self-
administered questionnaire was developed with 8 items using a five-
point Likert scale. The questionnaire was screened by a panel of
experts and pilot-tested to 20 qualified individuals. The calibrated
form of the questionnaire was further completed by 214
participants who adhered to the eligibility criteria of the study.
Results The results indicate that employee workplace spirituality is best
captured by 5 factors that showed significant levels of correlation
with their work performance. These include: (a) belief in “higher
power” (r(46) = 0.78, p< 0.01) that provides meaning and purpose
whether that power is in the form of a deity, the individual,
principles, or the universe in some form, (b) the belief that work is
part of a higher power plan and so an acceptable and valuable part
of life (r(38) = 0.62, p< 0.01), (c) the need to support one’s lifestyle
in accordance with the high power directives (r(42) = 0.71, p<
0.01), the need for self-actualization (r(34) = 0.52, p< 0.01), and
skills to endure the hardships imposed by the workplace (r(40) =
0.68, p< 0.01).
Discussion The results of the research suggest a strong connection between
spirituality as expressed by 5 factors and workplace employee
performance. Further research might be required to identify ways
that organization can use to integrate spiritual practices in the
workforce.
Publishing Research 241
Having discussed what our research is all about we move on to discussing what
the research did. This is where the various theoretical constructs and variables will be
operationalized, and a detailed description of the methods used will be discussed. The
details should be sufficient for other researchers to replicate the study and confirm or
disprove the findings of our study. Readers should also have sufficient information to
evaluate the appropriateness of the methods we used for the hypotheses we set and
the type of data we collected. References to past research that used similar methods
for similar studies should be provided as a support for the choices we made.
Everything we presented in section 2.4 of Chapter 2 (like research design,
sampling, instruments, etc.) is material that will be mentioned here so the interested
reader is referred to those sections for additional information. Specifics about
experimental manipulation and interventions, if used, also need to be discussed within
their specific context. It is suggested that the methods/methodology section is written
in past tense and passive voice to reduce researcher biases when discussing the choices
they made (depersonalizing the presentation).
After the methods, we proceed to the results section where we summarize the
collected data, the analysis we performed on them, and the results of our research.
This needs to be done in sufficient detail to provide a complete picture of the results
to someone with a professional knowledge of quantitative methods (Chapters 4, 5, and
6). No citations for the methods used are necessary in this section unless a justification
for a special procedure is required to interpret the results. The language used in
reporting statistical results is more or less standard, so we will provide here a list of the
way such results could be presented.
Mean and standard deviation are always presented as a pair like (M = 25.3, SD
= 1.8). Alternatively, in a narrative form, we might say that the mean for our sample
for VariableX was 25.3 variable units (SD = 1.8). Substitute ‘units’ with the units used
for the variable. Test results should be presented with their associated p values. For a
t-test this could be in the form “there was a significant effect for VariableX (M = 25.3,
SD = 1.8), t(10) = 1.26, p = 0.05”. Similarly, for a chi-square test we might say “the
percentage of our sample that expressed VariableX did not differ by VariableY, x2(4,
N=89) = 0.93, p > 0.05”. Correlations could be of the form of “VariableX and
VariableY were strongly correlated, r(46) = 0.78, p< 0.01”, and ANOVA could be
“one-way analysis of variance showed that the effect of VariableX was significant,
F(3,27) = 5.94, p = .007”. Finally, regression results can be in the form of “a significant
regression equation was found between VariableX and VariableY, F(1,210) = 37.29, p
< 0.01, with R2 = 0.12”. Most other statistical tests discussed in this book can be
presented in similar ways.
After the presentation of the results we come to the last main section of the
paper, the discussion of the research findings. This section is often titled “Conclusions
Publishing Research 243
7.1.3 References
The last part of a journal publication is if not the most “torturous” it is for
sure the most boring one (based on anecdotal evidence and personal experience).
Citing research work and referencing at the end is a requirement for every research
publication as it provides the sources used to make statements about claims and facts
that related to our research in some way. By the time researchers reach this stage they
will have undoubtedly seen hundreds of citation and reference styles through the
review of the literature they have conducted so some familiarity with referencing styles
will have been picked up along the way.
Popular styles nowadays include the American Psychological Association
(APA), Modern Language Association (MLA), Institute of Electrical and Electronics
Engineers (IEEE), Chicago Manual of Style, Harvard, etc. These have been developed
by different associations and journals to ensure compliance and in addressing the
needs of specific disciplines. APA for example is predominantly used in social sciences,
while IEEE is very popular in engineering and sciences. Overall, there are great
similarities between them as they all need to sufficiently describe the source material,
but the differences could be enough to lead to paper rejection if not properly
addressed. Table 7.2 demonstrates the APA, MLA, and IEEE styles for a research
journal and book as produced by Zotero (mentioned below). For additional types of
referencing and in-text citation styles the reader should refer to websites that explain
the various styles.
Luckily for researchers, there is software that has been developed to manage
references. Zotero is a popular free software that comes along with a citation manager
and plugins for browsers like Firefox and Chrome and also Microsoft Word. It allows
for the creation of a citation library that multiple researchers can access and update
online. It can also produce a bibliography in any of the popular formats available.
244 QUANTITATIVE RESEARCH METHODS
7.2 Dissertations
A special category of published research concerns dissertations. These can be
at the master’s level (M.Sc., M.Ed., MFA, etc.) or the doctorate level (Ph.D, DBA,
Ed.D., .D.Eng., etc.). The differences are mainly in the length of the manuscript (with
the doctorate been more extended), which generally reflects the amount of time
dedicated to the degree (1-2 years for MS and 3+ additional years for the doctorate)
and the contribution of the work to theory (this is mainly the PhD domain) and/or
practice (mainly DBA, MS domain).
Regarding structure, the great majority of dissertations follow a standard five-
chapter structure which is the IMRaD format with the interjection of a literature
chapter after the introduction. This is deemed necessary due to the extensive coverage
of the research topic of a dissertation in terms of what has been done in the past.
Because the previous section has discussed what is to be included in the various
IMRaD sections, we will present the various chapters of a quantitative dissertation
Publishing Research 245
with a brief discussion. The reader should keep in mind that additional entries are
required before these chapters and include:
• Title page: include the title of the research, type of degree, school and
department, author name, and publication year.
• Abstract page: Same as section 7.1.1.
• Acknowledgments page: everyone who has contributed to the
research in any form or means should be acknowledged here.
• Table of Contents page.
• List of Tables page: should mirror table titles within the body of the
paper according to the school’s referencing style.
• List of Figures page: should mirror figure titles according to the
school’s referencing style.
The five chapters (discussed next) will have to be followed by the references
section and any appendices mentioned in the main body of the text. When the
dissertation is complete and after it has been properly defended, the researcher can
proceed with the publication process. Apart from publishing it as a printed book, there
are dedicated databases like ProQuest that accept dissertations and make them
available to anyone interested.
constructs that will support the research, the discussion needs to also
address the appropriateness of the aforementioned for the study. In
the case of Ph.Ds this section should close with the extensions to
theory (or the development of a new theory) that the research is trying
to achieve. This section should normally include the definitions of key
terms and constructs used along with the research questions and
hypotheses of the study. This is because they are part of the model that
will be developed (for Ph.Ds) or used. In many dissertations, though,
they form separate sections, so for compliance reasons they will also
be discussed separately here.
• Definition of Key Terms: The terms that will be used to form the
research questions and hypothesis and any others that need to
accompany them should be clearly and concisely presented and
supported with citations.
• Research Questions: Ensure that the research questions are aligned
with the purpose of the research and are based on the theoretical
framework selected for the research. An easy way to ensure alignment
is by taking the sentence that expresses the purpose of the research
and converting it to a question. This can be a central research question
(CRQ) upon which the research questions (RQs) will expand.
• Hypotheses: For the case of quantitative research each research
question that is not exploratory in nature will have to be followed by a
pair of null and alternative hypotheses.
• Nature of the Study: Describe the research methodology and design
and discuss their appropriateness for our research (purpose, research
questions, and hypotheses). Briefly also discuss the data collection and
analysis methods (a more detailed discussion should be reserved for
Chapter 3). Make sure citation support is provided. 2–3 pages should
be enough for this section.
• Significance of the Study: Here is where we need to discuss the
importance of our contribution both to theory (mainly Ph.Ds) and
practice. The benefits of the answers to the research questions should
be emphasized as well as the positive impact of completing the study.
• Summary: Here we need to restate the key points of Chapter 1.
• Search Strategy: As with all the parts of the research it should be clear
how the material was acquired so that future researchers can replicate
and validate the study. A paragraph describing the databases used for
searches, keywords used, and the screening process should be
sufficient here.
• Topics/Subtopics: The past and current state of affairs in the area of
our research should be presented followed by a critical analysis and
synthesis of the key elements as they relate to our research. A historical
account of the subject with their advantages and disadvantages and
their implications for practice would help present the theoretical
approaches used to study similar phenomena. This discussion should
include both theoretical and practical perspectives. It should be
comprehensive and should flow logically. A mistake that should be
avoided is to view the literature review as an annotated bibliography
with the material of one reference following another.
• Summary: Key points of the chapter should be summarized.
Emphasize similarities between research approaches and the
dissertation topic along with differences, omissions, and the challenges
they pose.
f
Appendix B Research Example Snapshot 255
256 QUANTITATIVE RESEARCH METHODS
The plethora of the available statistical methods creates many times the need
to simplification of the process of selecting the appropriate method for the research
we are conducting. This need leads the way to flow-charts, decision trees and other
forms of representing the available options in an effort to ease the selection process.
Based on what is discussed in this book the following table summarizes the available
option according to the number and types of variables involved.
Distribution Free/Non-parametric
SPSS: Analyze => Nonparametric Tests =>
Binomial
Distribution Free/Non-parametric
Wilcoxon's rank/rank sum
One sample (Paired) SPSS: Analyze >
Nonparametric Tests > Legacy Dialogs > 2
Related Samples
Mann-Witney
Spearman's rank
SPSS: Analyze > Correlate > Bivariate >
Spearman (uncheck Pearson)
Distribution Free/Non-parametric
Kruskal-Wallis H/One-way ANOVA on
ranks
Logistic regression
SPSS: Analyze > Regression > Linear
>…one dependent, many independent
INDEX 261
INDEX