0% found this document useful (0 votes)
45 views

An Overview of Data Analysis and Interpr

Uploaded by

Bernie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

An Overview of Data Analysis and Interpr

Uploaded by

Bernie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Vol. 8(1), pp.

1-27, March 2020


DOI: 10.14662/IJARER2020.015 International Journal of
Copy © right 2020
Author(s) retain the copyright of this article Academic Research in
ISSN: 2360-7866 Education and Review
http://www.academicresearchjournals.org/IJARER/Index.htm

Review

An Overview of Data Analysis and Interpretations in


Research
Dawit Dibekulu Alem
Lecturer at Mekdela Amba University, College Social Scinces and Humanities , Departement of English
Languge and Literature. Email: [email protected]

Accepted 16 March 2020

Research is a scientific field which helps to generate new knowledge and solve the existing problem.
So, data analysis is the crucial part of research which makes the result of the study more effective. It is
a process of collecting, transforming, cleaning, and modeling data with the goal of discovering the
required information. In a research it supports the researcher to reach to a conclusion. Therefore,
simply stating that data analysis is important for a research will be an understatement rather no
research can survive without data analysis. It can be applied in two ways which is qualitatively and
quantitative. Both are beneficial because it helps in structuring the findings from different sources of
data collection like survey research, again very helpful in breaking a macro problem into micro parts,
and acts like a filter when it comes to acquiring meaningful insights out of huge data-set. Furthermore,
every researcher has sort out huge pile of data that he/she has collected, before reaching to a
conclusion of the research question. Mere data collection is of no use to the researcher. Data analysis
proves to be crucial in this process, provides a meaningful base to critical decisions, and helps to
create a complete dissertation proposal. So, after analyzing the data the result will provide by
qualitative and quantitative method of data results. Quantitative data analysis is mainly use numbers,
graphs, charts, equations, statistics (inferential and descriptive). Data that is represented either in a
verbal or narrative format is qualitative data which is collected through focus groups, interviews,
opened ended questionnaire items, and other less structured situations.

Key Words: Data,data analysis, qualitative and quantitative data analysis

Cite This Article As: Dawit DA (2020). An Overview of Data Analysis and Interpretations in Research. Inter. J.
Acad. Res. Educ. Rev. 8(1): 1-27

INTRODUCTION Oxford, (1952, p. 1069) cited in Kothari (2004), lays down


the meaning of research as “a careful investigation or
Research can be considered as an area of inquiry especially through search for new facts in any
investigation to solve a problem within a short period of branch of knowledge.” Moreover, Redman and Mory
time or in the coming long future. As explained by Kothari (1923) cited in Kothari (2004), define research as a
(2004), research in common parlance refers to a search “systematized effort to gain new knowledge.”
for knowledge. It can also be defined as a scientific and In research getting relevant data and using these data
systematic search for pertinent information on a specific properly is mandatory. The task of data collection begins
topic. In fact, research is an art of scientific investigation. after a research problem has been defined and research
The Advanced Learner’s Dictionary of Current English design/ plan chalked out. While deciding about the
2 Inter. J. Acad. Res. Educ. Rev.

method of data collection to be used for the study, the quantitative data analysis methods are explained in
researcher should keep in mind two types of data viz., detail. More over emphasize is given to descriptive and
primary and secondary. The primary data are those inferential statistics methods of data analysis. Finally, the
which are collected afresh and for the first time, and thus how of writing summary, conclusion and
happen to be original in character. The secondary data, recommendations based on the findings gained
on the other hand, are those which have already been qualitatively as well as quantitatively are included.
collected by someone else and which have already been
passed through the statistical process. The researcher
would have to decide which sort of data he would be Data Analysis
using (thus collecting) for his study and accordingly he
will have to select one or the other method of data Concept of Data Analysis
collection. The methods of collecting primary and
secondary data differ since primary data are to be What do we mean when we say data in the first place?
originally collected, while in case of secondary data the The 1973 Webster’s New Collegiate Dictionary defines
nature of data collection work is merely that of data as “factual information (as measurements or
compilation. Whatever it is the data used in any research statistics) used as a basis for reasoning, discussion, or
should be analyzed properly either qualitatively or calculation.” The 1996 Webster’s II New Riverside
quantitatively based on the nature of the data collected. Dictionary Revised Edition defines data as “information,
Data collected from various sources can gathered, especially information organized for analysis.” Merriam
reviewed, and then analyzed to form some sort of finding Webster Online Dictionary defines data” as: factual
or conclusion. information (as measurements or statistics) used as a
There are a variety of specific data analysis method, basis for reasoning, discussion, or calculation;
some of which include data mining, text analytics, information output by a sensing device or organ that
business intelligence, and data visualizations. includes both useful and irrelevant or redundant
Patton(1990) stated that data analysis is a process of information and must be processed to be meaningful or
inspecting, cleansing, transforming, and modeling data information in numerical form that can be digitally
with the goal of discovering useful information, transmitted or processed.
suggesting conclusions, and supporting decision-making. Taking from the above definitions, a practical approach
Data analysis has multiple facets and approaches, to defining data is that it is numbers, characters, images,
encompassing diverse techniques under a variety of or other method of recording, in a form which can be
names, in different business, science, and social science assessed to make a determination or decision about a
domains. It can be done qualitatively or quantitatively. specific action. Many believe that data on its own has no
Data analysis is the central step in both qualitative and meaning, only when interpreted does it take on meaning
qualitative research. and become information. By closely examining data (data
Whatever the data are, it is their analysis that, in a analysis) we can find patterns to perceive information,
decisive way, forms the outcomes of the research. The and then information can be used to enhance knowledge
purpose of analyzing data is to obtain usable and useful (The Free On-line Dictionary of Computing, 1993-2005
information. The analysis, irrespective of whether the Denis Howe).
data is qualitative or quantitative, may: describe and Simply, data analysis is changing the collected row
summarize the data, identify relationships between data into meaningful facts and ideas to be understood
variables, compare variables, identify the difference either qualitatively or quantitatively. It is studying the
between variables and forecast outcomes Sometimes, tabulated material in order to determine inherent facts or
data collection is limited to recording and documenting meanings. It involves breaking down existing complex
naturally occurring phenomena, for example by recording factors into simpler parts and putting the parts together in
interactions which may be taken as qualitative type. new arrangements for the purpose of interpretation. As to
Qualitative analysis is concentrated on analyzing such Kothari (2004) data analysis includes comparison of the
recordings. On the other hand data may be collected outcomes of the various treatments upon the several
numerically using questionnaires and some rating scales groups and the making of a decision as to the
and these data mostly analyzed using quantitative achievement of the goals of research. The analysis,
techniques. irrespective of whether the data is qualitative or
With this introduction, this paper focuses on data quantitative, may be to describe and summarize the
analysis, concepts, techniques, expected assumptions, data, identify relationships between variables, compare
advantages and some limitations of selected data variables, identify the difference between variables and
analysis techniques. That is, the concept of data analysis forecast outcomes as mentioned in the introduction.
and processing steps are treated in the first part of the According to Ackoff (1961), a plan of analysis can and
paper. In second part, concepts of qualitative and should be prepared in advance before the actual
Dawit 3

collection of material. A preliminary analysis on the searching for patterns of relationship that exist among
skeleton plan show as the investigation proceeds, data-groups (Kothari,2004). Thus, “in the process of
develop into a complete final analysis enlarged and analysis, relationships or differences supporting or
reworked as and when necessary. This process requires conflicting with original or new hypotheses should be
an alert, flexible and open mind. Caution is necessary at subjected to statistical tests of significance to determine
every step. with what validity data can be said to indicate any
In the process of data analysis, statistical method has conclusions”. But persons like (Selltiz, et.al., 1959) do
contributed a great deal. Simple statistical calculation not like to make difference between processing and
finds a place in almost any study dealing with large or analysis. They opine that analysis of data in a general
even small groups of individuals, while complex statistical way involves a number of closely related operations
computations form the basis of many types of research. It which are performed with the purpose of summarizing the
may not be out of place, therefore to enumerate some collected data and organizing these in such a manner
statistical methods of analysis used in educational that they answer the research question(s). We, however,
research. The analysis and interpretation of data shall prefer to observe the difference between the two
represent the application of deductive and inductive logic terms as stated here in order to understand their
to the research process. implications more clearly.
Technically speaking, processing implies editing, Generally, data analysis in research divided into
coding, classification and tabulation of collected data so qualitative and quantitative data analysis. The data, after
that they are amenable to analysis. The term analysis collection, has to be processed and analyzed in
refers to the computation of certain measures along with accordance with the outline laid down for the purpose at
the time of developing the research plan. This is essential
for a scientific study and for ensuring that we have all
relevant data for making contemplated comparisons and
analysis.
schedules have been completed and returned to the
office. It implies that all forms should get a thorough
Data Processing Operations editing by a single editor in a small study and by a team
of editors in case of a large inquiry. Editor(s) may correct
In the data analysis process we need to focus on the the obvious errors such as an entry in the wrong place,
following data analysis process operation stages. Kothari entry recorded in months when it should have been
(2004) suggested the following data analysis operation recorded in weeks, and the like.
stages.
2. Coding: Coding refers to the process of assigning
1. Editing: Editing of data is a process of examining the numerals or other symbols to answers so that responses
collected raw data (especially in surveys) to detect errors can be put into a limited number of categories or classes.
and omissions and to correct these when possible. As a Such classes should be appropriate to the research
matter of fact, editing involves a careful scrutiny of the problem under consideration. They must also possess
completed questionnaires and/or schedules. Editing is the characteristic of exhaustiveness (i.e., there must be a
done to assure that the data are accurate, consistent with class for every data item) and also that of mutual
other facts gathered, uniformly entered, as completed as exclusively which means that a specific answer can be
possible and have been well arranged to facilitate coding placed in one and only one cell in a given category set
and tabulation (Kothari, 2004). So, this indicates that (Kothari ,2004). In addition, Coding is necessary for
editing is the process of data correction. efficient analysis and through it the several replies may
be reduced to a small number of classes which contain
According to Kothari (2004) editing should be done, one the critical information required for analysis. Coding
can talk of field editing and central editing. decisions should usually be taken at the designing stage
of the questionnaire. This makes it possible to pre-code
Field editing consists in the review of the reporting forms the questionnaire choices and which in turn is helpful for
by the investigator for completing (translating or rewriting) computer tabulation as one can straight forward key
what the latter has written in abbreviated and/or in punch from the original questionnaires (Neuman, 2000).
illegible form at the time of recording the respondents’
responses. This type of editing is necessary in view of the 3. Classification: Most research studies result in a large
fact that individual writing styles often can be difficult for volume of raw data which must be reduced into
others to decipher. On the other hand, homogeneous groups if we are to get meaningful
relationships. This fact necessitates classification of data
central editing should take place when all forms or which happens to be the process of arranging data in
4 Inter. J. Acad. Res. Educ. Rev.

groups or classes on the basis of common Where i = size of class interval;


characteristics. Data having a common characteristic are
placed in one class and in this way the entire data get R = Range (i.e., difference between the values of the
divided into a number of groups or classes. Classification largest item and smallest item among the given items);
can be one of the following two types, depending upon N = Number of items to be grouped.
the nature of the phenomenon involved:
It should also be kept in mind that in case one or two or
(a) Classification according to attributes: data are very few items have very high or very low values, one
classified on the basis of common characteristics which may use what are known as open-ended intervals in the
can either be descriptive (such as literacy, sex, honesty, overall frequency distribution.
etc.) or numerical (such as weight, height, income, etc.).
Descriptive characteristics refer to qualitative (ii) How to choose class limits?
phenomenon which cannot be measured quantitatively;
only their presence or absence in an individual item can While choosing class limits, the researcher must take into
be noticed. Data obtained this way on the basis of certain consideration the criterion that the mid-point (generally
attributes are known as statistics of attributes and their worked out first by taking the sum of the upper limit and
classification is said to be classification according to lower limit of a class and then divide this sum by 2) of a
attributes. Such classification can be simple classification class-interval and the actual average of items of that
or manifold classification (Kothari, 2004). class interval should remain as close to each other as
possible. Consistent with this, the class limits should be
(b) Classification according to class-intervals: unlike located at multiples of 2, 5, 10, 20, 100 and such other
descriptive characteristics, the numerical characteristics figures. Class limits may generally be stated in any of the
refer to quantitative phenomenon which can be measured following forms:
through some statistical units. Data relating to income,
production, age, weight, etc. come under this category. Exclusive type class intervals: They are usually stated
Such data are known as statistics of variables and are as follows:
classified on the basis of class intervals. All the classes 10–20 read as above 10 and under 20
or groups, with the irrespective frequencies taken 20–30 read as above 10 and under 20
together and put in the form of a table, are described as 30–40 read as above 10 and under 20
group frequency distribution or simply frequency 40–50 read as above 10 and under 20
distribution. Classification according to class intervals
usually involves the following three main problems: Thus, under the exclusive type class intervals, the items
whose values are equal to the upper limit of a class are
(i) How may classes should be there? What should grouped in the next higher class. For example, an item
be their magnitudes? whose value is exactly 30 would be put in 30–40 class
intervals and not in 20–30 class intervals.
There can be no specific answer with regard to the In simple words, we can say that under exclusive type
number of classes. The decision about this calls for skill class intervals, the upper limit of a class interval is
and experience of the researcher. However, the objective excluded and items with values less than the upper limit
should be to display the data in such a way as to make it (but not less than the lower limit) are put in the given
meaningful for the analyst. Typically, we may have 5 to class interval.
15 classes. With regard to the second part of the
question, we can say that, to the extent possible, class- Inclusive type class intervals: They are usually stated
intervals should be of equal magnitudes, but in some as follows:
cases unequal magnitudes may result in better
classification. 11–20
21–30
Hence the researcher’s objective judgment plays an 31–40
important part in this connection. Multiples of 2, 5 and 10 41–50
are generally preferred while determining class
magnitudes. Some statisticians adopt the following In inclusive type class intervals the upper limit of a class
formula, suggested by Sturges as cited in Kothari,( 2004), interval is also included in the concerning class interval.
determining the size of class interval: Thus, an item whose value is 20 will be put in 11–20
class intervals. The stated upper limit of the class interval
i = R/(1 + 3.3 log N) 11–20 is 20 but the real limit is 20.99999 and as such
11–20 class interval really means 11 and under 21.
Dawit 5

When the phenomenon under consideration happens to dimensions.


be a discrete one (i.e., can be measured and stated only Qualitative data analysis is the range of processes and
in integers), then we should adopt inclusive type procedures whereby we move from the qualitative data
classification. But when the phenomenon happens to be that have been collected, into some form of explanation,
a continuous one capable of being measured in fractions understanding or interpretation of the people and
as well, we can use exclusive type class intervals. situations we are investigating (Cohen, et.al. 2007).It is
usually based on an interpretative philosophy. The idea is
4. Tabulation: When a mass of data has been to examine the meaningful and symbolic content of
assembled, it becomes necessary for the researcher to qualitative data. It refers to non-numeric information such
arrange the same in some kind of concise and logical as interview transcripts, notes, video and audio
order. This procedure is referred to as tabulation. Thus, recordings, images and text documents.
tabulation is the process of summarizing raw data and
displaying the same in compact form (i.e., in the form of Qualitative data analysis can be divided into the
statistical tables) for further analysis. In a broader sense, following five categories:
tabulation is an orderly arrangement of data in columns
and rows. As Kothari (2004) stated tabulation is essential 1. Content analysis: This refers to the process of
because of the following reasons; categorizing verbal or behavioral data to classify,
summarize and tabulate the data. According to Cohen,
it conserves space and reduces explanatory and et.al. (2007), content analysis is the procedure for the
descriptive statement to a minimum , categorization of verbal or behavioral data for the
it facilitates the process of comparison, purpose of classification, summarization and tabulation.
it facilitates the summation of items and the Content analysis can be done on two levels:
detection of errors and omissions, and
it provides a basis for various statistical a. Descriptive: What is the data? And
computations. b. Interpretative: what was meant by the data?
2. Narrative analysis: This method involves the
Generally, in the process of data analysis the above four reformulation of stories presented by
steps need to be critical applied. Because, without respondents taking into account context of each
applying the above process operation of data one cannot case and different experiences of each
do good data analysis. respondent. In other words, narrative analysis is
the revisions of primary qualitative data by
researcher. Narratives are transcribed
Qualitative Data Analysis experiences. Every interview/observation has
narrative aspect in which the researcher has to
Concept of Qualitative Data Analysis sort-out and reflects up on them, enhance them
and present them in a revised shape to the
Data that is represented either in a verbal or narrative reader. The core activity in narrative analysis is to
format is qualitative data. These types of data are reformulate stories presented by people in
collected through focus groups, interviews, opened different contexts and based on their different
ended questionnaire items, and other less structured experiences.
situations. A simple way to look at qualitative data is to
think of qualitative data in the form of words Migrant& 3. Discourse analysis: A method of analysis of
Seasonal Head Start, (2006) stated as: naturally occurring talk and all types of written
text. This is a method of analyzing a naturally
Qualitative data analysis is the classification and occurring talk (spoken interaction) and all types
interpretation of linguistic (or visual) material to of written texts. It focuses on how people express
make statements about implicit and explicit themselves verbally in their everyday social life
dimensions and structures of meaning-making in i.e. how language is used in everyday situations?
the material and what is represented in it. a. Sometimes people express themselves in a
Meaning-making can refer to subjective or social simple and straightforward way
meanings. b. Sometimes people express themselves vaguely
and indirectly
From the above explanation we can understand that c. Analyst must refer to the context when
qualitative data analysis is one way of data analysis interpreting the message because the same
which helps to describe or interpret the data through phenomenon can be described in a number of
words which transfer information through different different ways depending on context.
6 Inter. J. Acad. Res. Educ. Rev.

4. Framework analysis: This is more advanced And the third aim may be to develop a theory of the
method that consists of several stages such as phenomenon under study from the analysis of empirical
familiarization (Transcribing & reading the data), material (e.g. a theory of illness trajectories).
identifying a thematic framework (Initial coding
framework which is developed both from a priori
issues and from emergent issues), coding Advantages and disadvantages of qualitative
(Using numerical or textual codes to identify analysis
specific piece of data which correspond to
different themes) , charting(Charts created using Advantages of Qualitative Analysis
headings from thematic framework), and
mapping and interpretation (Searching for Qualitative data analysis has different advantages.
patterns, associations, concepts and Denscombe (2007), stated there are a number of
explanations in the data). advantages such as:

5. Grounded theory: as Corbin and Nicholas The first is the data and the analyses are ‘grounded. A
(2005) this method of qualitative data analysis particular strength associated with qualitative research is
starts with an analysis of a single case to that the descriptions and theories such research
formulate a theory. Then, additional cases are generates are ‘grounded in reality’. This is not to suggest
examined to see if they contribute to the theory. that they depict realityin some simplistic sense, as though
This theory starts with an examination of a single social reality were ‘out there’ waiting to be ‘discovered’.
case from a ‘pre-defined’ population in order to But it does suggest that the data and the analysis have
formulate a general statement (concept or a their roots in the conditions of social existence. There is
hypothesis) about a population. Afterwards the little scope for ‘armchair theorizing’ or ‘ideas plucked out
analyst examines another case to see whether of thin air’.
the hypothesis fits the statement. If it does, a
further case is selected but if it doesn’t fit there The second, there is a richness and detail to the data.
are two options: Either the statement is changed The in-depth study of relatively focused areas, the
to fit both cases or the definition of the population tendency towards small-scale research and the
is changed in such a way that the case is no generation of ‘thick descriptions’ mean that qualitative
longer a member of the newly defined population. research scores well in terms of the way it deals with
Then another case is selected and the process complex social situations. It is better able to deal with the
continues. In such a way one should be able to intricacies of a situation and do justice to the subtleties of
arrive at a statement that fits all cases of a social life.
population-as-defined. This method is only for
limited set of analytic problems: those that can be The third, there is tolerance of ambiguity and
solved with some general overall statement contradictions. To the extent that social existence
(Cohen, et.al. 2007). involves uncertainty, accounts of that existence ought to
be able to tolerate ambiguities and contradictions, and
qualitative research is better able to do this than
Aims of Qualitative Data Analysis quantitative research (Maykut and Morehouse, 1994 as
cited in Denscombe, 2007). This is not a reflection of a
The analysis of qualitative data can have several aims. weak analysis. It is a reflection of the social reality being
Neuman (2000) explained that: The first aim may be to investigated.
describe a phenomenon in some or greater detail. The
phenomenon can be the subjective experiences of a Lastly, there is the prospect of alternative
specific individual or group (e.g. the way people continue explanations. Qualitative analysis, because it draws on
to live after a fatal diagnosis). This can focus on the case the interpretive skills of the researcher, opens up the
(individual or group) and its special features and the links possibility of more than one explanation being valid.
between them. The analysis can also focus on comparing Rather than a presumption that there must be, in theory
several cases (individuals or groups) and on what they at least, one correct explanation, it allows for the
have in common or on the differences between them. possibility that different researchers might reach different
The second aim may be to identify the conditions on conclusions, despite using broadly the same methods.
which such differences are based. This means to look for
explanations for such differences (e.g. circumstances A. Disadvantages of Qualitative Analysis
which make it more likely that the coping with a specific
illness situation is more successful than in other cases). In relation to the disadvantages Denscombe (2007),
Dawit 7

also explained the following are some powerful and avoid attempts to oversimplify matters (Denscombe,
disadvantages: 2007).

First, the data might be less representative. The flip- Fifth, the analysis takes longer. The volume of data
side of qualitative research’s attention to thick description that a researcher collects will depend on the time and
and the grounded approach is that it becomes more resources available for the research project. When it
difficult to establish how far the findings from the detailed, comes to the analysis of that data, however, it is almost
in-depth study of a small number of instances may be guaranteed that it will seem like a daunting task
generalized to other similar instances. Provided sufficient (Denscombe, 2007).
detail is given about the circumstances of the research,
however, it is still possible to gauge how far the findings
relate to other instances, but such generalizability is still Quantitative Data Analysis
more open to doubt than it is with well conducted
quantitative research. Quantitative data is expressed in numerical terms, in
which the numeric values could be large or small.
Second, interpretation is bound up with the ‘self’ of Numerical values may correspond to a specific category
the researcher. Qualitative research recognizes more or label. Quantitative analysis is statistically reliable and
openly than does quantitative research that the generalizable results. In quantitative research we classify
researcher’s own identity, background and beliefs have a features, count them, and even construct more complex
role in the creation of data and the analysis of data. The statistical models in an attempt to explain what is
research is ‘self-aware’. This means that the findings are observed. Findings can be generalized to a larger
necessarily more cautious and tentative, because it population, and direct comparisons can be made
operates on the basic assumption that the findings are a between two corpora, so long as valid sampling and
creation of the researcher rather than a discovery of fact. significance techniques have been used (Bryman and
Although it may be argued that quantitative research is Cramer, 2005). Thus, quantitative analysis allows us to
guilty of trying to gloss over the point – which equally well discover which phenomena are likely to be genuine
applies – the greater exposure of the intrusion of the ‘self’ reflections of the behavior of a language or variety, and
in qualitative research inevitably means more cautious which are merely chance occurrences. The more basic
approaches to the findings (Denscombe, 2007). task of just looking at a single language variety allows
one to get a precise picture of the frequency and rarity of
Third, there is a possibility of de-contextualizing the particular phenomena, and thus their relative normality or
meaning. In the process of coding and categorizing the abnormality.
field notes, texts or transcripts there is a possibility that However, the picture of the data which emerges from
the word (or images for that matter) get taken literally out quantitative analysis is less rich than that obtained from
of context. The context is an integral part of the qualitative analysis. For statistical purposes,
qualitative data, and the context refers to both events classifications have to be of the hard-and-fast (so-called
surrounding the production of the data, and events and "Aristotelian" type). An item either belongs to class x or it
words that precede and follow the actual extracted pieces doesn't. So in the above example about the phrase "the
of data that are used to form the units for analysis. There red flag" we would have to decide whether to classify
is a very real danger for the researcher that in coding and "red" as "politics" or "color". As can be seen, many
categorizing of the data the meaning of the data is lost or linguistic terms and phenomena do not therefore belong
transformed by wrenching it from its location (a) within a to simple, single categories: rather they are more
sequence of data (e.g. interview talk), or (b) within consistent with the recent notion of "fuzzy sets" as in the
surrounding circumstances which have a bearing on the red example. Quantitative analysis is therefore an
meaning of the unit as it was originally conceived at the idealization of the data in some cases. Also, quantitative
time of data collection (Denscombe, 2007). analysis tends to sideline rare occurrences. To ensure
that certain statistical tests (such as chi-squared) provide
Fourth, there is the danger of oversimplifying the reliable results, it is essential that minimum frequencies
explanation. In the quest to identify themes in the data are obtained - meaning that categories may have to be
and to develop generalizations the researcher can feel collapsed into one another resulting in a loss of data
pressured to underplay, possibly disregard data that richness (Dawson, 2002). So, in generally concept
‘doesn’t fit’. Inconsistencies, ambiguities and alternative quantitative data analysis is mainly use numbers,
explanations can be frustrating in the way they inhibit a graphs, charts, equations, statistics( inferential and
nice clear generalization – but they are an inherent descriptive), ANOVA, ANCOVA, regression, and
feature of social life. Social phenomena are complex, and correlation etc.
the analysis of qualitative data needs to acknowledge this
8 Inter. J. Acad. Res. Educ. Rev.

Statistical Analysis of Data sense of that data. Descriptive statistics consist of


methods for organizing and summarizing information
Statistics is the body of mathematical techniques or (Weiss, 1999)
processes for gathering, describing organizing and
interpreting numerical data. Since research often yields A parameter is a descriptive characteristic of a
such quantitative data, statistics is a basic tool of population (Hinkle, Wiersma, & Jurs, 2003). For example,
measurement and research. The researcher who uses if we found the average of language skills all grade 8
statistics is concerned with more than the manipulation of students mentioned above in the town be it population,
data, statistical methods goes back to fundamental the resulting average (also called the mean) would be a
purposes of analysis. Research in education may deal population parameter. To obtain this average, we first
with two types of statistical data application: Descriptive need to tabulate the amount of numerated skills of every
Statistical Analysis, and Inferential Statistical Analysis. student. When calculating this mean, we are engaging in
To understand the difference between descriptive and descriptive statistical analysis.
inferential statistics, you must first understand the
difference between populations and samples. As, Weiss (1999) explained that descriptive statistical
analysis focuses on the exhaustive measurement of
A population is the entire collection of a carefully population characteristics. You define a population,
defined set of people, objects, or events(Celine, assess each member of that population, and compute a
2017). summary value (such as a mean or standard deviation)
based on those values. It is concerned with numerical
So, population is the broader group of people to whom description of a particular group observed and any
your results will apply. For example, if the a researcher similarity to those outside the group cannot be taken for
wants to conduct a research in education (eg. Grade 8 granted. The data describe one group and that one group
Students language skills in Debre Markos Administration only. Much simple educational research involves
town primary schools) all grade 8 students in that specific descriptive statistics and provides valuable information
area are considered to be population in which the about the nature of a particular group or class.
samples will be taken.
A sample is a subset of the people, objects, or events Data collected from tests and experiments often have
selected from that population (Celine, 2017). little meaning or significance until they have been
So, sample is the group of individuals who participate in classified or rearranged in a systematic way. This
your study. For example, selected grade 8 students from procedure leads to the organization of materials into few
the total population can be the sample for the research. heads:

A. Descriptive Statistics (i) Determination of range of the interval between the


largest and smallest scores.
Descriptive statistics is the type of statistics that (ii) Decision as to the number and size of the group to be
probably springs to most people’s minds when they hear used in classification.
the word “statistics.” In this branch of statistics, the goal is
to describe. As Weiss (1999) stated that numerical Class interval is therefore, helpful for grouping the data
measures are used to tell about features of a set of data. in suitable units and the number and size of these class
There are a number of items that belong in this portion of intervals will depend upon the range of scores and the
statistics, such as: kinds of measures with which one is dealing. The number
of class intervals which a given range will yield can be
The average, or measure of the center of a data determined approximately by dividing the range by the
set, consisting of the mean, median, mode, or interval tentatively chosen.
midrange, The spread of a data set, which can
be measured with the range or standard According to Agresti, & Finlay (1997), the most
deviation, Overall descriptions of data such as commonly used methods of analysis data statistically are:
the five number summary, Measurements such Calculating frequency distribution usually in percentages
as skewness and kurtosis. of items under study, testing data for normality of
distribution skewness and kurtosis, calculating
The exploration of relationships percentiles and percentile ranks, calculating measures of
and correlation between paired data, and the central tendency-mean, median and mode and
presentation of statistical results in graphical form. These establishing norms, calculating measures of dispersion-
measures are important and useful because they allow standard deviation mean deviation, quartile deviation and
scientists to see patterns among data, and thus to make range, calculating measures of relationship-coefficient of
Dawit 9

correlation, reliability and validity by the Rank-difference Measures of spread- variance, standard deviation,
and Product moment methods, and graphical quartiles and others are included under this. As Strauss
presentation of data-Frequency polygon curve, and Corbin (1990) stated that:
Histogram, Cumulative frequency polygon and Ogive etc.
Measures of spread describe how the data are
While analyzing data investigations usually make use distributed and relate to each other, including:
of as many of the above simple statistical devices as The range, the entire range of values present in
necessary for the purpose of their study. There are two a data set, The frequency distribution, which
kinds of descriptive statistics that social scientists use: defines how many times a particular value
occurs within a data set, Quartiles, subgroups
Measures of central tendency -mean, median, formed within a data set when all values are
and mode are included under this category. divided into four equal parts across the range,
Measures of central tendency capture general Mean absolute deviation, the average of how
trends within the data and are calculated and much each value deviates from the mean,
expressed as the mean, median, and mode. A Variance, which illustrates how much of a spread
mean tells scientists the mathematical average exists in the data, Standard deviation, which
of all of a data set, such as the average age at illustrates the spread of data relative to the
first marriage; the median represents the middle mean.
of the data distribution, like the age that sits in
the middle of the range of ages at which people So, the above explanation shows that, Measures of
first marry; and, the mode might be the most spread are often visually represented in tables, pie and
common age at which people first marry (Huck, bar charts, and histograms to aid in the understanding of
2004). the trends within the data. These are ways of
summarizing a group of data by describing how spreads
The above explanation indicates that the central out the scores are. It describes how similar or varied the
tendency of a distribution is an estimate of the "center" of set of observed values are for a particular variable (data
a distribution of values. A measure of central tendency is item). For example, the mean score of our 100 students
a central or typical value for a probability distribution. Let may be 65 out of 100. However, not all students will have
us see the following examples: scored 65 marks. Rather, their scores will be spread out.
Some will be lower and others higher. Measures of
Example one: consider the test score values: 15, 20, spread help us to summarize how spreads out these
21, 20, 36, 15, 25, 15. The sum of these scores are. To describe this spread, a number of
8 values is 167, so the mean is 167/8 = statistics are available to us, including the range,
20.875. quartiles, absolute deviation, variance and standard
deviation.
Example two: if there are 500 scores in the list, score Generally, Descriptive statistics includes the
250 would be the median. If we order the construction of graphs, charts, and tables, and the
8 scores shown above, we would get: calculation of various descriptive measures such as
15,15,15,20,20,21,25,36. There are 8 averages, measures of variation, and percentiles. In fact,
scores and score 4 and 5 represent the the most part of this course deals with descriptive
halfway point. Since both of these scores statistics.
are 20, the median is 20. If the two middle
scores had different values, you would B. Inferential Statistics
have to interpolate to determine the
median. The second type of statistics is inferential statistics.
Inferential statistical analysis involves the process of
Example three: in a bimodal distribution there are two sampling, the selection for study of a small group that is
values that occur most frequently. assumed to be related to the large group from which it is
Notice that for the same set of 8 scores drawn. Agresti & Finlay (1997) stated the small group is
we got three different values -- 20.875, known as the sample; the large group, the population or
20, and 15 -- for the mean, median and universe. A statistics is a measure based on a sample. A
mode respectively. If the distribution is statistic computed from a sample may be used to
truly normal (i.e., bell-shaped), the estimate a parameter, the corresponding value in the
mean, median and mode are all equal to population which it is selected. This is a set of methods
each other. used to make a generalization, estimate, prediction or
decision. Inferential statistics is the mathematics and
10 Inter. J. Acad. Res. Educ. Rev.

logic of how this generalization from sample to population the interval.


can be made. The fundamental question is: can we infer
the population’s characteristics from the sample’s Tests of significance or hypothesis testing where
characteristics? scientists make a claim about the population by
Ex. Of 350 (grade 8) randomly selected students in the analyzing a statistical sample. By design, there is
town of Debre Markos out of 2000 students and their some uncertainty in this process. This can be
average listening skills tests is calculated to be (75%), expressed in terms of a level of significance.
this is sample result that we can make generalization
about the total population (2000 students) which is The above explanation shows us, in statistics, a
inferential statistics. confidence interval (CI) is a type of interval estimate (of a
The major use of inferential statistics is to use population parameter) that is computed from the
information from a sample to infer something about observed data. The confidence level is the frequency
a population. Inferential statistics consist of methods for (i.e., the proportion) of possible confidence intervals that
drawing and measuring the reliability of conclusions contain the true value of their corresponding parameter.
about population based on information obtained from a And once sample data has been gathered through an
sample of the population (Weiss, 1999). observational study or experiment, statistical inference
Inferential statistics are produced through complex allows analysts to assess evidence in favor or some
mathematical calculations that allow scientists to infer claim about the population from which the sample has
trends about a larger population based on a study of a been drawn. The methods of inference used to support or
sample taken from it. Scientists use inferential statistics reject claims based on sample data are known as tests of
to examine the relationships between variables within a significance.
sample and then make generalizations or predictions Furthermore, Howell, (2002) stated that a statistic is a
about how those variables will relate to a larger numerical value that is computed from a sample,
population. describes some characteristic of that sample such as the
A measured value based upon sample data is statistic. mean, and can be used to make inferences about the
A population value estimated from a statistic is a population from which the sample is drawn. For example,
parameter. A sample is a small proportion of a population if you were to compute the average amount of insurance
selected for analysis. By observing the sample, certain sold by your sample of 100 agents, that average would
inferences may be made about the population. Samples be a statistic because it summarizes a specific
are not selected haphazardly, but are chosen in a characteristic of the sample. Remember that the word
deliberate way so that the influence of chance or “statistic” is generally associated with samples, while
probability can be estimated. The basic ideas of inference “parameter” is generally associated with populations. In
are to estimate the parameters with the help of sample similar taken, Weiss, (1999) in contrast to descriptive
statistics which play an extremely important role in statistics, inferential statistical analysis involves using
educational research. These basic ideals, of which the information from a sample to make inferences, or
concept of underlying distribution is a part, comprise the estimates, about the population.
foundation for testing hypotheses using statistical Inferential statistics consist of methods for drawing and
techniques. measuring the reliability of conclusions about population
The parameters are never known for certain unless the based on information obtained from a sample of the
entire population is measured and then there is no population (Weiss, 1999). In short, inferential statistics
inference. We look at the statistics and their underlying includes methods like point estimation, interval estimation
distributions and from them we reason to tenable and hypothesis testing which are all based on probability
conclusions about the parameters. theory.
It is usually impossible to examine each member of the
population individually. So scientists choose a Example (Descriptive and Inferential Statistics).
representative subset of the population, called
a statistical sample, and from this analysis, they are able Consider event of tossing dice. The dice is rolled 100
to say something about the population from which the times and the results are forming the sample data.
sample came. There are two major divisions of inferential Descriptive statistics is used to grouping the sample data
statistics (Agresti, & Finlay, (1997) stated as follow: to the following table.

A confidence interval gives a range of values


for an unknown parameter of the population by
measuring a statistical sample. This is expressed
in terms of an interval and the degree of
confidence that the parameter is within
Dawit 11

Outcome of the roll Frequencies in the sample data


Outcome of the roll Frequencies in the sample data
1 10
2 20
3 18
4 16
5 11
6 25
Inferential statistics can now be used to verify whether the dice is a fair or not.

Generally, descriptive and inferential statistics are various components corresponding to various sources of
interrelated. It is almost always necessary to use variation. Through this technique one can explain
methods of descriptive statistics to organize and whether various varieties of seeds or fertilizers or soils
summarize the information obtained from a sample differ significantly so that a policy decision could be taken
before methods of inferential statistics can be used to accordingly, concerning a particular variety in the context
make more thorough analysis of the subject under of agriculture researches(Cramer, 2005). Similarly, the
investigation. Furthermore, the preliminary descriptive differences in various types of feed prepared for a
analysis of a sample often reveals features that lead to particular class of animal or various types of drugs
the choice of the appropriate inferential method to be manufactured for curing a specific disease may be
later used. Sometimes it is possible to collect the data studied and judged to be significant or not through the
from the whole population. In that case it is possible to application of ANOVA technique. Likewise, a manager of
perform a descriptive study on the population as well as a big concern can analyze the performance of various
usually on the sample. Only when an inference is made salesmen of his concern in order to know whether their
about the population based on information obtained from performances differ significantly (Neuman, 2006).
the sample does the study become inferential. ANOVA can be one way or two way ANOVA.

One way (single –factor) ANOVA- Under the one-way


Analysis of Variance (ANOVA) ANOVA, we consider only one factor and then observe
that the reason for said factor to be important is that
Concept of ANOVA several possible types of samples can occur within that
factor (Neuman, 2006); Armstrong, Eperjesi and
One of the methods for quantitative data analysis is Gilmartin, 2002). We then determine if there are
analysis of variance. According to Kothari (2004), differences within that factor.
Professor R.A. Fisher was the first man to use the term
‘Variance’ and, in fact, it was he who developed a very Two-Way ANOVA- this technique is used when the
elaborate theory concerning ANOVA, explaining its data are classified on the basis of two factors. For
usefulness in practical field. Later on Professor Snedecor example, the agricultural output may be classified on the
and many others contributed to the development of this basis of different varieties of seeds and also on the basis
technique. of different varieties of fertilizers used. A business firm
may have its sales data classified on the basis of different
ANOVA is essentially a procedure for testing the salesmen and also on the basis of sales in different
difference among different groups of data for regions. In a factory, the various units of a product
homogeneity. “The essence of ANOVA is that the total produced during a certain period may be classified on the
amount of variation in a set of data is broken down into basis of different varieties of machines used and also on
two types, that amount which can be attributed to chance the basis of different grades of labour (Neuman, 2006);.
and that amount which can be attributed to specified Such a two-way design may have repeated
causes ”(Bryman, and Cramer, 2005).There may be measurements of each factor or may not have repeated
variation between samples and also within sample items. values.

Cramer, (2005) stated that the specific analysis of


variance test that we will study is often referred to as the Assumptions of ANOVA
one way ANOVA. It consists in splitting the variance for
analytical purposes. Hence, it is a method of analyzing Like so many of our inference procedures, ANOVA has
the variance to which a response is subject into its some underlying assumptions which should be in place in
12 Inter. J. Acad. Res. Educ. Rev.

order to make the results of calculations completely distribution, Samples must be independent, Population
trustworthy. In relation to the assumption of ANOVA, variances must be equal, and Groups must have equal
Huck (2004) stated that: Subjects are chosen via a sample sizes (Mordkoff (2016).
simple random sample, Within each
group/population, the response variable is normally Generally, An ANOVA tests whether one or more
distributed, While the population means may be samples means are significantly different from each
different from one group to the next, the population other. To determine which or how many sample means
standard deviation is the same for all groups. are different requires post hoc testing. Two samples
where means are significantly different. These two
Fortunately, ANOVA is somewhat robust (i.e., results sample means are NOT significantly different due to
remain fairly trustworthy despite mild violations of these smaller difference and high variability. Even with same
assumptions). Assumptions (ii) and (iii) are close enough difference between means, if variances are reduced the
to being true if, after gathering simple random samples means can be significantly different.
from each group, we: Look at normal quintile plots for
each group and, in each case, see that the data points
fall close to a line, and Compute the standard deviations Analysis of Co-variance (ANCOVA)
for each group sample, and see that the ratio of the
largest to the smallest group sample standard deviation is The Analysis of Covariance (generally known as
no more than two( Cramer, (2005); Neuman, (2006) and ANCOVA) is a technique that sits between analysis of
Armstrong, Eperjesi and Gilmartin, 2002). variance and regression analysis. It has a number of
purposes but the two that are, perhaps, of most
importance are: to increase the precision of comparisons
Uses of ANOVA between groups by accounting to variation on important
prognostic variables, and to "adjust" comparisons
The one-way analysis of variance for independent between groups for imbalances in important prognostic
groups applies to an experimental situation where there variables between these groups.
might be more than two groups. The t-test was limited to
two groups, but the ANOVA can analyze as many groups When we measure covariates and include them in an
as you want. Examine the relationship between variables analysis of variance we call it analysis of covariance (or
when there is a nominal level independent variable has 3 ANCOVA for short). There are two reasons for including
or more categories and a normally distributed interval/ covariates in ANOVA:
ratio level of dependent variable produces an F-ratio,
which determines the statistical significance of the result. 1. To reduce within-group error variance: In the
Reduces the probability of a Type I error (which would discussion of ANOVA and t-tests we got used to the idea
occur if we did multiple t-tests rather than one single that we assess the effect of an experiment by comparing
ANOVA)(Singh, 2007). the amount of variability in the data that the experiment
can explain against the variability that it cannot explain. If
In relation to the use of ANOVA, Mordkoff (2016) stated we can explain some of this ‘unexplained’ variance (SSR)
that One-way ANOVA is used to test for significant in terms of other variables (covariates), then we reduce
differences among sample means, differs from t-test the error variance, allowing us to more accurately assess
since more than 2 groups are tested, simultaneously, one the effect of the independent variable (SSM) (Hinkle,
factor (independent variable) is analyzed, also called the Wiersma, &Jurs, 2003).
“grouping” variable, and dependent variable should be
interval or ratio but independent variable is usually 2. Elimination of confounds: In any experiment,
nominal. there may be unmeasured variables that confound the
results (i.e. variables that vary systematically with the
A Two Way ANOVA is an extension of the One Way experimental manipulation). If any variables are known to
ANOVA. With a One Way, you have one independent influence the dependent variable being measured, then
variable affecting a dependent variable. With a Two Way ANCOVA is ideally suited to remove the bias of these
ANOVA, there are two independents. Use a two way variables. Once a possible confounding variable has
ANOVA when you have one measurement variable (i.e. a been identified, it can be measured and entered into the
quantitative variable) and two nominal variables. In other analysis as a covariate Hinkle, Wiersma, &Jurs, 2003).
words, if your experiment has a quantitative outcome and
you have two categorical explanatory variables, a two The above two explanations indicates that the reason
way ANOVA is appropriate. Assumptions for Two Way for including covariates is that covariates are a variable
ANOVA: The population must be close to a normal that a researcher seeks to control for (statistically
Dawit 13

subtract the effects of) by using such techniques as for the researcher, in preliminary analyses, to investigate
multiple regression analysis (MRA) or analysis of the nature of the relationship between the dependent
covariance (ANCOVA). variable and the covariate (by looking at a scatter plot of
the data points), in addition to conducting an ANOVA on
But, there are other reasons for including covariates in the covariate (Howell (2002); Huck (2004) and Vogt
ANOVA but because I do not intend to describe the ,1999).
computation of ANCOVA in any detail I recommend that
the interested reader consult my favorite sources on the The second assumption has to do with the
topic (Stevens, 2002; Wildt & Ahtola, 1978). Imagine that regression lines within each of the groups, (Howell
the researcher who conducted the Viagra study in the (2002); Huck, S. W. (2004) and Vogt, W. P. (1999). We
previous chapter suddenly realized that the libido of the assume the relationship to be linear. Additionally,
participants’ sexual partners would affect the participants’ however, the regression lines for these individual groups
own libido (especially because the measure of libido was are assumed to be parallel; in other words, they have the
behavioral). Therefore, they repeated the study on a same slope. This assumption is often called
different set of participants, but this time took a measure homogeneity of regression slopes or parallelism and
of the partner’s libido. The partner’s libido was measured is necessary in order to use the pooled within-groups
in terms of how often they tried to initiate sexual contact. regression coefficient for adjusting the sample means
and is one of the most important assumptions for the
Analysis of Covariance (ANCOVA) is an extension of ANCOVA.
ANOVA that provides a way of statistically controlling the
(linear) effect of variables one does not want to examine Failure to meet this assumption implies that there is an
in a study. These extraneous variables are called interaction between the covariate and the treatment. This
covariates, or control variables. (Covariates should be assumption can be checked with an F test on the
measured on an interval or ratio scale.) (Vogt, 1999).It interaction of the independent variable(s) with the
allows you to remove covariates from the list of possible covariate(s). If the F test is significant (i.e., significant
explanations of variance in the dependent variable. interaction) then this assumption has been violated and
ANCOVA does this by using statistical techniques (such the covariate should not be used as is. A possible
as regression to partial out the effects of covariates) solution is converting the continuous scale of the
rather than direct experimental methods to control covariate to a categorical (discrete) variable and making
extraneous variables. ANCOVA is used in experimental it a subsequent independent variable, and then use a
studies when researchers want to remove the effects of factorial ANOVA to analyze the data.
some antecedent variable. For example, pretest scores
are used as covariates in pretest posttest experimental Moreover, the assumptions underlying the ANCOVA had
designs. ANCOVA is also used in non-experimental a slight modification from those for the ANOVA, however,
research, such as surveys or nonrandom samples, or in conceptually, they are the same. According to Hinkle,
quasi-experiments when subjects cannot be assigned Wiersma, &Jurs, (2003) ANCOVA has the following
randomly to control and experimental groups. Although assumptions:
fairly common, the use of ANCOVA for non-experimental Assumption 1: The cases represent a random sample
research is controversial (Vogt, 1999). from the population, and the scores on the dependent
variable are independent of each other, known as the
assumption of independence.
Assumptions and Issues in ANCOVA
The test will yield inaccurate results if the independence
In addition to the assumptions underlying the ANOVA, assumption is violated. This is a design issue that should
there are two major assumptions that underlie the use of be addressed prior to data collection. Using random
ANCOVA; both concern the nature of the relationship sampling is the best way of ensuring that the
between the dependent variable and the observations are independent; however, this is not
covariate(Howell, 2002; Huck, 2004; and Vogt, 1999). always possible. The most important thing to avoid is
They stated the assumptions as follow: having known relationships among participants in the
study.
The first is that the relationship is linear. If the
relationship is nonlinear, the adjustments made in the Assumption 2: The dependent variable is normally
ANCOVA will be biased; the magnitude of this bias distributed in the population for any specific value of the
depends on the degree of departure from linearity, covariate and for any one level of a factor (independent
especially when there are substantial differences variable), known as the assumption of normality.
between the groups on the covariate. Thus it is important
14 Inter. J. Acad. Res. Educ. Rev.

This assumption describes multiple conditional Furthermore, Huck (2004) stated that Regression
distributions of the dependent variable, one for every analysis is a way of predicting an outcome variable from
combination of values of the covariate and levels of the one predictor variable(simple regression) or several
factor, and requires them all to be normally distributed. predictor variables (multiple regressions).
To the extent that population distributions are not normal
and sample sizes are small, p values may be invalid. In This tool is incredibly useful because it allows us to go a
addition, the power of ANCOVA tests may be reduced step beyond the data that we collected. So, this indicates
considerably if the population distributions are non- that Regression analysis is a statistical technique for
normal and, more specifically, thick-tailed or heavily investigating the relationship among variables. O’Brien,
skewed. The assumption of normality can be checked and Scott (2012) stated that the concept of regression as
with skewness values (e.g., within +3.29 standard follow:
deviations).
Regression is particularly useful to understand
Assumption 3: The variances of the dependent variable the predictive power of the independent
for the conditional distributions are equal, known as the variables on the dependent variable once a
assumption of homogeneity of variance. To the extent causal relationship has been confirmed. To be
that this assumption is violated and the group sample precise, regression helps a researcher
sizes differ, the validity of the results of the one-way understand to what extent the change of the
ANCOVA should be questioned. Even with equal sample value of the dependent variable causes the
sizes, the results of the standard post hoc tests should be change in the value of the independent
mistrusted if the population variances differ. The variables, while other independent variables are
assumption of homogeneity of variance can be checked held unchanged (p. 3).
with the Levine’s F-test.
Form the above explanation we can understand that,
It happens when they’re trying to run an analysis of regression is one of a tool for quantitative analysis which
covariance (ANCOVA) model because they have a is used to understand which among the independent
categorical independent variable and a continuous variables are related to the dependent variable, and to
covariate. The problem arises when a coauthor, explore the forms of these relationships used to infer
committee member, or reviewer insists that ANCOVA is causal relationships between the independent and
inappropriate in this situation because one of the dependent variables.
following ANCOVA assumptions is not met: The
independent variable and the covariate are independent In regression analysis, the problem of interest is the
of each other, and there is no interaction between nature of the relationship itself between the dependent
independent variable and the covariate (Helwig, 2017). variable (response) and the (explanatory) independent
variable. The regression equation describes the
relationship between two variables and is given by the
Regression and Correlation general format:

A. Regression FY = a + bX + ε

Regression analysis is used in statistics to find trends in Where: Y = dependent variable;


data. For example, we might guess that there’s a X = independent variable,
connection between how much we eat and how much we a = intercept of regression line;
weigh; regression analysis can help us quantify that. b = slope of regression line, and
Regression analysis will provide us with an equation for a ε = error term
graph so that we can make predictions about our data.
For example, if we’ve been putting on weight over the last In this format, given that Y is dependent on X, the slope b
few years, it can predict how much we’ll weigh in ten indicates the unit changes in Y for every unit change in X.
years’ time if we continue to put on weight at the same If b = 0.66, it means that every time X increases (or
rate. It will also give us a slew of statistics (including a p- decreases) by a certain amount, Y increases (or
value and a correlation coefficient) to tell us how accurate decreases) by 0.66 that amount. The intercept a indicates
your model is. Most elementary statistics courses cover the value of Y at the point where X = 0. Thus if X
very basic techniques, like making scatter plots and indicated market returns, the intercept would show how
performing linear regression. However, we may come the dependent variable performs when the market has a
across more advanced techniques like multiple flat quarter where returns are 0. In investment parlance, a
regressions Gogtay, Deshpande ,and Thatte 2017). manager has a positive alpha because a linear
Dawit 15

regression between the manager's performance and the that the two most popular correlation coefficients are:
performance of the market has an intercept number a Spearman's correlation coefficient rho and Pearson's
greater than product-moment correlation coefficient. When calculating
a correlation coefficient for ordinal data, select
Assumptions for regression: there are assumptions to Spearman's technique. For interval or ratio-type data, use
be taken in to consideration to use regression as a tool of Pearson's technique.
data analysis. Gogtay, Deshpande, and Thatte (2017)
stated the following assumptions: Ott, (1993) stated that:

Assumption 1: The relationship between the Correlation is a measure of the strength of a


independent variables and the dependent variables is relationship between two variables. Correlations
linear. The first assumption of Multiple Regression is that do not indicate causality and are not used to
the relationship between the IVs and the DV can be make predictions; rather they help identify how
characterized by a straight line. A simple way to check strongly and in what direction two variables co-
this is by producing scatter plots of the relationship vary in an environment.
between each of our IVs and our DV.
So, form the above definition we can deduce that
Assumption 2: There is no multi co linearity in your Correlation analysis is useful when researchers are
data. This is essentially the assumption that your attempting to establish if a relationship exists between
predictors are not too highly correlated with one another. two variables. The correlation coefficient is a measure of
the degree of linear association between two continuous
Assumption 3: The values of the residuals are variables.
independent. This is basically the same as saying that
we need our observations (or individual data points) to be Pearson r correlation: Pearson r correlation is widely
independent from one another (or uncorrelated). We can used in statistics to measure the degree of the
test this assumption using the Durbin-Watson statistic. relationship between linear related variables (Gogtay,
Deshpande, and Thatte, 2017). For example, in the stock
Assumption 4: The variance of the residuals is market, if we want to measure how two commodities are
constant. This is called homoscedasticity, and is the related to each other, Pearson r correlation is used to
assumption that the variation in the residuals (or amount measure the degree of relationship between the two
of error in the model) is similar at each point across the commodities.
model. In other words, the spread of the residuals should
be fairly constant at each point of the predictor variables Assumption For the Pearson r correlation, both
(or across the linear model). We can get an idea of this variables should be normally distributed. Other
by looking at our original scatter plot, but to properly test assumptions include linearity and homoscedasticity.
this, we need to ask SPSS to produce a special scatter Linearity assumes a straight line relationship between
plot for us that includes the whole model (and not just the each of the variables in the analysis and
individual predictors). homoscedasticity assumes that data is normally
distributed about the regression line (Gogtay,
To test the 4th assumption, we need to plot the Deshpande, and Thatte, 2017).
standardized values our model would predict, against the
standardized residuals obtained. Spearman rank correlation: Spearman rank correlation
is a non-parametric test that is used to measure the
Assumption 5: The values of the residuals are degree of association between two variables. It was
normally distributed. This assumption can be tested by developed by Spearman, thus it is called the Spearman
looking at the distribution of residuals. rank correlation (Gogtay, Deshpande, and Thatte ,2017).
Spearman rank correlation test does not assume any
Assumption 6: There are no influential cases biasing assumptions about the distribution of the data and is the
your model. Significant outliers and influential data appropriate correlation analysis when the variables are
points can place undue influence on your model, making measured on a scale that is at least ordinal.
it less representative of your data as a whole.
Assumptions: Spearman rank correlation test does not
B. Correlation make any assumptions about the distribution. The
Correlation is a measure of association between two assumptions of Spearman rho correlation are that data
variables. The variables are not designated as dependent must be at least ordinal and scores on one variable must
or independent. As O’Brien and Scott (2012), explained be monotonically related to the other variable.
16 Inter. J. Acad. Res. Educ. Rev.

The value of a correlation coefficient can vary from -1 to value of the other variable decreases, and vise versa
1. -1 indicates a perfect negative correlation, while a +1 (Gogtay, Deshpande, and Thatte ,2017). In other words,
indicates a perfect positive correlation. A correlation of for a negative correlation, the variables work opposite
zero means there is no relationship between the two each other. When there is a positive correlation between
variables. When there is a negative correlation between two variables, as the value of one variable increases, the
two variables, as the value of one variable increases, the value of the other variable also increases. The variables
move together.

|-----------------------------|------------------------------|--------------------------|------------------------|
-1.00 -.5 00 +.50 +1.00
strong negative relationship weak or none strong positive relationship

The standard error of a correlation coefficient is used to depending on the type of variables that we are dealing
determine the confidence intervals around a true with. Cox regression is a special type of regression
correlation of zero. If your correlation coefficient falls analysis that is applied to survival or “time to event “data
outside of this range, then it is significantly different than and will be discussed in detail in the next article in the
zero. The standard error can be calculated for interval or series. Linear regression can be simple linear or multiple
ratio-type data (i.e., only for Pearson's product-moment linear regressions while Logistic regression could be
correlation). Polynomial in certain cases.

Example: A company wanted to know if there is a The type of regression analysis to be used in a given
significant relationship between the total number of situation is primarily driven by the following three metrics:
salespeople and the total number of sales. They collect Number and nature of independent variable/s , Number
data for five months. and nature of the dependent variable/s, and Shape of the
regression line.
Variable 1 Variable 2
207 6907 A. Linear regression: Linear regression is the most
180 5991 basic and commonly used regression technique and is of
220 6810 two type’s viz. simple and multiple regressions. You can
205 6553 use Simple linear regression when there is a single
190 6190 dependent and a single independent variable. Both the
-------------------------------- variables must be continuous and the line describing the
relationship is a straight line (linear). Multiple linear
Correlation coefficient = .921 regression on the other hand can be used when we have
Standard error of the coefficient = ..068 one continuous dependent variable and two or more
t-test for the significance of the coefficient = 4.100 independent variables. Importantly, the independent
Degrees of freedom = 3 variables could be quantitative or qualitative (O’Brien and
Two-tailed probability = .0263 Scott, 2012). They added that, in simple linear
regression, the outcome or dependent variable Y is
Generally, as Hinkle, Wiersma, &Jurs, (2003) stated that predicted by only one independent or predictive variable.
the goal of a correlation analysis is to see whether two Multiple regression is not just a technique on its own. It is,
measurement variables co vary, and to quantify the in fact, a family of techniques that can be used to explore
strength of the relationship between the variables, the relationship between one continuous dependent
whereas regression expresses the relationship in the variable and a number of independent variables or
form of an equation. predictors. Although multiple regression is based on
correlation, it enables a more sophisticated exploration of
the interrelationships among variables.
Types of Regression
Both the independent variables here could be expressed
Gogtay, Deshpande, and Thatte (2017) stated that either as continuous data or qualitative data. A linear
essentially in research, there are three common types of relationship should exist between the dependent and
regression analyses that are used viz., linear, logistic independent variables.
regression and Cox regression. These are chosen
Dawit 17

B. Logistic regression: This type of regression analysis For logistic regression to be meaningful, the following
is used when the dependent variable is binary in nature. criteria must be met/satisfied: The independent variables
For example, if the outcome of interest is death in a must not be correlated amongst each other and the
cancer study, any patient in the study can have only one sample size should be adequate. If the dependent
of two possible outcomes- dead or alive. The impact of variable is non-binary and has more than two
one or more predictor variables on this binary variable is possibilities, we use the multinomial or polynomial logistic
assessed. The predictor variables can be either regression.
quantitative or qualitative. Unlike linear regression, this
type of regression does not require a linear relationship
between the predictor and dependent variables.

Table 1: Types of regression adopted from Gogtay, Deshpande, andThatte (2017)


The following table will summarize the basic types of regression:
Type of Dependent variable Independent variable Relationship
regression and its nature and its nature between
variables

Simple linear One, continuous, One, continuous, normally Linear


normally distributed distributed
Multiple linear One, continuous Two or more, may be Linear
continuous or categorical
Logistic One, binary Two or more, may be Need not be linear
continuous or categorical
Polynomial Non-binary Two or more, may be Need not be linear
(logistic) continuous or categorical
[multinomial]
Cox or Time to an event Two or more, may be Is rarely linear
proportional continuous or categorical
hazards
regression

Multiple Correlation and Regression cases, the dependent variable can only be explained by
one independent variable.
When there are two or more than two independent
variables, the analysis concerning relationship is known Assumptions behind Multiple Regressions
as multiple correlations and the equation describing such
relationship as the multiple regression equation. We here Multiple regressions make a number of assumptions
explain multiple correlation and regression taking only about the data, and are important that these are met. The
two independent variables and one dependent variable assumptions are: Sample Size, Multi co linearity of IVs,
(Convenient computer programs exist for dealing with a Linearity, Absence of outliers, Homo scedasticit, and
great number of variables). Normality.so, Tests of these assumptions are numerous
so we will only look at a few of the more important ones.
In correlation, the two variables are treated as equals.
In regression, one variable is considered independent a. Sample size: You will encounter a number of
(=predictor) variable (X) and the other the dependent recommendations for a suitable sample size for multiple
(=outcome) variable Y(Quirk,2007).Prediction: If you regression analysis (Tabachinick & Fidell, 2007). As a
know something about X, this knowledge helps you simple rule, you can calculate the following two values:
predict something about Y. 104 + m
50 + 8m where m is the number of independent variables,
In simple linear regression, the outcome or dependent and take whichever is the largest as the minimum
variable Y is predicted by only one independent or number of cases required.
predictive variable. It should be stressed that in very rare For example, with 4 independent variables, we would
18 Inter. J. Acad. Res. Educ. Rev.

require at least 108 cases: [104+4=108] [50+8*4=82] variables it will use and in which order they can go into
With 8 independent variables we would require at least the equation, based on statistical criteria.
114 cases: [104+8=112] [50+8*8=114] With Stepwise
regression, we need at least 40 cases for every Uses of Correlation and Regression
independent variable (Pallant, 2007). However, when any
of the following assumptions is violated, larger samples There are three main uses for correlation and regression.
are required. Cohen, (1988) stated that,

b. Multicollinearity of Independent Variables: Any two One is to test hypotheses about cause-and-effect
independent variables with a Pearson correlation relationships. In this case, the experimenter
coefficient greater than .9 between them will cause determines the values of the X-variable and sees
problems. Remove independent variables with a whether variation in X causes variation in Y. For
tolerance value less than 0.1. A tolerance value is example, giving people different amounts of a
calculated as 1, which is reported in SPSS (Tabachinick drug and measuring their blood pressure.
& Fidell, 2007). The second main use for correlation and
regression is to see whether two variables are
c. Linearity: Standard multiple regressions only looks at associated, without necessarily inferring a cause-
linear relationships. You can check this roughly using and-effect relationship. In this case, neither
bivariate scatterplots of the dependent variable and each variable is determined by the experimenter; both
of the independent variables (Tabachinick & Fidell, are naturally variable. If an association is found,
2007). the inference is that variation in X may cause
variation in Y, or variation in Y may cause
d. Absence of outliers: Outliers, such as extreme cases variation in X, or variation in some other factor
can have a very strong effect on a regression equation. may affect both X and Y.
They can be spotted on scatter plots in early stages of The third common use of linear regression is
your analysis. There are also a number of more estimating the value of one variable
advanced techniques for identifying problematic points. corresponding to a particular value of the other
These are very important in multiple regression analysis variable.
where you are not only interested in extreme values but
in unusual combinations of independent values. Advantages and Disadvantages of Quantitative
Analysis
e. Homoscedasticity: This assumption is similar to the
assumption of homogeneity of variance with ANOVAs. 1 Advantages of Quantitative Analysis
More advanced methods include examining residuals. It
requires that there be equality of variance in the Denscombe, (2007)stated the following advantages of
independent variables for each value of the dependent quantitative analysis:
variable. We can do this in a crude way with the scatter First, it is Scientific: Quantitative data lend themselves
plots for each independent variable against the to various forms of statistical techniques based on the
dependent variable (Tabachinick &Fidell, 2007). If there principles of mathematics and probability. Such statistics
is equality of variance, then the points of the scatter plot provide the analyses with an aura of scientific
should form an evenly balanced cylinder around the respectability. The analyses appear to be based on
regression line. objective laws rather than the values of the researcher.
Second, Confidence: Statistical tests of significance give
f. Normality: The dependent and independent variables researchers additional credibility in terms of the
should be normally distributed. interpretations they make and the confidence they have
in their findings. Third, Measurement: The analysis of
When we talk about Multiple Regressions it can be: quantitative data provides a solid foundation for
Standard Multiple Regressions (All of the independent (or description and analysis. Interpretations and findings are
predictor) variables are entered into the equation based on measured quantities rather than impressions,
simultaneously0, Hierarchical Multiple Regressions (The and these are, at least in principle, quantities that can be
independent variables are entered into the equation in checked by others for authenticity. Fourth, Analysis.
the order specified by the researcher based on their Large volumes of quantitative data can be analyzed
theoretical approach) , and Stepwise Multiple Regression relatively quickly, provided adequate preparation and
(The researcher provides SPSS with a list of independent planning has occurred in advance. Once the procedures
variables and then allows the program to select which are ‘up and running’, researchers can interrogate their
18
Dawit 19

results relatively quickly. Fifth, Presentation. Tables and Discussion section should not be simply a summary of
charts provide a succinct and effective way of organizing the results we have found and at this stage we will have
quantitative data and communicating the findings to to demonstrate original thinking. First, we should highlight
others. Widely available computer software aids the and discuss how our research has reinforced what is
design of tables and charts, and takes most of the hard already known about the area. Many students make the
labor out of statistical analysis. mistake of thinking that they should have found
something new; in fact, very few research projects have
Disadvantages of Quantitative Analysis findings that are unique. Instead, we are likely to have a
number of findings that reinforce what is already known
According to Denscombe (2007) the following are some about the field and we need to highlight these, explaining
limitations of quantitative data analysis. why we think this has occurred.
Second, we may have discovered something different
First, quality of data: The quantitative data are only as and if this is the case, we will have plenty to discuss. We
good as the methods used to collect them and the should outline what is new and how this compares to
questions that are asked. As with computers, it is a what is already known. We should also attempt to provide
matter of ‘garbage in, garbage out’. Second, Technicist: an explanation as to why our research identified these
There is a danger of researchers becoming obsessed differences. Third, we need to consider how our results
with the techniques of analysis at the expense of the extend knowledge about the field. Even if we found
broader issues underlying the research. Particularly with similarities between our results and the existing work of
the power of computers at researchers’ fingertips, others, our research extends knowledge of the area, by
attention can sway from the real purpose of the research reinforcing current thinking. We should state how it does
towards an overbearing concern with the technical this as this is a legitimate finding. It is important that this
aspects of analysis. Third, Data overload: Large section is comprehensive and well structured; making
volumes of data can be strength of quantitative analysis clear links back to the literature we reviewed earlier in the
but, without care, it can start to overload the researcher. project. This will allow us the opportunity to demonstrate
Too many cases, too many variables, too many factors to the value of our research and it is therefore very
consider – the analysis can be driven towards too much important to discuss our work thoroughly.
complexity. The researcher can get swamped. Fourth,
false promise: Decisions made during the analysis of The resources in this section of the gateway should help
quantitative data can have far-reaching effects on the us to:
kinds of findings that emerge. In fact, the analysis of
quantitative data, in some respects, is no more neutral or Interpret the research: the key to a good discussion is a
objective than the analysis of qualitative data. For clear understanding of what the research means. This
example, the manipulation of categories and the can only be done if the results are interpreted correctly.
boundaries of grouped frequencies can be used to
achieve a data fix, to show significance where other Discuss coherently: a good discussion presents a
combinations of the data do not. Quantitative analysis is coherent, well-structured explanation that accounts for
not as scientifically objective as it might seem on the the findings of the research, making links between the
surface. evidence obtained and existing knowledge.

DISCUSSION OF RESULTS As always, use the Gateway resources appropriately.


As usual, the resources have been included because we
Qualitative Data Result believe they provide accessible, practical and helpful
information on how to discuss our work. On the other
Research Gateway shows us how to discuss the hand, don’t forget that our institution will have
results that we have found in relation to both our research requirements of us and our project that override any
questions and existing knowledge. This is our opportunity information that we get from this Gateway. For example,
to highlight how our research reflects, differs from and we might not have to produce a separate discussion
extends current knowledge of the area in which we have section (it depends on different institutions and research
chosen to carry out research. This section is our chance types) as this may need to be included with the
to demonstrate exactly what we know about this topic by presentation of results. This is often the case for
interpreting our findings and outlining what they mean. At qualitative research, so we must be sure what is needed.
the end of our discussion we should have discussed all of Find out, and then use the Gateway accordingly.
the results that we found and provided an explanation for When crafting our findings, the first thing we want to
our findings. think about is how we will organize our findings. Our
19
20 Inter. J. Acad. Res. Educ. Rev.

findings represent the story we are going to tell in other non-textual elements to help the reader understand
response to the research questions we have answered. the data. Make sure that non-textual elements do not
Thus, we will want to organize that story in a way that stand in isolation from the text but are being used to
makes sense to us and will make sense to our reader. supplement the overall description of the results and to
We want to think about how we will present the findings help clarify key points being made (Agresti, and Finlay,
so that they are compelling and responsive to the 1997). Further information about how to effectively
research question(s) we answered. These questions may present data using charts and graphs can be found here.
not be the questions we set out to answer but they will Quantitative Research is used to quantify the problem
definitely be the questions we answered. We may by way of generating numerical data or data that can be
discover that the best way to organize the findings is first transformed into usable statistics. It is used to quantify
by research question and second by theme. There may attitudes, opinions, behaviors, and other defined
be other formats that are better for telling our story. Once variables and generalize results from a larger sample
we have decided how we want to organize the findings, population. Quantitative Research uses measurable data
we will start the chapter by reminding our reader of the to formulate facts and uncover patterns in research Huck
research questions. We will need to differentiate between (2004). So, for quantitative data you will need to decide in
is presenting raw data and using data as evidence or what format to present your findings i.e. bar charts, pie
examples to support the findings we have identified charts, histograms etc. You will need to label each table
(Cohen et.al.,2007).Here are some points to consider: and figure accurately and include a list of tables and a list
Our findings should provide sufficient evidence from our of figures with corresponding page numbers in your
data to support the conclusions we have made. Evidence Contents page or Appendices.
takes the form of quotations from interviews and excerpts Following is a list of characteristics and advantages of
from observations and documents, ethically we have to using quantitative methods: The data collected is
make sure we have confidence in our findings and numeric, allowing for collection of data from a large
account for counter-evidence (evidence that contradicts sample size, Statistical analysis allows for greater
our primary finding) and not report something that does objectivity when reviewing results and therefore, results
not have sufficient evidence to back it up, our findings are independent of the researcher, Numerical results can
should be related back to our conceptual framework, our be displayed in graphs, charts, tables and other formats
findings should be in response to the problem presented that allow for better interpretation, Data analysis is less
(as defined by the research questions) and should be the time-consuming and can often be done using statistical
“solution” or “answer” to those questions ,and We should software, Results can be generalized if the data are
focus on data that enables us to answer your research based on random samples and the sample size was
questions, not simply on offering raw data (Neuman, sufficient, Data collection methods can be relatively
2000). quick, depending on the type of data being collected, and
Qualitative research presents “best examples” of raw Numerical quantitative data may be viewed as more
data to demonstrate an analytic point, not simply to credible and reliable, especially to policy makers,
display data. Numbers (descriptive statistics) help our decision makers, and administrators (Neuman, &
reader understand how prevalent or typical a finding is. Robson, 2004).
Numbers are helpful and should not be avoided simply For qualitative data you may want to include quotes
because this is a qualitative dissertation. from interviews. Any sample questionnaires or transcripts
can be included in your Appendices. Qualitative analysis
Quantitative Data Result and discussion will often demand a higher level of writing
/ authoring skill to clearly present the emergent themes
Quantitative data result is on type of result in research from the research. It is easy to become lost in a detailed
which is presented in quantitative way like numbers, presentation of the narrative and lose sight of the need to
statistics. As Creswell, (2013); Neuman, and Robson, give priority to the broader themes. Creswell (2013)
(2004); and Neuman, and Neuman, (2006) stated that stated that Quantitative research deals in numbers, logic,
Quantitative methods are used to examine the and an objective stance. Quantitative research focuses
relationship between variables with the primary goal on numeric and unchanging data and detailed,
being to analyze and represent that relationship convergent reasoning rather than divergent reasoning
mathematically through statistical analysis. This is the [i.e., the generation of a variety of ideas about a research
type of research approach most commonly used in problem in a spontaneous, free-flowing manner]. So,
scientific research problems. quantitative data results are presented in the same way.
The finding of your study should be written objectively Singh, (2007) stated that quantitative data result has its
and in a succinct and precise format. In quantitative main characteristics: The data is usually gathered using
studies, it is common to use graphs, tables, charts, and structured research instruments, the results are based on
20
Dawit 21

larger sample sizes that are representative of the that describes your research to some prospective
population, The research study can usually be replicated audience. Main priority of a research summary is to
or repeated, given its high reliability, Researcher has a provide the reader with a brief overview of the whole
clearly defined research question to which objective study. To write a quality summary, it is vital to identify the
answers are sought, All aspects of the study are carefully important information in a study, and condense it for the
designed before data is collected, Data are in the form of reader. Having a clear knowledge of your topic or subject
numbers and statistics, often arranged in tables, charts, matter enables you to easily comprehend the contents of
figures, or other non-textual forms, Project can be used to your research summary (Philip, 1986).
generalize concepts more widely, predict future results,
or investigate causal relationships, Researcher uses As Globio (2017), stated that guidelines in writing the
tools, such as questionnaires or computer software, to summary of findings are the following.
collect numerical data.
In quantitative data result presentation, Bryman and 1. There should be brief statement about the main
Cramer (2005) explained the following things to keep in purpose of the study, the population or
mind when reporting the results of a study using respondents, the period of the study, method of
quantitative methods: Explain the data collected and their research used, the research instrument, and the
statistical treatment as well as all relevant results in sampling design.
relation to the research problem you are investigating.
Interpretation of results is not appropriate in this section, Example, a research conducted study of
Report unanticipated events that occurred during your teaching science in the high schools of Province
data collection. Explain how the actual analysis differs may be explained as: This study was conducted
from the planned analysis. Explain your handling of for the purpose of determining the status of
missing data and why any missing data does not teaching science in the high schools of Province
undermine the validity of your analysis, Explain the A. The descriptive method of research was
techniques you used to "clean" your data set, choose a utilized and the normative survey technique was
minimally sufficient statistical procedure; provide a used for gathering data. The questionnaire
rationale for its use and a reference for it. Specify any served as the instrument for collecting data. All
computer programs used, Describe the assumptions for the teachers handling science and a 20%
each procedure and the steps you took to ensure that representative sample of the students were the
they were not violated, When using inferential statistics, respondents. The inquiry was conducted during
provide the descriptive statistics, confidence intervals, the school year 1989-’90.
and sample sizes for each variable as well as the value of
the test statistic, its direction, the degrees of freedom, 2. The findings may be lumped up all together but
and the significance level [report the actual p value], clarity demands that each specific question under
Avoid inferring causality, particularly in nonrandomized the statement of the problem must be written first
designs or without further experimentation, Use tables to to be followed by the findings that would answer
provide exact values; use figures to convey global it. The specific questions should follow the order
effects. Keep figures small in size; include graphic they are given under the statement of the
representations of confidence intervals whenever problem.
possible, Always tell the reader what to look for in tables
and figures. Example. How qualified are the teachers
Generally, Quantitative methods emphasize objective handling science in the high schools of province
measurements and the statistical, mathematical, or A?
numerical analysis of data collected through polls, Of the 59 teachers, 31 or 53.54 % were BSC
questionnaires, and surveys, or by manipulating pre- graduates and three or 5.08% were MA degree
existing statistical data using computational techniques. holders. The rest, 25 or 42.37%, were non-BSC
Quantitative research focuses on gathering numerical baccalaureate degree holders with at least 18
data and generalizing it across groups of people or to education units. Less than half of all the
explain a particular phenomenon teachers, only 27 or 45.76% were science
majors and the majority, 32 or 54.24% were non-
Summary, Conclusion, And Recommendations science majors.

Summary 3. The findings should be textual generalizations,


that is, a summary of the important data
A research summary is a professional piece of writing consisting of text and numbers. Every statement
21
22 Inter. J. Acad. Res. Educ. Rev.

of fact should consist of words, numbers, or were not qualified to teach science.
statistical measures woven into a meaningful
statement. No deductions, nor inference, nor 2. Conclusions should appropriately answer the
interpretation should be made otherwise it will specific questions raised at the beginning of the
only be duplicated in the conclusion. investigation in the order they are given under the
statement of the problem. The study becomes
Only the important findings, the highlights of the data, almost meaningless if the questions raised are
should be included in the summary, especially those not properly answered by the conclusions.
upon which the conclusions should be based. Findings
are not explained nor elaborated upon anymore. They Example. If the question raised at the beginning of the
should be stated as concisely as possible. research is:
The summary actually is found at the beginning of the
written piece and will often lead to a concise abstract of “How adequate are the facilities for the
the work which will aid with search engine searches teaching of science?” and the findings show
(Erwin,2013). The summary of any written paper that that the facilities are less than the needs of
delves into a research related topic will provide the the students, the answer and the conclusion
reader with a high level snap shot of the entire written should be: “The facilities for the teaching of
work. The summary will give a brief background of the science are inadequate”.
topic, highlight the research that was done, significant
details in the work and finalize the work’s results all in 3. Conclusions should point out what were factually
one paragraph. Only top level information should be learned from the inquiry. However, no
provided in this section and it should make the reader conclusions should be drawn from the implied or
want to read more after they see the summary. indirect effects of the findings.

Conclusions Example: From the findings that the majority


of the teachers were non-science majors and
Conclusion is one part of research. Girma Tadess (2014) the facilities were less than the needs of the
stated that the conclusion may be the most important part students, what have been factually learned
of the research. The writer must not merely repeat the are that the majority of the teachers were not
introduction, but explain in expert like detail what has qualified to teach science and the science
been learned, explained, decided, proven, etc. The writer facilities were inadequate?
must reveal the way in which the paper’s thesis might
have significance in society. It should strive to answer It cannot be concluded that science teaching in the high
questions that the readers logically, raise. The writer schools of Province A was weak because there are no
should point out the importance or implication of the data telling that the science instruction was weak. The
research on the area of the societal concern. weakness of science teaching is an indirect or implied
effect of the non-qualification of the teachers and the
Guidelines in writing the conclusions. The following inadequacy of the facilities. This is better placed under
should be the characteristics of the conclusions Philip the summary of implications.
(1986): If there is a specific question which runs this way “How
strong science instruction in the high schools of Province
1. Conclusions are inferences, deductions, A as is perceived by the teachers and students?”, then a
abstractions, implications, interpretations, general conclusion to answer this question should be drawn.
statements, and/or generalizations based upon However, the respondents should have been asked how
the findings. Conclusions are the logical and valid they perceived the degree of strength of the science
outgrowths upon the findings. They should not instruction whether it is very strong, strong, fairly strong,
contain any numeral because numerals generally weak or very weak. The conclusion should be based
limit the forceful effect or impact and scope of a upon the responses to the question.
generalization. No conclusions should be made
that are not based upon the findings. 4. Conclusions should be formulated concisely, that
is, brief and short, ye they convey all the
Example: The conclusion that can be drawn from necessary information resulting from the study as
the findings in No. 2 under the summary of required by the specific questions.
findings is this: All the teachers were qualified to
teach in the high school but the majority of them Without any strong evidence to the contrary,
22
Dawit 23

conclusions should be stated categorically. They should a problem. The writer should not introduce new ideas in
be worded as if they are 100 percent true and correct. the recommendations section, but relay on the evidence
They should not give any hint that the researcher has presented in the result and conclusions sections. Via the
some doubts about their validity and reliability. The use of recommendations section, the writer is able to
qualifiers such as probably, perhaps, may be, and the like demonstrate that he or she fully understands the
should be avoided as much as possible. importance and implication of his or her research by
suggesting ways in which it may be further developed
5. Conclusions should refer only to the population, (Berk , Hart , Boerema ,and Hands, 1998).
area, or subject of the study. Take for instance, Furthermore, Erwin, (2013) described that for
he hypothetical teaching of science in the high recommending similar researches to be conducted, the
schools of Province A, all conclusions about the recommendation should be: It is recommended that
faculty, facilities, methods, problems, etc. refer similar researches should be conducted in other places.
only to the teaching of science in the high Other provinces should also make inquiries into the
schools of Province A. status of the teaching of science in their own high schools
so that if similar problems and deficiencies are found,
Conclusions should not be repetitions of any concerted efforts may be exerted to improve science
statements anywhere in the thesis. They may be teaching in all high schools in the country.
recapitulations if necessary but they should be worded
differently and they should convey the same information Research Report Writing
as the statements recapitulated.
In drawing the conclusion, we should aware of Some Research report is a condensed form or a brief
Dangers to Avoid in Drawing up Conclusions (Bacani, description of the research work done by the researcher.
et.al., pp. 48-52) avoid Bias, Incorrect generalization (An It involves several steps to present the report in the form
incorrect generalization is made when there is a limited of thesis or dissertation. A research paper can be used
body of information or when the sample is not for exploring and identifying scientific, technical and
representative of the population), Incorrect deduction social issues. If it's your first time writing a research
(This happens when a general rule is applied to a specific paper, it may seem daunting, but with good organization
case), Incorrect comparison (A basic error in statistical and focus of mind, you can make the process easier on
work is to compare two things that are not really yourself (Berk, Hart, Boerema , and Hands, 1998). In
comparable), Abuse of correlation data( A correlation addition to this Erwin,(2013) stated that writing a research
study may show a high degree of association between paper involves four main stages: choosing a topic,
two variables), Limited information furnished by any one researching your topic, making an outline, and doing the
ratio, and Misleading impression concerning magnitude actual writing. The paper won't write itself, but by
of base variable. So, Conclusions should not be planning and preparing well, the writing practically falls
repetitions of any statements anywhere in the thesis. into place. Also, try to avoid plagiarism. So, in each
They may be recapitulations if necessary but they should section of the paper we will need to be critically writing
be worded differently and they should convey the same the paper. Most of the consideration in each section
information as the statements recapitulated. explained as follow:
The conclusion will be towards the end of the work and
will be the logical closure to all the work found at the end a. Introduction: The introduction is a critical part of
of the document. As Erwin (2013) the conclusion will your paper because it introduces the reasons behind your
have more detailed information than the summary but it paper’s existence. It must state the objectives and scope
should not be a repeat of the entire body of the work. The of your work, present what problem or question you
conclusion should revisit the main points of the research address, and describe why this is an interesting or
and the results of the investigation. This section should important challenge(Erwin, 2013). In addition, it is
be where all the research is pulled together and all open important to introduce appropriate and sufficient
topics be closed. The final results and a call to action references to prior works so that readers can understand
should be included in this phase of the writing. the context and background of the research and the
specific reason for your research work. Having explored
RECOMMENDATIONS those, the objectives and scope of your work must be
clearly stated. The introduction may explain the approach
In recommendation section, it should be in concluded in that is characteristic to your work, and mention the
the a report part when the results and conclusions essence of the conclusion of the paper.
indicate that further work must be done or when the writer
needs to discuss several possible options to best remedy b. Methods: The Methods section provides
23
24 Inter. J. Acad. Res. Educ. Rev.

sufficient detail of theoretical and experimental methods accessible by the readers (Singh, 2007).
and materials used in your research work so that any
reader would be able to repeat your research work and Writing a research paper need critical attention; in the
reproduce the results. Be precise, complete and concise: area different expertise stated that research report writing
include only relevant information. For example, provide a need critical rewriting, editing, revising and etc. stages to
reference for a particular technique instead of describing make it more effective, accurate and acceptable. Among
all the details. them Philip (1986) stated the following tips may be useful
in writing the paper:
c. Results: The Results section presents the facts, You need not start writing the text from the Introduction.
findings of the study, by effectively using figures and Many authors actually choose to begin with the results
tables. Wilkinson (1991) explained that this section must section since all the materials that must be described are
present the results clearly and logically to highlight available. This may provide good motivation for carrying
potential implications. Combine the use of text, tables, out the procedure most effectively. And Your paper must
and figures to digest and condense the data, and be interesting and relevant to your readers. Consider
highlight important trends and extract relationships what your readers want to know rather than what you
among different data items. Figures must be well want to write. Describe your new ideas precisely in an
designed, clear, and easy to read. Figure captions should early part of your paper so that your results are readily
be succinct yet provide sufficient information to understood. Otherwise, do not use lengthy descriptions of
understand the figures without reference to the text. the details. For example, writing too many equations and
showing resembling figures or too much detailed tables
d. Discussion: In the Discussion section, present should be avoided. Clarity and conciseness are
your interpretation and conclusions gained from your extremely important.
findings. You can discuss how your findings compare He also added that during and after writing your draft,
with other experimental observations or theoretical you must edit your writing by reconsidering your starting
expectations. Refer to your characteristic results plan or original outline. You may decide to rewrite
described in the Results section to support your portions of your paper to improve logical sequence,
discussion, since your interpretation and conclusion must clarity, and conciseness. This process may have to be
be based on evidence. By properly structuring this repeated over and over. When editing is completed, you
discussion, you can show how your results can solve the can send the paper to your co-authors for improvement.
current problems and how they relate to the research When all the co-authors agree on your draft, it is ready to
objectives that you have described in the Introduction be submitted to the journal. It is worth performing one
section. This is your chance to clearly demonstrate the final check of grammatical and typographical errors.
novelty and importance of your research work (Wilkinson, English correction of the manuscript by a native speaker
1991). is highly recommended before your submission if you are
not a native speaker. Unclear description prohibits
e. Conclusions: The Conclusion section constructive feedback in the review process.
summarizes the important results and impact of the In relation to Writing and Editing Philip (1986);Singh
research work. Future work plans may be included if they (2007); Wilkinson(1991) and (Erwin, 2013). and the
are beneficial to readers (Singh, 2007). following tips may be useful in writing the paper.
The first, we need not start writing the text from the
f. Acknowledgments: The Acknowledgments Introduction. Many authors actually choose to begin with
section is to recognize financial support from funding the results section since all the materials that must be
bodies and scientific and technical contributions that you described are available. This may provide good
have received during your research work. motivation for carrying out the procedure most effectively.
The Second, Our paper must be interesting and
g. References: The References section lists prior relevant to our readers. Consider what our readers want
works referred to in the other sections. It is vitally to know rather than what we want to write. Describe our
important from an ethical viewpoint, to fully acknowledge new ideas precisely in an early part of our paper so that
all previously published works that are relevant to your our results are readily understood. Otherwise, we do not
research. Whenever we use previous knowledge, we use lengthy descriptions of the details. For example,
must acknowledge the source. Readers benefit from writing too many equations and showing resembling
complete references as it enables them to position our figures or too much detailed tables should be avoided.
work in the context of current research. Ensure that the Clarity and conciseness are extremely important.
references given are sufficient as well as current, and The third, during and after writing our draft, we must

24
Dawit 25

edit our writing by reconsidering our starting plan or documents the researcher's knowledge and preparation
original outline. We may decide to rewrite portions of our to investigate the problem.
paper to improve logical sequence, clarity, and
conciseness. This process may have to be repeated over CHAPTER THREE: METHODOLOGY
and over.
The fourth, when editing is completed, we can send The methodology part includes: Design of the study
the paper to our co-authors for improvement. When all (Description of Research Design and Procedures Used),
the co-authors agree on the draft, it is ready to be Sample method and size/ Sampling Procedures, Sources
submitted to the journal (if journal publication is needed). of Data (Give complete information about who, what,
It is worth performing one final check of grammatical when, where, and how the data were collected), Methods
and typographical errors. and Instruments of Data Gathering (Explain how the data
The fifth, English correction of the manuscript by a were limited to the amount which was gathered. If all of
native speaker is highly recommended before our the available data were not utilized, how was a
submission if we are not a native speaker. Unclear representative sample) achieved? , and Gives the reader
description prohibits constructive feedback in the review the information necessary to exactly replicate (repeat) the
process. study with new data or if the same raw data were
Research report has its own format or organization available, the reader should be able to duplicate the
which is accepted by in different field of study John results. This is written in past tense but without reference
(1970) stated that research has the following Format: to or inclusion of the results determined from the
analysis.
Preliminary Section
CHAPTER FOUR: DATA ANALYSIS
This part includes: Title Page (Be specific. Tell what,
when, where, etc. In one main title and a subtitle, give a It contains: text with appropriate, tables and figures.
clear idea of what the paper investigated), Describe the patterns observed in the data. Use tables
Acknowledgments (if any) (Include only if special help and figures to help clarify the material when possible.
was received from an individual or group), Table of
Contents (Summarizes the report including the CHAPTER FIVE: SUMMARY, CONCLUSION AND
hypotheses, procedures, and major findings), List of RECOMMENDATION
Tables (if any), List of Figures (if any) and Abstract.
Under this part of our paper we will include: Restatement
Main Body of the Problem , Description of Procedures, Major
Findings (reject or fail to reject Ho) , Conclusions,
CHAPTER ONE: INTRODUCTION Recommendations for Further Investigation, and This
section condenses the previous sections, succinctly
This part of the paper includes: Background of the study ( presents the results concerning the hypotheses, and
overview of the study and This is a general introduction to suggests what else can be done.
the topic), Statement of the problem (This is a short
reiteration of the problem), Objectives of the study (What A. Reference Section: includes: End Notes (if in
is the goal to be gained from a better understanding of that format of citation), Bibliography or Literature Cited
this question?), Scope of the study, Limitation of the and Appendix.
study (Explain the limitations that may invalidate the
study or make it less than accurate), Significance of the SUMMARY
study (Comment on why this question merits
investigation), Origination of the study, and Definition of In summing up, research is a scientific field which helps
Terms (Define or clarify any term or concept that is used to generate new knowledge and solve the existing
in the study in a non-traditional manner or in only one of problem. So, to get this function we need to pass
many interpretations). deferent stages. Among this data analysis is the crucial
part of research which makes the result of the study more
CHAPTER TWO: REVIEW OF RELATED LITERATURE effective. Data Analysis is a process of collecting,
transforming, cleaning, and modeling data with the goal
This part of the thesis states analysis of previous of discovering the required information. The results so
research and It Gives the reader the necessary obtained are communicated, suggesting conclusions, and
background to understand the study by citing the supporting decision-making. Data analysis is to verify our
investigations and findings of previous researchers and results whether it is valid, reproducible and
25
26 Inter. J. Acad. Res. Educ. Rev.

unquestionable and it is a process used to transform, Sample." Retrieved on May,


remodel and revise certain information (data) with a view 5/2018.fromhttp://www.differencebetween.net/miscellan
to reach to a certain conclusion for a given situation or eous/difference-between-population-and-sample/
problem. Cohen , L., Manion, L. and Morrison, K. (2007). Research
It can be applied in two ways which is qualitatively and Methods in Education. London and New York:
quantitative. Whatever the research apply one of the two Routledge.
in his or her research applying the most effective data Cohen, J.W. (1988). Statistical Power Analysis for the
nd
analysis in research work is essential. Due to this, data Behavioral Sciences (2 ed.). London and New York:
analysis is beneficial because it helps in structuring the Routledge.
findings from different sources of data collection like Creswell, J. W. (2002). Educational Research: Planning,
survey research, is again very helpful in breaking a Conducting, and Evaluating Quantitative. Prentice Hall.
macro problem into micro parts, and acts like a filter Creswell, J. W. (2013). Research Design: Qualitative,
when it comes to acquiring meaningful insights out of Quantitative, and Mixed methods Approaches. Sage
huge data-set. Furthermore, every researcher has sort Publications, Incorporated.
out huge pile of data that he/she has collected, before Daniel,M.(2010). Doing Quantitative Research in
nd
reaching to a conclusion of the research question. Mere Education with SPSS (2 ed.). London: SAGE
data collection is of no use to the researcher. Data Publications.
analysis proves to be crucial in this process, provides a Dawson, C.(2002). Practical Research Methods: A user-
meaningful base to critical decisions, and helps to create friendly Guide to Mastering Research Techniques and
a complete dissertation proposal. So, after analyzing the Projects. Oxford: How to Books Ltd,
data the result will provide by qualitative and quantitative Denscombe , M. 2007. The Good Research Guide for
rd
method of data results. Small-scale Social Research Projects (3 ed.).Open
In research work, summary, conclusion and University Press: McGraw-Hill Education
recommendation are the most important part which is Earl,R.B.(2010). The Practice of Social Research
th
need to be write in effective and efficient to make the (12 ed.). Belmont, CA: Wadsworth Cengage.
paper more convincible and reputable. And in writing Erwin,M. G.(2013) . Thesis Writing: Summary,
research report needs a critical attention to make the Conclusions, and Recommendations. Retrieved from
report more academic and effective. http://thesisadviser.blogspot.com/2013/02/thesis-
Generally, one of the most important uses of data writing-summary-conclusions-and.html
analysis is that it helps in keeping human bias away from Field, A. (2002). Discovering Statistics Using SPSS for
research conclusion with the help of proper statistical Windows. London: Sage
treatment. With the help of data analysis a researcher Freeman,J and Young,T.(2017), Correlation Coefficient:
can filter both qualitative and quantitative data for an Association between Two Continuous Variables.
assignment writing projects. Thus, it can be said that data Retrieved on May, 2017 from
analysis is of utmost importance for both the research http://www.epa.gov/bioindicators /statprimer/ index.html
and the researcher. Gogtay,N.J,. Deshpande,S,. and Thatte,U.M.(2017).
Principles of Correlation Analysis. J AssocPhyInd 2017;
REFERENCES 65:78-81.
Hill,M.H.( 2013). Format of Research Reports.adapted
Ackoff, R.L.(1961).The Design of Social Research. from: John W. Best, Research in Education, 2nd ed.,
Chicago: University of Chicago Press. (Englewood Cliffs, NJ: Prentice-Hall, 1970)]. Retrieved
Addison,W. (2017). Medical Statistics Course. MD/PhD from
students, Faculty of Medicine. http://www.jsu.edu/dept/geography/mhill/research/rese
Agresti, A. & Finlay, B.(1997). Statistical Methods for the archf.html
rd
Social Sciences (3 ed.).Prentice Hall. Hinkle, D. E., Wiersma, W., &Jurs, S. G. (2003).Applied
th
Berk, M., Hart,B., Boerema,D., and Hands, D. (1998). Statistics for the Behavioral Sciences (5 ed.). Boston,
Writing Reports: Resource Materials for Engineering MA: Houghton Mifflin Company.
Students.University of South Australia. Howell, D. C. (2002).Statistical Methods for Psychology
th
Bryman,A and Cramer,D. (2005 ). Quantitative Data (5 ed.). Pacific Grove, CA: Duxbury. Retrieved from
Analysis with SPSS 12 and 13: A Guide for Social https://www.sheffield.ac.uk/polopoly_
Scientists. London: Routledge. fs/1.43991!/file/Tutorial-14-correlation.pdf.
th
Bussines Dictionary .(2017). Data Analysis. Retrieved Huck, S. W. (2004).Reading Statistics and Research (4
from http://www.businessdictionary.com/ ed.). Boston, MA: Allyn and Bacon.
definition/data-analysis.html Juliet,C.J and Nicholas,L. H.(2005). Ground Theory.
Celine.(2017)."Difference between Population and London Thousand Oaks: New Delhi.
26
Dawit 27

nd
Kothari,C.R. (2004).Research methodology (2 ed). New Ott, R.L. 1993. An Introduction to Statistical Methods and
th
Delhi: New Age international (p) limited publisher. Data Analysis (4 ed.). Belmont, CA: Duxbury Press,
Leech, N. L., Barrett, K. C., & Morgan, G. A. Patton, M. Q. (1990). Qualitative Evaluation and
nd
(2005).SPSS for Intermediate Statistics: Use Research Methods (2 ed.). Newbury Park, CA: Sage.
nd
MED819, ANCOVA. Retrieved from Philip,C.K.(1986).Successful Writing at Work (2 ed.).
http://www.mas.ncl.ac.uk/~njnsm/medfac/ Retrieved fromftp://nozdr.ru/biblio/kol
doc/ancova.pdf on May, 5/2018. xo3/L/LEn/Kolin%20P.%20Successful%20writing%20at
Mordkoff,J.T. (2016). Introduction to ANOVA .Retrieved %20work%20(Wadsworth,%202009)(ISBN%20054714
from http://www2.psychology. 7910)(O)(753s)_LEn_.pdf
uiowa.edu/faculty/mordkoff/GradStats/part%203a/Intro Quirk T.J., and Rhiney,E. (2016). Multiple Correlations
%20to%20ANOVA.pdf and Multiple Regressions. Excel for Statistics. Springer,
Nathaniel,E. H.(2017). Analysis of Covariance. Retrieved Cham.
from http://users.stat.umn.edu/~ helwig/notes/acov- Selltiz, C., Jahoda, M., Deutsch, M., and Cook,
Notes.pdf S.W.(1959). Research Methods in Social Relations(rev.
Neuman, W,L. (2000). Social Research Methods: ed.). New York: Holt, Rinehart and Winston, Inc.
th
Qualitative and Quantitative Approaches (4 ed.). Singh, K.(2007).Quantitative Social Research Methods.
Boston: Allyn and Bacon. Los Angeles, CA: Sage Publications.
Neuman, W. L., & Robson, K. (2004).Basics of Social Strauss, A and Corbin, J .(1990).Basics of Qualitative
Research. Pearson. Research: Grounded Theory Procedures and
Norgaard,R. (n.d). Results and Discussion Sections in Techniques. London: Sage.
the Scientific Research Article.University of Colorado at The Advanced Learner’s Dictionary of Current English,
Boulder. Retrieved on May, 10, 2018 Oxford, 1952, p. 1069.
resess.unavco.org Weiss, N.A. (1999). Introductory Statistics. Addison
/lib/downloads/RESESS.13.results&discussion.051313. Wesley.
pdf Wilkinson, A.(1991). The Scientist’s Handbook for Writing
O’Brien, D. and Scott,P.S. (2012). “Correlation and Papers and Dissertations. Englewood Cliffs, New
Regression”, in Approaches to Quantitative Research – Jersey: Prentice Hall.
A Guide for Dissertation Students, Ed, Chen, H, Oak
Tree Press.

27

You might also like