0% found this document useful (0 votes)
47 views79 pages

SCS 301 Research Methods in Computing Notes

The document outlines the course SCS301: Research Methods in Computing at Murang’a University of Technology, detailing the definition of research, its purpose, and the importance of research skills for managers. It categorizes research into qualitative and quantitative types, discusses various research methodologies, and defines key research terms and components. Additionally, it highlights the significance of both qualitative and quantitative methods, their advantages and disadvantages, and various classifications of research by purpose and analysis methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views79 pages

SCS 301 Research Methods in Computing Notes

The document outlines the course SCS301: Research Methods in Computing at Murang’a University of Technology, detailing the definition of research, its purpose, and the importance of research skills for managers. It categorizes research into qualitative and quantitative types, discusses various research methodologies, and defines key research terms and components. Additionally, it highlights the significance of both qualitative and quantitative methods, their advantages and disadvantages, and various classifications of research by purpose and analysis methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

SCS301: RESEARCH METHODS IN COMPUTING

MURANG’A UNIVERSITY OF
TECHNOLOGY

SCS301: RESEARCH METHODS IN


COMPUTING
School: SCHOOL OF COMPUTING AND INFORMATION
TECHNOLOGY
Department: COMPUTER SCIENCE
Lecturer’s Name: TIRUS MUYA
Email Address: [email protected]_

Page 1 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Definition of research
Different authors have defined research as follows:
➢ Research is carrying out a diligent inquiry or a critical examination of a given
phenomenon.
➢ Research involves a critical analysis of existing conclusions or theories with
regard to newly discovered facts i.e. it’s a continued search for new knowledge
and understanding of the world around us.
➢ Research is a process of arriving at effective solutions to problems through
systematic collection, analysis and interpretation of data.

What is Business Research?


It is a systematic inquiry whose objective is to provide information to solve
managerial problems (Cooper and Schindler, 2003).

Research and Scientific Method


The scientific method encourages a rigorous, impersonal mode of procedure
dictated by the demands of logic and objective procedure. It is based on the
following basic postulates:
➢ It relies on empirical evidence
➢ It utilizes relevant concepts
➢ It is committed to only objective considerations
➢ It presupposes ethical neutrality i.e. it aims at nothing but making only
adequate and correct statements about population objects
➢ It results into probabilistic predictions
➢ Its methodology is made known to all concerned for critical scrutiny and for
use in testing the conclusions through replication.
➢ It aims at formulating most general axioms or what can be termed as
scientific theories.

Purpose of Research
➢ To discover new knowledge
➢ To describe a phenomenon
➢ To enable prediction.
➢ To enable control i.e. the ability to regulate the phenomenon under study.
➢ To enable explanation of a phenomenon i.e. accurate observation and
measurement of a given phenomenon.
➢ To enable theory development and validation of existing theories. Theory
development involves formulating concepts, laws and generalizations about a
given phenomenon.
➢ Research provides one with the knowledge and skills needed for the fast-paced
decision-making environment

Why Managers need Better Information


➢ Explosive growth and influence of the internet
Page 2 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ Stakeholders demanding greater influence: Workers, shareholders, customers


and the general public are demanding to be included in company decision-
making.

➢ More vigorous competition – domestic and global


➢ More government intervention
➢ More complex decisions: There are more variables to consider in every
decision.
➢ Maturing of management as a group of disciplines
➢ Greater computing power and speed: The power and ease of use of today’s
computers to analyze data, which help in decision-making.
➢ New perspectives on established research methodologies

Sources of Knowledge
➢ Research
➢ Experience: Empiricists attempt to describe, explain, and make predictions
through observation.
➢ Tradition: Rationalists believe all knowledge can be deduced from known laws
or basic truths of nature
➢ Authority: They serve as important sources of knowledge, but should be
judged on integrity and willingness to present a balanced case.
➢ Intuition: it is the perception, explanation or insight into phenomena by
instinct.

The Value of Acquiring Research Skills


➢ To gather more information before selecting a course of action
➢ To do a high-level research study
➢ To understand research design
➢ To evaluate and resolve a current management dilemma
➢ To establish a career as a research specialist

Definition of basic terms used in research


➢ Population: it refers to an entire group of individuals, events or objects having
a common observable characteristic.
➢ Sample: It is a smaller group obtained from the accessible population.
➢ Sampling: It is the process of selecting a number of individuals for a study in
such a way that the individuals selected represent the population.
➢ Variable: It is a measurable characteristic that assumes different values among
the subjects. They can be dependent, independent, intervening, confounding
or antecedent variables.
➢ Data: refers to all information a researcher gathers for his or her study. Can be
secondary data or primary data.

Page 3 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ Parameter: It is a characteristic that is measurable and can assume different


values in the population.
➢ Statistics: it is the science of organizing, describing and analyzing data.
Descriptive and inferential statistics.
➢ Objective: it refers to the specific aspects of the phenomenon under study that
the researcher desires to bring out at the end of the research study.
➢ Literature review: It involves locating, reading and evaluating reports of
previous studies, observations and opinions related to the planned study.
➢ Hypothesis: It is a researcher’s anticipated explanation or opinion regarding
the result of the study.
➢ Theory: It is a set of concepts or constructs and the interrelations that are
assumed to exist among those concepts. It provides the basis for establishing
the hypothesis to be tested in the study.
➢ A construct is an image or idea specifically invented for a given research and/or
theory-building purpose
➢ A concept is a bundle of meanings or characteristics associated with certain
events, objects, conditions, situations, and behaviors. Concepts have been
developed over time through shared usage

Components of research
1. Identification of the research area and topic.
2. Statement of the problem.
3. Literature review.
4. Methodology design
5. Sampling frame and sampling techniques.
6. Data collection tools, design and techniques.
7. Data analysis methods.
8. Report writing techniques.

Page 4 of 79
SCS301: RESEARCH METHODS IN COMPUTING

TYPES OF RESEARCH
Different authors have classified research into various categories.
Qualitative research
It includes designs, techniques and measures that do not produce discrete
numerical data. Qualitative data can be collected through direct observation,
participant observation or interview method. Qualitative research includes an
“array of interpretive techniques which seek to describe, decode, translate and
otherwise come to terms with the meaning, not the frequency, of certain more or
less naturally occurring phenomena in the social world. Qualitative research aims
to achieve an in-depth understanding of a situation. Qualitative research is
designed to tell the researcher how (process) and why (meaning) things happen as
they do. Qualitative techniques are used at both the data collection and data
analysis stages of a research project. At the data collection stage, the array of
techniques includes focus groups, individual depth interviews, case studies,
ethnography, grounded theory, action research and observation. During analysis,
the qualitative researcher uses content analysis of written or recorded materials
drawn from personal expressions by participants and behavioural observations.
Qualitative Quantitative
Focus of research Understand and interpret Describe, explain and
predict
Researcher High, researcher is participant or Limited, controlled to
involvement catalyst prevent bias
Research purpose In-depth understanding : theory Describe or predict: Build
building and test theory
Sample design Non-probabilistic : purposive Probabilistic
Research design May evolve or adjust during the Determined before
course of the period commencing the project
Often uses multiple methods Uses single method or
simultaneously or sequentially mixed methods
Consistency is not expected Consistency is critical
Involves longitudinal approach Involves either a cross-
sectional or a longitudinal
approach
Participant Pre-tasking is common No preparation desired
preparation to avoid biasing the
participant
Data type and Verbal or pictorial descriptions Verbal descriptions
preparation Reduced to verbal codes Reduced to numerical
codes for computerized
analysis

Data analysis Human analysis following Computerized analysis


computer or human coding

Page 5 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Quantitative research
It includes designs, techniques and measures that produce discreet numerical or
quantifiable data.

Advantages of using both qualitative and quantitative methods


1. Since in many cases a researcher has several objectives, some of these
objectives are better assessed using quantitative methods.
2. Both methods supplement each other i.e. qualitative methods provide the
in-depth explanations while quantitative methods provide the data needed
to test hypotheses.
3. Since both methods have a bias, using both types of research helps to avoid
such bias in that each method can be used to check the other.

Disadvantages of using both qualitative and quantitative methods


1. It is expensive
2. Researchers may not have sufficient training in both methods to be able to
use them effectively.

Classification by purpose
1. Basic / Pure / Fundamental Research
Basic researchers are interested in deriving scientific knowledge i.e. they are
motivated by intellectual curiosity and need to come up with a particular
solution. It focuses on generating new knowledge in order to refine or expand
existing theories. It does not consider the practical application of the findings
to actual problems or situations.
2. Applied research
It is conducted for the purpose of applying or testing theory and evaluating its
usefulness in solving problems. It provides data to support a theory, guide
theory revision or suggest the development of a new theory.
3. Action research
It is conducted with the primary intention of solving a specific, immediate and
concrete problem in a local setting e.g. investigating ways of overcoming
water shortage in a given area. It is not concerned with whether the results can
be generalized to any other setting.
4. Evaluation Research
It is the process of determining whether the intended results were realized.

Types of evaluation research


i. Needs assessment
A need is a discrepancy between an existing set of conditions and a desired
set of conditions. The results of needs assessment study provide the
foundation for developing new programmes and for making changes in
existing ones.
ii. Formative evaluation
Page 6 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Helps to collect data about a programme while it is still being developed


e.g. an educational programme, a marketing strategy etc.
iii. Summative evaluation
It is done after the programme has been fully developed. It is conducted
to evaluate how worthwhile the final programme has been especially
compared to similar programmes.

Classification by methods of analysis


1. Descriptive research
It is the process of collecting data in order to test hypotheses or to answer
questions concerning the current status of the subjects in the study. It
determines and reports the way things are. It attempts to describe such things
as possible behaviour, attitudes, values and characteristics.
Steps involved in descriptive research
➢ Formulating the objectives of a study
➢ Designing the methods of data collection
➢ Selecting the sample
➢ Data collection
➢ Analyzing the results
2. Causal-comparative research
It is used to explore relationships between variables. It determines reasons or
causes for the current status of the phenomenon under study. The variables of
interest cannot be manipulated unlike in experimental research.
Steps in causal-comparative research
➢ Define the research question
➢ Select a group that possesses the characteristics, which the researcher
wants to study.
➢ Select a comparison group which does not display the characteristics
under study but which is similar to the group in other respects.
➢ Collect data on both the experimental and control groups
➢ Analyze the data
Advantages of causal-comparative study
➢ Allows a comparison of groups without having to manipulate the
independent variables
➢ It can be done solely to identify variables worthy of experimental
investigation
➢ They are relatively cheap.
Disadvantages of causal-comparative study
➢ Interpretations are limited because the researcher does not know
whether a particular variable is a cause or result of a behaviour being
studied.
➢ There may be a third variable which could be affecting the established
relationship but which may not be established in the study.

Page 7 of 79
SCS301: RESEARCH METHODS IN COMPUTING

3. Correlation Methods
It describes in quantitative terms the degree to which variables are related. It
explores relationships between variables and also tries to predict a subject’s
score on one variable given his or her score on another variable.
Steps in correlational research
➢ Problem statement
➢ Selection of subjects
➢ Data collection
➢ Data analysis

Advantages of the correlational method


➢ Permits one to analyze inter-relationships among a large number of
variables in a single study.
➢ Allows one to analyze how several variables either singly or in combination
might affect a particular phenomenon being studied.
➢ The method provides information concerning the degree of relationship
between variables being studied.

Disadvantages of the correlational method


➢ Correlation between two variables does not necessarily imply causation
although researchers often tend to interpret such a relationship to mean
causation.
➢ Since the correlation coefficient is an index, any two variables will always
show a relationship even when commonsense dictates that such variables
are not related.
➢ The correlation coefficient is very sensitive to the size of the sample.

Classification by type of research


1. Survey Research
A survey is an attempt to collect data from members of a population in order to
determine the current status of that population with respect to one or more
variables. Survey study is therefore a self-report study, which requires the
collection of quantifiable information from the sample. It is a descriptive research.
Steps involved in Survey research
➢ Problem statement
➢ Defining Objectives
➢ Selecting a Sample
➢ Preparing the instruments
➢ Data analysis
Purpose of survey research
i. It seeks to obtain information that describes existing phenomena by asking
individuals about their perceptions, attitudes, behaviour or values.
ii. Can be used for explaining or exploring the existing status of two or more
variables, at a given point in time.
Page 8 of 79
SCS301: RESEARCH METHODS IN COMPUTING

iii. It is the most appropriate to measure characteristics of large populations.


Limitations of Survey research
i. They are dependent on the cooperation of respondents.
ii. Information unknown to the respondents cannot be tapped in a survey
e.g. amount saved per year
iii. Requesting information which is considered secret and personal,
encourages incorrect answers.
iv. Surveys cannot be aimed at obtaining forecasts of things to come.

2. Historical research
Involves the study of a problem that requires collecting information from the
past

Purpose of Historical Research


➢ Aims at arriving at conclusions concerning causes, effects or trends of past
occurrences that may help explain present events and anticipate future
events.
➢ Attempts to interpret ideas or events that had previously seemed unrelated.
➢ Synthesizes old data or merges old data with new historical facts that the
researcher or other researchers have discovered.
➢ To reinterpret past events that have been studied.

Steps involved in historical research


➢ Identifying and delineating the problem.
➢ Developing hypothesis or hypotheses that one is interested in testing.
➢ Collecting and classifying resource materials, determining facts by internal
and external criticism.
➢ Organizing facts into results
➢ Interpreting data in terms of stated hypothesis or theory.
➢ Synthesizing and presenting the research in an organized form.

3. Observational Research
The current status of a phenomenon is determined not by asking but by
observing. This helps to collect objective information.

Steps
➢ Selection and definition of the problem.
➢ Sample selection.
➢ Definition of the observational information.
➢ Recording observational information
➢ Data analysis and interpretation.

Page 9 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Types of observational research


1. Non-participant observation
The observer is not directly involved in the situation to be observed.
2. Naturalistic Observation
Behaviour is studied and recorded as it normally occurs.
3. Simulation observation.
The researcher creates the situation to be observed and tells subjects to be
observed what activities they are to engage in. Disadvantage – the setting is
not natural and the behaviour exhibited by the subjects may not be the
behaviour that would occur in a natural setting.
4. Participant observation
The observer becomes part of or a participant in the situation. May not be
ethical
5. Case studies
A case study is an in-depth investigation of an individual, group, institution or
phenomenon. It aims to determine factors and relationships among the factors
that have resulted in the behaviour under study.
6. Content analysis
It involves observation and detailed description of objects, items or things that
comprise the sample. The purpose is to study existing documents such as
books, magazines in order to determine factors that explain a specific
phenomenon.
Steps
a) Decide on the unit of analysis
b) Sample the content to be analyzed
c) Coding
d) Data analysis
e) Compiling results and interpretations.

Advantages
a) Researchers are able to economize in terms of time and money.
b) Errors that arise during the study are easier to detect and correct.
c) The method has no effect on what is being studied.
Disadvantages
a) It is limited to recorded communication.
b) It is difficult to ascertain the validity of the data.

Characteristics of a Good Research


Following the standards of the scientific method
a) Purpose clearly defined: Researcher distinguishes between symptoms of
organizations problem, the managers perception of the problem and the
research problem
b) Research process detailed: Researcher provides complete research proposal

Page 10 of 79
SCS301: RESEARCH METHODS IN COMPUTING

c) Research design thoroughly planned: Exploratory procedures are outlined


with constructs defined, sample unit is clearly described along with sampling
methodology, data collection procedures are selected and designed
d) Limitations frankly revealed: Desired procedure is compared with actual
procedure in report, desired sample is compared with actual sample in the
report, impact on findings and conclusions is detailed.
e) High ethical standards applied: Safeguards are in place to protect study
participants, organizations, clients and researchers. Recommendations do
not exceed the scope of the study. The study’s methodology and limitations
sections reflect researcher’s restraint and concern for accuracy
f) Findings presented unambiguously: Findings are clearly presented in words,
tables and graphs. Findings are logically organized to facilitate reaching a
decision about the managers problem. Executive summary of conclusions is
outlined. Detailed table of contents is tied to the conclusions and findings
presentation
a) Conclusions justified: Decision based conclusions are matched with detailed
findings
b) Researcher’s experience reflected: Researcher provides experience /
credentials with report
c) Adequate analysis for decision-maker’s needs: sufficiently detailed findings
are tied to collection instruments.

IDENTIFICATION OF RESEARCH AREA


The research process starts by formulating a research problem that can be
investigated through research procedures.
Identifying a research problem
The first step in selecting a research problem is to identify the broad area that one
is interested in. Such an area should be related to the professional interests and
goals of the researcher e.g. low-cost housing, productivity of workers, small-scale
businesses etc.
The second step is to identify a specific problem within it that will form the basis
of the research study. The research problem should be an important one i.e. it
should
a) Lead to findings that have widespread implications in a particular area
b) Challenge some commonly held truism
c) Review the inadequacies of existing laws, views or policies
d) Cover a reasonable scope e.g. not too narrow or too general.

Page 11 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Defining the research problem


A research problem refers to some difficulty which the researcher experiences in
the context of either a theoretical or practical situation and wants to obtain a
solution for the same. A research problem exists if the following conditions are
met:-
a) There must be an individual or a group which has some difficulty or the
problem.
b) There must be some objective(s) to be attained.
c) There must be alternative means or courses of action for obtaining the
objective(s) one wishes to attain.
d) There must be some doubt in the mind of a researcher with regard to the
selection of alternatives.
e) There must be some environment(s) to which the difficulty pertains.

Selecting the problem


The following points must be observed by a researcher in selecting a research
problem or a subject of study:
a) Subject which is overdone should not be normally chosen, for it will be a
difficult task to throw any new light in such a case.
b) Controversial subject should not become the choice of an average researcher.
c) Too narrow or too vague problems should be avoided.
d) The subject selected for research should be familiar and feasible so that the
related research material or sources of research are within one’s reach.
e) The importance of the subject, the qualifications and the training of a
researcher, the costs involved and the time factor must be considered.
f) The selection of a study must be preceded by a preliminary study.

Defining the problem


It involves the task of laying down boundaries within which a researcher shall
study the problem with a predetermined objective in view. The following steps
can be followed:-
a) Statement of the problem in a general way
b) Understanding the nature of the problem: Understand the origin and nature
of the problem e.g. by discussing it with those who raised it in order to find
out how the problem originally came about. The researcher should keep in
view the environment within which the problem is to be studied and
understood.
c) Surveying the available literature: the researcher must be well conversant
with relevant theories in the field, reports and records as also all other
relevant literature.
d) Developing ideas through discussions:
e) Rephrasing the research problem: Its putting the research problem in as
specific terms as possible so that it may become operationally viable and may
help in the development of working hypotheses.
Page 12 of 79
SCS301: RESEARCH METHODS IN COMPUTING

The following should also be observed when defining a research problem:


a) Technical terms and words or phrases with special meanings used in the
statement of the problem, should be clearly defined.
b) Basic assumptions or postulates if any relating to the research problem
should be clearly stated.
c) A straight forward statement of the value of the investigation should be
provided.
d) The suitability of the time-period and the sources of data available must
also be considered by the researcher in defining the problem.
e) The scope of the investigation or the limits within which the problem is to
be studied must be mentioned explicitly in defining a research problem.

Certain factors determine the scope of a research study. These include:


a) The time available to carry it out
b) The money available to carry it out
c) The availability of equipment if needed to carry it out
d) The availability of subjects or the units of study.

Ways of identifying a specific research problem from the broad area.


(a) Existing theories
(b) Existing literature
(c) Discussions with experts
(d) Previous research studies
(e) Replication
(f) The media
(g) Personal experiences.

STATING THE PROBLEM


A research study starts with a brief introductory section. The researcher introduces
briefly the general area of study, and then narrows down to the specific problem
to be studied.

Characteristics of a good problem statement


➢ It should be written clearly and in such a way that the reader’s interest is
captured immediately.
➢ The specific problem identified in the problem statement should be
objectively researchable
➢ The scope of the specific research problem should be indicated
➢ The importance of the study in adding new knowledge should be stated
clearly
➢ The problem statement must give the purpose of the research.

Page 13 of 79
SCS301: RESEARCH METHODS IN COMPUTING

STATING THE PURPOSE


The purpose of a study crystallizes the researcher’s inquiry into a particular area of
knowledge in a given field. If the purpose is accurately expressed, the research
process will be carried out with ease. The purpose of the study should meet the
following criteria:
➢ It must be indicated clearly, unambiguously and in a declarative manner.
➢ The purpose should indicate the concepts or variables in the study.
➢ Where possible, the relationships among the variables should be stated.
➢ The purpose should state the target population.
➢ The variables and target population given in the purpose should be
consistent with the variables and target population operationalised in the
methods section of the study.

In stating the purpose of the study, the researcher should choose the right words
to convey the focus of the study effectively. Use of subjective or biased words or
sentences should be avoided.
Examples
Biased Neutral
To show To determine
To prove To compare
To confirm To investigate
To verify To differentiate
To check To explore
To demonstrate To find out
To indicate To examine
To validate To inquire
To explain To establish
To illustrate To test

Stating the Objectives


Research objectives are those specific issues within the scope of the stated purpose
that the researcher wants to focus upon and examine in the study.
Characteristics of a good objective
➢ Specific:
➢ Measurable
➢ Achievable
➢ Reliable
➢ Time bound
Objectives guide the researcher in formulating testable hypotheses.
In stating the objectives of the study, the researcher should choose the right words
to convey the focus of the study effectively. Use of subjective or biased words or
sentences should be avoided.

Page 14 of 79
SCS301: RESEARCH METHODS IN COMPUTING

FORMULATING HYPOTHESES
A hypothesis is a researcher’s prediction regarding the outcome of the study. It
states possible differences, relationships or causes between two variables or
concepts. Hypothesis are derived from or based on existing theories, previous
research, personal observations or experiences. The test of a hypothesis involves
collection and analysis of data that may either support or fail to support the
hypothesis. If the results fail to support a stated hypothesis, it does not mean that
the study has failed but it implies that the existing theories or principles need to be
revised or retested under various situations.

Purpose of hypothesis
➢ It provides direction by bridging the gap between the problem and the
evidence needed for its solution.
➢ It ensures collection of the evidence necessary to answer the question posed in
the statement of the problem.
➢ It enables the investigator to assess the information he or she has collected from
the standpoint of both relevance and organisation.
➢ It sensitizes the investigator to certain aspects of the situation that are relevant
regarding the problem at hand.
➢ It permits the researcher to understand the problem with greater clarity and use
the data to find solutions to problems.
➢ It guides the collection of data and provides the structure for their meaningful
interpretation in relation to the problem under investigation.
➢ It forms the framework for the ultimate conclusions as solutions.

Characteristics of a good hypothesis


A sound review of literature or of existing theories often leads to good hypothesis.
1. Should state clearly and briefly the expected relationships between variables.
2. Must be based on a sound rationale derived from theory or previous research
or professional experience.
3. Must be consistent with common sense or generally accepted truths.
4. Must be testable within a reasonable time.
5. Must be related to empirical phenomena. Words like ought, should, bad
should be avoided since they reflect moral judgment.
6. Variables stated in the hypothesis must be consistent with the purpose
statement, objectives and operationalized variables in the method section.
7. Must be as simple and as concise as the complexity of the concepts involved
allows.
8. It must be stated in such a way that its implications can be deduced in the form
of empirical operations with respect to which relationship can be validated or
refuted.

Page 15 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Assumptions and Limitations


➢ An assumption is any fact that a researcher takes to be true without actually
verifying it. It puts some boundary around the study and provides the reader
with vital information, which influences the way results of the study are
interpreted.
➢ A limitation is an aspect of a research that may influence the results negatively
but over which the researcher has no control. A common limitation in social
science studies is the scope of the study, which sometimes may not allow
generalizations. Sample size may also be another limitation.

LITERATURE REVIEW
The review of literature involves the systematic identification, location and
analysis of documents containing information related to the research problem
being investigated. It should be extensive and thorough because it is aimed at
obtaining detailed knowledge of the topic being studied.

Purpose of literature review


➢ To determine what has already been done related to the research problem
being studied. This will help the researcher to:
- Avoid unnecessary and unintentional duplication.
- Form the framework within which the research findings are to be
interpreted.
- Demonstrate his or her familiarity with the existing body of
knowledge.
➢ Helps reveal the strategies, procedures and measuring instruments that have
been found useful in investigating the problem in question. This will help the
researcher to:
- Avoid mistakes that have been made by other researchers
- Benefit from other researcher’s experiences
- Clarify how to use certain procedures, which one may only have
learned in theory.
➢ Helps to suggest other procedures and approaches, which will help, improve
the research study.
➢ Familiarizes the researcher with previous studies, which facilitates
interpretation of the results of the study. If there is a contradiction, the
literature review might provide rationale for the discrepancy.
➢ It helps the researcher to limit the research problem and to define it better.
➢ Helps to determine new approaches and stimulates new ideas. The researcher
may be alerted to research possibilities, which have been overlooked in the
past.
➢ Approaches that have been proved to be futile will be revealed through
literature review.
➢ Specific suggestions and recommendations for further research can be found
by reviewing literature.
Page 16 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ It pulls together, integrates and summarizes what is known in an area. Thus


helping to reveal gaps in information and areas where major questions still
remain.

Steps in carrying out literature review.


1. Familiarize yourself with the library before beginning the literature review.
2. Make a list of key words or phrases to guide your literature search.
3. With the key words and phrases related to the study, one should go to the
source of literature.
4. Summarize the references on cards for easy organisation of the literature.
5. Once collected, the literature should be analyzed, organized and reported in
an orderly manner.
6. Make an outline of the main topics or themes in order of presentation.
7. Analyze each reference in terms of the outline made and establish where it will
be most relevant.
8. The literature should be organized in such a way that the more general is
covered first before the researcher narrows down to that which is more specific
to the research problem.

Sources of literature
(a) Primary sources: are direct descriptions of any occurrence by an individual
who actually observed or witnessed the occurrence.
(b) Secondary source: they include any publications written by an author who
was not a direct observer or participant in the events described.

Examples
➢ Scholarly journals
➢ Theses and dissertations
➢ Government documents
➢ Papers presented at conferences
➢ Books
➢ References quoted in books
➢ International indices
➢ Abstracts
➢ Periodicals
➢ The Africana section of the library
➢ Reference section of the library
➢ Grey literature
➢ Inter-library loan
➢ The British lending library
➢ The internet
➢ Microfilm

Page 17 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Evaluating information sources


Researchers evaluate and select information sources based on five factors that can
be applied to any type of source, whether printed or electronic. These are:-
a) Purpose: The purpose is what the author is trying to accomplish e.g. to
enlighten, to define terms, to entertain etc.
b) Scope: what is the date of publication? What time period does this source
cover? How much of the topic is covered and to what depth? Is the
material covered local, regional or international?
c) Authority: The author and the author’s credentials should be given both in
printed and electronic sources.
d) Audience: When evaluating the plausible audience of a source, look for key
indicators including vocabulary, types of information and questions or
directions that guide the search.
e) Format: It relates to how the information is presented and how easy it is to
find a specific piece of information.

Tips on good review of literature


a) Do not conduct a hurried review for fear of overlooking important studies.
b) Do not rely too heavily on secondary sources.
c) Check daily newspapers as they contain very educative, current information.
d) Copy the references correctly in the first place so as to avoid the frustration of
trying to retrace a reference later.
e) Do not only concentrate on findings, check on methodology and
measurement of variables.

ETHICS IN RESEARCH
Ethics are norms or standards of behaviour that guide moral choices about our
behaviour and our relationship with others. Ethics differ from legal constraints, in
which generally accepted standards have defined penalties that are universally
enforced. The goal of ethics in research is to ensure that no one is harmed or
suffers adverse consequences from research activities.

As the research is designed, several ethical considerations must be balanced e.g.


➢ Protect the rights of the participant or subject.
➢ Ensure the sponsor receives ethically conducted and reported research.
➢ Follow ethical standards when designing research
➢ Protect the safety of the researcher and team
➢ Ensure the research team follows the design

1. Ethical treatment of participants


In general, the research must be designed in such a manner that the respondent
does not suffer physical harm, discomfort, pain, embarrassment or loss to privacy.
To safeguard against these, the researcher should follow the following guidelines:
Page 18 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ Explain the study benefits


➢ Obtain informed consent
➢ Explain respondents rights and protection

(a) Benefits
Whenever direct contact is made with a respondent, the researcher should discuss
the study benefits, being careful to neither overstate nor understate the benefits.
An interviewer should begin an introduction with his or her name, the name of
the research organisation and a brief description of the purpose and benefits of
the research. This puts the respondent at ease, lets them know to whom they are
speaking and motivates them to answer questions truthfully. Inducements to
participate, financial or otherwise, should not be disproportionate to the task or
presented in a fashion that results in coercion.

Deception occurs when the respondents are told only part of the truth or when
the truth is fully compromised. The benefits to be gained by deception should be
balanced against the risks to the respondents. When possible, an experiment or
interview should be designed to reduce reliance on deception. In addition, the
respondent’s rights and well-being must be adequately protected. In instances
where deception in an experiment could produce anxiety, a subject’s medical
condition should be checked to ensure that no adverse physical harm follows.

(b) Informed consent


Securing informed consent from respondents is a matter of fully disclosing the
procedures of the proposed survey or other research design before requesting
permission to proceed with the study. There are exemptions that argue for a
signed consent form. When dealing with children, it is wise to have a parent or
other person with legal standing sign a consent form. If the researchers offer only
limited protection of confidentiality, a signed form detailing the types of limits
should be obtained. For most business research, oral consent is sufficient.

In situations where respondents are intentionally or accidentally deceived, they


should be debriefed once the research is complete. Debriefing involves several
activities following the collection of data e.g.
➢ Explanation of any deception.
➢ Description of the hypothesis, goal or purpose of the study.
➢ Post study sharing of results.
➢ Post study follow-up medical or psychological attention.
According to Neuman and Wiegand (2000), a full blown consent statement
would contain the following: -
➢ A brief description of the purpose and procedure of the research, including
the expected duration.
➢ A statement of any risks, discomforts or inconveniences associated with
participation.
Page 19 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ A guarantee of anonymity or at least confidentiality, and an explanation of


both.
➢ The identification, affiliation and sponsorship of the research as well as
contact information.
➢ A statement that participation is completely voluntary and can be
terminated at any time without penalty.
➢ A statement of any procedures that may be used.
➢ A statement of any benefits to the class of subjects involved.
➢ An offer to provide a free copy of a summary of the findings.

(c) Rights to privacy


All individuals have a right to privacy and researchers must respect that right. The
privacy guarantee is important not only to retain validity of the research but also
to protect respondents. Once the guarantee of confidentiality is given, protecting
that confidentiality is essential. The researcher can protect respondent’s
confidentiality in several ways, which include: -
➢ Obtaining signed nondisclosure documents
➢ Restricting access to respondent identification.
➢ Revealing respondent information only with written consent.
➢ Restricting access to data instruments where the respondent is identified.
➢ Nondisclosure of data subsets.

Researchers should restrict access to information that reveals names, telephone


numbers, address or other identifying features. Only researchers who have signed
nondisclosure, confidentiality forms should be allowed access to the data. Links
between the data or database and the identifying information file should be
weakened. Individual interview response sheets should be inaccessible to everyone
except the editors and data entry personnel.

Occasionally, data collection instruments should be destroyed once the data are in
a data file. Data files that make it easy to reconstruct the profiles or identification
of individual respondents should be carefully controlled. For very small groups,
data should not be made available because it is often easy to pinpoint a person
within the group. Employee-satisfaction survey feedback in small units can be
easily used to identify an individual through descriptive statistics.

Privacy is more than confidentiality. A right to privacy means one has the right to
refuse to be interviewed or to refuse to answer any question in an interview.
Potential participants have a right to privacy in their own homes, including not
admitting researchers and not answering telephones. They have the right to
engage in private behaviour in private places without fear of observation. To
address these rights, ethical researchers can do the following:-
➢ Inform respondents of their right to refuse to answer any questions or
participate in the study.
Page 20 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ Obtain permission to interview respondents


➢ Schedule field and phone interviews.
➢ Limit the time required for participation.
➢ Restrict observation to public behaviour only.

2. Ethics and the sponsor


There are ethical considerations to keep in mind when dealing with the research
client or sponsor. Whether undertaking product, market, personnel, financial or
other research, a sponsor has the right to receive ethically conducted research.

(a) Confidentiality
Sponsors have a right to several types of confidentiality including sponsor
nondisclosure, purpose nondisclosure and findings nondisclosure.
➢ Sponsor nondisclosure: Companies have a right to dissociate themselves
from the sponsorship of a research project. Due to the sensitive nature of
the management dilemma or the research question, sponsors may hire an
outside consulting or research firm to complete research projects. this is
often done when a company is testing a new product idea, to avoid
potential consumers from being influenced by the company’s current image
or industry standing. If a company is contemplating entering a new market,
it may not wish to reveal its plans to competitors. In such cases, it is the
responsibility of the researcher to respect this desire and device a plan to
safeguard the identity of the sponsor.
➢ Purpose nondisclosure: It involves protecting the purpose of the study or its
details. A research sponsor may be testing a new idea that is not yet
patented and may not want the competitor to know his plans. It may be
investigating employee complaints and may not want to spark union
activity. The sponsor might also be contemplating a new public stock
offering, where advance disclosure would spark the interest of authorities or
cost the firm thousands of shillings.
➢ Findings nondisclosure: If a sponsor feels no need to hide its identity or the
study’s purpose, most sponsors want research data and findings to be
confidential, at least until the management decision is made.

(b) Right to quality research


An important ethical consideration for the researcher and the sponsor is the
sponsor’s right to quality research. The right entails:
➢ Providing a research design appropriate for the research question.
➢ Maximizing the sponsor’s value for the resources expended
➢ Providing data handling and reporting techniques appropriate for
the data collected.
From the proposal through the design to data analysis and the final report, the
researcher guides the sponsor on the proper techniques and interpretations. Often
sponsors would have heard about sophisticated data handling technique and will
Page 21 of 79
SCS301: RESEARCH METHODS IN COMPUTING

want it used even when it is inappropriate for the problem at hand. The
researcher should propose the design most suitable for the problem. The
researcher should not propose activities designed to maximize researcher revenue
or minimize researcher effort at the sponsor’s expense. The ethical researcher
should report findings in ways that minimize the drawing of false conclusions. He
should also use charts, graphs and tables to show the data objectively, despite the
sponsor’s preferred outcomes.

(c) Sponsor’s Ethics


Occasionally, research specialists may be asked by sponsors to participate in
unethical behaviour. Compliance by the researcher would be a breach of ethical
standards. Some examples to be avoided are:
➢ Violating respondent confidentiality
➢ Changing data or creating false data to meet a desired objective
➢ Changing data presentations or interpretations.
➢ Interpreting data from a biased perspective.
➢ Omitting sections of data analysis and conclusions.
➢ Making recommendations beyond the scope of the data collected.

The ethical course often requires confronting the sponsor’s demand and taking the
following actions: -
➢ Educating the sponsor on the purpose of research
➢ Explain the researcher’s role in fact finding versus the sponsor’s role in
decision-making.
➢ Explain how distorting the truth or breaking faith with respondents leads to
future problems
➢ Failing moral suasion, terminate the relationship with the sponsor.

3. Researchers and team members


Researchers have an ethical responsibility to their team’s safety as well as their
own and also protecting the anonymity of both the sponsor and the respondent.
(a) Safety
It is the researcher’s responsibility to design a project so the safety of all
interviewers, surveyors, experimenters, or observers is protected. Several factors
may be important to consider in ensuring a researcher’s right to safety e.g. some
urban areas and undeveloped rural areas may be unsafe for research assistants,
therefore a team member can accompany the researcher. It is unethical to require
staff members to enter an environment where they feel physically threatened.
Researchers who are insensitive to these concerns face both research and legal
risks.

(b) Ethical behaviour of assistants


Researchers should require ethical compliance from team members just as sponsors
expect ethical behaviour from the researcher. Assistants are expected to carry out
Page 22 of 79
SCS301: RESEARCH METHODS IN COMPUTING

the sampling plan, to interview or observe respondents without bias and to


accurately record all necessary data. Unethical behaviour such as filling in an
interview sheet without having asked the respondent the questions cannot be
tolerated. The behaviour of the assistants is under the direct control of the
responsible researcher or field supervisor. If an assistant behaves improperly in an
interview or shares a respondents interview sheet with unauthorized person, it is
the researcher’s responsibility. All researchers’ assistants should be well trained and
supervised.

(c) Protection of anonymity


Researchers and assistants protect the confidentiality of the sponsor’s information
and the anonymity of the respondents. Each researcher handling data should be
required to sign a confidentiality and nondisclosure statement.

RESEARCH DESIGN

Definition of research design


Kerlinger, N.F (1986) defines research design as
“ The plan and structure of investigation so conceived as to obtain answers to
research questions. The plan is overall scheme or program of the research. It
includes an outline of what the investigator will do from writing hypotheses and
their operational implications to the final analysis of data….a research design
expresses both the structure of the research problem and the plan of investigation
used to obtain empirical evidence on relations of the problem”

Therefore a research design is the strategy for a study and the plan by which the
strategy is to be carried out. It specifies the methods and procedures for the
collection, measurement, and analysis of data.

ESSENTIALS OF RESEARCH DESIGN


The design:
a) Is an activity and time based plan
b) Is always based on the research question
c) Guides the selection of sources and types of information
d) Is a framework for specifying the relationships among the study’s variables
e) Outlines procedures for every research activity.

CLASSIFICATIONS OF DESIGNS
Research can be classified using eight different descriptors as shown in the table
below:
Category Options
The degree to which the research ➢ Exploratory study
Page 23 of 79
SCS301: RESEARCH METHODS IN COMPUTING

questions has been crystallized ➢ Formal study


The method of data collection ➢ Monitoring
➢ Interrogation / communication
The power of the researcher to ➢ Experimental
produce effects in the variables ➢ Ex post facto
under study
The purpose of the study ➢ Descriptive
➢ Causal
The time dimension ➢ Cross-sectional
➢ Longitudinal
The topical scope – breath and ➢ Case
depth of the study ➢ Statistical study
The research environment ➢ Field setting
➢ Laboratory research
➢ Simulation
The participants perceptions of ➢ Actual routine
research activity ➢ Modified routine

1. Degree to which the research questions has been crystallized


A study may be viewed as exploratory study or formal study. The essential
distinctions between these two options are the degree of structure and the
immediate objective of the study.
➢ Exploratory studies tend toward loose structures with the objective of
discovering future research tasks. Its immediate purpose is to develop
hypotheses or questions for further study.
➢ Formal study begins where the exploration leaves off- it begins with a
hypothesis or research question and involves precise procedures and data
source specifications. Its goal is to test the hypotheses or answer the research
questions posed.

2. Method of data collection


➢ Monitoring: It includes studies in which the researcher inspects the activities of
a subject or the nature of some material without attempting to elicit responses
from anyone e.g. an observation of the actions of a group of decision makers.

➢ Interrogation / communication: the researcher questions the subjects and


collects their responses by personal or impersonal means. The collected data
may result from
i. Interview or telephone conversations
ii. Self-administered or self-reported instruments sent through the mail,
left in convenient locations, or transmitted electronically or by other
means
iii. Instruments presented before and / or after a treatment or stimulus
condition in an experiment.
Page 24 of 79
SCS301: RESEARCH METHODS IN COMPUTING

3. Researcher control of variables


➢ Experimental: the researcher attempts to control and / or manipulate the
variables in the study. It is appropriate when one wishes to discover whether
certain variables produce effects in other variables. Experimentation provides
the most powerful support for a hypothesis of causation.
➢ Ex post facto: Investigators have no control over the variables in the sense of
being able to manipulate them. They can only report what has happened or
what is happening. It is important that the researcher’s using this design do not
influence the variables since doing so will introduce bias. The researcher is
limited to holding factors constant by judicious selection of subjects according
to strict sampling procedures and by statistical manipulation of findings.

4. Purpose of the study


➢ Descriptive study: it is a research that is concerned with finding out who, what,
where, when, or how much.
➢ Causal study: It is concerned with learning why i.e. how one variable produces
changes in another. It tries to explain the relationships among variables.

5. The time dimension


➢ Cross-sectional studies: they are carried out once and represent a snapshot of
one point in time.
➢ Longitudinal studies: are repeated over an extended period. It tracks changes
over time.
6. The topical scope
➢ Statistical studies: they are designated for breadth rather than depth. They
attempt to capture a population’s characteristics by making inferences from
a sample’s characteristics. Hypotheses are tested quantitatively.
Generalizations about findings are presented based on the
representativeness of the sample and the validity of the design.

➢ Case studies: they place more emphasis on a full contextual analysis of


fewer events or conditions and their interrelations. Although hypotheses are
often used, the reliance on qualitative data makes support or rejection more
difficult. An emphasis on detail provides valuable insight for problem
solving, evaluation and strategy. This detail is secured from multiple sources
of information. It allows evidence to be verified and avoids missing data.

7. The research environment


➢ Field setting: it is where the research occurs under actual environmental
conditions
➢ Laboratory research: it is where the research occurs under staged or
manipulated conditions
Page 25 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ Simulation: To simulate is to replicate the essence of a system or process.


Simulations are increasingly used in operations research. The major
characteristics of various conditions and relationships in actual situations are
often represented in mathematical models. Role-playing and other behavioural
activities may also be viewed as simulations.

8. Participants’ perceptions
The usefulness of a design may be reduced when people in a disguised study
perceive that research is being conducted. Participants’ perceptions influence the
outcomes of the research in subtle ways. There are three levels of perception:
➢ Participants perceive no deviations from everyday routines
➢ Participants perceive deviations, but as unrelated to the researcher.
➢ Participants perceive deviations as researcher-induced.
In all research environments and control situations, researchers need to be vigilant
to effects that may alter their conclusions. Participant’s perceptions serve as a
reminder to classify one’s study by type, to examine validation strengths and
weaknesses and to be prepared to qualify results accordingly.

MAJOR TYPES OF RESEARCH DESIGN


(a) Exploratory studies
Exploration is particularly useful when researchers lack a clear idea of the
problems they will meet during the study. Through exploration researchers
develop concepts more clearly, establish priorities, develop operational definitions
and improve the final research design. Other factors that necessitate the use of
exploration are
a) To save time and money
b) If the area of investigation is new
c) Important variables may not be known or thoroughly defined
d) Hypothesis for the research may be needed
e) A researcher can explore to be sure if it is practical to do a formal study in the
area.

Despite its obvious value, researchers and managers give exploration less attention
that it deserves. Exploration is sometimes linked to old biases about qualitative
research i.e. subjective ness, non-representativeness and non-systematic design.

When we consider the scope of qualitative research, several approaches are


adaptable for exploratory investigations of management questions:
a) In-depth interviewing – usually conversational rather than structured.
b) Participant observation – to perceive first hand what participants in the setting
experience
c) Films, photographs and videotapes – to capture the life of the group under
study.
d) Case studies – for an in-depth contextual analysis of a few events or conditions
Page 26 of 79
SCS301: RESEARCH METHODS IN COMPUTING

e) Document analysis – to evaluate historical or contemporary confidential or


public records, reports, government documents and opinions.

Where these approaches are combined, four exploratory techniques emerge with
wide applicability for the management researcher: -
a) Secondary data analysis
b) Experience surveys
c) Focus groups
d) Two-stage designs

An exploratory research is finished when the researchers have achieved the


following:
➢ Established the major dimensions of the research task
➢ Defined a set of subsidiary investigative questions that can be used as a guide
to a detailed research design.
➢ Developed several hypotheses about possible causes of a management
dilemma. Learned that certain other hypotheses are such remote possibilities
that they can be safely ignored in any subsequent study.
➢ Concluded additional research is not needed or is not feasible.

(b) Descriptive Studies


It is the process of collecting data in order to test hypotheses or to answer
questions concerning the current status of the subjects in the study. It determines
and reports the way things are.
Provides answers to questions like Who? What? When? Where? How? It attempts
to describe such things as possible behaviour, attitudes, values and characteristics.
(c) Causal Research
It is used to explore relationships between variables. It determines reasons or
causes for the current status of the phenomenon under study. The variables of
interest cannot be manipulated unlike in experimental research.
Advantages of causal study
➢ Allows a comparison of groups without having to manipulate the
independent variables
➢ It can be done solely to identify variables worthy of experimental
investigation
➢ They are relatively cheap.
Disadvantages of causal study
➢ Interpretations are limited because the researcher does not know whether a
particular variable is a cause or result of a behaviour being studied.
➢ There may be a third variable which could be affecting the established
relationship but which may not be established in the study.

(d) Correlation Methods

Page 27 of 79
SCS301: RESEARCH METHODS IN COMPUTING

It describes in quantitative terms the degree to which variables are related. It


explores relationships between variables and also tries to predict a subject’s score
on one variable given his or her score on another variable.

Advantages of the correlational method


➢ Permits one to analyze inter-relationships among a large number of
variables in a single study.
➢ Allows one to analyze how several variables either singly or in combination
might affect a particular phenomenon being studied.
➢ The method provides information concerning the degree of relationship
between variables being studied.

Disadvantages of the correlational method


➢ Correlation between two variables does not necessarily imply causation
although researchers often tend to interpret such a relationship to mean
causation.
➢ Since the correlation coefficient is an index, any two variables will always
show a relationship even when commonsense dictates that such variables
are not related.
➢ The correlation coefficient is very sensitive to the size of the sample.

THE SAMPLE DESIGN


It refers to the techniques of the procedure the researcher would adopt in selecting
items for the sample.
Factors to consider in developing a sample design
a) Type of universe; finite or infinite
b) Sampling unit; geographic: state, district or village, construction unit: house,
flat. Social unit: family, club, school or individual.
c) Source list: sampling frame- contains all the names of all items of a
universe. The list should be comprehensive, correct, reliable and
appropriate.
d) The size of the sample. Should be efficient, representative, reliable and
flexible.
e) Parameters of interest
f) Budgetary constraint
g) Sampling procedure.

Criteria for selecting a sampling procedure


Two costs are involved in a sampling analysis i.e. the cost of collecting the data
and the cost of an incorrect inference resulting from the data. Two causes of
incorrect inferences are systematic bias and sampling error. A systematic bias
results from errors in the sampling procedures and it cannot be reduced or
eliminated by increasing the sample size. Systematic bias is the result of the
following factors:-
Page 28 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ Inappropriate sampling frame


➢ Defective measuring device
➢ Non-respondents
➢ Indeterminancy principle – individuals act differently when kept under
observation.
➢ Natural bias in reporting data e.g. government tax – downward bias, social
organizations – upward bias.
Sampling errors are the random variations in the sample estimates around a true
population parameter. It decreases with the increase in the size of the sample and
it happens to be of a smaller magnitude in case of a homogenous population.
While selecting a sampling procedure, the researcher must ensure that the
procedure causes a relatively small sampling error and helps to control the
systematic bias in a better way.

Steps in sampling design


Identification of the: -
a) Relevant population
b) Type of universe i.e. finite or infinite
c) Parameters of interest
d) Sampling frame
e) Type of sample i.e. probabilistic or non-probabilistic
f) Size of the sample needed

Characteristics of a good sample design


a) Must result in a truly representative sample
b) Must result in a small sampling error
c) Must be viable in the context of funds available for the research study
d) Must ensure that systematic bias is controlled in a better way
e) Must be such that the results of the sample study can be applied in general
for the universe with a reasonable level of confidence.

The methodology section of a research study describes the procedures that are to
be followed in conducting the study. The techniques of obtaining data are
developed.
Population: It’s a complete set of individuals, cases or objects with some
observable characteristics.
A census is a count of all the elements in a population.
Sample: A sample is a subset of a particular population. The target population is
that population to which a researcher wants to generalize the results of the study.
There must be a rationale for defining and identifying the accessible population
from the target population.
Sampling; It’s the process of selecting a sample from a population.

Page 29 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Reasons for sampling


a) Cost
b) Time: Greater speed of data collection
c) Destructive nature of certain tests
d) Greater accuracy of results
e) Physical impossibility of checking all items in the population.
f) Availability of population elements.

Characteristics of a good sample


a) Accuracy: It’s the degree to which bias is absent from the sample. An
unbiased sample is the one in which the underestimators and the
overestimators are balanced among the members of the sample.
b) Precision of estimate: Precision is measured by the standard error of
estimate a type of standard deviation measurement. The smaller error of
estimate, the higher is the preciseness of the sample.

Factors that influence the sample size


➢ Dispersion / variance: The greater the dispersion or variance within the
population, the larger the sample must be to provide estimation precision.
➢ Precision of the estimate: the greater the desired precision of the estimate, the
larger the sample must be.
➢ Interval range: The narrower the interval range, the larger the sample must
be.
➢ Confidence level: The higher the confidence level in the estimate, the larger
the sample must be.
➢ Number of subgroups: The greater the number of subgroups of interest
within a sample, the greater the sample size must be, as each subgroup must
meet minimum sample size requirements.
➢ If the calculated sample size exceeds 5% of the population, sample size may
be reduced without sacrificing precision.

Sampling procedures:
There are two major ways of selecting samples;
➢ Probability sampling methods
➢ Non - Probability sampling methods

1. Probability Sampling Methods


Samples are selected in such a way that each item or person in the population has
a known (Nonzero) likelihood of being included in the sample.

Types of Probability sampling methods


a) Simple Random Sampling:
A sample is selected so that each item or person in the population has the same
chance of being included.
Page 30 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Advantages
➢ Easy to implement with automatic dialing and with computerized voice
response systems.
Disadvantages
➢ Requires a listing of population elements.
➢ Takes more time to implement
➢ Uses larger sample sizes
➢ Produces larger errors
➢ Expensive
b) Systematic Random Sampling:
The items or individuals of the population are arranged in some manner. A
random starting point is selected and then every kth member of the population
is selected for the sample.
Advantages
➢ Simple to design
➢ Easier to use than the simple random.
➢ Easy to determine sampling distribution of mean or proportion.
➢ Less expensive than simple random.
Disadvantages
➢ Periodicity within the population may skew the sample and results.
➢ If the population list has a monotonic trend, a biased estimate will result
based on the start point.
c) Stratified Random Sampling:
A population is divided into subgroups called strata and a sample is selected
from each stratum. After the population is divided into strata, either a
proportional or a non-proportional sample can be selected. In a proportional
sample, the number of items in each stratum is in the same proportion as in the
population while in a non-proportional sample, the number of items chosen in
each stratum is disproportionate to the respective numbers in the population.
Advantages
➢ Researcher controls sample size in strata
➢ Increased statistical efficiency
➢ Provides data to represent and analyze subgroups.
➢ Enables use of different methods in strata.

Disadvantages
➢ Increased error will result if subgroups are selected at different rates
➢ Expensive especially if strata on the population have to be created.

d) Cluster Sampling:
The population is divided into internally heterogeneous subgroups and some
are randomly selected for further study. It is used when it is not possible to
obtain a sampling frame because the population is either very large or

Page 31 of 79
SCS301: RESEARCH METHODS IN COMPUTING

scattered over a large geographical area. A multi-stage cluster sampling method


can also be used.
Advantages
➢ Provides an unbiased estimate of population parameters if properly done.
➢ Economically more efficient than simple random.
➢ Lowest cost per sample, especially with geographic clusters.
➢ Easy to do without a population list.

Disadvantages
➢ More error (Lower statistical efficiency) due to subgroups being
homogeneous rather the heterogeneous.

2. Non - Probability Sampling Methods


It is used when a researcher is not interested in selecting a sample that is
representative of the population.
a) Convenience or Accidental Sampling
It involves selecting cases or units of observation as they become available to
the researcher e.g. asking a question to the radio listeners, roommates or
neighbours.
b) Purposive Sampling: There are two main types; judgmental and quota
i. Judgement Sampling: Occurs when a researcher selects sample members to
conform to some criterion. It allows the researcher to use cases that have the
required information with respect to the objectives of his or her study e.g.
educational level, age group, religious sect etc.
ii. Quota Sampling
The researcher purposively selects subjects to fit the quotas identified e.g.
a) Gender: Male or Female.
b) Class Level: Graduate or Undergraduate
c) School: Humanities, Science or human resource development.
d) Religion: Muslim, Protestant, catholic, Jewish.
e) Fraternal affiliation: member or nonmember.
f) Social economic class: Upper, middle or lower.

Advantage
Widely used by pollsters, marketers and other researchers.
Disadvantages
a) It gives no assurance that the sample is representative of the variables being
studied.
b) The data used to provide controls may be outdated or inaccurate.
c) There is a practical limit on the number of simultaneous controls that can
be applied to ensure precision.
d) Since the choice of subjects is left to field workers, they may choose only
friendly looking people.

Page 32 of 79
SCS301: RESEARCH METHODS IN COMPUTING

c) Snow ball sampling


It is used when the population that possesses the characteristics under study is not
well known and can be best located through referral networks. Initial subjects are
identified who in turn identify others. Commonly used in drug cultures, teenage
gang activities, Mungiki sect, insider trading, Mau Mau etc.

Sampling error
It’s the difference between a sample statistic and its corresponding population
parameter. The sampling distribution of the sample means is a probability
distribution of possible sample means of a given sample size.

Statistical Inference
Sample information is used to shade some light on the population characteristics
i.e. we infer population properties based on findings on the sample. Statistical
inference falls into two main areas i.e. statistical estimation and hypothesis testing.
Statistical Estimation: The characteristics of the sample (sample statistic) are used to
estimate or approximate some unknown population characteristics.
Hypothesis testing: The population characteristics are known or assumed. The
sample characteristics are used to verify or ascertain this assumed or known
population characteristic. The assignment of values to a population parameter is
based on a sample is called estimation. The values assigned to a population
parameter based on the value of a sample statistic is called an estimate of the
population parameter. The sample statistic used to estimate a population
parameter is called an estimator. Estimation can be undertaken in two forms
namely, Point estimation or Interval estimation

Selecting the sample size to estimate a population mean


One of the most common questions asked of statisticians is, how large should the
sample taken in a survey be? The answer to this question depends on three
factors:-
i. The parameter to be estimated

ii. The desired confidence level of the interval estimator

iii. The maximum error of estimation, where error of estimation is the


absolute difference between the point estimator and the parameter e.g. the
point estimator of  is x so that the error of estimation = x − 

The maximum error of estimation is also called the error bound and is denoted B.
Suppose the parameter of interest in an experiment is the population mean . The
confidence interval estimator (assuming a normal population, with the population
variance known) is . If we want to estimate to within a certain

Page 33 of 79
SCS301: RESEARCH METHODS IN COMPUTING

specified bound B, we will want the confidence interval estimator to be . As


a consequence, we have . Solving for , we get the following result

A popular method of approximating is to begin by approximating the range of


the random variable. A conservative estimate of is the range divided by 4 i.e.
. This produces a larger value of , which results in a larger value of
, which then estimates with an interval at least as good as was specified.
Examples
1. A production manager would like to estimate the mean time required for
workers to complete a task on an assembly line. Assume that she knows that
is 80 seconds. How large a sample should she draw to estimate to within
5 seconds with (i) 90% confidence (ii) 95% confidence (iii) 99% confidence

2. Find , given that we want to estimate to within 10 units with 95%


confidence, assuming that

3. The operations manager of a large production plant would like to estimate the
average amount of time a worker takes to assemble a new electronic
component. After observing a number of workers assembling similar devices,
she noted that the shortest time taken was 10 minutes and the longest time
taken was 22 minutes. How large a sample of workers should she take if she
wants to estimate the mean assembly time to within 20 seconds? Assume that
the confidence level is to be 99%.

4. Determine the sample size necessary to estimate  to within 10 units with


99% confidence. We know that the range of the population is 200 units.

Selecting the sample size to estimate a population proportion


2
z pˆ qˆ 
The sample size necessary to estimate p is n =   / 2 
 B 
1. The manager of a bank feels that 35% of branches will have enhanced yearly
collection of deposits after introducing a hike in interest rate. Determine the
sample size such that the mean proportion is within plus or minus 0.06 at a
confidence level of (i) 90% (ii) 95% and (iii) 99%.

2. How large a sample should be taken in order to estimate p to within 0.01


with 95% confidence ? assume that
Page 34 of 79
SCS301: RESEARCH METHODS IN COMPUTING

a) You have no information about the value of p

b) p is believed to be approximately 0.10

c) p is believed to be approximately 0.90

3. The director of a management school feels that 55% of students will have
enhanced performance if additional input is given to them. Determine the
sample size such that the mean proportion is within plus or minus 0.10 at a
confidence level of 95%.

MEASUREMENT
Introduction

While people measure things casually in daily life, research measurement is more
precise and controlled. In measurement, one settles for measuring properties of
the objects rather than the objects themselves. An event is measured in terms of its
duration i.e. what happened during it, who was involved, where it occurred etc.
Measurement is the basis for all systematic inquiry because it provides us with the
tools for recording differences in the outcome of variable change.

Definition of Measurement

Measurement is the procedure by which we assign numerals, numbers, or other


distinguishing values to variables according to rules. These rules help us determine
the kinds of values we will assign to certain observable phenomena or variables.
They also determine the quality of measurement. Precision and exactness in
measurement are vitally important. The measures are what are actually used to
test the hypotheses. A researcher needs good measures for both independent and
dependent variables.

Measurement is a three – part process that includes:-

a) Selecting observable empirical events


b) Developing a set of mapping rules: a scheme for assigning numbers or
symbols to represent aspects of the event being measured.
c) Applying the mapping rules to each observation of that event

Mapping rules have four characteristics:-

a) Classification: Numbers are used to group or sort responses. No order


exists.

Page 35 of 79
SCS301: RESEARCH METHODS IN COMPUTING

b) Order: Numbers are ordered. One number is greater than, less than or
equal to another number.
c) Distance: Differences between numbers are ordered. The difference
between any pair of numbers is greater than, less than or equal to the
difference between any other pair of numbers.
d) Origin: The number series has a unique origin indicated by the number
zero. This is an absolute and meaningful zero point.

Measurement consists of two basic processes called conceptualization and


Operationalization, then an advanced process called determining the levels of
measurement, and then even more advanced methods of measuring reliability and
validity.

Conceptualization is the process of taking a construct or concept and refining it by


giving it a conceptual or theoretical definition. Ordinary dictionary definitions will
not do. Instead, the researcher takes keywords in their research question or
hypothesis and finds a clear and consistent definition that is agreed-upon by others
in the scientific community. Conceptualization is often guided by the theoretical
framework, perspective, or approach the researcher is committed to.

Operationalization is the process of taking a conceptual definition and making it


more precise by linking it to one or more specific, concrete indicators or
operational definitions. These are usually things with numbers in them that reflect
empirical or observable reality. For example, if the type of crime one has chosen
to study is theft (as representative of crime in general), creating an operational
definition for it means at least choosing between petty theft and grand theft (false
taking of less or more than $150).

LEVELS OF MEASUREMENT

A level of measurement is a scale by which a variable is measured. For 50 years,


with few detractors, science has used the Stevens (1951) typology of measurement
levels (scales). There are three things to remember about this typology:

➢ Anything that can be measured falls into one of the four types;
➢ The higher the level of measurement, the more precision in measurement; and
➢ Every level up contains all the properties of the previous level.

The four levels of measurement, from lowest to highest, are:

a) Nominal level. The observations are classified under a common


characteristic e.g. sex, race, marital status, employment status, language,
religion etc. helps in sampling.

Page 36 of 79
SCS301: RESEARCH METHODS IN COMPUTING

b) Ordinal level: items or subjects are not only grouped into categories, but
they are ranked into some order e.g. greater than, less than, superior,
happier than, poorer, above etc. helps in developing a likert scale.
c) Interval level: numerals are assigned to each measure and ranked. The
intervals between numerals are equal. The numerals used represent
meaningful quantities but the zero point is not meaningful e.g. test scores,
temperature.
d) Ratio level: has all the characteristics of the other levels and in addition the
zero point is meaningful. Mathematical operations can be applied to yield
meaningful values e.g. height, weight, distance, age, area etc.

Sources of measurement differences

The ideal study should be designed and controlled for precise and unambiguous
measurement of the variables. Since 100% control is unattainable, error occurs.
Much potential error is systematic (results from a bias) while the remainder is
random (occurs erratically). Some of the major sources of error are:
(a) The respondent: opinion differences that affect measurement come from
relatively stable characteristics of the respondent e.g. employee status, ethnic
group and social class. Temporary factors like fatigue, boredom, anxiety and
other distractions also limit the ability to respond accurately and fully.
Hunger, impatience or general variations in mood will also have an impact.
(b) The situational factors: any condition that places a strain on the interview
or measurement session can have serious effects on the interviewer –
respondent rapport. If another person is present, that person can distort
responses by joining in, by distracting or by merely being present. If the
respondents believe anonymity is not ensured, they may be reluctant to
express certain feelings.
(c) The measurer: the interviewer can distort responses by re-wording,
paraphrasing, or re-ordering questions. Stereotypes in appearance and action
introduce bias. Inflections of voice or unconscious prompting with smiles and
nods may encourage or discourage certain replies. Incorrect coding, careless
tabulation and faulty statistical calculation may introduce further errors in
data analysis.
(d) The data collection instrument: a defective instrument can cause distortion in
two major ways:
➢ It can be too confusing and ambiguous e.g. the use of complex
words, leading questions, ambiguous meanings, multiple questions.
➢ Leads to poor selection from the universe of content items. Seldom
does the instrument explore all the potentially important issues.

Page 37 of 79
SCS301: RESEARCH METHODS IN COMPUTING

TYPES OF VARIABLES

A variable is a measurable characteristic that assumes different values among the


subjects. According to Mugenda and Mugenda (2003), variables can be classified
into the following categories: -

1. Independent variables / Predictor variables


It is a variable that a researcher manipulates in order to determine its effect or
influence on another variable. They predict the amount of variation that occurs in
another variables.

Types of independent variables


i. Experimental variables: They are variables which the researcher has
manipulative control over them. Are commonly used in biological and
physical sciences e.g. influence of amount of fertilizer on the yield of
wheat, influence of alcohol on reaction time.

ii. Measurement types of independent variables: Are variables, which have


already occurred. They have fixed manipulative and uninfluenceable
properties. Most of the variables are either environmental or
personalogical e.g. age, gender, marital status, race, colour, geographical
location, nationality, soil type, altitude etc. (e.g. influence of nationality on
choice of food).

2. Dependent variables / criterion variables


It is the variable that is measured, predicted or monitored and is expected to be
affected by manipulation of an independent variable. They attempt to indicate
the total influence arising from the effects of the independent variable. It varies as
a function of the independent variable e.g. influence of hours studied on
performance in a statistical test, influence of distance from the supply center on
cost of building materials.

3. Extraneous variables
They are those variables that affect the outcome of a research study either because
the researcher is not aware of their existence or if the researcher is aware, she or
he has no control over them.

Extraneous variables are often classified into three types:

a) Subject variables, which are the characteristics of the individuals being


studied that might affect their actions. These variables include age, gender,
health status, mood, background, etc.

Page 38 of 79
SCS301: RESEARCH METHODS IN COMPUTING

b) Experimental variables are characteristics of the persons conducting the


experiment which might influence how a person behaves. Gender, the
presence of racial discrimination, language, or other factors may qualify as
such variables.
c) Situational variables are features of the environment in which the study or
research was conducted, which have a bearing on the outcome of the
experiment in a negative way. Included are the air temperature, level of
activity, lighting, and the time of day.

4. Control variables / concomitant / covariate or blocking variables


They are extraneous variables that are built into the study. Extraneous variables
are variables, which influence the results of a study when they are not controlled.

Reasons for introducing control variables:


➢ It increases the validity of the data.
➢ It leads to more convincing generalizations.

Since absolute control of extraneous variables is not possible in any study, results
are interpreted on the basis of degrees of confidence rather than certainty.

Once the major extraneous variables are identified, the researcher can control
them by:-
i. Building the extraneous variable into the study: i.e. including it as an
independent variable. E.g. in determining the effect of alcohol on reaction
time, sex may influence reaction time. Therefore, sex can be introduced as
an independent variable. Using regression, one can measure the effect of
alcohol on reaction time, controlling sex.
ii. Include them in the study but only at one level e.g. time is the dependent
variable, alcohol level - the independent and sex the extraneous variable.
Sex can be controlled by sampling only females or males of a given age.
The disadvantage of this method is that generalizations are limited to a
smaller population.
iii. By removing the effects of the extraneous variables by statistical procedures
i.e. by siphoning its effects on the dependent variable. This can be done by:
➢ Analysis of co-variance
➢ Partial correlation.
5. Intervening variables

They are a special case of extraneous variables. The difference between the
intervening and extraneous variables is in the assumed relationship among the
variables. An intervening variable is a hypothetical internal state that is used to
explain relationships between observed variables, such as independent and
dependent variables, in empirical research. With an extraneous variable, there is
no causal link between the independent and dependent variable, but they are

Page 39 of 79
SCS301: RESEARCH METHODS IN COMPUTING

independently associated with a third variable – the extraneous variable. An


intervening variable is recognized as being caused by the independent variable
and as being a determinant of the dependent variable.

Independent intervening dependent

The total effect of an independent variable on a dependent variable can be


subdivided into direct and indirect effects.

➢ Indirect effects are those effects of an intervening variable.


➢ Direct effects are not transmitted through another variable.

The choice of the right intervening variables helps one not only to determine
accurately the total effects of an independent variable on the dependent variable
but also partition the total effects into direct and indirect.

Examples of intervening variables include: motivation, intelligence, intention, and


expectation.

6. Antecedent variables

They do not interfere with the established relationship between an independent


and dependent variable but clarifies the influence that precedes such a
relationship.

Antecedent independent dependent

Conditions that must hold for a variable to be classified as a antecedent variable:-

➢ The variables including the antecedent variable must be related in some


logical sequence.
➢ When the antecedent variable is controlled for, the relationship between the
independent and the dependent variables should not disappear. Rather it
should be enhanced.
➢ When the independent variable is controlled for or its influence removed,
there should not be any relationship between the antecedent variable and the
dependent variable.

e.g. political stability – attracts investors – increased job opportunities – high


standards of living – reduction of poverty.

7. Suppressor variables

Page 40 of 79
SCS301: RESEARCH METHODS IN COMPUTING

It is an extraneous variable which when not controlled for, removes a relationship


between the two variables. When a suppressor variable is introduced in the study
as a control variable, a true relationship emerges.

8. Distorter variables

It is a variable that converts what was thought of as a positive relationship into a


negative relationship and vice-versa. Its effects lead a researcher into drawing
erroneous conclusions from the data. When the distorter variable is controlled, a
true relationship is obtained. Consideration of distorter variables in a study
reduces the chances of making a type I (rejecting a true null hypothesis) or type
two error (accepting a false null hypothesis).

9. Exogenous and endogenous variables

They are commonly used in testing hypothesized causal models. Path analysis ( a
procedure that tests causal links among several variables) is often used in testing
the validity of causal relationships in a theory or model.

A C

B D

C and D are called endogenous variables. Each endogenous variable is caused or


explained by the variable that precedes it. E.g. D is caused by A, B and C.

A and B are called exogenous variables. They lack hypothesized causes in the
model.

Validity and Reliability in Research

The quality of a research study depends to a large extent on the accuracy of the
data collection procedures. Reliability and validity measures the relevance and
correctness of the data.

Reliability

Page 41 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Reliability is the extent to which an experiment, test, or any measuring procedure


yields the same result on repeated trials. Without the agreement of independent
observers able to replicate research procedures, or the ability to use research tools
and procedures that yield consistent measurements, researchers would be unable
to satisfactorily draw conclusions, formulate theories, or make claims about the
generalizability of their research. In addition to its important role in research,
reliability is critical for many parts of our lives, including manufacturing, medicine
and sports. Reliability is such an important concept that it has been defined in
terms of its application to a wide range of activities.

Reliability is influenced by random error. Random error is the deviation from a


true measurement due to factors that have not effectively been addressed by the
researcher. As random error increases, reliability decreases.

Causes of random error


➢ Inaccurate coding
➢ Ambiguous instruction to the subjects
➢ Interviewer’s fatigue
➢ Interviewee’s fatigue
➢ Interviewer’s bias
Research instruments yield data that have two components; the true value or
score and an error component. The error component of the data reflects the
limitations of the instrument. There are three types of errors that arise at the time
of data collection;
➢ Error due to the inaccuracy of the instrument
➢ Error due to the inaccuracy of scoring by the researcher
➢ Unexplained error

Ways of Assessing Reliability


➢ Test-Retest
➢ Equivalent form
➢ Internal consistency
➢ Interrater reliability

1. The Test-Retest technique


It involves administering the same instruments twice to the same group of subjects,
but after some time. Stability reliability (sometimes called test, re-test reliability) is
the agreement of measuring instruments over time. To determine stability, a
measure or test is repeated on the same subjects at a future date. Results are
compared and correlated with the initial test to give a measure of stability.

An example of stability reliability would be the method of maintaining weights


used by the Kenya Bureau of Standards. Platinum objects of fixed weight (one
kilogram, half kilogram, etc...) are kept locked away. Once a year they are taken
Page 42 of 79
SCS301: RESEARCH METHODS IN COMPUTING

out and weighed, allowing scales to be reset so they are "weighing" accurately.
Keeping track of how much the scales are off from year to year establishes stability
reliability for these instruments. In this instance, the platinum weights themselves
are assumed to have a perfectly fixed stability reliability
Disadvantages
➢ Subjects may be sensitized by the first testing hence will do better in the second
test
➢ Difficulty in establishing a reasonable period between the two testing sessions.

2. Equivalent form

Equivalent reliability is the extent to which two items measure identical concepts
at an identical level of difficulty. Equivalency reliability is determined by relating
two sets of test scores to one another to highlight the degree of relationship or
association. In quantitative studies and particularly in experimental studies, a
correlation coefficient, statistically referred to as r, is used to show the strength of
the correlation between a dependent variable (the subject under study), and one
or more independent variable, which are manipulated to determine effects on the
dependent variable. An important consideration is that equivalency reliability is
concerned with correlational, not causal, relationships.

For example, a researcher studying university Bachelor of commerce students


happened to notice that when some students were studying for finals, their
holiday shopping began. Intrigued by this, the researcher attempted to observe
how often, or to what degree, these two behaviors co-occurred throughout the
academic year. The researcher used the results of the observations to assess the
correlation between studying throughout the academic year and shopping for
gifts. The researcher concluded there was poor equivalency reliability between the
two actions. In other words, studying was not a reliable predictor of shopping for
gifts.

Two instruments are used. Specific items in each form are different but they are
designed to measure the same concept. They are the same in number, structure
and level of difficulty e.g. TOEFL, GRE

Advantages
➢ Estimates the stability of the data as well as the equivalence of the items in the
two forms

Disadvantages
➢ Difficulty in constructing two tests, which measure the same concept (time and
resources).

3. Internal consistency technique

Page 43 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Internal consistency is the extent to which tests or procedures assess the same
characteristic, skill or quality. It is a measure of the precision between the
observers or of the measuring instruments used in a study. This type of reliability
often helps researchers interpret data and predict the value of scores and the limits
of the relationship among variables.
For example, a researcher designs a questionnaire to find out about college
students' dissatisfaction with a particular textbook. Analyzing the internal
consistency of the survey items dealing with dissatisfaction will reveal the extent to
which items on the questionnaire focus on the notion of dissatisfaction.

4. Interrater reliability
Interrater reliability is the extent to which two or more individuals (coders or
raters) agree. Interrater reliability addresses the consistency of the implementation
of a rating system.
A test of interrater reliability would be the following scenario: Two or more
researchers are observing a high school classroom. The class is discussing a movie
that they have just viewed as a group. The researchers have a sliding rating scale (1
being most positive, 5 being most negative) with which they are rating the
student's oral responses. Interrater reliability assesses the consistency of how the
rating system is implemented. For example, if one researcher gives a "1" to a
student response, while another researcher gives a "5," obviously the interrater
reliability would be inconsistent. Interrater reliability is dependent upon the ability
of two or more individuals to be consistent. Training, education and monitoring
skills can enhance interrater reliability.

Ways of improving reliability


➢ Minimize external sources of variation
➢ Standardize conditions under which measurements occurs
➢ Improve investigator consistency by using only well trained, supervised and
motivated persons to conduct the research
➢ Broaden the sample of measurement questions by adding similar questions to
the data collection instrument or adding more observers or occasions to an
observation study.
➢ Improve internal consistency of an instrument by excluding data from analysis
drawn from measurement questions eliciting extreme responses.

Validity

Validity refers to the degree to which a study accurately reflects or assesses the
specific concept that the researcher is attempting to measure. It is the degree to
which results obtained from the analysis of data actually represent the
phenomenon under study. It is the accuracy and meaningfulness of inferences,
which are based on the research results. It has to do with how accurately the data
obtained in the study represents the variables of the study. If such data is a true

Page 44 of 79
SCS301: RESEARCH METHODS IN COMPUTING

reflection of the variables, then inferences based on such data will be accurate and
meaningful. Validity is largely determined by the presence or absence of systematic
error in the data e.g. using a faulty scale to measure.

Types of validity
(a) Construct validity

Construct validity seeks agreement between a theoretical concept and a specific


measuring device or procedure. For example, a researcher inventing a new IQ test
might spend a great deal of time attempting to "define" intelligence in order to
reach an acceptable level of construct validity.

Construct validity can be broken down into two sub-categories: Convergent


validity and discriminate validity. Convergent validity is the actual general
agreement among ratings, gathered independently of one another, where
measures should be theoretically related. Discriminate validity is the lack of a
relationship among measures which theoretically should not be related.

To understand whether a piece of research has construct validity, three steps


should be followed. First, the theoretical relationships must be specified. Second,
the empirical relationships between the measures of the concepts must be
examined. Third, the empirical evidence must be interpreted in terms of how it
clarifies the construct validity of the particular measure being tested.

(b) Content validity


Content Validity is based on the extent to which a measurement reflects the
specific intended domain of content.
Content validity can be illustrated using the following examples: Researchers aim
to study mathematical learning and create a survey to test for mathematical skill. If
these researchers only tested for multiplication and then drew conclusions from
that survey, their study would not show content validity because it excludes other
mathematical functions. Although the establishment of content validity for
placement-type exams seems relatively straight-forward, the process becomes
more complex as it moves into the more abstract domain of socio-cultural studies.
For example, a researcher needing to measure an attitude like self-esteem must
decide what constitutes a relevant domain of content for that attitude. For socio-
cultural studies, content validity forces the researchers to define the very domains
they are attempting to study.

The usual procedure in assessing the content validity of a measure is to use


professional or experts in the particular field. The instrument is given to two
groups of experts, one group is requested to assess what concept the instrument is
trying to measure. The other group is asked to determine whether the set of items
or checklist accurately represents the concept under study.

Page 45 of 79
SCS301: RESEARCH METHODS IN COMPUTING

(c) Criterion related validity


Criterion related validity, also referred to as instrumental validity, is used to
demonstrate the accuracy of a measure or procedure by comparing it with
another measure or procedure which has been demonstrated to be valid. For
example, imagine a hands-on driving test has been shown to be an accurate test of
driving skills. By comparing the scores on the written driving test with the scores
from the hands-on driving test, the written test can be validated by using a
criterion related strategy in which the hands-on driving test is compared to the
written test.
Types
a) Predictive validity – refers to the degree to which obtained data predicts
the future behaviour of subjects e.g. B. Com graduates
b) Concurrent validity- refers to the degree to which data are able to predict
the behaviour of subjects in the present and not in the future e.g. psychiatry

Internal and external validity


Researchers should be concerned with both external and internal validity.
a) External validity refers to the extent to which the results of a study are
generalizable or transferable. External validity is the degree to which
research findings can be generalized to populations and environments
outside the experimental setting. It has to do with representativeness of the
sample with regard to the target population.
b) Internal validity refers to (1) the rigor with which the study was conducted
(e.g., the study's design, the care taken to conduct measurements, and
decisions concerning what was and wasn't measured) and (2) the extent to
which the designers of a study have taken into account alternative
explanations for any causal relationships they explore. In studies that do not
explore causal relationships, only the first of these definitions should be
considered when assessing internal validity. Internal validity depends on the
degree to which extraneous variables have been controlled for in the study
Internal and external validity are inversely related to each other.

Threats to internal validity


a) History – refers to occurrence of events that influence experimental units
during t he course of the study
b) Maturation – refers to the biological or psychological processes which occur
among the subjects in a relatively short time and which influence research
findings
c) Instrumentation -
d) Pre-testing – solution – use equivalent form tests
e) Statistical regression
f) Attrition- subjects dropping out of the study before completion- leads to error,
biasness in the sample
Page 46 of 79
SCS301: RESEARCH METHODS IN COMPUTING

g) Differential selection – occurs when subjects are systematically selected for a


study - volunteers and non-volunteers – biasness leads error
h) Selection – maturation interaction
i) Ambiguity - when correlation is taken for causation
j) Apprehension - when people are scared to respond to your study
k) Demoralization - when people get bored with your measurements
l) Diffusion - when people figure out your test and start mimicking symptoms

Threats to external validity


a) Accessible and target population
b) Control of extraneous variables
c) Pre-test treatment interaction
d) Explicit description of the sample
e) Multi-treatment interference

RESEARCH INSTRUMENTS

The research instruments that are widely used include


a) Questionnaires
b) Interviews
c) Observations

QUESTIONNAIRES
Each item in the questionnaire is developed to address a specific objective,
research question or hypothesis of the study. The researcher must also know how
information obtained from each questionnaire item will be analysed.

Types of questions used in questionnaires


1 Structured or closed-ended questions
They are questions, which are accompanied by a list of possible alternatives from
which respondents select the answer that best describes their situation.
Advantages of Structured or closed-ended questions
➢ They are easier to analyse since they are in an immediate usable form
➢ They are easier to administer
➢ They are economical to use in terms of time and money

Disadvantages of Structured or closed-ended questions


➢ They are more difficult to construct
➢ Responses are limited and the respondent is compelled to answer questions
according to the researcher’s choices

2 Unstructured or open – ended questions

Page 47 of 79
SCS301: RESEARCH METHODS IN COMPUTING

They refer to questions, which give the respondent complete freedom of response.
The amount of space provided is always an indicator of whether a brief or lengthy
answer is desired.

Advantages of Unstructured or open – ended questions


➢ They permit a greater depth of response
➢ They are simple to formulate
➢ The respondent’s responses may give an insight into his feelings, background,
hidden motives, interest and decisions.

Disadvantages of Unstructured or open – ended questions


➢ There is a tendency of the respondents providing information, which does not
answer the stipulated research questions or objectives.
➢ The responses given may be difficult to categorize and hence difficult to
analyze quantitatively
➢ Responding to open ended questions is time consuming, which may put some
respondent off.

3 Contingency questions
In particular cases, certain questions are applicable to certain groups of
respondents. In such cases, follow-up questions are needed to get further
information from the relevant sub-group only. These subsequent questions, which
are asked after the initial questions, are called ‘contingency questions’ or ‘ filter
questions’. The purpose of these kinds of questions is to probe for more
information. They also simplify the respondent’s task, in that they will not be
required to answer questions that are not relevant to them.

4 Matrix questions
These are questions, which share the same set of response categories. They are
used whenever scales like likert scale are being used.

Advantages of matrix questions


➢ When questions or items are presented in matrix form, they are easier to
complete and hence the respondent is unlikely not to be put off.
➢ Space is used efficiently
➢ It is easy to compare responses given to different items.

Disadvantages of matrix questions


➢ Some respondents, especially the ones that may not be too keen to give right
responses, might form a pattern of agreeing or disagreeing with statements.
➢ Some researchers use them when in fact the kind of information being sought
could better be obtained in another format.

Rules for constructing questionnaires and questionnaire items


Page 48 of 79
SCS301: RESEARCH METHODS IN COMPUTING

1. List the objectives that you want the questionnaire to accomplish before
constructing the questionnaire.
2. Determine how information obtained from each questionnaire item will be
analyzed.
3. Ensure clarity and avoid ambiguity.
4. If a concept has several meanings and that concept must be used in a
question, the intended meaning must be defined.
5. Construct short questions.
6. Items should be stated positively as possible.
7. Double-barreled items should be avoided.
8. Leading and biased questions should be avoided.
9. Very personal and sensitive questions should be avoided.
10. Simple words that are easily understandable should be used.
11. Questions that assume facts with no evidence should be avoided.
12. Avoid psychologically threatening questions.
13. Include enough information in each item so that it is meaningful to the
respondent.

Tips on how to organize or order items in a questionnaire


1. Begin with non-threatening, interesting items.
2. It is not advisable to put important questions at the end of a long
questionnaire.
3. Have some logical order when putting items together.
4. Arrange the questions according to themes being studied.
5. If the questionnaire is arranged into content sub-sections, each section
should be introduced with a short statement concerning its content and
purpose.
6. Socio-economic questions should be asked at the end because respondents
may be put off by personal questions at the beginning of the questionnaire.

Presentation of the questionnaire


1. Make the questionnaire attractive by using quality paper. It increases the
response rate.
2. Organize and lay out the questions so that the questionnaire is easy to
complete.
3. All the pages and items in a questionnaire should be numbered.
4. Brief but clear instruction must be included.
5. Make your questionnaire short.

Pretesting the questionnaire


The questionnaire should be pretested to a selected sample, which is similar to the
actual sample, which the researcher plans to study. This is important because:-
➢ Questions that are vague will be revealed in the sense that the respondents
will interpret them differently.
Page 49 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ Comments and suggestions made by respondents during pretesting should be


seriously considered and incorporated.
➢ Pretesting will reveal deficiencies in the questionnaire.
➢ It helps to test whether the methods of analysis are appropriate.

Ways of administering questionnaires


Questionnaires are mainly administered using three methods:
i. Self administered questionnaires
Questionnaires are send to the respondents through mail or hand-delivery,
and they complete on their own.
ii. Researcher administered questionnaires
The researcher can decide to use the questionnaire to interview the
respondents. This is mostly done when the subjects may not have the ability
to easily interpret the questions probably because of their educational level.
iii. Use of the internet
The people sampled for the research receive and respond to the
questionnaires through their web sites or e-mail addresses.

The letter of transmittal / Cover letter


The letter of transmittal / Cover letter should accompany every questionnaire.
Contents of a letter of transmittal
➢ It should explain the purpose of the study.
➢ It should explain the importance and significance of the stuidy.
➢ A brief assurance of confidentiality should be included in the letter.
➢ If the study is affiliated to a certain institution or organisation, it is advisable
to have an endorsement from such an institution or organisation.
➢ In a sensitive research, it may be necessary to assure the anonymity of
respondents.
➢ The letter should contain specific deadline dates by which the completed
questionnaire is to be returned.

Follow-up techniques
➢ Sending a follow-up letter which should be polite, and asking the subjects to
respond
➢ A questionnaire and a follow-up letter.

Response rate
It refers to the percentage of subjects who respond to questionnaires. Many
authors believe that a response rate of 50% is adequate for analysis and reporting.
If the response rate is low, the researcher must question the representativeness of
the sample.

Page 50 of 79
SCS301: RESEARCH METHODS IN COMPUTING

INTERVIEWS
An interview is an oral (face to face) administration of a questionnaire or an
interview schedule. To obtain accurate information through interviews, a
researcher needs to obtain the maximum co-operation from respondents.
Interviews are particularly useful for getting the story behind a participant's
experiences. The interviewer can pursue in-depth information around a topic.
Interviews may be useful as follow-up to certain respondents to questionnaires,
e.g., to further investigate their responses. Usually open-ended questions are asked
during interviews.

Guidelines for preparation for Interview


1. Choose a setting with little distraction. Avoid loud lights or noises, ensure the
interviewee is comfortable (you might ask them if they are), etc. Often, they
may feel more comfortable at their own places of work or homes.
2. Explain the purpose of the interview.
3. Address terms of confidentiality. Note any terms of confidentiality. (Be careful
here. Rarely can you absolutely promise anything. Courts may get access to
information, in certain circumstances.) Explain who will get access to their
answers and how their answers will be analyzed. If their comments are to be
used as quotes, get their written permission to do so.
4. Explain the format of the interview. Explain the type of interview you are
conducting
and its nature. If you want them to ask questions, specify if they're to do so as
they have them or wait until the end of the interview.
5. Indicate how long the interview usually takes.
6. Tell them how to get in touch with you later if they want to.
7. Ask them if they have any questions before you both get started with the
interview.
8. Don't count on your memory to recall their answers. Ask for permission to
record the interview or bring along someone to take notes.

Types of Interviews approaches


(a) Informal, conversational interview - no predetermined questions are
asked, in order to remain as open and adaptable as possible to the
interviewee's nature and priorities; during the interview, the interviewer
"goes with the flow".
(b) General interview guide approach - the guide approach is intended to
ensure that the same general areas of information are collected from each
interviewee; this provides more focus than the conversational approach,
but still allows a degree of freedom and adaptability in getting information
from the interviewee.
(c) Standardized, open-ended interview - here, the same open-ended
questions are asked to all interviewees (an open-ended question is where
respondents are free to choose how to answer the question, i.e., they
Page 51 of 79
SCS301: RESEARCH METHODS IN COMPUTING

don't select "yes" or "no" or provide a numeric rating, etc.); this approach
facilitates faster interviews that can be more easily analyzed and compared
(d) Closed, fixed-response interview - where all interviewees are asked the
same questions and asked to choose answers from among the same set of
alternatives. This format is useful for those not practiced in interviewing.

Sequence of Questions
1. Get the respondents involved in the interview as soon as possible.
2. Before asking about controversial matters (such as feelings and conclusions),
first ask about some facts. With this approach, respondents can more easily
engage in the interview before warming up to more personal matters.
3. Intersperse fact-based questions throughout the interview to avoid long lists
of fact-based questions, which tends to leave respondents disengaged.
4. Ask questions about the present before questions about the past or future.
It's usually easier for them to talk about the present and then work into the
past or future.
5. The last questions might be to allow respondents to provide any other
information they prefer to add and their impressions of the interview.

Wording of Questions
➢ Wording should be open-ended. Respondents should be able to choose
their own terms when answering questions.
➢ Questions should be as neutral as possible. Avoid wording that might
influence answers, e.g., evocative, judgmental wording.
➢ Questions should be asked one at a time.
➢ Questions should be worded clearly. This includes knowing any terms
particular to the program or the respondents' culture.
➢ Be careful asking "why" questions. This type of question infers a cause-effect
relationship that may not truly exist. These questions may also cause
respondents to feel defensive, e.g., that they have to justify their response,
which may inhibit their responses to this and future questions.
➢ While Carrying Out Interview
➢ Occasionally verify the tape recorder (if used) is working.
➢ Ask one question at a time.
➢ Attempt to remain as neutral as possible. That is, don't show strong
emotional reactions to their responses. Patton suggests to act as if "you've
heard it all before."
➢ Encourage responses with occasional nods of the head, "uh huh"s, etc.
➢ Be careful about the appearance when note taking. That is, if you jump to
take a note, it may appear as if you're surprised or very pleased about an
answer, which may influence answers to future questions.
➢ Provide transition between major topics, e.g., "we've been talking about
(some topic) and now I'd like to move on to (another topic)."

Page 52 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ Don't lose control of the interview. This can occur when respondents stray
to another topic, take so long to answer a question that times begins to run
out, or even begin asking questions to the interviewer.

Immediately After Interview


➢ Verify if the tape recorder, if used, worked throughout the interview.
➢ Make any notes on your written notes, e.g., to clarify any scratchings,
ensure pages are numbered, fill out any notes that don't make senses, etc.
➢ Write down any observations made during the interview. For example,
where did the interview occur and when, was the respondent particularly
nervous at any time? Were there any surprises during the interview? Did the
tape recorder break?

Personal interviews
People selected to be part of the sample are interviewed in person by a trained
interviewer.
Requirements for success
Three broad conditions must be met in order to have a successful personal
interview:
➢ The participant must possess the information being targeted by the
investigative questions
➢ The participant must understand his or her role in the interview as the
provider of accurate information
➢ The participant must perceive adequate motivation to cooperate

Increasing the participant’s receptiveness


The first goal in an interview is to establish a friendly relationship with the
participant. Three factors will help increase participant receptiveness. The
participant must:
➢ Believe that the experience will be pleasant and satisfying
➢ Believe that answering the survey is an important and worthwhile use of his or
her time
➢ Dismiss any mental reservations that he or she might have about participation.

The technique of stimulating participants to answer more fully and relevantly is


termed probing. Since it presents a great potential for bias, a probe should be
neutral and appear as a natural part of the conversation. Appropriate probes
should be specified by the designer of the data collection instrument. There are
several probing styles e.g.
➢ A brief assertion of understanding and interest e.g. comments such as “I see”
“yes”.
➢ An expectant pause
➢ Repeating the question
➢ Repeating the participant’s reply
Page 53 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ A neutral question or comment


➢ Question clarification.

Problems likely to be encountered during personal interviews


In personal interviews, the researcher must deal with bias and cost.
Biased results is as a result of three types of errors:
(a) Sampling error
It’s the difference between a sample statistic and its corresponding population
parameter. The sampling distribution of the sample means is a probability
distribution of possible sample means of a given sample size.

(b) Non-response error


This occurs when the responses of participants differ in some systematic way from
the responses of non-participants. It occurs when the researcher:
➢ Cannot locate the person to be studied
➢ Is unsuccessful in encouraging that person to participate
Solutions to reduce errors of non-response are
➢ Establishing and implementing callback procedures
➢ Creating a non response sample and weighting results from this sample
➢ Substituting another individual for the missing non-participant.
(c) Response error
Occurs when the data reported differ from the actual data. It can occur during the
interview or during preparation of data analysis.
➢ Participant-initiated error occurs when the participant fails to answer fully and
accurately either by choice or because of inaccurate or incomplete knowledge.
Can be solved by using trained interviewers who are knowledgeable about
such problems.
➢ Interviewer error can be caused by:-
- Failure to secure full participant cooperation
- Failure to consistently execute interview procedures
- Failure to establish appropriate interview environment
- Falsification of individual answers or whole interviews
- Inappropriate influencing behaviour
- Failure to record answers accurately and completely
- Physical presence bias.

Advantages of Personal interviews


➢ Good cooperation from the respondents
➢ Interviewer can answer questions about survey, probe for answers, use
follow-up questions and gather information by observation.
➢ Special visual aids and scoring devices can be used.
➢ Illiterate and functionally illiterate respondents can be reached
➢ Interviewer can prescreen respondent to ensure he / she fits the population
profile.
Page 54 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ Responses can be entered directly into a portable microcomputer to reduce


error and cost when using computer assisted personal interviewing.

Disadvantages of Personal interviews


➢ High costs
➢ Need for highly trained interviewers
➢ Longer period needed in the field collecting data
➢ May be wide geographic dispersion
➢ Follow-up is labour intensive
➢ Not all respondents are available or accessible
➢ Some respondents are unwilling to talk to strangers in their homes
➢ Some neighbourhoods are difficult to visit
➢ Questions may be altered or respondent coached by interviewers.

Telephone interviews
People selected to be part of the sample are interviewed on the telephone by a
trained interviewer.
Advantages of Telephone interviews
➢ Lower costs than personal interviews
➢ Expanded geographic coverage without dramatic increase in costs
➢ Uses fewer, more highly skilled interviewers
➢ Reduced interview bias
➢ Fates completion time
➢ Better access to hard-to-reach respondents through repeated callbacks
➢ Can use computerized random digit dialing
➢ Responses can be entered directly into a computer file to reduce error and
cost when using computer assisted telephone interviewing.

Disadvantages of Telephone interviews


➢ Response rate is lower than for personal interview
➢ Higher costs if interviewing geographically dispersed sample
➢ Interview sample must be limited
➢ Many phone numbers are unlisted or not working, making directory listings
unreliable
➢ Some target groups are not available by phone
➢ Responses may be less complete
➢ Illustrations cannot be used.
➢ Respondents may not be honest with their responses since it is not a face to
face situation

Rules pertaining to interviews


The interviewer must
➢ Be pleasant

Page 55 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ Show genuine interest in getting to know respondents without appearing like


spies.
➢ Be relaxed and friendly.
➢ Be very familiar with the questionnaire or the interview guide.
➢ Have a guide which indicates what questions are to be asked and in what
order.
➢ Interact with the respondent as an equal.
➢ Pretest the interview guide before using it to check for vocabulary, language
level and how well the questions will be understood.
➢ Inform the respondent about the confidentiality of the information given.
➢ Not ask leading questions
➢ Remain neutral in an interview situation in order to be as objective as
possible.

An interview schedule
It’s a set of questions that the interviewer asks when interviewing. It makes it
possible to obtain data required to meet specific objectives of the study.

Note taking during interviews


It refers to the method of recording in which the interviewer records the
respondent’s responses during the interview.

Advantages
➢ It facilitates data analysis since the information is readily accessible and
already classified into appropriate categories.
➢ If taken well, no information is left out.

Disadvantages of note taking


➢ It may interfere with the communication between the respondent and the
interviewer.
➢ It might upset the respondent if the answers are personal and sensitive.
➢ If it is delayed, important details may be forgotten.
➢ It makes the interview lengthy and boring.

Tape recording
The interviewer’s questions and the respondent’s answers are recorded either
using a tape recorder or a video tape.
Advantages
➢ It reduces the tendency for the interviewer to make unconscious selection of
data in the course of the recording.
➢ The tape can be played back and studied more thoroughly.
➢ A person other than the interviewer can evaluate and categorize responses.
➢ It speeds up the interview.
➢ Communication is not interrupted.
Page 56 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Disadvantages
➢ It changes the interview situation since respondents get nervous.
➢ Respondents may be reluctant to give sensitive information if they know they
are being taped.
➢ Transcribing the tapes before analysis is time consuming and tedious.

Advantages of interviews
➢ It provides in-depth data, which is not possible to get using a questionnaire.
➢ It makes it possible to obtain data required to meet specific objectives of the
study.
➢ Are more flexible than questionnaires because the interviewer can adapt to
the situation and get as much information as possible.
➢ Very sensitive and personal information can be extracted from the
respondent.
➢ The interviewer can clarify and elaborate the purpose of the research and
effectively convince respondents about the importance of the research.
➢ They yield higher response rates

Disadvantages of interviews
➢ They are expensive – traveling costs
➢ It requires a higher level of skill
➢ Interviewers need to be trained to avoid bias
➢ Not appropriate for large samples
➢ Responses may be influenced by the respondent’s reaction to the interviewer.

OBSERVATION
Observation is one of the few options available for studying records, mechanical
processes, small children and complex interactive processes. Data can be
gathered as the event occurs. Observation includes a variety of monitoring
situations that cover non-behavioural and behavioural activities.

The observer-participant relationship


Interrogation presents a clear opportunity for interviewer bias. The problem is
less pronounced with observation but is still real. The relationship between
observer and participant may be viewed from three perspectives:
➢ Whether the observation is direct or indirect
➢ Whether the observer’s presence is known or unknown to the participant
➢ What role the observer plays

Guidelines for the qualification and selection of observers


➢ Concentration: Ability to function in a setting full of distractions
➢ Detail-oriented: Ability to remember details of an experience
➢ Unobtrusive: Ability to blend with the setting and not be distinctive
➢ Experience level: Ability to extract the most from an observation study
Page 57 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Advantages of observation
Enables one to:
➢ Secure information about people or activities that cannot be derived from
experiment or surveys
➢ Reduces obtrusiveness
➢ Avoid participant filtering and forgetfulness
➢ Secure environmental context information
➢ Optimize the naturalness of the research setting

Limitations of observation
➢ Difficulty of waiting for long periods to capture the relevant phenomena
➢ The expense of observer costs and equipment
➢ Reliability of inferences from surface indicators
➢ The problem of quantification and disproportionately large records

Observation forms, schedules or checklists


The researcher must define the behaviours to be observed and then develop a
detailed list of behaviours. During data collection, the researcher checks off each as
it occurs. This permits the observer to spend time thinking about what is occurring
rather than on how to record it and this enhances the accuracy of the study.

DATA ANALYSIS
DATA PREPARATION AND DESCRIPTION
Once the data begins to flow in, attention turns to data analysis. If the project has
been done correctly, the analysis planning is already done.
Data preparation
This includes editing, coding and data entry. These activities ensure the accuracy of
the data and their conversion from raw form to reduced and classified forms that
are more appropriate for analysis.
Editing
Editing detects errors and omissions, corrects them when possible and certifies that
minimum data quality standards have been achieved. The editor’s purpose is to
guarantee that data are:
➢ Accurate
➢ Consistent with intent of the question and other information in the
survey
➢ Uniformly entered
➢ Complete
➢ Arranged to simplify coding and tabulation
Field editing
Page 58 of 79
SCS301: RESEARCH METHODS IN COMPUTING

In large projects, field editing review is a responsibility of the field supervisor. It


should be done soon after the data have been gathered. During the stress of data
collection, the researcher often uses ad hoc abbreviations and special symbols.
Soon after the interview, experiment or observation, the investigator should
review the reporting forms. It is difficult to complete what was abbreviated or
written in shorthand or noted illegibly if the entry is not caught that day. When
entry gaps are present from interviews, a call back should be made rather than
guessing what the respondent ‘probably would have said’. Self-interviewing has
no place in quality research.
Central editing
For a small study, the use of a single editor produces maximum consistency. In
large studies, the tasks may be broken down so that each editor can deal with one
entire section. This approach will not identify inconsistencies between answers in
different sections. However, this problem can be handled by identifying points of
possible inconsistency and having one editor check specifically for them.

Rules to guide editors in their work


➢ Be familiar with instructions given to interviewers and coders
➢ Do not destroy, erase or make illegible the original entry by the interviewer,
original entries should be crossed out with a single line to remain legible.
➢ Make all entries on an instrument in some distinctive colour and in a
standardized form.
➢ Initial all answers changed or supplied.
➢ Place initials and date of editing on each instrument completed.
Coding
Coding involves assigning numbers or other symbols to answers so the responses
can be grouped into a limited number of classes or categories. The classifying of
data into limited categories sacrifices some data detail but is necessary for efficient
analysis. Coding helps the researcher to reduce several thousand replies to a few
categories containing the critical information needed for analysis. In coding,
categories are the partitioning of a set and categorization is the process of using
rules to partition a body of data.
Coding rules
The categories should be:
➢ Appropriate to the research problem and purpose: Categories must provide
the best partitioning of data for testing hypotheses and showing relationships.
➢ Exhaustive
➢ Mutually exclusive
➢ Derived from one classification principle
Coding closed questions
The responses to closed questions include scaled items and others for which
answers can be anticipated. When codes are established early in the research
process, it is possible to pre-code the questionnaire. Pre-coding is particularly
helpful for data entry because it makes the intermediate step of completing a
Page 59 of 79
SCS301: RESEARCH METHODS IN COMPUTING

coding sheet unnecessary. The data are accessible directly from the questionnaire.
A respondent, interviewer, field supervisor or researcher is able to assign an
appropriate numerical response on the instrument by checking, circling or printing
it in the proper coding location.

Coding open-ended questions


Open-ended questions are always used where insufficient information or lack of a
hypothesis prohibits preparing response categories in advance, need to measure
sensitive or disapproved behaviour, discover salience or encouraging natural
modes of expressions. Content analysis is always used to analyse open-ended
questions. Converse and Presser (1986) define content analysis as a research
technique for the objective, systematic and quantitative description of the manifest
content of a communication.

Content analysis follows a systematic process i.e.


➢ Selection of a unitization scheme. The units may be syntactical, referential,
prepositional or thematic
➢ Selection of a sampling plan
➢ Development of recording and coding instructions
➢ Data reduction
➢ Inferences about the context
➢ Statistical analysis

Content analysis guards against selective perception of the content, provides for
the rigorous application of reliability and validity criteria and is amenable to
computerization.
“Don’t know” replies
“Don’t know” replies are evaluated in light of the questions nature and the
respondent. While many don’t know are legitimate, some result from questions
that are ambiguous or from an interviewing situation that is not motivating. It is
better to report don’t knows as a separate category unless there are compelling
reasons to treat them otherwise.

Data entry
Data entry converts information gathered by secondary or primary methods to a
medium for viewing and manipulation. Data entry is accomplished by keyboard
entry from pre-coded instruments, optical scanning, real time keyboarding,
telephone pad data entry, bar codes, voice recognition, optical mark recognition
(OMR) and data transfers from electronic notebooks and laptop computers.
Database programs, spreadsheets and editors in statistical software programs e.g.
SPSS and SAS offer flexibility for entering, manipulating and transferring data for
analysis, warehousing and mining.

Data description
Page 60 of 79
SCS301: RESEARCH METHODS IN COMPUTING

The objective of descriptive statistical analysis is to develop sufficient knowledge


to describe a body of data. This is accomplished by understanding the data levels
for the measurements we choose, their distributions and characteristics of location,
spread and shape. The discovery of miscoded values, missing data and other
problems in the data set is enhanced with descriptive statistics
There are three general areas that make up the field of statistics: descriptive
statistics, relational statistics, and inferential statistics:

DESCRIPTIVE STATISTICS
Descriptive statistics fall into one of two categories: measures of central tendency
(mean, median, and mode) or measures of dispersion (standard deviation and
variance). Their purpose is to explore hunches that may have come up during the
course of the research process, but most people compute them to look at the
normality of their numbers. Examples include descriptive analysis of sex, age, race,
social class, and so forth.

VISUAL DISPLAYS OF DATA


In addition to numerical summaries of location, spread and shape, visual displays
can be used to provide a complete and accurate impression of distribution and
variable relationships.
➢ Frequency table arrays data from highest to lowest values with counts and
percentages. They are most useful for inspecting the range of responses and
their repeated occurrence.
➢ Bar charts and pie charts are appropriate for relative comparisons of
nominal data.
➢ Histograms are optimally used with continuous variables where intervals
group the responses.
➢ Stem and leaf displays present actual data values using a histogram type
device that allows inspection of spread and shape.
➢ Box plots use the five-number summary to convey a detailed picture of a
distribution’s main body, tails and outliers.
➢ Control charts displays sequential measurements of a process together with
a centre line and control limits. The selection of a control chart depends on
the level of data one is measuring. It helps manager’s focus on special causes
of variation by revealing whether a system is under control and
substantiating results from improvements.
➢ The Pareto diagram is a bar chart whose percentages sum to 100 percent.
The causes of the problem under investigation are sorted in decreasing
importance with bar height descending from left to right. Its pictorial array
reveals the highest concentration of quality improvement potential in the
fewest number of remedies.

Page 61 of 79
SCS301: RESEARCH METHODS IN COMPUTING

INFERENTIAL STATISTICS
Hypothesis: It’s a statement about a population parameter developed for the
purpose of testing.
Hypothesis testing: It’s a procedure based on sample evidence and probability
theory to determine whether the hypothesis is a reasonable statement.
Procedure for testing a hypothesis
1. State the null and alternate hypothesis
2. Identify the test statistic
3. Formulate a decision rule and identify the rejection region
4. Compute the value of the test statistic
5. Make a conclusion.
State the null hypothesis (HO) and alternate hypothesis (HA)
➢ The null hypothesis is a statement about the value of a population parameter.
It should be stated as “There is no significant difference between
……………”. It should always contain an equal sign.
➢ The alternate hypothesis is a statement that is accepted if sample data provide
enough evidence that the null hypothesis is false.
One-tailed and Two-tailed tests
➢ A test is one tailed when the alternate hypothesis states a direction e.g.
Ho: The mean income of women is equal to the mean income of men
HA: The mean income of women is greater than the mean income of
men
➢ A test is two tailed if no direction is specified in the alternate hypothesis
Ho: There is no difference between the mean income of women and the
mean income of men
HA: There is a difference between the mean income of women and the
mean income of men
Identify the test statistic
A test statistic is the statistic that will be used to test the hypothesis e.g.
, , Fand 2 (chi − square)
Formulating a decision rule and identifying the rejection region
A decision rule is a statement of the conditions under which the null hypothesis is
rejected and the conditions under which it is not rejected. It is determined by the
level of significance which is designated by  and should be between 0 –1.
Compute the value of the test statistic and make a conclusion.
The value of the test statistic is determined from the sample information, and is
used to determine whether to reject the null hypothesis or not.

Types of errors that can be committed


i. Type I error: it is rejecting the null hypothesis, when it is true.
ii. Type II error: It is not rejecting the null hypothesis, when it is false.

Null hypothesis Do not reject HO Reject HO


HO is True Correct decision Type I error
Page 62 of 79
SCS301: RESEARCH METHODS IN COMPUTING

HO is false Type II error Correct decision

TESTING THE POPULATION MEAN WHEN THE POPULATION VARIANCE IS


KNOWN
When the population variance is known and the population is normally
distributed, the test statistic for testing hypothesis about is .

Estimating the population mean when the population variance is known


The confidence interval estimator of when is known is

Examples
1. A study by the Coca-Cola Company showed that the typical adult Kenyan
consumes 18 gallons of Coca-Cola each year. According to the same survey,
the standard deviation of the number of gallons consumed is 3.0. A random
sample of 64 college students showed they consumed an average (mean) of 17
gallons of cola last year. At the 0.05 significance level, can we conclude that
there is a significance difference between the mean consumption rate of college
students and other adults?
2. The manager of a departmental store is thinking about establishing a new
billing system for the stores credit customers. After a thorough financial
analysis, she determines that the new system will not be cost effective if the
average monthly account is less than 70,000. A random sample of 200
monthly accounts is drawn, for which the mean monthly account is Sh.
66,000. With  = 0.05, is there sufficient evidence to conclude that the new
system will not be cost effective? Assume that the population standard
deviation is Sh. 30,000.
3. Past experience indicates that the monthly long distance telephone bill per
household in a particular community is normally distributed, with a mean of
Sh. 1012 and a standard deviation of Sh. 327. After an advertising campaign
that encouraged people to make long distance telephone calls more
frequently, a random sample of 57 households revealed that the mean
monthly long distance bill was Sh. 1098. Can we conclude at the 10%
significance level that the advertising campaign was successful?

Page 63 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Testing the population proportion


The null and alternate hypotheses of tests of proportions are set up in the same
way as the hypothesis of tests about mean and variance. The test statistic for is

Confidence interval estimator of is

Example:
1. An inventor has developed a system that allows visitors to museums, zoos and
other attractions to get information at the touch of a digital code. For
example, zoo patrons can listen to an announcement (recorded on a
microchip) about each animal they see. It is anticipated that the device would
rent for $3.00 each. The installation cost for the complete system is expected
to be about $400,000. The ABC zoo is interested in having the system
installed, but the management is uncertain about whether to take the risk. A
financial analysis of the problem indicates that if more than 10% of the zoo
visitors rent the system, the zoo will make a profit. To help make the decision,
a random sample of 400 zoo visitors is given details of the systems capabilities
and cost. If 48 people say that they would rent the device, can the
management of the zoo conclude at the 5% significance level that the
investment would result in a profit?
2. In a random sample of 100 units from an assembly line, 22 were defective.
(a) Does this provide sufficient evidence at the 10% significance level to
allow us to conclude that the defective rate among all units exceeds
10%?
(b) Find a 99% confidence interval estimate of the defective rate.
3. A manufacturer of computer chips claims that more than 90% of his products
conform to specifications. In a random sample of 1,000 chips drawn from a
large production run, 75 were defective. Do the data provide sufficient
evidence at the 1% level of significance to enable us to conclude that the
manufacturer’s claim is true?

Chi-square test of a multinomial experiment


A multinomial experiment is a generalized version of a binomial experiment that
allows for more than two possible outcomes on each trial of the experiment.

Properties of a multinomial experiment


➢ The experiment consists of a fixed number of trials.
➢ The outcome of each trial can be classified into exactly one of categories
called cells
Page 64 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ The probability that the outcome of a trial will fall into a cell remains
constant for each trial, for moreover, .
➢ Each trial of the experiment is independent of the other trials.

Test statistic is

Rejection region is

Example
1. Two companies A and B have recently conducted aggressive advertising
campaigns in order to maintain and possibly increase their respective shares of
the market for a particular product. These two companies enjoy a dominant
position in the market. Before advertising campaigns began, the market share
for Company A was 45% while Company B had a market share of 40%.
Other competitors accounted for the remaining market share of 15%. To
determine whether these market shares changed after the advertising
campaigns, a marketing analyst solicited the preferences of a random sample
of 200 consumers of this product. Of the 200 consumers, 100 indicated a
preference for Company’s A’s product, 85 preferred Company’s B product
and the remainder preferred one or another of the products distributed by
other competitors. Conduct a test to determine at the 5% level of
significance, whether the market shares have changed from the levels they
were at before the advertising campaigns occurred.
2. To determine if a single die, is balanced, or fair, the die was rolled 600 times.
The observed frequencies with which each of the six sides of the die turned up
are recorded in the following table: -
Face 1 2 3 4 5 6
Observed frequency 114 92 84 101 107 102
Is there sufficient evidence to conclude at the 5% level of significance, that the
die is not fair?
3. Grades assigned by an economics instructor have historically followed a
symmetrical distribution.
Grade A B C D F
Percentage 5 25 40 25 5

A sample of 150 grades revealed the following


Grade A B C D F
Number 11 32 62 29 16
Can we conclude at the 1% level of significance that this year’s grades are
distributed differently than they were in the past?

Page 65 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Rule of five
For the discrete distribution of the test statistic to be adequately approximated
by the continuous chi-square distribution, the conventional rule is to require that
the expected frequency for each cell be at least 5. Where necessary, cells should be
combined in order to satisfy this condition. The choice of cells to be combined
should be made in such a way that meaningful categories result from the
combination.

CHI-SQUARE TEST OF A CONTIGENCY TABLE


A contingency table is a rectangular table which items from a population are
classified according to two characteristics. The objective is to analyze the
relationship between two qualitative variables i.e. to investigate whether a
dependence relationship exists between two variables or whether the variables are
statistically independent. The number of degrees of freedom for a contingency
table with rows and columns is .

Examples
1. The trustee of a company’s pension plan has solicited the opinions of a sample
of the company’s employees regarding a proposed revision of the plan. A
breakdown of the responses is shown in the table below: -
Response Lower level Middle Top
management management management
For 67 32 11
Against 63 18 9
Is there sufficient evidence at the 5% significance level, to conclude that the
responses differ among the three groups of employees?

2. The operations manager at a shirt manufacturing plant has been concerned


about the large number of defects that the company’s three shifts have been
producing. They appear to be three types of defects: Improper stitching,
buttons not aligned with button holes and inconsistent colouring. The manager
decides to investigate the problem. As a first step to improving the quality, she
wants to know if the number and type of defects are the same for all three
shifts. A random sample of one day’s shirt production is taken. The number of
each type of defect and the number of perfect shirts for each are shown in the
following table.
Shift
Shirt condition 1 2 3 Total
Perfect 224 249 238 711
Improperly stitched 15 19 21 55
Unaligned buttons 8 12 12 32
Inconsistent colour 17 16 11 44
Total 264 296 282 842

Page 66 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Do these results allow the operations manager to conclude that at the 10%
significance level, there are differences in quality among the three shifts?

3. There are three distinct types of hardware wholesalers; independents


(independently owned), Wholesaler voluntaries (groups of independents
acting together) and retailer cooperatives (retailer owned). In a random
sample of 137 retailers, the retailers were categorized according to the type of
wholesaler they primarily used and according to their store location as shown
in the table below:
Store Location Retailer Wholesaler Independents
cooperatives Voluntaries
Multiple locations 14 10 5
Free- standing 29 26 13
Others (Mall, 20 14 6
strips)
At the 5% significance level, is there sufficient evidence to conclude that
the type of wholesaler primarily used by a retailer is related to the retailers
location?
RELATIONAL STATISTICS
Relational statistics fall into one of three categories: univariate, bivariate, and
multivariate analysis. Univariate analysis is the study of one variable for a sub-
population. Bivariate analysis is the study of a relationship between two variables.
Multivariate analysis is the study of relationship between three or more variables.
The relational statistics include correlation, regression, discriminant analysis,
conjoint analysis, factor analysis and cluster analysis
➢ Discriminant analysis: It is used to classify people or objects into groups based
on several predictor variables. The groups are defined by a categorical variable
with two or more values, whereas the predictors are metric. The effectiveness
of the discriminant equation is based not only on its statistical significance but
also on its success in correctly classifying cases to groups.
➢ Conjoint analysis: It is a technique that typically handles non-metric
independent variables. It allows the researcher to determine the importance of
product or service attributes and the levels or features that are most desirable.
Respondents provide preference data by ranking or rating cards that describe
products. These data become utility weights of product characteristics by means
of optimal scaling and log linear algorithms.
➢ Factor analysis: It attempts to reduce the umber of variables and discover the
underlying constructs that explain the variance. A correlation matrix is used to
derive a factor matrix from which the best linear combination of variables may
be extracted.
➢ Cluster analysis: It is a set of techniques for grouping similar objects or people.
The cluster procedure starts with an undifferentiated group of people, events or
objects and attempts to reorganize them into homogeneous sub-groups.

Page 67 of 79
SCS301: RESEARCH METHODS IN COMPUTING

REGRESSION ANALYSIS
Regression involves developing a mathematical equation that analyses the
relationship between the variable to be forecast (dependent variable) and the
variables that the statistician believes are related to the forecast variable
(independent variable).
Regression is the estimation of unknown values or the prediction of one variable
from known values of other variables.
Types of regression
➢ Simple linear regression: Involves a relationship between two variables only.
➢ Multiple regression: Analyses or considers the relationship between three or
more variables.

Simple Regression
The first step in establishing the relationship between X and Y is to obtain
observations on the two variables and analyze the data using a scatter diagram to
indicate whether a positive or negative relationship exists between X and Y. the
relationship can be approximated by a straight line. Algebraically, the relationship
is

The above function is deterministic since it gives exact relationship between X and
Y. when the line is plotted, not all the points will fall on the line because of the
following reasons:-
➢ Omission of other explanatory variables from the function
➢ Random behavior of human beings
➢ Imperfect specification of the functional form of the model
➢ Errors of aggregation
➢ Errors of measurement

To account for the deviations of some points from the straight line, the error term
is introduced. The introduction of the error term makes the function stochastic
. To estimate the values of the coefficients and , we need
observations on Y, X and the error term. However, the error term is not
observable and therefore we make assumptions about the error term.

Assumptions of the error term


➢ The error term is a real random variable which has a mean of zero and
constant variance ( Assumption of homoscedasticity)
➢ The error term is normally distributed
➢ The error term corresponding to different values of X for different periods are
not correlated (assumption of no autocorrelation)
➢ There is no relationship between the explanatory variables and the error term
➢ The explanatory variables are measured without error. The error absorbs the
influence of omitted variables and errors of measurement in the dependent
variable.
Page 68 of 79
SCS301: RESEARCH METHODS IN COMPUTING

All the above assumptions are called stochastic assumptions

Other assumptions
➢ The explanatory variables are not perfectly linearly related or correlated (No
multicollinearity)
➢ The variables are correctly aggregated
➢ The relation being estimated is identified
➢ The relationship is correctly specified

The regression equation of Y on X


➢ It used to predict the values of Y from the given values of X.
➢ It is expressed as follows
➢ To determine the values of and the following two normal equations are
to be solved simultaneously

➢ Alternatively the values of and can be got using the following


formula’s

Example
1. A random sample of eight auto drivers insured with a company and having
similar auto insurance policies was selected. The following table lists their
driving experience (in years) and the monthly auto insurance premium (in
Sh.000) paid by them.
Driving experience (Years) 5 2 12 9 15 6 25 16
Monthly auto insurance premium 64 87 50 71 44 56 42 60
(In Sh.000)
i. Find the least squares regression line by identifying the appropriate
dependent and independent variable

ii. Interpret the meaning of the constants calculated in part (i) above.
iii. Compute the coefficient of correlation and coefficient of determination and
interpret their values.

2. A farmer wanted to find out the relationship between the amount of fertilizer
used and the yield of corn. He selected seven acres of his land on which he
used different amounts of fertilizer to grow corn. The following table gives the
Page 69 of 79
SCS301: RESEARCH METHODS IN COMPUTING

amount (in kg) of fertilizer used and the yield (in Tonnes) of corn for each of
the seven acres.
Fertilizer used 120 80 100 70 88 75 110
Yield of corn 138 112 129 96 119 104 134
i. Find the least squares regression line by identifying the appropriate
dependent and independent variable.

ii. Interpret the meaning of the constants calculated in part (i) above.
iii. Compute the coefficient of correlation and coefficient of determination and
interpret their values.
iv. Predict the yield of corn per acre for 105 kg of fertilizer used.

3. In an attempt to get a better idea of some of the determinants of medical


expenditures by families, a social worker collected data on family size and
average weekly medical bills, with the results shown in the following table;
Family size 2 2 4 5 7 3 8 10 5 2 3 5 2
Weekly medical
expenses ( in Sh. 20 28 52 50 78 35 102 88 51 22 29 49 25
’00’)

i. Find the least squares regression line by identifying the appropriate


dependent and independent variable.

ii. Interpret the meaning of the constants calculated in part (i) above.
iii. Compute the coefficient of correlation and coefficient of determination and
interpret their.
CORRELATION
Definition: It is the existence of some definite relationship between two or more
variables. Correlation analysis is a statistical tool used to describe the degree to
which one variable is linearly related to another variable.

Types of Correlation
Correlation may be classified in the following ways:-
Positive and negative correlation
Correlation is said to be positive if two series move in the same direction,
otherwise it is negative (opposite Direction).
Linear and Non-Linear correlation
Correlation is linear if the amount of change in one variable tends to bear a
constant ratio to the amount of change in the other variable otherwise it is non-
linear.
Simple, partial and multiple correlation
Simple correlation is where two variables are studied while partial or multiple
involves three or more variables.

Page 70 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Methods of calculating simple correlation


1. Scatter diagram
2. Karl Pearson’s coefficient of correlation
3. Spearman’s rank correlation coefficient
4. Method of least squares

Scatter diagram
It is a chart that potrays the relationship between two variables.
Advantages
➢ It is simple and non-mathematical method of studying correlation between
variables.
➢ It is not influenced by the size of extreme values
Limitation
➢ One cannot establish the exact degree of correlation between the variables.

Karl Pearson’s coefficient of correlation (Product moment coefficient of


correlation)
The coefficient of correlation (r) is a measure of strength of the linear relationship
between two variables.

r=
 XY − n X Y
 X 2 − n X  Y 2 − nY
2 2

Interpretation of the coefficient of correlation


➢ When r = +1, there is a perfect positive correlation between the variables
➢ When r = -1, there is a perfect negative correlation between the variables
➢ When r = 0, there is no correlation between the variables
➢ The closer r is to +1 or to –1, the closer the relationship between the variables
and the closer r is to 0, the less close the relationship.
➢ The closeness of the relationship is not proportional to r.
The following table lists the interpretations for various correlation coefficients:
Value Comment
0.8 to 1.0 Very strong
0.6 to 0.8 Strong
0.4 to 0.6 Moderate
0.2 to 0.4 Weak
0.0 to 0.2 Very weak

Advantage
➢ It summarizes in one figure the degree of correlation and whether it is positive
or negative.

Limitations
➢ It assumes linear relationship regardless of the fact whether that assumption is
true or not.
Page 71 of 79
SCS301: RESEARCH METHODS IN COMPUTING

➢ The coefficient can be misinterpreted.


➢ The value of the coefficient is unduly affected by the extreme values.
➢ It is time consuming.
Method of least squares
SS xy
r=
SS xx * SS yy
Spearman’s Rank Correlation
Definition
➢ It is the correlation between the ranks assigned to individuals by two different
characters.
➢ It is a non-parametric technique for measuring strength of relationship between
paired observations of two variables when the data are in ranked form.

It is denoted by R or p
6 d i2 6 d 2
R = 1− = 1 −
N ( N 2 − 1) N3 − N
In rank correlation, there are two types of problems:-
i. Where actual ranks are given
ii. Where actual ranks are not given

Where actual ranks are given


Steps:
➢ Take the differences of the two ranks i.e. (R1-R2) and denote these differences
by d.
➢ Square these differences and obtain the total  d 2
6 d 2
➢ Use the formula R = 1 −
N3 − N
Example
The ranks given by two judges to 10 individuals are given below.
Individual 1 2 3 4 5 6 7 8 9 10
Judge 1(X) 1 2 7 9 8 6 4 3 10 5
Judge 2 7 5 8 10 9 4 1 6 3 2
(Y)
Calculate the spearman’s rank correlation.

Where ranks are not given


Ranks can be assigned by taking either the highest value as 1 or the lowest value as
1. the same method should be followed in case of all the variables.

Example
Calculate the Rank correlation coefficient for the following data of marks given to
1st year B Com students:
Page 72 of 79
SCS301: RESEARCH METHODS IN COMPUTING

CMS 100 45 47 60 38 50
CAC 100 60 61 58 48 46
Merits of the Rank method
➢ It is simpler to understand and easier to apply compared to the Karl Pearson’s
method.
➢ Where the data are of qualitative nature like honesty, efficiency, intelligence
etc, the method can be used with great advantage.
➢ It is the only method that can be used where we are given the ranks and not
the actual values.
Limitations
➢ The method cannot be used for finding out correlation in a grouped frequency
distribution.
➢ Where the number of observations exceeds 30, the calculations become quite
tedious and require a lot of time.

Coefficient of determination (r2)


It is the square of the correlation coefficient. It shows the proportion of the total
variation in the dependent variable Y that is explained or accounted for by the
variation in the independent variable X. e.g. If the value of r = 0.9, r2 = 0.81, this
means 81% of the variation in the dependent variable has been explained by the
independent variable.

REPORT WRITING TECHNIQUES


A quality presentation of research findings can have an inordinate effect on a
reader’s or a listener’s perceptions of a study’s quality. Recognition of this fact
should prompt a researcher to make a special effort to communicate skillfully and
clearly. Research reports contain findings, analysis, interpretations, conclusions and
recommendations. Research reports differ depending on their aims and their
readership. Reports should be clearly organized, physically inviting and easy to
read. Writers can achieve these goals if they are careful with mechanical details,
writing style and comprehensibility.

Writing a research proposal and research reports


A proposal is a document, which details an intended activity. The formats for
writing proposals differ from institution to institution or from department to
department. Generally, a research proposal should include the following prefatory
items; the title page, declaration, table of contents, list of figures and tables, list of
acronyms and abbreviations and an abstract. It will also have chapter one:
Introduction, Chapter Two: Literature review and Chapter Three: Methodology.
In addition it will also have the references, time schedule, budget and any
appendices.

Page 73 of 79
SCS301: RESEARCH METHODS IN COMPUTING

The final research report will have what is contained in the proposal (apart from
the time schedule and budget) and in addition dedication, acknowledgement,
chapter four: Data analysis and findings and chapter five: Summary, conclusions
and recommendations.

Prefatory items
Prefatory items do not have a direct bearing on the research itself. They assist the
reader in using the research report. They can include: -
Title page:
The title page should include the title of the report, the date and for whom and
by whom it was prepared. The title should be brief but should include the
variables included in the study, the type of relationship among the variables and
the population to which the results may be applied.
Declaration
This is whereby the researcher declares that the work s his/her original work.
Dedication
Some researchers would always wish to dedicate their work to a person or
persons they deem special in their lives.
Acknowledgements
During the research process, the researcher may require help from other
individuals or organisations. It would be necessary if the researcher acknowledged
received from these individuals and organisations.
Table of contents and list of figures and tables
Any report with several sections that total more than six to ten pages should have
a table of contents. If there are many tables, charts or other exhibits, they should
also be listed after the table of contents in a separate list of tables or list of figures.
List of abbreviations and acronyms
All abbreviations and acronyms used in report should be explained. An
abbreviation is a short form of a word while an acronym is a contraction formed
by taking the first letter of several words.
Abstract
A proposal abstract is a summary of what the researcher intends to do. It should
be brief, precise and to the point.
Chapter One
1.0 Introduction
The introduction prepares the reader for the report by describing the parts of the
report.
1.1 Background to the problem
In the background, the researcher should broadly introduce the topic under
investigation. The researcher introduces briefly the general area of study, and then
narrows down to the specific problem to be studied. The background enables the
reader to have an idea of what is happening regarding the area under
investigation.
1.2 The problem Statement
Page 74 of 79
SCS301: RESEARCH METHODS IN COMPUTING

The researcher states the problem under investigation. The researcher should
describe the factors that make the stated problem a critical issue to warrant the
study. Relevant literature can be referred to. It should be brief and precise.
1.3 The objectives of the study
Research objectives are those specific issues within the scope of the stated purpose
that the researcher wants to focus upon and examine in the study. The objectives
should be specific, measurable, achievable, reliable and time bound. Objectives
guide the researcher in formulating testable hypotheses.
1.4 Research questions
These are the questions, which the researcher would like to be answered by
undertaking the study. They should be formulated from the objectives of the
study.
1.5 Research Hypothesis
A hypothesis is a researchers prediction regarding the outcome of the study. It
states possible differences, relationships or causes between two variables or
concepts. Hypothesis are derived from or based on existing theories, previous
research, personal observations or experiences. The test of a hypothesis involves
collection and analysis of data that may either support or fail to support the
hypothesis. If the results fail to support a stated hypothesis, it does not mean that
the study has failed but it implies that the existing theories or principles need to be
revised or retested under various situations.
1.6 Scope of the study
This section indicates the boundary of the study
1.7 Significance / Justification of the study
The justification helps to answer the following questions. Why is this work
important? What are the implications of doing it? How does it link to other
knowledge? How does it stand to inform policy making? The significance must be
strong enough to warrant the use of time, energy and money in carrying out the
research.
1.8 Assumptions and limitations of the study
An assumption is any fact that a researcher takes to be true without actually
verifying it. It puts some boundary around the study and provides the reader with
vital information, which influences the way results of the study are interpreted. A
limitation is an aspect of a research that may influence the results negatively but
over which the researcher has no control. A common limitation in social science
studies is the scope of the study, which sometimes may not allow generalizations.
Sample size may also be another limitation.

Chapter Two
2.0 Literature Review
The purpose of the literature review is to situate your research in the context of
what is already known about a topic. It need not be exhaustive, it needs to show
Page 75 of 79
SCS301: RESEARCH METHODS IN COMPUTING

how your work will benefit the whole. It should provide the theoretical basis for
your work, show what has been done in the area by others, and set the stage for
your work.
In a literature review you should give the reader enough ties to the literature that
they feel confident that you have found, read, and assimilated the literature in the
field. It should probably move from the more general to the more focused studies,
but need not be exhaustive, only relevant.
The literature review should clearly present the holes in the knowledge that need
to be plugged and by so doing, situate your work. It is the place where you
establish that your work will fit in and be significant to the discipline.
Chapter Three
3.0 Research Methodology
This section should make clear to the reader the way that you intend to approach
the research question and the techniques and logic that you will use to address it.
3.1 Research design
The coverage of the design must be adapted to the purpose. In an experimental
study, the materials, tests, equipment, control conditions and other devices should
be described. In descriptive or ex post facto designs, it may be sufficient to cover
the rationale for using one design instead of competing alternatives. The strengths
and weaknesses of the design can be identified and the instrumentation and
materials discussed.
3.2 The target population
The researcher should explicitly define the target population being studied
3.3 Sampling strategy
Explanations of the sampling methods, uniqueness of the chosen parameters or
other points that need explanation should be covered with brevity.
3.4 Data Collection Tools and Techniques
This part of the report describes the specifics of gathering the data. Its contents
depend on the design. This might include the data that you anticipate collecting
and a description of the instruments you will use. Detailed copies of the data
collection tools e.g. questionnaires, interview schedule or observation schedule
should be attached as an appendix.
3.5 Data Analysis
This section summarizes the methods used to analyze the data. It describes data
handling, preliminary analysis, statistical tests, computer programs and other
technical information. The rationale for the choice of analysis approaches should
be clear. A brief commentary on assumptions and appropriateness of use should
be presented.

Page 76 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Chapter Four
4.0 Data analysis and Findings
The objective is to explain the data rather than draw interpretations or
conclusions. When quantitative data can be presented, it should be done as simply
as possible with charts, graphics and tables. The data need not include everything
collected. Only material important to the reader’s understanding of the problem
and the findings should be included. Both findings that support or do not support
the hypothesis should be included.
Chapter Five
5.0 Summary and Conclusions
The summary is a brief statement of the essential findings. Sectional summaries
may be used if there are many specific findings. These may be combined into an
overall summary. Conclusions represent inferences drawn from the findings.
Conclusions may be presented in a tabular form for easy reading and reference.
Summary findings may be subordinated under the related conclusion statement.
Recommendations
There are usually a few ideas about corrective actions. In academic research, the
recommendations are often further study suggestions that broaden or test
understanding of the subject area. In applied research, the recommendations will
usually be for managerial action rather than research action. The writer may offer
several alternatives with justifications.
References
The use of secondary data requires a reference or a bibliography. Proper citation,
style and formats are unique to the purpose of the report. The
Appendixes
The appendixes are the place for complex tables, statistical tests, supporting
documents, copies of forms and questionnaires, detailed descriptions of the
methodology, instructions to field workers and other evidence important for later
support. The reader who wishes to learn about technical aspects of the study and
to look at statistical breakdowns will want a complete appendix.
Time schedule
It is a listing of the major activities and the corresponding anticipated time period
it will take to accomplish that activity. The time is usually given in months.
Activities to be undertaken can always overlap.
Budget
A budget is a list of items that will be required to carry out the research and their
approximate cost. It should be detailed enough and precise on items needed,
prices per unit and total cost. Details of requirements in each budget will be
governed by the type of research.

Page 77 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Characteristics of a Good Proposal:


a) The need for the proposed activity is clearly established, preferably with
data.
b) The most important ideas are highlighted and repeated in several places.
c) The objectives of the project are given in detail.
d) There is a detailed schedule of activities for the project, or at least sample
portions of such a complete project schedule.
e) Collaboration with all interested groups in planning of the proposed
project is evident in the proposal.
f) The commitment of all involved parties is evident, e.g., letters of
commitment in the appendix and cost sharing stated in both the narrative
of the proposal and the budget.
g) The budget and the proposal narrative are consistent.
h) The uses of money are clearly indicated in the proposal narrative as well as
in the budget.
i) All of the major matters indicated in the proposal guidelines are clearly
addressed in the proposal.
j) The agreement of all project staff and consultants to participate in the
project was acquired and is so indicated in the proposal.
k) All governmental procedures have been followed with regard to matters
such as civil rights compliance and protection of human subjects.
l) Appropriate detail is provided in all portions of the proposal.
m) All of the directions given in the proposal guidelines have been followed
carefully.
n) Appendices have been used appropriately for detailed and lengthy
materials which the reviewers may not want to read but are useful as
evidence of careful planning, previous experience, etc.
o) The length is consistent with the proposal guidelines and/or funding agency
expectations.
p) The budget explanations provide an adequate basis for the figures used in
building the budget.
q) If appropriate, there is a clear statement of commitment to continue the
project after external funding ends.
r) The qualifications of project personnel are clearly communicated.
s) The writing style is clear and concise. It speaks to the reader, helping the
reader understand the problems and proposal. Summarizing statements and
headings are used to lead the reader.

Page 78 of 79
SCS301: RESEARCH METHODS IN COMPUTING

Guidelines for writing a good research report


a) Break large units of text into smaller units with headings to show organisation
of the topics
b) Relieve difficult text with visual aids when possible
c) Emphasize important material and de-emphasize secondary material through
sentence construction and judicious use of italising, underlining, capitalizing
and parentheses.
d) Use ample space and wide margins to create a positive psychological effect on
the reader.
e) Choose words carefully, opting for the known and short rather than the
unknown and long.
f) Repeat and summarize critical and difficult ideas so readers can have time to
absorb them.
g) Review the writing to ensure the tone is appropriate
h) Proof read the final document to correct any errors.
i) Use short paragraphs
j) Indent parts of text that represent listings, long quotations or examples.
k) Use headings and subheadings to divide the report and its major sections into
homogeneous topical parts.

Page 79 of 79

You might also like