0% found this document useful (0 votes)
63 views

Research Methods Lesson Notes

This document provides an overview of descriptive statistics. It discusses the importance of statistics in research for describing, summarizing, and explaining data. Descriptive statistics is divided into topics including statistical symbols used, organizing data through statistical tables, rank ordering, and constructing frequency distributions. The document provides examples and steps for organizing data through these various descriptive statistic methods.

Uploaded by

Carol Soi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Research Methods Lesson Notes

This document provides an overview of descriptive statistics. It discusses the importance of statistics in research for describing, summarizing, and explaining data. Descriptive statistics is divided into topics including statistical symbols used, organizing data through statistical tables, rank ordering, and constructing frequency distributions. The document provides examples and steps for organizing data through these various descriptive statistic methods.

Uploaded by

Carol Soi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

CHAPTER EIGHT

DESCRIPTIVE STATISTICS

Introduction

Research and innovations are essential in any walk of life or field of knowledge for its
enrichment, progress and development. A lot of research work is carried out and
statistical methods help researchers in carrying out these researches successfully.

Statistics, branch of mathematics that deals with the collection, organization, and
analysis of numerical data and with such problems as experiment design and decision
making.

Statistics is divided into two major divisions: Descriptive and Inferential statistics. In this
chapter we will concern ourselves to descriptive statistics.

Topic 1: Importance of statistics in research

Descriptive statistics is the use of statistics to describe, summarize, and explain or make
sense of a given set of data.

Guilford (1973) summarized the importance of statistics in research:


• Thy permit the most exact kind of description
• They force us to be definite and exact in our thinking
• They enable us to summarise our results in a meaningful and convenient form
• They enable us to draw general conclusions
• They enable us to predict.
• They enable us o analyse some of the causal factors underlying complex and
otherwise bewildering events

Statistics, in general, renders valuable services in the following dimensions:


• In the collection of data
• In classification, organization and summarization of numerical facts.
• In drawing conclusions and inferences or making predictions on the basis of data
(Mangal, 1990).

Topic 2: Statistical Symbols

The following are statistical symbols:


• X - The raw score or the midpoint of the class interval in a frequency distribution
• X-M (or x) - The deviation score indicating how far a score deviates from the mean
• ∑ X - Sum of the scores in a series
• i - The class interval in a frequency distribution.
• f - Frequency that is the number of cases with specific score in a particular class
interval.
• n – Sample size
• N – Population size
• cf – Cumulative frequency
• M, Md, Mo – Mean, median, mode
• Q1, Q2, Q3, etc. – First quartile, second quartile, third quartile, etc
• P1, P2, P3, etc. – First percentile, second percentile, third percentile, etc
• σ – Standard deviation
• r – Product moment correlation coefficient
• ρ – Spearman’s rank order correlation coefficient
• Ho – Null hypothesis
• HA – Alternate Hypothesis

Topic 3: Organisation of Data

The word data is plural of datum. It means the evidence or facts for describing a group or
situation. Examples of data are height of students, score of intelligence and weight of
learners.

For understanding the meaning and deriving useful conclusion the data have to be
organised or arranged in some systematic way. Organisation of data is the arrangement
of original data in a proper way for deriving useful interpretation (Mangal, 1990 and
Kasomo, 2007). It involves the following ways:

➢ Organising in form of statistical tables.


➢ Organising in the form of rank order.
➢ Organising in the form of Frequency distribution (which is distribution in suitable
groups)

Statistical Tables

Here, you tabulate or arrange the data in some properly selected classes and the
arrangement is described by titles and sub-titles. Such tables can list original raw data as
well as percentages, means, standard deviations (Mangal, 1990). The following are
general rules for constructing tables:

➢ Title should be simple, concise and unambiguous. The title usually is top of the
table.
➢ The table should be suitably divided into columns and rows according to the nature
of data and purpose. These rows and columns should be arranged in a logical
order to facilitate comparisons.
➢ The heading of each column or row should be as brief as possible. Two or more
columns or rows with similar headings may be grouped under a common heading
to avoid repetition.
➢ Sub-totals for each separate classification and general total for all combined
classes are to be given. These totals should be given at the bottom or right of
concerned items.
➢ The units in which the data are given must be mentioned, preferably, in the
headings of columns or rows.
➢ Necessary footnotes for providing essential explanation of he points to ambiguous
interpretation of the tabulated data must be given at the bottom of the table.
➢ Provide the source of data at the end of the table.
➢ The table should be simple to allow easy interpretation.

Rank Order

The original data may be grouped in an ascending or descending series exhibiting an


order with respect to the rank or merit position of the individual. For this purpose, would
use a descending series arranging the highest score on the top and the lowest at the
bottom.

Example: Fifty students obtained the following scores on their psychology test,
tabulate these data in the form of rank order:

63, 22, 27, 33, 57, 37, 38, 40, 54, 41, 55, 43, 45, 62, 69, 29, 34, 57, 58, 38, 53, 40, 46,
48, 49, 59, 46, 47, 48, 41, 54, 43, 44, 64, 31, 35, 59, 36, 39, 53, 42, 52, 45, 42, 43, 44,
46, 47, 51, 39

Solution: The rank order tabulation of this data will be as given below:

Rank Scores Rank Scores Rank Scores


1 69 21 47 41 38
2 64 22 46 42 37
3 63 23 46 43 36
4 62 24 46 44 35
5 59 25 45 45 34
6 58 26 45 46 33
7 57 27 44 47 31
8 57 28 44 48 29
9 55 29 44 49 27
10 55 30 43 50 22

11 54 31 43
12 53 32 42
13 53 33 42
14 52 34 41
15 51 35 41
16 50 36 40
17 49 37 40
18 49 38 39
19 48 39 39
20 47 40 38

Frequency Distributions

One useful way to view the data of a variable is to construct a frequency distribution
(i.e., an arrangement in which the frequencies, and sometimes percentages, of the
occurrence of each unique data value are shown). By frequency of a datum, we mean the
number of times the datum is repeated in a given series.

In this form of organisation and arrangement of data, you group the numerical data into
some arbitrarily chosen classes or groups. For this purpose, usually, the data are
distributed into groups of data (classes) and each datum is allotted a place in the
respective group or class. It is also seen how many times a particular datum or group of
data occur in the given data. In this way, frequency distribution may be considered a
method of presenting a collection of groups of data in each group of data or class.

Frequency distribution is organised via a number of steps:


➢ Step 1: finding the range. You can obtain the range by subtracting the lowest
datum from the highest datum. In the above rank order example, the range is:
69 - 22 = 47
➢ Step 2: Determining the class interval or grouping interval. This you can do by
following any of the following rules:
o Rule 1: to obtain the class interval, denoted as i, the range is divided by
number of classes desired. As a general rule, if the series has fewer cases
than 50, the number of classes should not exceed 10; if the series has
between 50 and 100, you should use between 10 and 15 classes and if the
series has more than 100 items, you should use 15 or more classes.
o Rule 2: the class interval is decided first and then the number of classes is
determined
In the above case, using rule 1, the range is 47 and since the number of cases in our
series are 50, 10 classes should be sufficient. The class interval will be 47/10=4.7
which is 5, the nearest whole number.
➢ Step 3: now you should write the contents of the frequency distribution. This you
are able to do by drawing three columns with the following headings:
o Classes of the distribution, the lowest class is decided and the rest of the
classes are added up to the highest, in this case, let us use the lowest class
as: 20-24
o Tallies - done on data into proper classes
o Frequencies – the number of data in each class are written in the third
column and the total frequency entered.
➢ The above data would be present as shown on the following table:

TABLE 3.1 Construction of a Frequency Distribution Table


Classes of Data Tallies Frequencies
65-69 l 1
60-64 lll 3
55-59 llll l 6
50-54 llll l 6
45-49 llll llll 10
40-44 llll llll l 11
35-39 llll ll 7
30-34 lll 3
25-29 ll 2
20-24 l 1
Total Frequencies (N) = 50

While grouping the raw scores into a frequency distribution, we assume that the mid-point
of the class interval is the score obtained by each of the individuals represented by that
interval. In the above table, notice, for example, the class interval 35-39, had original
values of 39, 39, 38, 38, 37, 36 and 35 respectively. In grouping we assume that these
measures have the value of the midpoint of their respective class or group, i.e. 37. They
are represented by a single measure 37. This assumption leads to grouping error.

Cumulative Frequency and Cumulative Percentage Frequency Distribution

A Frequency Distribution Table tells you the way in which frequencies are distributed over
the various class intervals but it does not tell you the total number of cases or the
percentage of cases lying below or above a class interval. You can perform this task with
the help of cumulative frequency and cumulative percentage frequency distribution.
The following table on the above data will illustrate how this is done:

TABLE 3.2 Computation of Cumulative frequencies and cumulative percentage


frequencies
Class interval Frequency Cumulative Cumulative percentage
frequency frequency
65-69 1 50 100.00
60-64 3 49 98.00
55-59 6 46 96.00
50-54 6 40 80.00
45-49 10 34 68.00
40-44 11 24 48.00
35-39 7 13 26.00
30-34 3 6 12.00
25-29 2 3 6.00
20-24 1 1 2.00

Cumulative frequencies are, thus, obtained by adding successively, starting from the
bottom, the individual frequencies. In the above table when we start from the bottom, the
first cumulative frequency is written as 1 against the lowest class interval, i.e. 20-24, the
next cumulative frequency is 3 because 1 is added to the frequency of the second class
interval which is 2, then 6 by adding 3 to the frequency of the third class interval which is
3 and so on.
To convert the cumulative frequencies into cumulative percentage frequencies, you just
need the particular cumulative frequency and then multiply it by 100/N, where N isthe total
number of frequencies

Graphic Representations of Data

Another excellent way to describe your data is to construct graphical representations of


the data.
According to Mangal (1990), the following are the advantages of graphic representation
of data:
➢ Attractive and appealing to the eye.
➢ Provides more lasting effect on the brain.
➢ Comparative analysis and interpretation may effectively and easily be made.
➢ Various statistics like median, mode, quartiles, may be easily computed.
➢ May help in proper estimation, evaluation and interpretation of characteristics of
items and individuals
➢ The real value of graphical representation lies in its economy and effectiveness.
It carries a lot of communication power.
➢ Helps in forecasting as it indicates the trend and movements of the data in the
past

Some common graphical representations are bar graphs, histograms, line graphs, and
scatter plots.

Bar Graphs
A bar graph uses vertical bars to represent the data.

• The height of the bars usually represent the frequencies for the categories that sit on
the X axis.

• Note that, by tradition, the X axis is the horizontal axis and the Y axis is the vertical
axis.

• Bar graphs are typically used for categorical variables.

Histograms
A histogram is a graphic that shows the frequencies and shape that characterize a
quantitative variable.

• In statistics, we often want to see the shape of the distribution of quantitative


variables; having your computer programme provide you with a histogram is a
simple way to do this.

• Line Graphs
A line graph uses one or more lines to depict information about one or more variables.

• A simple line graph might be used to show a trend over time (e.g., with the years on
the X axis and the population sizes on the Y axis).

• Line graphs are used for many different purposes in research. Line graphs are used
in factorial experimental designs to depict the relationship between two categorical
independent variables and the dependent variable.

• This line graph shows that the "sampling distribution of the mean" is normally
distributed.

• Line graphs have in common their use of one or more lines within the graph (to depict
the levels or characteristics of a variable or to depict the relationships among
variables).

Scatterplots
A scatterplot is used to depict the relationship between two quantitative variables.

• Typically, the independent or predictor variable is represented by the X axis (i.e., on


the horizontal axis) and the dependent variable is represented by the Y axis (i.e.,
on the vertical axis).
Activity 15
What are the characteristics of a normal distribution?

Measures of Central Tendency


Measures of central tendency provide descriptive information about the single numerical
value that is considered to be the most typical of the values of a quantitative variable.

• Three common measures of central tendency are the mode, the median, and the
mean.

The mode is simply the most frequently occurring number.


The median is the centre point in a set of numbers; it is also the fiftieth percentile.
• To get the median by hand, you first put your numbers in ascending or descending
order.

• Then you check to see which of the following two rules applies:

• Rule One. If you have an odd number of numbers, the median is the centre
number (e.g., three is the median for the numbers 1, 1, 3, 4, 9).

• Rule Two. If you have an even number of numbers, the median is the average of
the two innermost numbers (e.g., 2.5 is the median for the numbers 1, 2, 3, 7).

The mean is the arithmetic average (e.g., the average of the numbers 2, 3, 3, and 4, is
equal to 3).

Activity 16
1. Which measure of central tendency would be most suitable for each of the
following sets of data?
i. 1, 23,25, 26, 27, 23, 29, 30
ii. 1, 1, 1, 1, 1, 1, 1,1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 50
iii. 1, 1, 2, 3, 4, 1, 2, 6, 6, 8, 3, 4, 5, 6, 7
iv. 1, 101, 104, 106, 111, 108, 109, 200

2. If you randomly select a sample of 20 12-year-old girls( sample A) and


then select a sample of 500 12-year-olds Sample B) and then calculate the
mean weight and height for each sample. Which is likely to give abetter
estimate of the population mean weight and height?

Measures of Variability
Measures of variability tell you how "spread out" or how much variability is present in a
set of numbers. They tell you how different your numbers tend to be. Note that measures
of variability should be reported along with measures of central tendency because they
provide very different but complementary and important information. To fully interpret one
(e.g., a mean), it is helpful to know about the other (e.g., a standard deviation).

An easy way to get the idea of variability is to look at two sets of data, one that is highly
variable and one that is not very variable. For example, which of these two sets of
numbers appears to be the most spread out, Set A or Set B?

• Set A. 93, 96, 98, 99, 99, 99, 100

• Set B. 10, 29, 52, 69, 87, 92, 100


If you said Set B is more spread out, then you are right! The numbers in set B are more
"spread out"; that is, they have more variability.
All of the measures of variability should give us an indication of the amount of variability
in a set of data. We will discuss three indices of variability: the range, the variance, and
the standard deviation.

Range
A relatively crude indicator of variability is the range (i.e., which is the difference between
the highest and lowest numbers).

• For example the range in Set A shown above is 7, and the range in Set B shown
above is 90.

Variance and Standard Deviation


Two commonly used indicators of variability are the variance and the standard deviation.

• Higher values for both of these indicators indicate a larger amount of variability than
do lower numbers.

• Zero stands for no variability at all (e.g., for the data 3, 3, 3, 3, 3, 3, the variance and
standard deviation will equal zero).

• When you have no variability, the numbers are a constant (i.e., the same number).

• (Basically, you set up the three columns shown, get the sum of the third column, and
then plug the relevant numbers into the variance formula.)

• The variance tells you (exactly) the average deviation from the mean, in "squared
units."

• The standard deviation is just the square root of the variance (i.e., it brings the
"squared units" back to regular units).

• The standard deviation tells you (approximately) how far the numbers tend to vary
from the mean. (If the standard deviation is 7, then the numbers tend to be about
7 units from the mean. If the standard deviation is 1500, then the numbers tend to
be about 1500 units from the mean.)

Virtually everyone in research is already familiar with the normal curve

If data are normally distributed, then an easy rule to apply to the data is what we call “the
68, 95, 99.7 percent rule." That is:
• Approximately 68% of the cases will fall within one standard deviation of the mean.

• Approximately 95% of the cases will fall within two standard deviations of the mean.

• Approximately 99.7% of the cases will fall within three standard deviations of the
mean.

Measures of Relative Standing


Measures of relative standing are used to provide information about where a particular
score falls in relation to the other scores in a distribution of data. Two commonly used
measures of relative standing are percentile ranks and Z-scores.

You can determine the mean of the type of standard scores below by simply looking under
Mean. You can determine the standard deviation by looking at how much the scores
increase as you move from the mean to 1 SD.

• Z-Scores: have a mean of 0 and a standard deviation of 1. Therefore, if you


converted any set of scores (e.g., the set of student grades on a test) to z-scores,
then that new set WILL have a mean of zero and a standard deviation of one.

• IQ has a mean of 100 and a standard deviation of 15.

• SAT has a mean of 500 and a standard deviation of 100.

• Note: percentile ranks are a different type of scores; because they only have ordinal
measurement properties, the concept of standard deviation is not relevant.

Activity 18
If you have a negative z-score, does it fall above or below the mean?
With a negative z-score do the majority of the population score higher or
lower than you?

Percentile Ranks
A percentile rank tells you the percentage of scores in a reference group (i.e., in the
norming group) that fall below a particular raw score.
• For example, if your percentile rank is 93 then you know that 93 percent of the scores
in the reference group fall below your score.

Z-Scores
A z-score tells you how many standard deviations (SD) a raw score falls from the mean.

• A SD of 2 says a score falls two standard deviations above the mean.

• A SD of -3.5 says the score falls three and a half standard deviations below the
mean.
To transform a raw score into z-score units, just use the following formula:
Raw score - Mean Z-score = ------------------------ Standard Deviation
For example, you know that the mean for IQ scores is 100 and the standard deviation for
IQ scores is 15

Therefore, if your IQ is 115, you can get your z-score...


115 - 100 15 Z-score = --------------- = -------- = 1 15 15
An IQ of 115 falls one standard deviation above the mean.
Note that once you have a set of z-scores, you can convert to any other scale by using
this formula: New score = Z-score(SD of new scale) + mean of the new scale.

• For example, let’s convert a z-score of three to an IQ score

• New score=3(15) + 100 (remember, the mean of IQ scores is 100 and the standard
deviation of IQ scores is 15). Therefore, the new score (i.e., the IQ score converted
from the z-score of 3 using the formula just provided) is equal to 145 (3 times 15
is 45, and when 100 is added you get 145).

Activity 19
Suppose that your marks in Mathematics and English are 65% and 71%
respectively. Which is your better subject in comparison with the others
if the group means and SD are 60 and 5 (Mathematics) and 65 and 7 (
English)

Examining Relationships among Variables

We have been talking about relationships among variables throughout your textbook. For
example, we have already talked about correlation, partial correlation, analysis of
variance which is used for factorial designs and analysis of covariance.
At this point in this chapter on descriptive statistics, we introduce two additional
techniques that you also can use for examining relationships among variables:
contingency tables and regression analysis.

Contingency Tables

When all of your variables are categorical, you can use contingency tables to see if your
variables are related.

• A contingency table is a table displaying information in cells formed by the


intersection of two or more categorical variables.

When interpreting a contingency table, remember to use the following two rules:

• Rule One. If the percentages are calculated down the columns, compare across the
rows.
• Rule Two. If the percentages are calculated across the rows, compare down the
columns.

• When you follow these rule you will be comparing the appropriate rates (a rate is the
percentage of people in a group who have a specific characteristic).

• When you listen to the local and national news, you will often hear the announcers
compare rates.

• The failure of some researchers to follow the two rules just provided has resulted in
misleading statements about how categorical variables are related; so be careful.

Regression Analysis
Regression analysis is a set of statistical procedures used to explain or predict the values
of a quantitative dependent variable based on the values of one or more independent
variables.

• In simple regression, there is one quantitative dependent variable and one


independent variable.

• In multiple regression, there is one quantitative dependent variable and two or more
independent variables.

The components of the regression equations (e.g., the Y-intercept and the regression
coefficients). Here are the important definitions:

• Regression equation-The equation that defines the regression line


• Here is the simple regression equation showing the relationship between starting
salary (Y or your dependent variable) and GPA (X or your independent variable)

Ŷ = 9,234.56 + 7,638.85 (X)

• The 9,234.56 is the Y intercept (look at the above regression line; it crosses the Y
axis a little below $10,000; specifically, it crosses the Y axis at $9,234.56).

• The 7,638.85 is the simple regression coefficient, which tells you the average amount
of increase in starting salary that occurs when GPA increases by one unit (it is also
the slope or the rise over the run).

• Now, you can plug in a value for X (i.e., starting salary) and easily get the predicted
starting salary.

• If you put in a 3.00 for GPA in the above equation and solve it, you will see that the
predicted starting salary is $32,151.11

• Now plug in another number within the range of the data (how about a 3.5) and see
what the predicted starting salary (check on your work: it is $35,970.54)

• The main difference is that in multiple regression, the regression coefficient is now
called a partial regression coefficient, and this coefficient provides the predicted
change in the dependent variable given a one unit change in the independent
variable, controlling for the other independent variables in the equation. In other
words, you can use multiple regression to control for other variables (i.e., for what
we called in earlier chapters statistical control).
CHAPTER NINE
INFERENTIAL STATISTICS

Introduction

• As you can see, inferential statistics is divided into estimation and hypothesis testing,
and estimation is further divided into point and interval estimation.

Inferential statistics is defined as the branch of statistics that is used to make inferences
about the characteristics of a populations based on sample data.

• The goal is to go beyond the data at hand and make inferences about population
parameters.

• In order to use inferential statistics, it is assumed that either random selection or


random assignment was carried out (i.e., some form of randomization must is
assumed).

Statisticians use Greek letters to symbolize population parameters (i.e., numerical


characteristics of populations, such as means and correlations) and English letters to
symbolize sample statistics (i.e., numerical characteristics of samples, such as means
and correlations).
For example, we use the Greek letter mu (i.e., μ) to symbolize the population mean and
the Roman/English letter X with a bar over it, X (called X bar), to symbolize the sample
mean.

Sampling Distributions
One of the most important concepts in inferential statistics is that of the sampling
distribution. That's because the use of sampling distributions is what allows us to make
"probability" statements in inferential statistics.

• A sampling distribution is defined as "The theoretical probability distribution of the


values of a statistic that results when all possible random samples of a particular
size are drawn from a population." (For simplicity you can view the idea of "all
possible samples" as taking a million random samples. That is, just view it as taking
a whole lot of samples!)

• A one specific type of sampling distribution is called the sampling distribution of the
mean. If you wanted to generate this distribution through the laborious process of
doing it by hand (which you would NOT need to do in practice), you would randomly
select a sample, calculate the mean, randomly select another sample, calculate
the mean, and continue this process until you have calculated the means for all
possible samples. This process will give you a lot of means, and you can construct
a line graph to depict your sampling distribution of the mean

• The sampling distribution of the mean is normally distributed (as long as your sample
size is about 30 or more for your sampling).

• Also, note that the mean of the sampling distribution of the mean is equal to the
population mean! That tells you that repeated sampling will, over the long run,
produce the correct mean. The spread or variance shows you that sample means
will tend to be somewhat different from the true population mean in most particular
samples.

Although we just described the sampling distribution of the mean, it is important to


remember that a sampling distribution can be obtained for any statistic. For example, you
could also obtain the following sampling distributions:

• Sampling distribution of the percentage (or proportion).

• Sampling distribution of the variance.

• Sampling distribution of the correlation.

• Sampling distribution of the regression coefficient.

• Sampling distribution of the difference between two means.

The standard deviation of a sampling distribution is called the standard error. In other
words, the standard error is just a special kind of standard deviation and you learned what
a standard deviation was in the last chapter.

• The smaller the standard error, the less the amount of variability present in a
sampling distribution.

It is important to understand that researchers do not actually empirically construct


sampling distributions! When conducting research, researchers typically select only one
sample from the population of interest; they do not collect all possible samples.

• The computer programme that a researcher uses (e.g., SPSS and SAS) uses the
appropriate sampling distribution for you.
• The computer programme will look at the type of statistical analysis you select (and
also consider certain additional information that you have provided, such as the
sample size in your study), and then the statistical programme selects the
appropriate sampling distribution.
• (It's kind of like the Greyhound Bus analogy: Leave the driving to us...SPSS will take
care of generating the appropriate sampling distribution for you if you give it the
information it needs.)

So please remember that the idea of sampling distributions (i.e., the idea of probability
distributions obtained from repeated sampling) underlies our ability to make probability
statements in inferential statistics.

Estimation
The key estimation question is "Based on our random sample, what is our estimate of the
population parameter?"

• The basic idea is that you are going to use your sample data to provide information
about the population.

There are actually two types of estimation.

• They can be first understood through the following analogy: Let's say that you take
your car to your local car dealer's service department and you ask the service
manager how much it will cost to repair your car. If the manager says it will cost
you $500 then she is providing a point estimate. If the manager says it will cost
somewhere between $400 and $600 then she is providing an interval estimate.

In other words, a point estimate is a single number, and an interval estimate is a range of
numbers.

• A point estimate is the value of your sample statistic (e.g., your sample mean or
sample correlation), and it is used to estimate the population parameter (e.g., the
population mean or the population correlation).

• For example, if you take a random sample from adults living an the United States
and you find that the average income for the people in your sample is $45,000,
then your best guess or your point estimate for the population of adults in the U.S.
will be $45,000.

In the above example, you used the value of the sample mean as the estimate of the
population mean.

• Again, whenever you engage in point estimation, all you need to do is to use the
value of your sample statistic as your "best guess" (i.e., as your estimate) of the
(unknown) population parameter. Often times, we like to put an interval around our
point estimates so that we realize that the actual population value is somewhat
different from our point estimate because sampling error is always present in
sampling.
• An interval estimate (also called a confidence interval) is a range of numbers inferred
from the sample that has a known probability of capturing the population parameter
over the long run (i.e., over repeated sampling).

• The "beauty" of confidence intervals is that we know their probability (over the long
run) of including the true population parameter (you can't do this with a point
estimate).

• Specifically, if you have the computer provide you with a 95 percent confidence
interval (based on your data), then you will be able to be "95% confident" that it will
include the population parameter. That is, your “level of confidence” is 95%.

• For example, you might take the point estimate of annual income of U.S. adults of
$45,000 (used earlier as a point estimate) and surround it by a 95% confidence
interval. You might find that the confidence interval is $43,000 to $47,000. In this
case, you can be "95% confident" that the average income is somewhere between
$43,000 and $47,000.

• If you have the computer programme give you a 99% confidence interval, then you
can be "99% confident" that the confidence interval provided will include the
population parameter (i.e., it will capture the true parameter 99% of the time in the
long run).

You might ask: So why don’t we just use 99% confidence intervals rather than 95%
intervals, since you will make fewer mistakes?

• The answer is that for a given sample size, the 99% confidence interval will be wider
(i.e., less precise) than a 95% confidence interval. For example, the interval
$40,000 to 50,000 is wider than the interval $43,000 to $47,000.

• 95% confidence intervals are popular with many researchers. However, you may, at
times, want to use other confidence intervals (e.g., 90% confidence intervals or
99% confidence intervals).

Hypothesis Testing

Hypothesis testing is the branch of inferential statistics that is concerned with how well
the sample data support a null hypothesis and when the null hypothesis can be rejected
in favour of the alternative hypothesis.

• First note that the null hypothesis is usually the prediction that there is no relationship
in the population.
• The alternative hypothesis is the logical opposite of the null hypothesis and says
there is a relationship in the population.

• We use hypothesis testing when we expect a relationship to be present; in other


words, we usually hope to “nullify” the null hypothesis and tentatively accept the
alternative hypothesis.
• Here is the key question that is answered in hypothesis testing: "Is the value of my
sample statistic unlikely enough (assuming that the null hypothesis is true) for me
to reject the null hypothesis and tentatively accept the alternative hypothesis?"

• Note that it is the null hypothesis that is directly tested in hypothesis testing (not the
alternative hypothesis).

An Analogy from Jurisprudence

The Kenyan criminal justice system operates on the assumption that the defendant is
innocent until proven guilty beyond a reasonable doubt. In hypothesis testing, this
assumption is called the null hypothesis. That is, researchers assume that the null
hypothesis is true until the evidence suggests that it is not likely to be true. The
researcher's null hypothesis might be that a technique of counseling does not work any
better than no counseling. The researcher is kind of like a prosecuting attorney. The
prosecuting attorney brings someone to trial when he or she believes there is some
evidence against the accused, and the researcher brings a null hypothesis to "trial" when
he or she believes there is some evidence against the null hypothesis (i.e., the researcher
actually believes that the counseling technique does work better than no counseling). In
the courtroom, the jury decides what constitutes reasonable doubt, and they make a
decision about guilt or innocence. The researcher uses inferential statistics to determine
the probability of the evidence under the assumption that the null hypothesis is true. If this
probability is low, the researcher is able to reject the null hypothesis and accept the
alternative hypothesis. If this probability is not low, the researcher is not able to reject the
null hypothesis. No matter what decision is made, things are still not completely settled
because a mistake could have been made. In the courtroom, decisions of guilt or
innocence are sometimes overturned or found to be incorrect. Similarly, in research, the
decision to reject or not reject the null hypothesis is based on probability, so researchers
sometimes make a mistake. However, inferential statistics gives researchers the
probability of their making a mistake.

• Here is the main point: In the Kenyan System of Jurisprudence, a defendant is


"presumed innocent" until evidence calls this assumption into question. That is, the
jury is told to assume that a person is innocent until they have heard all of the
evidence and can make a decision. Likewise, in hypothesis testing, the null
hypothesis is assumed to be true (i.e., it is assumed that there is no relationship)
until evidence clearly calls this assumption into question.

• In jurisprudence, the jury rejects the claim of innocence (rejects the null) in the face
of strong evidence to the contrary and makes the opposite conclusion that the
defendant is guilty. Likewise, in hypothesis testing, the researcher rejects the null
hypothesis in the face of strong evidence to the contrary.

• In hypothesis testing, "strong evidence to the contrary" is found in a small probability


value, which says the research result is unlikely if the null hypothesis is true. When
the researcher rejects the null hypothesis (i.e., rejects the assumption of no
relationship), he or she tentatively accepts the alternative hypothesis (i.e., which
says there is a relationship in the population).

• In short, in the procedure called hypothesis testing, the researcher states the null
and alternative hypotheses. Then if the probability value is small, the researcher
rejects the null hypothesis and goes with the alternative hypothesis and makes the
claim that statistical significance has been found.

• When you look at the table, be sure to notice that the null hypothesis has the equality
sign in it and the alternative hypothesis has the "not equals" sign in it.

You may be wondering, when do you actually reject the null hypothesis and make the
decision to tentatively accept the alternative hypothesis?

• Earlier we mentioned that you reject the null hypothesis when the probability of your
result assuming a true null is very small. That is, you reject the null when the
evidence would be unlikely under the assumption of the null.

• In particular, you set a significance level (also called the alpha level) to use in your
research study, which is the point at which you would consider a result to be very
unlikely. Then, if your probability value is less than or equal to your significance
level, you reject the null hypothesis.

• It is essential that you understand the difference between the probability value (also
called the p-value) and the significance level (also called the alpha level).
• The probability value is a number that is obtained from the SPSS computer printout.
It is based on your empirical data, and it tells you the probability of your result or a
more extreme result when it is assumed that there is no relationship in the
population (i.e., when you are assuming that the null hypothesis is true which is
what we do in hypothesis testing and in jurisprudence).

• The significance level is just that point at which you would consider a result to be
"rare." You are the one who decides on the significance level to use in your
research study. A significance level is not an empirical result; it is the level that you
set so that you will know what probability value will be small enough for you to
reject the null hypothesis.

• The significance level that is usually used in education is .05.


• It boils down to this: if your probability value is less than or equal to the significance
level (e.g., .05) then you will reject the null hypothesis and tentatively accept the
alternative hypothesis. If not (i.e., if it is > .05) then you will fail to reject the null
hypothesis. You just compare your probability value with your significance level.

• You must memorize the definitions of probability value and significance level right
away because they are at the heart of hypothesis testing. At the most simple level,
the process just boils down to seeing whether your probability value is less than
(or equal to) your significance level. If it is, you are happy because you can reject
the null hypothesis and make the claim of statistical significance (still don’t forget
the last step of determining practical significance).

• The final step after conducting a hypothesis test, you must interpret your results,
make a substantive, real-world decision, and determine the practical significance
of your result.

• Statistical significance does not tell you whether you have practical significance.

• If a finding is statistically significant then you can claim that the evidence suggests
that the observed result (e.g., your observed correlation or your observed
difference between two means) was probably not just due to chance. That is, there
probably is some non-zero relation present in the population.

• An effect size indicator can aid in your determination of practical significance and
should always be examined to help interpret the strength of a statistically
significant relationship. An effect size indicator is defined as a measure of the
strength of a relationship.

• A finding is practically significant when the difference between the means or the size
of the correlation is big enough, in your opinion, to be of practical use. For example,
a correlation of .15 would probably not be practically significant, even if it was
statistically significant. On the other hand, a correlation of .85 would probably be
practically significant.

• Practical significance requires you to make a non-quantitative decision and to think


about many different factors such as the size of the relationship, whether an
intervention would transfer well to the real world, the costs of using a statistically
significant intervention in the real world, etc. It is a decision that YOU make.

The next idea is for you to realize that you will either make a correct decision about
statistical significance or you will make an error whenever you conduct a hypothesis test.

• When the null hypothesis is true you can make the correct decision (i.e., fail to reject
the null) or you can make the incorrect decision (rejecting the true null). The
incorrect decision is called a Type I error or a "false positive" because you have
erroneously concluded that there is an effect or relationship in the population.

• When the null hypothesis is false you can also make the correct decision (i.e.,
rejecting the false null) or you can make the incorrect decision (failure to reject the
false null). The incorrect decision is called a Type II error or a "false negative"
because you have erroneously concluded that there is no effect or relationship in
the population.

• You need to memorize the definitions of Type I and Type II errors, and after working
with many examples of hypothesis testing they will become easier to ponder.

Hypothesis Testing in Practice

In this last section of the chapter, I apply the process of hypothesis testing (which is also
called "significance testing") to the data set.
• Since we are now using this data set for inferential statistics, we will assume that the
25 people were randomly selected.

• Note that there are three quantitative variables and two categorical variables (can
you list them?).

• Also note that I will use the significance level of .05 for all of my statistical tests below.

(The answers to the earlier questions about the two types of errors are in the first case a
Type I error was made and in the second case a Type II error was made.)

• Before we test some hypotheses, we want to point out the reason WHY we use
hypothesis or significance testing: We do it because researchers do not want to
interpret findings that are not statistically significant because these findings are
probably nothing but a reflection of chance fluctuations.

Note that in all of the following examples we will be doing the same thing. we will get
the p-value and compare it to my preset significance level of .05 to see if the
relationship is statistically significant. And then I will also interpret the results by
looking at the data, looking at an effect size indicator, and by thinking about the
practical importance of the result.

• Again, after practice, significance becomes very easy because you do the same
procedure every single time. Determining the practical significance is probably the
hardest part.
t-Test for Independent Samples One frequently used statistical test is called the t-test
for independent samples. We do this when we want to determine if the difference between
two groups is statistically significant.
Here is an example of the t-test for independent samples using our recent college
graduate data set:

• Research Question: Is the difference between average starting salary for males and
the average starting salary for females significantly different?

• Here the hypotheses (note that they are stated in terms of population parameters):

• Null Hypothesis Ho: μM = μF (i.e., the population mean for males equals the population
mean for females)

• Alternative Hypothesis H1: μM ≠ μF (i.e., the population mean for males does not equal
the population mean for females)

The probability value was .048 (we got this off of my SPSS printout).

• Since my probability value of .049 is less than my significance level of .05, I reject
the null hypothesis and accept the alternative.

• I conclude that the difference between the two means is statistically significant.

• Now I would need to look at the actual means and interpret them for substantive and
practical significance.

• The males’ mean is $34,333.33 and the females’ mean is $31,076.92.

• I can simply look at these means and see how different they are.

• To help in judging how different the means are, I also calculated an effect size
indicator called eta-squared which was equal to .16. This tells me that gender
explains 16% of the variance in starting salary in my data set.

• I conclude that males earn more than females, and because this is an important
issue in society, I also conclude that this difference is practically significant.

One-Way Analysis of Variance


One-way analysis of variance is used to compare two or more group means for
statistical significance.
Here is an example using our “recent college graduate” data set:
• Research Question: Is there a statistically significant difference in the starting
salaries of education majors, arts and sciences majors, and engineering majors?

• Here the hypotheses (note that they are stated in terms of population parameters):

• Null Hypothesis. Ho: μE = μA&S = μB (i.e., the population means for education students,
arts and sciences students, and business students are all the same)

• Alternative Hypothesis. H1: Not all equal (i.e., the population means are not all the
same)

The probability value was .001 (I got this off of my SPSS printout).

• Since .001 is less than .05, I reject the null hypothesis and accept the alternative. I
conclude that at least two of the means are significantly different.

• The effect size indicator, eta-squared, was equal to .467 which say that almost 47
percent in the variance of starting salary was explained or accounted for by
differences in college major.

• Now we need to find out which of the three means are different.

• In order to decide which of these three means are significantly different, I must follow
the “post hoc testing” procedure explained in the next. Notice that is I had done an
ANOVA with an independent variable that was composed of only two groups, I
would not need follow-up tests (which are only needed when there are three or
more groups).

Post Hoc Tests in Analysis of Variance


Here are the three average starting salaries for the three groups examined in the previous
analysis of variance (i.e., these are the three sample means):

• Education: $29,500

• Arts and Sciences: $32,300

• Business: $36,714.29

The question in post hoc testing is "Which pairs of means are significantly different?"
In this case that results in three post hoc tests that need to be conducted:

1. First, is the difference between education and arts and sciences significantly
different"
• Here are the null and alternative hypotheses for this first post hoc test:
• Null Hypothesis Ho: μE = μA&S (i.e., the population mean for education majors
equals the population mean for arts and sciences majors)

• Alternative Hypothesis H1: μE ≠ μA&S (i.e., the population mean for education
majors does not equal the population mean for arts and sciences majors)

• The Bonferroni "adjusted" p-value, which I got off the SPSS printout, was .233.

• Since .233 is > .05, I fail to reject the null that the population means for
education and arts and sciences are equal.

• In short, this difference was not statistically significant.

2. Second, is the difference between education and business significantly different?

• Here are the null and alternative hypotheses for this first post hoc test:

• Null Hypothesis Ho: μE = μB (i.e., the population mean for education majors
equals the population mean for business majors)

• Alternative Hypothesis H1: μE ≠ μB (i.e., the population mean for education


majors does not equal the population mean for business majors)

• The adjusted p-value was .001.

• Since .001 is < .05, we reject the null hypothesis that the two population
means are equal.

• We make the claim that the difference between the means is statistically
significant.

• We also claim that the salaries are higher for business than for education
students in the populations from which they were randomly selected.

• Because this finding could affect many students’ choices about majors and
because it may also reflect the nature of salary setting by the private versus
public sectors, We also conclude that this difference is practically
significant.

3. Third, is the difference between arts and sciences and business significantly
different?

• Here are the null and alternative hypotheses for this first post hoc test:
• Null Hypothesis Ho: μB = μA&S (i.e., the population mean for business majors
equals the population mean for arts and sciences majors)

• Alternative Hypothesis H1: μB ≠ μA&S (i.e., the population mean for business
majors does not equal the population mean for arts and sciences majors)

• The adjusted p-value was .031.

• Since .031 is < .05, we reject the null hypothesis that the two population
means are significantly different.

• We make the claim that this difference between the means is statistically
significant.

• We also claim that the salaries are higher form arts and sciences than for
education students in the populations from which they were randomly
selected.

• Because this finding could affect students’ choices about majoring in business
versus arts and sciences, I believe that this finding is practically significant.

In short, based on my post hoc tests, I have found that two of the differences in starting
salary were statistically significant, and, in my view, these differences were also
practically significant.

The t-Test for Correlation Coefficients


This test is used to determine whether an observed correlation coefficient is statistically
significant.
Here is an example using our “recent college graduate” data set:

• Research Question: Is there a statistically significant correlation between GPA (X)


and starting salary (Y)?

• Here are the hypotheses:

• Null Hypothesis. H0: ΡXY = 0 (i.e., there is no correlation in the population)

• Alternative Hypothesis. H1: ΡXY ≠ 0 (i.e., there is a correlation in the population)

• The observed correlation in the sample was .63.

• The probability value was .001.

• Since .001 is < .05, I reject the null hypothesis.


• The observed correlation was statistically significant.

• We conclude that GPA and starting salary are correlated in the population.

• If you square the correlation coefficient you obtain a “variance accounted for” effect
size indicator: .63 squared is .397 which means that almost 40 percent of the
variance in starting salary is explained or accounted for by GPA

• Because the effect size is large and because GPA is something that students can
control through studying, we conclude that this statistically significant correlation is
also practically significant.

The t-Test for Regression Coefficients


This test is used to determine whether a regression coefficient is statistically significant.
The multiple regression equation analyzed in the last chapter is shown here again, but
this time we will test each of the two regression coefficients for statistical significance.
Yˆ = 3,890.05 + 4,675.41 (X1) + 26.13(X2) where, Yˆ is predicted starting salary
3,890.05 is the Y intercept (or predicted starting salary when GPA and
GRE Verbal are zero)
4,675.41 is the regression coefficient for grade point average
X1 is grade point average (GPA)
X2 is GRE Verbal
Research Question One: Is there a statistically significant relationship between starting
salary (Y) and GPA (X1) controlling for GRE Verbal (X2)? That is, is the first regression
coefficient statistically significant?

• Here are the hypotheses:

• Null Hypothesis. H0: βYX1.X2 = 0 (i.e., the population regression coefficient expressing
the relationship between starting salary and GPA, controlling for GRE Verbal is
equal to zero; that is, there is no relationship)

• Alternative Hypothesis. H1 : βYX1.X2 ≠ 0 (i.e., the population regression coefficient


expressing the relationship between starting salary and GPA, controlling for GRE
Verbal is NOT equal to zero; that is, there IS a relationship)

• The observed regression coefficient was 4,496.45.

• The probability value was .035

• Since .035 is < .05, I conclude that the relationship expressed by this regression
coefficient is statistically significant.
• A good measure of effect size for regression coefficients is the semi-partial
2
correlation squared (sr ) . In this case it is equal to .10, which means that 10% of
the variance in starting salary is uniquely explained by GPA

• Because GPA is something we can control and because the effect is explains a good
amount of variance in starting salary, we conclude that the relationship expressed
by this regression coefficients is practically significant.

Research Question Two: Is there a statistically significant relationship between starting


salary (Y) and GRE Verbal (X2), controlling for GPA (X1)? That is, is the second
regression coefficient statistically significant?

• Here are the hypotheses:

• Null Hypothesis. H0: βYX2.X1 = 0 (i.e., the population regression coefficient expressing
the relationship between starting salary and GRE Verbal, controlling for GPA is
equal to zero; that is, there is no relationship)
• Alternative Hypothesis. H1 : βYX2.X1 ≠ 0 (i.e., the population regression coefficient
expressing the relationship between starting salary and GRE Verbal, controlling
for GPA is NOT equal to zero; that is, there IS a relationship)

• The observed regression coefficient was 26.13.

• The probability value was .014

• Since .014 is < .05, we conclude that the relationship expressed by this regression
coefficient is statistically significant.

• A good measure of effect size for regression coefficients is the semi-partial


2
correlation squared (sr ) . In this case it is equal to .15, which means that 15% of
the variance in starting salary is uniquely explained by GRE Verbal

• Because GRE Verbal is also something we can work at (as well as take preparation
programmes for) and because the effect is explains15% of the variance in starting
salary, we conclude that the relationship expressed by this regression coefficient
is practically significant.

The Chi-Square Test for Contingency Tables This test is used to determine whether a
relationship observed in a contingency table is statistically significant.

• Research Question: Is the observed relationship between college major and gender
statistically significant?

• The probability value was .046.


• The effect size indicator used for this contingency table is Cramer’s V. It was equal
to .496, which tells us that the relationship is moderately large.

• Because the effect size indicator suggested a moderately large relationship and
because of the importance of these variables in real world politics, we would also
conclude that this relationship is practically significant.
CHAPTER TEN
WRITING THE RESEARCH REPORT

Introduction

The purpose of this final chapter is to provide useful advice on how to organize and write
a research paper that has the potential for publication.
There are four main sections in this chapter:
1. General Principles Related to Writing the Research Report.
2. Writing Quantitative Research Reports Using the APA Style.
3. Writing Qualitative Research Reports.
4. Writing Mixed Research Reports.

General Principles Related to Writing the Research Report


We begin this section with some general writing tips and by listing some sources on
writing.

• Simple, clear, and direct communication should be your most important goal when
you write a research report.

Language The following three guidelines will help you select appropriate language in
your report:
1. Choose accurate and clear words that are free from bias. One way to do this is to be
very specific rather than less specific.
2. Avoid labelling people whenever possible.
3. Write about your research participants in a way that acknowledges their participation.

• For example, avoid the impersonal term "subject" or subjects; words such as
“research participants” or children or adults are preferable.

Keeping in mind the above guidelines, you should give special attention to the following
issues which are explained more fully in our chapter and, especially, in the APA
Publication Manual:

• Gender. The bottom line is to avoid sexist language.

• Sexual Orientation. Terms such as homosexual should be replaced with terms such
as lesbians, gay men, and bisexual women or men. Specific instances of sexual
behavior should be referred to with terms such as same gender, male-male,
female-female, and male-female.

• Racial and Ethnic Identity. Ask participants about their preferred designations and
use them. When writing this term, capitalize it (e.g., African American).
• Disabilities. Do not equate people with their disability. For example, refer to a
participant as a person who has cancer rather than as a cancer victim.

• Age. Acceptable terms are boy and girl, young man and young woman, male
adolescent and female adolescent. Older person is preferred to elderly. Call people
eighteen and older men and women.

Editorial Style
Italics.

• As a general rule, use italics infrequently. If you are submitting a paper for
publication, you can now use italics directly rather than using underlines to signal
what is to be italicized.

Abbreviations

• Use abbreviations sparingly, and try to use conventional abbreviations (such as IQ,
e.g., c.f., i.e., etc.).

Headings

• The APA Manual and our chapter specify five different levels of headings and the
combinations in which they are to be used in your report.

• If you are using two levels of headings, centre the first level and use upper and lower
case letters (i.e., do not use all caps), and place the second heading on the left
side in upper and lower case letters and in italics. Here is an example:

Method
Procedure

• If you are using three levels of headings, do the first two levels as just shown for two
levels. The third level should be in upper and lowercase letters, italicized, indented,
and ending with a period.

• Here is an example of how to use three levels of headings:

Method
Procedure
Instruments. (Start the text on this same line)
Quotations

• Quotations of fewer than 40 words should be inserted into the text and enclosed in
double quotation marks. Quotations of 40 or more words should be displayed in a
free standing block of lines without quotation marks. The author, year, and specific
page from which the quote is taken should always be included.
Numbers

• Use words for numbers that begin a sentence and for numbers that are below ten.

• See the APA Publication Manual for exceptions to this rule. Physical Measurements

• APA recommends using metric units for all physical measurements. You can also
use other units, as long as you include the metric equivalent in parentheses.

Activity 29
What are some of the commonly used metric units?

Presentation of Statistical Results

• Provide enough information to allow the reader to corroborate the results. See your
book and the APA manual for specifics (e.g., an analysis of variance significance
test of four group means would be presented like this: F(3, 32) = 8.79, p ═ .03).

• Note that the use of an equal sign is preferred when reporting probability values.

• If a probability value is less than .001, then use p < .001 rather than p = .000

Reference Citations in the Text

• APA format is an author-date citation method. The text shows the specifics.

• Here is one example: "Smith (1999) found that . . ."

• Frequently you will put references at the end of sentences. Here is an example,
“Mastery motivation has been found to affect achievement with very young children
(Turner & Johnson, 2003).”

Reference List

• All citations in the text must appear in the reference list.


• Here are two examples:

American Psychological Association (1994). Publication manual of the American


Psychological Association (4th ed.). Washington, DC: Author.
Turner, L.A., & Johnson, R. B. (2003). A model of mastery motivation for at-risk
preschoolers. Journal of Educational Psychology, 95(3), 495-505.
Typing

• Double space all material.


• Use 1-inch margins.

• Use only one space between the end of a sentence and the beginning of the next
sentence.

Writing Quantitative Research Reports Using the APA Style


There are seven major parts to the research report: 1. Title page 2. Abstract 3.
Introduction 4. Method 5. Results 6. Discussion and 7. References. We will make a few
brief comments on each of these below.

• Discussion of author notes, footnotes, tables, figure captions, and figures is only in
the textbook (and, of course, in the APA Publication Manual).

1. Title Page

• Your paper title should summarize the main topic of the paper in about 10 to 12
words.

2. Abstract

• This should be a comprehensive summary which is about 120 words. For a


manuscript submitted for review, it is typed on a separate page.

3. Introduction

• This section is not labelled. It should present the research problem and place it in
the context of other research literature in the area.

4. Method

• This section does not start on a separate page in a manuscript being submitted for
review.

• The most common subsections are Participants (e.g., list the number of participants,
their characteristics, and how they were selected), Apparatus or Materials or
Instruments (e.g., list materials used and how they can be obtained), and
Procedure (e.g., provide a step-by-step account of what the researcher and
participants did during the study so that someone could replicate it).

5. Results

• This does not start on a separate page in your manuscript.

• It is where you report on the results of your data analysis and statistical significance
testing.
• Be sure to report the significance level that you are using (e.g., "An alpha level of .05
was used in this study") and report your observed effect sizes along with the tests
of statistical significance.

• Tables and figures are expensive but can be used when they effectively illustrate
your ideas.

6. Discussion

• This is where you interpret and evaluate your results presented in the previous
section.

• Be sure to state whether your hypotheses were supported.

• Also, answer the following questions:

1. What does the study contribute?


2. How has it helped solve the study problem?
3. What conclusion and theoretical implications can be drawn from the study?
4. What are the limitations of the study?
5. What are some suggestions for future research in this area?

7. References

• Centre the word References at the top of the page and double-space all entries.

Writing Qualitative Research Reports


We recommend that qualitative researchers also follow the guidelines given above when
writing manuscripts for publication.
We recommend that qualitative researchers use the same seven major parts that were
discussed for the quantitative research report.

• Title Page and Abstract. The goals are exactly the same as before. You should
provide a clear and descriptive title. The abstract should describe the key focus of
the study, its key methodological features, and the most important findings.

• Introduction. Clearly explain the purpose of your study and situate it in any research
literature that is relevant to your study. In qualitative research, research questions
will typically be stated in more open-ended and general forms such as the
researcher hopes to "discover," "explore a process," "explain or understand," or
"describe the experiences."

• Method. It is important that qualitative researchers always include this section in their
reports. This section includes information telling how the study was done, where it
was done, with whom it was done, why the study was designed as it was, how the
data were collected and analyzed, and what procedures were carried out to ensure
the validity of the arguments and conclusions made in the report.

• Results. The overriding concern when writing the results section is to provide
sufficient and convincing evidence. Remember that assertions must be backed up
with empirical data. The bottom line is this: It's about evidence.
i. You will need to find an appropriate balance between description and
interpretation in order to write a useful and convincing results section.
ii. Several specific strategies are discussed in the chapter (e.g., providing quotes,
following interpretative statements with examples, etc.).
iii. We state that regardless of the specific format of your results section, you must
always provide data (i.e., descriptions, quotes, data from multiple sources, and
so forth) that back up your assertions.
iv. Effective ways to organize the results section are organizing the content around
the research questions, a typology created in the study, the key themes, or
around a conceptual scheme used in the study.
v. It can also be very helpful to use diagrams, matrices, tables, figures, etc. to
help communicate your ideas in a qualitative research report.

• Discussion. You should state your overall conclusions and offer additional
interpretations in this section of the report. Even if your research is exploratory, it
is important to fit your findings back into the relevant research literature. You may
also make suggestions for future research here.
Writing Mixed Research Reports

• First, know your audience and write in a manner that clearly communicates.

• The suggestions already discussed in this chapter for quantitative and qualitative
also apply for mixed research.

• In general, try to use the same seven headings discussed above.

• Here are a few organization options:

1. Organize the introduction, method, and results by research question.

2. Organize some sections (e.g., method and results) by research paradigm


(quantitative and qualitative).

3. Write essentially two separate subreports (one for the qualitative part and
one for the quantitative part).

4. NOTE: in all cases, if you are writing a mixed research report, mixing must
take place somewhere (e.g., at a minimum the findings must be related and
“mixed” in the discussion section).

You might also like