Practical Research 2 Q2 Lesson
Practical Research 2 Q2 Lesson
When the necessary data have already been collected, the next step is to organize the raw data
for data analysis. It is important that the researcher is assured of the quality of the accuracy,
consistency, completeness and systematic arrangement to facilitate coding and tabulation.
Every research methodology requires a data analysis plan. The plan includes specifying the
statistical measures to use and to address the research questions. The appropriate methods of
data are determined by the type of data, the variables to be used, the number of cases and the
distribution of the variables.
Purpose of data analysis plan
The purpose of a data plan is to gather useful information to find solutions to research questions
of interest. It may be used to:
Describe data sets;
Determine the degree of relationship of variables;
Determine differences between variables;
Predict outcomes; and
Compare variables.
All of the above could be manipulated by using any or a combination of the following data
analysis strategies:
Exploratory Data Analysis
This type of data analysis is used when it is not clear what to expect from the data. This strategy
uses numerical and visual presentations such as graphs. Since the research of interest is new,
it is possible to find some inconsistencies, such as missing values, distribution of the data or
unusually small or too large value or invalid data.
Descriptive Data Analysis
This type of data analysis is used to described, show or summarize data in meaningful way,
leading to a simple interpretation of data. Descriptive data analyses do not allow you to
formulate conclusions beyond the data that you described. The commonly used descriptive
statistics are those that analyze the distribution of data such as frequency, percentage,
measures of central tendency and measures of dispersion.
Inferential data analysis
Inferential statistics test hypotheses about a set of data to reach conclusions or make
generalizations beyond merely describing the data. Inferential statistics include tests of
significance of difference such as the T-test, analysis of Variance (ANNOVA); and test of
relationship such as product moment coefficient or correlation or pearson r, spearman rho,
linear regression and chi-square test.
Quantitative Analysis in Evaluation
Determining the level of measurement of the quantitative data is important before proceeding
with analysis of data. The choice of statistical measure/s to use is dependent on the level of
measurement of the data. The following are the levels of measurement scales:
• Nominal Scale
• Ordinal Scale
• Interval Scale
• Ratio Scale
Nominal Scale
A nominal scale of measurement is used for labelling variables. It is sometimes called
categorical data. Basketball players wear sports shirts with numbers, but that is just a way to
identify the players. Likewise, if you want to categorize respondents based on gender, you could
use 1 for male, and 2 for female. No order or distance is observed. The Yes or No is an
example of nominal data. The numbers assigned to the variables have no quantitative value.
Some examples of variables measured on a nominal scale are gender, religious affiliation, race
or ethnic group.
Ordinal Scale
An ordinal of measurement assigns order on items on the characteristics being measured. It
involves the ranking of individuals, attitudes and characteristics. The order in the honor roll (first
honor, second honor, third honor); order of agreement (strongly agree, agree, strongly disagree)
or economic status (low, average, high) are some examples.
Numerical scores such as first, second, third, and so on are assigned but the numerical value or
quantity has no value except its ability to establish ranking among a set of data. You can talk
about ordering, but differences in order between ranks are not specified.
Interval Scale
The interval has equal units of measurement, thereby, making it possible to interpret the order
of the scale scores and the distance between them. However, interval scales do not have a
“true zero”
With interval data, addition and subtraction are possible but you cannot multiply or divide.
Ratio Scales
Ratio Scale is considered the highest level of measurement. It has the characteristics of an
interval scale but it has a zero point. Because of this property, all statistical operations can be
performed on ratio scales. All descriptive and inferential statistics may be applied. All variables
can be added, subtracted, multiplied and divide.
1.1 Mean
Often called the arithmetic average of a set of data, the mean is the sum of the
observed values in the distributions divided by the number of observations. It is
frequently used for interval or ratio data. The symbol x̄ (X bar) is used to denote the
arithmetic mean.
sum of observations
Mean x̄ = number of observations
∑𝑥
The formula is: x̄ =
𝑛
The following examples show the calculations of the mean for ungrouped data, that
is a list of data that is not recognized in any way.
Example 1:
Find the mean of the measurement
18,26,27,29,30
Solutions:
Substitute the measurement using the formula.
∑𝑥
x̄ = = 18+26+27+29+30
5
=
130
5
= 26
𝑛
Note that the value falls near in the middle of the data set.
Answer: x̄ = 26
Suppose the 3rd measurement was 17 (RATHER THAN 27). The mean would be
120/5 = 24. Thus, the mean is changed when one of the values in the set of
observation is changed.
∑𝑥 1960
x̄ = = 20
= 98
𝑛
You can use the mean when the numbers you have can be added or when
characteristics are measured on a numerical scale like those used to describe
height, weight, or score on a test.
∑ 𝑓𝑥
𝑥𝑤 =
𝑛
where:
f = frequency
x – numerical value or item in a set of data
n = number of observations in the data set
Example 1:
Find the mean of the heights of 50 SHS summarized as follows:
Solution:
Using the above data, the weighted mean is equal to the sum of the column 𝑓𝑥, divided by the
total number of observations
∑ 𝑓𝑥 2905
Weighted Mean 𝑥𝑤 = = = 58.1 𝑖𝑛𝑐ℎ𝑒𝑠
𝑛 50
When the data is grouped into classes, the class midpoint represents the “x” in the formula.
Example 2:
Solve for the mean of the data below.
∑ 𝑓𝑥 2965
𝑥𝑤 = 𝑛
= 50
= 59.3
Answer: 59.3
1.3 Median
The median is the midpoint of the distribution. It represents the point in the data
where 50% of the values fall below that point and 50% fall above it. When the
distribution has an even number of observations, the median is the average of the
two middle scores. The median is the most appropriate measure of central tendency
for ordinal data.
Example 1:
Consider these odd numbers of numerical values:
7,8,8,9,10,12,23
By inspection, the median is 9 because half of the values (7,8,8) are below 9 and half (10,12,23)
n+1 𝑡ℎ 7+1 𝑡ℎ
are above 9. Since n= 7 is odd, the median has rank ( ) =( ) = 4𝑡ℎ item and is
2 2
equal to 9.
Answer: The Median is 9
Example 2:
Consider these even numbers of numerical values:
12,15,18,22,30,32
The two middle values are 18 and 22. If the average of the two middle is taken, that is, 18 = 22
= 40 and divided by 2, the median is 20.
Answer: The Median is 20
Example 3:
Find the median for the set of measurements.
15,20,12,26,3,30,14
Solution:
We first rank the measurement from the smallest to the largest 3,12,14,15,20,26,30. Since the
n+1 𝑡ℎ 7+1 𝑡ℎ
number of cases is odd, the median has rank ( 2
) =( 2
) = 4𝑡ℎ item and is equal to
15.
Answer: The Median is 15
1.4 Mode
The mode is the most frequently occurring value in a set of observations, in cases where
there is more than one observation which is the highest but with equal frequency, the
distribution is bimodal (with 2 highest observation) or multimodal with more than two
highest observations. In cases where every item has an equal number of observations,
there is no mode. The mode is appropriate for nominal data.
Example 1:
The ages of fifteen (15) persons assembled in a room are as follows:
16, 19, 18, 18, 25, 25, 25, 30, 34, 36, and 38
Answer: Mode = 25
Example 2:
The number of hours spent by 10 students in an internet café was as follows:
2, 2, 2, 3, 3, 4, 4, 4, 5, 5
Solutions:
Both 2 and 4 have a frequency of 3. The data is therefore bimodal.