CTED Lecture On Statistical Tools
CTED Lecture On Statistical Tools
I will make this lesson very simple because it is very easy and had been learned by you in
your high school. However, it should not be ignored because these tools are foundations
in some descriptive and inferential statistics. In other words, this is just a review.
For every group of scores, there are those ones which are noticed as “most”. It is
these scores which commonly appear, obtained or grouped. Thus, this is called central
tendency which simply serves as an index in the clustering of X-value towards the central
value. There are three measures of central tendency. These are mean, median and mode.
I. Objectives:
At the end of the lesson, the student is expected to:
A. Differentiate the characteristics of mean, median and mode.
B. Determine the mean, median and mode in a given set of data correctly.
II. Topics:
A. Mean
B. Median
C. Mode
III. Presentation
84 79 82 93 76 = 414 Total
2
Illustration: 16
Odd number 18
2 19
3 20 20 + 22 = 42
4 Median 22 42 ÷ 2 = 21 Median
7 23
9 25
number repeated many times
26
C. Mode. This is the score found as the most frequently occurring score
in a set of scores. An easy way to determine the mode is to arrange the scores in either
ascending or descending order.
Example: 44 37 37
32 45 30
37 42 45
33 36 39
30 32 33 36 37 37 37 39 42 44 45 45
D. Weighted Mean. Weighted values that results to weighted mean uses the Likert
point-scale. The points are expressed in weights with arbitrary descriptions used
depending upon the situation or condition where it is applied. It could be
progressive or regressive when it is one directional; or integeric when descriptions
I. Objectives
In this lesson, the student is expected to
A. Calculate values using the Likert-point scale.
B. Make simple interpretations about values obtained in calculations.
II. Topics
A. The Likert-point Scale
B. Calculation of Weighted Mean
III. Presentation
This is still a part of the lesson on measures of central tendency. The description
of a Likert-point scale is already given in the introductory paragraph. In a sense, it is an
arbitrary way of giving a value or a number to a certain description so that the value or
values can be calculated.
Illustrations:
Progressive/Regressive Integeric
Weight Description Weight Description
5 Excellent 5 Strongly Agree
4 Very Good 4 Agree
3 Good 3 Uncertain
2 Fair 2 Disagree
1 Poor 1 Strongly Disagree
The weights in a Likert point-scale are not always 5. It can be 3 and any number
up to 10 (or more). To facilitate interpretation, each weight (point) or a certain distance
should have equivalent description. Here is an example on how to use a Likert point scale
in determining the weighted mean:
School “D” released the following report in 2011 regarding programs they have
conducted among their students
(27x5)+(51x4)+(19x3)+(13x2)+(10x1)=∑fx÷N=Wm
Procedure of Calculation:
∑ fx
Wm = where: Wm is the weighted mean value
N
∑ is the summation sign
f is the frequency
x is the weight
N is the number of cases
∑fx is the sum of the partial products
Some simple analyses that could be derived from the values are:
Note:
To match the Wm- values, you have to make a scale. In this example, subtract the highest
value (5) by the lowest value (1). So 5-1=4. Divide 4 by the highest value (5) and the
resulting value is 0.8. this will be used as the interval. Thus:
5
Weigth/Interval Interpretation
4.21-4.00 Always
3.41-4.20 Often
2.61-3.40 Sometimes
1.81-2.60 Rarely
1.00-1.80 Never
Score
Practicum Paper
7
Measures of Central Tendency
45 48 49 52 53 57 57 57 57 59 61 62 63 68 72 75 76 77 79
80
Mode: 57
Multimodal
D. The ages of teachers with osteoporosis are 43; 32; 36; 41; 30; 45; 47; 29
Score
18
Practicum Paper
Developing
Readers 39 62 111 50 38 914 3.05 2 Moderate
Severly
Wasted 6 13 41 180 60 625 2.08 1 Low
Nutritional
Status
Low
Performance 46 95 89 42 28 989 3.30 3 Moderate
Level
Poor
Attendance 122 71 68 29 10 1,166 3.89 4 High
Low Student-
Book Ratio 86 143 51 12 8 1,187 3.96 5 High
3.23 Moderate
Overall Mean
Fill up the rows and columns with the correct data or information.
Why? Because this is the lowest scholastic problem in District C. Books are one
of the most important to the students. It is the materials that can help in
learning.
5. Write a general statement about the data obtained.
Generally, in District C "Moderate" conducted the five scholastic problems.
The spread of value around the central tendency is called dispersion or “variability”.
Measures of dispersion serve as index of spread of x-values away from the central value.
There are four common measures of dispersion and variation. These are range, variance,
standard deviation and coefficient of variation.
I. Objectives:
At the end of the lesson, the student is expected to:
A. Define the different measures of dispersion.
B. Use the rule in calculating measures of dispersion correctly.
II. Topics:
A. Range
B. Variance
C. Standard deviation
D. Coefficient of variation
III. Presentation:
A. Range. This is the difference from the highest and lowest value.
27 Range
Example : sd =
√ ∑ (dx )²
N−1 √
=
120
12−1 √
=
120
11
=√ 10.91 = 3.30
sd 3.30
Example: CV = x 100 = = 0.4125 x 100 = 41.25%
M 8
Score
Practicum Paper
Measures of Dispersion and Variation 10
Examine very well the examples at the left column then do the activities in the right
column.
F 11 -1 1 Tato 16 -1 1
G 13 1 1 Ismael 16 -1 1
H 13 1 1 Hubert 16 -1 1
I 13 1 1 Aisha 19 2 4
J 15 3 9 Ashley 21 4 16
K 16 4 16 Herbert 24 7 49
L 17 5 25 Gil 26 9 81
N = 12 ∑Xi=144 ∑dx= 0 ∑(dx)²= 96 ∑=
N=12 204 ∑(dx)²= 232
M= 12 M=1
7
B. Show the Range using Number Line system. Look at the model above.
12 12 13 14 15 16 16 16 19 21 24 26
14
Range
V = ∑(dx)²/ N-1
11
❑
96
= 12−1 V=
∑
❑
(dx)²
=
¿ N −1
= 8.73 =232
12-1
=21.09
s.d. ¿
√∑ (dx )²
N−1
¿
√ 96
12−1
=√21.09
=4.59
= 2.95
D. Measure of dispersion using Measure of dispersion using
Coefficient of Variation Coefficient of Variation
This lesson and the succeeding ones tackle on inferential statistics. It is in this
type of statistics where inferences about whether relationships/association/correlation
exists in the samples used. Testing the null hypothesis is involved here. The last set of
lessons reflected in the outline of topics in the Course Outline given to you may already
be integrated here. This is not only to save time but to make the learning of the unit more
interesting and meaningful. (Electronic calculation will be demonstrated to you when we
meet).
I. Topics:
12
Measures of Correlation
1. Spearman’s Coefficient of Correlation
2. Kendall’s tau
3. Pearson’s Coefficient of Correlation
4. Chi-Square
5. Kendall Coefficient of Concordance W
(There are other tools used to determine correlation/association between 2 or
more variables, but these are the most common ones).
II. Objectives:
III. Presentation:
2
6(∑ D )
Formula: r = 1- where:
N ( N −1 )
2
1 and 6 are constants. 1- means that the value of r should not be more than 1.
Thus, the table below is a guide to interpret the coefficient of correlation.
The coefficient shows the strength (or weakness) of the association between two
variables. Types of correlation could be no correlation, perfect positive correlation and
perfect negative correlation.
Example:
Diseases commonly diagnosed in the Nutrition Department of “E” Hospital
Males Females
Diseases Frequency Rank Frequency Rank
Hypertension 9 2 12 3
Renal failure 78 10 54 8
Tuberculosis 32 6 42 7
Anorexia 11 3 17 4
Colon cancer 53 8 68 9
Anemia 25 5 97 10
Cardio-vascular (heart) 64 9 34 5
Diabetes 5 1 40 6
Obesity 41 7 11 2
Marasmus 22 4 8 1
1. Take the ranks of both groups. However, the ranks of the first group should be
arranged from lowest to highest.
2. The ranks of the second group will be paired with the ranks of the first group.
3. Determine the differences of the ranks horizontally and write them in column
called “D”.
4. Square the differences and write them in the column called “D²”.
5. Get the total value of the D². (∑D²)
6. Calculate the value of r using the rule.
7. Look at the chart for the meaning of the value.
Illustration:
Males Females D D²
1 6 -5 25
2 3 -1 1
14
3 4 -1 1
4 1 3 9
5 10 -5 25
6 7 -1 1
7 2 5 25
8 9 -1 1
9 5 14 16
10 8 2 4
∑D² = 108
6 (∑ D 2 ) 6(108) 648
r=1– 2 =1– 2 = 1- 990 =1- 0.65 = 0.35
n(n −1) 10(10 −1)
low correlation
(look at the table)
To determine the significance of correlation, the t – value is used. The formula is: t –
√
value = r
n−2
1−r ²
where: r = is the computed r- value
t-value = 0.35
√ 10−2
1−0.35²
= 1.06 if you are using a scientific calculator
15
= 0.35
√ 8
0.88
if you are using ordinary calculator
= 0.35 (3.01)
To make a decision about the null hypothesis (Ho), compare the computed
value against the tabular value (or critical value) like 0.05 or 0.01 level of
significance.
In this case (the example), the computed t-value is lower than the tabular value.
Therefore, the decision is to Accept the Ho. The Ho is to be taken as conclusion.
Therefore, there is no significant correlation between the diseases commonly
diagnosed among the males and females in the Nutrition Department of “E”
Hospital.
The characteristic of this tool is just like the Spearman’s r. However, the value of N
makes them different. The Kendall’s tau (taken from the author Maurice Kendall) is used
when N is lesser than 10. The formula is:
C−D
T= where: C= concordant pairs (or higher numbers)
C+ D
D = discordant pairs (or lower numbers)
16
Example:
Non-teaching
Assignments of Male Female
Classroom Teachers Frequency Rank Frequency Rank
Librarian 34 3 61 4
Scout Leader/Master 67 4 27 3
Nutrition Coordinator 3 1 5 1
Guidance Teacher 8 2 9 2
Procedure:
1. Arrange the ranks of the first variable from lowest to highest.
2. Pair the ranks of the second variable side by side with the first variable.
3. Count the number of concordant and discordant pairs.
4. Collect the values in the C and D columns separately.
5. Determine the value of T
6. Calculate the significance of correlation using t-value.
Illustration:
Male Female C D
1 1 3 0
2 2 2 0
3 4 0 1
4 3
Total 5 1
C−D 5−1 4
T=
C+ D
= 5+1
= 6
= 0.67 (moderate correlation)
To test the null hypothesis, use the t-value as illutrated in Spearman’s rho.
This is also known as Pearson’s r. It is a kind of correlation where two (2) values
from one respondent are correlated. It takes into account each and every score and
produces a coefficient between 0.00 and + 1.00. There are two methods of determining
the Pearson’s r: the deviation method and the difference method. (You have learned this
already when we discussed reliability and validity of instruments).
For facility, let us take the deviation method. The formula used is:
r ∑ dxdy
xy=
( N−1) ( sdx) ( sdy )
Example: 10 students whose major is English had the following scores in Prelim and Midterm
in Literature. Find the degree of correlation between their scores in prelim (x) with their scores in
midterm (y).
A 12 8 6 1 36 1 6
B 10 12 4 5 16 25 20
C 8 6 2 -1 4 1 -2
D 7 4 1 -3 1 9 -3
E 6 9 0 2 0 4 0
F 5 10 -1 3 1 9 -3
G 5 7 -1 0 1 0 0
H 4 6 -2 -1 4 1 2
I 2 5 -4 -2 16 4 8
J 1 3 -5 -4 25 16 20
Mx=6 My=7
1. sdx =
√ ∑ (dx )²
N−1
=
√ 104
10−1
= 3.40
2. sdy =
√ ∑ (dy )²
N−1
=
√ 70 = 2.79
10−1
r
3. xy=
∑ dxdy
( N−1) ( sdx) ( sdy )
r
4. xy=
48
( 9 )( 3.40 ) ( 2.79 ) = 0.56 (moderate correlation)
Ho. The correlation between the scores of the Clinical Nutrition students in
Quiz 1 and Quiz 2 in Biostatistics is not significant.
Ha. The correlation between the scores of the Clinical Nutrition students in
Quiz 1 and Quiz 2 in Biostatistics is significant.
t-value = r
√ n−2
1−r ²
= 0.56
√ 10−2
1−0.56²
= 0.56
√ 8
0.69
=0.56√ 11.59
19
Since the computed t-value is lesser than the tabular t-value on a 0.05 level of
significance, therefore, the Ho is accepted. With this, the correlation between the scores
of the Clinical Nutrition students in Quiz 1 and Quiz 2 in Biostatistics is not significant.
Conclusion
The evidence does not warrant that the relationship between variables does not exist. The
result showed that 95 in 100 trials, the variables are not related or are independent and
only 5 in 100 trials, the relationship exists among variable (Subong).
Score
Instruction: Calculate the degree of correlation using the deviation method between the
marks of the students in English and Mathematics Test also the null hypothesis (Ho)
shown below the data.
Given:
K 5 5 -4 -4 16 16 16
L 4 4 -5 -5 25 25 25
N= 12 ∑=108 ∑=108 ∑(dx)² ∑dy)² ∑dxdy
M=9 M=9
sdx =
sdy =
r xy =
Meaning:_____________________________________
t-value =
Null Hypothesis (Ho): The correlation coefficient between the marks of the students in
the English and Math is not statistically significant.
Analysis:
____________________________________________________________________.
________________________________________________________________
_____.
Decision for Ho:
_____________________________________________________________.
Conclusion:
_________________________________________________________________.
_____________________________________________________________
____.
_____________________________________________________________
____.
_____________________________________________________________
____.
_____________________________________________________________
____.
21
This test is used to measure the association between two nominal variables. It is
only applicable for relationships of variables with nominal data. There are two types of
Chi-Square test. These are:
( fo−fe)²
The formula is: X²= ∑
fe
A class with 54 students in TVL focused on Bread and Pastry was surveyed as to
their willingness to observe different laboratory procedures done in Everlasting Bakery in
the city. The result showed that:
( fo−fe) ²
∑ 16.78
fe Computed X²=
Degree of freedom (df) = k-1; 3-1 =2
Ho. The responses of the students on their willingness to observe different laboratory
procedures in a bakery do not vary.
22
Ha. The responses of the students on their willingness to observe laboratory procedures
in a bakery vary.
Since the computed value of the Chi-Square of 16.78 is greater than the tabular
value of 5.99 on a 0.05 level of significance, then the Ho that the students’ responses on
their willingness to observe different laboratory procedures in Everlasting Bakery was
rejected. The alternative hypothesis (Ha) that the students’ responses differ was accepted.
This shows that the students’ willingness to observe the laboratory procedures in the
bakery varies. In other words, the students’ responses were not common, so it is not a
wise decision to go on with the activity.
There were 50 Student Teachers who were on their off-campus or internship year.
Determine whether or not their attitudes in their training are associated with gender.
Data:
Gender
Attitudes Male Female Total
Very Good 9 5 14
Good 12 7 19
Fair 8 9 17
Total 29 21 50
Anal
ysis: Computed X² value is lesser than the tabular X² value.
In other words, attitudes of the students have nothing to do with their gender. Each is
independent.
This is a Chi-Square test when the degree of freedom is equal to 1, (df=1) and the
contingency table has a cell frequency of less than 5. The formula is:
Gender
Male Female Total
Performance
Very Good 9 4 13
Good 3 7 10
Total 12 11 23
Decision: Ho is accepted.
24
The symbol used for this tool is W, that is why it is called Kendall’s W. It is a non-
parametric test used to determine the relationship among 3 or more variables using ranks.
Hence, it is ordinal. When the degree of correlation is obtained, the test of significance
uses the Chi-Square test. The formula is:
Age Group
Young Adults Middle Adults Late Adults
Perceptions f Rank f Rank f Rank
A. Being a heavy smoker. 11 1 15 2 25 3
B. Poor nutrition. 22 2.5 38 4 42 4
C. Untreated respiratory tract 35 4 2 1 7 1
infection
D. Exposure to a person with 22 2.5 25 3 16 2
tuberculosis.
N
25
Perceptions A B C D
Groups
Young Adults 1 2.5 4 2.5
Middle Adults 2 4 1 3
Late Adults 3 4 1 2
∑Tj 6 10.5 6 7.5
∑Tj² 36 110.25 36 56.25 238.5
Null Hypothesis (Ho): The concordance of the ranks of the perceptions of the
male adults on the causes of tuberculosis is not significant.
To Calculate:
Low correlation
To test the Ho. Use the formula: X² = m(N-1)W
Score
Practicum Paper Fischer’s Exact Test
Name: _______________________________________________________ 10
Part I. Calculate the value of X² using Fischer’s Exact Test then give the analysis and
interpretation of the result of computation.
Ho. There is no relationship between socio-economic level and general health status of
males beyond 40 years old.
SES
Health Status High Low Total
Good 15 4 19
Poor 5 3 8
Total 20 7 27
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________
X²=____________
_
Analysis:_________________________________________________________.
27
Decision: _______________________________________________________.
Conclusion:___________________________________________________.
____________________________________________________.
Part II. Make a 2 x 2 contingency table; put the values in the corresponding cell then
determine the value of the X² and make an analysis and interpretation of results obtained.
Ho.
__________________________________________________________________
__________________________________________________________________
____________.
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
____________________________________
Analysis: _______________________________________________________________.
Conclusion: ____________________________________________________________.
_______________________________________________________________.
Score
Practicum Paper
Kendall Coefficient of Concordance (W) 25
Name: ___________________________________________________
Part I. Complete the statistical processes needed to determine the value of W then test the
null hypothesis (Ho).
Determine the weighted mean values of the following using the weights and
interpretations as indicated below:
B. Teenagers (N=40)
Beliefs on Food
Habits 1 2 3 ∑fx Wm Rank Interpretation
1.Drinking coffee
doesn’t make you 1 10 29
sleep.
2.Eating plenty of
rice makes you fat. 12 12 16
3.Frequent drinking
of carbonated drinks
is one cause of 21 11 8
hyperacidity.
4.If you want to lose
weight, do not eat 35
anything after 6:00 2 3
pm.
5.Fair skin is the
effect of eating fresh 7 26 7
fruits and vegetables.
Overall Weighted Mean=
C. Young Adults (N=45)
b. Teenagers? _____________________________________________
c. Young Adults?__________________________________________
3. Which among the groups has the highest of belief based on the overall weighted
mean value? _________________________________.
4. Which among the groups has the lowest level of belief based on the overall
weighted mean value? _____________________________
Part III. Put the data in a matrix and determine the coefficient of correlation using
Kendall Coefficient of Concordance W.
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
__________________________________________________________________
______________________________________________________.
Ho. The correlation of the ranks given by the three groups of respondents on their belief on food
habits is not significant.
Analysis: ___________________________________________________________________________.
Conclusion: ________________________________________________________________.
Score
Practicum Paper Chi-Square
Name ___________________________________________________
5
Calculate the following and supply the necessary information.
There were 120 samples pupils diagnosed with reading difficulties. Determine whether
this problem is associated with their socio-economic status.
31
Ho. The stage of tuberculosis among the 120 samples is not associated with their
socio-economic status.
SES
Reading High Average Low Total
Proficiency
Approaching 9 17 32 58
Beginning 7 40 15 62
Total 16 57 47 120
df= 2 Critical value: 0.05 = 5.991
X² =___________
Analysis: ______________________________________________________________.
Conclusion: ___________________________________________________________.
___________________________________________________________.
___________________________________________________________.
Score
Practicum Paper
Kendall’s Tau
Name: _________________________________________________
10
Instruction: Find the degree of correlation between weight gain of male and female Grade
4 pupils with severely wasted nutritional status during the first 6 months of nutrition
therapy.
0.75 5 11
1.00 9 12
1.25 13 7
1.50 12 8
1.75 11 10
Total
Ho. There is no significant correlation between weight gain of male and female Grade 4
pupils with severely wasted nutritional status during the first 6 months of nutrition
therapy.
Analysis: ______________________________________________________________.
Conclusion: __________________________________________________________.
___________________________________________________________.
Score
Practicum Paper Spearman’s rho
10
Name: ______________________________________________________
Instruction: Calculate the degree of correlation of the given situation below using
Spearman’s rho then test the significance of correlation.
Heart diseases 54 85
Tuberculosis 16 51
33
Renal failure 27 18
Anorexia 32 59
Anemia 79 35
Scurvy 52 27
Diabetes 48 63
Colon cancer 94 41
Hepa-A 12 9
Hepa-B 80 72
∑D²=
Meaning: ________________________
t-value =
34
Analysis: ______________________________________________________________.
Conclusion: ___________________________________________________________.
___________________________________________________________.
___________________________________________________________.