Statistics 201
Statistics 201
Statistics 201
Introduction to Statistics
Module 1
Introduction
Examples:
Measures of central tendency
Variability
Skewness
Kurtosis
Examples:
Sampling/sampling distribution
Estimation
Testing hypotheses using z-test, t-test, chi-square, F-test,
ANOVA
2
Levels of Measurement
o Nominal scale
The numbers of symbols are used for the purpose of
categorizing forms into groups. When numerical values or
s symbols are used to classify an object, person or
characteristics to identify groups to which various objects,
persons, and characteristics belong, these values
constitute nominal measurements
o Ordinal level
Is a sort of improvement of nominal level. Data are ranked
from bottom to top or low to high manner.
o Interval scale
This possesses the properties of the nominal and ordinal
levels. The distance between any two numbers on the
scale are known and it does not have a stable starting
point.
o Ratio level
This possesses all the properties of the nominal, ordinal
and interval levels. In addition, this has an absolute zero
point. Data can be classified and placed in proper order
Types of data
Classification of Variables
Collection of Data
When you want to know whether several boxes of bulb lights are free
from defects, it would be time consuming to examine all of them piece by
piece.
You can do this by examining a few samples from each box. This
process is called sampling and the defined set that is sampled is called
population
Sampling
Activity Number 1
TOPIC: POPULATION AND SAMPLE
Name:
___________________________________________________________________________
1. 10000 .05
2. 20000 .05
3. 30000 .05
4. 5000 .05
5. 10000 95%
6. 20000 95%
7. 30000 95%
8. 5000 95%
9. 10000 99%
Module 2
Sampling Techniques
20 - 85 14 - 88 16 - 85 13 - 88
29 - 92 17 - 85 21 - 92 15 - 90
38 - 92 22 - 85 23 - 92 18 - 86
40 - 85 26 - 95 25 - 85 19 - 87
43 - 88 33 - 90 30 - 88 24 - 87
44 - 93 34 - 85 31 - 93 27 - 88
36 - 89 28 - 91
37 - 89 32 - 88
45 - 86 35 - 89
39 - 89
41 - 90
42 - 89
Illustrative examples
7
A. With the given data in the table on page I, consider a sample size of
15. How can you randomly select the 15 from the population?
Simple Random sampling is a procedure where a sample is selected in such a way that
every element is as likely to be selected as any other element from the population.
n = N
1 + Ne2
Where:
e = margin of error (can either be from .01
to .05 values consistent with the level of
significance used in testing hypothesis
N = Population size
n = Sample Size
x
13 - 88 124 - 233 - 91 342 - 85
24 - 85 90 244 - 87 354 - 89
31 - 90 134 - 253 - 88 363 - 89
41 - 85 88 262 - 95 373 - 89
53 - 86 142 - 274 - 88 381 - 92
62 - 90 88 284 - 91 394 - 89
73 - 88 154 - 291 - 92 401 - 85
82 - 92 90 303 - 87 414 - 90
94 - 90 163 - 313 - 92 424 - 89
101 - 92 85 324 - 88 431 - 88
112 - 86 172 - 332 - 90 441 - 93
85 453 - 86
184 -
86
194 -
87
201 -
85
213 -
85
222 -
85
C. With the data in B, consider again a sample size of 15. How can you
obtain a random after classifying the data into groups?
Stratified Random sampling is specifically used when the population can naturally be
classified into groups or strata
Activity:
What is the average age of all the teachers? Find the average age of a
sample size of 15 using:
Frequency distribution
28 26 21 15 20
16
32 15 18 19 16
14
25 14 22 21 13
9
12 9 18 15 12
10
9 11 12 9 10
11
6 6 7 8 6
8
7 6 8 8 3
4
3 5 5 0 2
1
N=
1. Determine the range. It is the difference between the highest and the
lowest values in the list of data.
In our example, the range is 32. That is, 32 0 = 32
Range = HS LS
= 32 0 = 32
This step does not entail any computation. All you have to do is
decide on the desired number of steps or classes.
3. Determine the size of the intervals by dividing the range by the desired
number of class intervals and then, rounding the result up. The class
size, denoted by i is the number of integer values included in each
class. The usual class sizes are 2, 3, 5, or 10.
= 32/11
= 2.90 0r 3
4. Determine the lower and upper limits of the first class interval. It
should include the smallest value in the list of data. Class marks are
the midpoints of the classes.
5. Determine the lower and upper class limits of the succeeding class
intervals by adding the size of the class interval to the lower and
upper limits of the preceding class interval until the highest class
interval is obtained.
Activity Number 2
TOPIC: Frequency Distribution Table
Name:
___________________________________________________________________________
52 61 91 41 40 48 22 55 63 34
88 55 62 58 98 51 30 73 57 49
95 40 85 66 87 27 65 48 96 45
45 36 75 71 85 20 92 50 50 57
72 90 77 65 70 33 61 81 72 70
15
Activity Number 3
TOPIC: Frequency Distribution Table
Name:
___________________________________________________________________________
34 36 21 28 20 40 22 30
21 30 38 32 24 27 25 33
25 27 30 29 31 28 27 23
33 28 18 33 25 23 27 28
27 15 27 19 36 31 20 31
19
The Mean
x = Σx
n
Example:
Find the mean of the following set of scores:
20
88 85 87 83 89 84
Solution:
x = 88 + 85 + 87 + 83 + 89 + 84
6
x = 516/6
x = 86
x = Σfx
N
Where:
x = mean
f = frequency in the class interval
x = midpoint of the class interval
N = total umber of observations
The application of the above formula is illustrated below. Find the mean of
the following frequency:
Class
Interval f X fx
75-79 4 77 308
70-74 8 72 576
65-69 8 67 536
60-64 10 62 620
55-59 9 57 513
50-54 7 52 364
45-49 4 47 188
N= 50 Σfx=3105
21
x = Σfx
N
= 3105/50
= 62.10
The calculation of the arithmetic mean may be shortened using the coded
deviation method.
x = AM + (Σfd’) i
N
Where:
f = frequency
AM= assumed mean (any class mark)
d’ = unit coded deviation from the assumed mean
i = size of class interval;
N = total frequencies
Class
Interval f x d fd
75-79 4 77 3 12
70-74 8 72 2 16 36
65-69 8 67 1 8
60-64 10 62(AM) 0 0
55-59 9 57 -1 -9
50-54 7 52 -2 -14 -35
45-49 4 47 -3 -12
N= 50 Σfd= 1
22
Interpretation:
If all scores of 50 students were added and divided by 50, then each
student would get a score equal to 62.10
Activity:
A. Find the mean for each set of data. Round each answer to the nearest
tenth.
1. 24 27 12 18 9 20
2. 51 40 63 32 45 78
3. 33 28 90 87 61 59
B. Given the following frequency distribution, find the mean using the
long method.
Class Interval f X fx
95-99 6
90-94 4
85-89 7
80-84 9
75-79 12
70-74 15
65-67 4
60-64 5
55-59 2
50-54 3
45-49 3
N=
23
C. Given the following frequency distribution, find the mean using the
coded deviation method.
The Median
If there is an odd number of an observation, the middle value is the median and if the
number of observations is even, the average of the two middle scores is the median
24
Example
1. What is the median of these scores?
91 87 93 89 94
Solution:
Arranging the scores in sequence, we have:
87 89 91 93 94
The middle score is 91. So the median is 91. In this set, there are 2
numbers above the median and 2 numbers below it.
Since the number of scores is even, the median is the average of the
two middle scores 126 and 128
n cf
x=L + 2 . i
fm
where;
x = median
L =lower limit of the class containing the median
n/2 =one-half of the total number of cases
cf =cumulative frequency immediately below the median interval
fm =frequency of the class containing the median
i = size of the class interval
Steps in Computing the median
Set up the less than cumulative frequency column
Find n/2, one half of the total number of classes
Get the cf of the class immediately below the median class
Determine the frequency (fm) of the median class
Determine the class size
Apply the formula by substituting the given values
25
Class
Interval f Cf
75-79 4 50
70-74 8 46
65-69 8 38
60-64 10fm 30
55-59 9 20cf
50-54 7 11
45-49 4 4
N= 50
Given:
n/2 = 50/2 = 25
L = 59.5
i=5
fm = 10
cf = 20
Solution:
n cf
x=L + 2 . i
fm
Interpretation:
50% of 50 or 25 students have scores of 62 above it and the other half below it.
Activity:
A. Given the following frequency distribution, find the median. Interpret the result.
26
Class Interval f cf
173-177 2
168-172 1
163-167 3
158-162 2
153-157 4
148-152 5
143-147 8
138-142 6
133-137 4
128-132 2
123-127 2
118-122 1
N=
B. The following data shows the scores in a mastery test of one of the classes on Mr.
Corachea. Find the median and interpret the result.
32 63 39 55 87 44 80 46 49 57
53 54 31 74 48 61 71 85 81 80
31 45 79 56 75 46 49 82 50 71
57 51 47 47 61 50 69 35 58 63
83 65 76 59 62 77 73 75 44 52
The Mode
27
The mode (denoted by x) refers to the scores with the greatest frequency of the score
that occurs most frequently. Consider the following data:
1. 2 6 21 3 8 7
2. 44 48 44 55 68 70
3. 10 10 6 2 4 2
In 2, the mode is 44 since it is the score that frequently appears. This distribution is said to be
unimodal.
In 3, the modes are 10 and 2. since the distribution has two modes, it is said to be bimodal or
multimodal.
The mode in a frequency distribution is within the class interval with the highest
frequencies, which is known as the modal class. It is sometimes called the crude mode.
When the set of measurement is tabulated in a frequency distribution, the mode can be
obtained from the formula:
x = L mo + d1 i
d1 + d2
where:
x =mode
L mo = lower limit of the modal class (this is class with the highest
frequency
d1 = difference between the frequency of the interval containing
the mode and the frequency of the next lower class interval
d2 = difference between the frequency of the interval containing
the mode and the frequency of the next higher class interval
i = class interval
Class
Interval F
75-79 4
70-74 8
65-69 8 d2
60-64 10x
55-59 9 d1
50-54 7
45-49 4
N= 50
28
The modal class is the class interval 60-64 since it has the largest frequency.
Given:
Lmo = 59.5
d1 = 10 - 9 = 1
d2 = 10 - 8 = 2
Solution:
x = L mo + d1 i
d1 + d2
= 59.5 + ( 1 ) 5
1 +2
= 59.5 + (1/3)5
= 59.5 + (0.33..)5
= 61.17
Activity:
Class
Interval f
120-124 3
115-119 2
110-114 4
105-109 3
100-104 12
95-99 14
90-94 9
85-89 4
80-84 3
75-79 1
70-74 4
65-69 1
N= 50
29
B. The following data shows the result of a mathematics test administered to 40 students
in one of the classes of Mr. Corachea.
34 36 21 28 20 40 22 30
21 30 38 32 24 27 25 33
25 27 30 29 31 28 27 23
33 28 18 33 25 23 27 28
27 15 27 19 36 31 20 31
C. The data below shows the scores of 40 students in Math Test. Construct a frequency
distribution and find the mode.
52 73 84 78 76 95 70 73
53 66 93 56 93 84 84 81
69 55 90 87 75 70 49 83
84 73 75 91 71 72 72 87
88 45 73 78 63 79 76 82
Activity Number 4
TOPIC: Measures of Central Tendency
Name:
___________________________________________________________________________
Answer : ______
2. 14 12 25 15 20 22 28
Answer : ________
3. 21 8 12 15 18 16 15 21 16 15 26 19
Answer : _______
4. 34 36 44 49 37 50 28 45 38 29 27
Answer : ________
5. 1.7 2.1 2.0 1.9 1.7 2.2 2.3 2.9 1.4 1.65 1.8 2.2
2.8 2.5 2.5
30
Answer : ______
Activity Number 5
TOPIC: Measures of Central Tendency
Name:
___________________________________________________________________________
Answer : ______
2. 14 12 25 15 20 22 28
Answer : ________
3. 21 8 12 15 18 16 15 21 16 15 26 19
Answer : _______
4. 34 36 44 49 37 50 28 45 38 29 27
Answer : ________
5. 1.7 2.1 2.0 1.9 1.7 2.2 2.3 2.9 1.4 1.65 1.8 2.2
2.8 2.5 2.5
Answer : ______
31
Activity Number 6
TOPIC: Measures of Central Tendency
Name:
___________________________________________________________________________
Answer : ______
2. 14 12 25 15 20 22 28
Answer : ________
3. 21 8 12 15 18 16 15 21 16 15 26 19
Answer : _______
4. 34 36 44 49 37 50 28 45 38 29 27
Answer : ________
5. 1.7 2.1 2.0 1.9 1.7 2.2 2.3 2.9 1.4 1.65 1.8 2.2
2.8 2.5 2.5
Answer : ______
32
Activity Number 7
TOPIC: Measures of Central Tendency
Name:
___________________________________________________________________________
Consider the following data. Complete the table below using data analysis
(Microsoft Excel)
1. 18 12 14 23 16 26 17 13 10 19
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
2. 14 12 25 15 20 22 28
Mean
Standard Error
Median
Mode
33
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
3. 21 8 12 15 18 16 15 21 16 15 26 19
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
4. 34 36 44 49 37 50 28 45 38 29 27
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
34
Count
5. 1.7 2.1 2.0 1.9 1.7 2.2 2.3 2.9 1.4 1.65 1.8 2.2
2.8 2.5 2.5
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
35
Activity Number 8
TOPIC: Measures of Central Tendency
Name:
___________________________________________________________________________
Solution:
36
Solution:
37
Table shows the costs of computer rentals spend by 160 students per week.
f x
200-219 7
180-199 8
160-179 10
140-159 18
120-139 29
100-119 30
80-99 20
60-79 18
40-59 20
N=160
Solution:
38
Solution:
39
Activity Number 9
TOPIC: Measures of Central Tendency
Name:
___________________________________________________________________________
Solution:
40
Solution:
41
Table shows the costs of computer rentals spend by 160 students per week.
f cf
200-219 7
180-199 8
160-179 10
140-159 18
120-139 29
100-119 30
80-99 20
60-79 18
40-59 20
N=160
Solution:
42
Solution:
43
Activity Number 10
TOPIC: Measures of Central Tendency
Name:
___________________________________________________________________________
Solution:
44
f
116-121 2
110-115 4
104-109 5
98-103 11
92-97 10
86-91 7
80-85 1
N=40
Solution:
45
Table shows the costs of computer rentals spend by 160 students per week.
f
200-219 7
180-199 8
160-179 10
140-159 18
120-139 29
100-119 30
80-99 20
60-79 18
40-59 20
N=160
Solution:
46
Solution:
47
Activity Number 11
TOPIC: Frequency Distribution Table
Name:
___________________________________________________________________________
52 61 91 41 40 48 22 55 63 34
88 55 62 58 98 51 30 73 57 49
95 40 85 66 87 27 65 48 96 45
45 36 75 71 85 20 92 50 50 57
72 90 77 65 70 33 61 81 72 70
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Activity Number 12
The following data shows the scores in a mastery test of one of the
classes on Mr. Corachea.
32 63 39 55 87 44 80 46 49 57
53 54 31 74 48 61 71 85 81 80
31 45 79 56 75 46 49 82 50 71
57 51 47 47 61 50 69 35 58 63
83 65 76 59 62 77 73 75 44 52
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
50
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
34 36 21 28 20 40 22 30
21 30 38 32 24 27 25 33
25 27 30 29 31 28 27 23
33 28 18 33 25 23 27 28
27 15 27 19 36 31 20 31
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
51
These are important measures which divide the distribution into parts
or subparts. Percentiles are used into one hundred parts. Deciles divide the
distribution into 10 subparts. Quartiles divide the distribution into four
subgroups. These measures are also important in looking at the position of
an individual in a group. This may be also used in categorizing data. For
instance quartiles for 4 categories, deciles for 10 categories, percentiles for
any desired number of categories.
Q1 = n/4
Q2 = 2n/4
Q3 = 3n/4
Example:
The following is the list of scores resulting from Mathematics Examination
administered to 40 students:
91 61 46 62 54
62 93 90 99 76
48 83 59 96 66
94 52 51 59 62
89 100 92 70 59
91 73 68 49 54
85 43 78 50 45
99 69 77 42 46
Solution:
Arrange the scores from the lowest to the highest as indicated below.
Scores (X)
42 54 68 90
43 54 69 91
45 59 70 91
46 59 73 92
46 59 76 93
48 61 77 94
48 62 78 96
50 63 83 98
51 62 85 99
Q1 10th 52 Q2 20th 66 Q1 30th 89 100
Dk = kn/10
Where:
D = the deciles
K = from 1, 2,
9
n = the sample size
Example
The following is the list of scores resulting from Mathematics examination
administered to 40 students (arranged in an array from lowest to highest).Solve for D3, D5,
and D8:
42 54 68 90
th nd
43 12 54 D3 69 32 91 D8
45 59 70 91
46 59 73 92
46 59 76 93
48 61 77 94
48 62 78 96
50 63 83 98
51 62 85 99
52 20th 66 D5 89 100
Solution:
Pk = kn/100
Where:
D = the percentile
K = from 1, 2,
99
n = the sample size
53
Example
Below are the scores of 40 students in Mathematics Examination (arranged in an array from
lowest to highest).Solve for P50, P66, and P98:
42 54 68 90
43 54 69 91
45 59 70 91
46 59 73 92
46 59 76 93
48 61 26th 77 P66 94
48 62 78 96
50 63 83 98
51 62 85 38th 99 P98
52 20th 66 P50 89 100
Solution:
It can be noted that the positions of the score of 66 in the above distribution
are the same when the following measures are computed:
Md = Q2 = D5 = P50
Md = 20th = 66
Q2 = 20th = 66
D5 = 20th = 66
P50 = 20th = 66
So that Md = Q2 = D5 = P50
Quartiles are score-points which divide the distribution into four equal parts. Twenty-
five percent fall below the first quartile (Q 1). Fifty percent are below the second quartile
(Q2 ) and scoring seventy five percent is less than the 3rd quartile (Q3 ).
The following are the formulas for Q1, Q2, and Q3 under grouped data.
Q1 = L + 1n/4 F i
f
54
Q2 = L + 2n/4 F i
f
Q3 = L + 3n/4 F i
f
Qk = L + kn/4 F i
f
Where:
Q = quartile where k is from 1, 2, and 3
L = lower limit
n = sample size
F = cumulative frequency (less than)
f = frequency where the lower limit is located
i = the interval
Example
Find the first, second and third quartiles of the frequency distribution of the scores of
fifty students in a History class. Divide them into four equal-subgroups.
Scores f F
45-49 2 50
40-44 6 48
Q3 34.5 35-39 11 42
Q2 29.5 30-34 10 31
Q1 24.5 25-29 12 21
20-24 5 9
15-19 4 4
N= 50
Q1 = L + 1n/4 F I
f
= 24.5 + 12.5 9 5
12
= 24.5 + ( 3.5/12 ) 5
55
= 25.95
Q2 = L + 2n/4 F i
f
= 29.5 + 25 21 5
10
= 29.5 + (4 / 10) 5
= 29.5 + 20/10
= 29.5 + 2
= 31.5
Q3 = L + 3n/4 F i
f
= 34.5 + 37.5 31 5
11
= 34.5 + 32.5/11
= 34.5 + 2.95
= 37.45
Deciles are used to divide a distribution into ten equal parts. These are denoted by D1 ,
D2 , D5 , D9 . The computation is similar to the Median and Quartiles. The formulas for D 1,
D2, D5, D9.
D1 = L + 1n/10 F i
f
D2 = L + 2n/10 F i
f
D5 = L + 5n/10 F i
f
D5 = L + 9n/10 F i
f
56
Dk = L + kn/10 F i
f
Where:
Dk = deciles where k is from 1, 2, 3
9
L = lower limit
n = sample size
F = cumulative frequency (less than)
f = frequency where the lower limit is located
i = the interval
Example
Find the values of D1, D5, and D9 from the given frequency distribution of the scores
in a History class of fifty students
Scores F F
45-49 2 50
D9 L = 39.5 40-44 6 48
35-39 11 42
D5 L = 29.5 30-34 10 31
25-29 12 21
D1 L = 19.5 20-24 5 9
15-19 4 4
N= 50
D1 = L + 1n/10 F i
f
= 19.5 + 54 5
5
= 19.5 + (1/5) 5
= 19.5 +1
= 20.5
D5 = L + 5n/10 F i
f
= 29.5 + 25 21 5
10
57
= 29.5 + (4/10) 5
= 29.5 + 2
= 31.5
D5 = L + 9n/10 F i
f
= 39.5 + 45-42 5
6
= 39.5 + 2.5
= 42
Percentiles are the ninety nine score points which divide a distribution into one
hundred equal parts. For, example, the 2nd percentile (P2 ) separates the lowest 2% from 98%.
The formula for percentile is shown below.
Pk = L + kn/100 F i
f
Where:
Pk = Percentiles where k is from 1, 2, 3
99
L = lower limit
n = sample size
F = cumulative frequency (less than)
f = frequency where the lower limit is located
i = the interval
P33 = L + 33n/100 F i
f
P66 = L + 66n/100 F i
f
P75 = L + 75n/100 F i
f
58
Example. Solve for P20 P50 P80 from the frequency distribution of the scores or 50 students in
History
Scores F F
45-49 2 50
P80 L = 34.5 40-44 6 48
P50 L = 29.5 35-39 11 42
P20 L = 24.5 30-34 10 31
25-29 12 21
20-24 5 9
15-19 4 4
N= 50
P20 = L + 20n/100 F i
F
= 24.5 + 10 9 5
12
= 24.5 + 5/12
= 24.5 + .41
= 24. 91
P50 = L + 50n/100 F i
F
= 29.5 + 25 -21 5
10
= 29.5 + 20/10
= 29.5 + 2
= 31.5
P80 = L + 80n/100 F i
f
Activity Number 13
TOPIC: Quartiles
Name:
___________________________________________________________________________
52 61 91 41 40 48 22 55 63 34
88 55 62 58 98 51 30 73 57 49
95 40 85 66 87 27 65 48 96 45
45 36 75 71 85 20 92 50 50 57
72 90 77 65 70 33 61 81 72 70
38 24 36 29 28 35 34 26
42 22 37 49 32 19 33 9
20 23 23 53 34 24 24 21
28 23 14 43 48 11 46 14
49 29 27 42 18 8 7 50
The following data shows the scores in a mastery test of one of the
classes on Mr. Corachea.
32 63 39 55 87 44 80 46 49 57
62
53 54 31 74 48 61 71 85 81 80
31 45 79 56 75 46 49 82 50 71
57 51 47 47 61 50 69 35 58 63
83 65 76 59 62 77 73 75 44 52
76 64 68 67 68 27
78 59 72 71 67 68
54 62 64 72 61 67
39 57 57 75 69 61
34 36 21 28 20 40 22 30
21 30 38 32 24 27 25 33
64
25 27 30 29 31 28 27 23
33 28 18 33 25 23 27 28
27 15 27 19 36 31 20 31
Activity Number 14
TOPIC: Deciles
Name:
___________________________________________________________________________
65
52 61 91 41 40 48 22 55 63 34
88 55 62 58 98 51 30 73 57 49
95 40 85 66 87 27 65 48 96 45
66
45 36 75 71 85 20 92 50 50 57
72 90 77 65 70 33 61 81 72 70
The following data shows the scores in a mastery test of one of the
classes on Mr. Corachea.
32 63 39 55 87 44 80 46 49 57
53 54 31 74 48 61 71 85 81 80
31 45 79 56 75 46 49 82 50 71
57 51 47 47 61 50 69 35 58 63
83 65 76 59 62 77 73 75 44 52
68
39 57 57 75 69 61
34 36 21 28 20 40 22 30
21 30 38 32 24 27 25 33
25 27 30 29 31 28 27 23
33 28 18 33 25 23 27 28
27 15 27 19 36 31 20 31
70
Activity Number 15
TOPIC: Percentiles
Name:
___________________________________________________________________________
Given the data below: look for P25 , P50, and P75
The following are scores obtained by a group of 40 students in an
achievement test.
48 46 44 48 43 34 46 45
71
35 30 37 25 29 43 59 47
42 45 32 38 37 36 41 67
26 31 73 30 25 31 38 52
28 75 30 35 36 36 55 78
52 61 91 41 40 48 22 55 63 34
88 55 62 58 98 51 30 73 57 49
95 40 85 66 87 27 65 48 96 45
45 36 75 71 85 20 92 50 50 57
72 90 77 65 70 33 61 81 72 70
72
The following data shows the scores in a mastery test of one of the
classes on Mr. Corachea.
32 63 39 55 87 44 80 46 49 57
53 54 31 74 48 61 71 85 81 80
31 45 79 56 75 46 49 82 50 71
57 51 47 47 61 50 69 35 58 63
83 65 76 59 62 77 73 75 44 52
74
34 36 21 28 20 40 22 30
21 30 38 32 24 27 25 33
25 27 30 29 31 28 27 23
33 28 18 33 25 23 27 28
27 15 27 19 36 31 20 31
76
Activity Number 16
TOPIC: Quartiles
Name:
__________________________________________________________________________
Find Q2:
f cf
67-69 8
64-66 25
61-63 44
58-60 16
55-57 7
77
N=100
Solution:
f cf
116-121 2
110-115 4
104-109 5
98-103 11
92-97 10
86-91 7
80-85 1
N=40
78
Solution:
Table shows the costs of computer rentals spend by 160 students per week. Find Q1:
f cf
200-219 7
180-199 8
160-179 10
140-159 18
120-139 29
100-119 30
80-99 20
60-79 18
79
40-59 20
N=160
Solution:
Activity Number 17
TOPIC: Deciles
Name:
___________________________________________________________________________
Find D4:
f cf
67-69 8
64-66 25
61-63 44
58-60 16
55-57 7
80
N=100
Solution:
Solution:
f cf
116-121 2
110-115 4
104-109 5
98-103 11
92-97 10
86-91 7
80-85 1
N=40
81
Table shows the costs of computer rentals spend by 160 students per week. Find D7:
f cf
200-219 7
180-199 8
160-179 10
140-159 18
120-139 29
100-119 30
80-99 20
60-79 18
82
40-59 20
N=160
Solution:
Activity Number 18
TOPIC: Percentiles
Name:
___________________________________________________________________________
_
Find P25:
f cf
67-69 8
64-66 25
61-63 44
58-60 16
55-57 7
N=100
83
Solution:
Solution:
f cf
116-121 2
110-115 4
104-109 5
98-103 11
92-97 10
86-91 7
80-85 1
N=40
84
Table shows the costs of computer rentals spend by 160 students per week. Find P75:
f cf
200-219 7
180-199 8
160-179 10
140-159 18
120-139 29
100-119 30
80-99 20
60-79 18
40-59 20
N=160
85
Solution:
Measures of Dispersion/variation
Module 5
Set A 10 11 9 12 11 10 12 11 11 9
10 12 9 11 10 12
Set B 8 10 11 9 11 9 11 12 10 15
8 9 10 13 14 10 9 11
What is the mean of the two sets of scores?
If they have the same mean, where do they differ?
Find the highest and the lowest values in each distribution.
How far is each of these values from the mean o the distribution?
Set A tells us that this group of students whose scores are very near from
each other have almost the same abilities, and therefore would be more
teachable and would progress at the same rate.
Set B consists of very slow and very fast learners. These students are more
difficult to manage as a group because of their diverse abilities.
AD = Σ X–X
where:
AD = average deviation
87
x x Ix – xI
12 7 5
13 7 6
6 7 -1
3 7 -4
9 7 2
1 7 -6
7 7 0
15 7 8
2 7 -5
2 7 -5
Σx = 70 ΣI x – XI 42
Solution:
x = Σx/n
= 70/10
=7
AD = ΣIX–XI
n
= 42/10
= 4.2
AD = Σf / m – x/
n
where:
AD = average deviation
Σf / m – x/ = sum of the products of the frequency times
the difference between the midpoint and the mean
n = sample size
30-34 10
25-29 13
20-24 8
15-19 5
------------------------------------------------------------------------------------------
n = 60 Σfm= Σf/m-x/ =
The formula:
s = Σd2
n-1
where:
s = standard deviation
Σd2 = sum of squared deviations
n = number of items
Calculate the standard deviation of the given scores in an Algebra quiz: 18,
20, 22, 15, 16, 12, 17, 21, 10, 19.
The formula:
s = Σfd2
n
89
where:
s = standard deviation
Σfd2 = sum of the product of frequency and
squared deviation
n = number of items
In calculating the standard deviation for grouped data, the steps to follow
are:
Variance
where:
x - scores/values
x - mean
n - total number of cases
Where:
Xm - class mark or midpoint
x - mean
n - number of cases
f - class frequencies
90
Q = Q3 - Q1
2
Where:
Q - Quartile deviation
Q3 - Quartile of 3
Q1 - Quartile of 1
2 - Constant
Example:
Below are the scores of ten students in problem solving. Solve for
Quartile Deviation.
Scores
X
11
8
12
7
5
3
4
15
9
6
Solution: Arrange the data in an array form from lowest to the highest score.
3 8
th th
Q1 2.5 4 4.5 Q3 7.5 9 10
5 11
6 12
7 15
Q = Q3 - Q1
2
= 10 - 4.5
2
= 5.5/2
= 2.25
Q = Q3 - Q1
2
Where:
Q - Quartile deviation
Q3 - Quartile of 3
Q1 - Quartile of 1
2 - Constant
For Example:
Scores f F
45-49 2 50
40-44 6 49
Q3 35-39 11 42
30-34 10 30
Q1 25-29 12 21
20-24 5 9
15-19 4 4
----------------------------------------------------------------------------------------------
n = 50
Q3 = L + 3n/4 F i
F
= 34.5 + 37.5 30 5
11
= 34.5 + 3.40
92
= 37.90
Q1 = L + 1n/4 F i
F
= 24.5 + 12.5 9 5
12
= 24.5 + 1.46
= 25.96
Q = Q3 - Q1
2
= 37.90 25.96 = 11.94/2 = 5.97
2
Activity:
48 46 44 48 43 34 46 25
35 30 37 25 29 43 59 57
42 45 32 38 37 36 41 77
26 31 73 30 25 31 38 32
28 75 30 35 36 36 55 48
Ix- I
x
Activity Number 19 85
75
TOPIC: Average Deviation
89
Name: 95
___________________________________________________________________________
69
86
Compute for the average deviation:
75
94
88
x I x - I 82
15
12
94
18
20
15
24
14
x Ix- I x Ix- I
25 12
25 15
14 18
26 19
35 30
25 25
50 24
41 19
24 28
29 31
95
Activity Number 20
TOPIC: Standard Deviation
Name:
___________________________________________________________________________
x- d2
x (d)
85
75
89
95
69
86
75
94
88
82
x - d2
x (d)
15
12
18
20
15
24
14
x- d2 x- d2
x (d) x (d)
25 12
25 15
14 18
26 19
35 30
25 25
50 24
41 19
24 28
29 31
x x- (x - )2
Activity Number 21 85
75
TOPIC: Variance 89
95
Name:
69
___________________________________________________________________________
86
75
94
Compute for the variance:
88
82
x x- (x - )2
15
12
18
20
15
24
14
x x- (x - )2 x x- (x - )2
25 12
25 15
14 18
26 19
35 30
25 25
50 24
41 19
24 28
29 31
Activity Number 22
x
TOPIC: Quartile Deviation 85
75
Name: 89
___________________________________________________________________________
95
69
Compute for the quartile deviation:
86
75
94
88
82
x
15
12
18
20
15
24
14
x x
25 12
25 15
14 18
26 19
35 30
25 25
50 24
41 19
24 28
29 31
Activity Number 23
TOPIC: Average Deviation
Name:
___________________________________________________________________________
f x fx Ix - I f Ix - I
70-78 3
61-69 1
52-60 3
43-51 10
34-42 12
25-33 11
N=
f x fx Ix - I f Ix - I
97-107 1
86-96 7
75-85 5
64-74 9
53-63 9
42-52 9
31-41 6
20-30 4
N=
f x fx Ix - I f Ix - I
87-93 1
80-86 6
73-79 7
66-72 3
59-65 7
52-58 8
45-51 11
38-44 3
31-37 4
N=
f x fx Ix - I f Ix - I
93-103 1
82-92 1
71-81 8
60-70 20
49-59 6
38-48 5
27-37 1
N=
Activity Number 24
TOPIC: Standard Deviation
Name:
__________________________________________________________________________
f x fx Ix - I d2 fd2
(d)
70-78 3
61-69 1
52-60 3
43-51 10
34-42 12
25-33 11
N= Efd2
f x fx Ix - I d2 fd2
(d)
97-107 1
86-96 7
75-85 5
64-74 9
53-63 9
42-52 9
31-41 6
20-30 4
N= Efd2
f x fx Ix - I d2 fd2
(d)
87-93 1
80-86 6
73-79 7
66-72 3
59-65 7
52-58 8
45-51 11
38-44 3
31-37 4
N= Efd2
f x fx Ix - I d2 fd2
(d)
93-103 1
82-92 1
71-81 8
60-70 20
49-59 6
38-48 5
27-37 1
N= Efd2
Activity Number 25
TOPIC: Variance
Name:
___________________________________________________________________________
f x fx (x - ) (x - )2 f(x - )2
70-78 3
61-69 1
52-60 3
43-51 10
34-42 12
25-33 11
N= Ef(x - )2
f x fx (x - ) (x - )2 f(x - )2
97-107 1
86-96 7
75-85 5
64-74 9
53-63 9
42-52 9
31-41 6
20-30 4
N= Ef(x - )2
f x fx (x - ) (x - )2 f(x - )2
87-93 1
80-86 6
73-79 7
66-72 3
59-65 7
52-58 8
45-51 11
38-44 3
31-37 4
N= Ef(x - )2
f x fx (x - ) (x - )2 f(x - )2
93-103 1
82-92 1
71-81 8
60-70 20
49-59 6
38-48 5
27-37 1
N= Ef(x - )2
Activity Number 26
TOPIC: Quartile Deviation
Name:
___________________________________________________________________________
f F
70-78 3
61-69 1
52-60 3
43-51 10
34-42 12
25-33 11
N=
f F
97-107 1
86-96 7
75-85 5
64-74 9
53-63 9
42-52 9
31-41 6
20-30 4
N=
f F
87-93 1
80-86 6
73-79 7
66-72 3
59-65 7
52-58 8
45-51 11
38-44 3
31-37 4
N=
f F
93-103 1
82-92 1
71-81 8
60-70 20
49-59 6
38-48 5
27-37 1
N=
Elements of Hypothesis Testing and Measures of Correlation
Module 6
What is a hypothesis?
There are two forms of a hypothesis, namely, the null hypothesis and
the alternative hypothesis. The null hypothesis, H o, is the hypothesis to be
tested and it represents what the investigation doubts to be true. On the
other hand, the alternative hypothesis, H1, is the operational statement of
the theory that the experimenter or researcher believes to be true and
wishes to prove.
Measures of Correlation
The most meaningful research is that which seeks to find and to verify
relationships between and among variables. In correlational studies,
researchers determine if a relationship exists between two (or more)
quantitative variables, such as score and age or reading comprehension and
word-problem solving skill in mathematics. Such relationships are often
times used in prediction, to imply causation. Although causal relationship
cannot be proven through correlational researches, researchers hope
eventually to make causal statements as an outgrowth of their work.
Value Interpretation
0.00 to + 0.20 Negligible correlation
+ 0.21 to + 0.40 Low or slight correlation
+ 0.41 to + 0.70 Marked/Moderate correlation
+ 0.71 to + 0.90 High correlation
+ 0.91 to + 0.99 Very High correlation
+ 1.00 Perfect correlation
The formula:
Speed Capacity
20.45 3.52
30.86 4.15
14.40 3.84
22.50 3.93
29.14 4.25
12.45 3.16
16.25 3.17
38.00 4.76
17.49 3.12
24.00 3.78
Spearman Rho
The formula:
n
6 Σd 2
Ρ = 1- i=1 .
n ( n2 - 1)
where:
ρ = Spearman correlation coefficient
d = difference in ranks of each of the
given pairs of ranks
n = number of pairs of measurements
Example:
Seven aspirants of the Outstanding Freshman Student were ranked
according to their grade point average and performance in interview. The
data were tabulated and ranked, where the highest rank is 1 and the lowest
is 7.
Ranking of Outstanding Freshman Student Contenders
Consider the data presented below. Compute the correlation coefficient and
test its significance at .o5 level
X 6 11 9 14 5 3
Y 5 2 3 1 7 11
The t-test
The formula:
t = x1 - x2
SS1 + SS 1 + 1
n1 + n2 - 2 n1 n2
where:
t = t-test
x1 = mean of group 1
x2 = mean of group 2
SS1 = sum of squares of group 1
for SS1
2
SS1 = Σx1 - (Σx1 )2
n1
for SS2
2
SS2 = Σx2 - (Σx2 )2
n2
The t-test for correlated samples is used when comparing the means
before and after the treatment. It is also used to compare the mean of the
pretest and the posttest.
The formula:
t = D .
ΣD2 - ( ΣD ) 2
n .
n(n - 1)
where:
Directions: Listed below are the important attributes of good mathematics teaching.
Please rate the following statements using the scales below.
The solution and formulas for the one-way analysis of variance re as follows:
Degreed of Freedom:
F = MSBET
MSW
Example
Three brands of milk were tried and compared on a sample of three
groups of 9 children to find out if they increase the weight of the subjects.
The data is reflected in the following table in terms of weight gain in pounds.
Computation
Respondents No. A B C A2 B2 C2
1 4.4 3.1 2.9
2 4.0 2.9 2.7
3 3.5 3.7 3.1
4 5.2 3.8 3.5
5 4.7 4.1 3.4
6 2.6 3.0 3.4
7 4.2 3.9 2.8
8 3.7 3.2 3.5
9 3.5 3.0 3.3
Σx = ΣA + ΣB + ΣC
MSW = SSW
d. f. W
F = MSBET
MSW
In 200 tosses of a coin, 115 heads and 85 tails were observed. Test
the hypothesis that the coin is fair using a .05 level of significance.
Step 1 Hypothesis
O = E
= .05
Step 3 Reject the null hypothesis if the computed value is greater that
the critical value.
x2 = (O – E)2
E
Df = k – 1
Worksheet Computation
O E (O –E)2 (O – E)2 / E
115 100 225 2.25
85 100 225 2.25
200 200 450 4.50
Step 5 The computed value of 4.50 is greater than the critical value of
3.84, we reject the null hypothesis, thus a significant difference
exists.
The formula:
x2 = (fo fe)2
fe
Where:
x2 - Chi-square
fo - Observed frequency
fe - Expected frequency
Example No.1
Variable 1 Variable 2
Research Highest Educational Attainment
Skills Bachelors Masters Doctorate Total
Very Skillful 15 9 8 32
Moderately 10 7 6 23
Skillful
Not skillful 5 6 6 17
Total 30 22 20 72
Example 2:
Religious Variable 2
Involvement Cultural Practices
Always Sometimes Not Total
Catholic 112 120 29 261
Non- 131 101 90 322
Catholics
None at all 16 25 13 54
Total 259 246 132 637
Activity Number 27
TOPIC: Pearson Product Moment Correlation Coefficient
Name:
___________________________________________________________________________
Prof. Henry Joel conducted a test to his 10 students in Elementary Statistics class twice after
one-day interval. The test given after one day is exactly the same test given the first time.
Scores were gathered in the first test (FT) and second test (ST).
Student FT ST
1 36 38
2 26 34
3 38 38
4 15 27
5 17 25
6 28 26
7 32 35
8 35 36
9 12 19
10 35 38
Prof. Vinci Glenn conducted a test to his students in his Biology class two times after one-
week-interval. Show the complete solution. Interpret the result.
Student FT ST
1 12 20
2 20 22
3 19 23
4 17 20
5 25 25
6 22 20
7 15 19
8 16 18
9 23 25
10 21 24
Prof. Gene James conducted a test to his 15 students in Physics class twice with one-day
interval. The test given after one week is exactly the same test during the first time it was
conducted. Scores below were gathered in the first test (FT) and second test (ST
Student FT ST
1 25 34
2 33 25
3 35 29
4 40 37
5 19 25
6 18 23
7 35 32
8 33 36
9 16 25
10 25 30
11 32 30
12 24 22
13 19 35
14 26 20
15 38 29
Solve for the correlation of parallel forms in 70-item Mathematics Achievement Test.
Student X Y
1 68 67
2 69 68
3 66 66
4 65 64
5 70 69
6 55 53
7 50 51
8 48 50
9 40 40
10 52 51
11 60 59
12 42 40
13 50 49
14 45 44
15 40 41
16 38 38
17 35 34
18 63 63
19 47 48
20 42 43
Activity Number 28
TOPIC: Spearman Rho
Name:
___________________________________________________________________________
Student X Y
1 68 67
2 69 68
3 66 66
4 65 64
5 70 69
6 55 53
7 50 51
8 48 50
9 40 40
10 52 51
11 60 59
12 42 40
13 50 49
14 45 44
15 40 41
16 38 38
17 35 34
18 63 63
19 47 48
20 42 43
Activity Number 29
TOPIC: Test of Goodness of Fit
Name:
___________________________________________________________________________
1. YMC clothing wants to know whether customers prefer any color over other colors in shirts.
She Selects a random sample of 200 shirts sold and notes the colors. At α = 0.05, is there a
color preference for the shirts?
Expected
2. TMC Foods wishes to see whether there is any preferences in the flavors of corn chips that is
sold at SM Batangas. A random sample of sales is selected, and the data are shown here. At
α = 0.05, are the flavors selected with equal frequency?
No. Sold 32 24 40 76
Observed 32 24 40 76
Expected
3. TMC Computers believes that 50% of its customers purchase PC computers, 25% purchase
printers and 25% purchase scanners. A sample of purchases show the following distribution.
At α = 0.05, is its assumption correct?
No. 44 25 19
Purchases
Observed 44 25 19
Expected
4. TMC Finances reported that 32% of loans granted by banks were for home mortgages, 28%
for car purchase, 20% for credit cards, 12% for real state and 8% for miscellaneous needs. A
random sample of 100 loans shows the following:
Loan Home Car Credit Real State Miscellaneous
mortgage Purchase Card
No. of 44 25 19 8 4
Loans
Observed 44 25 19 8 4
Expected
Activity Number 30
TOPIC: Chi-square
Name:
___________________________________________________________________________
1. TMC Research conduct a survey of drivers to determine if there is any difference in their
choice of brand bases on their gender. These are the results:
Men 80 90 50
Women 40 60 40
Test whether there is any difference in the proportion of drivers who prefer a particular
brand based on gender at α = 0.05 level of significance.
2. TMC Research conducts a further survey to determine if there is a relationship between the
proportion of drivers who express satisfaction or dissatisfaction with the performance of
their American, Japanese or Korean cars. These are the results:
Not 90 40 20
Satisfied
Test whether there is any difference in the proportion of drivers who express satisfaction or
dissatisfaction with the performance of their American, Japanese or Korean cars at α = 0.05
level of significance.
3. A survey is conducted among workers in Makati City to determine if there is any difference
between the proportions of women and men who drive to work, take the bus to work or
take the MRT to work. The results are as follows:
Women 40 80 60
Test whether there is any difference between the proportions of women and men who drive
to work, take the bus to work or take the MRT to work at α = 0.05 level of significance.
Activity Number 31
1. A marketing analyst wishes to see whether there is a difference in the average time a
costumer has to wait in a checkout line in three large department stores.
At α = 0.05, is there a significant difference in the mean waiting times of customers for each
store?
2. James Reid, the manager of TMS Computer Store is conducting a study of the number of
customers who pay by personal check, by a credit card and by bank card. During the course
of five different days, he records the following number of purchases made by each method.
Test if there is any difference in the mean number of purchases made with each method at α
= 0.05 level of significance.
Activity Number 32
TOPIC: T-Test
Name:
___________________________________________________________________________
Student FT ST
Student X 1Y 12 20
1 68 2 67 20 22
2 69 3 68 19 23
3 66 4 66 17 20
4 65 5 64 25 25
5 70 6 69 22 20
6 55 7 53 15 19
7 50 8 51 16 18
8 48 9 50 23 25
9 40 1040 21 24
10 52 51
11 60 59
12 42 40
13 50 49
14 45 44
15 40 41
16 38 38
17 35 34
18 63 63
19 47 48
20 42 43