Ge 4 Midterm PPT Statistics
Ge 4 Midterm PPT Statistics
THE MODERN
WORLD
Secillano, Patrick Joseph N.
UST-LEGAZPI PRAYER
n = 305
STRATIFIED SAMPLING
Stratified Random Sampling - This method is used when the
population is too big to handle, thus dividing N into subgroups, which
are called strata is necessary. Samples per stratum are then
randomly selected, but considerations must be given to the sizes of
the random samples to be selected from the subgroups.
𝑁𝑖
Proportional Allocation 𝑛𝑖 =
𝑁
𝑥𝑛
𝑁𝑖 𝜎𝑖
Optimum Allocation 𝑛𝑖 =
𝑁1 𝜎1 + 𝑁2 𝜎2 + 𝑁3 𝜎3 + ⋯ + 𝑁𝑖 𝜎𝑖
𝑥𝑛
Example: A researcher wants a sample size of 263 to be drawn from the
population which is divided into three strata size 𝐶𝐻𝑆 = 120, 𝐶𝐴𝑆𝐸 = 120
and 𝐶𝐵𝑀𝐴= 450. Determine the sample sizes for the different strata by
adopting proportional allocation.
Formula: c = 1 + 3.322log(n)
c = 1 + 3.322log(40)
c = 6.32204…
c=6
Note: if the highest value in the raw data is not
reached yet, add another class
Step 3: Solve for the Class Size ( k )
𝑯𝒊𝒈𝒉𝒆𝒔𝒕 −𝑳𝒐𝒘𝒆𝒔𝒕
Formula: k =
𝟏+𝟑.𝟑𝟐𝟐𝒍𝒐𝒈(𝒏)
k = 62/[1+3.322log(40)]
k = 9.806955…
k = 10
Step 4: In creating the class intervals always
start with the lowest value.
CLASS INTERVALS
118 – 127
128 – 137
138 – 147
148 – 157
You add
another 158 – 167
class! 168 – 177
178 - 187
Step 5: Tally or get the frequency/ies per class
interval
CLASS FREQUENCY
INTERVALS
118 – 127 3
128 – 137 6
138 – 147 11
148 – 157 10
158 – 167 6
168 – 177 3
178 - 187 1
Note: Try to make color codes
120 133 180 138
140 150 170 153
161 149 124 168
148 139 161 142
130 143 137 147
156 151 128 118
165 138 147 167
146 150 149 129
142 158 152 130
175 148 142 159
Step 6: Solve for the class mark
• In solving for the classmark (midpoint of the class interval):
• Formula: (lower limit + higher limit)/2
CLASS INTERVALS CLASSMARK
118 – 127 (118+127)/2 = 122.5
128 – 137 132.5
138 – 147 142.5
148 – 157 152.5
158 – 167 162.5
168 – 177 172.5
178 - 187 182.5
Step 7: Solve for the relative frequency
• In solving for the relative frequency:
𝒇
• Formula: RF = 𝒙 𝟏𝟎𝟎%
𝜮𝒇
CLASS INTERVALS RELATIVE
FREQUENCY
118 – 127 7.5%
128 – 137 15%
138 – 147 27.5
148 – 157 25%
158 – 167 15%
168 – 177 7.5%
178 - 187 2.5%
Step 8: Solve for Lower and Upper Class Boundaries
• Lower boundary: Subtract 0.5 from the lower limit per class
• Upper boundary: Add 0.5 from the higher limit per class
• Note: 0.5 is NOT constant, it varies depending on the data provided.
a
i 1
i a1 a2 ... an
Where:
•ai = function
•i & n = lower and upper bounds of
summation
Determine the sum
4
k 2 (1 2) (2 2) (3 2) (4 2) 18
k 1
3k
k 3
3(3) 3( 4) 3(5) 36
(1) (2k 1) 1 2(0) 1 1 2(1) 1 1 2(2) 1 1 2(3) 1 1 2(4) 1
k 0
k
5
0 1 2 3 4
FORMULA: MEASURES OF CENTRAL
TENDENCY
Mean Median Mode
𝑓𝑥 𝑛
𝑀𝑒𝑎𝑛 = − 𝑓𝑚−1
𝑛 𝑀𝑑 = 𝐿𝑚𝑑 +2 (𝑖) 𝑀𝑜 = 𝑙𝑚𝑜 +
𝑓𝑜 −𝑓1
(i)
𝑓𝑚 2𝑓𝑜 −𝑓1 −𝑓2
Where: 𝐿𝑚𝑑 - is the lower class boundary of the 𝑙𝑚𝑜 - is the lower class boundary of
f – frequency median group the modal class
x – class mark / midpoint n - is the total number of frequency 𝑓1 - is the frequency of the group
n – total number of 𝑓𝑚−1 - is the cumulative frequency of before the modal class
frequency the groups before the median group 𝑓𝑜 - is the frequency of the modal
𝑓𝑚 -is the frequency of the median group class
i - is the class width/size 𝑓2 -is the frequency of the group
after the modal class
i -is the class width/size
Given: List of pre-board examination scores of 40 BSN
graduating students of University of Santo Tomas – Legazpi.
Solve for the descriptive statistical measures and interpret the
results in two to three sentences.
120 133 180 138
140 150 170 153
161 149 124 168
148 139 161 142
130 143 137 147
156 151 128 118
165 138 147 167
146 150 149 129
142 158 152 130
175 148 142 159
MEASURES OF CENTRAL TENDENCY
Step 1: Solve for the Mean
CLASS INTERVALS FREQUENCIES (f) CLASS MARK (x) fx
118 – 127 3 122.5 367.5
128 – 137 6 132.5 795
138 – 147 11 142.5 1567.5
148 – 157 10 152.5 1525
158 – 167 6 162.5 975
168 – 177 3 172.5 517.5
178 - 187 1 182.5 182.5
n = 40 Σfx = 5930
𝑓𝑥
𝑀𝑒𝑎𝑛 =
𝑛
5,930
=
40
= 𝟏𝟒𝟖. 𝟐𝟓 or 𝟏𝟒𝟖
MEASURES OF CENTRAL TENDENCY
Step 2: Solve for the Median
LOWER
CLASS INTERVALS FREQUENCIES (f) <cf
BOUNDARY
118 – 127 3 118 – 0.5 = 117.5 3 𝑛
− 𝑓𝑚−1
128 – 137 6 127.5 9 𝑀𝑑 = 𝐿𝑚𝑑 + 2 𝑖
138 – 147 11 137.5 20
𝑓𝑚
148 – 157 10 147.5 30
20 − 9
= 137.5 + 𝑥 10
158 – 167 6 157.5 36 11
= 137.5 + 10
168 – 177 3 167.5 39
178 - 187 1 177.5 40
= 147.5
n = 40
1. Determine the Median class: n/2 = 40/2 = 20; Median Class: 138 – 147
2. Determine the lower boundary of the median class: L = 137.5
3. Determine cumulative frequency BEFORE the median class: <cf = 9
4. Determine the frequency of the median class: f = 11
5. Determine the class size: i = 10
MEASURES OF CENTRAL TENDENCY
Step 3: Solve for the Mode
CLASS INTERVALS FREQUENCIES (f)
118 – 127 3
𝑓𝑜 −𝑓1
128 – 137 6 𝑀𝑜 = 𝑙𝑚𝑜 + (i)
2𝑓𝑜 −𝑓1 −𝑓2
138 – 147 11 11 − 6
148 – 157 10 = 137.5 + 𝑥 10
2 11 − 6 − 10
158 – 167 6 5
168 – 177 3
= 137.5 + 𝑥(10)
6
178 - 187 1 = 137.5 + 8.33 …
n = 40 = 145.833….
1. Determine the Modal class: Get the highest frequency; Modal Class: 138 – 147
2. Determine the lower boundary of the modal class: L = 137.5
3. Determine the frequency before the modal class: 𝑓1 = 6
4. Determine the frequency of the modal class:𝑓0 = 11
5. Determine the frequency after the modal class: 𝑓2 = 10
6. Determine the class size: i = 10
NORMAL
DISTRIBUTION
How would you describe a Normal Distribution?
•A normal distribution is a bell-shaped frequency
distribution curve. Most of the data values in a
normal distribution tend to cluster around the
mean. The further a data point is from the mean,
the less likely it is to occur. There are many
things, such as intelligence, height, and blood
pressure that naturally follow a normal
distribution.
What are the Characteristics of a Normal
Distribution?
•Normal distributions are
symmetric, unimodal, and
asymptotic, and the mean,
median, and mode are all
equal. A normal distribution
is perfectly symmetrical
around its center.
EMPIRICAL RULE
68.26%
95.44%
99.72%
99.74%
-5 -4 -3 -2 -1 0 1 2 3 4 5
CONVERSION OF RAW SCORE TO Z-SCORE
• The standard score or z-score measures how many standard deviation a
given value (x) is above or below the mean. The z-scores are useful in
comparing observed values. A positive z-score indicates that the score or
observed value is above the mean, whereas a negative z-score indicates
that the score or observed value is below the mean.
𝒙− 𝐱̄
• Formula for sample: 𝒛 =
𝒔
𝒙− µ
• Formula for population: 𝒛 =
𝝈
EXAMPLE
• On a sample final examination in integral calculus, the mean
was 75 and the standard deviation was 12. Determine the
standard score of a student who received a score of 60
assuming that the scores are normally distributed.
SOLUTION
𝑥− x 60−75
• Solve:𝑧 = = = −𝟏. 𝟐𝟓
𝑠 12
• This indicates that 60 is 1.25 standard deviations below the
mean.
AREA OF THE NORMAL CURVE (Z-SCORES)
• The total area under the normal curve is equal to 1.
• The probability that a normal random variable X equals any
particular value is 0.
• The probability that X is greater than a equals the area under the
normal curve bounded by a and plus infinity (as indicated by the
non-shaded area in the figure).
• The probability that X is less than a equals the area under the
normal curve bounded by a and minus infinity (as indicated by the
shaded area in the figure.
CASES IN SOLVING THE AREA
OF A NORMAL CURVE
CASE 1 CASE 2
z = 0 and z = ± a z = a and z = b or z = -a and z = -b (both on
Note: Just determine the area on the z table. the same side)
Note: Subtract the areas of two z-scores.
CASE 3 CASE 4
z = a and z = -b or z = -a and z = b (on z = a (to the right) or z = -a (to the left)
Note: Subtract the areas of two z-scores.
CASE 5
z = - a (to the right) or z = a (to the left)
Note: Add 0.5 to the Area
CASE 5
z = - a (to the right) or z = a (to the left)
Note: Add 0.5 to the Area
EXAMPLE 1: Find the area between z = -1.5 and z = - 2.5
Step 1: Use CASE 2: z scores are both on the same side.
Step 2: Sketch the normal curve and plot the z-scores.
-2.5 -1.5 0
Step 3: Look for the area of the z-scores in the z-table: Note: Negative sign in z-scores is
just a notation that they are plotted on the left side of the curve.
Area of -1.5 = 0.4332 and Area of -2.5 = 0.4938
A = 𝐴2 − 𝐴1
A = 0.4938 – 0.4332
A = 0.0606 or 6.06%
Step 5: Interpret: The area between z = -1.5 and z =-2.5 is 0.0606 or 6.06%
EXAMPLE 2: The mean height of 2nd Year 2B students at UST-Legazpi is 164
cm and the standard deviation is 10 centimeters. Assuming the heights are
normally distributed, what percent of the heights is greater than 168
centimeters?
Step 1: Convert 168 to z-score
𝑥−x
z= 𝑠
168−164
z = = 10
z = 0.4
Step 2: Sketch the normal curve:
0 0.4
164 168
EXAMPLE 2: The mean height of 2nd Year 2B students at UST-Legazpi is 164
cm and the standard deviation is 10 centimeters. Assuming the heights are
normally distributed, what percent of0 the
0.4 heights is greater than 168
centimeters? 164 168
Step 3: Find the area of z = 0.4 in the z table.
Area of 0.4 = 0.1554
𝟔𝜮𝒅𝟐
𝒑= 𝟏−
𝒏(𝒏𝟐 − 𝟏)
Where:
x 1 2 3 4 5 6
y 5 10 15 15 25 35
SOLUTION:
Step 1: Construct a table of values.
x y xy 𝑥2 𝑦2
1 5 5 1 25
2 10 20 4 100
3 15 45 9 225
4 15 60 16 225
5 25 125 25 625
6 35 210 36 1,225
Σx = 21 Σy=105 Σxy=465 Σ𝑥 2 = 91 Σ𝑦 2 =2,425
Step 2: Use the formula, where n = 6
𝑛(𝛴𝑥𝑦) − (𝛴𝑥)(𝛴𝑦)
𝑟=
𝑛(𝛴𝑥 2 ) − (𝛴𝑥)2 𝑛(𝛴𝑦 2 ) − (𝛴𝑦)2
6(465) − (21)(105)
𝑟=
6(91) − (21)2 6(2,425) − (105)2
585
𝑟= = 0.96157 𝑜𝑟 𝟎. 𝟗𝟔𝟐
370,125
Interpretation: It indicates that there is a strong positive correlation between the time in hours spent in
studying and the scores on a test.
Example 2: In a regional finals for the mathematical device, two judges were asked to rank eight
contestants (A, B, C,…, H) based on their over-all performance. Calculate Spearman’s rank correlation
coefficient and determine how strong the correlation is between the scores of the two judges. The table
shows the resulting ranks.
A B C D E F G H
First
5 2 4 3 6 1 8 7
Judge(x)
Second
3 4 5 2 6 1 7 8
Judge (y)
Solution: n =8
Contestants x y d 𝑑2
A 5 3 2 4
B 2 4 -2 4
C 4 5 -1 1
D 3 2 1 1
E 6 6 0 0
F 1 1 0 0
G 8 7 1 1
H 7 8 -1 1
Σ𝑑 2 = 12
𝟔𝜮𝒅𝟐
𝒑= 𝟏−
𝒏(𝒏𝟐 − 𝟏)
𝟔(𝟏𝟐)
𝒑=𝟏−
𝟖(𝟖𝟐 − 𝟏)
𝟕𝟐
𝒑=𝟏− = 𝟎. 𝟖𝟓𝟕𝟏
𝟓𝟎𝟒
Interpretation: It indicates that there is a strong positive correlation between the scores of the two
judges.
FIN…
PRAYER Almighty God, bless our nation and make it true
FOR OUR to the ideals of freedom and justice and
brotherhood for all who make it great. Guard us
COUNTRY from war, from fire and wind, from compromise
and disease from fear and confusion. Be close to
our president and statesmen; give them vision
and courage, as they ponder decisions affecting
peace and the future of the world. Make us more
deeply aware of our heritage; realizing not only
our rights but also our duties and responsibilities
as citizens. Make this great land and all its people
know clearly Your will, that we may fulfill the
destiny ordained for us in the salvation of the
nations, and the restoring of all things in Christ.
Amen.