Stat Review Lecture (Complete)
Stat Review Lecture (Complete)
Stat Review Lecture (Complete)
PROFESSIONAL EDUCATION
BASIC CONCEPTS
A. Definition of Terms:
1. Statistics – the science of the classification and arrangement of facts based on number of
occurrences for the deduction of general assertions
2. Descriptive Statistics – one category of statistics which provide methods that concerns with
summarizing and describing numerical data. For easy understanding, tables, graphs and charts
that display data are used.
3. Data – the collection of any number of relates observations on one or more variables.
4. Data Array – the arrangement of raw data by observations in either ascending or descending
order.
5. Raw or Ungrouped Data – those information that have not been classified, arranged or organized.
6. Sample – it is a collection of some, but not all, of the elements of the population under study in
which the statistics is interested.
7. Tabulation – the process of classifying or grouping of scores in a systematic arrangement.
8. Class Frequency – it is the number of measures or observations in an interval.
9. Class Interval – these are the classes or ranges of values that the observations can assume. Each
class interval has a lower limit and an upper limit.
10. Class Boundaries/Real or Exact Limits – these are the exact values of class limits by at least 0.5. It is
the upper limit of class and the lower limit of succeeding class.
11. Class Mark - it is the midpoint of a class interval in a frequency distribution and taken as the
average of the lower and upper limits.
12. Class Size – it is the width of class intervals and measures the interval between the first value of
one class and the first value of the next class.
13. Graph – are picture of the given numerical data.
14. Frequency Distribution – a tabulation where the test scores are ranked and scores having the
same values are combined. It reports the number of observations or scores that are included in
each of the class intervals in a table. It permits the researcher to see at a glance how these
measurements are distributed.
B.
Sample Scores
100 106 104 103 105
79 92 91 89 105
110 95 96 108 113
93 102 80 90 101
110 78 107 75 127
99 86 113 98 95
101 103 124 100 87
122 118 104 84 99
109 106 116 111 115
109 88 114 72 102
1 2 3 4 5 6 7 8 9 10 11 12
F(MP-
c.i Tally f D fd Cf< Cf</N MP MP-M d2 fd2
M)
X-X Fd1
d1
125-129 I 1 5 5 50 100 127 26.3 691.7 691.69 26.3
120-124 II 2 4 8 49 98 122 21.3 453.7 907.38 42.6
115-119 III 3 3 9 47 94 117 16.3 265.7 797.07 48.9
110-114 IIII-I 6 2 12 44 88 112 11.3 127.7 766.14 67.8
105-109 IIII-III 8 1 8 38 76 107 6.3 39.69 317.52 50.4
100-104 IIII-IIII 10 0 0 30 60 102 1.3 0.169 16.9 -13
95-99 IIII-I 6 -1 -6 20 40 97 3.7 13.69 82.14 -22.2
90-94 IIII 4 -2 -8 14 28 92 8.7 75.69 304.76 -34.8
85-89 IIII 4 -3 -12 10 20 87 13.7 187.7 750.76 -54.8
80-84 II 2 -4 -8 6 12 82 18.7 349.7 699.38 -37.4
75-79 III 3 -5 -15 4 8 77 23.7 561.7 1685.07 -71.1
70-74 I 1 -6 16 1 2 72 28.7 823.7 823.69 -28.7
TOTAL(X)
50 -13 7840.5 498
= 100.70
D. Measures of Central Tendency – a single measure in a series which represents all the scores made by
a group.
1. Mean – the best known and most reliable measure of central tendency. It is the sum of the
separate measures divided by their number.
Characteristics of Mean
1. An interval statistics
2. A calculated average
3. Value is determined by every case in the distribution
4. Affected by the extreme measures
5. Sum of deviations about the mean is zero
6. Can be subjected to numerous mathematical computations
7. Most widely used
8. Represents average quantities
a. Computation of the mean, the scores are added and their sum divided by the sum of
frequencies/ number of scores
Formula:
X = ∑ƒ(x)/N where: X = mean
∑ = summative notation
ƒ = class frequency
(x) = class mark
N = sum of frequency
Example:
Method I
X = 5035/50
= 100.70
Method II
X=AM+ fd i
N
X = 102 + ( - 13) 5
50
= 102 + ( - 0.26) 5
= 102 + ( - 1.30)
= 100.70
Formula:
WX = NX + NX + NX Where: WX = weighted mean
N1+N2+N3 X = mean of the distribution
N = frequency of the distribution
Section Mean N
I 86 32
II 84 45
III 88 40
2. Median – it is the point on the scale such that exactly one – half of the cases in the distribution are above it
and the other half of the cases are below it.
Girls 1 2 3 4 5 6 7
Ages 13 14 15 16 18 19 19
Formula:
X=N+1
2
=7+1
2
= 4th
X = 16
Formula:
X = N/2 + N + 2/2
2
= 8/2 + 8 + 2/2
2
= 4+5
2
= 9 = 4.5th
2
X = 28.5 or 28 or 29
Characteristics of Mode:
1. A nominal statistics
2. An inspection average
3. The most frequently occurring value
4. Usually occurs near the center of the distribution
5. Cannot be manipulated mathematically
6. Some distribution have more than one mode
7. Most popular score
Formula:
X = L mo + (d1) i where: X = mode
(d1 + d2) L mo = lower mode of the assume mode
d1 = frequency of the modal class minus the frequency below it
d2 = frequency of the modal class minus the frequency directly above it.
I = class interval
Example:
X = 100 + (10 – 6) 5
(10 – 6 + 10 – 8)
= 100 + (4) 5 = 100 + (.6) 5
(4+2) = 100 + 3.33
= 100 + (4) 5 X = 103. 33
(6)
1. What is RANGE?
Range is simply the difference between the highest score and the lowest score in a
distribution.
Example:
99,105, 72, 119, 127, 114
Range = HS – LS
= 127 – 72
= 55
*When to use Range?
When one wants a quick and simple approximation of the spread or variability of scores/
values.
Interpretation:
If the range is small the scores are close together whereas if the range is large the scores are more
spread out.
2. Standard Deviation – it is the square root of the mean of the squares of the deviations taken from the
arithmetic : mean.
Steps:
1. First compute the MEAN.
X = ∑X
N
2. Compute the deviation of each score from the mean.
d=X–X Where : d = the deviation of each score from the mean
X = a raw score
X = the mean
3. Square each of these deviations and add this column. A check of our work would be necessary that the
sum of (x) column or the sum of the deviation about the mean should be zero.
4. To find the standard deviation, we use the following formula and substitute it as it shown.
*NOTE: in some texts, the formula for the standard deviation is given as:
Example:
(x) Deviation from Mean (x – x) (d1) Squared Deviation (d2)
40 -30 900
45 25 625
55 15 225
65 5 25
70 0 0
75 5 25
80 10 100
85 15 225
90 20 400
95 25 625
∑x = 700 ∑(x-x)2 = 3150
x = ∑x
N SD = √ ∑d2
= 700 N
10 SD = √ 3150
X = 70 10
= √ 315
SD = 17.75
X X2
40 (40)2= 1600
45 (45)2= 2025
55 (55)2= 3025
65 (65)2 = 4225
70 (70)2= 4900
75 (75)2 = 5625
80 (80)2 = 6400
85 (85)2 = 7225
90 (90)2 = 8100
95 (95)2 = 9025
∑x = 700 ∑x2 = 52150
SD = √∑𝑥2– (∑x)2
N–N
=√52150 − (700)2
10 10
49000
5215 −
=√ 10
= √5215 − 4900
=√315
= 17.75
Steps:
1. Find the mean
2. Subtract the mean from the class mark and assign it as d1.
3. Multiply the frequency with d1 and the product will be assigned ƒd1
4. Square ƒd1 and assign it as fd12
Formula:
SD = √∑ƒd2
N = √7840.5
50 = √156.81= 12.52
3. Average Deviation/ Mean Deviation – the mean deviation ( MD) also called arithmetic deviation is the
mean of all separate measures in a series taken from central tendency which is usually the arithmetic mean,
seldom the median.
a. When to use the Mean/ Average Deviation – when one wants to measure the extent by which each
individual value distribution deviates from the mean of the distribution.
4. Quartile Deviation – it is one – half of the distance between the 75th percentile and the 25th percentile in a
frequency distribution.
Q1 (25th Percentile) is the first quarter or quartile on a set of scores, the point lies 25 percent of the
number of cases.
Q3 (75th Percentile) is the third quarter or quartile on a set of scores, the point below which lies 75% of
the number of cases.
6. Stanines/Deciles – these are values which divide the distribution into ten parts.
Example:
50
(6( )−20)
10
D6 = 99.5+ 5
10
300
( −20)
10
= 99.5+ 5
10
30−20)
= 99.5 + 5
10
(10)
= 99.5 + 5
10
` = 99.5 + (1)5
= 99.5 +5
= 104.5
6. Percentile
The percentiles are the points that divide the total number of scores into exactly one hundred equal
parts. It is understood that there are ninety nine (99) percentile that determine the points below which certain
percentage of the test scores would fall. For example, the seventh percentile (P7) would indicate within or
below the test scores in the distribution lies within or below it while 72% lies above it.
The percentiles are computed exactly in the same manner as the median.
STEPS IN FINDING THE PERCENTILE FROM THE RAW SCORES ARE AS FOLLOWS:
1. Arrange the scores from highest to lowest or from lowest to highest.
2. Determine Pk where Pk is the kth percentile and k is equal to 1, 2, 3, … 99.
3. Computation of the percentile is very much similar to the computation of the quartile and decile.
4. Consider the following formulas for ungrouped and grouped data:
Mo Md M
Positively skewed distribution.
Example:
Table 1. Computation of the sum of Cubes of deviations about Arithmetic mean in a symmetrical
distribution.
Table 2. Computation of the sum of squared deviation and sum of cubes of deviation times frequency above
the arithmetic mean in a positively skewed distribution.
(𝑋−Ẍ)3
∑ʄ Wherein: (X) – midpoint of the class limit
𝑁
Skewness =
∑ʄ
(𝑋−Ẍ)2
√∑ʄ(𝑋−Ẍ)2 f- frequency
𝑁 𝑁
X - arithmetic mean
This is the computation of skewness in table 2. N – number of cases
(𝑋−Ẍ)3
∑ʄ
𝑁
Skewness = 795.7642492
(𝑋−Ẍ)2
∑ʄ √∑ʄ(𝑋−Ẍ)2 = Given:
𝑁 𝑁 90.8775(9.53297) ∑fX-X)3/N = 31830.56999
795.7642498
31830.56999
= ∑f(X-X)2 = 3635.1
866.3324812 N = 40
=
40 = 0.92
3635.1 3635.1
√
40 40
A distribution can be positively or negatively skewed.
Normal
Negatively Skewed
Positively Skewed
𝑸.𝑫
K=
P90 – P10
Where: Q.D = Quartile Deviation of the score/class interval
P90, P10 = Percentiles range of the scores/ class interval
Example:
8.28
K=
116.17−82
8.28
=
34.17
= .242317
= .24
G. Standard Scores
These are linearly divided scores that represent the area under the normal curve.
1. How to compute for STANDARD SCORES?
a. Find the difference between the individual’s raw score and the mean of the normative group.
b. Divide the difference by the standard deviation of the normative group.
Formula:
𝑋−Ẍ
Z= 𝑆𝐷
Where: X- any raw score
SD – standard deviation of the group
Ẍ - mean of the group
Z- standard score
Example I.
Which score is better in an intelligence test: a score of 145 on Test I or a score of 60 on Test II?
Test I Test II
X= 100 X = 40
SD = 15 SD = 5
145−100 60−40
Z= Z=
15 5
45 20
= =
15 5
=3 =4
141−148 112−113
Z= Z=
3.2 4.3
−7 −1
= =
3.2 4.3
= -2.19 = 0.23
The test in GEN. SCIENCE has a higher score.
OTHERS FACTS:
Distribution that contains at least 30 scores will approximate the normal curve.
An important use of the normal curve for physical educators is in determination of percentile ranks.
When distributions are not normally distributed the curve is skewed.
Positive – tail to the right
Negative – tail to the left
So an IQ of 85 is equivalent to a z- score of -1 and our question becomes what proportion of the normal
curve is lower than the z-score of -1. Appendix A only shows the proportion of scores between a z-score and
the mean (we can ignore the negative sign as the table only uses positive values). So we must look at the
proportion of scores between a z-score of -1 and the mean and then subtract this value from 50 (50%). So 50 -
.3413 (Appendix A Value) =.1587 or 15.87% of the population has IQ scores lower than 85.
We can also solve this problem by using the normal curve figure we have seen in this lesson. Just add up
the proportion of the curve below -1 standard deviation, which is 13.59 + 2.14 + 0.13 = 15.86%.
SCORE AREA
0.0 0.0000
0.5 0.1915
0.86 0.3051
1 0.3413
1.52 0.4357
2.5 0.4988
3.99 0.5000
Z = 1.00 Z = 0.5
0.3413 0.1915
0 1 2 0 1 2
Z = 1.52
Z = 2.5
0 1 2 3 -4 -3 -2 -1 0 1 2 3 4
Percent of cases
under portion
34.13% 34.14%
the normal curve
13.59% 13.59%
Percentile Scores
0.1 2 16 50 84 98 99.9
z-scores
T- scores
20 30 40 50 60 70 80
Stanines
1 2 3 4 5 6 7 8 9