Lecture 2-3 Data Analysis Location & Dispression
Lecture 2-3 Data Analysis Location & Dispression
Example
Geometric Mean
⚫ Geometric mean is defined as the positive root of the
product of observations. Symbolically,
⚫ Find geometric mean of rate of growth: 34, 27, 45, 55, 22, 34
Harmonic Mean
⚫ The harmonic mean is the number of variables divided
by the sum of the reciprocals of the variables.
(n + 1)/2
⚫ Median?
● Range
✔ Difference between maximum and minimum values
● Interquartile Range
✔ Difference between third and first quartile (Q3 - Q1)
● Variance
✔ Average of the squared deviations from the mean
● Standard Deviation
✔ Square root of the variance
Variability
Variabilit
y
No
Variability
The Range
• The range, R, of a set of n measurements is the
difference between the largest and smallest
measurements.
• Example: A botanist records the number of
petals on 5 flowers:
5, 12, 6, 8, 14
• The range is R = 14 – 5 = 9.
Quartiles
Q Q Q
1 2 3
25 25 25 25
% % % %
Percentile
50th Percentile ≡ Median (Q2)
25th Percentile ≡ Lower Quartile (Q1)
75th Percentile
≡ Upper Quartile (Q3)
Interquartile Range:
IQR=Q3 – Q1
• The position of p-th percentile is 0.p(n + 1)
✔ Q1is 3/4 of the way between the 4th and 5th ordered
measurements, or Q1 = 65 + 0.75(65 - 65) = 65.
Example
The prices ($) of 18 brands of walking shoes:
40 60 65 65 65 68 68 70 70
70 70 70 70 74 75 75 90 95
✔Q3 is 1/4 of the way between the 14th and 15th ordered
measurements, or
Q3 = 74 + .25(75 - 74) = 74.25
✔and
IQR = Q3 – Q1 = 74.25 - 65 = 9.25
90-th percentile P90
⚫The position of 90-th percentile is
0.9(18 + 1)=17.1
4 6 8 10 12 14
The Variance
• The variance of a population of N measurements
is the average of the squared deviations of the
measurements about their mean μ.
5 -4 16
12 3 9
6 -3 9
8 -1 1
14 5 25
Sum 45 0 60
Two Ways to Calculate the Sample
Variance
Use the calculation formula:
5 25
12 144
6 36
8 64
14 196
Sum 45 465
Example- ungrouped data
⚫ Sample: Moisture content (%) of kraft paper are:
6.7, 6.0, 6.4, 6.4, 5.9, and 5.8.
Suppose s = 2. s
4
s s
⚫ Mathematically,
37
Skewness
⚫ Mathematically,
39
Kurtosis
⚫Peakedness of a distribution
⚫ Leptokurtic: high and thin
⚫ Mesokurtic: normal in shape
⚫ Platykurtic: flat and spread out
Leptokurti
c
Mesokurtic
Platykurti
c
Skewness and Kurtosis
41
42, 53, 68, 66, 72, 74, 99, 69, 49, 50, 41, 76, 98, 77, 79, 60, 84, 80, 90, 52, 82, 50, 79, 84, 81,
85, 67, 79, 76, 96, 43, 65, 54, 42, 51, 61, 78, 73, 64, 86, 75, 77, 59, 69, 78, 83, 56, 81, 70, 94, N = 𝝨fi = f1 + f2 + f3 + f4 + f5 + f6
63, 95, 99, 80, 71
= 5 + 8 + 10 + 15 + 10 + 7 = 55
CI Mid-Value Tally Marks Frequency Cumulative
(x) (f) Frequency
Definitions
40 - 50 45 IIII 5 5
L = lower limit of the median class
50 - 60 55 IIII III 8 13 h = Magnitude of the median class
fm= Frequency of the median class
60 - 70 65 IIII IIII 10 23 c = cumulative frequency of the
premedian class
70 - 80 75 IIII IIII IIII 15 38
f1 = Frequency of the modal class
80 - 90 85 IIII IIII 10 48 f0= “ “ “ premodal class
f2= “ “ “ post “ “
90 - 100 95 IIII II 7 55 L = lower limit of the modal class
h = Magnitude of the modal class
n
Mean = 1/N i=1
𝚺 fix1 = x = 1/N [f1x1+ f2x2+...fnxn] = 71.91