Chapter 3-Numerical Descriptive Measures
Chapter 3-Numerical Descriptive Measures
Chapter 3
Objectives
Summary Definitions
The Mean
X i
X1 + X2 + + Xn
X= i=1
=
n n
Sample size Observed values
Applied Statistics for Business
ĐẠI HỌC FPT CẦN THƠ
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
The Median
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
n +1
Median position = position in the ordered data
2
⚫ If the number of values is odd, the median is the middle number.
⚫ If the number of values is even, the median is the average of the two
middle numbers.
Note that
n + 1 is not the value of the median, only the position of
2
the median in the ranked data.
The Mode
⚫ Value that occurs most often.
⚫ Not affected by extreme values.
⚫ Used for either numerical or categorical data.
⚫ There may be no mode.
⚫ There may be several modes.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
Review Example
▪ Geometric mean
▪ Used to measure the rate of change of a variable over time.
X G = (X1 X 2 X n ) 1/ n
RG = [(1 + R1 ) (1 + R2 ) (1 + Rn )]1/ n − 1
▪ Where Ri is the rate of return in time period i.
Central Tendency
X i
XG = ( X1 X2 Xn )1/ n
X= i=1
n Middle value Most Rate of
in the ordered frequently change of
array observed a variable
value over time
Measures of Variation
Variation
The Range
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 13 - 1 = 12
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
▪ Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
S =2 i =1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Applied Statistics for Business
ĐẠI HỌC FPT CẦN THƠ
n -1
Data A
Mean = ?
11 12 13 14 15 16 17 18 19 20 21 S=?
Data B Mean = ?
11 12 13 14 15 16 17 18 19 20
S=?
21
Data C Mean = ?
S=?
11 12 13 14 15 16 17 18 19 20 21
Summary Characteristics
S
CV = 100%
X
Applied Statistics for Business
ĐẠI HỌC FPT CẦN THƠ
X−X
Z=
S
Shape of a Distribution
Skewness
Statistic < 0 0 >0
Applied Statistics for Business
ĐẠI HỌC FPT CẦN THƠ
Sharper Peak
Than Bell-Shaped
(Kurtosis > 0)
Bell-Shaped
(Kurtosis = 0)
Flatter Than
Bell-Shaped
(Kurtosis < 0)
– Constructing a boxplot.
Quartile Measures
⚫ Quartiles split the ranked data into 4 segments with an
equal number of values per segment.
Q1 Q2 Q3
◼ The first quartile, Q1, is the value for which 25% of the
values are smaller and 75% are larger.
◼ Q2 is the same as the median (50% of the values are
smaller and 50% are larger).
◼ Only 25% of the values are greater than the third quartile.
Quartile Measures
Calculating The Quartiles: Example
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22
(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data,
so Q1 = (12+13)/2 = 12.5.
⚫ Measures like Q1, Q3, and IQR that are not influenced by
outliers are called resistant measures.
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
> ≈ <
> ≈ <
Q1 Q2 Q3 Q 1 Q2 Q3 Q1 Q2 Q3
Boxplot Example
00 2233 5 5 27 27
The mean µ
⚫ The population mean is the sum of the values in the
population divided by the population size, N.
X i
X1 + X2 + + XN
= i=1
=
N N
Where μ = population mean
N = population size
Xi = ith value of the variable X
Applied Statistics for Business
ĐẠI HỌC FPT CẦN THƠ
The Variance σ2
σ2 = i=1
N
i
(X − μ)2
σ= i=1
N
Applied Statistics for Business
ĐẠI HỌC FPT CẦN THƠ
68%
µ
µ ± 1σ
Applied Statistics for Business
ĐẠI HỌC FPT CẦN THƠ
95% 99.7%
μ 2σ μ 3σ
Applied Statistics for Business
ĐẠI HỌC FPT CẦN THƠ
Chebyshev’s Rule
⚫ Regardless of how the data are distributed, at least
(1 - 1/k2) x 100% of the values will fall within k
standard deviations of the mean (for k > 1).
– Examples:
At least Within
(1 - 1/22) x 100% = 75% ….............. k=2 (μ ± 2σ)
(1 - 1/32) x 100% = 88.89% ……….. k=3 (μ ± 3σ)
⚫ The Covariance.
⚫ The Coefficient of Correlation.
The Covariance
( X − X)( Y − Y )
i i
cov ( X , Y ) = i=1
n −1
⚫ Only concerned with the strength of the relationship.
⚫ No causal effect is implied.
Interpreting Covariance
Coefficient of Correlation
cov (X , Y)
r=
SX SY
Where,
n
(X − X)(Y − Y)
n n
i i (X − X)
i
2
(Y − Y)
i
2
cov (X , Y) = i=1
SX = i=1
SY = i=1
n −1 n −1 n −1
Features of the
Coefficient of Correlation
Y Y
X X
r = -1 r = -.6
Y
Y Y
X X X
r = +1 r = +.3 r=0
Applied Statistics for Business
ĐẠI HỌC FPT CẦN THƠ
Pitfalls in Numerical
Descriptive Measures
Ethical Considerations
Chapter Summary