Chapter 3
Chapter 3
Numerical Descriptive
Techniques (9 hours)
Chapter outline
In this chapter you learn:
◼ 1. Measures of centre and location
Central Tendency
(Location)
Variation
The variability of the set of measurements–that is,
the spread of the data.
Variation
(Dispersion)
Measures of Central Tendency:
The Mean
X i
X1 + X2 + + Xn
X= i=1
=
n n
Sample size Observed values
Measures of Central Tendency:
The Mean (con’t)
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
Mean = 13 Mean = 14
11 + 12 + 13 + 14 + 15 65 11 + 12 + 13 + 14 + 20 70
= = 13 = = 14
5 5 5 5
Numerical Descriptive
Measures for a Population
X i
X1 + X2 + + XN
= i=1
=
N N
Where μ = population mean
N = population size
Xi = ith value of the variable X
Arithmetic Mean
◼ The arithmetic mean (mean) is the most
common measure of central tendency
X i
X1 + X 2 + + Xn
X= i=1
=
n n
XG = ( X1 X 2 Xn ) 1/ n
R G = [(1 + R1 ) (1 + R 2 ) (1 + Rn )]1/ n − 1
◼ Where Ri is the rate of return in time period i
◼Stock price: at the each month, Jan-> Dec
Monthly return:
Example
11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20
Median = 13 Median = 13
◼ The location of the median when the values are in numerical order
(smallest to largest):
n +1
Median position = position in the ordered data
2
◼ If the number of values is odd, the median is the middle number
Note that n + 1 is not the value of the median, only the position of
2
the median in the ranked data
Measures of Central Tendency:
The Mode
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
Mode = 9 No Mode
Measures of Central Tendency:
Which Measure to Choose?
Skewness
Statistic < 0 0 >0
Measures of Variation
Variation
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 13 - 1 = 12
Measures of Variation:
Why The Range Can Be Misleading
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
▪ Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
◼ Average (approximately) of squared deviations
of values from the mean
n
◼ Sample variance:
(X − X) i
2
S =2 i=1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Measures of Variation:
The Sample Standard Deviation
S= i=1
n -1
Measures of Variation:
The Standard Deviation
Sample
Data (Xi) : 10 12 14 15 17 18 18 24
n=8 Mean = X = 16
σ2 = i=1
N
N
◼ Population standard deviation: i
(X − μ) 2
σ= i=1
N
Sample statistics versus
population parameters
X
2 S2
S
Interpreting Standard
Deviation: Empirical Rule
1. Measure of dispersion
2. Also called midspread
3. Difference between upper and lower quartiles
◼ Interquartile Range = QU – QL
4. Spread in middle 50%
5. Not affected by extreme values
Thinking Challenge
◼ You’re a financial analyst for Prudential-Bache
Securities. You have collected the following
closing stock prices of new stock issues: 17,
16, 21, 18, 13, 16, 12, 11.
◼ What are the quartiles, Q1 and Q3, and the
interquartile range?
Quartile Solution*
Q1
Raw Data: 17 16 21 18 13 16 12 11
Ordered: 11 12 13 16 16 17 18 21
Position: 1 2 3 4 5 6 7 8
QL is the median of the bottom half, the
average of the two middle scores
(12 + 13)/2 = 12.5
Quartile Solution*
Q3
Raw Data: 17 16 21 18 13 16 12 11
Ordered: 11 12 13 16 16 17 18 21
Position: 1 2 3 4 5 6 7 8
QU is the median of the bottom half, the
average of the two middle scores
(17 + 18)/2 = 17.5
Interquartile Range Solution*
Interquartile Range
Raw Data: 17 16 21 18 13 16 12 11
Ordered: 11 12 13 16 16 17 18 21
Position: 1 2 3 4 5 6 7 8
Interquartile Range = Q3 – Q1
= 17.5 – 12.5 = 5
Box Plot
4 6 8 10 12
Box Plot
( X − X)( Y − Y )
i i
cov ( X , Y ) = i=1
n −1
◼ Only concerned with the strength of the relationship
◼ No causal effect is implied
Interpreting Covariance
( X − X)( Y − Y )
i i
cov ( X , Y )
r= i=1
=
n n SX SY
i
(
i=1
X − X ) 2
i
(
i=1
Y − Y ) 2
Features of
Correlation Coefficient, r
◼ Unit free
◼ Ranges between –1 and 1
◼ The closer to –1, the stronger the negative linear
relationship
◼ The closer to 1, the stronger the positive linear
relationship
◼ The closer to 0, the weaker any positive linear
relationship
Scatter Plots of Data with Various
Correlation Coefficients
Y Y Y
X X X
r = -1 r = -.6 r=0
Y
Y Y
X X X
r = +1 r = +.3 r=0
Applications of standard deviation
UCL
+3σ
Process Average
- 3σ
LCL
time
Control Chart Basics
UCL
Common Cause +3σ
Process Mean
Variation: range of
- 3σ
expected LCL
variability
time
UCL = Process Mean + 3 Standard Deviations
LCL = Process Mean – 3 Standard Deviations
Process Variability
Special Cause of Variation:
A measurement this far from the process average is very
unlikely if only expected variation is present
UCL
±3σ → 99.7% of
process values Process Mean
should be in this
range LCL
time
UCL = Process Mean + 3 Standard Deviations
LCL = Process Mean – 3 Standard Deviations
Using Control Charts
UCL
Process Mean
LCL
time
Process Not in Control
LCL LCL
R=
R i
k
where:
Ri = ith subgroup range
k = number of subgroups
R Chart Control Limits
UCL = D 4 ( R )
LCL = D3 ( R )
where:
D4 and D3 are taken from the table
(Appendix Table E.11) for subgroup size = n
R Chart Example
R=
R i
=
3.85 + 4.27 + ... + 4.22
= 3.894
k 7
Minutes
8 UCL = 8.232
6 _
4 R = 3.894
2
0 LCL = 0
1 2 3 4 5 6 7
Day
X=
X i
k
where:
Xi = ith subgroup average
k = number of subgroups
Computing Control Limits
◼ The upper and lower control limits for an X chart are
generally defined as
R
◼ Use d 2 n to estimate the standard deviation of the
process average, where d2 is from appendix Table
E.11
Computing Control Limits
◼ The upper and lower control limits for an X chart are
generally defined as
◼ so R
UCL = X + 3
d2 n
R
LCL = X − 3
d2 n
Computing Control Limits
UCL = X + A 2 ( R )
LCL = X − A 2 ( R )
3
where A2 (from table E.11) =
d2 n
X Chart Example
X=
X i
=
5.32 + 6.59 + + 6.79
= 5.814
k 7
R=
R i
=
3.85 + 4.27 + + 4.22
= 3.894
k 7
Minutes
8 UCL = 8.061
_
_
6 X = 5.814
4
LCL = 3.567
2
0
1 2 3 4 5 6 7
Day
n
1
=
n − 1 i =1
( Ri − E ( R)) 2
Two Assets With Same Expected Return But
Different (Continuous) Probability Distributions
Probability Density
Stock 1
Stock 2
0 5 6 7 8 9 10 11 12 13 14 15
Return %
Return and Risk of a portfolio
RP = R1 w1 + R2 w2
= w + w + 2w1w2 cov( R1 , R2 )
2
P
2
1
2
1
2
2
2
2
= w + w + 2 w1w2 12 1 2
2
1
2
1
2
2
2
2
The Question Being Asked in VaR