CQF January 2016 M1S1 Explorations
CQF January 2016 M1S1 Explorations
CQF Lecture One exercises are explorations intended for you to do some experimentation
with (1) returns estimation, (2) histogram and (3) Q-Q plots. The purpose is seeing how
robust (or not) the estimates and rules are. By estimating returns over √ different timescales
you re-discover the scaling of volatility σ with the square root of time δt.
1. Multi-period projection with log-returns vs. linear returns for the asset price St
St+1 St+1
Rts = −1 vs. Rtc = ln
St St
Rts = exp Rtc − 1
where R1M is monthly return for [0, 1M], R2M is monthly return for [1M, 2M] and so on
to R12M as monthly return for [11M, 12M] (strictly following notation, we would have
written Rt as R11M ). The multi-period returns are simple to obtain.
This is also a relationship between the long term simple rate and a set of forward rates.
The notation is fitting because r1M applies over [0, 1M] and f12M is a forward rate
known at the start of [11M, 12M] period.
The exercises have been edited by CQF Faculty, Richard Diamond, [email protected]
1
There are several rules about the use of returns that create common pitfals in
estimation and portfolio management.
While returns are free from a scale in dollars, they are not unitless – their unit is time.
Returns are always estimated over some ‘timescale’, most common is daily timescale
where for Rt+τ − Rt the τ = 1/252.
√
The square-root rule for volatility σ δt applies under the assumption that the com-
pounded returns are invariants – they behave identically and independently across time.
The familiar variance minimisation (MVP, Markowitz) objective function is defined for
linear returns.
argmin {w0 µτ − λw0 Στ w}
w
where subscript in µτ and Στ means their estimation from market data of respective
frequency,e.g., τ = 1/252 daily.
The linear returns (not log-returns) aggregate across assets. Return calculation for a
portfolio of N assets is done as a weighted average of their individual returns.
2
2. Histogram is a graphic representation of probability density function f (x), a pdf. It is
also a discrete representation. To built a histogram one actually engages in estimation
of density as follows:
1
f (x) = nj
Nh
where nj is the varying number of observations in a bucket, h is our bandwidth –
bucket/window size on the scale of real numbers R, and N is the sample size.
With the proper ‘kernel smoothing’ methods small bandwidth setting h will produce
a histogram that repeats the plotted data (low smoothness), while sufficiently high
bandwidth will smooth the representation into a symmetric histogram as if the returns
are Normally distributed.
1000 breakpoints were used. It is recommended that you experiment by setting the
number of breakpoints higher (gives low smoothness) and lower but not too low.
3
3. Building a Q-Q plot in explicit steps is more straightforward in Excel because you can
follow the transformations and organise the data in matching columns.
Implementing the steps, you will obtain a result alike the following table. ‘Historic’
stands for the actual S&P 500 return, followed by ‘Scaled’ return (Z-score), indexes and
the Standard Normal Percentile which corresponds to the cumulative probability given
by i/N . Notice one past negative return that was in excess of 21 standard deviations!
Table 1: Left: Inputs for a Q-Q plot. Right: Empirical PDF and CDF
Plot the scaled returns (Z-scores) from step (a) against the theoretical percentiles from
step (d). For the perfectly Normal log-returns the Q-Q plot would be a straight line.
4
3.2 Q-Q Plot with R libraries Time series experiments are best conducted in the
suitable environment. Most of the student and analytical tools (Eviews, Stata, SPSS,
SAS) package the modelling into standardised statistical tests and procedures. They
allow you to modify parameters but rarely the procedure itself.
We have selected R environment for its convenience in manipulating the time series and
intuitive facilities of programming language. It is also noticeable that statistical tests
were implemented by specialists – the output is organised to present what is important.
A quant job requires the active knowledge of the chosen libraries, e.g., ‘zoo’ objects for
time series analysis, ‘quantlib’ main library of the functions for a financial quant, and
‘urca’ library for working with non-stationary time series (cointegration testing).
R Code: The code below reads from SPX.xls file distributed with Lecture One (saved
as .csv ). It selects certain sub-sample and plots a histogram and Q-Q plot.
######################################################################
# 2015. Richard Diamond. Quieries to [email protected] #
# CQF Lecture One #
######################################################################
5
# P1 Invariants. IID Check (histogram)
returns.this = diff(log(prices.this$Close))
hist(returns.this, breaks=1000, main = "Histogram of SPX Log-Returns", xlab="%")
The following section of the R code replicates the steps that you might like to do in
Excel in the first instance. It provides an example of matrix manipulation in R. Code
comments abridged.
######################################################################
# 2015. Richard Diamond. Quieries to [email protected] #
# CQF Lecture One. Q-Q Plot Step-by-step #
######################################################################
for(i in 1:sreturns.N)
{
sreturns.data[i,1]=i # This can be using seq(1:sreturns.N)
sreturns.data[i,2]=i/sreturns.N
sreturns.data[i,3]= qnorm(i/sreturns.N) # Can this be more efficient computationally?
}
sreturns.data = cbind(sort(as.vector(sreturns.this)), sreturns.data)
# This is a tricky line: sorting a vector of scaled returns and appending it the data matrix