0% found this document useful (0 votes)
85 views100 pages

PAF 2022 Woven

This document contains lecture notes for the course FIE450 Programming with Applications in Finance. The course uses the programming language R to solve problems in finance. It is intended for students who have taken an investments course and have little to no experience with R. The course covers topics like reading and processing stock market data, modeling volatility, Monte Carlo simulation, mean-variance portfolio optimization, and the single index model. Assessment consists of group assignments involving writing R code to be submitted electronically. Active participation during lectures is encouraged through in-class exercises and quizzes.

Uploaded by

Adrian Jacobsen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views100 pages

PAF 2022 Woven

This document contains lecture notes for the course FIE450 Programming with Applications in Finance. The course uses the programming language R to solve problems in finance. It is intended for students who have taken an investments course and have little to no experience with R. The course covers topics like reading and processing stock market data, modeling volatility, Monte Carlo simulation, mean-variance portfolio optimization, and the single index model. Assessment consists of group assignments involving writing R code to be submitted electronically. Active participation during lectures is encouraged through in-class exercises and quizzes.

Uploaded by

Adrian Jacobsen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 100

Programming with Applications in Finance

FIE450
Lecture Notes
Nils Friewald

January 11, 2022

1
Nils Friewald FIE 450en

Contents
1 Introduction 6

2 Stock index characteristics 9


2.1 Reading a text file into R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Accessing data content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Data processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Expected return and volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Volatility 22
3.1 EWMA model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.1 Maximum likelihood approach . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.2 Functions in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.3 Optimization in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.4 Searching for the optimal EWMA parameter . . . . . . . . . . . . . . . . 33
3.2 GARCH model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Volatility forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4 Option-Implied Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4 Monte-Carlo simulation 44
4.1 Fundamental Theorem of Asset Pricing . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Principals of Monte-Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3 A model for the index price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4 Monte-Carlo Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.5 Variance reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.6 Interpreting the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5 Data processing 63
5.1 Obtaining the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Data cleansing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3 Rolling observations forward in time . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.4 One stock price observation per month . . . . . . . . . . . . . . . . . . . . . . . . 68
5.5 Return computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.6 Market capitalization and weights . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.7 Market returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.8 From long to wide-format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.9 Risk-free interest rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6 Mean-Variance Portfolios 82
6.1 Optimization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.2 Expected returns and covariances . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.3 Solving for the optimal portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Page 2
Nils Friewald FIE 450en

6.4 Single Index Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89


6.4.1 Estimating the Single Index model . . . . . . . . . . . . . . . . . . . . . . 89
6.4.2 Expected returns and risk in the single index model . . . . . . . . . . . . 92
6.4.3 Solving for the optimal portfolio . . . . . . . . . . . . . . . . . . . . . . . 93
6.5 Capital allocation line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Index 100

Page 3
Nils Friewald FIE 450en

Course Information
Who should take this course?
You should take this course if you

• have successfully completed FIE400 (“Investments”);

• have almost zero knowledge in R;

• plan to pursue a career in the financial industry.

An alternative to this course is BUS455 (“Applied Programming and Data Analysis for Busi-
ness”)

• Covers SQL, Python, and R

• Here, programming is the focus

Note that if you decide to choose the alternative courses, please do not forget to sign off from
this course.

What you should not expect from this course


• This course will not be an in-depth R course

• The objective of this course is to solve finance problems by means of R

Teaching style
• 12 lectures (January 10 to February 16):

– Monday 12:15–13:45 and 14:00–14:45 with a break in between


– Wednesday 10:15–11:45 and 12:00–12:45 with a break in between
– Alternate dates: February 21 and 23

• Combination of regular lectures and “mini cases”

• Cases shall be independently worked out during class and implemented in R

• Active student participation is strongly encouraged

• Use of quizzes to further encourage participation1


1
https://kahoot.it

Page 4
Nils Friewald FIE 450en

Assessment
• Group-based assignments

• Assignment 1: hand-out: 02.02.2022 2 p.m., hand-in: 07.02.2022 11 a.m.

• Assignment 2: hand-out: 16.02.2022 2 p.m., hand-in: 21.02.2022 11 a.m.

• Assignment “reflection note”: 22.02.2022 2 p.m., hand-in: 25.02.2022, 2 p.m.

• Each assignment must be answered in English and counts 47.5% towards one final grade.
The reflection note counts for the remaining 5%. Each group assignment will have an
R-code as its delivery and needs to be submitted electronically.

• It is your responsibility to form groups

Some important rules


• No compulsory class attendance

• Don’t expect any OS-related support

• Don’t send us any R code; Neither I nor my teaching assistant (TA) will search for errors
in your code.

• If you have specific questions contact my TA (Diego Bonelli) on Canvas

• We do not respond to questions on topics that have been already covered in-depth during
the lectures. That is, no reiteration of course content for those who had been absent.

• “I will be working in Oslo. Can I still be enrolled in the course and complete the assign-
ments?” Yes, but it is your responsibility to form groups for the assignments.

• For more questions and answers, see the FAQ announcement on Canvas

Page 5
Nils Friewald FIE 450en

1 Introduction
This course introduces you into the programming language R. R is an extremely powerful lan-
guage. It is very similar to the commercial S language and, to some degree, also comparable
to Matlab. However, contrary to these languages, which are rather expensive, R is freeware.
R is developed under the GNU license. This means that basically everyone can contribute to
developing R. Compared to more established languages such as C++ and FORTRAN, R is much
more user-friendly and allows for a steep learning curve. R is widely used in academia and has
also become very popular in the financial industry.
To get started, you first need to install R on you computer. Download R by choosing the
appropriate package for your operating system.2 The installation process is pretty straightfor-
ward. Run R by selecting the icon in the menu. When you launch R it will open the R console.
This is the area where you can type in your comments.
Consider R as a calculator. You can type in a command after the prompt “>” and execute
the command by pressing enter/return. What happens is that the command will be evaluated
and the result will be printed out to the console.3

10

## [1] 10

The [1] indicates that the number on the right refers to the first number in the output. While
useless in this case it may be more informative when printing more numbers. Of course, you
can also compute more “complicated” expressions such as:

10 * 2 + 4

## [1] 24

Spaces between operators and/or numbers are not mandatory. They just shall make the
code more readable. We usually do not use the console like this because whenever you need
to redo a calculation you need to type in all the expressions in exactly the same order again.
This is cumbersome. Instead, we write R scripts which is a set of R commands. Basically, any
text editor is appropriate for this purpose. For example, you could use Notepad or Wordpad. A
typical workflow then looks like this:

1. Write your script in a text editor.

2. Select all.

3. Copy-paste from editor to console.

4. Evaluate the outcome.


2
https://cran.r-project.org/
3
Note that R code in this document does not display the prompt. First, this is for better readability and,
second, it also allows cut & paste code from the document into the console, for example. You may further notice
that any output starts with two hashtags “##”. A hashtag is used to start a comment in your script which will
not be evaluated by R.

Page 6
Nils Friewald FIE 450en

5. If necessary go back to the editor and edit your script.


6. Otherwise, you may wish save your script for later usage.

However, an even better approach is to use an integrated development environment (IDE).


An IDE normally includes a console, syntax-highlighting, support of direct code execution, as
well as tools for plotting, command history, debugging, workspace management, etc. A widely
used IDE for R is RStudio. The basic version is open source and thus free. You can download
and install RStudio for various types of operating systems.4 Download the appropriate package
and install it on your computer.5 Launch RStudio. Then open the editor window (or script
window) with File → New → R script. You should get something that looks like Figure 1. Let
me very briefly explain the most important parts of the development environment.

Figure 1: RStudio

• Again, the top left window that you have just opened is the editor window (or script
window). You will write your collection of commands into this window.
• The bottom left window is the console window (or command window). You can either
write R commands directly into this window or let RStudio execute your script in the
4
https://www.rstudio.com/products/rstudio/download/
5
As you will later see I am not using RStudio. Instead, I am using emacs, which is my favorite editor. However,
emacs has a ton of shortcuts and without remembering at least some of them emacs is almost useless.

Page 7
Nils Friewald FIE 450en

editor window. You do so by pressing Code → Run Region → Run All or by pressing the
corresponding button.

Equipped with an IDE we are now ready to start writing programs. Let’s begin with the
famous “Hello world!” example, although it does not provide much insight. But basically every
programming tutorial or book starts with this trivial example. Thus, we follow this unwritten
rule and do the same. Write the following code fragment into the script window and then
execute it as described above. cat

cat("Hello World!")

The command cat is a function that prints its argument (the one enclosed by the parentheses)
to the console. That’s it. I don’t explain RStudio any further. There are many more things you
can do but I will leave it up to you to explore other functionalities. You are now familiar with
the main use case, that is, to write an R script and run the script. Let’s continue with more
interesting stuff.

Page 8
Nils Friewald FIE 450en

2 Stock index characteristics

Mini case 1 Stock index characteristics


One of your clients would like to invest in the OBX Total Return Index (OBX). He heard
about diversification and knows that this is the only “free lunch” in finance. But he wonders
what return he can expect by investing in the index. He is also worried about the risk he
would be exposed to. Thus, he asks for your advice. Can you help him?

An apparent route to follow is to download the index and measure its past return and
volatility. The index is available on the webpage of the Oslo Stock Exchange.6 Download all
index prices available since first launched and save the file as “OBX.xlsx” at a location where
you will find it again.7 Open the file with Excel. What we see is the daily last, high and
low prices as well as the turnover. You may have noticed that the data is given in inverse
chronological order, that is the most recent observation is on the top. Of course, we could now
easily do all the calculations in Excel but since this course is about programming we will do it
with R instead. Thus, we first convert the file into a text format. We do so by exporting the file
into a CSV file first. Click on File → Save as and then choose “CSV UTF-8 (Comma delimited)
(*.csv)”. This format can be read into R easily. Save the file as “OBX.csv”. Note, however, that
the exact format of your CSV file depends on your operating system, system settings as well as
the general settings in Excel. This is important to keep in mind when loading the file into R
because we need to tell R how the format looks like. Thus, the following description will most
likely not apply to everyone.

2.1 Reading a text file into R


Anyway, we try to read the data using the following command: read.csv

obx <- read.csv("OBX.csv")

read.csv is a pre-defined function in R. As its name suggests it reads CSV files which are
just data files in text format. The function needs at least one argument which is the name of
the file (including its path if necessary). Arguments are always enclosed in parentheses after
the function name. The data from the file is read in and immediately assigned to a variable.
There are two assignment operators in R, both of them are equivalent: <- and = . There <-
are some rules to follow in choosing variable names. First, variable names are case sensitive, =
that is, obx is different to OBX. Second, do not use special characters with the exception being
“.” and “_”. Third, you are allowed to use numbers but not at the beginning of the variable
name. Once you have created a variable it remains accessible in the R environment. To be more
precise, all entities that R creates and manipulates are known as objects. The specific type of
this object in our case is a data frame. A data frame is comparable to a single spreadsheet in
6
https://www.oslobors.no/ob_eng/markedsaktivitet/#/details/OBX.OSE/overview. Please note, if you
click this link you are going to be redirected to EURONEXT which now maintains the OBX index.
7
Since EURONEXT has been in charge of publishing the index the description that follows is not valid
anymore. The reason is that the data is structured differently, e.g., EURONEXT only publishes daily price
information. Still, the description that follows describes the old data that was available through the Oslo Stock
Exchange.

Page 9
Nils Friewald FIE 450en

Excel. There is one important exception. All values of a column must be of the same data type,
be it characters, a number or something else. You can continue working with the variable and
even override it. By the way, if we would not have assigned the results to a variable in the
above case the results would have be printed to the console instead. If the data is big it may
take a while until all the data is printed. So be careful.
We now want to verify whether the import of the file was done correctly, for example,
whether each column was indeed interpreted as a separate column. We can use the function
head which displays the first six rows of the data: head

head(obx)

## OBX Last High Low Official.turnover..NOK.


## 1 12.01.18 765.94 767.33 762.08 3,834,616,180.00
## 2 11.01.18 766.33 768.40 762.39 3,809,699,158.00
## 3 10.01.18 766.88 767.71 762.41 3,739,229,520.00
## 4 09.01.18 766.75 770.17 761.61 3,865,224,741.00
## 5 08.01.18 761.58 763.80 758.04 3,371,386,194.00
## 6 05.01.18 760.51 760.51 753.93 4,125,052,960.00

If you only want to print the first three rows you would instead need to write:

head(obx, 3)

## OBX Last High Low Official.turnover..NOK.


## 1 12.01.18 765.94 767.33 762.08 3,834,616,180.00
## 2 11.01.18 766.33 768.40 762.39 3,809,699,158.00
## 3 10.01.18 766.88 767.71 762.41 3,739,229,520.00

Anyway, in my case the data appears to be imported correctly. How about yours? Does
your data only show up with one single column? This may happen if the columns in your CSV
file are not separated by a comma (“,”) but by another character. For example, columns could
have been separated, e.g., by semicolons or pipes instead. Recall that the export properties of
Excel depend on your specific settings. We need to tell the function how to interpret your CSV
file. Let’s consult the help page for further instructions. We write:

?read.csv

The help page provides a lengthy description about what the function is doing, its usage, the
arguments needed, the return value, references to other related functions, and some examples.
These help pages are extremely useful. From the Usage section we see, for example, that the
first argument is always the filename. The second argument tells the function whether there is
a header (that is column names) in the CSV file. This argument has a default value by saying
header=TRUE . The third argument describes how columns of the CSV files are separated. By
default the function assumes that they are separated by a comma (“,”). The fourth argument
describes how vector of characters (i.e. strings) are quoted in the text file. The fifth argument
tells the function what character to use for decimal points and so on. Suppose you want call
the function with separator argument set to a comma, then you would need to write:

Page 10
Nils Friewald FIE 450en

obx <- read.csv("OBX.csv", sep = ",")

But why do we need to add sep= to define the separator but no file= to provide the
filename? This is indeed a good question. This is a so-called named argument. To better
understand this concept let’s go back to the help page. As you notice we can provide several
arguments to the functions. Without providing the argument name R simply matches the list
of arguments one by one according to the function definition in the help page. So by writing

obx <- read.csv("OBX.csv", ",")

## Error in !header: invalid argument type

R would interpret the first argument as the filename and the second as the header. This call
fails because the argument for the header needs to be of a different type. R throws an error in
this case. To make a correct function call we would need to write:

obx <- read.csv("OBX.csv", TRUE, ",")

But this is cumbersome because we do not really need to provide the second argument.
Again, it is set by default to TRUE , a data type that we discuss later. So we prefer to name all
arguments instead which allows us to omit those arguments that we do not really need because
they are set to an appropriate default value. You could also name the first argument with
file= but this is not needed here because the first argument refers to the filename anyways.
So there cannot be any confusion when not providing the argument name. Let’s move on. In
order to get an idea of how large the data set is we can use the following command to get its
dimension: dim

dim(obx)

## [1] 5532 5

This function returns, first, the number of rows and, second, the number of columns. It is
always given exactly in this order! Alright, the numbers seem quite plausible. Next we will
have a closer look at the data.

2.2 Accessing data content


To access specific positions within the data we need a special operator: [ . This operator allows
us to select certain rows and columns. For example, let’s display the content in the first row
and second column. [

obx[1, 2]

## [1] 765.94

Apparently, the first argument in the [ operator refers to the row index and the second
to the column index. Note, that the white space in this command is not a requirement. It

Page 11
Nils Friewald FIE 450en

just makes your script more readable. Anyway, the command displays a number. You may
think, sure, it’s a price what else should we expect. However, if the numbers in your text file,
for example, use commas (“,”) instead of a period (“.”) as a decimal point it would not be
interpreted as a number. To see this, just have a look at the column with the turnover, which
is column five.

obx[1, 5]

## [1] "3,834,616,180.00"

This column does not contain numbers (although it looks like) but instead is of type factor,
a data type that we also will discuss later. Why does this happen? Well, the turnover uses
a comma to separate thousands. So why are they not interpreted as separate columns then?
Well, Excel further encloses the turnover with double quotes.8 Note, there is no direct way in
R to interpret the turnover column correctly. You would need to go back to Excel, change the
format of that column and then export the file again.
How do we show the complete first row instead of just one entry? Well, just leave out the
second argument in the above command.

obx[1, ]

## OBX Last High Low Official.turnover..NOK.


## 1 12.01.18 765.94 767.33 762.08 3,834,616,180.00

This is the first row of our data frame as indicated by “1” on the very left. In fact “1” is a row
name and does not necessarily be a number but can also be a string. In our case, read.csv
automatically assigned row names to our data set. We also observe that the first row of the
text file was (correctly) interpreted as containing the column headers.
So far we have only used indices to access columns. Usually, it makes the code more readable
when we use the names of the columns instead. For example, to print the first element of the
column Last we may use the following lines of codes, all of them are equivalent: $

[[
obx[1, "Last"]

## [1] 765.94

obx$Last[1]

## [1] 765.94

obx[["Last"]][1]

## [1] 765.94

Whew, this looks complicated. Let’s discuss each line of code separately. The first is very similar
to what we have done so far. The only thing that has changed was that we use the column
8
You see this by opening the CSV file with a text editor, e.g. RStudio.

Page 12
Nils Friewald FIE 450en

name instead of the column index. The second line of code makes use of the list operator $
which directly addresses one particular column. (In fact, a data frame is similar to a list which
we will cover later.) After we have selected the entire column we select the first element of
that vector. The third line of code is exactly the same as the one before. We just use the [[
operator instead of the $ operator to access the desired column.
Suppose we now want to access the first three elements of column Last . Instead of providing
a single row index, we provide a vector of indices. To create a vector we use the function c .
This is how we create a simple vector: c

c(1, 2, 3)

## [1] 1 2 3

To access the first three elements of column Last we write, for example:

obx$Last[c(1, 2, 3)]

## [1] 765.94 766.33 766.88

However, if you would like to display the first 50 rows the previous approach will be too
cumbersome. An easy way out of this problem is to use the operator : instead. This operator
creates a vector (or sequence) of numbers. :

1:50

## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
## [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
## [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50

The above command creates a vector between 1 and 50. Further, you see that every line
starts with a number given in square brackets. This number refers to the index of the number
given to the right of it. (In our example, this information is useless because the vector that we
have creates itself refer to the index.) To combine everything (with the output hidden):

obx[1:50, ]

Finally, what about displaying the last three rows of the data set? We use the function
nrow to determine the number of rows in our data frame. nrow

obx[(nrow(obx) - 2):nrow(obx), ]

## OBX Last High Low Official.turnover..NOK.


## 5530 03.01.96 79.7573 NA NA 440,230,415.00
## 5531 02.01.96 77.9852 NA NA 217,560,390.00
## 5532 29.12.95 NA NA NA 213,542,580.00

An easier way to obtain the same result is to use the function tail with the appropriate
arguments. tail

Page 13
Nils Friewald FIE 450en

tail(obx, n = 3)

## OBX Last High Low Official.turnover..NOK.


## 5530 03.01.96 79.7573 NA NA 440,230,415.00
## 5531 02.01.96 77.9852 NA NA 217,560,390.00
## 5532 29.12.95 NA NA NA 213,542,580.00

As you can see from the output, whenever there is data missing we get NA for “not available”.
This is a special data type in R that we will repeatedly encounter when working with data.

2.3 Data processing


Sometimes we wish to rename the columns. We do so by using the function names and c . names

names(obx) <- c("Date", "Last", "High", "Low", "Turnover")


obx[1, ]

## Date Last High Low Turnover


## 1 12.01.18 765.94 767.33 762.08 3,834,616,180.00

You may have noticed that the function call differs from the previous example. It is now on
the left hand side of the assignment operator. This needs some explanation. Recall that obx is
an object. An object may have attributes which can be altered. In our case we alter the column
names of obx by assigning a vector of new names. To create a vector we use the function c .
You can also use names directly to check what the column names are. The function returns
you a vector of strings.

names(obx)

## [1] "Date" "Last" "High" "Low" "Turnover"


End of
Dealing with dates and/or times in data needs special attention. The reason is that both lecture 1
can be formatted in several different ways. For example, do you know how to interpret
“03/02/2002”? Does this date refer to a day in March or February? Of course, it depends
on the convention whether it is American or English. R will not understand it either unless we
provide guidance.

obx[1, 1]

## [1] "12.01.18"

As this output shows R did not convert the column to type Date. (In fact, it was converted to
type factor but, again, more on this later.) Let’s do the conversion using the function as.Date .
We see from one of the previous outputs that the first part refers to the day, the second to the
month and the third to the two-digit year. Note, that this does not necessarily apply in your
case! You may have a totally different date format in your text file, depending on your system
settings. To convert the column into a Date object we use: as.Date

Page 14
Nils Friewald FIE 450en

obx[, 1] <- as.Date(obx[, 1], format = "%d.%m.%y")

The first argument to the function is the date column (or vector) that we would like to
convert into Date object. The second tells the function how it needs to interpret the content.
%m refers to the month, %d to the day and %y to the (two-digit)year. We also tell the function
that the components are separated by a period (“.”). If your date is formatted differently you
need to apply another conversion specification. You can get a complete list of all available
specifications by calling the help page of ?strftime .9 So far so good. Again we need to add
format= so that R understands what the second argument stands for. Look at the help page.
From the Usage section we see that the first argument is always the object that we intend to
convert. The second argument can by either format or origin but both have totally different
meanings. So R cannot know which of the two arguments we provide unless we name them.
Finally, we write the result back to the same column, that is, we overwrite its previous content.
Now let’s check whether it was correctly converted:

obx[1, 1]

## [1] "2018-01-12"

This seems to be correct. To be really sure that the column is now indeed of type Date use
the function class . class

class(obx[, 1])

## [1] "Date"

By the way, Date objects are internally stored as integer numbers which refer to the number
of days past since 1970-01-01. So if you add 1 to a Date object it returns the next following
date. For example:

as.Date("2018-01-02") + 1

## [1] "2018-01-03"

Next we need to sort the data in chronological order for which we use the function order .
This function returns the indices of an unsorted vector that helps to sort the vector. A little
example will demonstrate its usage. order

v <- c(50, 30, 40, 10)


i <- order(v)
i

## [1] 4 2 3 1

v[i]

## [1] 10 30 40 50
9
Alternatively, you can access the information online: https://stat.ethz.ch/R-manual/R-devel/library/
base/html/strptime.html

Page 15
Nils Friewald FIE 450en

We now apply this idea on our data.

obx <- obx[order(obx$Date), ]


head(obx, 3)

## Date Last High Low Turnover


## 5532 1995-12-29 NA NA NA 213,542,580.00
## 5531 1996-01-02 77.9852 NA NA 217,560,390.00
## 5530 1996-01-03 79.7573 NA NA 440,230,415.00

Since we do not use the columns Low , High , and Turnover we remove these columns
from our data frame. We do this by selecting the columns that shall remain:

obx <- obx[, c("Date", "Last")]

So far we have only printed (part of) the data which gives us a first impression on how it
looks like. However, to look at the complete time series of our index it is probably better to
plot it. plot

plot(obx$Date, obx$Last, type = "l", xlab = "", ylab = "OBX")


700
600
500
OBX

400
300
200
100

2000 2005 2010 2015

Page 16
Nils Friewald FIE 450en

The first argument of the plot function are the x coordinates, and the second the y
coordinates. The argument type="l" tells the function to draw lines (instead of just points).
The arguments xlab and ylab define the labels of both axes.
In order to estimate the index characteristics we first need to compute returns. We use log
returns for this purpose, that is

rt = log(pt /pt−1 ). (1)


We can accomplish this using the following line of R code:nr diff

log
obx$r = c(NA, diff(log(obx$Last)))

The arithmetic function log computes the natural logarithm of its argument. Since we
provide the complete column vector the function computes the log of the number in that vector.
The diff takes the differences of the vector of numbers. For example:

diff(c(1, 2, 4, 8))

## [1] 1 2 4

Since we take differences we loose one observation. To end up with a vector of the same length
as the initial price vector we add an NA at the beginning of the vector. This makes sense
because there is no return available at t = 0. The next step is to add a new column r to
our data frame. We now have done all the necessary steps and are ready to compute some
descriptive statistics.
Before we do so we save our data frame to a file because for very large CSV files it may take
a while to import the data which is not what we want. Given that we have imported the CSV
file correctly we can save it as an R data file, which uses compression and will thus not be a
text file anymore. The R data files typically have “RData” as an extension (suffix). save

save(obx, file = "OBX.RData")

Let’s remove obx from the environment and then load the data in again but this time from
the R data file. Of course, you won’t notice any speed advantage here because the file is so tiny. rm

load
rm(obx)
load("OBX.RData")
head(obx)

## Date Last r
## 5532 1995-12-29 NA NA
## 5531 1996-01-02 77.9852 NA
## 5530 1996-01-03 79.7573 0.022469208
## 5529 1996-01-04 79.7102 -0.000590716
## 5528 1996-01-05 79.4104 -0.003768215
## 5527 1996-01-08 80.3380 0.011613392
Kahoot!
1

Page 17
Nils Friewald FIE 450en

2.4 Expected return and volatility


Now we are ready to estimate the expected return and the volatility of the index using historical
data. We will use the function mean and sd . mean

sd
mean(obx$r)

## [1] NA

sd(obx$r)

## [1] NA

Ups, what has happened? Recall that we have missing values ( NA ) in the returns. Thus, we
will also have NA s in the returns. If there is just one single NA in the vector, both functions
(and many others) will return NA . That shall make you aware of that there is something wrong
here. Well, in our case we know what is going on and enforce the calculation by providing the
appropriate argument:

mean(obx$r, na.rm = TRUE)

## [1] 0.0004131256

sd(obx$r, na.rm = TRUE)

## [1] 0.01492927

Alright, we now have the result. But the return and volatility look very small. Don’t forget
we calculated both on daily data and, thus, we have a daily expected return and volatility. We
need to scale them to get the results on an annual basis. Daily returns are scaled by multiplying
it with the number of trading days in a year, while the volatility is scaled by the square root of
the number of days. Do you remember why? sqrt

mu <- mean(obx$r, na.rm = TRUE) * 250


sigma <- sd(obx$r, na.rm = TRUE) * sqrt(250)
mu

## [1] 0.1032814

sigma

## [1] 0.2360525

So what is the Sharpe ratio, that is the ratio of return to risk?

mu/sigma

## [1] 0.4375357

Page 18
Nils Friewald FIE 450en

A Sharpe ratio like this is not too bad. But beware, this is an ex-post figure and not an ex-ante
measure.
While the volatility can be estimated reasonably well it is much harder to get a precise
estimate for the expected return if only a limited number of observations is available (e.g., less
than 15 years). Let’s verify this. How good is our estimate?
For this we make use of an important theorem in statistics. It is the elementary central limit
theorem that states that if the n observations of random variables r1 , r2 , . . . , rn are independent
and identically distributed with expectation µ and variance σ 2 then the sample mean
n
1X
rn = ri = µ̂ (2)
n
i=1
is an estimate of the true mean µ and satisfies
 
σ
(3)
n→∞
µ̂ ⇒ N µ, √
n

Because the term σ/ n is so important it is called the standard error (SE). Knowing the
distribution of the sample mean (in the limit) is an extremely powerful result. This allows
us to make the following inference. What is the probability p that the true value µ lies in the
confidence interval [µ̂−z ·SE; µ̂+z ·SE] with z being the corresponding quantile of the standard
normal distribution?
To compute the interval we need to determine SE and z for a given confidence level p. SE
can be calculated once we know σ. However, the parameter σ would typically be unknown in a
setting in which µ is also unknown. There is a way out of this dilemma. We can easily estimate
the standard deviation:
v
u n
u 1 X
σ̂ = t (ri − r̂n )2 (4)
n−1
i=1

For brevity we now define the variables r and n and compute SE na.omit

length
r <- na.omit(obx$r)
n <- length(r)
SE <- sd(r)/sqrt(n) * 250
SE

## [1] 0.05018987

The function na.omit takes a vector and returns again a vector with all NA being removed
first. The function length determines the length of a vector, the number of elements it
comprises of. We annualize the standard error because we want to make inferences on the
annualized return. Be careful, the standard error is annualized by multiplying the daily standard
error by 250.
Now we need to determine p. From statistics we know that for a standard normal random
variable Z we have
1−p
Prob(Z ≤ (−z)) ≡ Φ(−z) = (5)
2

Page 19
Nils Friewald FIE 450en

with Φ being the standard normal distribution function. See Figure 4.3. Taking the inverse we
get
 
1−p
z = −Φ−1 (6)
2

Figure 2: Density of a standard normal distributed variable.

The (standard) normal distribution function Φ in R is called pnorm and its inverse function
Φ−1 is qnorm . Try to reproduce my hand-made plot using the the normal density function
dnorm ! Let’s assume a confidence probability of p = 0.99 then dnorm
qnorm
p <- 0.99
z <- -qnorm((1 - p)/2)
z

## [1] 2.575829

So what is the confidence interval?

c(mu - z * SE, mu + z * SE)

## [1] -0.02599912 0.23256194


End of
This is how we learned to compute confidence intervals from the statistics course. Alterna- lecture 2
tively, we could compute the intervals directly from the normal distribution function because we
know the two moments µ̂ and σ̂ = SE. We again use qnorm but add the mean and standard
deviation as arguments:

qnorm(0.5 - p/2, mu, SE)

## [1] -0.02599912

qnorm(0.5 + p/2, mu, SE)

## [1] 0.2325619

Page 20
Nils Friewald FIE 450en

The bottom line is that the mean estimate is still not very precise and can very tremendously
despite the long sample period of 22 years!

2.5 Exercises
1. Repeat the excercise in this section with return observations starting in 2010. What is
the 99.9% confidence interval?

2. Download the complete history of monthly stock prices in USD for IBM (Ticker: IBM) from
finance.yahoo.com.10 Plot the complete time-series of stock prices. Compute simple and
log returns. Based on both series compute an estimate for the expected return. Which
one is larger? Why? Further compute the 1% quantiles

3. Download the complete history of daily and monthly stock prices in USD for Microsoft
(Ticker: MSFT) from finance.yahoo.com. For both sampling frequencies compute the
annualized means, standard deviations, variances, and 99% confidence intervals. Compare
the results.
Kahoot!
10 2
Historical data can be downloaded by clicking on the tab Historical Data.

Page 21
Nils Friewald FIE 450en

3 Volatility

Mini case 2 Volatility


Your client is unsatisfied with your work. He let you know that he has an MBA in finance
and knows how to compute means and standard deviations. That is not what he is paying
you for. In the conversation with him he mentions terms such as “volatility regimes”,
“volatility cluster” and the like. Feeling embarrassed you go back to you desk and start
thinking about what your client really meant. How can you improve your work so that you
don’t loose him as a client?

First, plot the returns and analyze its pattern. What do you see?

plot(obx$Date, obx$r, type = "l", xlab = "", ylab = "r")


0.10
0.05
0.00
r

−0.05
−0.10

2000 2005 2010 2015

Well, we make the following observations and conclusions:

• Volatility does not stay constant.

• Periods of high and low volatility, so-called “volatility clusters”.

• Sample standard deviation might be a too simple measure for risk.

Page 22
Nils Friewald FIE 450en

• Remember, you are not so much interested in past volatility but in future volatility. You
need to predict volatility!

Let’s define the volatility σt of a market variable on day t as estimated at the end of day t − 1.
The square of the volatility is referred to as the variance, σt2 . Recall the definition of the
unbiased sample variance of a series of n return observations rt−n , . . ., rt−2 , rt−1 is
n
1 X
σt2 = (rt−i − r)2 (7)
n−1
i=1

We make the following reasonable assumptions:

• The daily mean, r, is zero.

• n − 1 is replaced by n.

Thus we get
n
1X 2
σt2 = rt−i . (8)
n
i=1

Every return observation contributes equally to the variance. That is, the one back in 1996
has the same meaning as the one yesterday. To estimate the current level of volatility it is much
more sensible to put more weight to more recent observations. A more general model for the
variance is
n
(9)
X
σt2 = 2
αi rt−i
i=1

with αi being a weight for return observation on day ti . If we choose αi−1 > αi then less weight
is given to older observations. The weights must sum to unity. Let’s discuss some specific
models.

3.1 EWMA model


The Exponentially Weighted Moving Average (EWMA) model has weights that exponentially
decrease as we move back in time. Specifically, we have αi = λαi−1 where λ is constant and
between 0 and 1. This leads to a simple formula for the variance estimates. We get

σt2 = λσt−1
2 2
+ (1 − λ)rt−1 . (10)
The estimate of the volatility σt made at the end of day t − 1 is a weighted combination of
the estimate σt−1 and the return realization rt−1 . If we substitute for σt−1 and continue that
for all observations n we get
n
(11)
X
σt2 = (1 − λ) λi−1 rt−i
2
+ λn σt−n
2
.
i=1

Assuming that λn σt−n


2 is sufficiently small then we can simplify to

Page 23
Nils Friewald FIE 450en

n
(12)
X
σt2 = (1 − λ) λi−1 rt−i
2
.
i=1

• σt is the volatility estimate for day t given all information up (and including) day t − 1.
It’s a forecast!

• The weights for rt decline at λ as we move back in time.

• The λ governs how responsive the estimate of the daily volatility is to the most recent
return realization.

• A low λ gives a great deal of weight to recent return observations whereas a high λ
produces estimates that respond slowly to new information.

• In practice λ is often set 0.94 as suggested by RiskMetrics but it can also be estimated by
maximum likelihood (more on this later).

Now we are ready to implement Eq. (12). Let’s assume that λ = 0.94. We first create a
vector lambda.vec and then multiply that one with the corresponding squared return series
r.vec . Let’s do this step by step: sum

n <- length(r)
lambda <- 0.94
lambda.vec <- lambda^(0:(n - 1))
r.vec <- r[length(r):1]
sigma2 <- (1 - lambda) * sum(lambda.vec * r.vec^2)
sigma <- sqrt(sigma2)
sigma * sqrt(250)

## [1] 0.09773794

Given a vector of n return observations rt−n , . . . , rt−1 this is the predicted (annualized)
volatility for day t based on the EWMA model. Note that this volatility is way lower than the
sample standard deviation that we computed earlier which was:

sd(r.vec) * sqrt(250)

## [1] 0.2360525

A shortcut to compute the EWMA volatility is: rev

sigma2 <- sum((1 - lambda) * lambda^(0:(length(r) - 1)) * rev(r)^2)


sigma <- sqrt(sigma2)
sigma * sqrt(250)

## [1] 0.09773794

Page 24
Nils Friewald FIE 450en

Instead of selecting the most recent return observations first, the second most recent next and
so on, we simply revert the complete return vector using the function rev .
As an exercise we now compute and plot the “historical” EWMA volatilities using an ex-
panding window. That is, each day, we use all past observations available for our calculation.
To put it differently, we “loop” over the entire return series. We thus first need to introduce
the for statement. How does that work? A little example will demonstrate. for

v <- 1:5
for (i in v) {
cat(i, "\n")
}

## 1
## 2
## 3
## 4
## 5

The variable i will “run” through the vector v taking each value one by one. In each
cycle we print the value of the counter variable i to the console. Recall that cat prints the
value of a variable. The “\n” is a special character that makes a new line. Putting everything
together we now compute the historical EWMA volatilities like this:

sigma2 <- c()


for (i in 1:length(r)) {
sigma2 <- c(sigma2, sum((1 - lambda)*lambda^(0:(i - 1))*rev(r[1:i])^2))
}
sigma <- sqrt(sigma2)

Note that we first initialize the variance vector sigma2 , that is we create the variable but
do not write anything into the vector. It is of length zero. We then sequentially add variance
estimates to sigma2 by going forward in time. How does the resulting time-series of volatilities
look like?

plot(sigma * sqrt(250), xlab = "", ylab = "EWMA", type = "l")

Page 25
Nils Friewald FIE 450en

0.8
0.6
EWMA

0.4
0.2

0 1000 2000 3000 4000 5000

Are the EWMA volatility estimates at the beginning of the sample period good forecasts?
Why? Why not? The EWMA volatility estimate of the total sample period is a reasonable risk
measure that we could report to our client. It is a better short-term predictor than the sample
standard deviation.

3.1.1 Maximum likelihood approach


We have estimated the EWMA volatility by assuming that λ = 0.94. But does that really reflect
the data? Recall that λ determines the weighting scheme. If λ = 1 all returns are weighted
equally and we end up (almost) with the sample standard deviation. Lower values of λ put more
weights on more recent return observations. Clearly, the λ tells how quickly any volatility spike
fades out over time. Again, a value of one assumes constant volatility as all return observations
enter the formula with equal weights. Thus, looking at the time pattern of returns should tell us
something about the time-varying volatility and thus about λ. We therefore directly estimate λ
from the data using the maximum likelihood approach. We start with a simple example to show
how maximum likelihood estimation in general works and then apply it to the EWMA model.
Suppose we have n daily return observations rt−n , . . . rt−1 and we would like to estimate the
constant volatility of the returns. Let’s assume:
• Daily returns are normally distributed.
• The mean of the daily return is zero. Again, this is a reasonable assumption because the
mean is so much smaller compared to the return variation.

Page 26
Nils Friewald FIE 450en

• Returns are independent over time.

What is the likelihood of observing rt if we knew its volatility σ (or equivalently its variance
σ 2 )? Well, since we assume returns to be normally distributed it is simply defined by its
probability density function. From standard statistics textbook we know that density of r = rt
is given by
1 1 rt 2
(13)

√ e− 2 σ ,
2πσ 2
assuming µ is zero. What is the likelihood of observing all n observations together? Again,
we know from statistics that if returns are independent then the likelihood of observing the n
observations is just the product of the probability densities of all individual observations. The
so-called likelihood function is:
n  r 2 
1 − 12 t−i
(14)
Y
√ e . σ

i=1 2πσ 2 End of


We want to maximize this function, i.e. we want to search for the parameters that increase lecture 3
the likelihood of observing exactly the given returns. The only unknown is the variance. Before
we maximize the function we apply a little trick here. We first take the logarithm of this
expression. Since the log function is monotone it does not matter for the result but it eases
the computation. Recall, the logarithm of a product of variables is equal to the sum of the
logged variables. The log likelihood function is:
n
n n 1 X 2
2
− log(2π) − log(σ ) − 2 rt−i (15)
2 2 2σ
i=1

We maximize this function by differentiating with respect to σ 2 (not w.r.t σ) and setting the
final expression to zero.
n
n 1 X 2
− 2+ 4 rt−i = 0 (16)
σ σ
i=1

The variance that maximizes the log likelihood function is given by:
n
1X 2
σ2 = rt−i (17)
n
i=1

Of course, this expression looks familiar to us. The only difference is the 1/n versus the
1/(n − 1) that we find in the equation of the sample standard deviation. In this example we
applied the log likelihood method and found an analytical solution for the variance σ 2 . Note
that for more complex cases we perhaps may not get a closed-form expression for the parameter
in question and, thus, we need to resort on a numerical approach. This is what we do next with
regard to the EWMA model.
In what follows we use the same method to derive a solution numerically for the EWMA
volatility. The only difference to the example above is that the variance is assumed to be time-
varying, that is, we have σt2 . The log likelihood function (after some simplifications) is given
by

Page 27
Nils Friewald FIE 450en

n 
r2

(18)
X
2
− log(σt−i ) − t−i
2
i=1
σt−i

with the expression for the EWMA variance given in Eq. (12).
Again, we want to maximize the log likelihood function numerically. How can we do this in
R? For this we first need to discuss the following topics:

• How to define functions in R?

• How to optimize in R?

3.1.2 Functions in R
Let’s first define a function div in R that divides a number a by b. This is how you do it: function

return
div <- function(a, b) {
result <- a/b
return(result)
}

We define the function div using the command function . For the function name the same
rules apply as for any other variable name. In parentheses we tell R what arguments the func-
tions requires. In our example we have the arguments a and b which are both being processed
inside the function body. Note that we do not necessarily need to write return(result) , in-
stead we could just write result or a/b . But it is good programming style since it makes
immediately clear what the function returns. After having defined the function we can enter its
name.

div

## function(a, b) {
## result <- a/b
## return(result)
## }

If we just enter the function name then R returns the definition of the function that is,
it does not call the function. We know, in order to call the function we need to provide its
arguments:

div(4, 2)

## [1] 2

div(2, 4)

## [1] 0.5

Page 28
Nils Friewald FIE 450en

As you see, it is highly important (in this case) to provide the arguments in the right order.
Thus, you need to know how your function is defined. If you use internal functions or functions
provided by someone else make sure that you know how to run it. Check the help page!
Nevertheless we have some flexibility in calling the function. This brings us back to the “named
argument” concept. Both calls return the same result:

div(4, 2)

## [1] 2

div(b = 2, a = 4)

## [1] 2

In the first example R interprets the arguments as given in the function definition. In the second
example we tell R how to interpret the arguments and thus we do not strictly need to follow the
function definition in terms of the sequence of the arguments. This gives us some flexibility.
Sometimes you want to have default values for some of the arguments so that you do not
necessarily need to provide them for each function call. Suppose for the moment, that you may
want to divide a number a by 2 in most of the cases. You could define a function:

div2 <- function(a, b = 2) {


a/b
}

Now you have the option whether to provide the denominator or not:

div2(4)

## [1] 2

div2(4, 3)

## [1] 1.333333

There is one more important thing to mention about functions. A frequent requirement is
to allow one function to pass on argument settings to another nested function. This can be done
by including an extra argument, literally ... , to the function, which may then be passed on.
Suppose we want to write a function plot.normal that plots the normal density with mean
mu and volatility sigma from x1 to x2 . This is one way we could do it: seq

plot.normal <- function(x1, x2, mu, sigma) {


x <- seq(x1, x2, by = 0.01)
y <- dnorm(x, mu, sigma)
plot(x, y, type = "l", xlab = "", ylab = "")
}

Page 29
Nils Friewald FIE 450en

Note that the previous function makes use of the function seq which creates a vector of values
starting with x1 and stopping at x2 with increments given by by . We can also define
the function above using the ... argument which does just the same but shorter and more
...
versatile.

plot.normal <- function(x1, x2, ...) {


x <- seq(x1, x2, by = 0.01)
y <- dnorm(x, ...)
plot(x, y, type = "l", xlab = "", ylab = "")
}
plot.normal(-3, 3, 1, 1.5)

But what is the difference? We did not specify the properties of the normal density in plot.normal .
Using the ... argument in our function specification allows us to pass on further arguments
in the function call. These arguments are then used in the function body. In our case they are
provided to the function dnorm . There they are interpreted as the mean and the volatility.
Check the dnorm help page that this is indeed true.

3.1.3 Optimization in R
Now let’s do some optimization. I will demonstrate the optimization procedure with a little
example. Assume we want to find the maximum of a function y = f (x) = −(x − 2)2 which
looks like this:

x <- seq(-1, 5, by = 0.1)


y <- -(x - 2)^2
plot(x, y, type = "l")

Page 30
Nils Friewald FIE 450en

0
−2
−4
y

−6
−8

−1 0 1 2 3 4 5

It is quite obvious from the figure as well as from the equation that the maximum must be at
x = 2. However, we would like to find it numerically using optimization. The optimization
problem formally is given by:

max f (x) = −(x − 2)2 (19)


x

We maximize the objective function f by varying x. Let’s define the function f in R first:

f <- function(x) {
y <- -(x - 2)^2
return(y)
}

One way to search for x is to do by trial and error so that f (x) is maximized:

f(1.5)

## [1] -0.25

f(2)

## [1] 0

Page 31
Nils Friewald FIE 450en

f(2.5)

## [1] -0.25

Of course, this may be too cumbersome if we have a very complex function and no idea where
the optimal solution is. Thus we use an optimizer similar to the Excel Solver. There are many
optimizer available in R, each to be used for a specific need. For example, some optimizers
are designed for linear some for non-linear problems. Some optimizers allow to specify bound-
aries, some do not. We are going to use nlm here. Note, however, that this function carries
out a minimization instead of a maximization. This is no problem because we can redefine
our objective function. Maximizing f (x) is the same as minimizing −f (x). Let’s redefine f
accordingly.

f <- function(x) {
y <- (x - 2)^2
return(y)
}

Then apply the optimization and assign the result to a variable res: nlm

res <- nlm(f, 0)


res

## $minimum
## [1] 0
##
## $estimate
## [1] 2
##
## $gradient
## [1] 0
##
## $code
## [1] 1
##
## $iterations
## [1] 2

The first argument of nlm is the name of the objective function which in our case is f . The
second argument is used to tell the solver where to start to search for the minimum. The
function returns a list which is a collection of variables of different types similar to a data frame.
The list contains the following components:

minimum
Value of the estimated minimum of f .

estimate
Point at which the minimum value of f is obtained.

Page 32
Nils Friewald FIE 450en

gradient
Gradient at the estimated minimum of f .

code
An integer indicating why the optimization process has been terminated. For example, “1”
indicates a probable solution.

iterations
Number of iterations performed.
Kahoot!
3
3.1.4 Searching for the optimal EWMA parameter
Now it is time to come back to our original problem. Remember we have the EWMA volatility
model that has one parameter which is λ. We want to estimate λ so that the model best reflects
our historical data. We need to:

1. Define a function ewma.var that computes the EWMA variance following Eq. (12).

2. Define a function ewma.hist.var that computes a vector of historical EWMA variances.

3. Specify an objective function which is the maximum likelihood function given in Eq. (18).
Let’s call this function ewma.ll.fun .

4. Choose an appropriate optimization function. We will use nlminb because it allows us


to bound the λ. Remember λ is defined only between 0 and 1. Then we specify a starting
value for λ and start the optimization.

5. Verify the output result.

Let’s do this step-by-step. We start defining a function that computes the EWMA volatility
given the parameter λ.

## Computes the EWMA variance


##
## r: Vector of return observations
## lambda: EWMA parameter
## Returns a the EWMA variance
ewma.var <- function(r, lambda) {
sigma2 <- sum((1 - lambda)*lambda^(0:(length(r) - 1))*rev(r)^2)
return(sigma2)
}

It is always a good idea to comment what a function does, what arguments are needed and
what exactly the function returns. Now we use the previous function to compute the historical
EWMA variances.

Page 33
Nils Friewald FIE 450en

## Computes historical EWMA variances


##
## r: Vector of return observations
## lambda: EWMA parameter
## Returns a variance vector
hist.ewma.var <- function(r, lambda) {
sigma2 <- c()
for (i in 1:length(r)) {
sigma2 <- c(sigma2, ewma.var(r[1:i], lambda))
}
return(sigma2)
}

We then define the log likelihood function which serves as the objective function in our
optimization problem.

## Log-likelihood function for EWMA volatility model.


##
## lambda: EWMA parameter
## r: Return vector
## Returns the negative of the log likelihood.
ewma.ll.fun <- function(lambda, r) {
sigma2 <- hist.ewma.var(r, lambda)
sigma2 <- sigma2[-length(sigma2)]
r <- r[-1]
log.ll <- sum(-log(sigma2) - r^2/sigma2)
return(-log.ll)
}

The first argument of the objective function is the parameter to be searched for, i.e. lambda .
This is important because every optimization routine in R assumes the first argument of the
objective function to be the unknown parameter. We then calculate a vector of historical
EWMA variances. Next we need to match the vector of forecasted and realized variances. That
is why we drop the last forecast (because we do not have a corresponding realization) and the
first realized variance (because we have no forecast right at the beginning of our sample period).
Note that we use negative indices to de-select values in a vector. Finally, we return the negative
of the likelihood function because almost all optimization routines in R minimize instead of
maximize a function.
We choose to use nlminb for this purpose because contrary to nlm it allows to constrain
the parameters. This makes sense here because λ is only defined between 0 and 1. nlminb

res <- nlminb(0.5, ewma.ll.fun, lower = 1e-06, upper = 1 - 1e-06,


r = r)

The first parameter shall be a sensible starting value for λ, i.e. the objective function needs
to be defined there. Further note that contrary to nlminb the first parameter for nlm was
the objective function. The second argument is the objective function (the negative of the

Page 34
Nils Friewald FIE 450en

likelihood function). The variable lower is the lower bound of λ. Here we do not choose zero
because zero is not a valid value for the objective function. Try it. The variable upper is the
upper bound of λ. The same applies as for the lower bound restriction. Finally, r is the vector
of all return observations. This is the ... argument in the nlminb function definition. So
nlminb just passes on the variable to our objective function ewma.ll.fun . Now let’s look at
the results. The function returns an object of type list:

res

## $par
## [1] 0.9231018
##
## $objective
## [1] -43123.89
##
## $convergence
## [1] 0
##
## $iterations
## [1] 7
##
## $evaluations
## function gradient
## 11 9
##
## $message
## [1] "relative convergence (4)"

Let us interpret the obtained results:

par:
The optimal λ that maximizes the log-likelihood or (equivalently) minimizes the objective
function. It is almost the value as suggested by RiskMetrics!

objective:
Value of the objective function.

convergence:
Indicates the optimization routine has converged (see help page).

iterations:
Number of iterations needed until it is considered to have converged.

Again, the parameter λ that we have estimated is very close to the one recommended by
RiskMetrics. You can even test different length of the sample period. The parameter does not
change dramatically. What is the EWMA volatility forecast given λ = 0.92?

Page 35
Nils Friewald FIE 450en

sqrt(ewma.var(r, lambda = 0.94) * 250)

## [1] 0.09773794

sqrt(ewma.var(r, lambda = res$par) * 250)

## [1] 0.09517522

The difference is negligible. Again, the EWMA estimate would be a reasonable estimate for
today’s volatility of the OBX index. However, to impress your client even further you look for
a more advanced variance model which we discuss in the next section.

3.2 GARCH model


We now move on to a more sophisticated but very popular type of volatility model. This is the
Generalized Autoregressive Conditional Heteroscetasticity model or, more simply, the GARCH
model. So why might it be necessary to use a more complex model than the EWMA model?
Let’s first look at its specification and than discuss the difference to the EWMA model. The
GARCH(1,1) model is defined as

σt2 = γVL + αrt−1


2 2
+ βσt−1 (20)
with

γ+α+β =1 (21)
We see that the variance is calculated from a long-run variance rate VL as well as from σt−1
2 and
rt−1 . In this respect it is similar to the EWMA model with a small but important extension.
2

The “(1,1)” indicates that the forecast σt2 depends on the most recent squared return observation
of rt−1
2 and the most recent estimate of the variance rate σt−1
2 . Setting ω ≡ γV we get
L

σt2 = ω + αrt−1
2 2
+ βσt−1 (22)

Note that for a stable GARCH(1,1) model we need that α + β < 1 because otherwise we would
have a negative weight on the long-term variance.
The GARCH(1,1) model recognizes that over the time the variance tends to get pulled back
to its long-run mean VL . Thus, the GARCH process follows a mean-reverting process whereas
the EWMA model does not incorporate mean-reversion. This is the important difference which
makes the GARCH model more appealing than the EWMA model.
Moreover, in contrast to the EWMA model, there are no standard parameters that can be
used. We need to estimate them using maximum likelihood. The log likelihood function is the
same as for the EWMA model. Recall that it is given by

n 
r2

(23)
X
2
− log(σt−i ) − t−i
2 .
i=1
σ t−i

The only difference, of course, is how the σt2 is defined. Analogously to the EWMA model we
first define a function garch.var that computes the GARCH variance following Eq. (22):

Page 36
Nils Friewald FIE 450en

## Computes the GARCH variance


##
## r: Vector of return observations
## omega: GARCH parameter
## alpha: GARCH parameter
## beta: GARCH parameter
## Returns a vector of GARCH variances
garch.var <- function(r, omega, alpha, beta) {
sigma2 <- r[1]^2
for (i in 2:length(r)) {
sigma2 <- c(sigma2, omega + alpha*r[i]^2 + beta*sigma2[i - 1])
}
return(sigma2)
}

This needs some explanation. The formula for the GARCH variance is a recursive formula, that
is, it depends on previous variance estimates.

t−n
First return observation in our sample We cannot make any forecast at t − n because we do
not have any information before that date.

t−n+1
We can still not apply our recursive formula given by Eq. (22) because we do not know the
forecast at σt−n
2 . However, at some point we need to start. We thus make the following

reasonable assumption: σt−n+1


2 2 . This is the first line in our function.
= rt−n

t−n+2
From now on we can recursively apply Eq. (22) using the for statement.

We now define the log likelihood function to estimate the GARCH model:

## Log-likelihood function for GARCH volatility model.


##
## par: Vector of GARCH parameters
## r: Vector of return observations
## Returns the negative of the log likelihood.
garch.ll.fun <- function(par, r) {
omega <- par[1]
alpha <- par[2]
beta <- par[3]
sigma2 <- garch.var(r, omega, alpha, beta)
r <- r[-1]
sigma2 <- sigma2[-length(sigma2)]
ll <- sum(-log(sigma2) - r^2/sigma2)
return(-ll)
}

Page 37
Nils Friewald FIE 450en

Again, the first argument of the objective function must always be the parameter that we are
looking for. In this case, however, we have three parameters (i.e. ω, α, and β). This is no
problem because we can put them into one single vector called par that shall be the first
argument of the function. We are now ready to do the optimization.

res <- nlminb(c(0.001, 0.3, 0.3), garch.ll.fun, lower = 1e-06,


upper = 1 - 1e-06, r = r)

What do we find?

res

## $par
## [1] 3.253296e-06 1.101613e-01 8.744163e-01
##
## $objective
## [1] -43204.7
##
## $convergence
## [1] 0
##
## $iterations
## [1] 26
##
## $evaluations
## function gradient
## 63 94
##
## $message
## [1] "relative convergence (4)"

The long-term variance VL = ω/γ is

omega <- res$par[1]


alpha <- res$par[2]
beta <- res$par[3]
gamma <- 1 - alpha - beta
VL <- omega/gamma
VL

## [1] 0.0002109465

sqrt(VL * 250)

## [1] 0.2296446

This is nearly the same as the unconditional standard deviation that we have estimated earlier.

Page 38
Nils Friewald FIE 450en

3.3 Volatility forecasts


Recall, we have used EWMA and GARCH to get a volatility forecast σt that was made at the
end of day t − 1 with all the information available up to that point in time. What if we want
to make a forecast for the volatility for day t + m ahead? For the EWMA model the estimate
for day t + m is exactly the same as the one we made for today t. That is

2
Et−1 [σt+m ] = σt2 (24)
To put it differently, although we weight recent return observations more heavily than past
observations, the daily volatility forecast remains constant.
What about the GARCH model? Here we need to be more careful because the GARCH
model was specified as a mean-reverting process. This implies that the volatility is reverting
to its long-run mean VL . For example, if the volatility today is above its long-run mean we
“expect” the future volatility to be pulled back to its long-run mean. The opposite is true if its
below its long-run mean. Formally, the expected volatility m + 1 days ahead is

2
Et−1 [σt+m ] = VL + (α + β)m (σt2 − VL ). (25)

This equation tells us that our prediction for future volatility is not constant compared with
the EWMA volatility model or if we were to use the simple sample standard deviation. We get
a term-structure of volatilities.
It’s time now to put everything together and make a forecast, for example, for the return
from t to t + m. I am going to use the following notation: rt→t+m . This return is not known by
now, its random, of course. It consists of all m + 1 daily (random) returns up to t + m, that is:

rt→t+m = rt + rt+1 + . . . rt+m (26)


Beware, this is not a daily return anymore. What’s the risk, that is the variance of this
return? Let’s compute it using the “variance operator”:

Var[rt→t+m ] = Var[rt ] + Var[rt+1 ] + . . . Var[rt+m ] (27)


Note the variance of the sum is the sum of the variance of each daily return only if returns are
independent (which we earlier assumed). We are almost there. If we assume the variance to be
constant such as for the EWMA model we get
m
(28)
X
Var[rt→t+m ] = Var[rt+i ] = m · σ 2
i=0

This is the standard rule you are familiar with for scaling variances (and volatilities)!
Again, for the GARCH model the forecasted variances are not constant but mean-reverting.
Thus we have
m
(29)
X
2
Var[rt→t+m ] = σt+i
i=0

where σt+i
2 is defined as in Eq. (25).
Let’s make a real forecast using both models. Suppose we want to make a one-year forecast.
For the EWMA model we get an annualized volatility of

Page 39
Nils Friewald FIE 450en

ewma.sigma2 <- ewma.var(r, lambda = 0.94)


sqrt(ewma.sigma2 * 250)

## [1] 0.09773794

Using the GARCH model we need to compute predictions for all 250 trading days.

m <- 249
garch.sigma2 <- garch.var(r, omega, alpha, beta)
garch.sigma2.t <- garch.sigma2[length(garch.sigma2)]
garch.sigma2.vec <- VL + (alpha + beta)^(0:m) * (garch.sigma2.t -
VL)

Let’s plot the forecasted annualized GARCH volatilities:

garch.sigma.vec <- sqrt(garch.sigma2.vec)


plot(garch.sigma.vec * sqrt(250), xlab = "Days", ylab = "GARCH Volatility",
type = "l")
0.22
0.20
GARCH Volatility

0.18
0.16
0.14
0.12

0 50 100 150 200 250

Days

Page 40
Nils Friewald FIE 450en

sqrt(sum(garch.sigma2.vec))

## [1] 0.2065874

How do you interpret both results. Why is the GARCH volatility higher than the EWMA
volatility? Kahoot!
4
3.4 Option-Implied Volatility End of
lecture 4
So far we have used historical data to estimate future volatility. An alternative is to use options
traded on the index. We know that the price of an option depends on the volatility of the
underlying asset (in our case it’s the OTX index). Given that we observe the price of the
option we can back out its implied volatility. This approach does not rely on historical data.
In fact, the volatility that we estimate is forward-looking because option prices incorporate the
expectation about the future risk of the underlying.
First, we need to get information about options traded on the OBX index. We get this
information from the Oslo Stock Exchange.11 There are several options traded. Which one to
choose? Well, that depends. In principal it shouldn’t matter because according to the Black-
Scholes model there is just one volatility that prices all options traded on a given underlying.
Recall that according to the model volatility is constant! We thus would like to have an option
that is traded frequently so that its price indeed reflects investors opinion about the future
prospects of the underlying index. The most liquid options are usually traded at-the-money,
that is, the strike is close to the current value of the index which trades at 766 as of January
24, 2018. Options with shorter expirations typically also have higher liquidity. We decide to
take a call but put options would also be fine. The following quotes are obtained from the Oslo
Stock Exchange on January 24, 2018:

Ticker Type Strike Maturity Buy Sell


OBX8B770 Call 770 2018-02-16 7.50 8.75

Note how large the bid-ask spread of this option is. What does this mean? We take the
mid-quote of the prices.

C.market <- mean(c(7.5, 8.75))

We then use the Black-Scholes formula. If you can’t remember here is the formula again:

ln(S0 /K) + (rf + σ 2 /2)T


d1 = √ (30)
σ T

d2 = d1 − σ T (31)
C = S0 · Φ(d1 ) − e −rf T
· K · Φ(d2 ) (32)

You should already be familiar with this famous equation. It will be more of a challenge to
transform these three equations into an executable R script. We will first define the parameters
11
Options on OBX: https://www.oslobors.no/ob_eng/markedsaktivitet/#/derivativeUnd/OBX.OSE

Page 41
Nils Friewald FIE 450en

as given above, that is, we assign the numbers to the variables. For the risk-free rf interest rate
we use the Nibor which should reflect the rate of return banks can borrow from each other. You
will get the Nibor rate for the corresponding maturity also from the Oslo Stock Exchange.12
As an approximation we use the 1-month Nibor rate. as.numeric

S0 <- 766.12 # OBX index


K <- 770 # Strike price
T <- as.numeric(as.Date("2018-02-16") - as.Date("2018-01-24"))/365 # Maturity
rf <- 0.008 # Risk-free interest rate

Again, it is good programming style to comment your script. You do this using the character
“#”. We compute the number of days between the valuation date and maturity by taking the
differences between those dates. Remember this returns you the number of days in between.
We then need to transform this value into a numeric using the command as.numeric . Don’t
forget to compute the year fraction of the time-to-maturity. Next, we define a function that
computes the price of a European call option. exp

call <- function(S0, sigma, K, rf, T) {


d1 <- (log(S0/K) + (rf + sigma^2/2) * T)/sigma/sqrt(T)
d2 <- d1 - sigma * sqrt(T)
C <- S0 * pnorm(d1) - exp(-rf * T) * K * pnorm(d2)
return(C)
}

Now we must define the objective function. But what do we really want to maximize or minimize
here? We do not have a maximum likelihood function as for the EWMA or GARCH models.
Actually, we don’t need one here. Note that we are looking for a target σ so when plugged into
the Black-Scholes equation returns us the observed market price of the Call option Cmarket .
The subscript reminds us that this is the market price. Thus we can write for the objective
function

 = (C(S0 , σ, K, rf , T ) −Cmarket )2 , (33)


| {z }
Cmodel

which refers to the squared error between model and market price. We thus wish to minimize
this function. The optimization problem is:

min  = (C(S0 , σ, K, rf , T ) − Cmarket )2 . (34)


σ

Let’s put everything together and define objective function in R.

obj.fun <- function(sigma, C.market, S0, K, rf, T) {


C.model <- call(S0, sigma, K, rf, T)
eps <- (C.model - C.market)^2
return(eps)
}
12
Nibor: https://www.oslobors.no/ob_eng/markedsaktivitet/#/list/nibor/quotelist

Page 42
Nils Friewald FIE 450en

Start the optimizer with a sensible starting value:

res <- nlm(obj.fun, p = 0.2, C.market = C.market, S0 = S0, K = K,


rf = rf, T = T)

What does the optimizer find?

res

## $minimum
## [1] 3.642174e-15
##
## $estimate
## [1] 0.1270815
##
## $gradient
## [1] -3.920127e-08
##
## $code
## [1] 1
##
## $iterations
## [1] 4

We find that the option-implied volatility is above the EWMA volatility but still much lower
than the GARCH volatility estimate and the simple sample standard deviation.
Proud of having estimated four different measures of volatility you now go back to your
client. But what estimate do you advice him to use? It very much depends on the investment
horizon of your client. For short-term investments of less than three month, let’s say, your client
should use the option-implied volatility or the EWMA volatility. For “long-term” investments
your client should instead use the GARCH volatility estimate.

3.5 Exercises
1. Download the complete history of monthly stock prices in USD for Google (Ticker: GOOG)
from finance.yahoo.com. Estimate an EWMA model to forecast the 1-month volatility.

2. Download the complete history of monthly stock prices in USD for Google (Ticker: GOOG)
from finance.yahoo.com. Estimate a GARCH(1,1) model to forecast the 1-month volatil-
ity.

3. Go to finance.yahoo.com and search for call options on Google. Choose the shortest
maturity and compute the implied volatilities of a range of options with varying strike
prices. Assume a risk-free interest rate of 50 bp. Can you replicate the implied volatilities
that are given on the same page?

Page 43
Nils Friewald FIE 450en

4 Monte-Carlo simulation

Mini case 3 Exotic options


You are an expert witness in a court case where a bank was accused by an investor of
having incorrectly quoted the price of a security. The investor sued the bank claiming that
the selling price (i.e. bid price) of this product was too low and thus unfair. The security is
a knock-out warrant call or down-and-out call option, respectively, on the EURO STOXX
50. Certificates have became very popular among investors in the recent past because it
allows the trading of more sophisticated products compared with regular options. Barrier
options are also less expensive than standard options. You have been asked to determine
the market price of this security and evaluate whether the selling price was fair. You were
provided with the following information:

WKN PR1LWN
Name Knock-out warrant call
Type Down-and-out call option
Underlying EUROSTOXX 50
Strike price 3220
Barrier level 3220
Selling date January 27, 2017
Selling price 0.84
Maturity April 27, 2017
Ratio 1:100

A down-and-out call option is a barrier option. The payoff depends on whether the price of
the underlying falls below a certain barrier level during its life time. A knock-out option ceases
to exist when the underlying falls below a certain barrier level.
Let’s make this a little bit more formal now. A down-and-out call option with barrier b,
strike K, and expiration T has the payoff

I{τ (b) > T }(S(T ) − K)+ , (35)


where

τ (b) = inf{t : S(t) < b} (36)


is the first time in t ∈ [t0 , T ] the price of the underlying asset, S, drops below b and I{} denotes
the indicator of the event in braces, i.e. I is one if the expression within braces it true and zero
otherwise. The payoff function as specified above is an example of a continuously monitored
option. That means that it is continuously checked whether the underlying falls below the
barrier. In contrast, for discretely monitored barrier options one only looks at certain dates
such as every end-of-month.

Page 44
Nils Friewald FIE 450en

4.1 Fundamental Theorem of Asset Pricing


In general, the price of any claim (or payoff) is the discounted expected value under a risk-
neutral probability measure. This is referred to as the Fundamental Theorem of Asset Pricing.
Thus, we can write for the price of the call option:

C = E[e−rT I{τ (b) > T }(S(T ) − K)+ ] (37)


| {z }
Payoff at T

Normally, we would try to solve the above equation in closed-form. That is, obtain an
equation where we plug in all the numbers to calculate the price of the option. However,
discretely monitored barrier options cannot be solved analytically. Therefore, one either needs to
make some simplifications or resort to a numerical approach. A widely used numerical approach
in finance is Monte-Carlo simulation. Although continuously monitored barrier options can be
priced in closed-form we still frequently rely on Monte-Carlo techniques. The reason is that
Monte-Carlo methods are straightforward to implement and are the most general technique for
the pricing of derivatives.

4.2 Principals of Monte-Carlo simulation


Let me introduce the principles of Monte-Carlo simulation first.

• Monte-Carlo simulation essentially uses the law of large numbers to evaluate the expecta-
tion E[X] of a random variable X.

• The law of large numbers states that the sample mean of a sequence of independent and
identically distributed random variables X1 , . . . , Xn , converges to the expected value E[X],
i.e.
n
1X
(38)
n→∞
Xn = Xi → E[X]
n
i=1

• The mean X n is an estimator of the expected value E[X] and is usually denoted by X̂n .

Let’s do a little example. If we roll a dice what is the expected roll? In this case it’s trivial
to determine. It’s simply the sum of the roll times the probability of its occurrence. More
formally the expected value of a roll X is
n
1 1
(39)
X
E[X] = Prob[X = xi ] · xi = 1 + . . . + 6 = 3.5.
6 6
i=1

So the answer is 3.5. But what if we did not know this? Alternatively and quite intuitively
you would start rolling the dice many times and write down the rolls. After many rounds you
take the mean of the rolls. You should end up close to the true value of 3.5. Thus we can
compute or estimate an expectation by simply simulating and then taking the mean of the
outcomes. This is a very important result. Let’s try it numerically using the computer. Since
the rolls are all equally likely to occure they are uniformly distributed. How can we sample
from such a distribution in R? sample

Page 45
Nils Friewald FIE 450en

n <- 1000
roll <- sample(1:6, n, replace = TRUE)

We have used the command sample to sample from a given set of elements. The elements are
a sequence of numbers between 1 and 6 corresponding to the outcome of rolling a dice. We
repeat this n times and tell the function to replace the outcomes. That means any roll can
occur again in the next trial. table

table(roll)

## roll
## 1 2 3 4 5 6
## 164 182 155 182 155 162

mean(roll)

## [1] 3.468

We use the function table to get a summary of all our drawings. It returns the number of
observations in our simulation. We see that they all occur with almost the same frequency. We
then compute the mean which intuitively is an estimate of the expected roll. We see it is fairly
close to the expected value.
We are now ready to apply the Monte-Carlo principle in a more general setting. In fact it
allows us to compute any expectation of a random variable. Note that the law of large numbers
is a numerical method for evaluating integrals, because
Z
E[X] ≡ x · f (x)dx (40)

with f being some density function of X. The law of large numbers can also be applied on a
function of a random variable, i.e. Y = g(X)
Z
E[Y ] = g(x)f (x)dx (41)

Look at the similarity of the general definition of the expected value and the one we have used
to compute the expected dice roll. The only difference is that in the dice example we have a
discrete distribution whereas the above equation refers to continuous distribution.

4.3 A model for the index price


How do we apply the Monte-Carlo method to price our option? We can’t simulate discounted
payoffs because we don’t know from what distribution to draw from. Note that the distribution
cannot be derived given that the complexity of the payoff function. However, we know that the
payoff is determined by the underlying of the option. Thus, we could “simulate” paths for the
index value. Once we know the evolution of a path we are able to compute the option payoff by
simply applying the payoff function conditional on that path. Did the index price fall below the
barrier? If so there is no payout. Was the price always above the barrier? Then it’s simply the
payoff of a regular call at maturity. The idea is simple. But how do we simulate a single path?

Page 46
Nils Friewald FIE 450en

Well, we first need a model for the dynamics of the index price which defines its distribution.
In the dice example above it was straightforward. The roll was uniformly distributed. Here we
follow the Black-Scholes model by assuming the dynamics to follow the stochastic differential
equation (SDE)

dS(t) = rf S(t)dt + σS(t)dW (t). (42)


The coefficient on dt is the mean rate of return, σ denotes the volatility of the index price
and W is a standard Brownian motion or Wiener process (this brings in the uncertainty into
future prices). The dW in√the above equation then refers to a Gaussian increment where
W (t + ∆t) − W (t) ∼ N (0, ∆t). In taking the rate of return to be the same as the risk-free
interest rate rf we implicitly describe the risk-neutral dynamics of the index price. Whew! That
looks complicated! That needs some explanation.
The SDE describes the price changes at infinitesimal small time changes dt. Let’s make a
simplification by assuming that the price only changes, for example, from one day to the next.
That is we move from dt to ∆t. The discrete version of the SDE is

∆S(t) = rf S(t)∆t + σS(t) ∆t, (43)
where  is a random variable from a normal distribution with mean 0 and standard deviation
1. After rearranging we can make it even more explicit what the equation tells us:

S(t + ∆t) − S(t) √


= rf ∆t + σ ∆t (44)
S(t)
This equation models the relative change in the price which consists of a deterministic part
and a stochastic part. This enables us to simulate the value of the index at time t + ∆t given
its price today at t. The uncertainty comes from drawing a normal random variable . Once
we get the value S(t + ∆t) we use the same equation above to simulate the price one time step
ahead, that is S(t + 2∆t) and so on. One simulation trial involves constructing a complete path
for S. We then continue simulating a second path etc.
In practice it is more accurate to simulate from the continuous model instead. We thus need
to solve the SDE first. (You don’t need to do this.) The solution of the SDE is


  
1 2
S(t + ∆t) = S(t) exp rf − σ ∆t + σ ∆t , (45)
2
where  again refers to a standard normal random variable. It’s worth to remember this equation
because it’s so widely used in finance!
You may have noticed that our process is growing with the risk-free interest rate. Why is
that? Shouldn’t the expected return be much higher since we are exposed to risk? In fact, we
have estimated the expected return in the first case study but now we are simulating the price
process with the risk-free rate. Let’s look into this in more detail:

• The model above is valid in a “risk-neutral” world.

• This is an important principle in finance and is called risk-neutral valuation.

• In a risk-neutral world investors do not increase the rate of return beyond the risk-free
rate to be compensated for risk.

Page 47
Nils Friewald FIE 450en

• Of course, we are not risk-neutral, we are risk-averse. We simply pretend to be so because


it makes valuation much easier.

• We know that risky cash flows (such as those from our option) need to be discounted by
the right rate. The higher the risk, the higher the discount rate. But what rate shall we
use exactly? In real world, everyone is differently risk averse.

• It turns out that once we know that we can replicate the option by the underlying and
a risk-free investment, risk preferences become unimportant. In fact, we are then in a
risk-neutral world which has two very important implications:

– The expected return on any investment is the risk-free rate.


– The discounted rate used for the expected payoff of any investment is the risk-free
rate (see later).

Now it’s time to simulate a few paths of our stock index. The only ingredient we should know
is the risk-free interest rate r and the volatility σ of the EURO STOXX 50. We could download
historical prices of the index and then use one of our previous models to estimate the volatility.
However, there is a much better and easier approach. We use the volatility index based on the
EURO STOXX 50 (VSTOXX). The VSTOXX indices are based on EURO STOXX 50 realtime
options prices and are designed to reflect the market expectations of near-term up to long-term
volatility by measuring the square root of the implied variance across all options of a given time
to expiration. Find the end-of day volatility of the index on January 27, 2017.13 We also look
for the end-of day price of the EURO STOXX 50 on the same webpage. Furthermore, we know
from the product specification the time-to-maturity, the strike price and the boundary level.
We define all these variables next. We assume the risk-free interest rate to be rf = 20 bp. We
are going to simulate n = 5000 scenarios (that is paths).

sigma <- 0.16


S0 <- 3303
T <- 0.25
b <- K <- 3220
rf <- 0.002
f <- 0.01
n <- 5000

Next we have to decide on the time grid. We are going to simulate (as an approximation) the
index twice a day. Thus the time step is

dt <- 1/250 * 0.5

Note that in doing so we discretely monitor the barrier option. That is, we only check twice a
day whether the stock index has fallen below the barrier. However, we still can simulate finer
intervals later on to see by how much it affects the price of the option. Now we are ready to
simulate an index path. Let’s define a function for this task. rnorm

13
VSTOXX: https://www.stoxx.com/index-details?symbol=V2TX

Page 48
Nils Friewald FIE 450en

## Simulates one path of a stock index.


##
## S0: Today's index price
## rf: risk-free interest rate
## sigma: volatility
## dt: time step
## T: time horizon
## Returns the path of index prices as a vector.
simulate.path <- function(S0, rf, sigma, dt, T) {
S <- S0
t <- seq(dt, T, by=dt)
for (i in 1:length(t)) {
Si <- S[length(S)]*exp((rf - 0.5*sigma^2)*dt +
sigma*sqrt(dt)*rnorm(1))
S <- c(S, Si)
}
return(S)
}

The function above iterates over the time grid and simulates the price one time-step ahead
given today’s price. It then continues using tomorrow’s price to simulate the one for the day
after tomorrow. It stops when it reaches the time horizon T . Be careful here. You must not
simulate an entire path based solely on today’s price. Always take the most recent simulated
price. To make this more explicit look at the following figure.

Figure 3: How to simulate a price path

Think carefully about why only the right figure can be correct. How does one specific simulated
path may look like? Here is an example: abline

Page 49
Nils Friewald FIE 450en

S <- simulate.path(S0, rf, sigma, dt, T)


plot(S, type = "l")
abline(h = b)
3500
3450
3400
S

3350
3300
3250
3200

0 20 40 60 80 100 120

Index
End of
What is the value of the option in this case? We need many more scenarios to determine the lecture 5
expected value of the option. We thus define another function that simulates a set of n paths. cbind

## Simulates a set of price paths.


##
## S0: today's index price
## rf: risk-free interest rate
## sigma: volatility
## dt: time step
## T: time horizon
## n: number of scenarios
## Returns the simulated paths as a matrix
simulate.paths <- function(S0, rf, sigma, dt, T, n) {
S <- c()
for (i in 1:n) {
S <- cbind(S, simulate.path(S0, rf, sigma, dt, T))
}

Page 50
Nils Friewald FIE 450en

return(S)
}

This function uses the function simulate.path which returns a vector of stock index
prices. We then create a matrix object. A matrix is very similar to a data frame. The major
difference is that all its elements must be of the same type (for example, numerics). We use
the command cbind to append columns to another column or, more generally, a matrix. The
following example illustrates how this works:

v1 <- c(1, 2, 3)
v2 <- c(4, 5, 6)
cbind(v1, v2)

## v1 v2
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6

If we need to start off with a matrix and then append columns one by one, we can write:

m <- c()
v1 <- c(1, 2, 3)
cbind(m, v1)

## v1
## [1,] 1
## [2,] 2
## [3,] 3

Of course, the arguments of cbind must either have the same number of elements or must be
NULL. Otherwise R will complain. By the way, a similar function exists to append rows to each
other: rbind . Now let’s simulate n = 5000 paths: rbind

S <- simulate.paths(S0, rf, sigma, dt, T, n = 5000)

...you may notice that this takes quite a while. And we would need many more paths to get a
reliable estimate for our option price as you will see later. Why is this so slow and what can
we do to increase speed? The reason why the code is so slow is that we have used for loops.
The for loop in the code above goes through each time step and each scenario one by one.
That makes R so slow. R is much more efficient if you allow it to process commands in a bulk.
How do we achieve this? Well, first we try to simulate a single path without using a for loop.
The way you do this is to draw the random variables all at once and then process them more
efficiently. For example:

t <- seq(dt, T, by = dt)


m <- length(t)
e <- exp((rf - 0.5 * sigma^2) * dt + sigma * sqrt(dt) * rnorm(m))

Page 51
Nils Friewald FIE 450en

This piece of code makes use of the fact that part of Eq. (45) can be computed all at once and
with a single line of code. The result is a vector and assigned to the variable e . This variable
refers to the exponential function in Eq. (45). We then just need to compute the cumulative
product because

S1 = S0 · e1
S2 = S1 · e2 = S0 · e1 · e2
S3 = S2 · e3 = S0 · e1 · e2 · e3
··· = ··· (46)

The function to compute a cumulative product in R is called cumprod . (If you ever need to
compute the cumulative sum, then use cumsum .) cumprod

cumsum
S <- c(S0, S0 * cumprod(e))

We could use the previous code to make the function simulate.path more efficient. How-
ever, we then still need to do this n times, that is, for each scenario separately. Which is, of
course, an improvement but there is no reason to stop here. Why not just continue and do all
at once? matrix

t <- seq(dt, T, by = dt)


m <- length(t)
e <- matrix(exp((rf - 0.5 * sigma^2) * dt + sigma * sqrt(dt) *
rnorm(n * m)), m, n)

Yuk yuk, that looks complicated. Let’s discuss the last line of code. We basically did the
same as before. The difference is that now we do not just draw m random variables but m · n.
We then compute the exponential expression in Eq. (45) as we did before. This would return
a vector of length m · n. However, we instead need it in a “spreadsheet” format, that is where
the rows refer to the time steps and the columns to the scenarios. We thus create a matrix that
shall have m rows and n columns. The function to define a matrix is matrix . Some examples
shall demonstrate how to use this function:

matrix(1:6, 2, 3)

## [,1] [,2] [,3]


## [1,] 1 3 5
## [2,] 2 4 6

The first argument refers to the elements the matrix shall consist of. The second argument to
the number of rows. The third argument to the number of columns. The elements are always
filled up columnwise by default, that is, first column then second column, etc. If you want to
have it filled up rowwise write:

Page 52
Nils Friewald FIE 450en

matrix(1:6, 2, 3, byrow = TRUE)

## [,1] [,2] [,3]


## [1,] 1 2 3
## [2,] 4 5 6

If you only provide two numbers as the first argument but the dimension of the matrix is larger
R will try to recycle the first two elements to fill up the matrix. This is an important principle
in R.

matrix(c(1, 2), 2, 3)

## [,1] [,2] [,3]


## [1,] 1 1 1
## [2,] 2 2 2

Coming back to our problem. Given our variable e we now need to compute the cumulative
product over each column. We do this using the function apply : apply

S <- S0 * apply(e, 2, cumprod)

The first argument of the apply function is typically a matrix. The second argument which
is set to 2 means to apply the function given as the third argument columnwise. If we want to
apply something rowwise we would need to set the second argument to 1. See the helppage.

m <- matrix(1:6, 2, 3)
m

## [,1] [,2] [,3]


## [1,] 1 3 5
## [2,] 2 4 6

apply(m, 2, sum)

## [1] 3 7 11

apply(m, 1, sum)

## [1] 9 12

Now let’s put everything together into a function called simulate.paths.fast :

simulate.paths.fast <- function(S0, rf, sigma, dt, T, n) {


t <- seq(dt, T, by = dt)
m <- length(t)
e <- matrix(exp((rf - 0.5 * sigma^2) * dt + sigma * sqrt(dt) *
rnorm(n * m)), m, n)

Page 53
Nils Friewald FIE 450en

S <- apply(e, 2, cumprod)


S <- S * S0
S <- rbind(S0, S)
return(S)
}

The following code shall demonstrate how much faster the more efficient code is to the one
initially. We are going to simulate n = 5000 paths: system.time

set.seed
set.seed(1)
system.time(S1 <- simulate.paths(S0, rf, sigma,
dt, T, n=5000))

## user system elapsed


## 5.095 0.223 5.333

set.seed(1)
system.time(S2 <- simulate.paths.fast(S0, rf, sigma,
dt, T, n=5000))

## user system elapsed


## 0.06 0.00 0.06

The function system.time measures the time needed to execute the code provided as an
argument. set.seed initializes the random number generator. In doing so we ensure that
both functions use exactly the same sequence of random numbers and thus should yield the
same price paths. This allows us later to verify whether the two pieces of code indeed result in
the same option values. If we were not initializing the random number generator before calling
each function we will not know whether the final option values differs. They could be different
either because we use a different set of random numbers or because we computed it differently.
Anyways, the result above indicates that the simulate.paths.fast is much, much faster than
simulate.paths . The CPU time that has elapsed for calling simulate.paths is more than
100 quicker by calling simulate.paths.fast . This is a tremendous increase in speed. Thus
it’s worth spending some time to make code more efficient in R.
Next we want to verify whether the set of paths of both functions indeed yield the same
option price. We therefore define a function that determines the discounted payoffs for each
scenario. pmax

all
## Computes the option payoff for each scenario.
##
## S: Matrix of stock index prices with (m + 1) rows and n columns.
## K: Strike price
## b: Boundary level
## rf: risk-free interest rate
## f: Scaling factor for the payoff
## Returns a vector of discounted option payoffs.

Page 54
Nils Friewald FIE 450en

payoffs <- function(S, K, b, rf, f) {


I <- apply(S>b, 2, all)
X <- I*pmax((S[nrow(S), ] - K), 0)*f
X0 <- exp(-rf*T)*X
return(X0)
}

The previous code probably deserves some explanation. The first code in the body deter-
mines whether the index price of a given path is always above the boundary level. We again
make use of the function apply because we need to determine this condition for every column,
that is for each path. We first test whether S > b which returns a matrix of many TRUE s and
FALSE s. Recall this object is of type logical. The function all test whether the condition is
TRUE for the entire path. For example: TRUE

FALSE
x <- c(100, 110, 90, 120)
y <- 100 >
x > y <

## [1] FALSE TRUE FALSE TRUE

all(x > y)

## [1] FALSE

The first line, thus, corresponds to the indicator function given by the payoff equation. You
may object that I have said the indicator function is either 0 or 1. But the apply function
returns either TRUE or FALSE . This is the same becasue FALSE is always interpreted as 0
and TRUE always as 1. ==

1 != 1
## [1] FALSE
1 == 1
## [1] TRUE
TRUE == 0
## [1] FALSE
TRUE == 1
## [1] TRUE
FALSE == 0
## [1] TRUE
FALSE == 1
## [1] FALSE

Page 55
Nils Friewald FIE 450en

We check whether two expressions are equal (non equal) using the operator == ( != ). Thus,
multiplying a number with FALSE returns 0. The second line in our function body does exactly
this. It multiplies the indicator function with the payoff function. The third line discounts the
payoff with the risk-free interest rate. That’s it. Using our two sets of scenarios we compute
the value of the option following Eq. (37). We approximate for the expectation using the mean
of the discounted payoffs.

X0 <- payoffs(S1, K, b, rf, f)


mean(X0)

## [1] 0.9633144

X0 <- payoffs(S2, K, b, rf, f)


mean(X0)

## [1] 0.9633144

As you see we get exactly the same results no matter whether we use the inefficient or the efficient
function. This confirms that both functions do exactly the same. But more importantly, is this
a fair price? Recall the investor sold the security for 0.84. Actually, we can not say because
we do not know how large the error is that we have made using our numerical technique which
gives us just an approximation of the true value. Next we look at the Monte-Carlo error and
try to quantify it.

4.4 Monte-Carlo Error


So far we have an estimate of the option price. Again, we do not yet know how accurate it is.
Just providing an estimated price to the court is not sufficient! We also need to tell how reliable
it is. What do I mean by this? Let’s try to get some more estimates of the option price.

for (i in 1:10) {
S <- simulate.paths.fast(S0, rf, sigma, dt, T, n = 5000)
X0 <- payoffs(S, K, b, rf, f)
C <- mean(X0)
cat(i, ":", C, "\n")
}

## 1 : 0.9633144
## 2 : 0.9407483
## 3 : 0.9467339
## 4 : 0.9200611
## 5 : 0.9370876
## 6 : 0.939594
## 7 : 0.9357843
## 8 : 0.9643571
## 9 : 0.9135791
## 10 : 0.9532851

Page 56
Nils Friewald FIE 450en

We see that drawing a new set of paths leads to a different option price. We also see that
it varies quite a bit. The lowest value is around 0.9 which is already fairly close to the selling
value. However, we need to compute the range in which we expect the true value to be given
a certain probability. This is called the confidence interval. (We had this already earlier when
we computed the expected return of the OBX index.)
We next need to define the Monte-Carlo error (MCE) for a given number of simulation trials
n. It is defined as the difference between the estimate X̂n and the expectation E[X], i.e.

M CE ≡ X̂n − E[X] (47)


Note, the Monte Carlo error (MCE) is itself a random variable. To quantify the error we need its
probability distribution. We can derive the distribution of the MCE from an important result
from statistics which is, again, the central limit theorem. Recall that the elementary central
limit theorem states that if X1 , X2 , . . . , Xn are independent and identically distributed with
expectation µ and variance σ 2 , then the standardized sample mean X̂n converges in distribution
to a standard normal. It follows that the MCE is approximately normally distributed:
 
σ
(48)
n→∞
M CE ⇒ N 0; √
n

Also recall that the term σ/ n is referred to as standard error (SE). Cutting the error in half
requires to quadruple the number of simulations. Adding one decimal place of precision requires
100 times as many simulations. But where do we get σ from? Remember, the parameter σ
would typically be unknown in a setting in which µ is unknown. However, we can easily estimate
the standard deviation using the sample standard deviation.
An often made mistake by students is to confuse σ with the volatility of the underlying. This
is not the the same! The σ needed to compute the SE is the variation in the discounted payoffs
because this where you compute the arithmetic mean from. Thus, the previous statistical theory
is all we need to compute the M CE which is basically given by the standard error.

SE <- sd(X0)/sqrt(length(X0))
SE

## [1] 0.02571523

Let’s compute the confidence interval for a confidence probability α = 99% probability.

alpha <- 0.99


z <- -qnorm((1 - alpha)/2)
c(mean(X0) - z * SE, mean(X0) + z * SE)

## [1] 0.8870471 1.0195232

This a rather wide confidence interval. Though we can get a more precise estimate by
simulating more scenarios, this is not the most efficient way. Again we would need to simulate,
e.g., four times as many scenarios to cut the MCE by half. There are better ways to gain
precision. We will discuss one approach next. End of
lecture 6

Page 57
Nils Friewald FIE 450en

4.5 Variance reduction


Instead of simulating more scenarios which is computationally expensive we can simulate the
random numbers in a smarter way so as to decrease the SE. There are many different approaches
but here we discuss the most simple one which is antithetic sampling. The idea of antithetic
sampling (i.e. antithetic variates) is to introduce dependence between pairs of replications. For
example, if we draw a standard normal distributed random number Zi then we shall also draw
−Zi . In terms of the price process this means that for every large realization there is also a tiny
realization which shall result in a reduction in variance. In a simulation driven by independent
normal random variables (like in our example) we implement antithetic sampling by drawing a
sequence of Z1 , Z2 , . . . , Zn of independent standard normal variables and pair this sequence to
−Z1 , −Z2 , . . . , −Zn .
Recall that our objective is to estimate the expectation E[Y ] which refers to the expected
discounted payoff of our security. Using the antithetic sampling implementation described
above produces a sequence of pairs of discounted payoffs (Y1 , Ỹ1 ), (Y2 , Ỹ2 ), . . . , (Yn , Ỹn ) with the
following important properties:

• the pairs (Y1 , Ỹ1 ), (Y2 , Ỹ2 ), . . . , (Yn , Ỹn ) are identically and independently distributed (i.i.d.)

• for each i, Yi and Ỹi have the same distribution but are not necessarily independent

Clearly, the antithetic variates estimator is simply the average of all 2n observations:
n n n
! !
1 X 1 X Yi + Ỹi
(49)
X
ŶAV = Yi + Ỹi =
2n n 2
i=1 i=1 i=1

From the last equation it becomes evident that ŶAV is the sample mean of n independent
observations. Thus, when calculating the variance to determine the SE we need to do that
based on the independent sample, i.e.
" #
Y + Ỹ
2
σAV = Var . (50)
2
Like before, we estimate the second moment (i.e. the variance) using the sample standard
deviation from which we calculate the SE. Now let’s implement antithetic sampling.

## Simulates 2*n antithetic pairs of price paths.


##
## S0: today's price
## rf: risk-free interest rate
## sigma: volatility
## dt: time step
## T: time horizon
## n: number of paired paths
## Returns the simulated paths as a matrix
simulate.paths.fast.as <- function(S0, rf, sigma, dt, T, n) {
t <- seq(dt, T, by = dt)
m <- length(t)

Page 58
Nils Friewald FIE 450en

z <- rnorm(n*m)
z.as <- -z
Z <- matrix(c(z, z.as), m, n*2)
e <- exp((rf - 0.5 * sigma^2) * dt + sigma * sqrt(dt)*Z)
S <- apply(e, 2, cumprod)
S <- S * S0
S <- rbind(S0, S)
return(S)
}

We first draw n*m standard normal random variables. We then compute their antithetic
counterparts. Based on the total set of random variables we compute the price paths. The
following code simulates paths based on antithetic sampling and plots one specific antithetic
pair. range

lines
n <- 2500
S.as <- simulate.paths.fast.as(S0, rf, sigma, dt, T, n = n)
plot(S.as[, 1], type = "l", col = "blue", xlab = "", ylab = "Antithetic pair",
ylim = range(S.as[, c(1, n + 1)]))
lines(S.as[, n + 1], type = "l", col = "red")
3600
3500
3400
Antithetic pair

3300
3200
3100
3000

0 20 40 60 80 100 120

Page 59
Nils Friewald FIE 450en

The function range determines the minimum and maximum value of its argument. The
argument ylim tells the function plot on what range values shall be plotted on the y-axis.
The function lines is used in a similar way as plot but prevents creating a new plotting
window which would otherwise override the old plot.
We now compute the discounted option payoffs and the estimated option value. Finally we
determine the SE and the confidence interval.

X0.as <- payoffs(S.as, K, b, rf, f)


mean(X0.as)

## [1] 0.9753244

X.pairs <- (X0.as[1:n] + X0.as[(n + 1):(2 * n)])/2


SE <- sd(X.pairs)/sqrt(n)
SE

## [1] 0.02240072

alpha <- 0.99


z <- -qnorm((1 - alpha)/2)
c(mean(X0.as) - z * SE, mean(X0.as) + z * SE)

## [1] 0.9176239 1.0330248

Don’t forget to calculate the means of all the antithetic pairs first which is important when
we want determine the standard deviation. The sample needs to be independen! Clearly, the
SE is lower with antithetic sampling compared without using antithetic sampling. But the
improvement is not much which is also evident from comparing the confidence intervals. The
question is under what condition is an antithetic variate estimator better than an ordinary
Monte Carlo estimator? Using antithetics reduces variance if
2n
" #
h i 1 X
Var ŶAV < Var Yi , (51)
2n
i=1

where ŶAV is defined in Eq. (49). It directly follows that the necessary condition for antithetic
sampling to reduce variance is
h i
Cov Y, Ỹ < 0. (52)

To put it differently, this condition requires that the negative dependence in the inputs Z
(i.e. standard normal random variable) translates into a negative correlation in the outputs
Y (i.e. discounted payoffs). A simple sufficient condition is the monotonicity of the payoff
function. Is this condition here satisfied?

cor(X0.as[1:n], X0.as[(n + 1):(2 * n)])

## [1] -0.274992

Page 60
Nils Friewald FIE 450en

We see that the condition holds true but the absolute correlation is much lower than the
correlation of the input parameters which is −1.

4.6 Interpreting the results


We now interpret the results. The selling price for the investor was outside the confidence
interval. We would conclude that the selling price was significantly lower than the fair market
price. However, keep in mind that the bank always buys (investor sells) a security at a price
that is lower and sells at a price higher than the fair market price. This is the bid-ask spread
from which the bank profits.
Assume for the moment that the fair price (optimistically) was at the lower bound of the
confidence interval, that is at 0.92, then the spread the bank earns is abs

abs(0.84/0.92 - 1) * 2

## [1] 0.173913

The function abs takes the absolute value of its argument. This is a reasonable spread for a
security like this. Now assume (pessimistically) that the fair value was at the upper bound of
the confidence interval, that is at 1.03, then the spread would be

abs(0.84/1.03 - 1) * 2

## [1] 0.368932

This would be probably too much. However, spreads beyond 10% are not unlikely. Look at
the quotes for options on the OTX. They can be huge too. But be careful here. We made the
assumption of discretely monitoring the underlying to verify whether it falls below the boundary
level. In fact we do it just twice a day. It may still be the case that between the monitoring
dates the stock index could have fallen below the boundary and knocked out the option. To get
a more realistic estimate we need to rely on a finer time grid. Let’s monitor the underlying ten
times a day, for example.

S <- simulate.paths.fast.as(S0, rf, sigma, dt = 1/250 * 0.1,


T, n = 2500)
X0 <- payoffs(S, K, b, rf, f)
X.pairs <- (X0[1:n] + X0[(n + 1):(2 * n)])/2
SE <- sd(X.pairs)/sqrt(n)
c(mean(X0) - z * SE, mean(X0) + z * SE)

## [1] 0.8471707 0.9628857

As you see the selling price is now very close to the lower bound of the confidence interval
and thus the fair market price. The reason is that a finer time grid makes it more likely that
the underlying falls below the boundary which in turn makes the call option less valuable. We
close this case by concluding that the price was fair.

Page 61
Nils Friewald FIE 450en

4.7 Exercises
1. Use Monte Carlo simulation with 10000 scenarios to price an at-the-money European call
option with expiration in 2.5 years. The underlying trades at 85 and has a volatility of
35% p.a. The risk-free interest rate is 80 bp. Verify the obtained estimate using the
Black-Scholes option pricing formula.

2. Use Monte Carlo simulation with 5000 pairs of antithetic variates to price an at-the-money
European put option with maturity in 0.5 years. The underlying trades at 50 and has a
volatility of 15% p.a. The risk-free interest rate is 50 bp. Verify the obtained estimate
using the Black-Scholes option pricing formula for put options.
Kahoot!
5

Page 62
Nils Friewald FIE 450en

5 Data processing

Mini case 4 Data processing


You are an intern in an investment bank. Your boss asks you to download stock price data
of all stocks traded on the Oslo Stock Exchange (OSE). Your job is to clean and process
the data so as to make it ready to be used for a trading strategy.

5.1 Obtaining the data


We start by obtaining Norwegian stock market data through Amadeus 2.0 a tool that can be
launched online using a browser.14 We want to download stock price data on a monthly basis:

1. Use your credentials to login.

2. Goto tab Browser.

3. In Data Sources select Monthly Equity Prices.

4. In Data Columns select TradeDate, SecurityId, SecurityName, SecurityType, Last, AdjLast,


and SharesIssued.

5. In Restrictions add Trade Date GreaterOrEqualThan 1980-01-01 and Trade Date LessThan
2000-01-01. The reason is that only 50000 stock prices can be downloaded at once.

6. Deselect Preview.

7. Export data as Stocks-1980-2000.csv.

8. Repeat the previous steps for the period “2000–2010” and “2010–2014”, and “2014–1018”.

stocks1 <- read.csv("Stocks-1980-2000.csv", header = FALSE, sep = ";",


dec = ",")
stocks2 <- read.csv("Stocks-2000-2010.csv", header = FALSE, sep = ";",
dec = ",")
stocks3 <- read.csv("Stocks-2010-2014.csv", header = FALSE, sep = ";",
dec = ",")
stocks4 <- read.csv("Stocks-2014-2018.csv", header = FALSE, sep = ";",
dec = ",")

Merge the data frames into one single data frame.

stocks <- rbind(stocks1, stocks2, stocks3, stocks4)


nrow(stocks)

## [1] 131822

14
Amadeus 2.0: http://mora.rente.nhh.no/borsprosjektet/amadeus/client/publish.htm

Page 63
Nils Friewald FIE 450en

5.2 Data cleansing


Before starting with data cleansing use the R functions head , tail and summary on the final
data frame to get a sense of how the data look like. summary

summary(stocks)

## V1 V2 V3
## Length:131822 Min. : 6000 Length:131822
## Class :character 1st Qu.: 6245 Class :character
## Mode :character Median : 48327 Mode :character
## Mean : 500791
## 3rd Qu.:1249743
## Max. :2101943
##
## V4 V5 V6
## Length:131822 Min. : 0.01 Min. : 0.0
## Class :character 1st Qu.: 5.04 1st Qu.: 6.5
## Mode :character Median : 33.78 Median : 30.0
## Mean : 81.82 Mean : 475.8
## 3rd Qu.: 102.50 3rd Qu.: 83.5
## Max. :25000.00 Max. :339915.6
## NA's :57459 NA's :57482
## V7 V8
## Min. :0.000e+00 Mode:logical
## 1st Qu.:0.000e+00 NA's:131822
## Median :2.488e+06
## Mean :5.845e+07
## 3rd Qu.:3.015e+07
## Max. :2.027e+10
##

The function summary is particularly useful because it computes the minimum, 25% quan-
tile, median, mean, 75% quantile, and maximum of all columns that are of type numeric.
For non-numeric datatypes no such statistic can be computed, of course. For example, for
SecurityName only the number of occurrences of each name is reported. It’s now time to
discuss this data type a little bit further. Before doing so, however, we assign sensible column
names first and drop the last column which seems to be empty.

stocks$V8 <- NULL


colnames(stocks) <- c("Date", "SecurityId", "SecurityName", "SecurityType",
"Last", "AdjLast", "SharesIssued")
End of
The column SecurityName is of type factor. It is worth spending some time to explain lecture 7
what a factor is because of its importance in R. Whenever a column consists of strings (or
character vectors) it will be automatically converted into a factor when read into. What is a
factor? A factor is used to store categorical information. One could also use a character variable

Page 64
Nils Friewald FIE 450en

instead but factors typically have many advantages compared with characters. A good example
are firms’ credit ratings. Assume we know the ratings of several firms and want to store them
in a factor. We would use the function factor : factor

ratings <- factor(c("BBB", "A", "BB", "BB", "BBB", "AAA", "AA",


"AA"))
ratings

## [1] BBB A BB BB BBB AAA AA AA


## Levels: A AA AAA BB BBB

The output shows that R converts the character vector into a categorical variable with all
the categories shown below. In older versions of R factors had a lower memory consumption
than character vectors because not all occurrences of a string were stored in memory separately.
Instead a “pointer” was referring to the string. There is mostly no memory advantage anymore
in newer version of R. Factors, however, have other pros. First, they can be reordered. Second,
factors are internally represented as numerical values which is often very useful. Third, we can
easily change the labels in a factor. Let’s do some examples to demonstrate what I mean by
that.
As you know credit ratings are measured on an ordinal scale meaning that a AAA is better
than AA but it does not say anything about how much “better” it is. Looking at the levels of
the factor in our previous example we see that these are not ordered to be consistent with a
standard credit rating scale. We can do this as follows:

ratings <- factor(ratings, levels = c("AAA", "AA", "A", "BBB",


"BB", "B", "CCC", "CC", "C"))
ratings

## [1] BBB A BB BB BBB AAA AA AA


## Levels: AAA AA A BBB BB B CCC CC C

The first argument is simply the previous factor that we are going to redefine. The second
argument reorders the levels so that the ordering is consistent with a rating scale. AAA refers
to the highest rating, AA to the second highest and so on. You may have also noticed that
I have added additional levels (“CCC”, “CC”, “C”) to complete our scale despite having no
observation of firms having these very low ratings.
Again, we may find it useful to work with numbers when using credit ratings. For example,
we could use a credit rating variable in a regression model. The following function converts a
factor into a numerical value. Thus, it is important that the ordering of the levels is correct.

as.numeric(ratings)

## [1] 4 3 5 5 4 1 2 2

To access the levels of a factor we use the function levels : levels

Page 65
Nils Friewald FIE 450en

levels(ratings)

## [1] "AAA" "AA" "A" "BBB" "BB" "B" "CCC" "CC" "C"

Now assume that we want to map the credit ratings onto our own rating scale which shall by
represented by different letters (for example Moody’s and Standard and Poor’s use a different
notation).

factor(ratings, labels = c("a", "b", "c", "d", "e"))

## [1] d c e e d a b b
## Levels: a b c d e

Other advantages of using factors are that they are converted automatically to dummy
variables in regressions. They are also extremely useful when we want to “summarize” them,
in particular if the number of levels are relatively low like in our example:

summary(ratings)

## AAA AA A BBB BB B CCC CC C


## 1 2 1 2 2 0 0 0 0

We see factors can be quite useful. We now continue with the data cleansing. We first select
“Ordinary Shares” only. Then remove the corresponding column. Further make sure there is a
price observation available and that the price ( Last ) is larger then NOK 10. We also need to
make sure that there are shares outstanding. is.na

stocks <- stocks[stocks$SecurityType == "Ordinary Shares", ]


stocks$SecurityType <- NULL
stocks <- stocks[!is.na(stocks$AdjLast), ]
stocks <- stocks[!is.na(stocks$Last), ]
stocks <- stocks[stocks$Last > 10, ]
stocks <- stocks[stocks$SharesIssued > 0, ]
nrow(stocks)

## [1] 37063

Note that the third line in the previous code removes all rows in our data where AdjLast is
NA . To do so we must use the function is.na ! The ! inverts the selection, i.e. any TRUE
becomes FALSE and vice versa. We do the same for Last . Note that SharesIssued may
contain very large numbers.15 Next we convert the date column into a Date object.

15
About 20 firms in the sample have had at least once more than 2 billion shares issued. Albeit large this
number is not unreasonable. For example Norsk Hydro has more than 2 billion shares outstanding. What is
suspicious, however, is that for some observations the number of shares is exactly to 231 − 1. To computer
scientists this number refers to a maximum value of a 32-bit signed integer number. This points towards towards
a bug in Amadeus.

Page 66
Nils Friewald FIE 450en

stocks$Date <- as.Date(stocks$Date, format = "%m/%d/%Y")

5.3 Rolling observations forward in time


The next step is to make sure that we have only one observation per stock and month. The
data is supposed to be already on a monthly frequency. At least that is what we have selected
in Amadeus. However, this might not be the case and is worth to check.
We wish to have end-of-month price observations for each firm. However, you may have
noticed that some trades occurred not at the end-of-the month but in the middle of the month,
or even worth, at the beginning of the month. Apparently, these firms have traded infrequently
and what we see is, in fact, the “last” trade occurred in a given month. Since we want to have
the trades synchronized we should only select observations that are close to the end of each
month. We first define a sequence of “true” (auxiliary) end-of-month dates using the following
procedure

months <- seq(as.Date("1980-01-01"), as.Date("2018-02-01"), by = "1 month")


months <- months - 1

As you see we cannot go directly from the end of one month to the next because each month
have different numbers of dates. That is why we first create a vector of starting dates of each
month and then subtract one single day to get the last date of the previous month. This does the
trick. We then are “rolling forward” the trading days to the auxiliary dates. For example if the
trading day is “2005-04-06”, we roll it forward to the end-of-month date which is “2005-04-30”.
Here we make use of the function cut and of factors̊. cut

stocks$Date.aux <- cut(stocks$Date, months, right = TRUE)

The first argument of cut is a vector that can be ordered. The second argument defines the
boundaries of the intervals into which the first argument shall be “cut”. The third argument
indicates if the intervals should be closed on the right. Let’s do some basic examples to see how
cut works:

b <- c(0, 5, 10, 15)


v <- c(1, 7, 11, 0, 5, 10, 15)
c <- cut(v, b, right = TRUE)
c

## [1] (0,5] (5,10] (10,15] <NA> (0,5] (5,10] (10,15]


## Levels: (0,5] (5,10] (10,15]

as.numeric(c)

## [1] 1 2 3 NA 1 2 3

You see that cut returns a factor with the label being the intervals. Note that “(“ stands for
left open and “]” for right close, i.e. 0 does not belong to the interval (0, 5] but so does the
number 5. The factor can be converted into a numeric or as in our case into a Date object.

Page 67
Nils Friewald FIE 450en

However, going back to our real example we see that we did not exactly get the desired result
because it seems we have rolled the trading date backwards to the previous end-of-month date.
To see this

stocks$Date[1]

## [1] "1980-01-01"

stocks$Date.aux[1]

## [1] 1979-12-31
## 457 Levels: 1979-12-31 1980-01-31 1980-02-29 ... 2017-12-31

The following trick helps us in getting the desired result.

i <- as.numeric(stocks$Date.aux)
stocks$Date.aux <- months[i + 1]

Let’s check once more whether it’s correct now:

stocks$Date[1:5]

## [1] "1980-01-01" "1980-01-03" "1980-01-30" "1980-02-03"


## [5] "1980-02-28"

stocks$Date.aux[1:5]

## [1] "1980-01-31" "1980-01-31" "1980-01-31" "1980-02-29"


## [5] "1980-02-29"

5.4 One stock price observation per month


Next, we need to check whether there is more than one price observation in a given month. In
general this should not be the case but it is always worth to check. When working with data it
is of good practice to know which data fields (i.e. columns) uniquely identifie each observation.
For example, in a telephone book we prefer to find only one name (first name plus last name)
listed at a given address. Otherwise we don’t know which number to call. This also applies
to data management and, in general, is referred to as the “key”. That is, we would like to
have only one end-of-month price observation for each stock. If there are more, then there is
maybe something wrong. We make use of the function aggregate to compute the number of
observations per month and firm. aggregate

list
num <- aggregate(stocks$SecurityId, list(stocks$Date.aux, stocks$SecurityId),
length)

The first argument of aggregate is a vector on which we apply the function given by the third
argument. The second argument defines how to group the first vector. We want to compute the

Page 68
Nils Friewald FIE 450en

length (that is the number of observations) per month ( Date.aux ) and firm ( SecurityId ).
This information must be provided as a list for which we use the function list . A list object
is a very versatile data type because it can comprise other objects of different data types and
different lengths. A few examples

list(c(1, 2, 3), c(100, 200), 7)

## [[1]]
## [1] 1 2 3
##
## [[2]]
## [1] 100 200
##
## [[3]]
## [1] 7

list(c("A", "B"), c(4, 5, 6))

## [[1]]
## [1] "A" "B"
##
## [[2]]
## [1] 4 5 6

list(letters = c("A", "B"), numbers = c(3, 5, 6))

## $letters
## [1] "A" "B"
##
## $numbers
## [1] 3 5 6

The last line demonstrates that we can also give names to the list’s constituents. How does
aggregate work? The following example demonstrates how to use this function:

df <- data.frame(id = c("a", "a", "a", "b", "b"), val = 1:5)


df

## id val
## 1 a 1
## 2 a 2
## 3 a 3
## 4 b 4
## 5 b 5

aggregate(df$val, list(df$id), sum)

Page 69
Nils Friewald FIE 450en

## Group.1 x
## 1 a 6
## 2 b 9

aggregate(df$val, list(df$id), length)

## Group.1 x
## 1 a 3
## 2 b 2

Is there really just one observation per firm and month?

head(num[num$x > 1, ])

## Group.1 Group.2 x
## 65 1992-06-30 6002 2
## 384 1985-09-30 6015 2
## 987 1989-05-31 6035 2
## 1684 1994-01-31 6041 2
## 1766 1993-12-31 6042 2
## 1931 1993-12-31 6048 2

The result above shows that there are some cases where we have more than one observation.
To give you a specific example:

stocks[stocks$Date.aux == "1992-06-30" & stocks$SecurityId ==


6002, ]

## Date SecurityId SecurityName Last AdjLast


## 20179 1992-06-08 6002 Actinor Shipping 95 95
## 20198 1992-06-29 6002 Actinor Shipping 76 76
## SharesIssued Date.aux
## 20179 4054975 1992-06-30
## 20198 4054975 1992-06-30

What shall we do? We will take the most recent observation in a given month. This does the
trick:

stocks <- stocks[order(stocks$SecurityId, stocks$Date), ]


stocks$row <- 1:nrow(stocks)
rows <- aggregate(stocks$row, list(stocks$Date.aux, stocks$SecurityId),
max)
stocks <- stocks[rows$x, ]
stocks$row <- NULL

We first ordered the data frame based on SecurityId and Date . We then added a new column
to the data frame which contains the row number. Next we use the function aggregate to

Page 70
Nils Friewald FIE 450en

determine the largest row number within each group as defined by SecurityId and Date .
Since we have ordered the data frame beforehand we made sure that the highest row number
in each group refers to the most recent observation of the stock price in a given month. The
variable rows allows selecting the corresponding rows in stocks . This ensures that we end
up with only one observation per month and firm.
Next we need to make sure that the actual trade is not too old to be considered as an
end-of-month trade. We define a trade as not being too old if it occurs at most five days before
the end-of-month.

stocks$delta.t <- as.numeric(stocks$Date.aux - stocks$Date)


summary(stocks$delta.t)

## Min. 1st Qu. Median Mean 3rd Qu. Max.


## 0.000 1.000 1.000 1.664 2.000 30.000

stocks <- stocks[stocks$delta.t <= 5, ]


stocks$delta.t <- NULL

From now on we do not need the actual trading date and only refer to the auxiliary end-of-month
date. Thus, we remove the trading date and rename the auxiliary date.

stocks$Date <- stocks$Date.aux


stocks$Date.aux <- NULL
End of
lecture 8
5.5 Return computation
At last, we are ready to compute the returns. The question how to compute return. There are
essentially two ways to do it: (i) simple returns (i.e. arithmetic returns, discretely compounded
returns, net return) and (ii) log returns (i.e. continuously compounded returns). I denote simple
returns with Rt , where
pt
Rt = −1 (53)
pt−1
and log returns with rt , respectively, where

rt = log pt − log pt−1 . (54)


Simple returns are the most popular way of expressing rates of return. Simple returns are not
additive over time:

Rt→t+2 6= Rt→t+1 + Rt+1→t+2 (55)


But they are additive across assets:
p
Rt→t+1 = ω a Rt→t+1
a
+ ω b Rt→t+1
b
(56)
And while log returns aggregate over time:

Page 71
Nils Friewald FIE 450en

rt→t+2 = rt→t+1 + rt+1→t+2 (57)


Log returns do not aggregate across assets:
p
rt→t+1 6= ω a rt→t+1
a
+ ω b rt→t+1
b
(58)
In analyzing portfolios it is easiest to use simple returns because they aggregate when we
form portfolios. Thus, we use simple returns in our further analysis. We first make sure that
the data frame is appropriately sorted before computing the returns using the function tapply . tapply

unlist
stocks <- stocks[order(stocks$SecurityId, stocks$Date), ]
stocks$R <- unlist(tapply(stocks$AdjLast, stocks$SecurityId,
function(v) c(v[-1]/v[-length(v)] - 1, NA)))

The function tapply is very similar to aggregate but can easier cope with functions that
do not just return one value per group but several values. Since each group produces data with
varying length tapply returns a list. We use unlist to convert the list in a simple vector.
Note also that here we made use of an anonymous function definition. This is just a short cut.
We could have defined our function also explicitly. The following simple examples demonstrate
the use of tapply .

df <- data.frame(id = c("a", "a", "a", "b", "b"), val = c(100,


200, 120, 140, 160))
df

## id val
## 1 a 100
## 2 a 200
## 3 a 120
## 4 b 140
## 5 b 160

fn <- function(v) {
c(NA, diff(v))
}
tapply(df$val, list(df$id), fn)

## $a
## [1] NA 100 -80
##
## $b
## [1] NA 20

tapply(df$val, list(df$id), function(v) c(NA, diff(v)))

## $a
## [1] NA 100 -80

Page 72
Nils Friewald FIE 450en

##
## $b
## [1] NA 20

unlist(tapply(df$val, list(df$id), fn))

## a1 a2 a3 b1 b2
## NA 100 -80 NA 20
Clearly, the last two function calls produce the same result. The first explicitly defines a function
while the second is an anonymous function definition. Note that we defined the returns in a
“forward looking” manner, i.e. the return in a given row for time t defines the relative price
change from t to t + 1. This is just a convention and eases the further process.
Given the returns we need to check whether they are all on a monthly frequency. Although
we have ensured that stock prices are all end-of-month values there could be cases when a stock
stops trading for some period and then trades again. This would distort our return computation
because the underlying price observations may be a long way from each other. We first make
sure that the data frame is sorted and then use tapply again to compute the time differences.16 .

stocks <- stocks[order(stocks$SecurityId, stocks$Date), ]


stocks$delta.t <- unlist(tapply(stocks$Date, list(stocks$SecurityId),
function(v) c(as.numeric(diff(v)), NA)))
summary(stocks$delta.t)

## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's


## 28.00 30.00 31.00 45.87 31.00 5691.00 640

summary(stocks$R)

## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's


## -0.9999 -0.0610 0.0006 0.0130 0.0766 3.0625 640
Apparently there are cases where stocks trade just once which result in an NA . Further, we
observe trade that are apart from each other for many days. Thus, we only consider returns
bases on price observation within a month, i.e. 31 calendar days.

stocks <- stocks[!is.na(stocks$delta.t), ]


stocks <- stocks[stocks$delta.t <= 31, ]
stocks$delta.t <- NULL
nrow(stocks)

## [1] 30650

5.6 Market capitalization and weights


As a next step we compute the market capitalization of each (traded) stock and their weights
relative to the total market capitalization at the end of each month. We need the weights later
16
If tapply throws an error use as.numeric(diff(v)) instead of diff(v)

Page 73
Nils Friewald FIE 450en

on to construct a market portfolio. We first add a column MarketCap that shows the market
capitalization in millions Norwegian Kronor:

stocks$MarketCap <- stocks$Last * stocks$SharesIssued/1e+06

Note that we use the unadjusted stock price because any change due to stock splits must be
reflected in SharesIssued already. Next we compute the total market capitalization each
month and plot the time series.

res <- aggregate(stocks$MarketCap, list(stocks$Date), sum)


names(res) <- c("Date", "TotalMarketCap")
plot(res$Date, res$TotalMarketCap,
type="l", xlab="", ylab="Total Market Capitalization [mln]")
2000000
1500000
Total Market Capitalization [mln]

1000000
500000
0

1980 1990 2000 2010

In order to compute the weights we need to bring back the total market capitalization into
our data frame stocks , i.e. we need to merge both stocks and res . To do so we use the
function merge . merge

stocks <- merge(stocks, res, by = "Date")


nrow(stocks)

## [1] 30650

Page 74
Nils Friewald FIE 450en

The function merge requires at least three arguments. The first two refer to the data frame to
merge and the third tells the function based on what columns to merge. We are now ready to
compute the weights.

stocks$Weight <- stocks$MarketCap/stocks$TotalMarketCap


summary(stocks$Weight)

## Min. 1st Qu. Median Mean 3rd Qu. Max.


## 0.0000212 0.0011761 0.0033605 0.0146493 0.0099487 1.0000000

Of course, the weight vector for each month must always sum up to one.

5.7 Market returns


Given the individual stock returns and weights it is now straightforward to compute the an
equally and a value-weighed market return.

market.ew <- aggregate(stocks$R, list(stocks$Date), mean)


names(market.ew) <- c("Date", "RM.ew")
stocks$h <- stocks$R*stocks$Weight
market.vw <- aggregate(stocks$h, list(stocks$Date), sum)
names(market.vw) <- c("Date", "RM.vw")

Now we plot both market indices by computing the cumulative returns first. legend

RM.ew <- cumprod(1 + market.ew$RM.ew)


RM.vw <- cumprod(1 + market.vw$RM.vw)
plot(market.ew$Date, RM.ew, type = "l", xlab = "", ylab = "Market index",
col = "red")
lines(market.vw$Date, RM.vw, type = "l", col = "blue")
legend("topleft", c("Equally-weighed", "Value-weighed"), lwd = 2,
col = c("red", "blue"))

Page 75
Nils Friewald FIE 450en

Equally−weighed
Value−weighed
300
250
200
Market index

150
100
50
0

1980 1990 2000 2010

We use the function legend to add a legend to the plot. The first argument tells the
function where to put the legend in the plot window. The second provides the legend labels
and the third and fourth refers to the line width and color. We now save the current data to a
an R data file. We can write several data objects into the same file.

save(stocks, market.ew, market.vw, file = "Stocks.RData")

5.8 From long to wide-format


Our data frame has not really the desired format yet. Currently it is in “long-format” meaning
that firms are simply stacked up. However, it would be much more convenient to have the
return of all firms of a given month in one row. This is referred to as the “wide-format”. A
powerful function to reshape data is reshape . reshape

stocks.wide <- reshape(stocks[, c("Date", "SecurityId", "R")],


v.names = "R", idvar = "Date", timevar = "SecurityId", direction = "wide")
stocks.wide <- stocks.wide[order(stocks.wide$Date), ]
stocks.wide[1:4, 1:4]

## Date R.6143 R.6049 R.6288


## 1 1980-01-31 0.01111030 NA NA

Page 76
Nils Friewald FIE 450en

## 2 1980-02-29 -0.09890118 NA NA
## 3 1980-05-31 NA 0.022221315 -0.05600039
## 5 1980-06-30 NA 0.008696503 0.02542392

The first argument of reshape is the data frame that we wish to have in wide-format.
Note that we need to select the right columns first. The argument v.names tells the function
which column shall be transformed into wide-format. Since we want to have the returns of all
stocks as columns we set this argument to R . The next argument specifies the key in the wide
format which is, of course, the Date column. timevar tells reshape that we want to have
the returns of each SecurityId to vary over time. The final column specifies the data format
which is wide .
The following little example shall demonstrate of how to transform a standard data frame
from long to wide-format and then back again.

df <- data.frame(id = c("a", "a", "b"), t = c(1, 2, 2), val = c(100,


20, 40), f = c(10, 32, 32))
df

## id t val f
## 1 a 1 100 10
## 2 a 2 20 32
## 3 b 2 40 32

df2 <- reshape(df, v.names = "val", idvar = "t", timevar = "id",


direction = "wide")
df2

## t f val.a val.b
## 1 1 10 100 NA
## 2 2 32 20 40

reshape(df2, direction = "long")

## t f id val
## 1.a 1 10 a 100
## 2.a 2 32 a 20
## 1.b 1 10 b NA
## 2.b 2 32 b 40

The original data frame contains firms ( id ) for which we have some firm specific information
( val ) over time ( t ) and some only time-dependent information ( f ). We then follow exactly
the same procedure to transform the market capitalization into a wide-format.

market.cap.wide <- reshape(stocks[, c("Date", "SecurityId", "MarketCap")],


v.names = "MarketCap", idvar = "Date", timevar = "SecurityId",
direction = "wide")

Page 77
Nils Friewald FIE 450en

market.cap.wide <- market.cap.wide[order(market.cap.wide$Date),


]
market.cap.wide[1:4, 1:4]

## Date MarketCap.6143 MarketCap.6049 MarketCap.6288


## 1 1980-01-31 235.350 NA NA
## 2 1980-02-29 237.965 NA NA
## 3 1980-05-31 NA 523.125 312.5
## 5 1980-06-30 NA 534.750 295.0

Given that stocks did not have any non-missing values in any of its columns it is assured that
if there is a return for some stock from month t to t + 1 there will also be a market capitalization
for month t. To finish, we save both data frames in wide-format to the same R data file.

save(stocks.wide, market.cap.wide, file = "Stocks-Wide.RData")


Kahoot!
6
5.9 Risk-free interest rate End of
lecture 9
In this section we construct a time-series of risk-free interest rates that correspond to a monthly
holding period. We need the risk-free interest rates to compute excess returns which is an
important ingredient in any trading strategy. In one of our earlier examples we already used the
Nibor as the risk-free interest rate. We will do so again here, whenever possible. Unfortunately,
the Nibor is not available as a single download for the entire sample period. We have to use
three samples that differ in their sampling frequency and the time period they cover. Further,
they stem from different data sources. We download all data and save them as CSV files.

1980–1985
1-month Eurokrone money market interest rates17

1986–2013
1-month Nibor18

2014–2018
1-month Nibor19

We start by first processing the earliest sample.

df1 <- read.csv("MMR-1980-1985.csv", skip = 12)


df1 <- df1[, c(1, 2)]
names(df1) <- c("Date", "rf")
months <- seq(as.Date("1959-05-01"), as.Date("1986-12-01"), by = "1 month")
months <- months - 1
17
http://www.norges-bank.no/en/Statistics/Historical-monetary-statistics/
Short-term-interest-rates/
18
http://www.norges-bank.no/en/Statistics/Historical-monetary-statistics/
Short-term-interest-rates/
19
https://www.oslobors.no/ob_eng/markedsaktivitet/#/details/NIBOR1M.NIBOR/overview

Page 78
Nils Friewald FIE 450en

df1$Date <- months


df1$rf <- df1$rf/100
df1 <- df1[df1$Date >= "1980-01-01", ]

Note that we need to skip the first twelve rows using the skip argument in read.csv because
the data series starts only thereafter. Further we only select the first and second column which
refers to the date column and the 1-month money market rate. We then add a column with
the end-of-month dates of the previous interest rate period to make it consistent to how we set
up the stock returns. Recall, that we have used a forward-looking notation. A return at month
t refers to the period t to t + 1. We use the same setup here. After having done so we sort
the data frame and convert the rates into decimals. Next we use the 1-month Nibor rates from
Norges Bank.

df2 <- read.csv("Nibor-1986-2013.csv", skip = 16)


df2 <- df2[, c(1, 5)]
names(df2) <- c("Date", "rf")
months <- seq(as.Date("1986-01-01"), as.Date("2013-12-01"), by = "1 month")
months <- months - 1
df2$Date <- months
df2$rf <- df2$rf/100

Finally, we use the 1-month Nibor rates from Oslo Stock Exchange. Note that since the data is
on a daily frequency we use the first rate available in each month.

df3 <- read.csv("Nibor-2014-2018.csv", skip = 0)


df3 <- df3[, c(1, 2)]
names(df3) <- c("Date", "rf")
df3$Date <- as.Date(df3$Date, format = "%d.%m.%y")
df3 <- df3[order(df3$Date), ]
df3 <- df3[df3$Date >= "2013-03-01" & df3$Date <= "2018-01-31",
]
months <- seq(as.Date("2013-03-01"), as.Date("2018-02-01"), by = "1 month")
df3$Date2 <- cut(df3$Date, months)
df3 <- aggregate(df3$rf, list(df3$Date2), head, n = 1)
names(df3) <- c("Date", "rf")
df3$Date <- as.Date(df3$Date) - 1
df3$rf <- df3$rf/100

Let’s plot all three time series to see whether they indeed fit together.

plot(df1$Date, df1$rf, xlab = "", ylab = "", xlim = range(df1$Date,


df2$Date, df3$Date), ylim = range(df1$rf, df2$rf, df3$rf),
type = "l", col = "black")
lines(df2$Date, df2$rf, col = "blue")
lines(df3$Date, df3$rf, col = "red")

Page 79
Nils Friewald FIE 450en

0.25
0.20
0.15
0.10
0.05
0.00

1980 1990 2000 2010

The spike in the interest rate is due to a currency crisis. We then need to combine all three
time series and make sure that they do not overlap. Further, we scale the interest rates so that
they reflect a monthly investment period.

range(df1$Date)

## [1] "1980-01-31" "1986-11-30"

range(df2$Date)

## [1] "1985-12-31" "2013-11-30"

range(df3$Date)

## [1] "2013-02-28" "2017-12-31"

rf <- rbind(df1[df1$Date < "1985-12-31", ], df2, df3[df3$Date >


"2013-11-30", ])
rf$rf <- rf$rf/12

Let’s store the risk-free interest rate data to a file.

Page 80
Nils Friewald FIE 450en

save(rf, file = "Riskfree-Rate.RData")

5.10 Exercises
1. Compute for each month the number of return observations in the data frame stocks
that we used during the lecture. Create a plot showing the time-series of observations
each month.

2. Download the complete daily stock price information for the following stock as given by
their tickers from finance.yahoo.com: IBM, AAPL, XOM, KO and GS. Merge all samples,
compute daily simple returns, and transform the data frame into wide-format. Compute
the annualized mean and volatility. Determine also the covariance and the correlation
using the functions cov and cor . cov

cor
3. Obtain the 3-month Nibor rates from Norges Bank and Oslo Stock Exchange and extend
this series with the 3-month Norwegian Treasury Bills in the primary market to construct
a monthly time-series of risk-free interest rates.

Page 81
Nils Friewald FIE 450en

6 Mean-Variance Portfolios

Mini case 5 Mean-Variance Portfolios


You work in an asset management company. A rich Norwegian client calls you. He tells
you that he would like to invest 50 million NOK in Norwegian stocks only and asks for
your advice in finding an optimal portfolio that delivers a target return of 5% p.a. You
immediately tell him that it is more advantageous to diversify and invest in other asset
classes too, preferably in international ones. However, your client refuses, probably because
he’s a real patriot. You promise him to think about the problem and to call him back once
you have found an optimal solution.

6.1 Optimization problem


Our objective is to find an optimal portfolio. This is a portfolio which has the lowest risk for a
given expected return or the highest expected return for a given risk. We will make use of the
matrix notation to formulate the objective function. The following equation shows the expected
portfolio return in matrix notation for a portfolio of two assets, which when expanded, results
in a familiar expression

 
 µ1
>
(59)

E[Rp ] = µp = |{z}
ω µ = ω1 ω2 = ω1 µ1 + ω2 µ2 ,
|{z} |{z} µ2
1×1 1×2 2×1

where µi is the expected return of stock i and ωi its weight. The portfolio variance (squared
volatility) can also be written in matrix notation. For the expansion we use the fact that
σ11 = σ12 , σ22 = σ22 and the symmetry of the covariance matrix, i.e. σ12 = σ21 .

ω > × |{z}
Var[Rp ] = σp2 = |{z} Σ × |{z}
ω
|{z}
1×2 2×2 2×1
1×1
  
  σ11 σ12 ω1
= ω1 ω2
σ21 σ22 ω2
 
  ω1
= ω1 σ11 + ω2 σ21 ω1 σ12 + ω2 σ22
ω2
= ω12 σ11 + ω1 ω2 σ21 + ω1 ω2 σ12 + ω22 σ22
= ω12 σ12 + 2ω1 ω2 σ12 + ω22 σ22 (60)

Here, σij is the covariance between stock i and j and Σ refers to the covariance matrix. We
wish to find a portfolio, that is the weights, that yields the highest expected return for a given
target variance of the portfolio. Or to put it differently we look for the lowest variance for a
given target expected return. This is a classic optimization problem. So let’s first define the
problem formally:

min Var[Rp ] (61)


ω
subject to

Page 82
Nils Friewald FIE 450en

(62)
X
E[Rp ] = µ∗ and ω = 1,

where the expected portfolio return and variance are computed as given in (59) and (60). This
is a so-called Quadratic Programming Problem.

6.2 Expected returns and covariances


Before we can start, we first must estimate the expected excess returns and the covariance
matrix. Let’s load the data from the various files that our intern has created.

load("Stocks-Wide.RData") ## Stock returns


load("Stocks.RData") ## Market return
load("Riskfree-Rate.RData") ## Risk-free interest rate

We then merge the stock returns with the risk-free interest rate and the market return using
the function merge . This helps us to align the dates of the different data samples.

df <- merge(rf, stocks.wide, by = "Date")


df <- merge(market.vw, df, by = "Date")
df[1:2, 1:4]

## Date RM.vw rf R.6143


## 1 1980-01-31 0.01111030 0.01116667 0.01111030
## 2 1980-02-29 -0.09890118 0.01263333 -0.09890118

The previous line of code adds two columns to the stock returns, i.e. the risk-free interest rate
( rf ) and the market return ( RM.vw ). We add the market return for later usage. We then
deduct the risk-free interest rate from all stock returns to get excess returns.

rf <- df[, c("Date", "rf")]


RM <- df[, c("Date", "RM.vw")]
R <- df[, -c(2, 3)]
R[, -1] <- R[, -1] - rf$rf
RM[, -1] <- RM[, -1] - rf$rf

Note that the last two lines of the previous code snippet again demonstrates how replication in
R works. To reiterate the principle of replication:

m <- matrix(1:4, 2, 2)
m

## [,1] [,2]
## [1,] 1 3
## [2,] 2 4

m - c(1, 2)

Page 83
Nils Friewald FIE 450en

## [,1] [,2]
## [1,] 0 2
## [2,] 0 2

Since we might use the excess returns again we save the data to a file.

save(R, RM, file = "Excess-Returns.RData")

We pretend to be on December 31, 2017 because this is the last price observation in our data.
We will only consider stocks that have been traded on that day and the end of the previous
month to make sure that we could potentially invest in the stocks going forward. We further
restrict the sample in that we select only stocks that have been traded at least 75% of the time
during the last 60 months.

not.NA <- !is.na(R[R$Date == "2017-11-30", ])


R <- R[, not.NA]
R <- tail(R, n = 60)
RM <- tail(RM, n = 60)
liq <- apply(R, 2, function(v) sum(!is.na(v))/length(v))
summary(liq)

## Min. 1st Qu. Median Mean 3rd Qu. Max.


## 0.01667 0.39167 0.71667 0.65631 1.00000 1.00000

R <- R[, liq >= 0.75]


End of
With apply we determine the proportion of non-missing observations. The returns, correla- lecture 10
tions and covariances are computed as follows:

mu <- apply(R[, -1], 2, mean, na.rm = TRUE) * 12


Rho <- cor(R[, -1], use = "pairwise.complete.obs")
Sigma <- cov(R[, -1], use = "pairwise.complete.obs") * 12

The argument use tells the function cor and cov , respectively, that for computing the
correlation and the covariance pairwise complete observations shall be used. Note that we scale
the returns and covariances accordingly so that we have these metrics in annual terms. This
scaling does not change our results but eases the interpretation of returns and variances. Let’s
look at the estimates for the expected return and correlation to check whether they make any
sense: lower.tri

summary(mu)

## Min. 1st Qu. Median Mean 3rd Qu. Max.


## -0.49208 0.05422 0.15567 0.15217 0.25835 0.64163

summary(Rho[lower.tri(Rho)])

## Min. 1st Qu. Median Mean 3rd Qu. Max.


## -0.37834 -0.04845 0.05454 0.06913 0.16607 0.78225

Page 84
Nils Friewald FIE 450en

We use the function lower.tri to get the lower triangular part of the correlation matrix so
that we do not double count the correlation coefficients when computing the summary statistics.
Clearly, some expected excess return estimates are negative. Who would be willing to invest in
a risky stock with a negative expected excess return? Nobody! These are just estimated badly.
We set them to zero.

mu[mu < 0] <- 0

6.3 Solving for the optimal portfolio


Now we are ready to solve the quadratic programming problem. For doing so we need an appro-
priate algorithm. Try to avoid “general purpose” optimizers. Instead, think about the specific
problem that you have and then use an optimizer that is suitable for solving the problem. In
our case solve.QP does the job. Before we use the function we need to load the correspond-
ing package using require . A package contains functions that may was developed by other
authors than the core development team. Loading the package basically means to make the
function available in the R environment. If the package itself is not available you may need
to install it first using the function install.packages function. A further important note
on packages. Don’t just use every package available on CRAN because basically everyone can
contribute packages. Some may be faulty. Also look at the maintainers and/or authors of the
packages and check whether you consider them trustworthy. solve.QP implements the dual
method of Goldfarb and Idnani (1982, 1983) for solving quadratic programming problems of
the form install.packa

require
min(−d> b + 1/2b> Db) (63)
solve.QP
with the constraints

A> b ≥ b0 . (64)
Let’s assume we wish to determine the optimal risky portfolio that promises a return of 5%
p.a. with the lowest risk. We first need to specify the variables in Equations (63) and (64).
Since our problem only consists of a quadratic but no linear term we set d = 0. In the notation
above b corresponds to the weight vector ω which solve.QP will search for. The variable D
corresponds to the covariance matrix Σ. Thus we have
 
0
0
d = . . (65)
 .. 
 

0
Next we need to define our two linear constraints. First, all weights shall sum up to one and,
second, the return shall be 0.05.

Page 85
Nils Friewald FIE 450en

 
b1
1  b2 
    
1 1 ··· 1
(66)
µn  ... 
 =
µ1 µ2 · · · 0.05
| {z } | {z }
A> bn b0
| {z }
b

The following lines of code do the optimization. The function t transposes a vector or matrix
which means that columns and rows are swapped. The function rep replicates the first argu-
ment by the number given as the second argument. Note that the last argument to solve.QP
says that the first two constraints (out of two) shall be interpreted as equalities instead of
inequalities. t
rep
require(quadprog)
A <- t(rbind(1, mu))
mu.star <- 0.05
d <- rep(0, length(mu))
b0 <- c(1, mu.star)
solve.QP(Dmat = Sigma, dvec = d, Amat = A, bvec = b0, meq = 2)

## Error in solve.QP(Dmat = Sigma, dvec = d, Amat = A, bvec = b0, meq = 2): matrix
D in quadratic function is not positive definite!

Oh dear, what’s wrong here? What does it mean that the matrix is not positive definite?
Well, if a matrix is not positive definite we cannot invert it. In our case Sigma is not invertable.
But apparently, solve.QP needs to invert this matrix to solve the entire problem. So why can
it be that Sigma is non positive definite. This is a common problem in empirics if you estimate
the covariance (and also the correlation) matrix of high dimensionality. Remember we have 53
firms which is quite a lot. There is a remedy to this issue because there are ways to convert
nearly non positive definite matrices into positive definite ones. The function is called nearPD
and is available in the package Matrix . nearPD

as.matrix
require(Matrix)
Sigma2 <- nearPD(Sigma)$mat
Sigma2 <- as.matrix(Sigma2)

After having transformed Sigma into a positive definite matrix it is not given as a matrix
object. Thus, we need to convert it into a matrix object using as.matrix . Now let’s try again:

res <- solve.QP(Dmat = Sigma2, dvec = d, Amat = A, bvec = b0,


meq = 2)
omega <- res$solution

What are the weights? hist

Page 86
Nils Friewald FIE 450en

hist(omega)

Histogram of omega
20
15
Frequency

10
5
0

−0.2 0.0 0.2 0.4 0.6

omega

The R function hist is used to plot a histogram. Using the weights we then check whether we
end up with the desired return. We also compute the portfolio standard deviation.

t(omega) %*% as.matrix(mu)

## [,1]
## [1,] 0.05

sqrt(t(omega) %*% Sigma2 %*% as.matrix(omega))

## [,1]
## [1,] 0.000101595

We use the operator %*% to do matrix multiplication. Be careful here, the matrices have to
have the right dimension so that they can be multiplied. And what are the results? Well, we
get the desired portfolio return using the weights omega . But look at the portfolio standard
deviation! It’s tiny and already in annual terms! It basically means that you earn 5% without
risk. Too good to be true. What’s wrong here? A big disadvantages of the Markowitz portfolio
optimization is that it is so sensitive to the input parameters, in particular, to the expected
return estimates. Recall that these are very hard to estimate. It is always sensible to add

Page 87
Nils Friewald FIE 450en

constraints to the optimization. For example, we require all weights to be positive because our
investor cannot short stocks. How do we do this? We need to define A and b0 as follows. Note
that the equality applies to the first two equations only while the inequality to the rest.
   
1 1 ··· 1   1
µ1 µ2 · · · µn  b1 0.05
   
 1 0 · · · 0   b2   0 
 0 1 · · · 0   ..  = (≥)  0  (67)
    
 . 
. .. . . .   . 
  
 .. . . ..  bn  .. 
0 0 ··· 1 | {z } 0
| {z } b | {z }
A> b0

This is how we implement constraint optimization: diag

A <- t(rbind(1, mu, diag(1, length(mu))))


mu.star <- 0.05
b0 <- c(1, mu.star, rep(0, length(mu)))
res <- solve.QP(Dmat = Sigma2, dvec = d, Amat = A, bvec = b0,
meq = 2)
omega <- res$solution
t(omega) %*% as.matrix(mu)

## [,1]
## [1,] 0.05

sqrt(t(omega) %*% Sigma2 %*% as.matrix(omega))

## [,1]
## [1,] 0.09029294

summary(omega)

## Min. 1st Qu. Median Mean 3rd Qu. Max.


## 0.00000 0.00000 0.00000 0.01887 0.02423 0.16736

The function diag creates a diagonal matrix with the diagonal values given by the first ar-
gument and the number of rows and columns by its second argument. The results are much
more convincing now than before. However, another disadvantage of the Markowitz portfolio
optimization is the large number of parameters to be estimated. In our case we have:

Estimates of expected returns 53


Estimates of variances 53
Estimates of covariances (532 − 53)/2 = 1378
Total estimates 1484

This is quite a lot. And the number of estimates grows exponentially with the number of
stocks we consider. This implies that the more estimates you have the more errors your are

Page 88
Nils Friewald FIE 450en

going to make. We know already how difficult it is to estimate, for example, expected returns.
A similar issue arises for the covariance (or correlation matrix). Since it is huge it can happen
that correlation coefficient can be mutually inconsistent and thus lead to nonsensical results.
The higher the number of estimates the larger the total estimation error. We know, garbage in
– garbage out. This seems to be the case in the previous computation. What can we do? We
can either put more constraints to the optimization problem or simplify our model by using an
index model instead. This is what we do next.

6.4 Single Index Model


This model simplifies in how we describe the sources of risk and allows us to use a smaller
consistent set of estimates of risk parameters and premiums. We assume the following model.
We argue that a single return depends to some degree on a common factor and to some degree
on an idiosyncratic factor. In principal several factors could be considered here as long as they
affect the individual stocks. In this case we consider the market index (Rm ) as the common
factor.

Ri = αi + βi Rm + i (68)
The intercept αi in this equation is the security’s expected return when the market return is
zero. In an arbitrage-free world we should not expect to find a non-zero α. Therefore, it is often
sensible to assume it to be zero. The slope coefficient βi is the security beta and measures the
sensitivity of the security to the market index. The residual i is zero-mean and corresponds to
the firm-specific surprise in returns.

6.4.1 Estimating the Single Index model


We will now estimate the index model. We rely on linear regression to estimate the coefficients.
We use the R function lm which stands for “linear model.” We do it for each and every stocks. lm

coefficients
reg <- apply(R[, -1], 2, function(v) {
res <- lm(v ~ RM$RM.vw) residuals
c(coefficients(res), var(residuals(res)))
})
rownames(reg) <- c("alpha", "beta", "var.eps")

This probably needs some further explanation. I will demonstrate how to run a regression using
a very simple example. Let’s regress y on x as follows:

x <- c(4, 2, 6, 7, 3, 4)
y <- c(100, 200, 140, 160, 170, 190)
lmres <- lm(y ~ x)
lmres

##
## Call:
## lm(formula = y ~ x)
##

Page 89
Nils Friewald FIE 450en

## Coefficients:
## (Intercept) x
## 192.5 -7.5

plot(x, y)
abline(lmres)
200
180
160
y

140
120
100

2 3 4 5 6 7

The notation to tell R to regress y on x is y∼x . lm returns an object which comprises a


lot of information. You get a summary of the regression results using summary .

summary(lmres)

##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## 1 2 3 4 5
## -6.250e+01 2.250e+01 -7.500e+00 2.000e+01 -2.842e-14
## 6
## 2.750e+01

Page 90
Nils Friewald FIE 450en

##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 192.500 41.926 4.591 0.0101 *
## x -7.500 9.007 -0.833 0.4519
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 37.5 on 4 degrees of freedom
## Multiple R-squared: 0.1477,Adjusted R-squared: -0.06534
## F-statistic: 0.6933 on 1 and 4 DF, p-value: 0.4519

You get the coefficients with:

coefficients(lmres)

## (Intercept) x
## 192.5 -7.5

You get the residuals with:

residuals(lmres)

## 1 2 3 4
## -6.250000e+01 2.250000e+01 -7.500000e+00 2.000000e+01
## 5 6
## -2.842171e-14 2.750000e+01

You get the coefficients, standard errors, t-values and p-values all together with:

summary(lmres)$coefficients

## Estimate Std. Error t value Pr(>|t|)


## (Intercept) 192.5 41.926275 4.5913929 0.01009621
## x -7.5 9.007209 -0.8326664 0.45186013

You get the R2 with:

summary(lmres)$r.squared

## [1] 0.1477273

summary(lmres)$adj.r.squared

## [1] -0.06534091
End of
lecture 11

Page 91
Nils Friewald FIE 450en

6.4.2 Expected returns and risk in the single index model


It immediately follows from the single index model that the expected return is given as

µi = E[Ri ] = E[αi + βi Rm + i ] = αi + βi µm (69)


and its variance is

σi2 = Var[Ri ] = Var[αi + βi Rm + i ] = βi2 σm


2 2
+ σ,i . (70)
The covariance between the stock and the index based on the single index model is

2
Cov[Ri , Rm ] = βi σm (71)
while the covariance between stock i and j with i 6= j is

2
σij 2
= Cov[Ri , Rj ] = βi βj σm . (72)
The next step is to write down the expressions for the expected portfolio return and variance
in the familiar matrix notation

E[Rp ] = µp = ω > µ (73)


where the vector µ is defined by Equation (69). The portfolio variance is given by

Var[Rp ] = σp2 = ω > × Σ × ω, (74)


with the variance terms defined in Equation (70) and the covariance terms in Equation (72).
To make it more clear how the covariance matrix of the stock returns based on the single index
model looks like, here is the expression:
 2 2
β1 σm + σ21 2 2

β1 β2 σm ··· β1 βn σm
 β1 β2 σ 2 β22 σm
2 + σ2 2
2 · · · β2 βn σm 
m
Σ= .. .. .. . (75)
..
 
. . .

 
2
β1 βn σm 2
β2 βn σm · · · βn2 σm
2 + σ2
n
Equipped with all these expressions we now determine the vector of expected stock returns
and the covariance matrix.

alpha <- reg[1, ] * 12


beta <- reg[2, ]
var.eps <- reg[3, ] * 12
mu.index <- mean(RM$RM.vw) * 12
var.index <- var(RM$RM.vw) * 12
mu <- beta * mu.index
Sigma <- var.index * (as.matrix(beta) %*% beta)
diag(Sigma) <- diag(Sigma) + var.eps

Note, that I have scaled returns, variances and covariance so that they are in annual terms.
When markets are efficient a stock should not have α. We thus set the intercept to zero when
determining the expected (excess) stocks returns. The R function diag accesses the diagonal
of a matrix.

Page 92
Nils Friewald FIE 450en

6.4.3 Solving for the optimal portfolio


We have now all the necessary ingredients to do the optimization. Let’s do a few examples. We
start by finding the optimal solution to get a target expected excess return of 5% p.a.

A <- t(rbind(1, mu))


mu.star <- 0.05
d <- rep(0, length(mu))
b0 <- c(1, mu.star)
res <- solve.QP(Dmat = Sigma, dvec = d, Amat = A, bvec = b0,
meq = 2)
omega <- res$solution
t(omega) %*% as.matrix(mu)

## [,1]
## [1,] 0.05

sqrt(t(omega) %*% Sigma %*% as.matrix(omega))

## [,1]
## [1,] 0.07206284

summary(omega)

## Min. 1st Qu. Median Mean 3rd Qu. Max.


## 0.0008164 0.0075522 0.0174065 0.0188679 0.0262146 0.0529453

The previous result makes more sense compared with the result that we obtained based on the
full estimation of the covariance matrix. Note that the weights of the optimal solution are all
positive without explicitly imposing a constraint. This, however, does not necessarily apply to
all optimal portfolios, that is, portfolios that yield other target returns. The guarantee that
weights are positive for other portfolios we impose the no-shortselling constraint.

A <- t(rbind(1, mu, diag(1, length(mu))))


mu.star <- 0.05
b0 <- c(1, mu.star, rep(0, length(mu)))
res <- solve.QP(Dmat = Sigma, dvec = d, Amat = A, bvec = b0,
meq = 2)
omega <- res$solution
t(omega) %*% as.matrix(mu)

## [,1]
## [1,] 0.05

sqrt(t(omega) %*% Sigma %*% as.matrix(omega))

## [,1]
## [1,] 0.07206284

Page 93
Nils Friewald FIE 450en

summary(omega)

## Min. 1st Qu. Median Mean 3rd Qu. Max.


## 0.0008164 0.0075522 0.0174065 0.0188679 0.0262146 0.0529453

Would you advice your client to invest in these stocks with the corresponding weights? To
answer this question we need to plot the frontier first and check whether the obtained portfolio
is efficient.

mu.p.vec <- seq(0, 0.2, length = 100)


sigma.p.vec <- c()
for (i in 1:length(mu.p.vec)) {
mu.star <- mu.p.vec[i]
b0 <- c(1, mu.star, rep(0, length(mu)))
res <- solve.QP(Dmat = Sigma, dvec = d, Amat = A, bvec = b0,
meq = 2)
omega <- res$solution
sigma.p.vec <- c(sigma.p.vec, sqrt(t(omega) %*% Sigma %*%
as.matrix(omega)))
}
plot(sigma.p.vec, mu.p.vec, type = "l", xlim = c(0, max(sigma.p.vec)),
xlab = "Volatility", ylab = "Expected return")

Page 94
Nils Friewald FIE 450en

0.20
0.15
Expected return

0.10
0.05
0.00

0.0 0.1 0.2 0.3 0.4

Volatility

We see that the optimal portfolio that delivers a target return of 5% is indeed efficient.
That is, there is no other portfolio that has a lower risk and still yields 5%. Further, we could
also improve the perfomance by taking into consideration a risk-free bank account which we are
going to do next.

6.5 Capital allocation line


So far we have put all the investor’s money entirely into stocks. But of course, he could also
leave some money in the bank account and only invest a fraction in risky stocks. The investor
even could borrow money and invest more than his wealth in stocks. This process of deciding
how much to invest in the risky and how much in the risk-less portfolio is referred to as capital
allocation. The the process of determining the optimal risky portfolio is denoted as asset
allocation.
From the previous graph it immediately gets clear that there is one optimal risky portfolio
given that we have the opportunity in investing in a risk-less asset. This is the tangency portfolio
and the line connecting the origin with the tangency portfolio is the capital allocation line (CAL).
How do we find the tangency portfolio? By inspecting the previous figure you may notice that
the tangency portfolio is the one that makes the CAL steepest. It has the highest Sharpe ratio.
Thus, this is again a job for the optimizer. Find the weights of the risky assets that makes the
CAL steepest, that is maximizes the Sharpe ratio. Thus, the optimization problem is given by:

Page 95
Nils Friewald FIE 450en

E[Rp ] ω>µ
max p = max √ (76)
ω Var[Rp ] ω ω > Σω
subject to

(77)
X
E[Rp ] = µ∗ ωi = 1

Unfortunately, this does not immediately look like a quadratic programming problem. Thus, it
seems we cannot use the solveQP anymore. However, a little trick shows that the problem is
still quadratic. Let’s divide the numerator and denominator of the objective function by some
scalar k ≡ 1/(ω > µ), that is,
1 1
max q = max √ , (78)
w ω> ω
Σ ω> w w> Σw
ω> µ µ

where w ≡ ω/k. The previous optimization problem is then again equivalent to

min w> Σw (79)


w
subject to

E[Rp ] = µ∗ . (80)

which is a standard quadratic programming problem. The result of this optimization problem
is a portfolio on the CAL. To get the tangency portfolio (that is the one fully invested in the
risky assets) we need to rescale the weight vector w so that it sums to 1. Let’s do this first
using the unconstraint version. points

mu.star <- 0.05


b0 <- c(mu.star, rep(0, length(mu)))
d <- rep(0, length(mu))
res <- solve.QP(Dmat = Sigma, dvec = d, Amat = A, bvec = b0,
meq = 1)
w <- res$solution
omega <- w/sum(w)
omega

## [1] 1.929254e-02 1.972085e-02 2.258105e-02 2.347323e-02


## [5] 2.507384e-02 2.551734e-02 2.354814e-03 2.338602e-02
## [9] 7.234191e-03 2.415994e-02 7.792567e-02 3.677751e-02
## [13] 2.198006e-03 1.771949e-02 1.252667e-02 1.744818e-02
## [17] 3.675803e-02 7.105055e-03 4.695372e-03 4.879321e-02
## [21] 9.916029e-03 1.177341e-02 5.050441e-03 5.290306e-02
## [25] 1.753444e-02 9.346389e-02 0.000000e+00 8.955204e-03
## [29] 5.932608e-03 6.354556e-02 1.321753e-02 8.523523e-19
## [33] 3.805419e-03 2.710655e-19 5.177206e-03 3.799824e-03

Page 96
Nils Friewald FIE 450en

## [37] 5.623203e-03 1.874810e-02 9.415682e-03 1.823120e-02


## [41] 9.252439e-03 -1.185227e-18 1.723012e-02 3.302689e-02
## [45] 1.298694e-02 5.097800e-02 1.580560e-02 -1.929734e-19
## [49] -2.196933e-19 1.110242e-02 5.932440e-03 2.134806e-02
## [53] 2.050329e-02

y <- t(omega) %*% as.matrix(mu)


x <- sqrt(t(omega) %*% Sigma %*% as.matrix(omega))
points(x, y, pch = 4, lwd = 4, col = "darkred")
abline(0, y/x, lwd = 2, col = "darkred", lty = 2)
0.20
0.15
Expected return

0.10
0.05
0.00

0.0 0.1 0.2 0.3 0.4

Volatility

The points function plots the stock returns and its standard deviations as symbols as
defined by pch . Since the investor desires to earn 5% p.a. in excess of the risk-free interest
rate we advise him to invest the following fraction of his NOK 50 million in the risky tangency
porfolio:

0.05/y

## [,1]
## [1,] 0.5906588

In doing so, we can expect to earn 5% p.a. with a volatility of:

Page 97
Nils Friewald FIE 450en

0.05/y * x

## [,1]
## [1,] 0.06753101

The correspondnig weights and dollar amounts to be invested in the stocks are given by: round

round(omega, 2)

## [1] 0.02 0.02 0.02 0.02 0.03 0.03 0.00 0.02 0.01 0.02 0.08
## [12] 0.04 0.00 0.02 0.01 0.02 0.04 0.01 0.00 0.05 0.01 0.01
## [23] 0.01 0.05 0.02 0.09 0.00 0.01 0.01 0.06 0.01 0.00 0.00
## [34] 0.00 0.01 0.00 0.01 0.02 0.01 0.02 0.01 0.00 0.02 0.03
## [45] 0.01 0.05 0.02 0.00 0.00 0.01 0.01 0.02 0.02

omega * 0.05/c(y) * 5e+07

## [1] 5.697654e+05 5.824148e+05 6.668848e+05 6.932333e+05


## [5] 7.405042e+05 7.536019e+05 6.954459e+04 6.906578e+05
## [9] 2.136469e+05 7.135141e+05 2.301374e+06 1.086148e+06
## [13] 6.491357e+04 5.233087e+05 3.699493e+05 5.152959e+05
## [17] 1.085573e+06 2.098331e+05 1.386681e+05 1.441007e+06
## [21] 2.928495e+05 3.477034e+05 1.491544e+05 1.562383e+06
## [25] 5.178434e+05 2.760263e+06 0.000000e+00 2.644735e+05
## [29] 1.752073e+05 1.876687e+06 3.903525e+05 2.517247e-11
## [33] 1.123852e+05 8.005362e-12 1.528981e+05 1.122200e+05
## [37] 1.660697e+05 5.536864e+05 2.780728e+05 5.384209e+05
## [41] 2.732517e+05 -3.500325e-11 5.088561e+05 9.753812e+05
## [45] 3.835425e+05 1.505530e+06 4.667859e+05 -5.699071e-12
## [49] -6.488188e-12 3.278870e+05 1.752024e+05 6.304710e+05
## [53] 6.055225e+05

With the function round we round decimals as given by the first argument to the number of
digits as given by the second argument. Finally, the ex-ante Sharpe ratio is:

y/x

## [,1]
## [1,] 0.7404006

6.6 Exercises
1. Use the estimates of the single index model and find the optimal portfolio that has a
volatility of 8% p.a. assuming you cannot short stocks.

2. Write a function that computes the unconstrained frontier, i.e. the expected returns and
the corresponding volatilities. The function shall be defined as follows:

Page 98
Nils Friewald FIE 450en

## Computes the frontier.


##
## mu: vector of expected returns of length n
## Sigma: covariance matrix (n x n)
## mu.p.vec: vector of desired portfolio returns of length m
## Returns a vector of corresponding portfolio variances of length m
frontier <- function(mu, Sigma, mu.p.vec) {
}

3. Use the estimates based a single-index model and do an unconstrained portfolio opti-
mization to find the tangency portfolio. Backtest this strategy using a rolling window
of exactly 60 return observations. That is, start at the earliest date possible where you
have at least 60 return observations. Find the tangency portfolio. Invest in this tangency
portfolio for one month. Find a new tangency portfolio and invest in the new one. Plot
the return series this strategy generates. What is the Sharpe ratio?
End of
lecture 12

Page 99
Nils Friewald FIE 450en

Index
..., 30 log, 17
:, 13 lower.tri, 84
<-, 9 matrix, 52
<, 55 mean, 18
==, 55 merge, 74
=, 9 na.omit, 19
>, 55 names, 14
FALSE, 55 nearPD, 86
TRUE, 55 nlminb, 34
[[, 12 nlm, 32
[, 11 nrow, 13
$, 12 order, 15
abline, 49 plot, 16
abs, 61 pmax, 54
aggregate, 68 points, 96
all, 54 qnorm, 20
apply, 53 range, 59
as.Date, 14 rbind, 51
as.matrix, 86 read.csv, 9
as.numeric, 42 rep, 86
cat, 8 require, 85
cbind, 50 reshape, 76
class, 15 residuals, 89
coefficients, 89 return, 28
cor, 81 rev, 24
cov, 81 rm, 17
cumprod, 52 rnorm, 48
cumsum, 52 round, 98
cut, 67 sample, 45
c, 13 save, 17
diag, 88 sd, 18
diff, 17 seq, 29
dim, 11 set.seed, 54
dnorm, 20 solve.QP, 85
exp, 42 sqrt, 18
factor, 65 summary, 64
for, 25 sum, 24
function, 28 system.time, 54
head, 10 table, 46
hist, 86 tail, 13
install.packages, 85 tapply, 72
is.na, 66 t, 86
legend, 75 unlist, 72
length, 19
levels, 65
lines, 59
list, 68
lm, 89
load, 17

Page 100

You might also like