0.1 Installation of R Packages
0.1 Installation of R Packages
In this note, we briefly introduce the R program to be used extensively in the course. Specific
packages and their commands for performing statistical analyses discussed in the lectures
will be given when needed. Our goal is to make the empirical analysis as easy as possible so
that students can reproduce the results shown in the lecture notes and textbook.
R is a free software available from http://www.r-project.org. It runs on many operating
systems, including Linux, MacOS X, and Windows. One can click CRAN on the above web
page to select a nearby CRAN Mirror to download and install the software and selected
packages. The simplest way to install the program is to follow the online instructions and
to use the default options. Because R is an open-source software, it contains thousands
of packages developed by researchers around the world for various statistical analyses. For
financial time series analysis, the Rmetrics of Dr. Diethelm Wuertz and his associates has
many useful packages, including fBasics and fGarch. We use many functions of these packages
in the lectures. We also use some other packages that are powerful and easy to use in R, e.g.,
the evir package for extreme value analysis in R and the rugarch package for additional
volatility models.
The R commands are case sensitive and must be followed exactly.
1
Economic Data (FRED) of Federal Reserve Bank of St. Louis. The package is quantmod by
Jeffry A. Ryan. It is highly recommended that one installs it. Another useful package for
financial data is quandl. The basic version of quandl is also free.
Once installed, the quantmod package allows users, with Internet connection, to use tick
symbols to access daily stock data from Yahoo and Google Finance and to use series name
to access thousands of economic and financial time series from FRED. The command is
getSymbols. The package also has some nice functions, e.g., obtaining time series plots of
closing price and trading volume. The command is chartSeries. The default option of
these two commands is sufficient for basic analysis of financial time series. One can use
subcommands to further enhance the capabilities of the package such as specifying the time
span of interest in getSymbols. Interested readers may consult the document associated
with the package for description of the commands available. Here we provide a simple
demonstration. Figure 1 shows the time plots of daily closing price and trading volume
of Apple stock from January 3, 2008 to January 28, 2015. The plot also shows the price
and volume of the last observation. The subcommand theme=‘‘white’’ of chartSeries is
used to set the background of the time plot. The default is black. Figure 2 shows the time
plot of monthly U.S. unemployment rates from January 1948 to November 2011. Figure 3
shows the time plot of daily interest rate of 10-year treasures notes from January 3, 2007 to
December 2, 2011. These are the interest rates from the Chicago Board Options Exchange
(CBOE) obtained from Yahoo Finance. Since there is no volume, the subcommand TA=NULL
is used to omit the time plot of volume in chartSeries. The commands head and tail
show, respectively, the first and the last six rows of the data.
2
2015-01-27 112.42 112.48 109.03 109.14 95568700 109.14
2015-01-28 117.63 118.12 115.31 115.31 145448000 115.31
> chartSeries(AAPL,theme="white") % Obtain time plot of
closing price and trading volume
3
AAPL [2008−01−03/2015−01−28]
Last 115.31 700
600
500
400
300
200
100
800
Volume (millions):
600 145,448,000
400
200
0
Jan 03 2008 Jul 01 2009 Jan 03 2011 Jul 02 2012 Jan 02 2014
Figure 1: Time plots of daily closing price and trading volume of Apple stock from January
3, 2008 to January 28, 2015.
> getSymbols("^TNX")
[1] "TNX"
> head(TNX)
TNX.Open TNX.High TNX.Low TNX.Close TNX.Volume TNX.Adjusted
2007-01-03 4.66 4.69 4.64 4.66 0 4.66
2007-01-04 4.66 4.66 4.60 4.62 0 4.62
2007-01-05 4.59 4.70 4.58 4.65 0 4.65
2007-01-08 4.67 4.68 4.65 4.66 0 4.66
2007-01-09 4.66 4.67 4.64 4.66 0 4.66
2007-01-10 4.67 4.70 4.66 4.68 0 4.68
> tail(TNX)
TNX.Open TNX.High TNX.Low TNX.Close TNX.Volume TNX.Adjusted
2015-01-21 1.80 1.86 1.77 1.85 0 1.85
2015-01-22 1.94 1.95 1.81 1.90 0 1.90
2015-01-23 1.82 1.85 1.80 1.82 0 1.82
2015-01-26 1.80 1.84 1.80 1.83 0 1.83
2015-01-27 1.79 1.83 1.75 1.83 0 1.83
2015-01-28 1.82 1.83 1.72 1.72 0 1.72
> chartSeries(TNX,theme="white",TA=NULL) % Obtain time plot without trading volume
Remark: The Quantmod package updates financial data daily. The default option of
getSymbols is to download data to the most recent one available.
4
UNRATE [1948−01−01/2014−12−01]
Last 5.6
10
Jan 1948 Jan 1960 Jan 1975 Jan 1990 Jan 2005
Figure 2: Time plot of U.S. monthly unemployment rates from January 1948 to December
2014.
TNX [2007−01−03/2015−01−28]
Jan 03 2007 Jan 02 2009 Jan 03 2011 Jan 02 2013 Dec 31 2014
Figure 3: Time plot of Chicago Board Options Exchange interest rates of 10-year treasury
note from January 3, 2007 to January 28, 2015.
5
0.3 Some Basic R commands
After starting R, the first thing to do is to set the working directory. By working directory,
we mean the computer directory where data sets reside and output will be stored. This can
be done in two ways. The first method is to click on the command File. A pop-up window
appears that allows one to select the desired directory. The second method is to type in the
desired directory in the R Console using the command setwd, which stands for set working
directory. See the demonstration below.
R is an object oriented program. It handles many types of object. For the purposes of the
course, we do not need to study details of an object in R. Explanations will be given when
needed. It suffices now to say that R allows one to assign values to variables and refer to
them by names. The assignment operator is <−, but = can also be used. For instance, x <−
10 assigns the value 10 to the variable “x”. Here R treats “x” as a sequence of real numbers
with the first element being 10. There are several ways to load data into the R working
space, depending on the data format. For simple text data, the command is read.table.
For *.csv files, the command is read.csv. The data file is specified in either a single or
double quotes; see the R demonstration. R treats the data as an object and refer to them by
the assigned name. For both loading commands, R stores the data in a matrix framework.
As such, one can use the command dim (i.e., dimension) to see the size of the data. Finally,
the basic operations in R are similar to those we commonly use and the command to exit R
is q().
R Demonstration
> setwd("C:/Users/rst/teaching/bs41202/sp2017") % Set my working directory
> x <- 10 % Assign value, here "x" is a variable.
> x % See the value of x.
[1] 10 % Here [1] signifies the first element.
> 1+2 % Basic operation: addition
[1] 3
> 10/2 % Basic operation: division
[1] 5
% Use * and ^ for multiplication and power, respectively.
% Use log for the natural logarithm.
> da=read.table(’d-ibm-0110.txt’,header=T) % Load text data with names.
> head(da) % See the first 6 rows
date return
1 20010102 -0.002206
2 20010103 0.115696
....
6 20010109 -0.010688
> dim(da) % Dimension of the data object "da".
[1] 2515 2
> da <- read.csv("d-vix0411.csv",header=T) % Load csv data with names.
> head(da) % See the first 6 rows
Date VIX.Open VIX.High VIX.Low VIX.Close
6
AAPLrtn [2007−01−04/2011−12−02]
Last 0.00455230136879425
0.10
0.05
0.00
−0.05
−0.10
−0.15
−0.20
Figure 4: Time plot of daily log returns of Apple stock from January 4, 2008 to January 28,
2015.
7
TNX.rtn [2007−01−04/2015−01−28]
Last −0.11
0.2
0.0
−0.2
−0.4
Jan 04 2007 Jan 02 2009 Jan 03 2011 Jan 02 2013 Dec 31 2014
Figure 5: Time plot of daily changes in the yield to maturity for the U.S. 10-year Treasury
notes from January 4, 2007 to January 28, 2015.
USEU.rtn [1999−01−05/2015−01−23]
0.04
0.02
0.00
−0.02
Jan 05 1999 Jan 03 2003 Jan 03 2007 Jan 04 2011 Jan 05 2015
Figure 6: Time plot of daily log returns of the Dollar-Euro exchange rates from January 5,
1999 to January 23, 2015. The rate is dollars per Euro.
8
DEXUSEU [1999−01−04/2015−01−23]
1.6
1.4
1.2
1.0
0.8
Jan 04 1999 Jan 02 2003 Jan 02 2007 Jan 03 2011 Jan 02 2015
Figure 7: Time plot of daily Dollar-Euro exchange rates from January 4, 1999 to January
23, 2015. The rate is dollars per Euro.
R Demonstration
> require(quantmod)
> getSymbols("AAPL",from="2008-01-03",to="2015-01-28") %Specify period
[1] "AAPL"
> AAPL.rtn=diff(log(AAPL$AAPL.Adjusted)) % Compute log returns
> chartSeries(AAPL.rtn,theme="white")
> getSymbols("^TNX",from="2007-01-03",to="2015-01-28")
[1] "TNX"
9
> chartSeries(USEU.rtn,theme="white")
10