Time-series-ARMA-modeling (Unit4b)
Time-series-ARMA-modeling (Unit4b)
ARMA MODELS IN R
1. INTRODUCTION
Import data into R, create a time series object, and construct time series plots
mydatats1<- read.table("E:/Vrontos/mathimata/…/datats.txt")
y <- mydatats1$V1
j=ts(y, frequency=4, start = c(1960,1))
Below, the time series plot of the J&J data, of the log of J&J and of the first differences of the log of J&J
data, together with their corresponding histograms are presented. The log of J&J time series and the
first differences of the log of J&J data are computed by:
A nice graph can be produced by using the command par(mfrow=c(rows,col)), which splits the graph
into (rows x col) subplots:
1
MSc in Statistics
Financial Analytics Ioannis Vrontos
Figure 1: Time series plots and histograms for the J&J, log of J&J and the differences of log(J&J)
We focus on the stationary time series, i.e. the differences of the logarithms of the J&J series. We can
test for normality of dlj, and create the plot of the histogram together with a density plot, and also the
normal QQplot:
par(mfrow=c(2,1))
hist(dlj, prob=TRUE, 15) # histogram
lines(density(dlj)) # smooth it - ?density for details
qqnorm(dlj,main="Normal QQplot of dlj") # normal Q-Q plot
qqline(dlj) # add a line
2
MSc in Statistics
Financial Analytics Ioannis Vrontos
It is based on the autocorrelation and partial autocorrelation plot. The acf( ) command and the pacf( )
command create an autocorrelation and a partial autocorrelation plot, respectively. Below, the
autocorrelation and partial autocorrelation plots of the J&J data, the log of J&J and of the first
differences of the log of J&J data are presented using the command par(mfrow=c(3,2)).
3
MSc in Statistics
Financial Analytics Ioannis Vrontos
Note that the lag values in the X axis are 1, 2, 3, 4, 5,… and correspond to lags 4, 8, 12, 16, 20,…
because we have quarterly data, i.e. the frequency is 4. A better type of labeling can be produced by
using the following set of commands:
4
MSc in Statistics
Financial Analytics Ioannis Vrontos
The estimation of a specified ARMA model can be done by using the command arima() of R. A frequently
used form of the command is:
arima(y, order = c(p,d,q), seasonal = list(order = c(ps,ds,qs), period = freq), include.mean = TRUE,
fixed = NULL, method = c("CSS-ML", "ML", "CSS"), optim.method = "BFGS")
where
y: is the univariate time series under consideration.
order: specifies the non-seasonal part of the ARIMA model, i.e. p denotes the order of the
Autoregressive part [AR(p)], q denotes the order of the Moving Average part [MA(q)] and d denotes the
order of differencing I[(d)].
seasonal: specifies the seasonal part of the ARIMA model, i.e. ps denotes the order of seasonal AR part,
qs denotes the order of seasonal MA part, ds denotes the order of seasonal differencing and period
5
MSc in Statistics
Financial Analytics Ioannis Vrontos
refers to the frequency of the analyzed series, i.e. period=4 for quarterly data, period=12 for minthly
data.
include.mean: equals TRUE if the ARMA model includes the mean of the analyzed series, FALSE if the
ARMA model has zero mean.
fixed: is a vector of same length as the number of model parameters to be estimated (the ARMA
parameters and the mean). It takes the value of 0, if the corresponding parameter will not be estimated,
and takes NA if the corresponding parameter will be estimated.
method: denotes the estimation method, i.e. maximum likelihood (ML), minimize conditional sum-of-
squares (CSS), or first conditional-sum-of-squares to find starting values and then maximum likelihood
(CSS-ML).
optim.method: denotes the optimization algorithm used.
The general form of the ARMA model, which estimated by the command arima() if include.mean=FALSE
is:
yt = 1 yt −1 + + p yt − p + 1 t −1 + + q t −q + t ,
and
yt − = 1 ( yt −1 − ) + + p ( yt − p − ) + 1 t −1 + + q t −q + t ,
If include.mean=TRUE, which is the default choice of the command. Note that for ARIMA models, i.e.
when differencing is used, the differenced series follows a zero mean ARMA model1.
For illustration purposes, first, we will estimate some moving average (MA) models in R, using the
differences of the logarithm of J&J series (dlj), which is a stationary process (as shown in previous
lectures). The command ma1fit=arima(dlj,order=c(0,0,1)) estimates a simple MA(1) model for the dlj
series and returns the object/list ma1fit. Note that ma1fit is an object/list containing several results. For
example, it contains the coefficients (ma1fit$coef), the estimated residual series (ma1fit$residuals), the
Akaike Information Criterion AIC (ma1fit$aic).
ma1fit=arima(dlj,order=c(0,0,1))
ma1fit
1
In R the commands arima(diff(y),order=c(1,0,1)) and arima(y,order=c(1,1,1)) provide different model estimates!.
6
MSc in Statistics
Financial Analytics Ioannis Vrontos
ma4fit=arima(dlj,order=c(0,0,4))
ma4fit
Call: arima(x = dlj, order = c(0, 0, 4))
Coefficients:
ma1 ma2 ma3 ma4 intercept
-0.6578 -0.1495 -0.4180 0.7598 0.0366
s.e. 0.0939 0.1296 0.1035 0.0732 0.0068
sigma^2 estimated as 0.01383: log likelihood = 57.65, aic = -103.31
7
MSc in Statistics
Financial Analytics Ioannis Vrontos
Note, that if minimization of the conditional sum of squares (CSS) is used to estimate the model
parameters, we receive the following results
ma4afit=arima(dlj,order=c(0,0,4),method=c("CSS"))
ma4afit
Call: arima(x = dlj, order = c(0, 0, 4), method = c("CSS"))
Coefficients:
ma1 ma2 ma3 ma4 intercept
-0.6216 -0.1615 -0.3922 0.6940 0.0353
s.e. 0.0800 0.0977 0.0957 0.0778 0.0071
sigma^2 estimated as 0.01645: log likelihood = 52.68, aic = NA
Finally, if we want to estimate a restricted MA model (say with 1 = 0 , 2 = 0 , 3 = 0 and only the
fourth parameter will be estimated , i.e. 4 0 ) we use the option fixed() described above as follows:
ma4restricted=arima(dlj,order=c(0,0,4),fixed=c(0,0,0,NA,NA))
ma4restricted
Call: arima(x = dlj, order = c(0, 0, 4), fixed = c(0, 0, 0, NA, NA))
Coefficients:
ma1 ma2 ma3 ma4 intercept
0 0 0 0.7633 0.0314
s.e. 0 0 0 0.0815 0.0282
sigma^2 estimated as 0.02211: log likelihood = 38.67, aic = -71.34
Here, we will estimate some autoregressive (AR) models in R, using the differences of the logarithm of
J&J series (dlj), which is a stationary process. The command ar1fit=arima(dlj,order=c(1,0,0)) estimates a
simple AR(1) model for the dlj series and returns the object/list ar1fit.
ar1fit=arima(dlj,order=c(1,0,0))
ar1fit
8
MSc in Statistics
Financial Analytics Ioannis Vrontos
or yt = 0.0545 - 0.5226 yt −1 + t
ar4fit=arima(dlj,order=c(4,0,0))
ar4fit
Call: arima(x = dlj, order = c(4, 0, 0))
Coefficients: ar1 ar2 ar3 ar4 intercept
-0.6834 -0.6104 -0.6226 0.2819 0.0384
s.e. 0.1123 0.1181 0.1241 0.1183 0.0037
sigma^2 estimated as 0.007825: log likelihood = 80.62, aic = -149.25
arma41fit=arima(dlj,order=c(4,0,1))
arma41fit
Call:arima(x = dlj, order = c(4, 0, 1))
Coefficients:
ar1 ar2 ar3 ar4 ma1 intercept
-0.4497 -0.3944 -0.4237 0.4910 -0.2465 0.0382
s.e. 0.2720 0.2535 0.2344 0.2423 0.2937 0.0042
sigma^2 estimated as 0.007751: log likelihood = 80.99, aic = -147.97
9
MSc in Statistics
Financial Analytics Ioannis Vrontos
yt − = 1 ( yt −1 − ) + 2 ( yt −2 − ) + 3 ( yt −3 − ) + 4 ( yt −4 − ) + 1 t −1 + t or
yt − 0.038 = −0.45( yt −1 − 0.038) − 0.39( yt −2 − 0.038) − 0.42( yt −3 − 0.038) + 0.49( yt −4 − 0.038) − 0.25 t −1 + t
Finally, if we want to estimate a restricted ARMA(4,1) model (say with 1 = 0 , 2 = 0 , 3 = 0 and only
the fourth autoregressive parameter 4 will be estimated together with the moving average parameter
arma41restricted=arima(dlj,order=c(4,0,1),fixed=c(0,0,0,NA,NA,NA))
arma41restricted
Call: arima(x = dlj, order = c(4, 0, 1), fixed = c(0, 0, 0, NA, NA, NA))
Coefficients:
ar1 ar2 ar3 ar4 ma1 intercept
0 0 0 0.8603 -0.8140 0.0337
s.e. 0 0 0 0.0599 0.0933 0.0113
sigma^2 estimated as 0.008292: log likelihood = 78.35, aic = -148.7
Now we will provide some diagnostic plots for the residuals of the above model. Based on the residuals
of the restricted ARMA(4,1) model, we will present the autocorrelation plots and the partial
autocorrelation plots of the estimated residuals (examine the assumption of autocorrelation of
residuals), the autocorrelation plots and the partial autocorrelation plots of the squared residuals (a
kind of plots to examine heteroskedasticity in the residual series), as well as some normality plots
(examine the assumption of normality of residuals).
arma41residuals=arma41restricted$residuals
arma41residuals
residuals=ts(arma41residuals, frequency=4, start = c(1960,2))
residuals
10
MSc in Statistics
Financial Analytics Ioannis Vrontos
Based on the residual plots presented above, it seems that the assumptions with respect to the residuals
are satisfied, thus the restricted ARMA(4,1) models is a appropriate candidate model for modeling the
stationary series of the differences of logarithm of J&J. Obviously, other alternative model specifications
can be used.
11
MSc in Statistics
Financial Analytics Ioannis Vrontos
2.4 Predictions
In this paragraph we compute predictions based on an estimated ARMA model using the command
predict() of R. A frequently used form of the command is predict(modelfit, n.ahead), where modelfit is
the estimated time series model, and n.ahead denotes the time steps ahead to predict. For our
estimated restricted ARMA(4,1) model, predictions for 8 quarters ahead (i.e. two years) are obtained by
using the following commands
forecast=predict(arma41restricted,8)
forecast
$pred
$se
A plot of the analyzed time series together with the forecasts plus/minus one standard error is obtained
by using the commands
12
MSc in Statistics
Financial Analytics Ioannis Vrontos
13