Time Series Previewpdf
Time Series Previewpdf
Analysis Approach
Using R
CHAPMAN & HALL/CRC
Pragmatics of Uncertainty
J.B. Kadane
Stochastic Processes
From Applications to Theory
P.D Moral and S. Penev
Design of Experiments
An Introduction Based on Linear Models
Max Morris
Stochastic Processes
An Introduction, Third Edition
P.W. Jones and P. Smith
Time Series
A Data Analysis Approach Using R
Robert H. Shumway, David S. Stoffer
Robert H. Shumway
David S. Stoffer
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reason-
able efforts have been made to publish reliable data and information, but the author and publisher
cannot assume responsibility for the validity of all materials or the consequences of their use. The
authors and publishers have attempted to trace the copyright holders of all material reproduced in
this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know
so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organiza-
tion that provides licenses and registration for a variety of users. For organizations that have been
granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Preface xi
4 ARMA Models 67
4.1 Autoregressive Moving Average Models . . . . . . . . . . . . . . 67
4.2 Correlation Functions . . . . . . . . . . . . . . . . . . . . . . . . 76
4.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.4 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5 ARIMA Models 99
5.1 Integrated Models . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.2 Building ARIMA Models . . . . . . . . . . . . . . . . . . . . . 104
5.3 Seasonal ARIMA Models . . . . . . . . . . . . . . . . . . . . . . 111
5.4 Regression with Autocorrelated Errors * . . . . . . . . . . . . . 122
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
vii
viii CONTENTS
6 Spectral Analysis and Filtering 129
6.1 Periodicity and Cyclical Behavior . . . . . . . . . . . . . . . . . 129
6.2 The Spectral Density . . . . . . . . . . . . . . . . . . . . . . . . 137
6.3 Linear Filters * . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
References 253
Index 257
Preface
The goals of this book are to develop an appreciation for the richness and versatility
of modern time series analysis as a tool for analyzing data. A useful feature of
the presentation is the inclusion of nontrivial data sets illustrating the richness of
potential applications in medicine and in the biological, physical, and social sciences.
We include data analysis in both the text examples and in the problem sets.
The text can be used for a one semester/quarter introductory time series course
where the prerequisites are an understanding of linear regression and basic calculus-
based probability skills (primarily expectation). We assume general math skills at
the high school level (trigonometry, complex numbers, polynomials, calculus, and so
on).
All of the numerical examples use the R statistical package (R Core Team, 2018).
We do not assume the reader has previously used R, so Appendix A has an extensive
presentation of everything that will be needed to get started. In addition, there are
several simple exercises in the appendix that may help first-time users get more
comfortable with the software. We typically require students to do the R exercises as
the first homework assignment and we found this requirement to be successful.
Various topics are explained using linear regression analogies, and some estima-
tion procedures require techniques used in nonlinear regression. Consequently, the
reader should have a solid knowledge of linear regression analysis, including multiple
regression and weighted least squares. Some of this material is reviewed in Chapter 3
and Chapter 4.
A calculus-based introductory course on probability is an essential prerequisite.
The basics are covered briefly in Appendix B. It is assumed that students are familiar
with most of the content of that appendix and that it can serve as a refresher.
For readers who are a bit rusty on high school math skills, there are a number of
free books that are available on the internet (search on Wikibooks K-12 Mathematics).
For the chapters on spectral analysis (Chapter 6 and 7), a minimal knowledge of
complex numbers is needed, and we provide this material in Appendix C.
There are a few starred (*) items throughout the text. These sections and examples
are starred because the material covered in the section or example is not needed to
move on to subsequent sections or examples. It does not necessarily mean that the
material is more difficult than others, it simply means that the section or example
may be covered at a later time or skipped entirely without disrupting the continuity.
Chapter 8 is starred because the sections of that chapter are independent special
xi
xii PREFACE
topics that may be covered (or skipped) in any order. In a one-semester course, we
can usually cover Chapter 1 – Chapter 7 and at least one topic from Chapter 8.
Some homework problems have “hints” in the back of the book. The hints vary
in detail: some are nearly complete solutions, while others are small pieces of advice
or code to help start a problem.
The text is informally separated into four parts. The first part, Chapter 1 –
Chapter 3, is a general introduction to the fundamentals, the language, and the
methods of time series analysis. The second part, Chapter 4 – Chapter 5, presents
ARIMA modeling. Some technical details have been moved to Appendix D because,
while the material is not essential, we like to explain the ideas to students who know
mathematical statistics. For example, MLE is covered in Appendix D, but in the main
part of the text, it is only mentioned in passing as being related to unconditional least
squares. The third part, Chapter 6 – Chapter 7, covers spectral analysis and filtering.
We usually spend a small amount of class time going over the material on complex
numbers in Appendix C before covering spectral analysis. In particular, we make sure
that students see Section C.1 – Section C.3. The fourth part of the text consists of the
special topics covered in Chapter 8. Most students want to learn GARCH models, so
if we can only cover one section of that chapter, we choose Section 8.1.
Finally, we mention the similarities and differences between this text and Shumway
and Stoffer (2017), which is a graduate-level text. There are obvious similarities
because the authors are the same and we use the same R package, astsa, and con-
sequently the data sets in that package. The package has been updated for this text
and contains new and updated data sets and some updated scripts. We assume astsa
version 1.8.6 or later has been installed; see Section A.2. The mathematics level of
this text is more suited to undergraduate students and non-majors. In this text, the
chapters are short and a topic may be advanced over multiple chapters. Relative to the
coverage, there are more data analysis examples in this text. Each numerical example
has output and complete R code included, even if the code is mundane like setting up
the margins of a graphic or defining colors with the appearance of transparency. We
will maintain a website for the text at www.stat.pitt.edu/stoffer/tsda. A solutions manual
is available for instructors who adopt the book at www.crcpress.com.
1.1 Introduction
The analysis of data observed at different time points leads to unique problems that
are not covered by classical statistics. The dependence introduced by the sampling
data over time restricts the applicability of many conventional statistical methods that
require random samples. The analysis of such data is commonly referred to as time
series analysis.
To provide a statistical setting for describing the elements of time series data,
the data are represented as a collection of random variables indexed according to
the order they are obtained in time. For example, if we collect data on daily high
temperatures in your city, we may consider the time series as a sequence of random
variables, x1 , x2 , x3 , . . . , where the random variable x1 denotes the high temperature
on day one, the variable x2 denotes the value for the second day, x3 denotes the
value for the third day, and so on. In general, a collection of random variables, { xt },
indexed by t is referred to as a stochastic process. In this text, t will typically be
discrete and vary over the integers t = 0, ±1, ±2, . . . or some subset of the integers,
or a similar index like months of a year.
Historically, time series methods were applied to problems in the physical and
environmental sciences. This fact accounts for the engineering nomenclature that
permeates the language of time series analysis. The first step in an investigation
of time series data involves careful scrutiny of the recorded data plotted over time.
Before looking more closely at the particular statistical methods, we mention that
two separate, but not mutually exclusive, approaches to time series analysis exist,
commonly identified as the time domain approach (Chapter 4 and 5) and the frequency
domain approach (Chapter 6 and 7).
The following examples illustrate some of the common kinds of time series data as
well as some of the statistical questions that might be asked about such data.
1
2 1. TIME SERIES ELEMENTS
Johnson & Johnson Quarterly Earnings
1015
QEPS
5
0
1We assume astsa version 1.8.6 or later has been installed; see Section A.2.
1.2. TIME SERIES DATA 3
Global Warming
1.5
Land Surface
1.0 Sea Surface
Temperature Deviations
0.0 0.5
−0.5
Figure 1.2 Yearly average global land surface and ocean surface temperature deviations
(1880–2017) in ◦ C.
rt = ( xt − xt−1 )/xt−1 .
4 1. TIME SERIES ELEMENTS
16000 16000
14000 14000
12000 12000
10000 10000
8000 8000
Apr 20 2006 Nov 01 2007 Jun 01 2009 Jan 03 2011 Jul 02 2012 Jan 02 2014 Jul 01 2015
0.05 0.05
0.00 0.00
−0.05 −0.05
Apr 21 2006 Nov 01 2007 Jun 01 2009 Jan 03 2011 Jul 02 2012 Jan 02 2014 Jul 01 2015
Figure 1.3 Dow Jones Industrial Average (DJIA) trading days closings (top) and returns
(bottom) from April 20, 2006 to April 20, 2016.
r2 r3
log(1 + r ) = r − 2 + 3 −··· −1 < r ≤ 1,
we see that if r is very small, the higher-order terms will be negligible. Consequently,
because for financial data, xt /xt−1 ≈ 1, we have
log(1 + rt ) ≈ rt .
Note the financial crisis of 2008 in Figure 1.3. The data shown are typical of
return data. The mean of the series appears to be stable with an average return of
approximately zero, however, the volatility (or variability) of data exhibits clustering;
that is, highly volatile periods tend to be clustered together. A problem in the analysis
of these types of financial data is to forecast the volatility of future returns. Models
have been developed to handle these problems; see Chapter 8. The data set is an xts
data file, so it must be loaded.
1.2. TIME SERIES DATA 5
0.040.02
GDP Growth
0.00 −0.02
Figure 1.4 US GDP growth rate calculated using logs (–◦–) and actual values (+).
library(xts)
djia_return = diff(log(djia$Close))[-1]
par(mfrow=2:1)
plot(djia$Close, col=4)
plot(djia_return, col=4)
You can see a comparison of rt and log(1 + rt ) in Figure 1.4, which shows the
seasonally adjusted quarterly growth rate, rt , of US GDP compared to the version
obtained by calculating the difference of the logged data.
tsplot(diff(log(gdp)), type="o", col=4, ylab="GDP Growth") # diff-log
points(diff(gdp)/lag(gdp,-1), pch=3, col=2) # actual return
It turns out that many time series behave like this, so that logging the data and
then taking successive differences is a standard data transformation in time series
analysis. ♦
Example 1.4. El Niño – Southern Oscillation (ENSO)
The Southern Oscillation Index (SOI) measures changes in air pressure related to sea
surface temperatures in the central Pacific Ocean. The central Pacific warms every
three to seven years due to the ENSO effect, which has been blamed for various global
extreme weather events. During El Niño, pressure over the eastern and western Pacific
reverses, causing the trade winds to diminish and leading to an eastward movement
of warm water along the equator. As a result, the surface waters of the central and
eastern Pacific warm with far-reaching consequences to weather patterns.
Figure 1.5 shows monthly values of the Southern Oscillation Index (SOI) and
associated Recruitment (an index of the number of new fish). Both series are for
a period of 453 months ranging over the years 1950–1987. They both exhibit an
obvious annual cycle (hot in the summer, cold in the winter), and, though difficult to
see, a slower frequency of three to seven years. The study of the kinds of cycles and
6 1. TIME SERIES ELEMENTS
Southern Oscillation Index
1.0
COOL
0.0
WARM
−1.0
Recruitment
100
60
0 20
Figure 1.5 Monthly SOI and Recruitment (estimated new fish), 1950–1987.
their strengths is the subject of Chapter 6 and 7. The two series are also related; it is
easy to imagine that fish population size is dependent on the ocean temperature.
The following R code will reproduce Figure 1.5:
par(mfrow = c(2,1))
tsplot(soi, ylab="", xlab="", main="Southern Oscillation Index", col=4)
text(1970, .91, "COOL", col="cyan4")
text(1970,-.91, "WARM", col="darkmagenta")
tsplot(rec, ylab="", main="Recruitment", col=4)
♦
Example 1.5. Predator–Prey Interactions
While it is clear that predators influence the numbers of their prey, prey affect the
number of predators because when prey become scarce, predators may die of star-
vation or fail to reproduce. Such relationships are often modeled by the Lotka–
Volterra equations, which are a pair of simple nonlinear differential equations (e.g.,
see Edelstein-Keshet, 2005, Ch. 6).
One of the classic studies of predator–prey interactions is the snowshoe hare and
lynx pelts purchased by the Hudson’s Bay Company of Canada. While this is an
indirect measure of predation, the assumption is that there is a direct relationship
between the number of pelts collected and the number of hare and lynx in the wild.
These predator–prey interactions often lead to cyclical patterns of predator and prey
abundance seen in Figure 1.6. Notice that the lynx and hare population sizes are
asymmetric in that they tend to increase slowly and decrease quickly (%↓).
The lynx prey varies from small rodents to deer, with the snowshoe hare being
1.2. TIME SERIES DATA 7
150
Hare
Lynx
( × 1000)
100
Number
50 0
Figure 1.6 Time series of the predator–prey interactions between the snowshoe hare and lynx
pelts purchased by the Hudson’s Bay Company of Canada. It is assumed there is a direct
relationship between the number of pelts collected and the number of hare and lynx in the wild.
its overwhelmingly favored prey. In fact, lynx are so closely tied to the snowshoe
hare that its population rises and falls with that of the hare, even though other food
sources may be abundant. In this case, it seems reasonable to model the size of the
lynx population in terms of the snowshoe population. This idea is explored further in
Example 5.17.
Figure 1.6 may be reproduced as follows.
culer = c(rgb(.85,.30,.12,.6), rgb(.12,.67,.86,.6))
tsplot(Hare, col = culer[1], lwd=2, type="o", pch=0,
ylab=expression(Number~~~(""%*% 1000)))
lines(Lynx, col=culer[2], lwd=2, type="o", pch=2)
legend("topright", col=culer, lty=1, lwd=2, pch=c(0,2),
legend=c("Hare", "Lynx"), bty="n")
♦
Example 1.6. fMRI Imaging
Often, time series are observed under varying experimental conditions or treatment
configurations. Such a set of series is shown in Figure 1.7, where data are collected
from various locations in the brain via functional magnetic resonance imaging (fMRI).
In fMRI, subjects are put into an MRI scanner and a stimulus is applied for a
period of time, and then stopped. This on-off application of a stimulus is repeated
and recorded by measuring the blood oxygenation-level dependent (bold) signal
intensity, which measures areas of activation in the brain. The bold contrast results
from changing regional blood concentrations of oxy- and deoxy- hemoglobin.
The data displayed in Figure 1.7 are from an experiment that used fMRI to
examine the effects of general anesthesia on pain perception by comparing results
from anesthetized volunteers while a supramaximal shock stimulus was applied. This
stimulus was used to simulate surgical incision without inflicting tissue damage. In
8 1. TIME SERIES ELEMENTS
Cortex
0.60.2
BOLD
−0.2−0.6
0 20 40 60 80 100 120
Thalamus
0.60.2
BOLD
−0.2−0.6
0 20 40 60 80 100 120
Cerebellum
0.60.2
BOLD
−0.2−0.6
0 20 40 60 80 100 120
Time (1 pt = 2 sec)
Figure 1.7 fMRI data from two locations in the cortex, the thalamus, and the cerebellum;
n = 128 points, one observation taken every 2 seconds. The boxed line represents the
presence or absence of the stimulus.
this example, the stimulus was applied for 32 seconds and then stopped for 32 seconds,
so that the signal period is 64 seconds. The sampling rate was one observation every
2 seconds for 256 seconds (n = 128).
Notice that the periodicities appear strongly in the motor cortex series but seem to
be missing in the thalamus and perhaps in the cerebellum. In this case, it is of interest
to statistically determine if the areas in the thalamus and cerebellum are actually
responding to the stimulus. Use the following R commands for the graphic:
par(mfrow=c(3,1))
culer = c(rgb(.12,.67,.85,.7), rgb(.67,.12,.85,.7))
u = rep(c(rep(.6,16), rep(-.6,16)), 4) # stimulus signal
tsplot(fmri1[,4], ylab="BOLD", xlab="", main="Cortex", col=culer[1],
ylim=c(-.6,.6), lwd=2)
lines(fmri1[,5], col=culer[2], lwd=2)
lines(u, type="s")
tsplot(fmri1[,6], ylab="BOLD", xlab="", main="Thalamus", col=culer[1],
ylim=c(-.6,.6), lwd=2)
lines(fmri1[,7], col=culer[2], lwd=2)
lines(u, type="s")
1.3. TIME SERIES MODELS 9
tsplot(fmri1[,8], ylab="BOLD", xlab="", main="Cerebellum",
col=culer[1], ylim=c(-.6,.6), lwd=2)
lines(fmri1[,9], col=culer[2], lwd=2)
lines(u, type="s")
mtext("Time (1 pt = 2 sec)", side=1, line=1.75)
♦
The primary objective of time series analysis is to develop mathematical models that
provide plausible descriptions for sample data, like that encountered in the previous
section.
The fundamental visual characteristic distinguishing the different series shown in
Example 1.1 – Example 1.6 is their differing degrees of smoothness. A parsimonious
explanation for this smoothness is that adjacent points in time are correlated, so
the value of the series at time t, say, xt , depends in some way on the past values
xt−1 , xt−2 , . . .. This idea expresses a fundamental way in which we might think
about generating realistic looking time series.
Example 1.7. White Noise
A simple kind of generated series might be a collection of uncorrelated random
variables, wt , with mean 0 and finite variance σw2 . The time series generated from
uncorrelated variables is used as a model for noise in engineering applications where it
is called white noise; we shall sometimes denote this process as wt ∼ wn(0, σw2 ). The
designation white originates from the analogy with white light (details in Chapter 6).
A special version of white noise that we use is when the variables are independent
and identically distributed normals, written wt ∼ iid N(0, σw2 ).
The upper panel of Figure 1.8 shows a collection of 500 independent standard
normal random variables (σw2 = 1), plotted in the order in which they were drawn. The
resulting series bears a resemblance to portions of the DJIA returns in Figure 1.3. ♦
If the stochastic behavior of all time series could be explained in terms of the
white noise model, classical statistical methods would suffice. Two ways of intro-
ducing serial correlation and more smoothness into time series models are given in
Example 1.8 and Example 1.9.
Example 1.8. Moving Averages, Smoothing and Filtering
We might replace the white noise series wt by a moving average that smoothes the
series. For example, consider replacing wt in Example 1.7 by an average of its current
value and its immediate two neighbors in the past. That is, let
1
w t −1 + w t + w t +1 , (1.1)
vt = 3
which leads to the series shown in the lower panel of Figure 1.8. This series is much
smoother than the white noise series and has a smaller variance due to averaging.
It should also be apparent that averaging removes some of the high frequency (fast
10 1. TIME SERIES ELEMENTS
white noise
3
1
w
−1
−3
Figure 1.8 Gaussian white noise series (top) and three-point moving average of the Gaussian
white noise series (bottom).
successively for t = 1, 2, . . . , 250. The resulting output series is shown in Figure 1.9.
Equation (1.2) represents a regression or prediction of the current value xt of a
1.3. TIME SERIES MODELS 11
autoregression
5
x
0 −5
time series as a function of the past two values of the series, and, hence, the term
autoregression is suggested for this model. A problem with startup values exists here
because (1.2) also depends on the initial conditions x0 and x−1 , but for now we set
them to zero. We can then generate data recursively by substituting into (1.2). That
is, given w1 , w2 , . . . , w250 , we could set x−1 = x0 = 0 and then start at t = 1:
x1 = 1.5x0 − .75x−1 + w1 = w1
x2 = 1.5x1 − .75x0 + w2 = 1.5w1 + w2
x3 = 1.5x2 − .75x1 + w3
x4 = 1.5x3 − .75x2 + w4
and so on. We note the approximate periodic behavior of the series, which is similar
to that displayed by the SOI and Recruitment in Figure 1.5 and some fMRI series
in Figure 1.7. This particular model is chosen so that the data have pseudo-cyclic
behavior of about 1 cycle every 12 points; thus 250 observations should contain
about 20 cycles. This autoregressive model and its generalizations can be used as an
underlying model for many observed series and will be studied in detail in Chapter 4.
One way to simulate and plot data from the model (1.2) in R is to use the following
commands. The initial conditions are set equal to zero so we let the filter run an extra
50 values to avoid startup problems.
set.seed(90210)
w = rnorm(250 + 50) # 50 extra to avoid startup problems
x = filter(w, filter=c(1.5,-.75), method="recursive")[-(1:50)]
tsplot(x, main="autoregression", col=4)
♦
Example 1.10. Random Walk with Drift
A model for analyzing a trend such as seen in the global temperature data in Figure 1.2,
is the random walk with drift model given by
x t = δ + x t −1 + w t (1.3)
12 1. TIME SERIES ELEMENTS
random walk
80
60
40
20
0
Figure 1.10 Random walk, σw = 1, with drift δ = .3 (upper jagged line), without drift, δ = 0
(lower jagged line), and dashed lines showing the drifts.
for t = 1, 2, . . .; either use induction, or plug (1.4) into (1.3) to verify this statement.
Figure 1.10 shows 200 observations generated from the model with δ = 0 and .3,
and with standard normal noise. For comparison, we also superimposed the straight
lines δt on the graph. To reproduce Figure 1.10 in R use the following code (notice
the use of multiple commands per line using a semicolon).
set.seed(314159265) # so you can reproduce the results
w = rnorm(200); x = cumsum(w) # random walk
wd = w +.3; xd = cumsum(wd) # random walk with drift
tsplot(xd, ylim=c(-2,80), main="random walk", ylab="", col=4)
abline(a=0, b=.3, lty=2, col=4) # plot drift
lines(x, col="darkred")
abline(h=0, col="darkred", lty=2)
♦
Example 1.11. Signal Plus Noise
Many realistic models for generating time series assume an underlying signal with
some consistent periodic variation contaminated by noise. For example, it is easy to
detect the regular cycle fMRI series displayed on the top of Figure 1.7. Consider the
model
xt = 2 cos(2π t+5015 ) + wt (1.5)
for t = 1, 2, . . . , 500, where the first term is regarded as the signal, shown in the
1.3. TIME SERIES MODELS 13
2cos(2π(t + 15) 50)
2
1
cs
0 −1
−2
Figure 1.11 Cosine wave with period 50 points (top panel) compared with the cosine wave
contaminated with additive white Gaussian noise, σw = 1 (middle panel) and σw = 5 (bottom
panel); see (1.5).
upper panel of Figure 1.11. We note that a sinusoidal waveform can be written as
xt = −.9xt−2 + wt
with σw = 1, using the method described in Example 1.9. Next, apply the moving
average filter
vt = ( xt + xt−1 + xt−2 + xt−3 )/4
to xt , the data you generated. Now plot xt as a line and superimpose vt as a
dashed line.
(b) Repeat (a) but with
xt = 2 cos(2πt/4) + wt ,
where wt ∼ iid N(0, 1).
(c) Repeat (a) but where xt is the log of the Johnson & Johnson data discussed in
Example 1.1.
(d) What is seasonal adjustment (you can do an internet search)?
(e) State your conclusions (in other words, what did you learn from this exercise).
1.2. There are a number of seismic recordings from earthquakes and from mining
explosions in astsa. All of the data are in the dataframe eqexp, but two specific
recordings are in EQ5 and EXP6, the fifth earthquake and the sixth explosion, respec-
tively. The data represent two phases or arrivals along the surface, denoted by P
(t = 1, . . . , 1024) and S (t = 1025, . . . , 2048), at a seismic recording station. The
recording instruments are in Scandinavia and monitor a Russian nuclear testing site.
The general problem of interest is in distinguishing between these waveforms in order
to maintain a comprehensive nuclear test ban treaty.
To compare the earthquake and explosion signals,
(a) Plot the two series separately in a multifigure plot with two rows and one column.
(b) Plot the two series on the same graph using different colors or different line types.
(c) In what way are the earthquake and explosion series different?
1.3. In this problem, we explore the difference between random walk and moving
average models.
(a) Generate and (multifigure) plot nine series that are random walks (see Exam-
ple 1.10) of length n = 500 without drift (δ = 0) and σw = 1.
(b) Generate and (multifigure) plot nine series of length n = 500 that are moving
averages of the form (1.1) discussed in Example 1.8.
(c) Comment on the differences between the results of part (a) and part (b).
1.4. The data in gdp are the seasonally adjusted quarterly U.S. GDP from 1947-I to
2018-III. The growth rate is shown in Figure 1.4.
PROBLEMS 15
(a) Plot the data and compare it to one of the models discussed in Section 1.3.
(b) Reproduce Figure 1.4 using your colors and plot characters (pch) of your own
choice. Then, comment on the difference between the two methods of calculating
growth rate.
(c) Which of the models discussed in Section 1.3 best describe the behavior of the
growth in U.S. GDP?
References
Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans-
actions on Automatic Control, 19(6):716–723.
Blackman, R. and Tukey, J. (1959). The measurement of power spectra, from the
point of view of communications engineering. Dover, pages 185–282.
Bloomfield, P. (2004). Fourier Analysis of Time Series: An Introduction. John
Wiley & Sons.
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. J.
Econometrics, 31:307–327.
Bollerslev, T., Engle, R. F., and Nelson, D. B. (1994). Arch models. Handbook of
Econometrics, 4:2959–3038.
Box, G. and Jenkins, G. (1970). Time Series Analysis, Forecasting, and Control.
Holden–Day.
Brockwell, P. J. and Davis, R. A. (2013). Time Series: Theory and Methods.
Springer Science & Business Media.
Chan, N. H. (2002). Time Series Applications to Finance. John Wiley & Sons, Inc.
Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scat-
terplots. Journal of the American Statistical Association, 74(368):829–836.
Cochrane, D. and Orcutt, G. H. (1949). Application of least squares regression to
relationships containing auto-correlated error terms. Journal of the American
Statistical Association, 44(245):32–61.
Cooley, J. W. and Tukey, J. W. (1965). An algorithm for the machine calculation of
complex Fourier series. Mathematics of Computation, 19(90):297–301.
Durbin, J. (1960). The fitting of time-series models. Revue de l’Institut International
de Statistique, pages 233–244.
Edelstein-Keshet, L. (2005). Mathematical Models in Biology. Society for Industrial
and Applied Mathematics, Philadelphia.
Efron, B. and Tibshirani, R. J. (1994). An Introduction to the Bootstrap. CRC Press.
Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of
the variance of United Kingdom inflation. Econometrica, 50:987–1007.
Granger, C. W. and Joyeux, R. (1980). An introduction to long-memory time series
models and fractional differencing. Journal of Time Series Analysis, 1(1):15–
29.
Grenander, U. and Rosenblatt, M. (2008). Statistical Analysis of Stationary Time
Series, volume 320. American Mathematical Soc.
Hansen, J. and Lebedeff, S. (1987). Global trends of measured surface air tempera-
ture. Journal of Geophysical Research: Atmospheres, 92(D11):13345–13372.
Hansen, J., Sato, M., Ruedy, R., Lo, K., Lea, D. W., and Medina-Elizade, M. (2006).
Global temperature change. Proceedings of the National Academy of Sciences,
103(39):14288–14293.
Hosking, J. R. (1981). Fractional differencing. Biometrika, 68(1):165–176.
Hurst, H. E. (1951). Long-term storage capacity of reservoirs. Trans. Amer. Soc.
Civil Eng., 116:770–799.
Hurvich, C. M. and Tsai, C.-L. (1989). Regression and time series model selection
in small samples. Biometrika, 76(2):297–307.
Johnson, R. A. and Wichern, D. W. (2002). Applied Multivariate Statistical Analysis.
Prentice Hall.
Kalman, R. E. (1960). A new approach to linear filtering and prediction problems.
Journal of Basic Engineering, 82(1):35–45.
Kalman, R. E. and Bucy, R. S. (1961). New results in linear filtering and prediction
theory. Journal of Basic Engineering, 83(1):95–108.
Kitchin, J. (1923). Cycles and trends in economic factors. The Review of Economic
Statistics, pages 10–16.
Levinson, N. (1947). A heuristic exposition of Wiener’s mathematical theory of
prediction and filtering. Journal of Mathematics and Physics, 26(1-4):110–119.
McLeod, A. I. and Hipel, K. W. (1978). Preservation of the rescaled adjusted
range: 1. A reassessment of the Hurst phenomenon. Water Resources Research,
14(3):491–508.
McQuarrie, A. D. and Tsai, C.-L. (1998). Regression and Time Series Model
Selection. World Scientific.
Parzen, E. (1983). Autoregressive Spectral Estimation. Handbook of Statistics,
3:221–247.
R Core Team (2018). R: A Language and Environment for Statistical Computing.
R Foundation for Statistical Computing, Vienna, Austria.
Schuster, A. (1898). On the investigation of hidden periodicities with application to a
supposed 26 day period of meteorological phenomena. Terrestrial Magnetism,
3(1):13–41.
Schuster, A. (1906). Ii. on the periodicities of sunspots. Phil. Trans. R. Soc. Lond.
A, 206(402-412):69–100.
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics,
6(2):461–464.
Shephard, N. (1996). Statistical aspects of arch and stochastic volatility. Monographs
on Statistics and Applied Probability, 65:1–68.
Shewhart, W. A. (1931). Economic Control of Quality of Manufactured Product.
ASQ Quality Press.
Shumway, R., Azari, A., and Pawitan, Y. (1988). Modeling mortality fluctuations
in Los Angeles as functions of pollution and weather effects. Environmental
Research, 45(2):224–241.
Shumway, R. and Stoffer, D. (2017). Time Series Analysis and Its Applications:
With R Examples. Springer, New York, 4th edition.
Shumway, R. H. and Verosub, K. L. (1992). State space modeling of paleoclimatic
time series. In Proc. 5th Int. Meeting Stat. Climatol, pages 22–26.
Sugiura, N. (1978). Further analysts of the data by Akaike’s information criterion and
the finite corrections: Further analysts of the data by Akaike’s. Communications
in Statistics-Theory and Methods, 7(1):13–26.
Tong, H. (1983). Threshold Models in Non-linear Time Series Analysis. Springer-
Verlag, New York.
Tsay, R. S. (2005). Analysis of Financial Time Series, volume 543. John Wiley &
Sons.
Winters, P. R. (1960). Forecasting sales by exponentially weighted moving averages.
Management Science, 6(3):324–342.
Wold, H. (1954). Causality and econometrics. Econometrica: Journal of the
Econometric Society, pages 162–177.