Seasonality and Trend
Seasonality and Trend
ts.plot(x)
Lack of stationarity may be caused by the presence of deterministic effects in the quantity
being observed. Monthly sales figures for a company which is expanding rapidly would be
expected to show a steady underlying increase, possibly linear or perhaps even
exponential. A company which sells greetings cards will find that the sales in some months
of the year will be much higher than in others. In both cases there is an underlying
deterministic pattern and some (possibly stationary) random variation on top of that. In
order to predict sales figures in future months it is necessary to extrapolate the
deterministic trends as well as to analyse the stationary random variation.
A further cause of non-stationarity may be that the process observed is the integrated
version of a more fundamental process; in these cases, differencing the observed time
series may produce a series which is more likely to be a realisation of some stationary
process.
It is worth pointing out that this list is not exhaustive. For example, the first two causes are just
specific cases of general deterministic behaviour. In theory, this could take many different forms,
but trends and cycles are the most likely to be met in practice.
The items on the list are not mutually exclusive either. Consider a simple random walk with
probability 0.6 of stepping up, and 0.4 of stepping down. This can be represented by the
equation:
X n X n1 Z n
where:
This process is I(1) since the first difference is stationary, but the process itself is not.
On the other hand, the process also has a increasing deterministic trend:
E[ X n ] E[ Z1 Z2 Z n ] E[ Z1 ] E[ Z2 ] E[ Z n ] 0.2n
We consider in the following sections how to detect and remove such causes of non-stationarity.
The sample ACF is an estimate of the ACF based on sample data. It is defined in Section 2.1.
Plotting the series will highlight any obvious trends in the mean and will show up any cyclic
variation which could also form evidence of non-stationarity. This should always be the
first step in any practical time series analysis.
ts.plot(log(FTSE100$Close))
points(log(FTSE100$Close),cex=.4)
generates Figure 14.1, which shows the time series of the logs of 300 successive closing
values of FTSE100 index.
Figure 14.1: 300 successive closing values of the FTSE100 index, Jan 2017 – Mar 2018;
log-transformed
The corresponding sample ACF and sample PACF are produced using:
par(mfrow=c(1,2))
acf(log(FTSE100$Close))
pacf(log(FTSE100$Close))
Figure 14.2: Sample ACF and sample PACF of the log(FTSE100) data; dotted lines indicate
cut-offs for significance if data came from some white noise process.
The sample ACF should, in the case of a stationary time series, ultimately converge towards
zero exponentially fast, as for AR (1) where s s .
1, , 2 , 3 , …
If the sample ACF decreases slowly but steadily from a value near 1, we would conclude
that the data need to be differenced before fitting the model. If the sample ACF exhibits a
periodic oscillation, however, it would be reasonable to conclude that there is some
underlying cause of the variation.
Figure 14.2 shows the sample ACF of a time series which is clearly non-stationary as the
values decrease in some linear fashion; differencing is therefore required before fitting a
stationary model.
See, for example, the change of ACF and PACF for the differenced data:
Figure 14.3: Data plot, sample ACF and sample PACF of ln(FTSE100).
The ACF for a stationary ARMA(p, q) process decays fairly rapidly. For example, we have seen
that the ACF of a moving average process cuts off sharply, and the ACF of an AR(1) process has
exponential decay: k . In theory, this could still actually lead to a slow decay if the
k
values of p and/or q were high, (eg the autocorrelation of an MA(100) process wouldn’t cut off
until lag 100), but in practice the parameter values will be fairly small. If many parameters are
used then the resulting model may give a good fit to the sample data, but it is unlikely to be useful
for forecasting. A fairly slow decay of the sample autocorrelation function is therefore more likely
to be interpreted as an indication that the time series needs to be differenced before being
modelled.
If the sample autocorrelation function oscillates without decaying rapidly, as we will see in the
hotel example in Section 1.4, then we might conclude that there is an underlying deterministic
cycle. This would have to be removed before fitting a model to the residuals.
xt a bt y t
where a and b are constants and y is a zero-mean stationary process. The parameters a
and b can be estimated by linear regression prior to fitting a stationary model to the
residuals y t .
The formulae for estimating a and b are given on page 24 of the Tables.
1.3 Differencing
Differencing may well be beneficial if the sample ACF decreases slowly from a value near 1,
but has useful effects in other instances as well. If, for instance, xt a bt y t , then:
x t b y t
Differencing a series d times will make an I(d) series stationary. In addition however,
differencing once will also remove any linear trend, as above.
On the other hand, we could remove the linear trend by using linear regression, as in Section 1.2.
However, if the series is actually I(1) with a trend, then least squares regression will only remove
the trend. We will still be left with an I(1) process that is non-stationary.
For example, consider the simple random walk discussed earlier, which has probability 0.6 of
stepping up, and 0.4 of stepping down:
X n X n1 Z n
where:
1 0.6
Zn
1 0.4
We have seen that this process has an increasing trend and that E[ X n ] 0.2n . If we let
Yn X n 0.2n , then E[Yn ] 0 . So we have removed the trend. However, since:
Yn Yn 1 Z n 0.2
we are still left with an I(1) process that needs to be differenced in order to be stationary.
Example 1
Suppose that the time series x records the monthly average temperature in London. A
model of the form:
xt t y t (14.1)
might be applied, where is a periodic function with period 12 and y is a stationary series.
Then the seasonal difference of x is defined as 12 x t x t x t 12 and we see that:
is a stationary process.
We can then model the seasonal difference of x as a stationary process and reconstruct the
original process x itself afterwards.
We can use R to plot the time series data and to remove seasonal variation.
Figure 14.4 below is generated from the following lines in R, where functions ts.plot, acf
and pacf are used:
Figure 14.4: Data plot, sample ACF and PACF of temperature data.
Seasonal differencing 12 seems to have removed the seasonal behaviour of the data. See
Figure 14.5 generated from:
Example 2
The monthly inflation figures are obtained by seasonal differencing of the Retail Prices
Index. If x t is the value of the RPI in month t , the annual inflation figure reported is:
xt xt 12
100%
xt 12
A linear filter is a transformation of a time series x (the input series) to create an output series y
that satisfies:
yt ak xt k
k
The collection of weights ak : k Z forms a complete description of the filter. The objective of
the filtering is to modify the input series to meet particular objectives, or to display specific
features of the data. For example, an important problem in analysis of economic time series is
detection, isolation, and removal of deterministic trends.
In practice a filter ak : k Z normally contains only a relatively small number of non-zero
components.
A very simple example of a linear filter is the difference operator 1 B . Using this filter
produces:
yt (1 B) xt xt xt 1
As a second example, suppose that the input series is a white noise process e , and the filter takes
the form:
q
yt k et k
k 0
to an input series x that is AR(p) recovers the original white noise series:
p
yt xt k xt k et
k 1
If x is a time series with seasonal effects with even period d 2h , then we define a
smoothed process y by:
yt
1 1
2h 2 1
x t h x t h 1 x t 1 x t x t h 1 x t h
2
This ensures that each period makes an equal contribution to y t .
For example, with quarterly data a yearly period will have d 4 2h , so h 2 and we have:
11 1
yt xt 2 xt 1 xt xt 1 xt 2
42 2
1 for k 2,2
8
ak 14 for k 1,0,1
0 otherwise
This is a centred moving average since the average is taken symmetrically around the time t .
Such a centred moving average introduces the practical problem that the average can only be
calculated in retrospect, ie there will be a natural delay.
The same can be done with odd periods d 2h 1 , but the end terms xt h and xt h do not
need to be halved.
For example, with data every 4 months, a yearly period will have d 3 2h 1 , so h 1 and we
have:
1
yt xt 1 xt xt 1
3
1 for k 1,0,1
ak 3
0 otherwise
As with most filtering techniques, care must be taken lest the smoothing of the data
obscure the very effects which the procedure is intended to uncover.
The method of taking moving averages is one example of a series of approaches known as filtering
techniques. We lose some of our knowledge about the variation in the data in exchange for
(hopefully) a clearer picture of the underlying process.
xt t yt
where is a periodic function with period 12 and y is a stationary series. The term t contains
the deviation of the model at time t due to the seasonal effect. So:
yt xt t
When fitting the model in Equation 14.1 to a monthly time series x extending over 10 years
from January 1990 the estimate for is x and the estimate for January is:
1
ˆJanuary (x x13 x 25 x109 ) ˆ
10 1
So, in this case, we can remove the seasonal variation by deducting the January average,
1
x January (x x13 x25 x109 ) , from all the January values, deducting the February
10 1
1
average, xFebruary (x x14 x26 x110 ) , from all the February values, etc.
10 2
Alternatively, we could deduct ˆJanuary from all the January values, ˆFebruary from all the
February values, etc.
In R the function decompose can be used to obtain both the moving average and seasonal
means described in Sections 1.5 and 1.6.
decomp=decompose(ts(manston1$tmax,frequency = 12),type="additive")
The moving average can be added (in red) using the code:
lines(as.vector(decomp$trend),col="red")
The sum of seasonal and moving average trends can be added (in blue) as follows:
lines(as.vector(decomp$seasonal+decomp$trend),col="blue")
The resulting graph is shown below. A colour version is also available online in the tuition
materials for the R part of CS2.
Figure 14.6: Temperature data and its decomposition into moving average (in red) and
seasonal trend (in blue) added.
Variance-stabilising transformations
Transformations are most commonly used when a dependence is suspected between the
variance of the residuals and the size of the fitted values. If, for example, the standard
deviation of X t 1 X t appears to be proportional to X t , then it would be appropriate to use
the logarithmic transformation, to work on the time series Y ln X .