0% found this document useful (0 votes)
287 views69 pages

Times Series Analysis Notes May 2021

This document discusses time series analysis and its components. It contains: 1) An introduction to time series analysis, its objectives of description, explanation and forecasting. 2) Descriptions of the different components of time series - trend, seasonal, cyclic, and irregular. It also describes additive and multiplicative models. 3) Methods for smoothing time series data and estimating trends, including moving averages, exponential smoothing, and curve fitting. 4) A section on measuring trend effects using centered moving averages.

Uploaded by

Henry Ndungu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
287 views69 pages

Times Series Analysis Notes May 2021

This document discusses time series analysis and its components. It contains: 1) An introduction to time series analysis, its objectives of description, explanation and forecasting. 2) Descriptions of the different components of time series - trend, seasonal, cyclic, and irregular. It also describes additive and multiplicative models. 3) Methods for smoothing time series data and estimating trends, including moving averages, exponential smoothing, and curve fitting. 4) A section on measuring trend effects using centered moving averages.

Uploaded by

Henry Ndungu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Created in Master PDF Editor - Demo Version

THE EAST AFRICAN UNIVERSITY

SCHOOL OF EDUCATION

TIME SERIES ANALYSIS NOTES

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

UNIT 1 TREND COMPONENT ANALYSIS


Structure
13.1 Introduction
Objectives
13.2 Introduction to Time Series
Time Series with Trend Effect
Time Series with Seasonal Effect
Time Series with Cyclic Effect
13.3 Components of Time Series
Trend Component (T)
Seasonal Component (S)
Cyclic Component (C)
Irregular Component (I)
13.4 Basic Models of Time Series
Additive Model
Multiplicative Model
13.5 Smoothing or Filtering Time Series
Equal Weight (Simple) Moving Average (MA) Method
Weighted (Unequal) Moving Average Method
Exponential Smoothing
13.6 Estimation of Trends by Curve Fitting
Fitting a Linear Trend Equation
Fitting a Quadratic Trend Equation
Fitting the Exponential Trend Equation
13.7 Measurement of Trend Effect Using Centred Moving Average
Method
13.8 Summary
13.9 Solutions/Answers

13.1 INTRODUCTION
In the previous block, we have discussed simple and multiple linear
regression analysis where we have dealt with bivariate as well as
multivariate data. While studying that block, you have learnt how useful
regression analysis is in decision making. If we carefully analyse most
decisions and actions of the Government, an institution, an industrial
organisation or an individual, we find that, to a large extent, these depend on
the situations expected to arise in future. For example, suppose the Delhi
Government wishes to frame a housing development policy for providing
houses to all families of the central government employees in Delhi over the
next five years. Then the Government would like to know: What would the
number of families of government employees in Delhi be in the next five
years? A similar assessment is required while formulating the employment
policy, and so on.
Planning for future is an essential aspect of managing an organisation. This
requires that we should be able to forecast the future requirements of that
organisation. For example, suppose we are asked to provide quarterly
forecasts of the sales volume for a particular product during the coming one
5

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling year period. Our forecasts for sales would also affect production schedules,
raw material requirements, inventory policies, and sales quota. A good
forecast of the future requirements will result in good planning. A poor
forecast results in poor planning and may result in increased cost. In order to
provide such forecasts, we use historical data of the past few years to assess
the average requirement, trend (if any) over the years and seasonal
variations. Based on these features observed from the past data, we try to
understand their role in causing variability and use them for forecasting
requirements.
This exercise is done with the help of time series analysis which is a
collection of observations made sequentially over a period of time. The
main objectives of time series analysis are description, explanation and
forecasting. It has applications in many fields including economics,
engineering, meteorology, etc.
In this unit, we discuss the concept of time series and explain different types
of time series in Sec. 13.2. In Secs. 13.3 and 13.4, we describe different
components and basic models of time series. We explain the methods of
smoothing and filtering the time series data along with the estimation of
trend by the curve fitting and curvilinear methods in Secs. 13.5 and 13.6.
Finally, in Sec. 13.7, we describe the methods of measurement of trend and
cyclic effect in time series data.
In the next unit, we shall discuss some methods for estimating the seasonal
component (S). We shall also discuss the method of estimating the trend
component from deseasonalised time series data.
Objectives
After studying this unit, you should be able to:
 explain the concept of time series;
 describe the components of time series;
 explain the basic models of time series;
 decompose a time series into its different components for further
analysis;
 describe the trend component of the time series;
 describe different types of trends;
 explain various methods for smoothing time series and estimation of
trends; and
 describe the centred moving average method of measuring the trend
effect.

13.2 INTRODUCTION TO TIME SERIES


Generally it is seen that forecasting involves studying the behaviour of a
characteristic over time and examining data for any pattern. The forecasts
are made by assuming that, in future, the characteristic will continue to
behave according to the same pattern. The data gathered could be of sales
per day, units of productions per week, the running cost of a machine per
month, etc.
A time series (TS) is a collection of observations made sequentially over
a period of time. In other words, the data on any characteristic collected
6

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

with respect to time over a span of time is called a time series. Normally, we Trend Component Analysis
assume that observations are available at equal intervals of time, e.g., on an
hourly, daily, monthly or yearly basis. Some time series cover a period of
several years.
The methods of analysing time series constitute an important area of study
in statistics. But before we discuss time series analysis, we would like to
show you the plots of some time series from different fields. In the next
three sub-sections, we look at the plots of three time series, namely, time
series with trend effect, time series with seasonal effect and time series
with cyclic effect. These plots are called time plots.

13.2.1 Time Series with Trend Effect


A trend is a long term smooth variation (increase or decrease) in the time
series. When values in a time series are plotted in a graph and, on an
average, these values show an increasing or decreasing trend over a long
period of time, the time series is called the time series with trend effect.
You should note that all time series do not show an increasing or decreasing
trend. In some cases, the values of the time series fluctuate around a
constant reading and do not show any trend with respect to time. You should
also remember that an increase or decrease may not necessarily be in the
same direction throughout the given period. Time series may show an
upward trend, a downward trend or have no trend at all. Let us explain all
three cases with the help of examples.

Time Series with Upward Trend


When values in a time series are plotted in a graph and these show an
upward trend with respect to time, we call it a time series with upward
trend. For example, the profit of a company plotted for the time period
1981 to 2012 in Fig. 13.1 shows an upward trend.

Profit (in Lakhs)


1200

1000

800

600
Profit (in Lakhs)
400

200

Fig. 13.1: Profit of a company from 1981 to 2012.

Time Series with Downward Trend


If data in a time series plotted in a graph show a downward trend with
respect to time, it is called the time series with downward trend. For
example, the values of mortality rate of a developing country from 1991 to
2009 show a downward trend in the time plot given in Fig. 13.2. 7

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling

Mortality Rate
800
700
600
500
400
300 Mortality Rate
200
100
0
1991199319951997199920012003200520072009

Fig. 13.2: Mortality rates of a developing country from 1991 to 2009.


Time Series with No Trend
If data of a time series is plotted on the graph paper and does not show any
trend, that is, neither an upward nor a downward trend is reflected in the time
plot, the time series is called the time series with no trend. For example,
Fig. 13.3 shows the time plot of the production of a commodity in a factory
from 1988 to 2012. Notice that the time series shows no trend.

Production (in Tons)


1000
800
600
400
Production (in Tons)
200
0
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012

Fig. 13.3: Production of a commodity from 1988 to 2012.

13.2.2 Time Series with Seasonal Effect


If values in a time series reflect seasonal variation with respect to a given
period of time such as a quarter, a month or a year, the time series is called a
time series with seasonal effect. For example, the time plot of the data of
weekly sales of air coolers shows a seasonal effect (Fig. 13.4).

Fig. 13.4: Weekly sales of air coolers in successive weeks.

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Trend Component Analysis


13.2.3 Time Series with Cyclic Effect
If the time plot of data in a time series exhibits a cyclic trend, the time series
is called a time series with cyclic effect. For example, time series data of
the number of employees in software industry in different phases, i.e.,
phases of prosperity, recession, depression and recovery shows a cyclic
pattern, that is, the pattern repeats itself over an almost fixed period of time
(Fig. 13.5).

Number of Employees in Software Industries


1200 (1988-2012)
1000
800
600
400 Cyclic Time Series

200
0
0 5 10 15 20 25 30

Fig. 13.5: Number of employees in software industries for the last 25 years.
So far, you have learnt about different types of time series plots which
exhibit different trends in data. These trends arise due to the effect of
various factors on the variations in data. The variations in the values or data
are also described in terms of components of time series. Let us learn about
them.

13.3 COMPONENTS OF TIME SERIES


In Sec. 13.2, you have learnt that the variations in the time series values are
of different types and arise due to a variety of factors. These different types
of variations in the values of the data in a time series are also called
components of the time series.
In the past, time series analysis was mainly concerned with decomposing the
variations in a time series into components representing (i) Trend
(ii) Seasonal (iii) Cyclic and (iv) Remaining variations attributed to
Irregular fluctuations (sometimes referred to as the Random
component). This approach is not necessarily the best one and we shall
discuss the modern approach in later units of this block. Usually, some or all
components may be present in a time series in varying amounts and can be
classified in the above-mentioned four categories. For the sake of
completeness, we discuss these components in some detail.

13.3.1 Trend Component (T)


Usually time series data show random variation, but over a long period of
time, there may be a gradual shift in the mean level to a higher or a lower
level. This gradual shift in the level of time series is known as the trend. In
other words, the general tendency of values of the data to increase or
decrease during a long period of time is called the trend. Some time
series show an upward trend while some show a downward trend as you
have learnt in the previous section. For example, upward trends are seen in
the data of population growth, currency in circulation, etc., while data of
births and deaths, epidemics, etc. show downward trends. It is quite possible 9

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling that some time series may not show any trend at all. The shifting in level is
usually the result of changes in the population, demographic characteristic
of the population, technology, consumer preferences, purchasing power of
the population, and so on.
You should clearly understand that a trend is a general, smooth, long term
and average tendency of a time series data. The increase or decrease may
not necessarily be in the same direction throughout the given period. A time
series may show a linear or a nonlinear (curvilinear) trend. If the time series
data are plotted on a graph and the points on the graph cluster more or less
around a straight line, the tendency shown by the data is called linear trend
in time series. But if the points plotted on the graph do not cluster more or
less around a straight line, the tendency shown by the data is called
nonlinear or curvilinear trend. Trend need not always be a straight line. It
can be quadratic, exponential or may not be present at all.
Trend is also known as long term variation. However, do understand that
long term or long period of time is a relative term which cannot be defined
uniformly. In some situations, a period of one week may be fairly long
while in others, a period of 2 years may not be long enough.

13.3.2 Seasonal Component (S)


In a time series, variations which occur due to rhythmic or natural
forces/factors and operate in a regular and periodic manner over a span of
less than or equal to one year are termed as seasonal variations. Although
we generally think of seasonal movement in time series as occurring over
one year, it can also represent any regularly repeating pattern that is less
than one year in duration. For example, daily traffic volume data show
seasonal behaviour within the same day, with peak level occurring during
rush hours, moderate flow during the rest of the day, and light flow from
midnight to early morning. Thus, in a time series, seasonal variations may
exist if data are recorded on a quarterly, monthly, daily or hourly basis.
However, do remember this point: Although the data may be recorded
over a span of three months, one month, a week or a day, the total
period should be one year to assess seasonal variation properly. Note
that the amplitudes of the seasonal variation are different for different spans
of time over which data is recorded (quarterly, monthly, daily, etc.). Most
of the time series data in the fields of economics or business show a
seasonal pattern.
The seasonal pattern in a time series may be either due to natural forces or
manmade conventions. Variations in time series that arise due to changes in
seasons or weather conditions and climatic changes are known as seasonal
variations due to natural forces. For example, the sales of umbrellas, rain
coats and gumboots pick up very fast in the rainy season; the demand for air
conditioners goes up in summers; the sale of woollens goes up in winter. All
these arise due to natural factors. Variations in time series that arise due to
changes in habits, fashions, customs and conventions of people in any
society are termed as seasonal variations from manmade conventions.
For example, in our country, the sale of gold and silver ornaments goes up
in Diwali, Dussehra (Durga Puja) or the marriage season.

13.3.3 Cyclic Component (C)


Apart from seasonal effects, some time series exhibit variation in a fixed
period of time due to other physical causes. For example, economic data are
10 sometimes thought to be affected by business cycles with a period varying

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

from three to ten years (see Fig. 13.5). The cycles could be caused by period Trend Component Analysis
of moderate inflation followed by a period of high inflation. However, the
existence of such business cycles leads to some confusion about cyclic,
trend and seasonal effects. To avoid this confusion, we shall term a pattern
in the time series as cyclic component only when its duration is longer
than one year.
The cyclic variations in a time series are usually called “business cycle” and
comprise four phases of a business, namely, prosperity (boom), recession,
depression and recovery. These are normally over a span of seven to eleven
years. Thus, the oscillatory variations with a period of oscillation of
more than one year are called cyclic variations or the cyclic component
in a time series. One oscillation period is called one cycle.

13.3.4 Irregular Component (I)


The long term variations, i.e., the trend component and short term
variations, i.e., the seasonal and cyclic component are known as regular
variations. Apart from these regular variations, random or irregular
variations, which are not accounted for by trend, seasonal or cyclic
components, exist in almost all time series. In most cases, these irregular
variations are random, irregular and unpredictable and are caused by
short-term, unanticipated and non-recurring factors that affect time series in
some cases. In Unit 16, we shall model the irregular component using
probability models such as auto-regressive (AR) and moving average (MA)
models.
So far, you have learnt about four components of time series. These
components may be present individually or jointly in any time series.
We now take up an example of a time series, which has both trend and
seasonal components. Consider the quarterly sales data of washing machines
for the period 2001-2007 given in Table 1.
Table 1: Quarterly sales of washing machines for the period 2001-2007

Year Quarter 1 Quarter 2 Quarter 3 Quarter 4

2001 556 662 327 494

2002 398 704 624 473

2003 750 343 484 545

2004 419 798 334 465

2005 468 554 744 443

2006 582 581 437 417

2007 618 571 517 754

Fig. 13.6 shows the plot of this quarterly data. Note that there are 28
quarters from the year 2001 to 2007 and so 28 values are plotted in
Fig. 13.6. These have been numbered from 1 onwards on the horizontal axis.
We have connected the data points by a dotted curve to obtain a time series
plot of the data. From the plot, we note that the values exhibit an upward
linear trend over the long term. We show this trend by a thin straight line in
Fig. 13.6. So the thin straight line in Fig. 13.6 reflects the presence of a
long-term linear trend. We also notice a seasonal variation in the data, which
11

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling we show by a smooth thick free-hand curve. This thick curve shows the
approximate movement around the straight trend line.

900
800
700
600
500 Seasonal Indices
400 Trend Values
300
200 Linear (Trend Values)
100
0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28

Fig. 13.6: Trend and seasonal variation in quarterly sales of washing machine.
In Fig. 13.6, in most years, the first quarter is a low point and then there is a
rise in the second quarter, a decline in the third quarter and a rise in the
fourth quarter. This could be due to changes in seasons and festival offers.

13.4 BASIC MODELS OF TIME SERIES


In Sec. 13.3, we have discussed different components of time series with
examples. You have learnt of many factors (natural and manmade) that
affect a time series. We would now like to describe the effect of these
factors and the components of a time series mathematically. In this section,
we discuss two commonly used mathematical models, which explain time
series data reasonably well. While discussing these models, we shall use the
notation yt for the value of the time series at time t. We shall use serial
numbering for time t since all time series are in chronological order. We
now describe the two basic time series models, namely, the additive model
and the multiplicative model.

13.4.1 Additive Model


The additive model is one of the most widely used models. It is based on
the assumption that at any time t, the time series value Yt is the sum of all
the components. According to the additive model, a time series can be
expressed as
Yt  Tt  C t  St  I t
where Tt, Ct, St and It are the trend, cyclic, seasonal and irregular variations,
respectively, at time t. In this model, we make the following assumptions:
 cyclic effects remain constant for all cycles;
 seasonal effects remain constant during any year or the corresponding
period; and
 It is an i.i.d. normal variable with mean 0, i.e., the effect of irregular
variation remains constant throughout. This implies that the term St does
not appear in a time series of annual data.
The additive model implies that seasonal variations in different years, cyclic
variations in different cycles and irregular variations in different trends
show equal absolute effects irrespective of the trend value. In the previous
sections, you have learnt that all four components need not necessarily be
exhibited in every time series. For example, the time series of annual

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

production data of a yield does not have seasonal variations. Similarly, a Trend Component Analysis
time series for the annual rainfall does not contain cyclic variations.
In additive model, we have assumed that the time series is the sum of the
trend, cyclic, seasonal and irregular components. Generally, the additive
model is appropriate when seasonal variations do not depend on the trend of
the time series. However, there are a number of situations where the
seasonal variations exhibit an increasing or decreasing trend over time. In
such cases, we use the multiplicative model.

13.4.2 Multiplicative Model


When seasonal variations exhibit any change over time in terms of an
increasing or decreasing trend, we can use the multiplicative model to
describe the time series data. The multiplicative model is appropriate if
various components in a time series operate proportionately to the general
level of the series. The multiplicative model is based on the assumption that
the time series value Yt at time t is the product of the trend, cyclic, seasonal
and irregular component of the series:
Yt  Tt  Ct  St  I t
where Tt, Ct, St and It denote the trend, cyclic, seasonal and irregular
variations, respectively. The multiplicative model is found to be appropriate
for many business and economic data. Some examples are the time series
for production of electricity, time series for number of passengers opting for
air travel, time series for sales of soft drinks.
For estimation of any trend in a time series, smoothing (or filtering) the
effect of irregular fluctuations present in it becomes important so that trend
and seasonal effects may be easily estimated. We now discuss the
smoothing or filtering of time series.

13.5 SMOOTHING OR FILTERING TIME SERIES


In this section, we discuss the methods of moving averages and exponential
smoothing for smoothing or filtering the time series data. There are two
methods of moving averages: the equal weight or simple moving average
method and the weighted (unequal) moving average method. We discuss
these two moving average methods in the next two sub-sections.

13.5.1 Equal Weight (Simple) Moving Average (MA)


Method
In this method, we find the simple moving averages of time series data over
m periods of time, called m-period moving averages. You can calculate
them in the following way:
1. Calculate the average of the first m values of the time series.
2. Then discard the first value and take the average of the next m values
again.
3. Repeat this process till all data are exhausted.
These steps yield a new time series of m-period moving averages.
Let us explain this method with the help of an example.
Example 1: Compute the 3-year simple moving averages for the time series
of annual output of a factory for the period 2006-2011 given in Table 2. 13

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling


Table 2: Annual output of a factory from 1976 to 1981

Year 1976 1977 1978 1979 1980 1981

Output (in thousands) 17 22 18 26 16 27

Solution: In this case, m = 3 years. The first value of the moving averages
for m = 3 years is the average of 17, 22 and 18, which is 19. The second
value of moving averages is obtained by discarding the first value, i.e., 17
and taking the average of the next 3 values in the time series, i.e., 22, 18 and
26. So we take the average of 22, 18 and 26, which is 22. Again, we discard
the first value, i.e., 22 and take the average of the next 3 values in the time
series, i.e., 18, 26 and 16. It is 20. We repeat this procedure for calculating
the remaining 3-year moving averages. Table 3 gives the 3-year simple
moving averages for the data given in Table 2. Note that each moving
average is tabulated at the average (centre) of the time period for which it is
computed. This method is, therefore, called the centred moving averages.
Table 3: Centred simple moving averages for time series data of Table 2

Year 1976 1977 1978 1979 1980 1981


Output (in thousands) 17 22 18 26 16 27
Moving Averages - 19 22 20 23 -

Remember that moving averages vary less than the data values from which
they are calculated as they smooth (or filter) out the effect of irregular
component. This helps us in appreciating the effect of trend more clearly.
For the given data the original time series varies between 16 and 27 whereas
the moving averages vary between 19 and 23, which is much smoother than
the original series. Fig.13.7 shows the output of the original time series and
the 3-year moving averages series. In this figure, you can clearly see the
smoothing property of moving averages.

30

25

20

15
Output (in
10 Thousands)

0
1976 1977 1978 1979 1980 1981

Fig. 13.7: Three-year centred moving average of the time series.

You may ask: What should the value of m be? If m is increased, the series
becomes much smoother and it may also smooth out the effect of cyclical
and seasonal components, which are our main interest of study. Sometimes
3-year, 5-year or 7-year moving averages are used to expose the combined
trend and cyclical movement of time series. But as we shall see in Sec. 13.6,
four quarter or 12-months moving averages are more useful for estimating
trend, seasonal and cyclical movement of the series.

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Trend Component Analysis


13.5.2 Weighted (Unequal) Moving Average Method
The simple moving average method described in Sec.13.5.1 is not generally
recommended for measuring trend although it can be useful for removing
seasonal variation. It may also not lie close to the latest values. Therefore,
the weighted (unequal) moving averages method is used. In this method,
instead of giving equal weights to all values, unequal weights are given in
such a way that all the weights are positive and their sum is equal to 1. If wi
denotes the weight of the ith observation, the weighted moving average
value yt is given by
q q
y t   w i x t i w i  0, w i 1 … (1)
i  q i  q

where xt is the original series.


Simple moving average becomes a particular case of weighted moving
average for
1
m = 2q+1 and w i 
2q  1
In Example 5, q =1, m = 2q+1 = 3 and wi = 1/(2q+1) = 1/3.
Let us consider an example to illustrate this method.
Example 2: Compute the 3-years weighted moving averages for the time
series given in Example 1.
Solution: Generally, the most recent observation receives the largest weight
and the weights decrease for older data values. For the data given in
Example 1, suppose we give 3 times more weight to the most recent
observation than the first observation and 2 times more weight to the second
observation. Then the weights for each year in every 3-year period are:
w1=1/6, w2=2/6, w3=3/6
After assigning the weights as above, we get the smoothened trend values as
yt = (xt-1 + 2xt + 3xt+1) / 6 … (2)
The procedure for getting the smoothened trend values by weighted moving
average method is as follows:
1. Calculate the weighted average of the first m values given in time series
as explained above.
2. Now discard the first value and include the next one and take the average of
the next m values by following the weight structure given in equation (2).
3. Repeat this process till all values are exhausted.
These steps yield a new time series of m-period weighted moving
averages and the weighted moving average values are given in the following
table:
Table 4: Weighted moving averages for time series data of Table 2

Year 1976 1977 1978 1979 1980 1981


Output (in thousands) 17 22 18 26 16 27
Weighted Moving
- 19.16 22.66 19.66 23.16 -
Averages
15

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling The reason for giving larger weight to the latest observation rather than the
earlier observations is that the latest observation is a better predictor of the
trend than the earlier one. However, this may not be true for the cases in
which the data contains a very large irregular component.

30

25

20

15 Output (in
Thousands)
10 Weighted
Moving Avg
5

0
1976 1977 1978 1979 1980 1981

Fig. 13.8: Three-year weighted moving average of the time series.

13.5.3 Exponential Smoothing


We now discuss another technique of smoothing the time series, namely, the
exponential smoothing technique, which is very popular. In Secs. 13.5.1
and 13.5.2, you have learnt that in the method of moving averages, weights
(equal or unequal) are attached to each time series value that is considered.
But in this method, the weights assigned to current as well as past time
series values are different fixed positive numbers.
In this smoothing technique, weights decrease exponentially, except the last
one, and the trend value is given by

y¢t = wy t +(1 - w) y¢t - 1 for t = 1, 2, .... …(3)

where 0 < w < 1. The value of ‘w’ is chosen as per the requirements.
Equation (3) gives weighted average of y1, y2, …, yt for calculating moving
average series yt . You can see that the latest observation y¢t - 1 gets the
maximum weight and then weights decrease exponentially. Note that the
sum of weights is equal to one. This is a very popular technique for
forecasting purposes. Let us take up an example to explain this method.
Example 3: Find the forecast for the time series given in Example 1.
Solution: On plotting the given data, it appears that there is no trend in the
time series.
This is the same as saying that if T = a + bt is the trend then b = 0.
Therefore, if we fit a straight line to the given data with b = 0, the least
squares estimate of a would be the simple average of the time series
values, i.e.,

å yt
â = = 20.66
6
In this way, the forecast for output for all t would be â as the model is
T = a + (forecast error)

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Trend Component Analysis

Output (in Thousands)


30

25

20

15
— Output (in
Thousands)
10

0
1976 1977 1978 1979 1980 1981

Fig. 13.9: Original time series values output (in thousands).


The result seems somewhat unreasonable because even though there is no
trend, the time series values are not the same at all times t. The values of â
seem to be higher in the year 2007, 09 and 2011 than the values in the year
2006, 08 and 2010. Therefore, it is logical to assume that the value of â is
gradually changing over time. Therefore, the value of â should be denoted
by â t rather than ‘ â ’.
According to this smoothing technique, we select a single weight w which is
called exponential smoothing constant, where w lies between 0 and 1 and
there is a method of choosing this constant. We can compute an exponential
smoothing series yt as follows:
y1  wy1  1  w  y1  y1
y2  w y2  1  w  y1
y3  w y3  1  w  y2
. .
. .
. .
yt  w y t  1  w  yt 1

We should know the value of y1 for the computation of y1 , y2 , y3 ...
According to this technique, we also select a single weight w, which lies between
0 and 1. To start with, let us take w = 0.01. Then from equation (3), we get
y1  0.01y1  0.99 y1  0.0117    0.99 17   17
Our forecast for y1 at t = 0 is y1 . Therefore, the forecast error is
e1  y1  y1  17  17  0.00
We get the second forecast value as follows:
y2  0.01y 2  0.99 y1  0.01 22    0.99 17   17.05
The forecast error at t = 2 is
e 2  y2  y2  22  17.05  4.95
17

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling Proceeding in this way, we calculate the forecast value yt and forecast error
et for all t. These are calculated and given in the following table:
Table 5: Forecast values and forecast errors for time series given in Table 2
Year Output (in thousand) Forecasts Forecast errors
yt y t et
1976 17 17.00 +0.00
1977 22 17.05 +4.95
1978 18 17.06 +0.94
1979 26 17.15 +8.85
1980 16 17.14 −1.14
1981 27 17.24 +9.76

After calculating the forecasts y t and errors et for all t, we plot the forecast
values with the original time series values (Fig. 13.10). The graph of time
series shows that there are no peaks in the time series after smoothing.
30

25

20

15 Output (in
Thousands)
10

0
1976 1977 1978 1979 1980 1981

Fig. 13.10: Original and smoothed time series values of output (in thousands).
You should try to solve a few problems to check your understanding of the
concepts discussed so far.
E1) Calculate the last two values of 3-year moving average for the data
given in Example 1.
E2) Use exponential smoothing to obtain filtered values for the data given
in Example 1 taking w = 0.5 and compare them with simple moving
average values obtained in the same example.
E3) Obtain filtered values for the following data using exponential smoothing:
Year Rainfall Year Rainfall Year Rainfall Year Rainfall
(in cm) (in cm) (in cm) (in cm)
1970 664 1981 548 1991 624 2001 468
1971 728 1982 417 1992 473 2002 554
1972 447 1983 387 1993 750 2003 744
1973 663 1984 590 1994 343 2004 943
1974 630 1985 556 1995 484 2005 582
1975 451 1986 292 1996 545 2006 581
1976 617 1987 327 1997 419 2007 437
1977 734 1988 494 1998 798 2008 417
1978 491 1989 448 1999 334 2009 617
1979 520 1990 704 2000 465 2010 571
1980 280

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Trend Component Analysis


13.6 ESTIMATION OF TRENDS BY CURVE
FITTING
An alternative approach to smoothing is to fit a polynomial to the data. This
treats smoothing as a regression problem in which yt is the trend value and
integral powers of time t are the explanatory variables. The resulting smooth
function is a polynomial
p
yt   b j t j …(4)
j 0

where b j is the jth coefficient of the polynomial of degree p. The coefficients


of the polynomial are estimated by the method of least squares by
minimising the quantity
2
p
n 
E    y i   b j t ij  …(5)
i1  j 0 

We shall not discuss it here in detail as you have already studied the method
of least squares and curve fitting in Unit 5 of MST-002. In the next section,
we shall discuss the case when p = 1, i.e., the case of a linear trend equation.

13.6.1 Fitting a Linear Trend Equation


The trend equation given in equation (4) would be the special case of linear
equation for p =1 and we get the equation of the straight line as
yt = b 0 +b1t …(6) …
We can write it as
y t  b 0  b1 t  b1 t  t  …(7)
To simplify the calculations, we take x t  t  t , where t is the mean
of all times t.
Equation (6) then becomes
y t  a1  b1x t where a1 = b 0 + b1 t …(8)
Using the method of least squares for minimising the error term, we obtain
the following normal equations:
n n
å
t =1
y t = na1 + b1å x t
t =1

n n n
å yt x t = a1å x t + b1å x t2 …(9a)
t =1 t =1 t =1

On solving the above equations, the least squares estimates of a1 and b1 are
given by â 1 and b̂1 as:
n

y x
t 1
t t
â 1  y and b̂1  n
…(9b)
2
x
t 1
t

The estimate of b0 is given by b̂0 :

bˆ 0  y  bˆ 1 t …(10)
19

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling The fitted trend line is given by


ŷt  bˆ 0  bˆ 1 t  bˆ 1 (t  t )  y  bˆ 1 (t  t ) …(11)
Let us explain this concept further with the help of an example.

Example 6: Fit a straight line trend for the data of annual profit of a
company given below:

Year 2003 2004 2005 2006 2007 2008 2009 2010 2011

Profit (in crores) 93 102.8 126.7 103.5 105.7 133.2 156.7 175.7 161.6

Solution: From the given data, we obtain linear trend values for the annual
profit as follows:
Table 6: Trend values for the given time series

Year Profit xt = t – t xt y t x2t Trend


(in crores) Values

2003 93.0 –4 –372.0 16 89.915


2004 102.8 –3 –308.4 9 99.828
2005 126.7 –2 –253.4 4 109.351
2006 103.5 –1 –103.5 1 119.054
2007 105.7 0 0 0 128.767
2008 133.2 1 133.2 1 138.480
2009 156.7 2 313.4 4 148.193
2010 175.7 3 527.1 9 157.906
2011 161.6 4 646.4 16 167.619
Σyt = 1158.9 Σxtyt = 582.8 Σxt2 = 60

From the above table, we calculate the least squares estimates as follows:
n

y x t t
 y    128.76
y t 1 582.8
â 1 b̂1  n
  9.71
9 2 60
x
t 1
t

The equation of the fitted trend line is given by equation (11) as

ŷ t  b̂ 0  b̂1t  y  b̂1 ( t  t )

= 128.76 + 9.71 (t – 2007)

The trend projection for 2013 is

Y2013 = 128.76 + 9.71(2013 – 1977) = 187.02

These points are used to plot the trend line along with the data and projected
value for 2013 (Fig. 13.11).

20

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Trend Component Analysis


Yearly Profit (in Crores)
200
150
100 Yearly profit (in Crores)
50
Linear (Yearly profit (in
0 Crores))
0 2 4 6 8 10
Years From 2003 to 2011

Fig. 13.11: Computed trend line for the data of profit of a company.

The projections are forecasts of future trend values but they do not take into
account the cyclical effect. Sometimes cyclical effects are confused with
trend curves which are of higher degree of polynomials.

13.6.2 Fitting a Quadratic Trend Equation


Sometimes trend is not linear and shows some curvature. The simplest
curvilinear form is a second degree polynomial which can be obtained by
taking p = 2 in equation (4).
yt = b 0 + b1t + b 2t2 …(12)
Proceeding in the same way as in Sec. 13.6.1 for linear trend equation, the
equations for estimating b0, b 1 and b2 are given by
2
 y  nb  b  x  b  x
t 0 1 t 2 t

2 3
x y b x  b x  b x
t t 0 t 1 t 2 t
...(13)
2
x y t t  b 0  x 2t  b1  x 3t  b 2  x 4t

The values of  y t ,  x t  x t y t ,  x 2t y t ,  x 2t ,  x 3t and  x 4t are


obtained from the given data and the normal equations given in
equation (13) can now be solved for the optimum values b̂ 0 , b̂1 and b̂2 .
With these values, equation (12) gives the desired quadratic trend.
To illustrate this, let us take an example of fitting a quadratic trend for the
data of gross revenue of a company.
Example 7: Fit a quadratic trend equation for the data given below:
Year 1991 1992 1993 1994 1995 1996 1997 1998 1999
Gross Revenue 240 167 140 120 124 128 142 176 207
(in Lakhs)
Year 2000 2001 2002 2003 2004 2005 2006 2007
Gross Revenue 304 338 397 439 481 577 711 778
(in Lakhs)

Solution: Let us take the simplest curvilinear form of a second degree


polynomial given as
yt = b 0 + b1t + b 2t2 ... (i)
21

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling Proceeding in the same way as in Sec. 13.6.1, the normal equations for
estimating b0, b1 and b2 are given by

y t  nb 0  b1  x t  b 2  x 2t

x y t t  b0  x t  b1  x 2t  b 2  x 3t
... (ii)
2
x t y t  b0  x 2t  b1  x 3t  b 2  x 4t

The values of  y t ,  x t ,  x t y t ,  x 2t y t , 2
x ,  x
t
3
t and x 4
t are
obtained from the given data as follows:
Table 7: Calculations for fitting of quadratic trend

2
Year yt xt=t–1999 xt yt xt y t x 2t x 3t x 4t
1991 240 –8 –1920 15360 64 –512 4096
1992 167 –7 –1169 8183 49 –343 2401
1993 140 –6 –840 5040 36 –216 1296

1994 120 –5 –600 3000 25 –125 625


1995 124 –4 –496 1984 16 –64 256
1996 128 –3 –384 1152 9 –27 81

1997 142 –2 –284 568 4 –8 16


1998 176 –1 –176 176 1 –1 1
1999 207 0 0 0 0 0 0

2000 304 1 304 304 1 1 1


2001 338 2 676 1352 4 8 16
2002 397 3 1191 3573 9 27 81

2003 439 4 1756 7024 16 64 256


2004 481 5 2405 12025 25 125 625
2005 577 6 3462 20772 36 216 1296

2006 711 7 4977 34839 49 343 2401


2007 778 8 6224 49792 64 512 4096
Total 5469 0 15126 165144 408 0 17544

Putting the values from the above table in the normal equations given in
equation (ii), we get

17 b 0 + 0 b1 + 408 b2 = 5469
0 b0 + 408 b1+ 0 b2 = 15126
408 b0 + 0 b1+ 17544 b2 = 165144

On solving the normal equations, we get the optimum values: b̂ 0 = 373.22,


b̂1  41.614 and b̂2  4.3715 .

22

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

With these values, we get the desired quadratic trend: Trend Component Analysis

2
yt = 373.22 – 41.614  t  t  + 4.3715  t  t 

Since we have used x t   t  t  in equestion (1)


Fig. 13.12 gives a plot of the above quadratic equation that fits the data
(shown by continuous line) along with the actual values of the gross
revenue of a company shown by dots. The values of time t are coded year
values (t = 1, 2, ..., 17). The fitted model with coded year values is shown
in Fig. 13.12.

Gross Revenue (in Lakhs)


900
800
700
600
500
400
300
200
100
0
0 2 4 6 8 10 12 14 16 18

Fig. 13.12: Plot the quadratic trend model to the data of Example 7.

13.6.3 Fitting the Exponential Trend Equation


Sometimes data show that compound annual growth rate is constant over
time rather than increasing annually as in the case of linear model. This can
be represented by an exponential model
y t = a 0a 1t
You can see that (α1–1) ×100% is the annual compound growth rate (in %),
which remains constant. As far as fitting is concerned, we transform this
model to a linear trend model by taking natural logarithm of yt:

loge yt = loge α 0 + t loge α1

Yt = β0 + β1 t
where Yt = loge yt, β0 = logee α 0 , β1 = loge α1,
Now, we can fit the model with Yt and t as described in Sec. 13.6.1. Once
we know the estimates of β0 and β1, i.e., ̂0 and ̂1 , we can obtain the
estimates of α0 and α1 , i.e., the values of ̂ 0 and ̂1 by taking anti-
logarithm of ̂0 and ̂1 , respectively. When we fit the exponential model to
the data given in Example 7, we get the values ˆ  2.57 and ˆ  0.0494,
0 1
respectively, and the estimated exponential trend equation is obtained as:

yt = 371.535 (1.1205)t
23

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling The fitted and raw data are plotted in Fig. 13.13.

1000 Gross Revenue (in Lakhs)


800

600

400

200

0
0 2 4 6 8 10 12 14 16 18

Fig. 13.13: Plot of fitted exponential trend model to the data of Example 7.

An exponential model can also be represented by


y t   0 exp1 t
As far as fitting is concerned, we transform this model to a linear trend
model by taking natural logarithm of yt:
loge yt = loge α0 +β1 t
or Yt = β0 + β1 t
where Yt = loge yt, β0 = loge α 0.
On comparing Fig. 13.12 with Fig. 13.13, we find that the quadratic model
is a better choice.
You may now like to solve the following exercise.

E4) Use the data given in Example 1 to fit a linear trend line and obtain
the projection for the year 2012.

13.7 MEASUREMENT OF TREND EFFECT


USING CENTRED MOVING AVERAGE
METHOD
In Sec. 13.3, you have learnt that time series data usually have four
components: Trend (T), Cyclical (C), Seasonal (S) and Irregular (I).
Suppose a set of data are recorded on a monthly basis and there is a seasonal
effect of one year. This means that after twelve months the data behaves in
the same way as it did twelve months ago. This is called the period of
seasonal effect. If we take a centred moving average with m = 12, then it
will smooth out (eliminate) the effect of season. If data have been recorded
on a quarterly basis and there is a seasonal effect with a period of twelve
months, then a moving average with m = 4 will smooth out the seasonal
effect. Not only this, it will also smooth out the effect of the irregular
component I. Thus, the time series values of centred moving average will be
nearly free from the effects of S and I. When the effects of S and I are
removed, we are left with the effect of trend (T) and the cyclic effect (C).
In the next unit, we shall discus the method of estimating seasonal effect (S)
by making use of the estimates of trend (T) and cycle (C).

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Example 8: Compute the trend values for the given data for quarterly sales Trend Component Analysis
of washing machines by an appliance manufacturer for the period
2001-2009.
Year Quarter 1 Quarter 2 Quarter 3 Quarter 4
2001 935 1215 1045 1455
2002 990 1315 1350 1485
2003 1370 1815 1470 1680
2004 1160 1365 1205 1445
2005 1030 1475 1195 1585
2006 1185 1330 1500 2145
2007 1410 2120 1915 2390
2008 1875 2145 1965 2800
2009 1865 2115 1935 2165

Solution: We shall use the data of four years 2001-2004 from the given data
to explain the calculations of moving averages, which are estimates of T and
C. The given time series data is shown in Fig. 13.14.

Quaterly Sales of Washing Machines


3000
2500
2000
1500
1000
500
0

Fig. 13.14: Time series plot for quarterly sales of washing machines from 2001-2009.

We have seen from the plot (Fig. 13.14) that it has a very strong 12 monthly
seasonal effect. Hence, to remove the seasonal effect, we have to take
moving averages with m = 4.
Table 8: Calculation of centred moving averages
Sales (in
Year Quarter Centred MA(1) Centred MA (2)
Hundreds)
1 935
2001
-
2 1215
1162.5
3 1045 1169.375
1176.25
4 1455 1188.75
1201.25
2002 1 990 1239.375
1277.5
2 1315 1281.25
1285.0
3 1350 1332.5
1380.0
4 1485 1442.5
25

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling 1505.0


2003 1 1370 1520
1535.0
2 1815 1559.125
1583.25
3 1470 1619.75
1656.25
4 1680 1600
1543.75
2004 1 1660 1510.625
1477.5
2 1365 1448.625
1419.75
3 1205 -
-
4 1445 -

The values in MA(1) are the moving averages for m = 4 but they do not
correspond to any of the given four quarters as the average of 1, 2, 3 and 4 is
2.5. Hence, to make it correspond to a quarter, we usually calculate moving
averages with m = 2 on MA(1) so that the centred values correspond to one
of the four quarters. MA(2) gives the moving average of MA(1) series with
m = 2 so that the values correspond to quarters 3, 4, 1, 2,.. etc.
You should try to solve the following exercises.
E5) Compute MA(1), the moving average values for m = 4 and MA(2),
the moving average values for m = 2 of MA(1) for the remaining
years of the period 2005-2009 for the data given in Example 8.
E6) Compute the moving average (MA) values for m = 3 for time series
for the period 2001-2009 for the data given in Example 8.
Let us now summarise the concepts that we have discussed in this unit.

13.8 SUMMARY
1. A good forecast of the future requirements will result in good planning.
A poor forecast results in poor planning and may lead to increased cost.
In order to provide such forecasts, we use historical data of the past few
years to assess the average requirement, trend (if any) over the years and
seasonal variations. Based on these features observed from the past data,
we try to understand their role in causing variability and use them for
forecasting requirements.
2. A time series is a collection of observations made sequentially over a
period of time. The main objectives of time series analysis are
description, explanation and forecasting. It has applications in many
fields including economics, engineering, meteorology, etc.
3. A trend is a long term smooth variation (increase or decrease) in the
time series. When values in a time series are plotted in a graph and, on
an average, these values show an increasing or decreasing trend over a
long period of time, the time series is called the time series with trend
effect.
4. If values in a time series reflect seasonal variation with respect to a
given period of time such as a quarter, a month or a year, the time series
is called a time series with seasonal effect. If the time plot of data in a
time series exhibits a cyclic trend, it is called the time series with cyclic
effect.
26

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

5. The long term variations, i.e., the trend component, and short term Trend Component Analysis
variations, i.e., the seasonal and cyclic component, are known as
regular variations. Apart from these regular variations, random or
irregular variations, which are not accounted for by trend, seasonal or
cyclic variations, exist in almost all time series.
6. The additive model is one of the most widely used models. It is based
on the assumption that at any time t, the time series value Yt is the sum
of all the components. According to the additive model, a time series can
be expressed as
Yt  Tt  C t  St  It

where Tt, Ct, St and It are the trend, cyclic, seasonal and irregular
variations, respectively, at time t.
9. The multiplicative model is based on the assumption that the time series
value Yt at time t is the product of the trend, cyclic, seasonal and
irregular component of the series:
Yt  Tt  Ct  St  I t

where Tt, Ct, St and It denote the trend, cyclic, seasonal and irregular
variations, respectively. The multiplicative model is found to be
appropriate for many business and economic data.
10. There are two methods of moving averages: the equal weight or
simple moving averages method and the weighted (unequal)
moving average method. The methods of moving averages and
exponential smoothing are used for smoothing or filtering the time series
data.
11. In exponential smoothing technique, where weights decrease
exponentially, except the last one, the trend value is given by

y¢t = wy t +(1 - w) y¢t - 1 for t = 1, 2, ....

where 0 < w < 1. The value of ‘w’ is chosen as per the requirements.
12. An alternative approach to smoothing is to fit a polynomial to the data.
This treats smoothing as a regression problem in which yt is the trend
value and integral powers of time t are the explanatory variables. The
resulting smooth function is a polynomial
p
yt   b j t j
j 0

where bj is the jth coefficient of the polynomial of degree p. The


coefficients of the polynomial are estimated by the method of least
squares.

13.9 SOLUTIONS/ANSWERS
E1) Year MA
1979 (18 + 26 + 16)/3 = 20
1980 (26 + 16 + 27)/3 = 23

27

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling E2) Using equation (1), we obtain the exponential smoothing values as
follows:
y2 = 0.5x2 +0.5x1 = 0.5 × 22 + 0.5 × 17 = 19.50
y3 = 0.5x3 + 0.25x2 + 0.25x1
= 0.5 × 18 + 0.25 × 22 + 0.25 × 17 = 18.75
y4 = 0.5x4 + 0.25x3 + 0.125x2 + 0.125x1
= 0.5 × 26 + 0.25 × 18 + 0.125 × 22 + 0.125 × 17 = 22.375
y5 = 0.5x5 + 0.25x4 + 0.125x3 + 0.0625x2 + 0.0625x1
= 0.5 × 16 + 0.25 × 26 + 0.125 × 18 + 0.0625 × 22
+ 0.0625 × 17 = 19.1875
y6 = 0.5x6 + 0.25x5 + 0.125x4 + 0.0625x3 +0.03125 x2 + 0.03125x1
= 0.5 × 27 + 0.25 × 16 + 0.125 × 26 + 0.0625 × 18
+ 0.03125 × 22 + 0.03125 × 17 = 23.09375
E3) If we plot the given data in a graph, there seems to be no trend in the
time series.
If we fit a linear regression to this data and if T = a + bt is the
equation, it seems to have b = 0. Therefore, the least squares
estimates of all 41 time series values would be the same. Also our
forecast for rainfall is a for all t because our model is
T = a + (forecast error)
This result of forecast seems somewhat unreasonable because a
particular place cannot have a constant amount of rainfall every year.
From Fig. 13.15, you can see that the value of yt seems to be higher
during the period 1970 to 1980 compared to the value of a for the
period 1980 to 1990. Therefore, it is logical to assume that the value
of a is gradually changing over time and to denote it by yt rather
than a.

Rainfall (in cms)


1000
800
600
400
Rainfall (in cms)
200
0
1970
1973
1976
1979
1982
1985
1988
1991
1994
1997
2000
2003
2006
2009

Fig. 13.15: Time series plot for rainfall (in cm) from 1970 to 2009.

According to this smoothing technique, we select a single weight w


called the exponential smoothing constant. Here w lies between 0
and 1 and there is a method of choosing this constant. We can
compute an exponential smoothing series yt as follows:

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Trend Component Analysis


y1  wy1  1  w  y1  y1

y2  w y2  1  w  y1 ... (i)


y3  w y3  1  w  y2
. .
. .
. .
yt  w y t  1  w  yt 1

We should know the value of yt to initialise the computation of


y1, y2, … For this example we take the initial value of yt as y1, the
first and initial value of the given time series. Thus,
y1  y1
Now let us take w = 0.02 since w can take any value between
0 and 1. From experience it is suggested that the value of w be taken
between 0.01 and 0.3. Then from the equation (i) given above, we
get
y2  0.02 y 2  0.98 y1 = 0.02 (728) + (0.98) (664) = 543.71
Our forecast for x1 at time zero is y1. Therefore, the forecast error is
e1  y1  y1  664  664  0.00
Similarly, the second forecast error is
e 2  y2  y2  728  665.24  62.76
Following the same procedure, we get

y3   0.02 y3   0.98 y2   0.02  447  0.98  665.24  660.875


and the third forecast error is
e3  y3  y3  447  660.875  213.875
Proceeding in the same way, we calculate y¢t and the forecast errors
et for all t. The calculated forecast values and errors are given in the
following table:
Table 9: Forecast values and Forecast errors for given time series

Year Rainfall Forecast Error


yt y ¢t et = yt − y ¢t

1970 664 664.00 0.00


1971 728 665.24 +62.76

1972 447 660.875 −213.875


1973 663 660.92 +02.08
1974 630 660.30 −30.30

1975 451 656.11 −05.11


1976 617 655.33 −38.33
1977 734 656.91 +77.09
29

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling 1978 491 653.58 −162.58

1979 520 650.92 −130.92


1980 280 643.50 −363.50
1981 548 641.59 −93.59

1982 417 637.1 −220.1


1983 387 632.1 −245.1
1984 590 631.25 −41.25

1985 556 629.75 −73.75


1986 292 623.0 −331.0
1987 327 617.07 −290.07

1988 494 614.61 −120.61


1989 448 611.28 −163.28
1990 704 613.13 +90.87

1991 624 613.35 +10.65


1992 473 610.54 −137.54
1993 750 613.33 +136.67

1994 343 607.93 −264.93


1995 484 605.45 −121.45
1996 545 604.24 −59.24

1997 419 600.53 −181.53


1998 798 604.48 +193.52
1999 334 599.07 −265.07

2000 465 596.40 −131.40


2001 468 593.82 −125.82
2002 554 593.03 −39.03

2003 744 596.05 +147.95


2004 943 602.99 +340.01
2005 582 602.56 −20.56

2006 581 602.13 −21.13


2007 437 598.83 −161.83
2008 417 595.20 −178.20

2009 618 595.65 +22.35


2010 571 595.16 −24.16

After calculating the forecast at and the error et for all t, we plot the
time series values along with the forecasted values, which are the
outcomes after smoothing the time series values and observe the
change in the time series values before and after the smoothing
(Fig. 13.16). We can also see that the peaks barely exist in the graph
of the time series after smoothing.
30

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

UNIT 2 SEASONAL COMPONENT


ANALYSIS
Structure
14.1 Introduction
Objectives
14.2 Estimation of Seasonal Component
Simple Average Method
Ratio to Moving Average Method
Ratio to Trend Method
14.3 Estimation of Trend from Deseasonalised Data
14.4 Forecasting
14.5 Summary
14.6 Solutions/Answers

14.1 INTRODUCTION
In Unit 13, you have learnt that time series can be decomposed into four
components: Trend (T), Cyclic (C ), Seasonal (S) and Irregular
Component (I). Our aim is to estimate T, C and S components and use them
for forecasting. We have already described some methods for smoothing or
filtering the time series, namely, the simple moving average method,
weighted moving average method and exponential smoothing method in
Unit 13. We have also described some methods for estimating Trend and
Cyclic components, i.e., the method of least squares and the moving average
method in Unit 13.
When time series data do not contain any trend and cyclic components but
reflect seasonal variation, we have to estimate the seasonal component S by
removing irregular components. In Sec. 14.2 of this unit, we discuss some
methods for estimating the seasonal component (S), namely, the simple
average method, the ratio to moving average method and the ratio to trend
method. If the effect of seasonal variation is not removed from the time
series data, the trend estimates are also affected. Therefore, we have to
deseasonalise the data by dividing it by the corresponding seasonal
indices. Once the data is free from seasonal effects, we estimate the trend
equation using the method of least squares as explained in 14.3. In
Sec. 14.4, we explain how to use data for forecasting purposes once we
have estimated the trend, cyclic and seasonal components of the time
series.
In the next unit, we shall discuss the stationary time series and explain the
stationary processes, i.e., weak and strict stationary processes. We shall also
discuss the autocovariance, autocorrelation function and correlogram of a
stationary process.
Objectives
After studying this unit, you should be able to:
 explain the effect of seasonal variation in time series;
 apply the simple average method for estimating seasonal indices;
 apply the ratio to moving average method and ratio to trend method for
estimation of seasonal indices; 33

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling  describe the method of estimation of trend component from
deseasonalised time series data; and
 use trend (T), cyclical (C) and seasonal (S) components for forecasting
purposes.

14.2 ESTIMATION OF SEASONAL COMPONENT


If the seasonal variation is substantial, we can express the variation in yt by
additive or multiplicative models:

Additive Model: Yt = Tt + Ct + St + It … (1)


Multiplicative Model: Yt = Tt × Ct × St × It … (2)

In the additive model, the seasonal indices St are normalised so that their
sum over months in a year is zero. In the multiplicative model, they are
normalised. In both cases, the forecast of the yearly output is not affected by
the seasonal effect St and we can work on yearly totals to estimate the trend.
In such cases, we can estimate St by working on the annual data. However,
in many cases, we may be interested in forecasting monthly (or quarterly)
figures. This requires monthly (or quarterly) estimate of the seasonal
index St.

In many cases it has been found that seasonal effects increase with increase
in the mean level of time series. Under these circumstances, it may be more
appropriate to use the multiplicative model. If seasonal effects remain
constant, the additive model is more appropriate. The classical approach is
to consider the multiplicative model and estimate seasonal effect (St) for
forecasting purposes. In this unit, we use the classical multiplicative model.
We describe two methods for estimating seasonal indices based on
the ratios of time series observation (Y) and estimated trend and cycle
effects:

Y T  C t  St  It
 t  St  I t … (3)
(Tt  C t ) (Tt  C t )

This ratio gives an estimate of St × It. We shall estimate the seasonal


indices (St) by smoothing out the irregular component (It).
We now describe the methods of estimating the seasonal indices St.

14.2.1 Simple Average Method


The method of simple average is the simplest of all methods. It is used to
eliminate the seasonal effect from the given time series data. This method is
based on the assumption that the data do not contain any trend and cyclic
components and consists of eliminating irregular components by averaging
the monthly (or quarterly or yearly) values over years. This assumption may
or may not be true since most economic or business time series exhibit
trends.
This method consists of the following steps:
Step 1: We arrange the data by years, months or quarters if data are
collected on yearly, monthly or quarterly basis.
Step 2: After arranging the time series data, the average y i is calculated for
the ith month of the year.
34

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Step 3: After the monthly averages are calculated, we calculate the average Seasonal Component
of the monthly averages, that is, Analysis

y1  y 2  ...  y12
y
12
Step 4: After calculating the average y , we express monthly averages yi as
the percentage of the average y . These percentages are known as
seasonal indices. Thus,
y
Seasonal index for the ith month  i 100, for i  1, 2,...,12. … (4)
y
Let us consider an example to explain this method.
Example 1: Determine the monthly seasonal indices for the following data
of production of a commodity for the years 2010, 2011 and 2012 using the
method of simple averages.
Years Production in Tonnes
Months 2010 2011 2012
January 120 150 160
February 110 140 150
March 100 130 140
April 140 160 160
May 150 160 150
June 150 150 170
July 160 170 160
August 130 120 130
September 110 1360 100
October 100 120 100
November 120 130 110
December 150 140 150

Solution: First of all, we arrange the data as shown in columns 1 to 4 of the


table given below:
Table 1: Seasonal indices for the given time series

Months Production (in Tonnes) Total Monthly Seasonal


2010 2011 2012 Avgs. Index
January 120 150 160 430 143.3 104.886
February 110 140 150 400 133.3 97.566
March 100 130 140 370 123.3 90.247
April 140 160 160 460 153.3 112.205
May 150 160 150 460 153.3 112.205
June 150 150 170 470 156.6 114.620
July 160 170 160 490 163.3 119.524
August 130 120 130 380 126.6 92.662
September 100 130 100 340 113.3 82.928
October 100 120 1000 320 106.6 78.024
November 120 130 110 360 120.0 87.832
December 150 140 150 440 146.6 107.301
Total 4920 1639.5 1200
Average 410 136.625 100
35

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling We are given the monthly production of a commodity for 3 years. For
calculating the monthly seasonal indices, we first calculate the month-wise
total production for the 3 years. Then we calculate the monthly averages for
all 12 months. Note from Table 1 that for January, it is 143.3, for February,
it is 133.3, …, etc. Next, we calculate the average of all monthly averages,
i.e.
1
y 143.3  133.3  ...  146.6   136.625
12
Now we calculate the seasonal indices by taking the percentage of monthly
averages y i to the combined averages y , one at a time, for i = 1, 2, …, 12:
143.3
Seasonal Index for January   100  104.886
136.625
133 .3
Seasonal Index for February   100  97.566
136 .625
123 .3
Seasonal Index for March   100  90.247
136 .625
153.3
Seasonal Index for April   100  112.205
136.625
153.3
Seasonal Index for May   100  112.205
136.625
156.6
Seasonal Index for June   100  114.620
136.625
163.3
Seasonal Index for July   100  119.524
136.625
126 .6
Seasonal Index for August   100  92.662
136 .625
113 .3
Seasonal Index for September   100  82.928
136 .625
106 .6
Seasonal Index for October   100  78.024
136 .625
120
Seasonal Index for November   100  87.832
136 .625
146 .6
Seasonal Index for December   100  107 .301
136 .625
14.2.2 Ratio to Moving Average Method
In Sec. 14.2.1, we have discussed the simple average method for calculating
seasonal indices. Now we discuss the most widely used method known as
the ratio to moving average method. It is better because of its accuracy.
Also the seasonal indices calculated using this method are free from all the
three components, namely, trend (T), cyclic (C) and Irregular variations (I).
As you have learnt in Unit 13, the moving average eliminates periodic
variations if the span of period of moving average is equal to the period of
the oscillatory variation sought to be eliminated. Therefore, we have to
choose the span of time for moving average to be equal to one cycle. For
36 example, if a cycle is completed in 3 months, we calculate the moving

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

average for 3 months. You may note that for some quarters, three values are Seasonal Component
available and for some quarters, four values are available. By taking the Analysis
average over three or four values, we get seasonal indices Si for i=1, 2, 3, 4.
We usually normalise them so that their mean is 100 by dividing them by
the mean of Si and multiplying by 100. These normalised Si have mean 100.
Usually, not much difference exists between normalised and non-normalised
seasonal indices Si . When data are monthly, the same procedure will yield
twelve monthly seasonal indices Si . This method of estimating seasonal
indices is known as the ratio to moving average method.
We have explained the ratio to moving average method for monthly time
series data. The same method may be applied for any other periodic data
such as quarterly, weekly data, etc. The steps for obtaining seasonal indices
using this method are as follows:
Step 1: We arrange the data chronologically.
Step 2: If the cycle of oscillation is 1 year, we take the 12 months moving
average of the 1 st year, which will give estimates of the combined
effects of trend and cyclic fluctuation. We enter the average value
against the middle position, i.e., between the months of June and
July.
Step 3: We discard the value for the month of January of the first year and
include the value for the month of January of the subsequent year.
Then we calculate the average of these 12 values and enter it
against the middle position, i.e., between July and August. We
repeat the process of taking moving averages MA (1) and entering
the value in the middle position, till all the monthly data are
exhausted.
Step 4: We calculate the centred moving average, i.e., MA(2), of the two
values of the moving averages MA(1) and enter it against the first
value, i.e., the month of July in the first year and subsequent values
against the month of August, September, etc.
Step 5: After calculating the MA(1) and MA(2) values, we treat the
original values (except the first 6 months in the beginning and the
last 6 months at the end) as the percentage of the centred moving
average values. For this we divide the original monthly values by
the corresponding centred moving average, i.e., MA (2) values,
and multiply the result by 100. We have now succeeded in
eliminating the trend and cyclic variations from the original data.
We now have to get rid of the data of irregular variations.
Step 6: We prepare another two-way table consisting of the month-wise
percentage values calculated in Step 5, for all years. The purpose
of this step is to average the percentages and to eliminate the
irregular variations in the process of averaging.
Step 7: We find the median of the percentages or preliminary seasonal
indices calculated in Step 5 month-wise and take the average of the
month-wise median. Then we divide the median of each month by
the average value and multiply it by 100. Generally, the sum of all
medians is not 1200. Therefore, the average of all medians is not
equal to 100. Hence, the seasonal indices are subjected to the same
operation. We multiply the medians by the ratio of expected total
of indices, i.e., 1200 to the actual total as follows:

37

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling 1200


Seasonal Index = Median ´
Total of Indices
The seasonal index for each month is given in the last column of
the table. We calculate the sum of indices.
Let us consider an example to understand this method.
Example 2: Apply the ratio to moving average method to ascertain
seasonal indices for the following data:
Year 2009 2010 2011 2012
Month
January 500 550 500 600
February 600 550 600 650
March 650 600 550 650
April 750 650 600 750
May 800 700 650 800
June 800 700 750 900
July 850 750 750 1000
August 900 750 850 1000
September 900 750 900 1050
October 950 800 1000 1100
November 1100 900 1100 1200
December 1100 1000 1200 1250

Solution: As described in Sec. 13.5 we shall first eliminate the effect of St


from time series observations. If the data are quarterly and the period of
seasonal effect is one year, then taking a moving average for 4 quarters will
eliminate the effect of St. In this case, we have to calculate centred moving
average by taking the average of two successive moving average values.
Table 2 gives the original data and centred moving average values, denoted
by MA(2), give an estimate of T × C. Then we calculate seasonal relatives
S × I as percentages. Seasonal relatives in percentages are calculated by the
ratio of yt to MA (2) in percentages:
Seasonal relative = S × I × 100 = (yt /MA(2) ) × 100
These are given for the years 2009-2012 in the following table in bold
figures.
Table 2: Calculation of seasonal relatives for the given time series

Year Month Sales Moving Centred Moving Seasonal


(1) (2) (3) Average MA(1) Average MA (2) Relatives
(4) (5) (3) / (5)
2009 January 500
February 600
March 650
April 750
May 800
June 800
825.0
July 850 827.00 102.78
829.0

38

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

August 900 827.00 108.827 Seasonal Component


Analysis
825.0
September 900 823.00 109.356
821.0
October 950 816.75 116.314
812.5
November 1100 808.25 136.096
804.0
December 1100 800.00 137.500
796.0
2010 January 550 791.75 69.466
787.5
February 550 781.25 70.4
775.0
March 600 768.75 78.05
762.5
April 650 756.25 85.95
750.0
May 700 741.75 94.371
733.5
June 700 729.25 95.989
725.0
July 750 723.00 103.734
721.0
August 750 723.00 103.734
725.0
September 750 723.00 103.734
721.0
October 800 718.75 111.3
716.5
November 900 714.50 125.96
712.5
December 1000 714.50 139.958
716.5
2011 January 500 716.50 69.78
716.5
February 600 720.75 83.246
725.0
March 550 731.25 75.213
737.5
April 600 745.75 80.456
754.0
May 650 762.50 85.246
771.0
June 750 779.25 96.246
787.5
July 750 791.75 94.727
796.0 39

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling August 850 798.00 106.516


800.0
September 900 804.25 111.905
808.5
October 1000 814.75 122.737
821.0
November 1100 827.25 132.97
833.5
December 1200 839.75 142.899
846.0
2012 January 600 856.25 70.073
866.5
February 650 872.75 74.477
879.0
March 650 885.25 73.425
891.5
April 750 895.75 83.728
900.0
May 800 904.25 88.471
908.5
June 900 910.50 98.846
912.5
July 1000
August 1000
September 1050
October 1100
November 1200
December 1250

Once we have obtained S × I, we can take the average to eliminate the effect
of irregularity I. This gives seasonal indices, Si. Now we prepare a two way
table which will include the percentage value of column (6) of Table 2
month-wise for every year as follows:
Table 3: Seasonal indices for the given time series

Year 2009 2010 2011 2012 Median Seasonal


Months Indices
January -- 69.466 69.78 70.073 69.78 70.025
February 70.4 83.246 74.477 74.477 74.738
March 78.05 75.213 73.425 75.213 75.477
April 85.95 80.456 83.728 83.728 84.02
May 94.37 85.246 88.471 88.471 88.78
June 95.989 96.246 98.846 96.246 96.584
July 102.78 103.734 94.727 102.78 103.14
August 108.827 103.734 106.516 106.516 106.89
September 109.356 103.734 111.905 109.356 109.74
October 116.314 116.314 122.737 116.314 116.72
November 136.096 125.962 132.97 132.971 133.437
December 137.5 139.958 142.899 139.958 140.446
Total 1195.81 1200
Average 99.65 100
40

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

14.2.3 Ratio to Trend Method Seasonal Component


Analysis
This method provides seasonal indices free from trend and is an improved
version of the simple average method as it assumes that seasonal variation
for a given period is a constant fraction of the trend. The measurement of
the seasonal indices by this method consists of the following steps:
Step 1: We obtain the trend values by the method of least squares for each
period by establishing the trend by fitting a straight line or second
degree parabola or a polynomial.
Step 2: To express the original time series values as percentages of the
trend value, we divide each original value by the corresponding
trend value and multiply it by 100. The indices so obtained are free
from the trend.
Step 3: To obtain the seasonal indices free from the cyclic and irregular
variations, we find average (mean or median) of ratio to trend
values (or percentages values) for each season for any number of
years. It is suggested that median be preferred instead of mean if
some extreme values are present, which are not primarily due to
seasonal effects. In this way, the irregular variation is removed. If
there are a few abnormal values in the percentage values, the mean
should be preferred to remove the randomness.
Step 4: If the seasonal periods are quarters, the sum of seasonal indices in
the case of multiplicative model should be 400 and if the periods
are in months, it should be 1200. After calculating the seasonal
indices, the expected total is divided by actual total of the indices
for quarterly and monthly data, respectively. Most of the time, the
sum of all calculated seasonal indices is not exactly the same, as it
is expected to be.
By following the above step-wise procedure, we calculate the seasonal
indices using this method. This method is based on sound and logical
footing and utilises complete information.
Example 3: Compute the seasonal indices for the following time
series of sales (in thousand `) of a commodity by the ratio to trend
method:
Year I II III IV
Quarter Quarter Quarter Quarter
2008 800 920 880 820
2009 540 760 680 620
2010 400 580 540 480
2011 340 520 500 440
2012 300 400 360 340

Solution: We are given the time series data for 5 years of quarterly
sales of a commodity. To compute the seasonal indices, we first determine
the trend value for the yearly averages (Y) by fitting a linear trend by the
method of least squares. The following table is constructed for fitting the
straight line:
Y = a + b  X  X   a  bU
where U =  X  X 
41

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling Table 4: Trend values for the yearly averages

Year Yearly Yearly U = X−2010 U×Y U2 Trend


X Total Average (y) Value
2008 3400 850 −2 −1700 4 800
2009 2600 650 −1 −650 1 680
2010 2000 500 0 0 0 560
2011 1800 450 1 450 1 440
2012 1400 350 2 700 4 320
Total 2800 0 −1200 10 2800

For the straight line Y = a + b X, the normal equations for estimating a and
b are:

 Y  na  b U
2
 UY  a  U  b U
2
Now we put the values of  Y ,  U ,  UY , U in the normal
equations:
5 a  2800 a  560
10 b  1200  b   120
On putting the optimum value of a and b in the equation of the straight line
Y = a + bX, we get the trend line for the given time series data as:
Y = 560 – 120 U
Thus, the trend values for each value of U are obtained as follows:
U = –2, Y = 560 – 120 (–2) = 800
U = –1, Y = 560 – 120 (–1) = 680
U = 0, Y = 560 – 120 ( 0 ) = 560
U = 1, Y = 560 – 120 ( 1 ) = 440
U = 2, Y = 560 – 120 ( 2 ) = 320
Since yearly decline in the trend value is −120, the quarterly increment
would be
 120
Quarterly increment   30
4

Now we determine the quarterly trend values as follows:

For 2008, the trend value for the middle quarter, i.e., half of the second
quarter and half of the third quarter is 800. Since the quarterly increment is
−30, we obtain the trend value for the 2nd quarter as 800 − (−15) and for the
3rd quarter as 800 + (−15). Thus, these are 815 and 785, respectively.
Consequently, the trend value for the first quarter is 815 − (− 30) = 845 and
for the 4 th quarter, it is 785 + (−30) = 755. Similarly, we can get the trend
values for other years as we have obtained for all the quarters of the year
2008. After calculating the trend values, we also calculate the seasonal
indices for each quarter of every year, which estimates the trend component
42 from the data. This is shown in Table 5.

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Table 5: Calculations for seasonal indices Seasonal Component


Analysis
Trend Values Seasonal Indices Total
(Given value as % of trend values)

Year 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
2008 845 815 785 755 94.67 112.88 112.1 108.61 428.26
2009 725 695 665 635 74.48 109.35 102.25 97.64 383.72

2010 605 575 545 515 66.11 100.87 99.08 93.2 359.26
2011 485 455 425 395 70.1 114.28 117.64 111.39 413.41
2012 365 335 305 275 82.19 119.4 118.03 123.63 443.25

Total 387.55 556.78 549.1 534.47 2027.9


Average (A. Mean) 77.51 111.356 109.82 106.892 405.08
Adjusted Seasonal Index 76.445 109.824 108.301 105.42 399.99

The average yearly seasonal indices obtained above are adjusted to a total of
400 because the total of the seasonal indices for each quarter is 405.578,
which is greater than 400. So we multiply each quarterly index by the ratio
400 400
K   0.986.
Total of Indices 405.08
The adjusted seasonal indices for each quarter are given in the last row of
the table.
You may now like to solve the following problems to check your
understanding of the three methods explained so far.

E1) Determine the seasonal indices for the data given below for the
average quarterly prices of a commodity for four years:
Years Quarter I Quarter II Quarter III Quarter IV
2009 554 590 616 653
2010 472 501 521 552
2011 501 531 553 595
2012 403 448 460 480

E2) Calculate the seasonal indices for the following data of production of
a commodity (in hundred tons) of a firm using the ratio to trend
method:
Years Quarter I Quarter II Quarter III Quarter IV
2001 470 531 500 480
2002 340 450 410 380
2003 270 360 340 310
2004 240 330 320 290
2005 220 270 250 240

E3) Apply the ratio to moving average method for calculating the
seasonal indices for the time series data given in Example 8 of
Unit 13.
43

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling

14.3 ESTIMATION OF TREND FROM


DESEASONALISED DATA
In Unit 13, you have learnt the estimation of trend. However, when a
substantial seasonal component is present in the data, it is advisable to first
remove the seasonal effect from the data. Otherwise, the trend estimates are
also affected by seasonal effects, which makes the estimation unreliable.
Hence, after estimating the seasonal indices, we deseasonalise the data
values by dividing them by the corresponding seasonal indices (St). Thus,
the deseasonalised values are given by
y
Deseasonalised Z t  t  Tt  C t  I t
St
Once the data are made free from the seasonal effect, we estimate the trend
line by the method of least squares as explained in Sec.13.4 of Unit 13.
Thus, we have a reasonably good estimate of Tt , Ct and St.
We have not described the estimation of cyclic effect (Ct) separately. A
cycle in the time series means a business cycle, which normally exceeds one
year in duration. Note that hardly any time series possess strict cycles
because cycles are never regular in periodicity and amplitude. This is why
the business cycles are the most difficult types of economic fluctuation to
measure. To construct meaningful typical cycle indices of curves similar to
those that have been developed for trends and seasons is impossible. The
successive cycles vary widely in time, amplitude and patterns and are
inextricably mixed with irregular factors. Since it is very difficult to
distinguish cyclic effects from long term trend effect, and in most cases we
assume that either they are not present or they are estimated along with
trend, we estimate Tt Ct jointly by the least squares method.
Once we have estimated all components, we shall use them for forecasting
purposes as described in the next section.
Example 4: Use the following data and calculate the deseasonalised values
T × C × I. Use these values to estimate the trend line:
Years Quarter I Quarter II Quarter II Quarter IV
2003 289 241 273 232
2004 336 294 363 274
2005 297 270 263 198
2006 291 209 243 187

Solution: For calculating the deseasonalised values, we first calculate the


seasonal indices for the given data. The seasonal indices are given in
Table 6. Deseasonalised values are also calculated by dividing the time
series values by the corresponding indices St.
Table 6: Calculations for obtaining trend values

Year Yt St Deseasonalised
Xt values (X t - Xt ) (X t - Xt )
2
(
Zt Xt - Xt )
Zt =(Yt/St)×100
2003 Q1 289 113.13 255.46 1.875 3.515625 478.9875
Q2 241 94.82 254.17 1.625 2.640625 413.02625
Q3 273 107.43 254.12 1.375 1.890625 349.415
Q4 232 84.62 274.17 1.125 1.265625 308.44125
44

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

2004 Q1 336 113.13 297.00 0.875 0.765625 259.875 Seasonal Component


Analysis
Q2 294 94.82 310.06 0.625 0.390625 193.7875
Q3 363 107.43 337.89 0.375 0.140625 126.70875
Q4 274 84.62 323.80 0.125 0.015625 040.475
2005 Q1 297 113.13 262.53 –0.125 0.015625 –032.81625
Q2 270 94.82 284.75 –0.375 0.140625 –106.78125
Q3 263 107.43 244.81 –0.625 0.390625 –153.00625
Q4 198 84.62 233.99 –0.875 0.765625 –204.74125
2006 Q1 291 113.13 257.23 –1.125 1.265625 –289.38375
Q2 209 94.82 220.42 –1.375 1.890625 –303.0775
Q3 243 107.43 226.19 –1.625 2.640625 –367.55875
Q4 187 84.62 220.99 –1.875 3.515625 –414.35625
Total 4257.58 21.25 298.995

Once the data are deseasonalised, we apply the method of least squares to estimate
the trend equation. The following values are calculated (as given in the above table):

 X  X t   21.25  Z X  X   298.995
2
X t  2004.5 , t t t t

â  Z  266.10 b̂ 
 Zt Xt  Xt  14.07
 Xt  Xt 
2

The fitted trend equation is:


Yt  266.10  14.07 X t  X t 

14.4 FORECASTING
Forecasting is one of the main purposes of time series analysis. It is always
a risky task as one has to assume that the process will remain the same in
future as in the past, at least for the period for which we are forecasting.
This assumption is not very realistic and we shall assume that at least for
short term forecasting, the process remains the same as in the past.
If a time series plot shows that there is no seasonal effect, or on theoretical
basis there is no reason for having a seasonal component (St), we can
estimate the trend component (Tt) and use the trend equation for forecasting.
If it is empirically observed that there is a significant seasonal effect (S) and
on theoretical ground also there is a valid reason for the presence of such a
component, we have to take the seasonal effect into account while
estimating and forecasting. If data are collected on monthly basis and the
period of seasonality is one year, we estimate twelve seasonal indices, one
for each month. If data is quarterly, we estimate four seasonal indices. After
deseasonalising the data, we fit the trend equation. Then we project the trend
for the period for which we have to forecast. Next we adjust it for seasonal
effect by multiplying it by the corresponding seasonal index. This gives the
final forecast value which has been corrected for the seasonal effect. These
steps can be described as follows:
Step 1: Calculate the moving average of suitable order. The order of
moving average is taken as the period of the seasonal effect.
Step 2: Calculate the ratio of data to moving average values so that this ratio
contains the seasonal component (St) and the irregular component (It),
i.e., St × It. 45

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling Step 3: Determine the seasonal indices by averaging St × It for respective
seasons. This gives the estimates of the seasonal component (St).
Step 4: Obtain deseasonalised values by dividing data values by the
corresponding seasonal component (St).
Step 5: Fit the trend equation to deseasonalised data. Compute
deseasonalised forecast value from the trend equation.
Step 6: Adjust the forecast value for seasonal effect by multiplying it by the
corresponding St. If the additive model is used, instead of
multiplying or dividing we add or subtract the corresponding values.
Seasonal effects are normalised so that their sum is equal to zero.
Example 5: Using estimated seasonal indices and the fitted trend equation
of Example 4, forecast the value for the first quarter (Q1) of 2007.
Solution: We have fitted the trend equation in Example 4 as:
Yt  266.10  14.07 X t  X t 
The projected trend value for Q1 of 2007 is:
Ŷt  266.10  14.07 ( 2007.25  2004.5)

where X t  2004.5

or Ŷt = 266.10 + 14.07 × 2.75


= 304.79
The season corrected forecast = ( Ŷt × St) /100
= (304.79×84.62) /100
= 257.9
You may now like to solve the following exercises to assess your
understanding of forecasting.
E4) The following table gives the sales figures (in thousands) of
television sets for 16 quarters over four years, coded as 1, 2, 3 and 4:
Quarter Q1 Q2 Q3 Q4
Year
1 480 410 600 650
2 580 520 680 740
3 600 560 750 780
4 630 590 800 840

i) Calculate the four quarters centred moving average


values.
ii) Compute the seasonal indices for the four quarters.
iii) Obtain deseasonalised values and estimate the trend line.
iv) Obtain the season adjusted forecast value (Q3) of the fifth year.

E5) The following data give production of a certain brand of motor


vehicles. Determine indices using the ratio to moving average
method for August and September, after calculating the centred
moving average for twelve months.
46

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Production (in thousand units) Seasonal Component


Analysis
Year Jan Feb Mar Apr May Jun July Aug Sep Oct Nov Dec

1985 7.92 7.81 7.91 7.03 7.25 7.17 5.01 3.90 4.64 7.03 6.88 6.14

1986 4.86 4.48 5.26 5.48 6.42 6.82 4.98 2.45 4.51 6.38 7.55 7.59

E6) Given below are the data of production of a company (in lakhs of
units) for the years 1973 to 1979:
Year 1973 1974 1975 1976 1977 1978 1979
Production 15 14 18 20 17 24 27
i) Compute the linear trend by the method of least squares.
ii) Compute the trend values for each year.

Let us now summarise the concepts that we have discussed in this unit.

14.5 SUMMARY
1. When time series data do not contain any trend and cyclic components
but reflect only seasonal variation, we have to estimate the seasonal
component S by removing irregular components.
2. If the effect of seasonal variation is not removed from the time series
data, the trend estimates are also affected. Therefore, we have to
deseasonalise the data by dividing it by the corresponding seasonal
indices. Once the data is free from the seasonal effects, we estimate the
trend equation using the method of least squares.
3. If a cycle is completed in 3 months, we calculate the moving average for
3 months. By taking the average over all available values, we get
seasonal indices Si for i =1, 2, 3, 4.
4. We usually normalise the seasonal indices so that their mean is 100 by
dividing them by mean of Si for all i and multiplying by 100. The
normalised seasonal indices have mean 100. Usually, there is not
much difference between normalised and non-normalised seasonal
indices.
5. The ratio to trend method provides seasonal indices free from trend
and is an improved version of the simple average method as it assumes
that seasonal variation for a given period is a constant fraction of the
trend.
6. When substantial seasonal component is present in the data, it is
advisable to first remove the seasonal effect from the data.
Otherwise, the trend estimates are also affected by seasonal effects,
which make the estimation unreliable. Hence, after estimating the
seasonal indices, we deseasonalise the data values by dividing it by
corresponding seasonal indices (St). Thus, the deseasonalised values are
given by
yt
Deseasonalised Z t   Tt  C t  I t
St
7. A cycle in the time series means a business cycle, which normally
exceeds one year in length. Note that hardly any time series possess 47

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

UNIT 3 STATIONARY PROCESSES


Structure
15.1 Introduction
Objectives
15.2 Stationary Processes
Stationary Process
Strict Stationary Process
Weak Stationary Process
15.3 Autocovariance and Autocorrelation
Autocovariance and Autocorrelation Coefficients
Estimation of Autocovariance and Autocorrelation Coefficients
15.4 Correlogram of Stationary Processes
Interpretation of Correlogram
15.5 Summary
15.6 Solutions / Answers

15.1 INTRODUCTION
In Units 13 and 14, you have learnt that a time series can be decomposed
into four components, i.e., Trend (T), Cyclic (C ), Seasonal (S) and Irregular
(I) components. We have discussed methods for smoothing or filtering the
time series and for estimating Trend, Seasonal and Cyclic components. We
have also explained how to use them for forecasting.

In this unit, we describe a very important class of time series, called the
stationary time series. In Sec. 15.2, we explain the concept of stationary
process and define weak and strict stationary processes. We discuss
autocovariance, autocorrelation function and correlogram of a stationary
process in Secs. 15.3 and 15.4. If a time series is stationary, we can model it
and draw further inferences and make forecasts. If a time series is not
stationary, we cannot do any further analysis and hence cannot make
reliable forecasts. If a time series shows a particular type of non-stationarity
and some simple transformations can make it stationary, then we can
model it.

In the next unit, we shall discuss certain stationary linear models such as
Auto Regressive (AR), Moving Average (MA) and mixed Autoregressive
Moving Average (ARMA) processes. We shall also discuss how to deal
with models with trend by considering an integrated model called
Autoregressive Integrated Moving Average (ARIMA) model.

Objectives
After studying this unit, you should be able to:
 describe stationary processes;
 define weak and strict stationary processes;
 define autocovariance and autocorrelation coefficients;
 estimate autocovariance and autocorrelation coefficients;
 plot the correlogram and interpret it; and
 make proper choice of probability models for further studies.
55

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling


15.2 STATIONARY PROCESSES
In the course MST-004, you have studied random variables and their
properties. Recall that a random variable Y is a function defined on a sample
space. A family of random variables defined on the same sample space
taking values over time is known as a random process. Most physical
processes in real life situations involve random components or variables and
a random process may be described as a statistical phenomenon that evolves
in time. A random process may be defined as a family of random variables
defined on a given probability space indexed by the parameter t. Here we
denote a stochastic variable by the capital letter Y and assume that it is
observable at discrete time points t1, t2, ....
A random process is a statistical phenomenon that evolves in time according
to some laws of probability. The length of queue in a system, the number of
accidents in a particular city in successive weeks, etc. are examples of a
random process. Mathematically, a random process is defined as the family
of random variables which are ordered in time, i.e., random process {Y(t); t
belongs to T} is a collection of random variables, where T is a set for which
all random variables Yt are defined on the same sample space. If T takes
continuous range of values, the random process is said to be a continuous
parameter process. On the other hand, if T takes discrete set of values, the
process is said to be a discrete parameter process. We use the notation Yt
for a random process when we deal with discrete parameter processes.
When T represents time, the random process is referred to as a time series.
In Units 13 and 14, we have dealt with one set of observations recorded at
different times. Thus, we had only a single outcome of the process and a
single observation on the random variable at time t. This sample may be
regarded as one time series out of the infinite set of time series, which might
have been observed. This infinite set of time series is called an Ensemble.
Every member of the ensemble can be taken as a possible realisation of the
stochastic process and the observed time series can be considered as one
particular realisation.

15.2.1 Stationary Process


Broadly speaking, a time series is said to be stationary if there is no
systematic change in mean, variance and covariance of the observations
over a period of time. This means that the properties of one section of the
time series are similar to the properties of the other sections of the time
series. In other words, a process is said to be stationary if it is in a state of
statistical equilibrium.
A random process is said to be stationary if the joint distribution of
Yt1, Yt2, Yt3, …, Ytk is the same as the joint distribution of Yt1+J, Yt2+J, Yt3+J,
..., Ytk+J, for all t1, t2, ..., tk and J. In other words, shifting the origin of time by
an amount J has no effect on the joint distribution. This means that it
depends only on the interval between t1, t2, …, tk. This definition holds for
any value of k.
For k =1,
E(Yt) = µ and V(Yt) = σ2
This implies that the mean and variance of Yt are constant and do not
depend on time.
For k = 2, the joint distribution of Yt1 and Yt2 depends only on the time
56 difference (t2 − t1) = J, say, which is called lag.

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Thus, the covariance term depends only on lag J = t2 − t1, i.e., Stationary Processes

γ (t1, t2) = E[(Yt – µ) (Yt +J – µ)]

= Cov (Yt, Yt +J)

The variance function is a special case of the covariance function when


t1= t2, i.e., J = 0.
There are two types of stationary processes: Strict stationary processes and
weak stationary processes. Let us discuss them one at a time.

15.2.2 Strict Stationary Process


Strict stationary process imposes the strong condition that the joint
probability distribution remains the same and depends only on the time
interval. If all the finite dimensional distributions of a random process are
invariant under translation of time, then it is called a Strict Sense
Stationary process or SSS process. In other words, if the joint distribution
of Yt1, Yt2, …, Ytk is the same as the joint distribution of Yt1+J, Yt2+J,..., Ytk+J,
for all t1, t2, ..., tk and J (> 0) for all k ≥ 1, the random process Yt is called a
Strict Sense Stationary process.
The strict stationary process requires that for any t1 , t2, the distributions of
Yt1 and Yt2 for all i = 1, 2, …, n must be the same. Also, the bivariate
distributions of pairs [Yt1, Yt1+J] and [Yt2, Yt2+J] are the same for all
i = 1, 2, …, n and J.
Note that the requirement of strict stationarity is a severe one. Usually it is
difficult to establish it mathematically and in most cases, the distributions of
[Yti, Yti+J] for all i = 1, 2, …, n and J are not known. That is why, the less
restrictive notions of stationarity called weak stationary processes have
been developed.

15.2.3 Weak Stationary Process


A stationary process is said to have weak stationarity of order m, if the
moments up to order m depend only on time lag J. If m = 2, the stationarity
(or weak stationarity) of order 2 implies that moments up to the second
order depend only on time lag J. For a weak stationary process:

E[yt] = µ;

and Cov[Yt, Yt +J] = γ(J)

No requirements are placed on the moments of higher order. The definition


also implies that the mean and variance remain constant and finite. Thus, a
random process Yt with finite first and second order moments is called a
weak stationary process, if the means are constant and the covariance
depends only on the time lag.

In the subsequent discussion in this unit, we shall assume weak stationarity


as many properties of a stationary process depend only on the first and
second order moments. One important class of processes is the normal
process, where joint distributions of Yt1, Yt2, ..., Ytk are multivariate normal
for all t1, t1, ..., tk. This multivariate normal distribution is completely
characterised by its first and second moments, i.e., µt and γ(t1, t2).
57

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling


15.3 AUTOCOVARIANCE AND
AUTOCORRELATION
In this section, we discuss the autocovariance and autocorrelation function
for a stationary process. Of particular interest in the analysis of a time series
are the covariance and correlation between Yt1 and Yt2. Since these are the
covariance and correlation within the same time series, they are called
autocovariance and autocorrelation. Some important properties of time
series can be studied with the help of autocovariance and autocorrelation.
They measure the linear relationship between observations at different time
lags. They provide useful descriptive properties of the time series being
studied and are important tools for guessing a suitable model for the time
series data. Let us define these parameters.
15.3.1 Autocovariance and Autocorrelation Coefficients
Suppose a weak stationary time series process is denoted by Y1, Y2, …, Yt,
Yt+1, …, Yt+k, …. We are interested in finding the linear relationship
between two consecutive observations, Yt, Yt+1. We are also interested in
the relationship between observations that are apart by a time lag k, e.g., Yt
and Yt+k . We shall study the linear relationship by studying covariances and
correlations of observations at different time lags.

In the course MST-002, you have learnt how to calculate the covariance and
correlation between two variables for given N pairs of observations on two
variables X and Y, say {(x1, y1), (x2, y2), …, (xN, yN). Recall that the
formulas for computation of covariance and correlation coefficient are given
as:
Cov  X, Y   E  X     Y    

E  X     Y     Cov X, Y 


(X,Y )  
2
E  X    E  Y     2 XY
 
Here we apply analogous formulas to the stationary time series data to
measure whether successive observations are correlated.
The autocovariance between Yt and Yt+k, separated by the time interval k,
for a stationary process must be the same for all t and is defined as:
 k   k  Cov Yt , Yt  k   E Yt  k   Yt    …(1)
Similarly, the autocorrelation at lag k is
E  Yt k     Yt    
k 
 E  Yt k    2 E  Yt    2 
 

Cov Yt , Yt  k 
 … (2)
 2Y
From equation (1), we note that
 2Y   0 … (3)

k
Therefore, k  and 0  1 … (4)
0
58

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Stationary Processes
15.3.2 Estimation of Autocovariance and Autocorrelation
Coefficients
So far, we have defined the autocovariance and autocorrelation coefficients
for a random process. You would now like to estimate them for a finite time
series for which N observations y1, y2, ..., yN are available. We shall denote
a realisation of the random process Y1, Y2, …, YN by small letters y1, y2, ...,
yN. The mean µ can be estimated by
N
y   yi N … (5)
i 1

and autocovariance γk at lag k can be estimated by the autocovariance


coefficient ck as follows:
N k
c k   y t  y y t k  y  N  k  , for all k … (6)
t 1

The sample covariance is a special case of autocovariance when


k = 0, i.e.,
N N
2
c 0   y t  y y t  y  N   y t  y  N   2Y , for all k
t 1 t 1

The autocorrelation coefficients (rk) are usually calculated by computing the


series of autocovariance coefficients (ck) as follows:
ck c
rk   k2 … (7)
c0 Y
In practice, at least 50 observations are required for the estimation of
correlations. It is also advisable that for calculations of rk, the lag k should
not exceed N/4.
Let us explain these concepts with the help of an example.
Example 1: A series of 10 consecutive yields from a batch chemical process
are given as follows:
47, 64, 23, 71, 38, 64, 55, 41, 59, 48
Calculate the mean, autocovariance c1 and autocorrelation coefficient r1 for
the given time series.
Solution: We first construct the following table:
S. No. Y Y2
( Y - Y)
t ( Y - Y) ( Y
t t +1 -Y )
1 47 2209 −4
2 64 4096 13 −52
3 23 529 −28 −364
4 71 5041 20 −560
5 38 1444 −13 −260
6 64 4096 13 −169
7 55 3025 4 52
8 41 1681 −10 −40
9 59 3481 8 −80
10 48 2304 −3 −24
Total
åy i = 510 å y 2i = 27906 å ( Y - Y)( Y
t t +1 )
- Y = −1497

59

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling From equation (5), we get


N

y
i 1
i
510
y   51.0
10 10
From equation (6), for k=0, the autocovariance coefficient is
Nk
2
 y
t 1
i  y
 y 2
i  Ny 2 
c0  
N N
(27906  26010)
  189.6
10
For k = 1,
9

 y
t 1
t  y  y t 1  y 
c1  = – 166.33
9
From equation (7),
c1 166.33
r1    0.88
c2 189.6
You may now like to solve a problem to assess your understanding.
E1) Ten successive observations on a stationary time series are as
follows:
1.6, 0.8, 1.2, 0.5, 0.6, 1.5, 0.8, 1.2, 0.5, 1.3.
Plot the observations and calculate r1.
E2) Fifteen successive observations on a stationary time series are as
follows:
34, 24, 23, 31, 38, 34, 35, 31, 29,
28, 25, 27, 32, 33, 30.
Plot the observations and calculate r1.

15.4 CORRELOGRAM OF STATIONARY


PROCESSES
A useful plot for interpreting a set of autocorrelations coefficient is called a
correlogram in which the sample autocorrelation coefficients rk are plotted
versus the lag J where J=1, 2, 3, …, k. This helps us in examining the nature
of time series. It is also a very important diagnostic tool for the selection of a
suitable model for the process which generates the data. The correlogram is
alternatively known as the sample autocorrelation function (acf).
The value of lag J is usually much less than N. For example, a time series of
length N = 200 given in Fig 15.1a shows the plot of the time series for N=200
and Fig. 15.1b shows a plot of the correlogram for a lag up to order 17.
The relatively smooth nature of the time series plot indicates that
observations which are close to each other (at smaller lags) are positively
correlated. The correlogram suggests that observations with smaller lag are
positively correlated and autocorrelation decreases as lag k increases.
60

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

In most time series, it is noticed that the absolute value of rk, i.e., | rk| Stationary Processes
decreases as k increases. This is because observations which are located far
away are not related to each other, whereas observations lying closer to each
other may be positively (or negatively) correlated.

(a)

(b)

Fig. 15.1: a) Plot of a time series for N = 200; b) correlogram for lag k = 1, 2, .., 17.

15.4.1 Interpretation of Correlogram


Sometimes it is possible to recognise the nature of a time series from its
correlogram, though it is not always easy. We shall describe certain types of
time series processes and the nature of their correlograms. If the correlogram
of a time series is one of these types, it is possible to get a good idea of the
process which gave rise to that time series.
Random Series
In case all observations are completely random, i.e., they contain only
independent observations, the series is called a random series.
This means that rk ≈ 0 for all non zero k. The correlogam of such a random
series will be oscillating around the axis (zero line). In fact, for such a series,
for large N, the values of rk approximately follow the N(0,1/N) distribution.
Thus, in about 95% cases, the values of rk lie in a range of ±2/√N. If the
correlogram shows such a behaviour, it is a good indication that the time
series is random. However, this behaviour may not always confirm that the
time series is random and it may need further examination.
Short-Term Correlation
Stationary time series usually has a few large autocorrelations in absolute
value for small lag k. They tend to zero very rapidly with increase in lag k
(see Fig. 15.1b). When the first few autocorrelations are positive, the time
series is smooth in nature, i.e., if an observation is above mean, it is likely to
be followed by an observation above mean and if an observation is below
mean, it is likely to be followed by an observation below mean. This gives
61

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling an indication of stationary time series with most of the non-zero
autocorrelations being either positive or negative.
Alternating Series
If a time series behaves in a very rough and zig-zag manner, alternating
between values above and below mean, it is indicated by a negative r1 and
positive r2. An alternating time series with its correlogram is shown in
Fig.15.2.

(a)

(b)
Fig. 15.2: a) Plot of alternating time series; b) correlogram for an alternating series
with lag up to 15.
Non-Stationary Time Series
If a time series contains trend, it is said to be non-stationary. Such a series is
usually very smooth in nature and its autocorrelations go to zero very slowly
as the observations are dominated by trend. Due to the presence of trend, the
autocorrelations move towards zero very slowly (see Fig. 15.3). One should
remove trend from such a time series before doing any further analysis.

(a)

(b)
62 Fig. 15.3: a) Plot of a non-stationary time series; b) correlogram of non-stationary series.

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Seasonal Time Series Stationary Processes

If a time series has a dominant seasonal pattern, the time plot will show a
cyclical behaviour with a periodicity of the seasonal effect. If data have
been recorded on monthly basis and the seasonal effect is of twelve months,
i.e., s = 12, we would expect a highly negative autocorrelation at lag 6 (r6)
and highly positive correlation at lag 12 (r12). In case of quarterly data, we
expect to find a large negative r2 and large positive r4. This behaviour will
be repeated at r6, r8 and so on. This pattern of cyclical behaviour of
correlogram will be similar to the time plot of the data.

Years (2010-2012)

Fig. 15.4: Time plot of the average rainfall at a certain place, in successive months
from 2010 to 2012.
Therefore, in this case the correlogram may not contain any more
information than what is given by the time plot of the time series.

(a)

(b)
Fig. 15.5: a) Smoothed plot of the average rainfall at a certain place, in successive
months from 2010 to 2012; b) correlogram of monthly observations of
seasonal time series.
63

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling Fig. 15.5a shows a time plot of monthly rainfall and Fig. 15.5b shows the
correlogram. Both show a cyclical pattern and the presence of a strong 12
monthly seasonal effect. However, it is doubtful that in such cases the
correlogram gives any more information about the presence of seasonal
effect as compared to the time plot shown in Fig 15.4.
In general, the interpretation of correlogram is not easy and requires a lot of
experience and insight. Estimated autocorrelations (rk) are subject to
sampling fluctuations and if N is small, their variances are large. We shall
discuss this in more detail when we consider a particular process. When all
the population autocorrelations ρk (k ≠ 0) are zero, as happens in a random
series, then the values of rk are approximately distributed as N(0,1/N). This
is a very good guide for testing whether the population correlations are all
zeros or not, i.e., the process is completely random or not.
Example 2: For the time series given in Example 1, calculate r1, r2, r3, r4 and
r5 and plot a correlogram.
Solution: From Example 1 and its results we have the following:
y  51.0, c 0 = 189.6, c1 = −166.33 and r1 = − 0.88
Now we form the table for the calculations as follows:
S.
No.
Y Y2  Yt – Y   Yt – Y Yt+2 – Y   Yt – Y Yt+3 – Y   Yt – Y Yt+4 – Y  Yt – Y Yt+5 – Y
1 47 2209 −4
2 64 4096 13
3 23 529 −28 112

4 71 5041 20 260 −80


5 38 1444 −13 364 −169 52
6 64 4096 13 260 −364 169 −52

7 55 3025 4 −52 80 −112 52


8 41 1681 −10 −130 130 −200 280
9 59 3481 8 32 104 −104 160

10 48 2304 −3 30 −12 −39 39


Total 510 27906 876 −311 −234 479

We now calculate the autocorrelation coefficients r 2, r3, r4 and r5 as follows:


For k = 2, we get

c2  
8
 y t  y   y t2  y 
t 1 8

= 876/8 = 109.5
c2
r2  = 109.5/ 189.6 = 0.58
c0

For k = 3, we get
7

y
t 1
t  y   y t 3  y 
c3 
64 7

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

= –311/7 = –44.43 Stationary Processes

c3
r3  = −44.43/ 189.6 = −0.2343
c0
For k = 4, we get
6

 y
t 1
t  y   y t 4  y 
c4 
6
= −234/6 = −39
c4
r4  = −39/ 189.6 = −0.2057
c0
For k = 5, we get
5

 y
t 1
t  y   y t 5  y 
c5 
5
= 479/5 = 95.8
c5
r5  = 95.8/ 189.6 = −0.5052
c0
Thus, we have obtained the autocorrelation coefficients r1, r2, r3, r4 and r5
as r1= –0.88, r2 = 0.58, r3 = −0.2343, r4 = −0.2057, r5 = −0.5052,
respectively.

Now we plot the correlogram for the given time series by plotting the values
of the autocorrelation coefficients versus the lag k for k = 1, 2, …, 5. The
correlogram is shown in Fig. 15.6.

Fig. 15.6: Correlogram for the given time series.

Example 3: A computer generates a series of 200 observations that are


supposed to be random. The first 10 sample autocorrelation coefficients of
the series are:
r1 = 0.02, r2 = 0.05, r3 = –0.09, r4 = 0.08, r5 = –0.02, r6 = –0.07,
r7 = 0.12, r8 = 0.06, r9= 0.02, r10 = –0.08
Plot the correlogram.
65

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling Solution: The correlogram for the given values of autocorrelation
coefficients is shown in Fig. 15.7.

Fig. 15.7: Correlogram for 10 sample autocorrelation coefficients of the series of 200
observations.
Example 4: A random walk (St, t = 0, 1, 2, …) starting at zero is obtained
by cumulative sum of independently and identically distributed (i.i.d)
random variables. Check whether the series is stationary or non-stationary.
Solution: Since we have a random walk (St, t = 0, 1, 2, …) starting at zero
obtained from cumulative sum of independently and identically distributed
(i.i.d) random variables, a random walk with zero mean is obtained by
defining S0 = 0 and

St = Y1 + Y2 +….+Yt, for t = 1, 2, …
where {Yt} is i.i.d. noise with mean zero and variance σ2. Then we have
2
E (St) = 0, E (St ) = tσ2 < ∞ for all t

Cov ( St, St+h) = (St, St+Yt+1+…..+Yt+h)

= Cov (St, St) = t σ2


This depends upon t and hence the series St is non-stationary.
You may now like to solve the following exercises to assess your
understanding about correlogram and stationary processes.

E3) Calculate r2, r3, r4 and r5 for the time series given in Exercise 1 and plot
a correlogram.
E4) Calculate r2, r3, r4 and r5 for the time series given in Exercise 2 and plot
a correlogram.
E5) A computer generates a series of 500 observations that are supposed to
be random. The first 10 sample autocorrelation coefficients of the
series are:
r1 = 0.09, r2 = –0.08, r3 = 0.07, r4 = –0.06, r5 = –0.05, r6 = 0.04,
r7 = –0.3, r8 = 0.02, r9= –0.02, r10 = –0.01
Plot the correlogram.
66

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Let us now summarise the concepts that we have discussed in this unit. Stationary Processes

15.5 SUMMARY
1. A time series is said to be stationary if there is no systematic change in
mean, variance and covariance of the observations over a period of time.
If a time series is stationary, we can model it and draw further inferences
and make forecasts. If a time series is not stationary, we cannot do any
further analysis and make reliable forecasts. If a time series shows a
particular type of non-stationarity and some simple transformations can
make it stationary, then we can model it.

2. A random process is a statistical phenomenon that evolves in time


according to some laws of probability. Mathematically, a random
process is defined as the family of random variables which are ordered
in time, i.e., a random process {Y(t); t belongs to T} is a collection of
random variables, where T is a set for which all the random variables Yt
are defined on the same sample space.
3. A random process is said to be stationary if the joint distribution of
Yt1, Yt2, Yt3,…, Ytk is the same as the joint distribution of Yt1+J, Yt2+J,
Yt3+J, ..., Ytk+J, for all t1, t2, ..., tk and J. In other words, shifting the origin
of time by an amount J has no effect on the joint distribution. This
means that it depends only on the interval between t1, t2, …, tk.
4. If all the finite dimensional distributions of a random process are
invariant under the translation of time, it is called a Strict Sense
Stationary process or SSS process. In other words, if the joint
distribution of Yt1, Yt2, …, Ytk is the same as the joint distribution of
Yt1+J, Yt2+J,..., Ytk+J, for all t1, t2, ..., tk and J (> 0) for all k ≥ 1, the random
process Yt is called a Strict Sense Stationary process.
5. A stationary process is said to have weak stationarity of order m, if
the moments up to order m depend only on the time lag J. If m = 2, the
stationarity (or weak stationarity) of order 2 implies that moments up to
the second order depend only on the time lag J.
6. Of particular interest in the analysis of a time series are the covariance and
correlation between Yt1 and Yt2. Since these are the covariance and
correlation within the same time series, they are called autocovariance and
autocorrelation. Some important properties of time series can be studied
with the help of autocovariance and autocorrelation. They measure the
linear relationship between observations at different time lags. They
provide useful descriptive properties of the time series being studied and
are important tools for guessing a suitable model for the time series data.
7. A useful plot for interpreting a set of autocorrelation coefficients is called a
correlogram in which the sample autocorrelation coefficients rk are plotted
versus the lag J where J=1, 2, 3, …, k. This helps us in examining the
nature of time series. It is also a very important diagnostic tool for selection
of a suitable model for the process which generates the data. The
correlogram is also known as the sample autocorrelation function (acf).
8. In most time series, it is noticed that the absolute value of rk, i.e., | rk|
decreases as k increases. This is because observations which are located
far away are not related to each other, whereas observations lying closer
to each other may be positively (or negatively) correlated.
67

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

UNIT 4 TIME SERIES MODELS


Structure
16.1 Introduction
Objectives
16.2 Linear Stationary Processes
Moving Average (MA) Process
Autoregressive (AR) Process
Fitting an Autoregressive Process
Determining the Order of an Autoregressive Model
Partial Autocorrelation Function (pacf)
16.3 Autoregressive Moving Average (ARMA) Models
16.4 Autoregressive Integrated Moving Average (ARIMA) Models
16.5 Summary
16.6 Solutions /Answers

16.1 INTRODUCTION
In Unit 15, you have learnt that there are two types of stationary processes:
strict stationary and weak stationary processes. You have also learnt how to
determine the values of autocovariance and autocorrelation coefficients, and
to plot a correlogram for a stationary process. In this unit, we discuss
various time series models.
In Sec. 16.2 of this unit, we introduce an important class of linear stationary
processes, known as Moving Average (MA) and Autoregressive (AR)
processes and describe their key properties. We discuss Autoregressive
Moving Average (ARMA) models in Sec. 16.3. We also discuss their
properties in the form of autocorrelations and the fitting of suitable models
to the given data. We discuss how to deal with models with trend by
considering integrated models, called the Autoregressive Integrated Moving
Average (ARIMA) models in Sec. 16.4.
Objectives
After studying this unit, you should be able to:
 describe a linear stationary process;
 explain autoregressive and moving average processes;
 fit autoregressive moving average models;
 describe and use the ARIMA models; and
 explore the properties of AR, MA, ARMA and ARIMA models.

16.2 LINEAR STATIONARY PROCESSES


In Unit 15, we have considered discrete time stationary processes and their
properties. Note that the sequences of random variables {Yi} are mutually
independent and identically distributed. If a discrete stationary process
consists of such sequences of i.i.d. variables, it is called a purely random
process. Sometimes it is called white noise.
Recall that the random variables are normally distributed with mean zero
and variance σ2. Similarly, a purely random process has constant mean and
variance, i.e., 73

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling


 2 k0
 k  CovX t , X t  k    Y ... (1)
0 k  1, 2, 3, ....

In this section, we consider some particular cases of a linear process. Let Yt


be a stochastic process with mean µ. We can express it as a weighted sum of
previous random noises (shocks). Thus, we have
Yt    a t   1a t 1   2 a t  2  ... … (2)
Here at, (t = 0, 1, 2, …) represent white noises with mean zero, variance  a2
and ψi (i = 1, 2, …) represent weights. For the linear process to be
stationary, the following conditions on weights are required, i.e.,

  i2   ,   i2  … (3)

Then, the autocovariance is given by


Cov (at, at+k) = 0 for k ≠ 0 … (4)
For simplicity, we denote the process by Xt :
X t  Yt   … (5)
Therefore, the process Xt has mean zero and we can write it as:
X t  Yt    a t   1 a t 1   2 a t  2  ... … (6)
Under the above mentioned conditions on weights ψi, the model can also be
expressed as
X t  a t   1 X t 1   2 X t  2  ... … (7)
Let us now consider two particular cases of the linear stationary processes.

16.2.1 Moving Average (MA) Process


The moving average processes have been often used in econometrics. For
example, the economic indicators are affected by many random events such
as government decisions, strikes and shortages of raw materials, etc. They
have immediate effects as well as effects of lower magnitude in past
periods. Such processes have been successfully modelled by moving
average processes.
Suppose we write the linear process as
X t  0 a t  1a t 1  ...  q a t q
… (8)
where βi, (i = 0, 1, 2, …, q) are constants. This process is known as the
moving average process of order q and is abbreviated as MA(q) process.
The white noises (at) are scaled so that β0 = 1. The mean and variance of Xt
are given by
 q 
EX t   0 and VX t    2a 1   i2  ... (9)
 i 1 
and autocovariance is given as
 k  CovX t , X t  k  … (10)
 k   k  Cov a t  1 a t 1  ...   q a t q , a t  k  1 a t  k 1  ...   q a t  k q 
= 0 for k > q
 
  a2  k  1  k 1  ...  q  k  q for k  1, 2, ..., q
… (11)
74

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

The autocorrelation function (acf) of the MA(q) process is given by Time Series Models

k 
k  1 k 1  ...  q  k q  , k  1, 2, ..., q ... (12)
 q 
1   i2 
 
 i 1 
Note that the autocorrelation function (acf) becomes zero, if lag k is greater
than the order of the process, i.e., q. This is a very important feature of
moving average (MA) processes.
First and Second Order Moving Average (MA) Processes
For the first order moving average {MA (1)} process, we have
X t  a t  1 a t 1 … (13)
The mean and variance are obtained for q =1 as
EX t   0 , 
VX t   a2 1  12  … (14)
and the autocorrelation coefficient is obtained for q=1 as
1
1  … (15)

1  12 
Similarly, for the second order Moving Average MA(2) process we have
X t  a t  1a t 1   2 a t  2 … (16)
For q = 2, the mean and variance are given as
EX t   0 , 
V  X t   a2 1  12  22  … (17)
The autocorrelation coefficients are given as
1  12  , 2 
2
1 
1  2
1  2
2  
1  12   22  … (18)
There is no requirement on the constants β1 and β2 for stationarity.
However, for unique representation of the model, the autocorrelation
coefficients should satisfy the condition of invertibility, which is satisfied
when the roots of
B   1  1B   2 B 2  ...   q Bq  0
… (19a)
lie outside the unit-circle, i.e., roots |B| >1.
For MA (1) process, we have

B   1  1B  0  B   1 1 … (19b)


Therefore, if |B| >1, this implies that |β1|<1. Hence, for invertibility
|β1| <1 … (20)
Let us consider an example of the moving average process.
Example 1: Consider a time series consisting of 60 consecutive daily over
shots from an underground gasoline tank at a filling station. The sample
mean and estimate of  2a with some sample autocorrelations are given as:
Sample mean = 4.0;  a2  4515.46
r1 = −0.5, r2= 0.1252, r3 = −0.2251, r4 = 0.012, r5 =0.0053 75

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling Check whether a moving average MA (1) process can be fitted to the data
and obtain preliminary estimates of the parameters.
Solution: We are given the sample mean of 60 observations as 4.0 and
estimate of  a2 , i.e., ˆ 2a  3415.72 .
The MA (1) model is written as
Yt    a t  1a t 1
X t  Yt    a t  1a t 1
If the process is purely random, all the autocorrelations (rk) should be in the
range of
2

N
2 2
In this case,  =  = ±0.258
N 60
Here we see that of the given autocorrelations, only r1 lies outside the range,
given by ±0.258. This suggests that moving average MA (1) model could
be a suitable model since only ρ1 is significantly different from zero and
ρk, k >1 lie within the range ±0.258.

Equating r1 to ρ1 given by equation (15) and using the method of moments,


we get
1
r1   0.5    0.5

1  12 
On simplifying the above equation, we get
ˆ 1   0.1
Hence, the model MA(1) becomes X t  a t  0.1 a t 1

Thus, Yt   4.0  a t  0.1 a t 1 

where at is white noise with estimated variance of 4515.46.


You may now like to solve the following exercise to check your
understanding about MA processes.

E1) Show that the autocorrelation function of MA(2)


X t  a t  0.74 a t 1  0.19 a t  2
is given by
0.3675 k  1

 k   0.1289 k  2
0 Otherwise

In Sec.16.2.1, we have considered estimation of parameters β1, β2… by the


method of moments, i.e., by equating autocorrelations to their expected
values. This method is not a very efficient method of estimation of
parameters. For moving average processes, usually the maximum likelihood
method is used which gives more efficient estimates when N is large. We do
not discuss it here as it is beyond the scope of this course.
76

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

16.2.2 Autoregressive (AR) Process Time Series Models

A stationary process Yt is said to be an autoregressive process of order p,


abbreviated as AR (p), if
 
Yt    1 Yt 1      2 Yt  2     ...   p Yt  p    a t ... ( 21)
which is written as
X t  1X t 1   2 X t  2  ...   p X t  p  a t …(22)

where Xt = Yt − µ and at is white noise. It is similar to a multiple regression


model, where we regress Xt on its past values and that is why it is called an
autoregressive process.
A linear stationary process can always be expressed as an autoregressive
process of suitable order. Unlike the moving average (MA) process, which
puts no restrictions on parameters for stationarity, autoregressive (AR)
process requires certain restrictions on the parameters α for stationarity. An
autoregressive (AR) process can also be written as
1   B  
1 2 
B 2  ...   p B p X t  a t … (23)

or  X t  a t
where B is the backward shift operator, defined as
B X t  X t 1 , B 2 X t  X t  2 , ...... B p X t  X t  p … (24)

For an AR (p) process to be stationary, the roots of


 B   1  1B   2 B 2  ...   p B p  0 … (25)

must lie outside the unit circle.


First Order Autoregressive {AR(1)} (Markov) Process
Suppose, we write the linear model as
X t  1X t 1  a t … (26)
By repeatedly using this equation you can see that Xt can be expressed as
weighted sum of infinite numbers of past noises at, i.e.,
X t  a t   1 a t 1   2 a t  2  ... … (27)
Autocorrelations ρk are obtained by multiplying equation (27) by Xt-k and
taking expectations of the results. Then we get
 k  1 k 1  ..........  1k … (28)
Thus, ρ0 = 1, ρ1 = α1
From equation (27)  2x   1 2x   2a … (29)

which gives 
 2x   a2 1   12  … (30)

and  2x is positive if |α1| < 1. Thus, for stationarity


|α1| < 1 … (31)
When α1 is positive and large, the time plot becomes smooth and shows a
slow changing trend. When α1 is large and negative, the time plot shows a
very rapid zig-zag movement. It is because of negative autocorrelations. If
one value of autocorrelation is above mean, the next value of the
autocorrelation is very likely to be below mean, and so on. 77

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling Second order Autoregressive {AR (2)} process


This process is obtained by taking p = 2 and the model is
X t  1X t 1   2 X t  2  a t … (32)
For stationarity, the following restrictions are placed on the coefficients:
α2+ α1< 1; α2 − α1< 1 and −1 < α2 < 1 … (33)
For autoregressive AR(2) model, the first two autocorrelations ρ1 and ρ2 are
obtained as follows:
On multiplying equation (32) by Xt−1 and Xt−2 and taking expectations and
dividing the results by  2x , we get
1  1   2 1 … (34a)
 2  11   2 … (34b)
On simplifying the above equations, we obtain

1 
1 1   2 
2 
   
2
2
1
… (35)

1  12  1   
2
1

Similarly, ρ1 and ρ2 can be expressed in terms of α1 and α2 as


1 12
1  , 2   2  … (36a)
1   2  1   2 

 2x   2a 1  1  1   2  2  … (36b)

Multiplying equation (32) by Xt−k, taking expectations and dividing by  2x


gives the autocorrelation function of AR (2) process as
 k  1 k 1   2  k 2 , X0 … (37)
We can obtain ρk for different values of k by using equation (37) for
k=1, 2, ….
Let us consider an example of AR(2) process.
Example 3: Consider an auto regressive AR(2) model
X t  0.80X t 1  0.60 X t  2  a t
Verify whether the series is stationary.
(i) Obtain ρk for k = 1, 2, …, 5, and (ii) plot the correlogram.
Solution: We have an autoregressive AR (2) model
X t  0.80 X t 1  0.60 X t  2  a t … (i)
Now from equation (36a), the autocorrelations ρ1 and ρ2 are given as

1 12
1  , 2   2 
1   2  1   2 

 0  1, 1 
0.80
 0.50 and 2   0.50 
 0.80 2  0.20
1  0.60 1  0.60 
We obtain the values of the autocorrelations ρ3, ρ4 and ρ5 using equation (37)
and get
78

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

3  12   21 , Time Series Models

3  0.80  0.20  (0.60)  (0.50)   0.46

4  0.80  0.46  ( 0.60)  (0.20)   0.25


5  0.80  0.25  ( 0.60)  (0.46)  0.076
The correlogram of the given AR process is shown in Fig. 16.1.

Fig 16.1: Correlogram of the model.


You may now like to solve the following exercises to check your
understanding about MA processes.
E2) Consider an AR (2) process given by
X t  X t 1  0.5 X t 2  a t
Verify whether the series is stationary or not.
a) Obtain ρk for k =1, 2, ..., 5 and b) plot the correlogram
E3) For each of the following processes, write the model using B
notations and then determine whether the processes are stationary or
not:
a) X t  0.3 X t 1  a t

b) X t  X t 1  X t  2  12  a t

16.2.3 Fitting an Autoregressive Process


Suppose N observations on a time series y1, y2, …, yN are available. We
now wish to fit an autoregressive (AR) process of suitable order. Therefore,
we need to know the order of autoregressive (AR) process, that is, p.
Suppose, we know the order p. Then we have to estimate parameters µ, α 1,
α2, …, α p, σ2x, etc. We calculate the autocorrelations from the data. Usually
µ is estimated by Ŷ . Hence, by subtracting Ŷ from Yt, we calculate
X YY ˆ … (38a)
t t

For the given rk, we have to calculate parameters α1, α 2, …, αp of the model:
X t   1 X t 1   2 X t 2  ...   p X t  p  a t … (38b)

For an autoregressive (AR) process, the least squares estimates of the


parameters α1, α2, …, αp are obtained by minimising S:
N
2
S    X t  1X t 1   2 X t  2  ...   p X t  p  … (39)
t 1 79

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling with respect to α1, ..., αp, and equating the result to zero. This method
provides good estimates.
If Yt, t = 1, 2, …, N is the observed series X t  Yt  Ŷ are used in
equation (39). This looks very similar to multiple regression estimates and
by differentiating S with respect to α1, α 2, …, αp and equating the result to
zero, we get a set of k equations
Rˆ  r … (40)
where R is a matrix of autocorrelations given by
1 r1 .............rp 1 
 
r1 1 ............. rp 2 
R  ........................................  … (41)
 
........................................ 
r rp  2 ............1 
 p1 
and r = (r1, r2 , ..., rp ) is the row matrix corresponding to the column matrix r.
Thus, ̂ is obtained by solving the simultaneous equations (40) using
inverse of R matrix denoted by R−1 as
ˆ  R 1 r … (42)
16.2.4 Determining the Order of an Autoregressive Model
For fitting the model, we have to estimate the order of the autoregressive
model for the data at hand. For the first order autoregressive model, the
autocorrelation function (acf) reduces exponentially as follows:
 k  1k as |α1| < 1.
Hence, for an autoregressive process AR (1), the exponential reduction of
autocorrelation function (acf) gives a good indication that the autoregressive
process is of order 1. However, this is not true for correlogram of higher
orders. For two and higher order autoregressive models, the autocorrelation
function (acf) can be a combination of damped exponential or cyclical
functions and may be difficult to identify.
One way is to start fitting the model by taking p = 1 and then p = 2, and so
on. As soon as the contribution of the last α p fitted is not significant, which
can be judged from the reduction in the value of residual sum of squares, we
should stop and take the order as p −1. An alternative method is to calculate
what is called partial autocorrelation function.
16.2.5 Partial Autocorrelation Function (pacf)
For an autoregressive AR (p) process, the partial autocorrelation function
(pacf) is defined as the value of the last coefficient α p. We start with p =1
and calculate pacf. Hence, for the AR (1) process, pacf (1) is
α1 = ρ1 … (43a)
For AR (2), the pacf is given by

2 
   
2
2
1
… (43b)
1   2
1

as described earlier. In this way, we can go on calculating pacf(3) as α 3 and


80 αp, p = 4, 5, …. We can estimate these partial autocorrelation functions by

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

substituting estimated autocorrelations rk in place of ρ and then test the Time Series Models
significance. When partial autocorrelation function (pacf) is zero, its
asymptotic standard error is 1/√N. Hence, we calculate partial
autocorrelation functions (pacf) by increasing the order by one every time.
As soon as this lies within range of ± 2/√N, we stop and take the order as
the last significant partial autocorrelation function (pacf). This is indicated
when pacf lies outside the range of 2/√N. In the following steps, we give
partial autocorrelation functions (pacf) up to autoregressive AR(3) process:

pacf (1) = ρ = α ; pacf(2) =


1 1
(r - r ) = a 2
2
1
2 … (44a)
(1 - r ) 2
1

1 r1 r2
r1 1 r1
r2 r1 r3
and pacf (3) = … (44b)
1 r1 r2
r1 1 r1
r 2 r1 1

where |…| means the determinant of the matrix.


Let us now calculate partial autocorrelation functions for stationary
processes.
Example 4: Find the pacf of the AR(2) process:
X t  0.333 X t 1  0.222 X t 2  a t
Solution: For this process, α1 = 0.333 and α 2 = 0.222. We use the
expressions of ρ1 and ρ2 as given in equation (36a) and get
1 0.333
1    0.428
1   2  0.778
12 0.111
2  2   0.222   0.365
1   2  0.778

Now, from equations (43a and b),

pacf (1) = α1= ρ1 = 0.428

and pacf (2) =


r 2 - r 12

 0.365  0.183  0.222
1- r 12
1  0.183
Also, pacf (k) = 0 , for k ≥ 3.

Example 5: Suppose for a time series of length N = 100, the three


autocorrelation coefficients are r1 = 0.806, r2 =0.428, r3 = 0.070. Calculate
the pacfs and estimate the order of autoregressive model to be fitted.
Solution: Equating rk to ρk (k= 1, 2, 3), from equations (44a and b), we have
pacf (1) = r1 = 0.806

81

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling


pacf (2) = (
r - r ) ( 0.428 - 0.650)
2 1
2

= = - 0.634
(1 - r ) 1
2
0.350

1 0.806 0.428
0.806 1 0.806
0.428 0.806 0.070
pacf (3)  = 0.077
1 0.806 0.428
0.806 1 0.806
0.428 0.806 1

and range = 2/√N = 2/10 = 0.2


The partial autocorrelation functions pacf (1) and pacf (2) lie outside this
range and pacf (3) lies inside this range. Since the least significant pacf is
pacf (2), the order of the model is 2 and the autoregressive model AR(2) is
suggested for this process.
You may now like to solve the following exercises to check your
understanding about MA processes.
E4) For the AR (2) process
X t  1.0 X t 1  0.5 X t 2  a t
calculate ρ1 and ρ2. State whether the model is stationary. Also
calculate pacf (1) and pacf (2).
E5) For the model
X t  1.5 X t 1  0.6 X t  2  a t
obtain ρ1 and ρ2. Is the process stationary?
E6) Find the autocorrelation function (acf) of the process
X t  X t 1  0.25 X t  2  a t
and obtain ρ1 and ρ2.
E7) The following table gives the number of workers trained during
1980-2010.
(t) 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990
yt 4737 5117 5091 3468 4320 3825 3673 3694 3708 3333
(t) 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
yt 3367 3614 3362 3655 3963 4405 4595 5045 5700 5716
(t) 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
yt 5138 5010 5353 6074 5031 5648 5506 4230 4827 3885

Some autocorrelations are given below:


r1 = 0.732, r2 = 0.661, r3 =0.557, r4 = 0.385, r5 = 0.272, r6 = 0.119,
r7= 0.019, r8 = −0.139, r9=−0.268, r10 = −0.375, y = 4503.00 and
σy = 836.74
i) Draw the time plot.
ii) Plot the correlogram.
iii) Calculate pacf (1) and pacf (2) and test their significance.
iv) Which one of the models, AR(1) or AR(2), will be more suitable
for this data?
v) Fit the suitable model.
82

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Models


16.3 AUTOREGRESSIVE MOVING AVERAGE
(ARMA) MODELS
A finite order moving average process can be written as an infinite order
autoregressive process. Similarly, a finite order autoregressive process can
be written as an infinite order moving average process. We would like to fit
a model, which has the least number of parameters. This property is called
parsimony (most economical). Hence, a combination of autoregressive
(AR) and moving average (MA) models may turn out to be the most
parsimonious. We represent a combination of AR(p) and MA(q) model as
ARMA(p, q) and write
X t   1X t 1   2 X t  2  ...   p X t  p  a t  1a t 1   2 a t  2  ...   q a t  q
… (45)
Using the backward shift operator B, we can write equation (45) as
Φ (B) Xt = θ (B) at … (46)
where
 (B)  1 1B   2 B2  .....   p Bp (AR) … (47a)
2 q
 (B) 1  1B  2 B  .....  q B (MA) … (47b)

The conditions of stationarity and invertibility are the same as for


autoregressive (AR) and moving average (MA) processes, respectively, i.e.,
the roots of
Φ (B) = 0 and θ(B) = 0 … (47c)
must lie outside the unit circle. So the modulus of roots of B must be greater
than one.
An ARMA (1, 1) model can be written as
X t   X t 1  a t   a t 1 … (48a)
which can be written using backward operator B as
(1− αB) Xt = (1+ βB) at … (48b)
For a stationary and invertible ARMA (1, 1) process,
| α |<1, |β|<1
On multiplying equation (48a) by Xt, Xt–1 and Xt–k and taking expectations,
we obtain

0 

 a2 1   2  2  … (49a)

1 2 
 1   0   2a … (49b)

 k   k 1 , k≥2 … (49c)
We also obtain

1 
1       … (49d)
1   2
 2 
 k   k 1 , k≥2 … (49e)
Thus, the autocorrelation function decays exponentially from the starting
value 1, which depends on α and β.
Let us take up an example of the ARMA model. 83

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling Example 6: Write the following ARMA (1,1) model
X t  0.5 X t 1  a t  0.3 a t 1
using backward operator B. Is the process stationary and invertible?
Calculate 1 and 2 for the process.
Solution: Since α = 0.5 and β= – 0.3, from equation (48b), the model is
written using backward operator B as:
1  0.5 B X t  1  0.3 B a t
In this case, from equations (47a and b) we have
Φ (B) = 1– 0.5 B and θ(B) = 1 – 0.3B
Therefore, for stationarity and invertibility, from equation (47c), the roots of
1 – 0.5 B = 0 and 1 – 0.3B = 0 must lie outside the unit circle. The roots of
these equations are:
B = 1/0.5 = 2.0 and B = 1/0.3 = 3.33
Since both roots lie outside the unit circle, the process is stationary and
invertible. From equations (49 d and e),
 
1  1   B   / 1   2  2  0.215 , and
 2   1  0.107
You may now like to try out an exercise.
E8) Show that the ARMA (1, 1) model
X t  0.5 X t 1  a t  0.5 a t 1

can be equivalently written as X t  a t , which is a white noise model.

16.4 AUTOREGRESSIVE INTEGRATED


MOVING AVERAGE (ARIMA) MODELS
In Units 13 and 14, we have discussed that the actual time series often
contains trend and seasonal components. In that sense, most of the time
series we come across are non-stationary as their mean changes with time.
In these units, we have tried to take moving average to remove seasonal
component and then we have estimated trend. In this section, we incorporate
trend and seasonal effects in the model and then by making suitable
operations on the series, transform them to stationary series. Then we apply
the methods of stationary models discussed so far.
If a time series is non-stationary because of changes in mean, we can take
the difference of successive observations. The modified series is more likely
to be stationary. Sometimes more than one difference of successive
observations is required to get a modified stationary model. Such a model is
called an integrated model because the stationary model that is fitted to the
modified series has to be summed or integrated to provide a model for the
original non-stationary series. The first difference of series Xt is defined
as Wt:
Wt   X t  1  B X t  X t  X t 1 … (50)
where  is the difference operator. This is called the difference of order 1.
We may define a modified series of order d as
84

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Models


Wt   d X t  1  B  X t
d
… (51)
where d takes values 1, 2, ….
For d =2, this operation takes differences twice:
Wt   2 X t  1  B 1  B  X t  1  BX t  X t 1 

 X t  X t 1  X t 1  X t  2   X t  2 X t 1  X t  2 … (52)
In general, the ARIMA model can be written as:
Wt   1 Wt 1   2 Wt 2  ...   p Wt p  a t  1a t 1   2 a t  2  ...   q a t  q

… (53a)
or using backward operator B, it can be written as:
 B Wt   B  a t …(53b)

 B1  B X t   B a t
d
or … (53c)
It is denoted by ARIMA (p, d, q). The operator Φ (B) (1−B)d has d roots of
B equal to 1. For d = 0, the series is an ARMA process. In practice, the first
or second difference make the process stationary. A random walk model is
an example of the ARIMA model.
Consider the time series
X t  X t 1  a t … (54a)
which can be written as
1  B X t  a t … (54b)
It is clearly non-stationary as one root of
Φ (B) = 1− B = 0 … (54c)
lies on the unit circle. To make it stationary, we take one difference of Xt, as
Wt  X t  X t 1  a t
So the time series can be written as ARIMA (0,1,0). Wt is a white noise
process and stationary.
A plot of the first difference looks like a plot of a stationary process without
any trend. The plot of autocorrelations and partial autocorrelations provide
the idea of the process.
Example 7: For the model
1  0.2 B1  BX t  1  0.5 B a t
find p, d, q and express it as ARIMA (p, d, q). Determine whether the
process is stationary and invertible.
Solution: We are given the model
1  0.2 B1  B X t  1  0.5 B a t
a) In this case, from equations (53 b and c), we can write the given model as
1  0.2 B1  B1 X t  1  0.5 B a t
which implies that Wt = (1 – B) Xt, i.e., d = 1 and from equation (53a)
X t  0.2X t 1  a t  0.5 a t 1
85

Created in Master PDF Editor - Demo Version


Created in Master PDF Editor - Demo Version

Time Series Modelling This implies that p = 1 and q = 1. Hence, the process is ARIMA (1,1,1)
b) F( B) = (1 - B)(1 - 0.2B) = 0 Þ B = 1 and B = 5 and

B  1  0.5B   0  B  1 / 0.5  2.0


One of the roots of Φ(B) = (1 – B) (1 – 0.2B) = 0 is 1. Hence, the process is
non-stationary. However, the root of θ(B) = 0 lies outside the unit circle.
Hence, it is invertible. For the first difference Wt = (1 – B) Xt, the process is
stationary and invertible.
You may now lke to try some more exercises for practice.
E9) Consider the time series
X t  1   2 t  a t
where 1 and 2 are known constants and at is a white noise with
variance σ2.
Determine whether Xt is stationary. If Xt is not stationary, find a
transformation that produces a stationary process.
E10) Suppose that the correlogram of a time series consisting of 100
observations has
r1=0.31, r2=0.37, r3= – 0.05, r4= 0.06, r5= – 0.21, r6=0.11, r7=0.08,
r8 = 0.05, r9=0.12, r10= – 0.01
Suggest an ARIMA model which may be appropriate for this case.
Let us now summarise the concepts that we have discussed in this unit.

16.5 SUMMARY
1. The sequences of random variables {Yi} are mutually independent and
identically distributed. If a discrete stationary process consists of such
sequences of i.i.d. variables, it is called a purely random process.
Sometimes it is called white noise.
2. The moving average processes are used successfully to model stationary
time series in econometrics. The MA(q) process of order q is given as
X t  0a t  1a t 1  ...  q a t q
where βi, (i = 0, 1, 2, …, q) are constant.
3. The autocorrelation function (acf) of the MA (q) process is given by

k 
k  1 k 1  ...  q  k q  , k  1, 2, ..., q
q
 
1   i2 
 
 i 1 

It becomes zero if lag k is greater than the order of the process, i.e., q.
This is a very important feature of moving average (MA) processes.
4. A linear stationary process can always be expressed as an
autoregressive process of suitable order. Unlike moving average (MA)
process, which puts no restrictions on parameters for stationarity,
autoregressive (AR) process requires certain restrictions on the
parameter α for stationarity.
86

Created in Master PDF Editor - Demo Version

You might also like