Ignou Forecasting Methods
Ignou Forecasting Methods
BLOCK 4
FORECASTING METHODS
249
Sampling and
Sampling
Distributions
250
Business
UNIT 13 BUSINESS FORECASTING Forecasting
Objectives
After completion of this unit, you should be able to :
• realise that forecasting is a scientific discipline unlike ad hoc predictions
• appreciate that forecasting is essential for a variety of planning decisions
• become aware of forecasting methods for long, medium and short term
decisions
• use Moving Averages and Exponential smoothing for demand
forecasting
• understand the concept of forecast control
• use the moving range chart to monitor a forecasting system.
Structure
13.1 Introduction
13.2 Forecasting for Long Term Decisions
13.3 Forecasting for Medium and Short Term Decisions
13.4 Forecast Control
13.5 Summary
13.6 Self-assessment Exercises
13.7 Key Words
13.8 Further Readings
13.1 INTRODUCTION
Data on demands of the market may be needed for a number of purposes to
assist an organisation in its long term, medium and short term decisions.
Forecasting is essential for a number of planning decisions and often
provides a valuable input on which future operations of the business
enterprise depend. Some of the areas where forecasts of future product
demand would be useful are indicated below :
1) Specification of production targets as functions of time.
2) Planning equipment and manpower usage, as well as additional
procurement.
3) Budget allocation depending on the level of production and sales.
4) Determination of the best inventory policy.
5) Decisions on expansion and major changes in production processes and
methods.
6) Future trends of product development, diversification, scrapping etc.
251
Forecasting 7) Design of suitable pricing policy.
Methods
8) Planning the methods of distribution and sales promotion.
It is thus clear that the forecast of demand of a product serves as a vital input
for a number of important decisions and it is, therefore, necessary, to adopt a
systematic and rational methodology for generating reliable forecasts.
The Uncertain Future
The future is inherently uncertain and since time immemorial man has made
attempts to unravel the mystery of the future. In the past it was the crystal
gazer or a person allegedly in possession of some supernatural powers who
would make predications about the things-to be-major events or the rise and
fall of kings. In today's world, predictions are being made daily in the realm
of business, industry and politics. Since the operation of any capital
enterprise has a large lead time (1-5 years is typical), it is clear that a factory
conceived today is for some future demand and the whole operation is
dependent on the actual demand coming up to the level projected much
earlier. During this period many circumstances, which might not even have
been imagined, could come up. For instance, there could be development of
other industries, or a major technological breakthrough that may render the
originally conceived product obsolete; or a social upheaval and change-of
government may redefine priorities of growth and development; or an
unusual weather condition like drought or floods may alter completely the
buying potential of the originally conceived market. This is only a partial list
to suggest how uncertainties from a variety of sources can enter to make the
task of prediction of the future extremely difficult.
It is proper at this stage to emphasise the distinction between prediction and
forecasting. Forecasting generally refers to the scientific methodology that
often uses past data along with some well-defined assumptions or 'model' to
come up with a 'forecast' of future demand. In that sense, forecasting is
objective. A prediction is a subjective estimate made by an individual by
using his intuitive 'hunch' which may in fact come out true. But the fact that it
is subjective (A's prediction may be different from B's and C's) and non-
realisable as a Well-documented computer programme (which could be used
by anyone) deprives it of much value. This is not to discount. the role of
intuition or subjectivity in practical decision-making. In fact, for complex
long term decisions, intuitive methods such as the Delphi technique are most
popular. The opinion of a well informed, educated person is likely to be
reliable, reflecting the well-considered contribution of a host of complex
factors in a relationship that may be difficult to explicitly quantify. Often
forecasts are modified based on subjective judgment and experience to obtain
predictions used in planning and decision making.
The future is inherently uncertain and any forecast at best is an educated
guess with no guarantee of coming true. In certain purely deterministic
systems (as for example in classical physics the laws governing the motion of
celestial bodies are fairly well developed) an unequivocal relationship.
between cause and effect has been clearly established and it is possible to
predict. very accurately the course of events in the future, once the future
252
patterns of causes are inferred from past behaviour. Economic systems, Business
Forecasting
however, are more complex because (i) there is a large number of governing
factors in a complex structural framework which may not be possible to
identify and (ii) the individual factors themselves have a high degree of
variability and uncertainty. The demand for a particular product (say
umbrellas) would depend on competitor's prices, advertising campaigns,
weather conditions, population and a number of factors which might even be
difficult to identify. In spite of these complexities, a forecast has to be made
so that the manufacturers of umbrellas (a product which exhibits a seasonal
demand) can plan for the next season.
Forecasting for Planning Decisions
The primary purpose of forecasting. is to provide valuable information for
planning the design and operation of the enterprise. Planning decisions may
be classified as long term, medium term and short term.
Long term decisions include decisions like plant expansion or. new product
introduction which may require new technologies or a. complete
transformation in social or moral fabric of society. Such decisions are
generally, characterised by lack of quantitative information and absence of
historical data on which to base, the forecast of future events. Intuition and
the collected opinion of. experts in the field generally play a significant role
in developing forecasts for such decisions. Some methods used in forecasting
for long term decisions are discussed in Section 13.2.
Medium term decisions involve such decisions as planning the production
levels in a manufacturing plant over the next year, determination of
manpower requirements or inventory policy for the firm. Short term
decisions include daily production planning and scheduling decisions. For
both medium and short term forecasting, many methods and techniques exist.
These methods can broadly be classified as follows
a) Subjective of intuitive methods.
b) Methods based on averaging of past data, including simple, weighted and
moving averages.
c) Regression models on historical data.
d) Causal or Econometric models.
e) Time series analysis or stochastic models.
These methods are briefly reviewed in Section 13.3. A more detailed
discussion of correlation, regression and time series models is taken up in the
next three units.
The choice of an appropriate forecasting method is discussed in Section 13.4.
The aspect of forecast control which tells whether a particular method in use
is acceptable is discussed in Section 13.5. And finally a summary is given in
Section 13.6.
253
Forecasting
Methods
13.2 FORECASTING FOR LONG TERM
DECISIONS
Technological Forecasting
Technological growth is often haphazard, especially in developing countries
like India. This is because Technology seldom evolves and there are frequent
technology transfers -due to imports of knowhow resulting in a leap-frogging
phenomenon. In spite of this, it is generally seen that logarithms of many
technological variables show linear trends with time, showing exponential
growth. Some extrapolations reported by Rohatgi et al. are
• Passenger kms carried by Indian Airlines (Figure I)
• Fertilizer applied per hectare of cropped area (Figure II)
• Demand and supply of petroleum crude (Figure III)
• Installed capacity of electricity generation in millions of KW (figure IV).
Figure I: Passenger Km Carried by Indian Air Lines
Figure I: Passenger Km Carried by Indian Air Figure II: Fertilizer Applied per
Lines Hectare «f Cropped Area
254
Figure III: Demand and Supply of Petroleum Crude. Business
Forecasting
Figure V: Hydroelectric Power Generalion Using Figure VI: Number of Villages Electrified
Gompertz Growth Curve Using a Pearl Type Growth Curve
Apart from the above extrapolative techniques which are based on the
projection of historical data into the future (such models are called regression
models and you will learn more about them in Unit 15), technological
forecasting often implies prediction of future scenarios or likely possible
futures. As an example suppose there are three events; E1, E� and �� where
each one may or may not happen in the future. Thus, eight possible
scenarios-E1 E2 E3, E1 E2 E�� , E1 E�� E3, E
��E2 E3, E
�� E
��E3, E
��E2 E �� , E1 E
�� E
�� ,
E �� E
�� E ��, — showt he range of possible futures (a line above the event
indicates that the event does not take place). Moreover these events may not
be independent. The breakout of war (E1) is likely to lead to increased
spendings on defence (�� ) and reduced emphasis on rural uplift and social
development (�� ). Such interactions can be investigated using the Cross-
impact Technique.
Delphi
This is a subjective method relying on the opinion of experts designed to
minimise bias and error of judgment. A Delphi panel consists of a number of
experts with an impartial leader or coordinator who organises the questions.
Specific questions (rather than general opinions) with yes-no or multiple type
answers or specific dates/events are sought from the experts. For instance,
questions could be of the following kind :
• When do you think the petroleum reserves of the country would be
exhausted? (2020,2040, 2060)
• When would the level of pollution in Delhi exceed danger limit? (as
defined by a particular agency)?
• What would the population of India be in 2020, 2040 and 2060?
• When would fibre optics become a commercial viability for
communication?
256
A summary of the responses of the participants is sent to each expert Business
Forecasting
participating in the Delphi panel after a statistical analysis. For a forecast of
when an event is likely to happen, the most optimistic and pessimistic
estimates along with a distribution of other responses is given to the
participant. On the basis of this information the experts may like to revise
their earlier estimates and give revised estimates to the coordinator. It may be
mentioned that the identities of the experts are not revealed to each other so
that bias or influence by reputation is kept to a minimum. Also the feedback
response is statistical in nature without revealing who made which forecast.
The Delphi method is an iterative procedure in which revisions are carried
out by the experts till the coordinator gets a stable response.
The method is very efficient, if properly conducted, as it provides a
systematic framework for collecting expert opinion. By virtue of anonymity,
statistical analysis and feedback of results and provision for forecast revision,
results obtained are free of bias and generally reliable. Obviously, the
background of the experts and their knowledge of the field is crucial. This is
where the role of the coordinator in identifying the proper experts is
important.
Opinion Polls
Opinion polls are a very common method of gaining knowledge about
consumer tastes, responses to a new product, popularity of a person or leader,
reactions to an election result or the likely future prime minister after the
impending polls. In any opinion poll two things are of primary importance.
First, the information that is sought and secondly the target population from
whom the information is sought. Both these factors must be kept in mind
while designing the appropriate mechanism for conducting the opinion poll.
Opinion polls may be conducted through
• Personal interviews.
• Circulation of questionnaires.
• Meetings in groups.
• Conferences, seminars and symposia.
The method adopted depends largely on the population, the kind of
information desired and the budget available. For instance, if information
from a very large number of people is to be collected a suitably designed
questionnaire could be mailed to die people concerned. Designing a proper
questionnaire is itself a major task. Care should be taken to avoid ambiguous
questions. Preferably, the responses should be short one word answers or
ticking an appropriate reply from a set of multiple choices. This makes the
questionnaire easy for the respondent to fill and also easy for the analyst to
analyse. For example, the final analysis could be summarised by saying
• 80% of the population expressed opinion A,
• 10% expressed opinion B,
• 5% expressed opinion C,
• 5% expressed no opinion. 257
Forecasting Similarly in the context of forecasting of product demand, it is common to
Methods arrive at the sales forecast by aggregating the opinion of area salesmen. The
forecast could be modified based on some kind of rating for each salesman or
an adjustment for environmental uncertainties.
Decisions in the area of future R&D or new technologies too are based on the
opinions of experts. The Delphi method treated in this Section is just an
example of a systematic gathering of opinion of experts in the concerned
field.
The major advantage of opinion polls lies in the fact that a well formed
opinion considers the multifarious subjective and objective factors which
may not even be possible to enumerate explicitly, and yet they may have a
bearing on the concerned forecast or question. Moreover the aggregation of
opinion polls tends to eliminate the bias that is bound to be present in any
subjective, human evaluation. In fact for long term decisions, opinion polls of
opinions of the experts constitute a very reliable method for forecasting and
planning.
259
Forecasting The average of the sales for January, February and March is
Methods (199+202+199)/3=200, which constitutes the 3 months moving average
calculated at the end of March and may thus be used as a forecast for April.
Actual sales in April turn out to be 208 and so the 3 months moving average
forecast for May is (202+199+208)/3 =203. Notice that a convenient method
of updating the moving average is
New moving average = Old moving average + Added period demand – Dropped period demand
Number of period in moving average
At the end of May, the actual demand for May is 212, while the demand for
February which is to be dropped from the last moving average is 202. Thus,
New moving average = 203 + 10/3 = 206.33 which is the forecast for June.
Both the 3 period and 6 period moving average are shown in Table 1.
It is characteristic of moving averages to
a) Lag a trend (that is, give a lower value for an upward trend and a higher
value for a lower trend) as shown in Figure VII (a).
b) Be out of phase (that is, lagging) when the data is cyclic, as in seasonal
demand. This is depicted in Figure VII (b).
c) Flatten the peaks of the demand pattern as shown in Figure VII (c).
B) Moving Averages Are Out Of Phase (C) Moving Averages Flatten Peaks
For Cyclic Demand.
Some correction factors to rectify the lags can be incorporated. For details,
you may refer to Brown (3).
Exponential smoothing is an averaging technique where the weightage given
to the past data declines (at an exponential rate) as the data recedes into the
past. Thus all the values are taken into consideration, unlike in moving
260
averages, where all data points prior to the period of the Moving Average are Business
Forecasting
ignored.
If �� is the one-period ahead forecast made at time t and is the demand for
period t, then
�� = F��� + �(−�� − ���� )
= ��� + (1 − �)����
Where � is a smoothing constant that lies between 0 and 1 but generally
chosen values lie between 0.01 and 0.30. A higher value of a places more
emphasis on recent data. To initiate smoothing, a starting value of �� , is
needed which is generally taken as the first or some average demand value
available. Corrections for trend effects may be made by using double
exponential smoothing and other factors. For details, you may consult the
references at the end.
A computation of the smoothed values of demand for the example considered
earlier in Table 1 is shown in Table 2 for values of a equal to 0.1 and 0.3. In
these computations, exponential smoothing is initiated from June with a
starting forecast as the average demand for the first five months. Thus the
error for June is (194-204), that is -10, which when multiplied by a (0.1 or
0.3 as the case may be) and added to the previous forecast of 204 yields 203
or 201 (depending on whether � is 0.1 or 0.3) respectively as shown in Table
2.
Table 2: Monthly Sales of an Item and Forecasts Using Exponential
Smoothing
Month Demand Smoothed Smoothed
forecast (alpha forecast (alpha
= 0.1) = 0.3)
Jan 199
Feb 202
Mar 199
Apr 208
May 212
Jun 194 204.0 204.0
July 214 203.0 201.0
Aug 220 204.1 204.9
Sept 2 19 205.7 209.4
Oct 234 207.0 212.3
Nov 219 209.7 218.8
Dec 233 210.6 218.9
Both moving averages and smoothing methods are essentially short term
forecasting techniques where one or a few period-ahead forecasts are
obtained.
261
Forecasting Regression Models on Historical Data
Methods
The demand of any product or service when plotted as a function of time
yields a time series whose behaviour may be conceived of as following a
certain pattern with random fluctuations. Some commonly observed demand
patterns are shown in Figure VIII.
Figure VIII: Some Commonly Observed Demand Patterns
264
a) whether the past demand is statistically stable, Business
Forecasting
b) whether the present demand is following the past pattern,
c) if the demand pattern has changed, the control chart tells how to revise
the forecasting method.
As long as the plotted error points keep falling within the control limits, it
shows that the variations are due to chance causes and the underlying system
of forecast generation is acceptable. When a point goes out of control there is
reason to suspect the validity of the forecast generation system, which should
be revised to reflect these changes.
13.5 SUMMARY
The unit has emphasised the importance of forecasting in all planning
decisions-be they long term, medium term or short term. For long term
planning decisions, techniques like Technological Forecasting, collecting
opinions of experts as in Delphi or opinion polls using personal interviews or
questionnaires have been surveyed. For medium and short term decisions,
apart from subjective and intuitive methods there is a greater variety of
mathematical models and statistical techniques that could be profitably
employed. There are methods like Moving averages or exponential
smoothing that are based on averaging of past data. Any suitable
mathematical function or curve could be fitted to the demand history by using
least squares regression. Regression is also used in estimation of parameters
of causal or econometric models. Stochastic models using Box-Jenkins
methodology are a statistically advanced set of tools capable of more accurate
forecasting. Finally, forecast control is very necessary to check whether the
forecasting system is consistent and effective. The moving range chart has
been suggested for its simplicity and ease of operation in this regard.
265
Forecasting Period (Monthly)
Methods
1 2 3 4 5 6 7 8 9 10 11 12
80 100 79 98 95 104 80 98 102 96 115 88
67 53 601 79 102 118 135 162 70 53 68 63
117 124 95 228 274 248 220 130 109 128 125 134
a) Plot the data on a graph and suggest an appropriate model that could be
used for forecasting.
b) Plot a 3 and 5 period moving average and show on the graph in (a)
c) Initiate exponential smoothing from the first period demand for
smoothing constant (cc) values of 0.1 and 0.3. Show the plots.
6 What do you understand by forecast control? What could be the various
methods to ensure that the forecasting system is appropriate?
266
Business
13.8 FURTHER READINGS Forecasting
267
Forecasting
Methods UNIT 14 CORRELATION
Objectives
After completion of this unit, you should be able to :
• understand the meaning of correlation
• compute the correlation coefficient between two variables from sample
observations
• test for the significance of the correlation coefficient
• identify confidence limits for the population correlation coefficient from
the observed sample correlation coefficient
• compute the rank correlation coefficient when rankings rather than actual
values for variables are known
• appreciate some practical applications of correlation
• become aware of the concept of auto-correlation and its application in
time series analysis.
Structure
14.1 Introduction
14.2 The Correlation Coefficient
14.3 Testing for the Significance of the Correlation Coefficient
14.4 Rank Correlation
14.5 Practical Applications of Correlation
14.6 Auto-correlation and Time Series Analysis
14.7 Summary
14.8 Self-assessment Exercises
14.9 Key Words
14.10 Further Readings
14.1 INTRODUCTION
We often encounter situations where data appears as pairs of figures relating
to two variables. A correlation problem considers the joint variation of two
measurements neither of which is restricted by the experimenter. The
regression problem, which is treated in Unit 15, considers the frequency
distributions of one variable (called the dependent variable) when another
(independent variable) is held fixed at each of several levels.
Examples of correlation problems are found in the study of the relationship
between IQ and aggregate percentage marks obtained by a person in SSC
examination, blood pressure and metabolism or the relation between height
268
and weight of individuals. In these examples both variables are observed as Correlation
1988 50 700
1987 50 650
1986 50 600
1985 40 500
1984 30 450
1983 20 400
1982 20 300
1981 15 250
1980 10 210
1979 5 200
269
Forecasting Figure I: Scatter Diagram
Methods
The scatter diagram may exhibit different kinds of patterns. Some typical
patterns indicating different correlations between two variables are shown in
Figure II.
What we shall study next is a precise and quantitative measure of the degree
of association between two variables and the correlation coefficient.
270
Figure II: Different Types of Association Between Variables Correlation
where
� = � − �� = deviation of a particular X value from the mean ��
� = � − �� = deviation of a particular Y value from the mean ��
Equation (14.2) can be derived from equation (14.1) by substituting for ��
and �� as follows:
� �
σ� = �� Σ(X − X̄)� and �� = �� Σ(X − Ȳ)� (14.3)
Activity A
Suggest five pairs of variables which you expect to be positively correlated.
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
Activity B
Suggest five pairs of variables which you expect to be negatively correlated.
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
………………………………………………………………………………… 271
Forecasting A Sample Calculation: Taking as an illustration t he data of advertisement
Methods
expenditure (X) and Sales (Y) of a company for the 10-year period shown in
Table 1, we proceed to determine the correlation coefficient between these
variables:
Computations are conveniently carried out as shown in Table 2.
Table 2: Calculation of Correlation Coefficient
Sl.No X Y � = x − x� � =�−� x� �� xy
.
290
�� =
= 29
10
4260
�=
Y = 426
10
�� 28310
∴�= = = 0.976
√Σ� � �Σ� � √2740 × 306840
This value of r (= 0.976) indicates a high degree of association between the
variables X and Y. For this particular problem, it indicates that an increase in
advertisement expenditure is likely to yield higher sales.
You may have noticed that in carrying out calculations for the correlation
coefficient in Table 2, large values for � � and � � resulted in a great
computational burden. Simplification in computations can be adopted by
calculating the deviations of the observations from an assumed average rather
than the, actual average, and also scaling these deviations conveniently. To
illustrate this short cut procedure, let us compute the correlation coefficient
for the same data. We shall take U to be the deviation of X values from the
assumed mean of 30 divided by 5. Similarly, V represents the deviation of Y
values from the assumed mean of 400 divided by 10.
The computations are shown in Table 3.
272
Table 3: Short cut Procedure for Calculation of Correlation Coefficient Correlation
S.No X Y U V UV �� ��
1. 50 700 4 30 120 16 900
2. 50 650 4 25 100 16 625
3 50 600 4 20 80 16 400
4. 40 500 2 10 20 4 100
5. 30 450 0 5 0 0 25
6. 20 400 -2 0 0 4 0
7 20 300 -2 -10 20 4 100
8. 15 250 -3 -15 45 9 225
9. 10 210 -4 -19 76 16 361
10. 5 200 -5 -20 100 25 400
Total -2 26 561 110 3,13
����
Σ�� − �
�=
(∑�)� (��)�
�Σ� � − �Σ� � −
� �
(��)(��)
561 − ��
�=
(��)� (��)�
�110 − �3136 −
�� ��
566.2
=
10.47 × 55.39
= 0.976
We thus obtain the same result as before.
Activity C
Use the short cut procedure to obtain the value of correlation coefficient in
the above example using scaling factor 10 and 100 for X and Y respectively.
(That is, the deviation from the assumed mean is to be divided by 10 for X
values and by 100 for Y values.)
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
Once r has been calculated, the chart can be used to determine the upper and
lower values of the interval for the sample size used. In this chart the range of
unknown values of p is shown in the vertical scale; while the sample r values
are shown on the horizontal axis, with a number of curves for selected sample
sizes. Notice that for every sample size there are two curves. To read the 95%
confidence limits for an observed sample correlation coefficient of 0.8 for a
sample of size 10, we simply look along the horizontal line for a value of 0.8
(the sample correlation coefficient) and construct a vertical line from there till
it intersects the first curve for n =10. This happens for p = 0.2. This is the
lower limit of the confidence interval. Extending the vertical line upwards, it
again intersects the second n =10 line at p = 0.92, which represents the upper
274
confidence limit. Thus the 95% confidence interval for the population Correlation
Rank for 1 2 3 4 5 6 7 8 9 10
variable X
Rank for 3 1 4 2 6 9 8 10 5 7
variable Y
276
Correlation
�−2
� = �� �
1 − ���
10 − 2
= 0.697�
1 − (0.697)�
=2.75
Referring to the table of the t-distribution for n-2 = 8 degrees of freedom, the
critical value for t at a 5% level of significance is 2.306. Since the calculated
value of t is higher than the table value, we reject the null hypothesis
concluding that the performances in Mathematics and Physics are closely
associated.
When two or more items have the same rank, a correction has to be applied to
∑d�� . For example, if the ranks of X are 1, 2, 3, 3, 5, ... showing that there are
�
two items with the same 3rd rank, then instead of writing 3, we write 3 � for
each so that the sum of these items is 7 and the mean of the ranks is
unaffected. But in such cases the standard deviation is affected, and therefore,
a correction is required. For this, ∑d�� is increased by (� � − �)/12 for each
tie, where t is the number of items in each tie.
Activity D
Suppose the ranks in Table 4 were tied as follows: Individuals 3 and 4 both
ranked 3rd in Maths and individuals 6, 7 and 8 ranked 8th in Physics.
Assuming that other rankings remain unaltered, compute the value of
Spearman's rank correlation.
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
277
Forecasting Correlation analysis is used as a starting point for selecting useful
Methods independent variables for regression analysis. For instance a construction
company could identify factors like
• population
• construction employment
• building permits issued last year which it feels would affect its sales for
the current year.
These and other factors that may be identified could be checked for mutual
correlation by computing the correlation coefficient of each pair of variables
from the given historical data (this kind of analysis is easily done by using an
appropriate routine on a computer). Only variables having a high correlation
with the yearly sales could be singled out for inclusion in a regression model.
Correlation is also used in factor analysis wherein attempts are made to
resolve a large set of measured variables in terms of relatively few new
Categories, known as factors. The results could be useful in the following
three ways :
1) to reveal the underlying or latent factors that determine the relationship
between the observed data,
2) to make evident relationships between data that had been obscured before
such analysis, and
3) to provide a classification scheme when data scored on various rating
scales have to be grouped together.
Another major application of correlation is in forecasting with the help of
time series models. In using past data (which is often a time series of the
variable of interest available at equal time intervals) one has to identify the
trend, seasonality and random pattern in the data before an appropriate
forecasting model can be built. The notion of auto-correlation and plots of
auto-correlation for various time lags help one to identify the nature of the
underlying process. Details of time series analysis are discussed in Unit 20.
However, some fundamental concepts of auto-correlation and its use for time
series analysis-are outlined below.
One could construct from one variable another time-lagged variable which is
twelve periods removed. If the data consists of monthly figures, a twelve-
month time lag will show how values of 'the same month but of different
years correlate with each other. If the auto-correlation coefficient is positive,
it implies that there is a seasonal pattern of twelve months duration. On the
other hand, a near zero auto-correlation indicates the absence of a seasonal
pattern. Similarly, if there is a trend in the data, values next to each other will
relate, in the sense that if one increases, the other too will tend to increase in
order to maintain the trend. Finally, in case of completely random data, all
auto-correlations will tend to zero (or not significantly different from zero).
The formula for the auto correlation coefficient at time lag k is:
∑��� � �
��� (X � − X)(X ��� − X)
r� =
∑���� (X� − �
X)�
where
�� denotes the auto-correlation coefficient for time lag k k denotes the length
of the time lag n is the number of observations
X, is the value of the variable at time t and
X is the mean of all the data
Using the data of Figure IV the calculations can be illustrated.
13 + 8 + 15 + ⋯ + 12 100
�� = = = 10
10 10
(13 − 10)(8 − 10) + (8 − 10)(15 − 10) + ⋯ + (14 − 10)(12 − 10)
�� =
(13 − 10)� + (8 − 10)� + ⋯ + (14 − 10)� + (12 − 10)�
−27
= = −0.188
144
279
Forecasting For k = 2, the calculation is as follows :
Methods
∑��
��� (X � − 10)(X ��� − 10)
r� =
∑��
��� (X � − 10)
�
14.7 SUMMARY
In this unit the concept of correlation or the association between two
variables has been discussed. A scatter plot of the variables may suggest that
the two variables are related but the value of the Pearson correlation
coefficient r quantifies this association. The correlation coefficient r may
assume values between -1 and 1. The sign indicates whether the association
is direct (+ve) or inverse (-ve). A numerical value of r equal to unity indicates
perfect association while a value of zero indicates no association.
Tests for significance of the correlation coefficient have been described.
Spearman's rank correlation for data with ranks is outlined. Applications of
correlation in identifying relevant variables for regression, factor analysis and
in forecasting using time series have been highlighted. Finally the concept of
auto-correlation is defined and illustrated for use in time series analysis.
c) What are the 95% confidence limits for the population correlation
coefficient?
d) Test the significance of the correlation coefficient using a t-test at a
significance level of 5%.
3) The following data pertains to length of service (in years) and. the annual
income for a sample of ten employees of an industry:
Length of service in years (X) Annual income in thousand
rupees (Y)
6 14
8 17
9 15
10 18
11 16
12 22
14 26
16 25
18 30
20 34
Compute the correlation coefficient between X and Y and test its
significance at levels of 0.01 and 0.05.
4) Twelve salesmen are ranked for efficiency and the length of service as
below:
Salesman Efficiency (X) Length of Service (Y)
A 1 2
B 2 1
C 3 5
D 5 3
E 5 9
F 5 7
G 7 7
H 8 6
I 9 4
J 10 11
K 11 10
L 12 11
�
for n data points.
Scatter Diagram: An ungrouped plot of two variables, on the X and Y axes.
Time Lag: The length between two time periods, generally used in time
series where one may test, for instance, how values of periods 1, 2; 3, 4
correlate with values of periods 4, 5, 6, 7 (time lag 3 periods).
Time-Series: Set of observations at equal time intervals which may form the
basis of future forecasting.
283
Forecasting
Methods
UNIT 15 REGRESSION
Objectives
After successful completion of this unit, you should be able to:
• understand the role of regression in establishing mathematical
relationships between dependent and independent variables from given
data
• use the least squares criterion to estimate the model parameters
• determine the standard errors of estimate of the forecast and estimated
parameters
• establish confidence intervals for the forecast values and estimates of
parameters
• make meaningful forecasts from given data by fitting any function, linear
in unknown parameters.
Structure
15.1 Introduction
15.2 Fitting A Straight Line
15.3 Examining the Fitted Straight Line
15.4 An Example of the Calculations
15.5 Variety of Regression Models
15.6 Summary
15.7 Self-assessment Exercises
15.8 Key Words
15.9 Further Readings
15.1 INTRODUCTION
In industry and business today, large amounts of data are continuously being
generated. This may be data pertaining, for instance, to a company's annual
production, annual sales, capacity utilisation, turnover, profits, manpower
levels, absenteeism or some other variable of direct interest to management.
Or there might be technical data regarding a process such as temperature or
pressure at certain crucial points, concentration of a certain chemical in the
product or the breaking strength of the sample produced or one of a large
number of quality attributes.
The accumulated data may be used to gain information about the system (as
for instance what happens to the output of the plant when temperature is
reduced by half) or to visually depict the past pattern of behaviour (as often
happens in company s annual meetings where records of company progress
are projected) or simply used for control purposes to check if the process or
284
system is operating as designed (as for instance in quality control). Our Regression
interest in regression is primarily for the first purpose, mainly to extract the
main features of the relationships hidden in or implied by the mass of data.
The Need for Statistical Analysis
For the system under study there may be many variables and it is of interest
to examine the effects that some variables exert (or appear to exert) on others.
The exact functional relationship between variables may be too complex but
we may wish to approximate to this functional relationship by some simple
mathematical function such as straight line or a polynomial which
approximates to the true function over certain limited ranges of the variables
involved.
There could be many variables of interest in the system. In a chemical plant
for instance, the monthly consumption of water or other raw materials, the
temperature and pressure maintained in the reacting vessel, the number of
operating days per month the monthly production of the final product and any
by-products could all be variables of interest. We are, however, interested in
some key performance variable (which in our case may be monthly
production of final product) and would like to see how this key variable
(called the response variable or dependent variable) is affected by the other
variables (often called independent variables). By independent variables we
shall usually mean variables that can either be set to a desired value or else
take values that can be observed but not controlled. As a result of changes
that are deliberately made, or simply take place in the independent variables,
an effect is transmitted to the response variables. In general we shall be
interested in finding out how changes in the independent variables affect the
values of the response variables. Sometimes the distinction between
independent and dependent variables is not clear, but a choice may be made
depending on convenience or objectives.
Broadly speaking we would have to undergo the following sequence of steps
in determining the relationship between variables, assuming we have data
points already.
1) Identify the independent and response variables.
2) Make a guess of the form of the relation (linear, quadratic, cyclic etc.)
between the dependent and independent variables. This can be facilitated
by a graphical plot of the data (for two variables) on a systematic
tabulation (for more than two variables) which may suggest some trends
or patterns.
3) Estimate the parameters of the tentatively entertained model in step 2
above. For instance if a straight line was to be fitted, what is the slope and
intercept of this line?
4) Having obtained the mathematical model, conduct an error analysis to see
how good the model fits into the actual data.
5) Stop, if satisfied with model otherwise repeat steps 2 to 4 for another
choice of the model form in step 2.
285
Forecasting What is Regression?
Methods
Suppose we consider, the height and weight of adult males for some given
population. If we plot the pair (X� , X� ) = (height, weight), a diagram like
Figure I will result. Such a diagram, you would recall from the previous
chapter, is conventionally called a scatter diagram.
Note that for any given height there is a range of observed weights and vice-
versa. This variation will be partially due to measurement errors but primarily
due to variations between individuals. Thus no unique relationship between
actual height and weight can be expected. But we can note that average
observed weight for a given observed height increases as height increases.
The locus of average observed weight for given observed height (as height '
varies) is called the regression curve of weight on height. Let us denote it by
X� = f(X� ). There also exists a regression curve of height on weight
similarly defined which we can denote by X� = g(X� ). Let us assume that
these two "curves" are both straight lines (which in general they may not be).
In general these two curves are not the same as indicated by the two lines in
Figure I.
Figure I: Height and Weight of Thirty Adult Males
are called the parameters of the model whose values have been obtained from
the actual data.
When we say that a model is linear or non-linear, we are referring to linearity
or non¬linearity in the parameters. The value of the highest power of
independent variable in the model is called the order of the model. For
example:
� = �� + �� � + �� � � + �
is a second order (in X) linear (in the his) regression model..
Now in the model of equation (15.1) �� , �� and ∈ are unknown and in fact ∈
would be difficult to discover since it changes from observation to
observation. However, �� and �� remain fixed, and although we cannot find
them exactly without examining all possible occurrences of Y and X, we can
use the information provided by the actual data to give us estimates b� and b�
of �� and ��. Thus we can write
Y = b� + b� X (15.2)
where Y that denotes the predicted value of Y for a given X, when b� and b�
are determined. Equation 15.2 could then be used as a predictive equation;
substitution of a value of X would provide a. prediction of the true mean
value of Y for that X.
This is, however, not the only criterion available. One may, for instance,
minimise the sum of absolute deviations, which is equivalent to minimising 287
Forecasting the mean absolute deviation (MAD). The least squares criterion, however,
Methods has the following main advantages :
1) It is simple and intuitively appealing.
2) It results in linear equations (called normal equations) for solution of
parameters which are easy to solve.
3) It results in estimates of quality of fit and intervals of confidence of
predicted values rather easily.
In the context of the straight line model of equation (15.1), suppose there are
n data points (X� Y� ), (X� Y� ), … , (X� , Y� ) then we can write from equation
(19.1)
�� = �� + �� �� + �� , � = 1, … � (15.3)
so that the sum of squares of the deviations from the true line is
� = ∑���� ��� = ∑���� (�� − �� − �� �� )� (15.4)
We shall choose our estimates b� and b� to be values which, when
substituted for �� and �� in equation (15.4) produce the least possible value
of S. We can determine b� and b� by differentiating equation (15.4) first with
respect to βo and then with respect to β1 and setting the results equal to zero.
Notice that �� , �� are fixed pairs of numbers from our data set for i varying
between 1 and n. Therefore,
�
∂�
= −2 � ���� − �� − �� �� �
∂��
���
�
∂�
= −2 � �� (�� − �� − �� �� )
∂��
���
� (�� − �� − �� �) = 0
���
�
� �� (�� − �� − �� �) = 0
���
288
��� ��� Regression
� �
��� �� ���� ∑�� ���� ���� ��� ��
�� = � ��� = ����� �(��� )�
(15.6)
� ∣
��� ����
� ���
� � ���� �� ���� ���
∑�� ��� ��
�� = � ��� = ����� �(��� )�
(15.7)
� �
��� ����
Thus (15.6) and (15.7) may be used to determine the estimates of the
parameters and the predictive equation (15.2) may be used to obtain the
predicted value of Y (called Y) for any desired value of X.
Rather than use the above procedure, a slightly modified (though equivalent)
method is to use the, solution of the first normal equation in (15.5) to obtain
boas
�� = �� − �� �� (15.8)
Where �� and �� are (�� + �� + ⋯ + �� )/� and (�� + �� + ⋯ + �� )/�
respectively. Substituting (15.8) in (15.2) yields the following estimated
regression equation
�� = �� + �� (� − ��) (15.9)
Where �� is computed by
∑�� �� �(∑�� ∑�� )/� �(�� ��� )(�� ���)
�� = ∑��� �(∑�� )� /�
= �(�� ��� )�
(15.10)
This equation, as you can easily see, is derived from the last expression in
(15.7) by simply dividing the numerator and denominator by n. It is written
in the form above as it has an interpretation suitable for analysis of variance
later.
Activity A
You can see that the last form of equation (15.10) is expressed in terms of
sums of squares or products of deviations of individual points from their
corresponding means. Show that in fact
fact Σ(X� − �
X)(Y� − �
Y) = ΣX � Y� − (ΣX � Y� )/n
and Σ(�� − ��)� = Σ��� − (ΣX)� /�
Hence verify equation (15.10).
The quantity X�� is called the uncorrected sum of squares of the X � s, and
(∑�� )� /� is the correction for the mean of the X � s. The difference is called
the corrected sum of squares of the X � s. Similarly, ∑x� Y� is called
uncorrected sum of products, and (∑�� ∑�� )/� is the correction for the means
of X and Y. The difference is called the corrected sum of products of X and
Y. In terms of these definitions we can see that the estimate of the slope of
the fitted Straight line, bi from equation 15.10, is simply the ratio of the
corrected sum of products of X and Y to the corrected sum of squares of X's.
289
Forecasting How good is the Regression?
Methods
Analysis of Variance (ANOVA) Once the regression line is obtained we
would like to find out how good tie fit is. This can be ascertained by the
examination of errors. If �� is the ith data point and ��� its predicted value by
the regression equation, then we can write
�� − ��� = �� − �� − ���� − ���
If we square both sides and add the equations for i = 1 to n, we obtain
� �
� �
� ��� − ��� � = � �(�� − ��) − ���� − ����
��� ���
�
= Σ(�� − ��)� + Σ ����� − ��� − 2Σ(� − ��)���� − ���
= −2��� Σ(�� − �̅ )�
�
= −2Σ���� − ���
Thus
� �
Σ��� − ��� = Σ(�� − ��� )� + Σ���� − ��� (15.11)
Now Y� − � Y is the deviation of the ith observation from the overall mean and
so the left hand side of equation (15.11) is the sum of squares of the
deviations of the observations from the mean; this is shortened to SS (SS:
Sum of squares) about the mean, and is also the corrected sum of squares of
the Y's. Since �� − ��� is the deviation of the ith observation from its predicted
or fitted value, and ��� − �� is the deviation of the predicted value of the ith
observation from the mean, we can express equation (15.11) in words as
follows :
Sum of squares Sum of squares Sum of squares
� �=� �+� �
about the mean about regression due to regression
This shows that, of the variation in the Y's about their mean, some of the
variation can be ascribed to the regression line and some ∑�Y� − Y �� � to the
fact that the actual observations do not all lie on the regression line. If they all
did, the sum of squares about the regression would be zero. From this
procedure, we can see that a way of assessing how useful the regression line
will be as a predict or is to see how much of the SS about the mean has fallen
into the SS about regression. We shall be pleased if the SS due to regression
290
is much greater than the SS about regression, or what amounts to the same Regression
291
Forecasting An Example: Data on the annual sales of a company in lakhs of Rupees over
Methods the past eleven years is shown to the Table below. Determine a suitable
straight line regression model, Y = �� + �� X + � for the data in the table.
Solution: The independent variable in this problem is the year whereas the
response variable is the annual sales. Although we could take the actual year
as the independent variable itself, a judicious choice of the origin at the
middle year of 2003 with the corresponding X values for other years as -5, -4,
-3, -2, -1, 0, 1, 2, 3, 4, 5 would simplify calculations. From equation. (15,10)
we see that to estimate the parameter bj we require the four summations
∑X � , ∑Y� , ∑X �� and ∑X � Y� .
Thus, calculations can be organised as shown below where the totals of the
four columns yield the four desired summations:
We find that
n = 11
292
∑X � = 0 Regression
�
X = 0/11 = 0
�� = 102
�� = 102/11
��� = 110
ΣX � �� = 158
∑�� �� − (∑�� Σ�� )/�
�� =
∑��� − (∑�� )� /�
158
= = 1.44
110
The fitted equation is thus n
�� = �� + �� (� − ��)
or �� = 9.27 + 1.44�
Thus the parameters �� and �� of the model Y = �� + �� X + � are estimated
by b� and b� which in this case are 9.27 and 1.44 respectively. Now that the
model is completely specified we can obtain the predicted values ��� and the
�� corresponding to the eleven observations. These
errors or residuals Y� − Y
are shown in the table below:
I �� �� ��� �� − ���
1 -5 1 2.07 -1.07
2 -4 5 3.51 1,49
3 -3 4 4.95 -0.95
4 -2 7 6.39 0.61
5 -1 10 7.83 2.17
6 0 8 9.27 -1.27
7 1 9 10.71 -1-71
8 2 13 12.15 0.85
9 3 14 13,59 0.41
10 4 13 15.03 -2.03
11 5 18 16.47 1.53
To determine whether the fit is good enough, the ANOVA table can be
constructed.
SS duo to regression = �� �Σ�� �� − (Σ�� Σ�� )/��
�∑�� �� �(∑�� ��� )/���
(Associated degrees of feedom = 1) = ∑��� �(��� )� /�
(158)�
= = 226.95
110 293
Forecasting The total (corrected) SS = Σ��� − (Σ�� )� /�
Methods (Associated degrees of = 1194 − (102)� /11
Freedom = 11 –1 = 10) = 1194 − 945.82
= 248.18
�� due to regression ���.��
The value R� = �� about mean
= ���.�� = 0.9145
indicating that the regression line explains 91.45% of the total variation about
the mean.
A NOVA TABLE
Source SS Df MS
Regression (b) 226.95 1 MSR = 226.95
Residual 21.23 9 � � = 2.36
Total (corrected) 248.18 10
294
distribution as the number of components increases, by the Central Limit Regression
The standard error (s.e.) of �� is the square root of the variance, that is
�
s.e. (�� ) = � (15.14)
�∑� � � �
��� (�� ��) �
If � is unknown, we may use the estimate s in its place and obtain the
estimated standard error of �� , as
�
est s.e. (�� ) = � (15.15)
��(�� ���)� ��
295
Forecasting If we assume that the variations of the observations about the line are normal,
Methods that is, that the errors e, are all from the same normal distribution, N(0, � � ), it
can be shown that we can assign 100(1 − �)% confidence limits for ��, by
calculating
�
�����,�� ���
�� ± �
� … (15.16)
�(�� ���)� ��
� �
where � �� − 2,1 − � � is the �1 − � � percentage point-of .a t-distribution
with n -2 degrees of freedom (the number of degrees of freedom on which the
estimate � � is based) (see Figure I1I)
Figure III: The t-distribution
Standard Error of the Intercept and Confidence Interval for its Estimate
We may recall from equation (15.8) that
�� = �� − �� ��
In computing the variance of bQ we require the variance of (which is
� � ��
∑� � and thus has variance ∑���� Var (�� ) = , since Var (�� ) = � � )
� ��� � �� �
by assumption (2) stated at the beginning of Section (15.3) and the variance
of �� (which is available from equation (15.13) above. Since X̄ may be
treated as a constant we may write.
�) + (X
V(b� ) = V(Y �)� V(b� )
�
1 �� �
)
�(�� = � � + �
� Σ(�� − ��)�
�� ����
= � ∑� � �
(15.17)
��� (�� ��)
where both Ȳ and �� are subject to error which will influence ��. Now if �� .
and �� are constants, and
a = a� Y� + a� Y� + ⋯ + a� Y�
C = C� Y� + C� Y� + ⋯ C� Y�
then provided that �� and �� are uncorrelated when i ≠ j and if V(Y� ) = � � ,
all i, Cov (a, c) = (�� � + �� �� + ⋯ + �� �� )
� �i. ea� = �� and � = �� ( i.e.�� = (�� − ��)/
if follows by setting a = Y �
¯
∑���� (�� − ��)�, so that cov �Y
�, b� � = 0, that is Ȳ and b� are uncorrelated
random variables. Thus the variance of the predicted mean value of Y, ��� at a
specific value �� of X is
����� � = �(��) + (�� − ��)� �(�� )
�� (� ��� )� ��
�
+ ∑� � (� ���)� (15.19)
��� �
where the expression in equation (15.13) for V (�� ) has been utilised.
Hence the estimated standard error for the predicted mean value of Y for a
given �� is
�
� (�� ��� )� �
est. s.e. (��) = � × �� + �
∑��� (�� ���)� � … (15.20)
� �
Where � �� − 2,1 − � corresponds to the �1 − � � percentage point of a t-
distribution with (n-2) degrees of freedom (recall Figure III).
F-test for Significance of Regression
Since the Y, are random variables, any function of them is also a random
variable; two particular functions are MSR the mean square due to regression,
and � � , the mean square due to residual variation, which arise in the analysis
of variance table shown in Section 15.2.
In the case of fitting a straight line, it can be shown that if �� = 0 (i.e. the
slope of the fitted line is zero) the variable ��� multiplied by its degree of
freedom (here one), and divided by � � follows � � (chi-square) distribution
with the same (1) number of degrees of freedom. In addition(� − 2)� � /� �
follows a � � distribution with (n - 2) degrees of freedom. And since these
two variables are independent, a statistical theorem tells us that the ratio.
���
F= ��
… (15.23)
299
�
Forecasting �(9,0.975)����� ��
Methods �� ± �
��Σ(�� − �̅ )� ��
= 9.27 ± (2.262)(0.4637)
= 9.27 ± 1.0489, that is 10.3189 and 8.2211
Standard error of the forecast
� (� ��� )�
Estimate of ����� � = � � �� + �(�� ���)� �
�
1 (�� − 0)�
= 236 � + �
11 110
1 X��
= 2.36 � + �
11 110
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 Year
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6(X1)
Activity B
For the example problem of Section 15.2 being considered above, determine
the 95% and 99% confidence limits for an individual observation for a given
�� . Compute these limits for the year 2003 and the year 2009 (i.e. X = 0 and
X = 6 respectively). How do these limits compare with those found for the
mean value of Y above?
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
F-test for Significance of Regression
From the ANOVA table constructed for the example in Section 15.2
��� = 226.95
� � = 2.36
��� 226.95
�= = = 9.17
�� 2.36
If we look up percentage points of the F,(1,9) distribution we see that the
95% point (F1, 9, 0.95) = 5.12. Since the calculated F exceeds the critical F
value in the table, that is F = 9.17 > 5.12, we reject the hypothesis H0 running
a risk of less than 5% of being wrong.
Percentage Variation Explained
���.��
For the example problem R� = ���.�� = 0.9145
This indicates that the regression line explains 91.45% of the total variation
about the mean.
301
Forecasting
Methods
15.5 VARIETY OF REGRESSION MODELS
The methods of regression analysis have been illustrated in this unit for the
case of fitting a straight line to a giver set of data points. However the same
principles are applicable to the fitting of a variety of other functions which
may be relevant in certain situations highlighted below.
Seasonal Model
The monthly sales for items like woollens or desert coolers is expected to be
seasonal and a sinusoidal model would be appropriate for such a case. If ��
is the forecast for period t,
�� ��
�� = � + �cos �
� + �sin �
… (15.24)
when a, u and v are constants, t is the time period and N is the number of
time periods in a complete cycle (12 months if the cycle is 1 year). An
example of such a cyclic forecaster is given in Figure V.
Figure V: Cyclic Demand and a Cyclic Forecaster
302
Polynomials of Various Order Regression
We have considered a simple model of the first order with one independent
variable namely
� = �� + �� � + �
We may have k independent variables X� , X� … X� and obtain a first order
model with k-independent variables as
� = �� + �� �� + �� �� + ⋯ + �� �� + � … (15.26)
In a forecasting context, for instance, the demand for tyres in certain month
(Y) may be related to sales of petrol three months ago (��) the number of
new registrations of vehicles six months ago (��) and the current months
target production of vehicles (��). A second order model with one
independent variable would be
� = �� + �� � + �� � � + � … (15.27)
The most general type of linear model in variables �� , �� , … . �� is of the
form
� = �� �� + �� �� + ⋯ �� �� + � … (15.28)
where
�� = �� (�� , �� … �� )
can take any form: In many cases, each �� may involve only one X variable.
Multiplicative Models
Often by a simple transformation a non-linear model may be handled by the
methods of linear regression. For instance in the multiplicative model
� = ���� ��� ��� (15.29
a, b, c, d are unknown parameters and ∈ is the multiplicative random error.
Taking logarithms to the base a in equation (15.29) converts the model to the
linear form
In � = ln � + �ln �� + �ln �� + �ln �� + � … (15.30)
This model is of the form (15.28) with the parameters being In a, b, c and d
and the independent variables being ln X� , 1nX �� In X� While the dependent
variable is 1nY.
Linear and Non-linear Regression
We have seen above that many non-linear models can be transformed to
linear models by simple transformations. It is to be noted that we are
referring to linearity in the unknown parameters so that, any model which can
be expressed as equation (15.28) is called linear. For such a model the
parameters can be obtained by the method of least squares as the solution to a
set of linear equations (known as the normal equations). Non-linear models
which can be transformed to yield linear models are called intrinsically
linear. Some models are intrinsically non-linear. Examples are:
303
Forecasting � = ���� ��� ��� + � (15.31)
Methods
���� �
� = �� + �� +� (15.32)
� = �� + �� � + �� (�� )� + � (15.33)
Some kind of interactive methods have to be employed for estimating the
parameters of a non-linear system.
15.6 SUMMARY
In this unit fundamentals of linear regression have been highlighted. Broadly
speaking, the fitting of any chosen mathematical function to given data is
termed as regression analysis. The estimation of the parameters of this model
is accomplished by the least squares criterion which tries to minimise the sum
of squares of the errors for all the data points.
How the parameters of a fitted straight line model are estimated, has been
illustrated through an example.
After the model is fitted to data the next logical question is to find out how
good the quality of fit is. This question can best be answered by conducting
statistical tests and determining the standard errors of estimate. This
information permits us to make quantitative statements regarding confidence
limits for estimates of the parameters as well as the forecast values. An
overall percentage variation can also be computed and it serves to give a
score to the regression. Thus it also serves to compare alternative regression
models that may have been hypothesised. The various computations involved
in practice have been illustrated on an example problem.
Finally, it has been emphasised that the method of least squares used in linear
regression is applicable to a wide class of models. In each case the model
parameters are obtained by the solution of the so called "normal equations
.These are simultaneous linear equations equal in number to the number of
parameters to be estimated, obtained by partially differentiating the sum of
squares of errors with respect to the individual parameters.
Regression is thus a potent device for establishing relationships between
variables from the given data. The discovered relationship can be used for
predictive purposes. Some of the models used in forecasting of demand rely
heavily on regression-analysis. One such class of models, called Time -series
models is explored in Unit 20.
c) Y = a + be��� + e
���
(d) � = � + �cos �
+ �Sin + �
3) The demand of two products A and B for twelve periods is given below:
Period 1 2 3 4 5 6 7 8 9 10 11 12
Demand 80 10 79 98 95 10 80 98 10 96 11 88
of 0 4 2 5
Produc A
Demand 19 20 19 20 21 19 21 22 21 23 21 23
of 9 2 9 8 2 4 4 0 9 4 9 3
Product
B
15.8 KEYWORDS
Dependent variable: The variable of interest or focus which is influenced by
one or more independent variable(s).
Estimate: A value obtained from data for a certain parameter of the assumed
model or a forecast value obtained from the model.
Independent variable: A variable that can be set either to a desirable value
or takes values that can be observed but not controlled. parameters of the
model are estimated by minimising the sum of squares of error (discrepancy
between fitted and actual value).
Linear regression: Fitting of any chosen mathematical model, linear in
unknown parameters, to a given data.
Model: A general mathematical relationship relating a dependent (or
response) variable Y to independent variables X� , X� … … , X� by a force
Y = f(X� , X� … X� )
306
Non-linear regression: Fitting-of any chosen mathematical model, non- Regression
307
Forecasting
Methods UNIT 16 TIME SERIES ANALYSIS
Objectives
After completion of this unit, you should be able to :
• appreciate the role of time series analysis in short term forecasting
• decompose a time series into its various components
• understand auto-correlations to help identify the underlying patterns of a
time series
• become aware of stochastic models developed by Box and Jenkins for
time series analysis
• make forecasts from historical data using a suitable choice from
available methods.
Structure
16.1 Introduction
16.2 Decomposition Methods
16.3 Example of Forecasting using Decomposition
16.4 Use of Auto-correlations in Identifying Time Series
16.5 An Outline of Box-Jenkins Models for Time Series
16.6 Summary
16.7 Self-assessment Exercises
16.8 Key Words
16.9 Further Readings
16.1 INTRODUCTION
Time series analysis is one of the most powerful methods in use, especially
for short term forecasting purposes. From the historical data one attempts to
obtain the underlying pattern so that a suitable model of the process can be
developed, which is then used for purposes of forecasting or studying the
internal structure of the process as a whole. We have already seen in earlier
unit that a variety of methods such as subjective methods, moving averages
and exponential smoothing, regression methods, causal models and time-
series analysis are available for forecasting. Time series analysis looks for the
dependence between values in a time series (a set of values recorded at equal
time intervals) with a view to accurately identify the underlying pattern of the
data.
In the case of quantitative methods of forecasting, each technique makes
explicit assumptions about the underlying pattern. For instance, in using
regression models we had first to make a guess on whether a linear or
parabolic model should be chosen and only then could we proceed with the
308
estimation of parameters and model-development. We could rely on mere Time Series
Analysis
visual inspection of the data or its graphical plot to make the best choice of
the underlying model. However, such guess work, through not uncommon, is
unlikely to yield very accurate or reliable results. In time series analysis, a
systematic attempt is made to identify and isolate different kinds of patterns
in the data. The four kinds of patterns that are most frequently encountered
are horizontal, non-stationary (trend or growth), seasonal and cyclical.
Generally, a random or noise component is also superimposed.
We shall first examine the method of decomposition wherein a model of the
time-series in terms of these patterns can be developed. This can then be used
for forecasting purposes as illustrated through an example.
A more accurate and statistically sound procedure to identify the patterns in a
time-series is through the use of auto-correlations. Auto-correlation refers to
the correlation between the same variable at different time lags and was
discussed in Unit 18. Auto-correlations can be used to identify the patterns in
a time series and suggest appropriate stochastic models for the underlying
process. A brief outline of common processes and the Box-Jenkins
methodology is then given.
Finally the question of the choice of a forecasting method is taken up.
Characteristics of various methods are summarised along with likely
situations where these may be applied. Of course, considerations of cost and
accuracy desired in the forecast play a very important role in the choice.
310
Figure III: Exponential Trend Time Series
Analysis
311
Forecasting Finally, randomness can be eliminated by averaging the different values of
Methods equation (16.5). The averaging is done on the same months or seasons of
different years (for example the average of all Januaries, all Februaries,.... all
Decembers). The result is a set of seasonal values free of randomness, called
seasonal indices, which are widely used in practice.
In order to forecast, one must reconstruct each of the components of equation
(16.1). The seasonality is known through averaging the values in equation
(16.5) and the trend through (16.3). The cycle of equation (16.4) must be
estimated by the user and the randomness cannot be predicted.
To illustrate the application of this procedure to actual forecasting of a time
series, an example will now be considered.
Year Quarters
I II III IV
1983 5.5 5.4 7.2 6.0
1984 4.8 5.6 6.3 5.6
1985 4.0 6.3 7.0 6.5
1986 5.2 6.5 7.5 7.2
1987 6.0 7.0 8.4 7.7
312
Table 2: Computation of moving averages �� and the ratios �� /�� Time Series
Analysis
It should be noticed that the 4 Quarter moving totals pertain to the middle of
two successive periods. Thus the value 24.1 computed at the end of Quarter
IV, 1983 refers to middle of Quarters II, III, 1983 and the next moving total
of 23.4 refers to the middle of Quarters III and IV, 1983. Thus, by taking
(��.����.�)
their average we obtain the centred moving total of �
= 23.75 ≅
23.8 to be placed for Quarter III, 1983. Similarly for the other values in case
the number of periods in the moving total or average is odd, centering will
not be required.
The seasonal indices for the quarterly sales data can now be computed by
taking averages of the X� /M� ratios of the respective quarters for different
years as shown in Table 3.
313
Forecasting Table 3: Computation of Seasonal Indices
Methods
Year Quarters
I II III IV
1983 - - 1.200 1.017
1984 0.828 1.000 1.145 1.018
1985 0.702 1.068 1.148 1.032
1986 0.813 1.000 1.119 1.043
1987 0.845 0.972 - -
Mean 0.797 1.010 1.153 1.028
Seasonal 0.799 1.013 1.156 1.032
Index
The seasonal indices are computed from the quarter means by adjusting these
values of means so that the average over the year is unity. Thus the sum of
means in Table 3 is 3.988 and since there are four Quarters, each mean is
adjusted by multiplying it with the constant figure of 4/3.988 to obtain the
indicated seasonal indices. These seasonal indices can now be used to obtain
the deseasonalised sales of the firm by dividing the actual sales by the
corresponding index as shown in Table 4.
Table 4: Deseasonalised Sales
Year Quarter Actual Seasonal Deseasonalised
Sales index Sales
1983 I 5.5 0.799 6.9
II 5.4 1.013 5.3
III 7.2 1.156 6.2
IV 6.0 1.032 5.8
1964 I 4.8 0.799 6.0
II 5.6 1.013 5.5
111 6.3 1.156 5.4
IV 5.6 1.032 5.4
1985 1 4.0 0.799 5.0
11 6.3 1.013 6.2
III 7.0 1.156 6.0
IV 6.5 1.032 6.3
1986 I 5.2 0.799 6.5
II 6.5 1.013 6.4
111 7.5 1.156 6.5
IV 7.2 1.032 7.0
1967 1 6.0 0.799 7.5
V II 7.0 1.013 6.9
III 8.4 1.156 7.3
IV 7.7 1.032 7.5
Fitting a Trend Line
The next step after deseasonalising the data is to develop the trend line. We
shall here use the method of least squares that you have already studied in
Unit 19 on regression. Choice of the origin in the middle of the data with a
suitable scaling simplifies computations considerably. To fit a straight line of
314
the form Y = a + bX to the deseasonalised sales, we proceed as shown in Time Series
Analysis
Table 5.
Table 5: Computation of Trend
Σ� 125.6
�= = = 6.3
� 20
�� 114.2
�= = = 0.04�
Σ� � 2660
∴ the trend line is � = 6.3 + 0.04X
Identifying Cyclical Variation
The cyclical component is identified by measuring deseasonalised variation
around the trend line, as the ratio of the actual deseasonalised sales to the
value predicted by the trend line. The computations are shown in Table 6.
315
Forecasting Table 6: Computation of Cyclical Variation
Methods
Year Quarter Deseasonalised Trend a+bX �
Sales (Y) � + ��
1983 I 6.9 5,54 1.245
II 5.3 5.62 0.943
III 6.2 5.70 1.088
IV 5.8 5.78 1.003
1984 I 6.0 5.86 1.024
II 5.5 5.94 0.926
III 5.4 6.02 0.897
IV 5.4 6.10 0.885
1985 I 5.0 6.18 0.809
II 6,2 6.26 0.990
Ill 6,0 6.34 0.946
IV 6.3 6.42 0.981
1986 I 6,5 6.50 1.000
II 6.4 6.58 0.973
III 6,5 6.66 0.976
IV 7.0 6.74 1.039
1987 I 7.5 6.82 1.110
II 6.9 6.90 1.000
III 7.3 6.98 1.046
IV 7.5 7.06 1.062
316
Forecasting with the Decomposed Components of the Time Series Time Series
Analysis
Suppose that the management of the Engineering firm is interested in
estimating the sales for the second and third quarters of 1988. The estimates
of the deseasohalised sales can be obtained by using the trend line
Y = 6.3 + 0.04(23)
= 7.22 (2nd Quarter 1988)
and Y = 6.3 + 0.04 (25)
= 7.30 (3rd Quarter 1988)
These estimates will now have to be seasonalised for the second and third
quarters respectively. This can be done as follows :
For 1988 2nd quarter
seasonalised sales estimate = 7.22 x 1.013 = 7.31
For 1988 3rd quarter
seasonalised sales estimate = 7.30 x 1.56
= 8.44
Thus, on the basis of the above analysis, the sales estimates of the
Engineering firm for the second and third quarters of 1988 are Rs. 7.31 lakh
and Rs. 8.44 lakh respectively.
These estimates have been obtained by taking the trend and seasonal
variations into account. Cyclical and irregular components have not been
taken into account. The procedure for cyclical variations only helps to study
past behaviour and does not help in predicting the future behaviour.
Moreover, random or irregular variations are difficult to quantify.
is called an auto-regressive (AR) process of order p. The reason for this name
is that equation (16.6) represents a regression of the variable �� on successive
values of itself. The model contains p + 2 unknown parameters m,
�� , �� , … … �� , ��� which in practice have to be estimated from the data.
318
is called a moving average (MA) process of order q. The name "moving Time Series
Analysis
average" is somewhat misleading because the weights 1, −�� , −�� , … , −��
which multiply the a's, need not total unity nor need they be positive.
However, this nomenclature is in common use and therefore we employ it.
The model (16.7) contains q + 2 unknown parameters m, �� , �� , … . �� , ���
which in practice have to be estimated from the data.
Mixed Auto-regressive-moving average models :
It is sometimes advantageous to include both auto-regressive and moving
average terms in the model. This leads to the mixed auto-regressive-moving
average (ARMA) model.
�� = �� ���� + ⋯ �� ���� + �� − �� ���� − ⋯ − �� ���� … (16.8)
In using such models in practice p and q are not greater than 2.
For non-stationary processes the most general model used is an auto-
regressive integrated moving average (ARIMA) process of order (p, d, q)
where d represents the degree of differencing to achieve stationarity in the
process.
The main contribution of Box and Jenkins is the development of procedures
for identifying the ARMA model that best fits a set of data and for testing the
adequacy of that model. The various stages identified by Box and Jenkins in
their interactive approach to model building are shown in Figure VI. For
details on how such models are developed refer to Box and Jenkins.
Figure VI: The Box-Jenkins Methodology
IDENTIFY MODEL
TO BE
TENTATIVELY
ENTERTAINED
ESTIMATE PARAMETERS IN
TENTATIVE MODEL
DIAGNOSTIC CHECKING ( IS
MODEL ADEQUATE?
NO YES
319
Forecasting
Methods
16.6 SUMMARY
Some procedures for time series analysis have been described in this unit
with a view to making more accurate and reliable forecasts of the future.
Quite often the question that puzzles a person is how to select an appropriate
forecasting method. Many times the problem context or time horizon
involved would decide the method or limit the choice of methods. For
instance, in new areas of technology forecasting where historical information
is scanty, one would resort, to some subjective method like opinion poll or a
DELPHI study. In situations where one is trying to control or manipulate a
factor a causal model might be appropriate in identifying the key variables
and their effect on the dependent variable.
In this particular unit, however, time series models or those models where
historical data on demand or the variable of interest is available are discussed.
Thus we are dealing with projecting into the future from the past. Such
models are short term forecasting models.
The decomposition method has been discussed. Here the time series is broken
up into seasonal, trend, cycle and random components from the given data
and reconstructed for forecasting purposes. A detailed example to illustrate
the procedure is also given.
Finally the framework of stochastic models used by Box and Jenkins for time
series analysis has been outlined. The AR, MA, ARMA and ARIMA
processes in Box- Jenkins models are briefly described so that the interested
reader can pursue a detailed study on his own.
7) A survey of used car sales in a city for the 10-year period 1976-85 has
been made. A linear trend was fitted to the sales for month for each year
and the equation was found to be
Y = 400 + 18 t
�
where t = 0 on January 1, 1981 and t is measured in �
year (6 monthly)
units
a) use this trend to predict sales for June, 1990
b) If the actual sales in June. 1987 are 600 and the relative seasonal index
June sales is 1.20, what would be the relative cyclical, irregular index
June, 1987?
8) The monthly sales for the last one year of a product in thousands of units
are given below :
Month 1 2 3 4 5 6 7 8 9 10 11 12
Sales 0.5 1.5 2.2 3.0 3.2 3.5 3.5 3.5 3.8 4.0 4.7 5.5
Compute the auto-correlation coefficients up to lag 4. What conclusion
can be derived from these values regarding the presence of a trend in the
data?
321
Forecasting Decomposition : Identifying the trend, seasonality, cycle and randomness in
Methods a time series.
Forecasting : Predicting the future values of a variable based on historical
values of the same or other variable(s). If the forecast is based simply on past
values of the variable itself, it is called time series forecasting, otherwise it is
a causal type forecasting.
Seasonal Index : A number with a base of 1.00 that indicates the seasonality
for a given period in relation to other periods.
Time Series Model : A model that predicts the future by expressing it as a
function of the past.
Trend : A growth or decline in the mean value of a variable over the relevant
time span.
322