0% found this document useful (0 votes)
169 views74 pages

Ignou Forecasting Methods

forecasting

Uploaded by

ss t
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
169 views74 pages

Ignou Forecasting Methods

forecasting

Uploaded by

ss t
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

Chi-square Tests

BLOCK 4
FORECASTING METHODS

249
Sampling and
Sampling
Distributions

250
Business
UNIT 13 BUSINESS FORECASTING Forecasting

Objectives
After completion of this unit, you should be able to :
• realise that forecasting is a scientific discipline unlike ad hoc predictions
• appreciate that forecasting is essential for a variety of planning decisions
• become aware of forecasting methods for long, medium and short term
decisions
• use Moving Averages and Exponential smoothing for demand
forecasting
• understand the concept of forecast control
• use the moving range chart to monitor a forecasting system.
Structure
13.1 Introduction
13.2 Forecasting for Long Term Decisions
13.3 Forecasting for Medium and Short Term Decisions
13.4 Forecast Control
13.5 Summary
13.6 Self-assessment Exercises
13.7 Key Words
13.8 Further Readings

13.1 INTRODUCTION
Data on demands of the market may be needed for a number of purposes to
assist an organisation in its long term, medium and short term decisions.
Forecasting is essential for a number of planning decisions and often
provides a valuable input on which future operations of the business
enterprise depend. Some of the areas where forecasts of future product
demand would be useful are indicated below :
1) Specification of production targets as functions of time.
2) Planning equipment and manpower usage, as well as additional
procurement.
3) Budget allocation depending on the level of production and sales.
4) Determination of the best inventory policy.
5) Decisions on expansion and major changes in production processes and
methods.
6) Future trends of product development, diversification, scrapping etc.
251
Forecasting 7) Design of suitable pricing policy.
Methods
8) Planning the methods of distribution and sales promotion.
It is thus clear that the forecast of demand of a product serves as a vital input
for a number of important decisions and it is, therefore, necessary, to adopt a
systematic and rational methodology for generating reliable forecasts.
The Uncertain Future
The future is inherently uncertain and since time immemorial man has made
attempts to unravel the mystery of the future. In the past it was the crystal
gazer or a person allegedly in possession of some supernatural powers who
would make predications about the things-to be-major events or the rise and
fall of kings. In today's world, predictions are being made daily in the realm
of business, industry and politics. Since the operation of any capital
enterprise has a large lead time (1-5 years is typical), it is clear that a factory
conceived today is for some future demand and the whole operation is
dependent on the actual demand coming up to the level projected much
earlier. During this period many circumstances, which might not even have
been imagined, could come up. For instance, there could be development of
other industries, or a major technological breakthrough that may render the
originally conceived product obsolete; or a social upheaval and change-of
government may redefine priorities of growth and development; or an
unusual weather condition like drought or floods may alter completely the
buying potential of the originally conceived market. This is only a partial list
to suggest how uncertainties from a variety of sources can enter to make the
task of prediction of the future extremely difficult.
It is proper at this stage to emphasise the distinction between prediction and
forecasting. Forecasting generally refers to the scientific methodology that
often uses past data along with some well-defined assumptions or 'model' to
come up with a 'forecast' of future demand. In that sense, forecasting is
objective. A prediction is a subjective estimate made by an individual by
using his intuitive 'hunch' which may in fact come out true. But the fact that it
is subjective (A's prediction may be different from B's and C's) and non-
realisable as a Well-documented computer programme (which could be used
by anyone) deprives it of much value. This is not to discount. the role of
intuition or subjectivity in practical decision-making. In fact, for complex
long term decisions, intuitive methods such as the Delphi technique are most
popular. The opinion of a well informed, educated person is likely to be
reliable, reflecting the well-considered contribution of a host of complex
factors in a relationship that may be difficult to explicitly quantify. Often
forecasts are modified based on subjective judgment and experience to obtain
predictions used in planning and decision making.
The future is inherently uncertain and any forecast at best is an educated
guess with no guarantee of coming true. In certain purely deterministic
systems (as for example in classical physics the laws governing the motion of
celestial bodies are fairly well developed) an unequivocal relationship.
between cause and effect has been clearly established and it is possible to
predict. very accurately the course of events in the future, once the future
252
patterns of causes are inferred from past behaviour. Economic systems, Business
Forecasting
however, are more complex because (i) there is a large number of governing
factors in a complex structural framework which may not be possible to
identify and (ii) the individual factors themselves have a high degree of
variability and uncertainty. The demand for a particular product (say
umbrellas) would depend on competitor's prices, advertising campaigns,
weather conditions, population and a number of factors which might even be
difficult to identify. In spite of these complexities, a forecast has to be made
so that the manufacturers of umbrellas (a product which exhibits a seasonal
demand) can plan for the next season.
Forecasting for Planning Decisions
The primary purpose of forecasting. is to provide valuable information for
planning the design and operation of the enterprise. Planning decisions may
be classified as long term, medium term and short term.
Long term decisions include decisions like plant expansion or. new product
introduction which may require new technologies or a. complete
transformation in social or moral fabric of society. Such decisions are
generally, characterised by lack of quantitative information and absence of
historical data on which to base, the forecast of future events. Intuition and
the collected opinion of. experts in the field generally play a significant role
in developing forecasts for such decisions. Some methods used in forecasting
for long term decisions are discussed in Section 13.2.
Medium term decisions involve such decisions as planning the production
levels in a manufacturing plant over the next year, determination of
manpower requirements or inventory policy for the firm. Short term
decisions include daily production planning and scheduling decisions. For
both medium and short term forecasting, many methods and techniques exist.
These methods can broadly be classified as follows
a) Subjective of intuitive methods.
b) Methods based on averaging of past data, including simple, weighted and
moving averages.
c) Regression models on historical data.
d) Causal or Econometric models.
e) Time series analysis or stochastic models.
These methods are briefly reviewed in Section 13.3. A more detailed
discussion of correlation, regression and time series models is taken up in the
next three units.
The choice of an appropriate forecasting method is discussed in Section 13.4.
The aspect of forecast control which tells whether a particular method in use
is acceptable is discussed in Section 13.5. And finally a summary is given in
Section 13.6.

253
Forecasting
Methods
13.2 FORECASTING FOR LONG TERM
DECISIONS
Technological Forecasting
Technological growth is often haphazard, especially in developing countries
like India. This is because Technology seldom evolves and there are frequent
technology transfers -due to imports of knowhow resulting in a leap-frogging
phenomenon. In spite of this, it is generally seen that logarithms of many
technological variables show linear trends with time, showing exponential
growth. Some extrapolations reported by Rohatgi et al. are
• Passenger kms carried by Indian Airlines (Figure I)
• Fertilizer applied per hectare of cropped area (Figure II)
• Demand and supply of petroleum crude (Figure III)
• Installed capacity of electricity generation in millions of KW (figure IV).
Figure I: Passenger Km Carried by Indian Air Lines

Figure I: Passenger Km Carried by Indian Air Figure II: Fertilizer Applied per
Lines Hectare «f Cropped Area

254
Figure III: Demand and Supply of Petroleum Crude. Business
Forecasting

Figure IV: Installed Capacity of Electricity Generation in Million KW

The use of S curves in forecasting technological growth is also common.


Rather than implying unchecked growth there is a limit to growth. Thus the
growth rate of technology is slow to begin with (owing to initial problems), it
reaches a maximum (when the technology becomes stable and popular) and
finally declines till the technology becomes obsolete and is replaced by a
newer alternative. Some examples of the use of S curves as reported by
Rohatgi et al. (1979) are
• Hydroelectric power generation using Gumpertz growth curve (Figure
V)
255
Forecasting • Number of villages electrified using a Pearl type growth curve (Figure
Methods VI).

Figure V: Hydroelectric Power Generalion Using Figure VI: Number of Villages Electrified
Gompertz Growth Curve Using a Pearl Type Growth Curve

Apart from the above extrapolative techniques which are based on the
projection of historical data into the future (such models are called regression
models and you will learn more about them in Unit 15), technological
forecasting often implies prediction of future scenarios or likely possible
futures. As an example suppose there are three events; E1, E� and �� where
each one may or may not happen in the future. Thus, eight possible
scenarios-E1 E2 E3, E1 E2 E�� , E1 E�� E3, E
��E2 E3, E
�� E
��E3, E
��E2 E �� , E1 E
�� E
�� ,
E �� E
�� E ��, — showt he range of possible futures (a line above the event
indicates that the event does not take place). Moreover these events may not
be independent. The breakout of war (E1) is likely to lead to increased
spendings on defence (�� ) and reduced emphasis on rural uplift and social
development (�� ). Such interactions can be investigated using the Cross-
impact Technique.
Delphi
This is a subjective method relying on the opinion of experts designed to
minimise bias and error of judgment. A Delphi panel consists of a number of
experts with an impartial leader or coordinator who organises the questions.
Specific questions (rather than general opinions) with yes-no or multiple type
answers or specific dates/events are sought from the experts. For instance,
questions could be of the following kind :
• When do you think the petroleum reserves of the country would be
exhausted? (2020,2040, 2060)
• When would the level of pollution in Delhi exceed danger limit? (as
defined by a particular agency)?
• What would the population of India be in 2020, 2040 and 2060?
• When would fibre optics become a commercial viability for
communication?
256
A summary of the responses of the participants is sent to each expert Business
Forecasting
participating in the Delphi panel after a statistical analysis. For a forecast of
when an event is likely to happen, the most optimistic and pessimistic
estimates along with a distribution of other responses is given to the
participant. On the basis of this information the experts may like to revise
their earlier estimates and give revised estimates to the coordinator. It may be
mentioned that the identities of the experts are not revealed to each other so
that bias or influence by reputation is kept to a minimum. Also the feedback
response is statistical in nature without revealing who made which forecast.
The Delphi method is an iterative procedure in which revisions are carried
out by the experts till the coordinator gets a stable response.
The method is very efficient, if properly conducted, as it provides a
systematic framework for collecting expert opinion. By virtue of anonymity,
statistical analysis and feedback of results and provision for forecast revision,
results obtained are free of bias and generally reliable. Obviously, the
background of the experts and their knowledge of the field is crucial. This is
where the role of the coordinator in identifying the proper experts is
important.
Opinion Polls
Opinion polls are a very common method of gaining knowledge about
consumer tastes, responses to a new product, popularity of a person or leader,
reactions to an election result or the likely future prime minister after the
impending polls. In any opinion poll two things are of primary importance.
First, the information that is sought and secondly the target population from
whom the information is sought. Both these factors must be kept in mind
while designing the appropriate mechanism for conducting the opinion poll.
Opinion polls may be conducted through
• Personal interviews.
• Circulation of questionnaires.
• Meetings in groups.
• Conferences, seminars and symposia.
The method adopted depends largely on the population, the kind of
information desired and the budget available. For instance, if information
from a very large number of people is to be collected a suitably designed
questionnaire could be mailed to die people concerned. Designing a proper
questionnaire is itself a major task. Care should be taken to avoid ambiguous
questions. Preferably, the responses should be short one word answers or
ticking an appropriate reply from a set of multiple choices. This makes the
questionnaire easy for the respondent to fill and also easy for the analyst to
analyse. For example, the final analysis could be summarised by saying
• 80% of the population expressed opinion A,
• 10% expressed opinion B,
• 5% expressed opinion C,
• 5% expressed no opinion. 257
Forecasting Similarly in the context of forecasting of product demand, it is common to
Methods arrive at the sales forecast by aggregating the opinion of area salesmen. The
forecast could be modified based on some kind of rating for each salesman or
an adjustment for environmental uncertainties.
Decisions in the area of future R&D or new technologies too are based on the
opinions of experts. The Delphi method treated in this Section is just an
example of a systematic gathering of opinion of experts in the concerned
field.
The major advantage of opinion polls lies in the fact that a well formed
opinion considers the multifarious subjective and objective factors which
may not even be possible to enumerate explicitly, and yet they may have a
bearing on the concerned forecast or question. Moreover the aggregation of
opinion polls tends to eliminate the bias that is bound to be present in any
subjective, human evaluation. In fact for long term decisions, opinion polls of
opinions of the experts constitute a very reliable method for forecasting and
planning.

13.3 FORECASTING FOR MEDIUM AND SHORT


TERM DECISIONS
Forecasting for the medium and short term horizons from one to six months
ahead is commonly employed for production planning, scheduling and
financial planning decisions in an organisation. These methods are generally
better structured as compared to the models for long term forecasting treated
in Section 13.2, as the variables to be forecast are well known and often
historical data is available to guide in the making of a more reliable forecast.
Broadly speaking we can classify these methods into five categories.
1) Subjective of intuitive methods.
2) Methods based on an averaging of past data (moving average and
exponential smoothing).
3) Regression models on historical data.
4) Causal or econometric models.
5) Stochastic models, with Time Series analysis and Box-Jenkins models.
Subjective or Intuitive Methods
These methods rely on the opinion of the concerned people and are quite
popular in practice. Top executives, salesmen, distributors, and consumers
could all be approached to give an estimate of the future demand of a
product. And a judicious aggregation/adjustment of these opinions could be
used to arrive at the forecast of future demand. How such opinion polls could
be systematically conducted has already been discussed in Section 17.2.
Committees or even a Delphi panel could be constituted for the purpose.
However, all such methods suffer from individual bias and subjectivity.
Moreover the underlying logic of forecast generation remains mysterious for
it relies entirely on the intuitive judgment and experience of the forecaster. It
258 cannot be documented and programmed for use on a computer so that no
matter whether A or B or C makes the forecast, the result is the same. The Business
Forecasting
other categories of methods discussed in the section are characterised by well
laid procedures so that documentation and computerisation can be easily
done.
However, subjective and intuitive methods have their own advantages. The
opinion of an expert or an experienced salesman carries with it the
accumulated wisdom of experience and maturity which may be difficult to
incorporate in any explicit mathematical relationship developed for purposes
of forecasting. Moreover in some instances where no historical data is
available (e.g. forecasting the sales of a completely new product or new
technology) reliance on opinions of persons in Research and Development,
Marketing or other functional areas may be the only method available to
forecast and plan future operations.
Methods Based on Averaging of Past Data (Moving Averages and
Exponential Smoothing)
In many instances, it may be reasonable to forecast the demand for the next
period by taking the average demand till date. Similarly when the next period
demand actually becomes known, it would be used in making the forecast of
the next future period. However, rather than use the entire past history in
determining the average. Only the recent data for the past 3 or 6 months may
be used. This is the idea behind the 'Moving Average', where only the
demand of the recent couple of periods (the number of periods being
specified) is used in making a forecast. Consider, for illustration, the monthly
sales figures of an item, shown in Table 1.
Table 1: Monthly Sales of an Item and Forecasts Using Moving Averages
Month Demand 3 period 6 period
moving moving
Average Average
Jan 199
Feb 202
Mar 199 200 00
Apr 208 203.00
May 212 206 33
Jun 194 203 66 202 33
July 214 205 66 207 83
Aug 220 208 33 210 83
Sept 2 19 2 16 66 213 13
Oct 234 223 33 217 46
Nov 219 223 00 218 63
Dec 233 227.66 225.13

259
Forecasting The average of the sales for January, February and March is
Methods (199+202+199)/3=200, which constitutes the 3 months moving average
calculated at the end of March and may thus be used as a forecast for April.
Actual sales in April turn out to be 208 and so the 3 months moving average
forecast for May is (202+199+208)/3 =203. Notice that a convenient method
of updating the moving average is
New moving average = Old moving average + Added period demand – Dropped period demand
Number of period in moving average

At the end of May, the actual demand for May is 212, while the demand for
February which is to be dropped from the last moving average is 202. Thus,
New moving average = 203 + 10/3 = 206.33 which is the forecast for June.
Both the 3 period and 6 period moving average are shown in Table 1.
It is characteristic of moving averages to
a) Lag a trend (that is, give a lower value for an upward trend and a higher
value for a lower trend) as shown in Figure VII (a).
b) Be out of phase (that is, lagging) when the data is cyclic, as in seasonal
demand. This is depicted in Figure VII (b).
c) Flatten the peaks of the demand pattern as shown in Figure VII (c).

(A) Moving Averages Lag A Trend

B) Moving Averages Are Out Of Phase (C) Moving Averages Flatten Peaks
For Cyclic Demand.

Some correction factors to rectify the lags can be incorporated. For details,
you may refer to Brown (3).
Exponential smoothing is an averaging technique where the weightage given
to the past data declines (at an exponential rate) as the data recedes into the
past. Thus all the values are taken into consideration, unlike in moving
260
averages, where all data points prior to the period of the Moving Average are Business
Forecasting
ignored.
If �� is the one-period ahead forecast made at time t and is the demand for
period t, then
�� = F��� + �(−�� − ���� )
= ��� + (1 − �)����
Where � is a smoothing constant that lies between 0 and 1 but generally
chosen values lie between 0.01 and 0.30. A higher value of a places more
emphasis on recent data. To initiate smoothing, a starting value of �� , is
needed which is generally taken as the first or some average demand value
available. Corrections for trend effects may be made by using double
exponential smoothing and other factors. For details, you may consult the
references at the end.
A computation of the smoothed values of demand for the example considered
earlier in Table 1 is shown in Table 2 for values of a equal to 0.1 and 0.3. In
these computations, exponential smoothing is initiated from June with a
starting forecast as the average demand for the first five months. Thus the
error for June is (194-204), that is -10, which when multiplied by a (0.1 or
0.3 as the case may be) and added to the previous forecast of 204 yields 203
or 201 (depending on whether � is 0.1 or 0.3) respectively as shown in Table
2.
Table 2: Monthly Sales of an Item and Forecasts Using Exponential
Smoothing
Month Demand Smoothed Smoothed
forecast (alpha forecast (alpha
= 0.1) = 0.3)
Jan 199
Feb 202
Mar 199
Apr 208
May 212
Jun 194 204.0 204.0
July 214 203.0 201.0
Aug 220 204.1 204.9
Sept 2 19 205.7 209.4
Oct 234 207.0 212.3
Nov 219 209.7 218.8
Dec 233 210.6 218.9

Both moving averages and smoothing methods are essentially short term
forecasting techniques where one or a few period-ahead forecasts are
obtained.

261
Forecasting Regression Models on Historical Data
Methods
The demand of any product or service when plotted as a function of time
yields a time series whose behaviour may be conceived of as following a
certain pattern with random fluctuations. Some commonly observed demand
patterns are shown in Figure VIII.
Figure VIII: Some Commonly Observed Demand Patterns

The basic approach in this method is to identify an underlying pattern and to


fit a regression line to demand history by available statistical methods. The
method of least squares is commonly used to determine the parameters of the
fitted model.
Forecasting by this technique assumes that the underlying system of chance
causes which was operating in the past would continue to operate in the
future as well. The forecast would thus not be valid under abnormal
conditions like wars, earthquakes, depression or other natural calamities like
floods or drought which might drastically affect the variable of interest.
For the demand history considered previously in Tables 1 and 2, the linear
regression line is F� = 193 + 3t
where t = l refers to January, t=2 to February, and so on. The forecast for any
month t can be found by substituting the appropriate value oft. Thus, the
expected demand for next January (t=13) = 193 + (3 x 13) = 232.
You will study details of this regression procedure in Unit 19. We may only
add here that the procedure can be used to fit any type of function, be it
linear, parabolic or other, and that some very useful statements of confidence
and precision can also be made.
Causal or Econometric Models
In causal models, an attempt is made to consider the cause effect
relationships and the variable of interest (e.g. demand) is modelled as a
function of these causal variables. For instance, in trying to forecast the
demand of tyres of a particular kind in a certain month (say DTM), it would
be reasonable to assume that this is influenced by the targeted production of
new vehicles for that month (TPVM) and the total road mileage of existing
vehicles in the past 6 months (say) which could be assumed to be
proportional to sales of petrol in the last 6 months (SPL6M). Thus, one
262
possible model to forecast the monthly demand of tyres is DTM=a x (TPVM) Business
Forecasting
+ b x (SPL6M) where a, b and c are constants to be determined from the data.
The above model has value for forecasting only if TPVM and SPL6M (the
two causal variables) are known at the time the forecast is desired. This
requirement is expressed by saying that these variables be leading. Also the
quality of it is determined by the correlation between the predictor and the
predicted variables. Commonly used indicators of the economic climate, such
as consumers price index, wholesale price index, gross national product,
population and per capital income are often used in econometric models
because these are easily available from published records.
Model parameters are estimated by usual regression procedures, similar to
the ones described in Models on Historical Data :
Construction of these structural and econometric models is generally difficult
and more time-consuming as compared to simple time-series regression
models. Nevertheless, they possess the advantage of portraying the inner
mechanics of the demand so that when changes in a certain pertinent factor
occur, the effect can be predicted.
The main difficulty in causal models is the selection or identification of
proper variables which should exhibit high correlation and be leading for
effective forecasting.
Time Series Analysis or Stochastic Models
The demand or variable of interest when plotted as a function of time yields
what is commonly called a 'time-series'. This plot of demand at equal time
intervals may show random patterns of behaviour and our objective in
Models on Historical Data was to identify the basic underlying pattern that
should be used to explain the data. After hypothesising a model (linear,
parabolic or other) regression was used to estimate the model parameters,
using the criterion of minimising the sum of squares of errors.
Another method often used in time series analysis is to identify the following
four major components in a time series.
1) Secular trend (e.g. long term growth in market)
2) Cyclical fluctuation (e.g. due to business cycles)
3) Seasonal variation (e.g. Woollens, where demand is seasonal)
4) Random or irregular variation.
The observed value of the time series could then be expressed as a product
(or some other function) of the above factors.
Another treatment that may be given to a time series is to use the framework
developed by Box and Jenkins (1976) in which a stochastic model of the
autoregressive (AR) variety, moving average (MA) variety, mixed
autoregressive- moving average variety (ARMA) or an integrated
autoregressive-moving average variety (ARIMA) model may be chosen. An
introductory discussion of these models is included in Unit 20. Stochastic
models are inherently complicated and require greater efforts to construct. 263
Forecasting However, the quality of forecasting generally improves. Computer codes are
Methods available to implement the procedures [see for instance Box and Jenkins
(1976)].

13.4 FORECAST CONTROL


Whatever, be the system of forecast generation, it is desirable to monitor the
output of such a system to ensure that the discrepancy between the forecast
and actual values of demand lies within some permissible range of random
variations.
A system of forecast generation is shown in Figure IX.
From past data, the system generates a forecast which is subject to
modification through managerial judgment and experience. The forecast is
compared with the current data when it becomes available and the error is
watched or monitored to assess the adequacy of the forecast generation
system.
The Moving Chart is a useful statistical device to monitor and verify the
accuracy of a forecasting system.
The control chart is easy to construct and maintain. Suppose data for n
periods is available. If Ft,.is the forecast for period t and D, is the actual
demand for period t then MR (Moving
Range ) =∣ (�� − �� ) − (���� − ���� )|
���
∴ �����
MR = ���
(There are � − 1 moving averages for n -periods)

Then Upper Control Limit (UCL) = +2.66��


Lower Control Limit (LCL) = −2.66��
The variable to be plotted on the chart is the error (Ft – Dt) in each period. A
sample control chart is shown in Figure X. Such a control chart tells three
important things about a demand pattern:
Figure IX: System of Forecast Generation

264
a) whether the past demand is statistically stable, Business
Forecasting
b) whether the present demand is following the past pattern,
c) if the demand pattern has changed, the control chart tells how to revise
the forecasting method.
As long as the plotted error points keep falling within the control limits, it
shows that the variations are due to chance causes and the underlying system
of forecast generation is acceptable. When a point goes out of control there is
reason to suspect the validity of the forecast generation system, which should
be revised to reflect these changes.

13.5 SUMMARY
The unit has emphasised the importance of forecasting in all planning
decisions-be they long term, medium term or short term. For long term
planning decisions, techniques like Technological Forecasting, collecting
opinions of experts as in Delphi or opinion polls using personal interviews or
questionnaires have been surveyed. For medium and short term decisions,
apart from subjective and intuitive methods there is a greater variety of
mathematical models and statistical techniques that could be profitably
employed. There are methods like Moving averages or exponential
smoothing that are based on averaging of past data. Any suitable
mathematical function or curve could be fitted to the demand history by using
least squares regression. Regression is also used in estimation of parameters
of causal or econometric models. Stochastic models using Box-Jenkins
methodology are a statistically advanced set of tools capable of more accurate
forecasting. Finally, forecast control is very necessary to check whether the
forecasting system is consistent and effective. The moving range chart has
been suggested for its simplicity and ease of operation in this regard.

13.6 SELF-ASSESSMENT EXERCISES


1) Why is forecasting so important in business? Identify applications of
forecasting for
• Long term decisions.
• Medium term decisions.
• Short term decisions.
2) How would you conduct an opinion poll to determine student reading
habits and preferences towards daily newspapers and weekly magazines?
3, 4, 5 For the demand data of a product, the following figures for last year's
sales (monthly) are given :

265
Forecasting Period (Monthly)
Methods
1 2 3 4 5 6 7 8 9 10 11 12
80 100 79 98 95 104 80 98 102 96 115 88
67 53 601 79 102 118 135 162 70 53 68 63
117 124 95 228 274 248 220 130 109 128 125 134

a) Plot the data on a graph and suggest an appropriate model that could be
used for forecasting.
b) Plot a 3 and 5 period moving average and show on the graph in (a)
c) Initiate exponential smoothing from the first period demand for
smoothing constant (cc) values of 0.1 and 0.3. Show the plots.
6 What do you understand by forecast control? What could be the various
methods to ensure that the forecasting system is appropriate?

13.7 KEY WORDS


Causal Models: Forecasting models wherein the demand or variable or
interest is related to underlying causes or causal variables.
Delphi: A method of collecting information from experts, useful for long
term forecasting. It is iterative in nature and maintains anonymity to reduce
subjective bias.
Exponential Smoothing: A short term forecasting method based on
weighted averages of past data so that the weightage declines exponentially
as the data recedes into the past, with the highest weightage being given to
the most recent data.
Forecasting: A systematic procedure to determine the future value of a
variable of interest.
Moving Average: An average computed by considering the K most recent
(for a K- period moving average) demand points, commonly used for short
term forecasting.
Prediction: A term to denote the estimate or guess of a future variable that
may be arrived at by subjective hunches or intuition.
Regression: From a given demand history to establish a relation between the
dependent variable (such as demand) and independent variable (S). Such
relations prove very useful for forecasting purposes.
Time Series: Any data on demand, sales or consumption taken at regular
intervals of time constitutes a time series. Analysis of this time series to
discover patterns of growth, decay, seasonalities or random fluctuations is
known as to Time Series analysis.

266
Business
13.8 FURTHER READINGS Forecasting

Biegel, J.E., Production Control-A Quantitative Approach, Prentice Hall of


India: New Delhi.
Box, G.E.P. and G.M. Jenkins. Time Series Analysis: Forecasting and
Control, I-lolden-Day: San Francisco.
Brown, R.G., Smoothing, Forecasting and Prediction of Discrete Time
Series, Prentice Hall: Englewood-Cliffs.
Chambers, J.C., S.K. Mullick and D.D. Smith. An Executive's Guide to
Forecasting, John Wiley: New York.
Chatterjee, S., & Simonoff, J.S. Handbook of regression analysis (Vol. 5).
John Wiley & Sons.
Firth, M., Forecasting Methods in Business and Management, Edward
Arnold: London.
Jarrett, Al., Forecasting for Business Decisions, Basil Blackwell: London.
Makridakis, S. and S. Wheelwright. Forecasting: Methods and Applications,
John Wiley: New York.
Martino. J.P., Technological Forecasting for Decision Making, American
Elsevier: New York, .
Montgomery D.C. and L.A. Johnson, Forecasting and Time Series Analysis,
McGraw Hill: New York.
Rohatgi. P.K., K. Rohatgi and B. Bowonder. Technological Forecasting,
Tata McGraw Hill: New Delhi.

267
Forecasting
Methods UNIT 14 CORRELATION

Objectives
After completion of this unit, you should be able to :
• understand the meaning of correlation
• compute the correlation coefficient between two variables from sample
observations
• test for the significance of the correlation coefficient
• identify confidence limits for the population correlation coefficient from
the observed sample correlation coefficient
• compute the rank correlation coefficient when rankings rather than actual
values for variables are known
• appreciate some practical applications of correlation
• become aware of the concept of auto-correlation and its application in
time series analysis.
Structure
14.1 Introduction
14.2 The Correlation Coefficient
14.3 Testing for the Significance of the Correlation Coefficient
14.4 Rank Correlation
14.5 Practical Applications of Correlation
14.6 Auto-correlation and Time Series Analysis
14.7 Summary
14.8 Self-assessment Exercises
14.9 Key Words
14.10 Further Readings

14.1 INTRODUCTION
We often encounter situations where data appears as pairs of figures relating
to two variables. A correlation problem considers the joint variation of two
measurements neither of which is restricted by the experimenter. The
regression problem, which is treated in Unit 15, considers the frequency
distributions of one variable (called the dependent variable) when another
(independent variable) is held fixed at each of several levels.
Examples of correlation problems are found in the study of the relationship
between IQ and aggregate percentage marks obtained by a person in SSC
examination, blood pressure and metabolism or the relation between height

268
and weight of individuals. In these examples both variables are observed as Correlation

they naturally occur, since neither variable is fixed at predetermined levels.


Examples of regression problems can be found in the study of the yields of
crops grown with different amount of fertiliser, the length of life of certain
animals exposed to different amounts of radiation, the hardness of plastics
which are heat-treated for different periods of time, and so on. In these
problems the variation in one measurement is studied for particular levels of
the other variable selected by the experimenter. Thus the factors or
independent variables in regression analysis are not assumed to be random
variables, though the dependent variable is modelled as a random variable for
which intervals of given precision and confidence are often worked out. In
correlation analysis, all variables are assumed to be random variables. For
example, we may have figures on advertisement expenditure (X) and Sales
(Y) of a firm for the last ten years, as shown in Table I. When this data is
plotted on a graph as in Figure I we obtain a scatter diagram. A scatter
diagram gives two very useful types of information. First, we can observe
patterns between variables that indicate whether the variables are related.
Secondly, if the variables are related we can get an idea of what kind of
relationship (linear or non-linear) would describe the relationship.
Correlation examines the first
Table 1: Yearwise data on Advertisement Expenditure and Sales

Year Advertisement Sales in thousand Rs.


Expenditure in (Y)
Thousand Rs. (X)

1988 50 700
1987 50 650
1986 50 600
1985 40 500
1984 30 450
1983 20 400
1982 20 300
1981 15 250
1980 10 210
1979 5 200

question of determining whether an association exists between the two


variables, and if it does, to what extent. Regression examines the second
question of establishing an appropriate relation between the variables.

269
Forecasting Figure I: Scatter Diagram
Methods

The scatter diagram may exhibit different kinds of patterns. Some typical
patterns indicating different correlations between two variables are shown in
Figure II.
What we shall study next is a precise and quantitative measure of the degree
of association between two variables and the correlation coefficient.

14.2 THE CORRELATION COEFFICIENT


Definition and Interpretation
The correlation coefficient measures the degree of association between two
variables X and Y. Pearson's formula for correlation coefficient is given as

�(����)(����)
�=� �� ��
(14.1)

Where r is the correlation coefficient between X and Y, σx and σy are the


standard deviations of X and Y respectively and n is the number of values of

the pair of variable X and Y in the given data. The expression � Σ(X −
�)(Y − �
(X Y) is known as the covariance between X and Y. Here r is also
called the Pearson's product moment correlation coefficient. You should note
that r is a dimensionless number whose numerical value lies between +1 and
-1. Positive values of r indicate positive (or direct) correlation between the
two variables X and Y i.e. as X increases Y will also increase or as X
decreases Y will also decrease. Negative values of r indicate negative (or
inverse) correlation, thereby meaning that an increase in one variable results
in a decrease in the value of the other variable. A zero correlation means that
there is no association between the two variables. Figure II shows a number
of scatter plots with corresponding values for the correlation coefficient r.

270
Figure II: Different Types of Association Between Variables Correlation

The following form for carrying out computations of the correlation


coefficient is perhaps more convenient
���
�= (14.2)
√�� � ��� �

where
� = � − �� = deviation of a particular X value from the mean ��
� = � − �� = deviation of a particular Y value from the mean ��
Equation (14.2) can be derived from equation (14.1) by substituting for ��
and �� as follows:

� �
σ� = �� Σ(X − X̄)� and �� = �� Σ(X − Ȳ)� (14.3)

Activity A
Suggest five pairs of variables which you expect to be positively correlated.
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
Activity B
Suggest five pairs of variables which you expect to be negatively correlated.
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
………………………………………………………………………………… 271
Forecasting A Sample Calculation: Taking as an illustration t he data of advertisement
Methods
expenditure (X) and Sales (Y) of a company for the 10-year period shown in
Table 1, we proceed to determine the correlation coefficient between these
variables:
Computations are conveniently carried out as shown in Table 2.
Table 2: Calculation of Correlation Coefficient

Sl.No X Y � = x − x� � =�−� x� �� xy
.

1. 50 700 21 274 441 75,076 5,7


2. 50 650 21 224 441 50,176 4,7
3. 50 600 21 174 441 30,276 3,6
4. 40 500 11 74 121 5,476 814
5. 30 450 1 24 1 576 24
6. 20 400 -9 -26 81 676 234
7. 20 300 -9 -126 81 15,876 1,134
8. 15 250 -14 -176 196 30,976 2,464
9. 10 210 -19 -216 361 46,656 4,104
10. 5 200 -24 -226 576 51,076 5,424

Total 290 4,260 0 0 2,740 3,06,840 28,310

290
�� =
= 29
10
4260
�=
Y = 426
10
�� 28310
∴�= = = 0.976
√Σ� � �Σ� � √2740 × 306840
This value of r (= 0.976) indicates a high degree of association between the
variables X and Y. For this particular problem, it indicates that an increase in
advertisement expenditure is likely to yield higher sales.
You may have noticed that in carrying out calculations for the correlation
coefficient in Table 2, large values for � � and � � resulted in a great
computational burden. Simplification in computations can be adopted by
calculating the deviations of the observations from an assumed average rather
than the, actual average, and also scaling these deviations conveniently. To
illustrate this short cut procedure, let us compute the correlation coefficient
for the same data. We shall take U to be the deviation of X values from the
assumed mean of 30 divided by 5. Similarly, V represents the deviation of Y
values from the assumed mean of 400 divided by 10.
The computations are shown in Table 3.

272
Table 3: Short cut Procedure for Calculation of Correlation Coefficient Correlation

S.No X Y U V UV �� ��
1. 50 700 4 30 120 16 900
2. 50 650 4 25 100 16 625
3 50 600 4 20 80 16 400
4. 40 500 2 10 20 4 100
5. 30 450 0 5 0 0 25
6. 20 400 -2 0 0 4 0
7 20 300 -2 -10 20 4 100
8. 15 250 -3 -15 45 9 225
9. 10 210 -4 -19 76 16 361
10. 5 200 -5 -20 100 25 400
Total -2 26 561 110 3,13
����
Σ�� − �
�=
(∑�)� (��)�
�Σ� � − �Σ� � −
� �

(��)(��)
561 − ��
�=
(��)� (��)�
�110 − �3136 −
�� ��

566.2
=
10.47 × 55.39
= 0.976
We thus obtain the same result as before.
Activity C
Use the short cut procedure to obtain the value of correlation coefficient in
the above example using scaling factor 10 and 100 for X and Y respectively.
(That is, the deviation from the assumed mean is to be divided by 10 for X
values and by 100 for Y values.)
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………

14.3 TESTING FOR THE SIGNIFICANCE OF


THE CORRELATION COEFFICIENT
Once the correlation coefficient has been calculated from sample data one is
normally interested in asking the question: Is there an association between 273
Forecasting the variables? Or with what confidence can we make a statement about the
Methods association between the variables?
Such questions are best answered statistically by using one of the following
two commonly used procedures :
1) Providing confidence limits for the population correlation coefficient
from the sample size n and the sample correlation coefficient r. If this
confidence interval includes the value zero, then we say that r is not
significant, implying thereby that the population correlation coefficient
may be zero and the value of r may be due to sampling variability.
2) Testing the null hypothesis that population correlation coefficient equals
zero vs. the alternative hypothesis that it does not, by using the t-statistic.
The use of both these procedures is now illustrated.
The value of the sample correlation coefficient is used as an estimate of the
true population correlation p. It is desirable to incude a confidence interval
for the true value along with the sample statistics. There are several methods
for obtaining the confidence interval for p. However, the most straight
forward method is to use a chart such as that shown in Figure III.
Figure III: Confidence Bands for the Population Correlation

Scale of r ( Sample correlation cofficient)

Once r has been calculated, the chart can be used to determine the upper and
lower values of the interval for the sample size used. In this chart the range of
unknown values of p is shown in the vertical scale; while the sample r values
are shown on the horizontal axis, with a number of curves for selected sample
sizes. Notice that for every sample size there are two curves. To read the 95%
confidence limits for an observed sample correlation coefficient of 0.8 for a
sample of size 10, we simply look along the horizontal line for a value of 0.8
(the sample correlation coefficient) and construct a vertical line from there till
it intersects the first curve for n =10. This happens for p = 0.2. This is the
lower limit of the confidence interval. Extending the vertical line upwards, it
again intersects the second n =10 line at p = 0.92, which represents the upper
274
confidence limit. Thus the 95% confidence interval for the population Correlation

correlation coefficient becomes


0.2 ≤ � ≤ 0.92
If a confidence interval for p includes the value zero, then r is not considered
significant since that value of r may be due to nothing more than sampling
variability.
This method of using charts to determine the confidence intervals is
convenient, though of course we must use a different chart for different
confidence limits (e.g. 90%, 95%, 99%).
The alternative approach for testing the significance of r is to use the formula

�= (14.4)
�(��� � )(���)

Referring to the table of t-distribution for (n-2) degrees of freedom, we can


find the critical value for t at any desired level of significance (5% level of
significance is commonly used). If the calculated value of t (as obtained by
equation 14.3) is less than or equal to the table value, we accept the
hypothesis (H� : the correlation coefficient equals zero), meaning that the
correlation between the variables is not significantly different from zero:
Suppose we obtain a correlation coefficient of 0.2 for a sample of size 10.
0.2
�= ≅ 0.577
�(1 − 0.04)/8
And from the t-distribution with 8 degrees of freedom for a 5% level of
significance, the table value = 2.306. Thus we conclude that this r of 0.2 for n
= 10 is not significantly different from zero.
It should be mentioned here that in case the same value of the correlation
coefficient of 0.2 was obtained on a sample of size 100 then
0.2
t= ≅ 2021
�(1 − 0.04)/98
And the tabled value for a t-distribution with 98 degrees of freedom and a 5%
level of significance = 1.99. Since the calculated t exceeds this figure of 1.99,
we can conclude that this correlation coefficient of 0.2 on a sample of size
100 could be considered significantly different from zero, or alternatively that
there is statistically significant association between the variables.

14.4 RANK CORRELATION


Quite often data is available in the form of some ranking for different
variables. It is common to resort to rankings on a preferential basis in areas
such as food testing, competitive events (e.g. games, fashion shows, or
beauty contests) and attitudinal surveys. The primary purpose of computing a
correlation coefficient in such situations is to determine the extent to which
the two sets of rankings are in agreement. The coefficient that is determined
from these ranks is known as Spearman's rank correlation coefficient, r. 275
Forecasting This is given by the following formula
Methods
� ∑� �
��� ��
�� = 1 − �(�� ��)
(14.5)

Here n is the number of pairs of observations and �� is the difference in ranks


for the ith observation set.
Suppose the ranks obtained by a set of ten students in a Mathematics test
(variable X) and a Physics test (variable Y) are as shown below :

Rank for 1 2 3 4 5 6 7 8 9 10
variable X
Rank for 3 1 4 2 6 9 8 10 5 7
variable Y

To determine the rank correlation, �� we can organise computations as shown


in Table 4 :
Table 4: Determination of Spearman's Rank Correlation

Individual Rank in Rank in d =Y -X ��


Maths(X) Physics(Y)
1 1 3 +2 4
2 2 1 -I 1
3 3 4 +1 1
4 4 2 -2 4
5 5 6 +1 1
6 6 9 +3 9
7 7 8 +1 1
8 8 10 +2 4
9 9 5 -4 16
10 10 7 -3 9
Total 50

Using the formula (14.5) we obtain


6 × 50
�� = 1 − = 1 − 0.303 = 0.697
10(100 − 1)
We can thus say that there is a high degree of correlation between the
performance in Mathematics and Physics.
We can also test the significance of the value obtained. The null hypothesis is
that the two variables are not associated, i.e. r = O. That is, we are interested
to test the null hypothesis, H� that the two variables are not associated in the
population and that the observed value of �� differs from zero only by chance.
The t-statistic that is used to test this is

276
Correlation
�−2
� = �� �
1 − ���

10 − 2
= 0.697�
1 − (0.697)�

=2.75
Referring to the table of the t-distribution for n-2 = 8 degrees of freedom, the
critical value for t at a 5% level of significance is 2.306. Since the calculated
value of t is higher than the table value, we reject the null hypothesis
concluding that the performances in Mathematics and Physics are closely
associated.
When two or more items have the same rank, a correction has to be applied to
∑d�� . For example, if the ranks of X are 1, 2, 3, 3, 5, ... showing that there are

two items with the same 3rd rank, then instead of writing 3, we write 3 � for
each so that the sum of these items is 7 and the mean of the ranks is
unaffected. But in such cases the standard deviation is affected, and therefore,
a correction is required. For this, ∑d�� is increased by (� � − �)/12 for each
tie, where t is the number of items in each tie.
Activity D
Suppose the ranks in Table 4 were tied as follows: Individuals 3 and 4 both
ranked 3rd in Maths and individuals 6, 7 and 8 ranked 8th in Physics.
Assuming that other rankings remain unaltered, compute the value of
Spearman's rank correlation.
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………

14.5 PRACTICAL APPLICATIONS OF


CORRELATION
The primary purpose of correlation is to establish an association between any
two random variables. The presence of association does not imply causation,
but the existence of causation certainly implies association. Statistical
evidence can only establish the presence or absence of association between
variables. Whether causation exists or not depends merely on reasoning. For
example, there is reason to believe that higher income causes higher
expenditure on superior quality cloth. However, one must be on the guard
against spurious or nonsense correlation that may be observed between
totally unrelated variables purely by chance.

277
Forecasting Correlation analysis is used as a starting point for selecting useful
Methods independent variables for regression analysis. For instance a construction
company could identify factors like
• population
• construction employment
• building permits issued last year which it feels would affect its sales for
the current year.
These and other factors that may be identified could be checked for mutual
correlation by computing the correlation coefficient of each pair of variables
from the given historical data (this kind of analysis is easily done by using an
appropriate routine on a computer). Only variables having a high correlation
with the yearly sales could be singled out for inclusion in a regression model.
Correlation is also used in factor analysis wherein attempts are made to
resolve a large set of measured variables in terms of relatively few new
Categories, known as factors. The results could be useful in the following
three ways :
1) to reveal the underlying or latent factors that determine the relationship
between the observed data,
2) to make evident relationships between data that had been obscured before
such analysis, and
3) to provide a classification scheme when data scored on various rating
scales have to be grouped together.
Another major application of correlation is in forecasting with the help of
time series models. In using past data (which is often a time series of the
variable of interest available at equal time intervals) one has to identify the
trend, seasonality and random pattern in the data before an appropriate
forecasting model can be built. The notion of auto-correlation and plots of
auto-correlation for various time lags help one to identify the nature of the
underlying process. Details of time series analysis are discussed in Unit 20.
However, some fundamental concepts of auto-correlation and its use for time
series analysis-are outlined below.

14.6 AUTO-CORRELATION AND-TIME SERIES


ANALYSIS
The concept of auto-correlation is similar to that of correlation but applies to
values of the same variable at different time lags. Figure IV shows how a
single variable such as income (X) can be used to construct another variable
(X1) whose only difference from the first is that its values are lagging by one
time period. Then, X and X1 can be treated as two variables and their
correlation found. Such a correlation is referred to as auto-correlation and
shows how a variable relates to itself for a specified time lag. Similarly, one
can construct X2 and find its correlation with X. This correlation will indicate
how values of the same variable that are two periods apart relate to each
278 other.
Figure IV: Example of the Same Variable with Different Time Lags Correlation

Time X Original X1One time lag X2 Two time


variable variable lags variable
constructed from constructed from
X X
t=1 13
2 8 8
3 15 15 15
4 4 4 4
5 4 4 4
6 12 12 12
7 11 11 11
8 7 7 7
9 14 14 14
10 12 12 12

One could construct from one variable another time-lagged variable which is
twelve periods removed. If the data consists of monthly figures, a twelve-
month time lag will show how values of 'the same month but of different
years correlate with each other. If the auto-correlation coefficient is positive,
it implies that there is a seasonal pattern of twelve months duration. On the
other hand, a near zero auto-correlation indicates the absence of a seasonal
pattern. Similarly, if there is a trend in the data, values next to each other will
relate, in the sense that if one increases, the other too will tend to increase in
order to maintain the trend. Finally, in case of completely random data, all
auto-correlations will tend to zero (or not significantly different from zero).
The formula for the auto correlation coefficient at time lag k is:
∑��� � �
��� (X � − X)(X ��� − X)
r� =
∑���� (X� − �
X)�
where
�� denotes the auto-correlation coefficient for time lag k k denotes the length
of the time lag n is the number of observations
X, is the value of the variable at time t and
X is the mean of all the data
Using the data of Figure IV the calculations can be illustrated.
13 + 8 + 15 + ⋯ + 12 100
�� = = = 10
10 10
(13 − 10)(8 − 10) + (8 − 10)(15 − 10) + ⋯ + (14 − 10)(12 − 10)
�� =
(13 − 10)� + (8 − 10)� + ⋯ + (14 − 10)� + (12 − 10)�
−27
= = −0.188
144
279
Forecasting For k = 2, the calculation is as follows :
Methods
∑��
��� (X � − 10)(X ��� − 10)
r� =
∑��
��� (X � − 10)

(13 − 10)(15 − 10) + (8 − 10)(4 − 10) + ⋯ + (7 − 10)(12 − 10)


=
(13 − 10)� + (8 − 10)� + ⋯ + (14 − 10)� + (12 − 10)�
−29
= = −.201
144
A plot of the auto-correlations for various lags is often made to identify the
nature of the underlying time series. We, however, reserve the detailed
discussion on such plots and their use for time series analysis for Unit 16.

14.7 SUMMARY
In this unit the concept of correlation or the association between two
variables has been discussed. A scatter plot of the variables may suggest that
the two variables are related but the value of the Pearson correlation
coefficient r quantifies this association. The correlation coefficient r may
assume values between -1 and 1. The sign indicates whether the association
is direct (+ve) or inverse (-ve). A numerical value of r equal to unity indicates
perfect association while a value of zero indicates no association.
Tests for significance of the correlation coefficient have been described.
Spearman's rank correlation for data with ranks is outlined. Applications of
correlation in identifying relevant variables for regression, factor analysis and
in forecasting using time series have been highlighted. Finally the concept of
auto-correlation is defined and illustrated for use in time series analysis.

14.8 SELF-ASSESSMENT EXERCISES


1) What do you understand by the term correlation? Explain how the study
of correlation helps in forecasting demand of a product.
2) A company wants to study the relation between R&D expenditure (X)
and annual profit (Y). The following table presents the information for
the last eight years:
Year R&D Expense (X) Annual Profit (Y)
(Rs. in thousands)
1988 9 45
1987 7 42
1986 5 41
1985 10 60
1984 4 30
1983 5 34
1982 3 25
1981 20

280 a) Plot the data on a scatter diagram.


b) Estimate the sample correlation coefficient. Correlation

c) What are the 95% confidence limits for the population correlation
coefficient?
d) Test the significance of the correlation coefficient using a t-test at a
significance level of 5%.
3) The following data pertains to length of service (in years) and. the annual
income for a sample of ten employees of an industry:
Length of service in years (X) Annual income in thousand
rupees (Y)
6 14
8 17
9 15
10 18
11 16
12 22
14 26
16 25
18 30
20 34
Compute the correlation coefficient between X and Y and test its
significance at levels of 0.01 and 0.05.
4) Twelve salesmen are ranked for efficiency and the length of service as
below:
Salesman Efficiency (X) Length of Service (Y)
A 1 2
B 2 1
C 3 5
D 5 3
E 5 9
F 5 7
G 7 7
H 8 6
I 9 4
J 10 11
K 11 10
L 12 11

a) Find the value of Spearman's rank correlation coefficient, ��


b) Test for the Significance of ��
281
Forecasting 5) An alternative definition of the correlation coefficient between a two-
Methods dimensional random variable (X, Y) is
[(� − �(�))(� − �(�))]
�=
��(�)�(�)
where E(.) represents expectation and V(.) the variance of the random
variable. Show that the above expression can be simplified as follows :
�(��) − �(�)�(�)
�=
��(�)�(�)
(Notice here that the numerator is called the covariance of X and Y).
6) In studying the relationship between the index of industrial production
and index of security prices the following data from the Economic Survey
1980-81 (Government of India Publication) was collected.
70-71 71-72 72-73 73-74 74-75 75-76 76-77 77-78 78-79
Index of 101.3 114.8 1196.6 122.1 125.2 122.2 135.3 140.1 150.1
Industrial
(1970-
100)
Index of 100.0 95.1 96.7 116.0 113.2 96.9 102.9 107.4 130.4
Security
Prices
(1970-
71-100)

a) Find the correlation between the two indices.


b) Test the significance of correlation coefficient at 0.01 level of
significance.
7) Compute and plot the first five auto-correlations (i.e. up-to time lag 5
periods) for the time series given below :
t I 2 3 4 5 6 7 8 9 10
dt 13 8 15 4 4 12 II 7 14 12

14.9 KEY WORDS


Auto-correlation: Similar to correlation in that it described the association
or mutual dependence between values of the same variable but at different
time periods. Auto-correlation coefficients provide important information
about the structure of a data set.
Correlation: Degree of association between two variables.
Correlation Coefficient : A number lying between -1 (Perfect negative
correlation) and + 1 (perfect positive correlation) to quantify the association
between two variables.
Covariance: This is the joint variation between the variables X and Y.
Mathematically defined as
282
∑(�� − ��)��� − ��� Correlation


for n data points.
Scatter Diagram: An ungrouped plot of two variables, on the X and Y axes.
Time Lag: The length between two time periods, generally used in time
series where one may test, for instance, how values of periods 1, 2; 3, 4
correlate with values of periods 4, 5, 6, 7 (time lag 3 periods).
Time-Series: Set of observations at equal time intervals which may form the
basis of future forecasting.

14.10 FURTHER READINGS


Box, G.E.P., and G.M. Jenkins. Time Series Analysis, Forecasting and
Control, Holden-Day: San Francisco.
Chatterjee, S., & Simonoff, J.S. Handbook of regression analysis (Vol.5).
John Wiley & Sons.
Draper, N. and H. Smith. Applied Regression Analysis, John Wiley: New
York.
Edwards, B., The Readable Maths and Statistics Book, George Allen and
Unwin: London.
Makridakis, S. and S. Wheelwright. Interactive Forecasting: Univariate and
Multivariate Methods, Holden-Day: San Francisco.
Peters, W.S. and G.W: Summers. Statistical Analysis for Business Decisions,
Prentice Hall: Englewood-Cliffs.
Srivastava, U.K., G.V. Shenoy and S.C. Sharma. Quantitative Techniques for
Managerial Decision Making,Wiley Eastern: New Delhi.
Stevenson, W.J. Business Statistics-Concepts and Applications, Harper and
Row: New York.

283
Forecasting
Methods
UNIT 15 REGRESSION

Objectives
After successful completion of this unit, you should be able to:
• understand the role of regression in establishing mathematical
relationships between dependent and independent variables from given
data
• use the least squares criterion to estimate the model parameters
• determine the standard errors of estimate of the forecast and estimated
parameters
• establish confidence intervals for the forecast values and estimates of
parameters
• make meaningful forecasts from given data by fitting any function, linear
in unknown parameters.
Structure
15.1 Introduction
15.2 Fitting A Straight Line
15.3 Examining the Fitted Straight Line
15.4 An Example of the Calculations
15.5 Variety of Regression Models
15.6 Summary
15.7 Self-assessment Exercises
15.8 Key Words
15.9 Further Readings

15.1 INTRODUCTION
In industry and business today, large amounts of data are continuously being
generated. This may be data pertaining, for instance, to a company's annual
production, annual sales, capacity utilisation, turnover, profits, manpower
levels, absenteeism or some other variable of direct interest to management.
Or there might be technical data regarding a process such as temperature or
pressure at certain crucial points, concentration of a certain chemical in the
product or the breaking strength of the sample produced or one of a large
number of quality attributes.
The accumulated data may be used to gain information about the system (as
for instance what happens to the output of the plant when temperature is
reduced by half) or to visually depict the past pattern of behaviour (as often
happens in company s annual meetings where records of company progress
are projected) or simply used for control purposes to check if the process or
284
system is operating as designed (as for instance in quality control). Our Regression

interest in regression is primarily for the first purpose, mainly to extract the
main features of the relationships hidden in or implied by the mass of data.
The Need for Statistical Analysis
For the system under study there may be many variables and it is of interest
to examine the effects that some variables exert (or appear to exert) on others.
The exact functional relationship between variables may be too complex but
we may wish to approximate to this functional relationship by some simple
mathematical function such as straight line or a polynomial which
approximates to the true function over certain limited ranges of the variables
involved.
There could be many variables of interest in the system. In a chemical plant
for instance, the monthly consumption of water or other raw materials, the
temperature and pressure maintained in the reacting vessel, the number of
operating days per month the monthly production of the final product and any
by-products could all be variables of interest. We are, however, interested in
some key performance variable (which in our case may be monthly
production of final product) and would like to see how this key variable
(called the response variable or dependent variable) is affected by the other
variables (often called independent variables). By independent variables we
shall usually mean variables that can either be set to a desired value or else
take values that can be observed but not controlled. As a result of changes
that are deliberately made, or simply take place in the independent variables,
an effect is transmitted to the response variables. In general we shall be
interested in finding out how changes in the independent variables affect the
values of the response variables. Sometimes the distinction between
independent and dependent variables is not clear, but a choice may be made
depending on convenience or objectives.
Broadly speaking we would have to undergo the following sequence of steps
in determining the relationship between variables, assuming we have data
points already.
1) Identify the independent and response variables.
2) Make a guess of the form of the relation (linear, quadratic, cyclic etc.)
between the dependent and independent variables. This can be facilitated
by a graphical plot of the data (for two variables) on a systematic
tabulation (for more than two variables) which may suggest some trends
or patterns.
3) Estimate the parameters of the tentatively entertained model in step 2
above. For instance if a straight line was to be fitted, what is the slope and
intercept of this line?
4) Having obtained the mathematical model, conduct an error analysis to see
how good the model fits into the actual data.
5) Stop, if satisfied with model otherwise repeat steps 2 to 4 for another
choice of the model form in step 2.
285
Forecasting What is Regression?
Methods
Suppose we consider, the height and weight of adult males for some given
population. If we plot the pair (X� , X� ) = (height, weight), a diagram like
Figure I will result. Such a diagram, you would recall from the previous
chapter, is conventionally called a scatter diagram.
Note that for any given height there is a range of observed weights and vice-
versa. This variation will be partially due to measurement errors but primarily
due to variations between individuals. Thus no unique relationship between
actual height and weight can be expected. But we can note that average
observed weight for a given observed height increases as height increases.
The locus of average observed weight for given observed height (as height '
varies) is called the regression curve of weight on height. Let us denote it by
X� = f(X� ). There also exists a regression curve of height on weight
similarly defined which we can denote by X� = g(X� ). Let us assume that
these two "curves" are both straight lines (which in general they may not be).
In general these two curves are not the same as indicated by the two lines in
Figure I.
Figure I: Height and Weight of Thirty Adult Males

A pair of random variables such as (height, weight) follows some sort of


bivariate probability distribution. When we are concerned with the
dependence of a random variable Y on quantity X, which is variable but not a
random variable, an equation that relates Y to X is usually called a
regression equation. Similarly when more than one independent variable is
involved, we may wish to examine the way in which a response Y depends
on variables X� X� … . X. We determine a regression equation from data which
cover certain areas of the X-space as Y = f(X� , X� … X� )
Linear Regression
The simplest and most commonly used relationship between two variables is
that of a straight line. We may write the linear, first order model as
� = �� + �� � + � (15.1)
That is, for a given X, a corresponding observation Y consists of the value
�� + �� � plus an amount ∈, the increment by which an individual Y may fall
286
off the regression line. Equation (15.1) is the model of what we believe �� , �� Regression

are called the parameters of the model whose values have been obtained from
the actual data.
When we say that a model is linear or non-linear, we are referring to linearity
or non¬linearity in the parameters. The value of the highest power of
independent variable in the model is called the order of the model. For
example:
� = �� + �� � + �� � � + �
is a second order (in X) linear (in the his) regression model..
Now in the model of equation (15.1) �� , �� and ∈ are unknown and in fact ∈
would be difficult to discover since it changes from observation to
observation. However, �� and �� remain fixed, and although we cannot find
them exactly without examining all possible occurrences of Y and X, we can
use the information provided by the actual data to give us estimates b� and b�
of �� and ��. Thus we can write
Y = b� + b� X (15.2)
where Y that denotes the predicted value of Y for a given X, when b� and b�
are determined. Equation 15.2 could then be used as a predictive equation;
substitution of a value of X would provide a. prediction of the true mean
value of Y for that X.

15.2 FITTING A STRAIGHT LINE


Least Squares Criterion
In fitting a straight line (or any other function) to a set of data points we
would expect some points to fall above or below the line resulting in both
positive and negative error terms (see Figure II). It is true that we would like
the overall error to be as small as possible. The most common criterion in the
determination of model parameters is to minimise the sum of squares of
errors, or residuals as they are often called. This is known as the least squares
criterion, and is the one most commonly used in regression analysis.
Figure II: The Least Squares Criterion

This is, however, not the only criterion available. One may, for instance,
minimise the sum of absolute deviations, which is equivalent to minimising 287
Forecasting the mean absolute deviation (MAD). The least squares criterion, however,
Methods has the following main advantages :
1) It is simple and intuitively appealing.
2) It results in linear equations (called normal equations) for solution of
parameters which are easy to solve.
3) It results in estimates of quality of fit and intervals of confidence of
predicted values rather easily.
In the context of the straight line model of equation (15.1), suppose there are
n data points (X� Y� ), (X� Y� ), … , (X� , Y� ) then we can write from equation
(19.1)
�� = �� + �� �� + �� , � = 1, … � (15.3)
so that the sum of squares of the deviations from the true line is
� = ∑���� ��� = ∑���� (�� − �� − �� �� )� (15.4)
We shall choose our estimates b� and b� to be values which, when
substituted for �� and �� in equation (15.4) produce the least possible value
of S. We can determine b� and b� by differentiating equation (15.4) first with
respect to βo and then with respect to β1 and setting the results equal to zero.
Notice that �� , �� are fixed pairs of numbers from our data set for i varying
between 1 and n. Therefore,

∂�
= −2 � ���� − �� − �� �� �
∂��
���

∂�
= −2 � �� (�� − �� − �� �� )
∂��
���

so that the estimates b� and b� are given by


� (�� − �� − �� �) = 0
���

� �� (�� − �� − �� �) = 0
���

where we substitute (�� , �� ) for (�� , �� ) when we derivatives to zero.


We thus obtain two linear equations in two unknown parameters (�� , �� ).
These equations are known as normal equations and for this case they can be
written as
�� � + �� ∑���� �� = ∑���� ��
� � (15.5)
�� ∑���� �� + �� ∑���� ��� = ∑���� �� ��
The solution to these equations is easily written as follows:

288
��� ��� Regression
� �
��� �� ���� ∑�� ���� ���� ��� ��
�� = � ��� = ����� �(��� )�
(15.6)
� ∣
��� ����

� ���
� � ���� �� ���� ���
∑�� ��� ��
�� = � ��� = ����� �(��� )�
(15.7)
� �
��� ����

Thus (15.6) and (15.7) may be used to determine the estimates of the
parameters and the predictive equation (15.2) may be used to obtain the
predicted value of Y (called Y) for any desired value of X.
Rather than use the above procedure, a slightly modified (though equivalent)
method is to use the, solution of the first normal equation in (15.5) to obtain
boas
�� = �� − �� �� (15.8)
Where �� and �� are (�� + �� + ⋯ + �� )/� and (�� + �� + ⋯ + �� )/�
respectively. Substituting (15.8) in (15.2) yields the following estimated
regression equation
�� = �� + �� (� − ��) (15.9)
Where �� is computed by
∑�� �� �(∑�� ∑�� )/� �(�� ��� )(�� ���)
�� = ∑��� �(∑�� )� /�
= �(�� ��� )�
(15.10)

This equation, as you can easily see, is derived from the last expression in
(15.7) by simply dividing the numerator and denominator by n. It is written
in the form above as it has an interpretation suitable for analysis of variance
later.
Activity A
You can see that the last form of equation (15.10) is expressed in terms of
sums of squares or products of deviations of individual points from their
corresponding means. Show that in fact
fact Σ(X� − �
X)(Y� − �
Y) = ΣX � Y� − (ΣX � Y� )/n
and Σ(�� − ��)� = Σ��� − (ΣX)� /�
Hence verify equation (15.10).
The quantity X�� is called the uncorrected sum of squares of the X � s, and
(∑�� )� /� is the correction for the mean of the X � s. The difference is called
the corrected sum of squares of the X � s. Similarly, ∑x� Y� is called
uncorrected sum of products, and (∑�� ∑�� )/� is the correction for the means
of X and Y. The difference is called the corrected sum of products of X and
Y. In terms of these definitions we can see that the estimate of the slope of
the fitted Straight line, bi from equation 15.10, is simply the ratio of the
corrected sum of products of X and Y to the corrected sum of squares of X's.

289
Forecasting How good is the Regression?
Methods
Analysis of Variance (ANOVA) Once the regression line is obtained we
would like to find out how good tie fit is. This can be ascertained by the
examination of errors. If �� is the ith data point and ��� its predicted value by
the regression equation, then we can write
�� − ��� = �� − �� − ���� − ���
If we square both sides and add the equations for i = 1 to n, we obtain
� �
� �
� ��� − ��� � = � �(�� − ��) − ���� − ����
��� ���

= Σ(�� − ��)� + Σ ����� − ��� − 2Σ(� − ��)���� − ���

The third term can be rewritten as

= −2Σ(�� − ��)���� − ��� (using eqn. 15.9)


= −2Σ(�� − ��)�� (�� − ��) �� − �� = �� (� − ��)
�)(X� − X
= −2b� Σ(Y� − Y �) (using eqn. 15.10)

= −2��� Σ(�� − �̅ )�

= −2Σ���� − ���
Thus
� �
Σ��� − ��� = Σ(�� − ��� )� + Σ���� − ��� (15.11)

which may be written as


� �
Σ(�� − ��)� = Σ��� − ��� � + Σ���� − ��� (using eqn. 15.11)

Now Y� − � Y is the deviation of the ith observation from the overall mean and
so the left hand side of equation (15.11) is the sum of squares of the
deviations of the observations from the mean; this is shortened to SS (SS:
Sum of squares) about the mean, and is also the corrected sum of squares of
the Y's. Since �� − ��� is the deviation of the ith observation from its predicted
or fitted value, and ��� − �� is the deviation of the predicted value of the ith
observation from the mean, we can express equation (15.11) in words as
follows :
Sum of squares Sum of squares Sum of squares
� �=� �+� �
about the mean about regression due to regression
This shows that, of the variation in the Y's about their mean, some of the
variation can be ascribed to the regression line and some ∑�Y� − Y �� � to the
fact that the actual observations do not all lie on the regression line. If they all
did, the sum of squares about the regression would be zero. From this
procedure, we can see that a way of assessing how useful the regression line
will be as a predict or is to see how much of the SS about the mean has fallen
into the SS about regression. We shall be pleased if the SS due to regression

290
is much greater than the SS about regression, or what amounts to the same Regression

thing if the ratio


SS due to regression
R� =
SS about mean
is not too far from unity:
Any sum of, squares has associated with it a number called its degrees of
freedom. This number indicates how many independent pieces of information
involving the n independent numbers Y� , Y� … , Y� , are needed to compile the
sum of squares. For example, the SS about the mean needs (n-1) independent
pieces (for the numbers �� − ��, �� − ��, … … , �� − �� only (n-1) are
independent since all the n numbers sum to zero, by definition of the mean).
We can compute the SS due to regression from a single function of
�� , �� … ��, , namely �� (since ∑(�� − ��)� = ��� ∑(�� − ��)� and so this
sum of squares-has one degree of freedom.
By subtraction, the SS about regression has (n-2) degrees of freedom. Thus,
corresponding to equation (15.11), we can show the split of degrees of
freedom as (n - 1) = (n - 2) + 1 … (15.12)
Using equations (15.11) and (15.12) and employing alternative computational
forms for the expression of equation (15.11) we can construct an analysis of
variance (ANOVA) table in the following form :
ANOVA TABLE

Source Sum of Squares Degrees of Mean Square


Freedom
Regression �� �� 1 MS�
�� �Σ�� �� − �

�Σ�� �� − (Σ�� Σ�� )/���
=
Σ��� − (Σ�� )� /�
About By subtraction n-2 (SS)
s� =
regression n−2
(residual)
About mean ‘n - 1
(total,
corrected for
mean)

The Mean Square column is obtained by dividing each sum of squares-entry


by its corresponding degrees of freedom. The mean square about regression,
� � will provide an estimate, based on (n - 2) degrees of freedom, of the
variance about the regression, a quantity we shall call � � �� If the
regression equation were estimated from an indefinitely large number of
observations, the variance about regression would represent a measure of the
error with which any observed value of Y could be predicted from a given
value of X using the determined equation.

291
Forecasting An Example: Data on the annual sales of a company in lakhs of Rupees over
Methods the past eleven years is shown to the Table below. Determine a suitable
straight line regression model, Y = �� + �� X + � for the data in the table.

Year Annual Sales in lakhs of Rupees


1998 1
1999 5
2000 4
2001 7
2002 10
2003 8
2004 9
2005 13
2006 14
2007 13
2008 18

Solution: The independent variable in this problem is the year whereas the
response variable is the annual sales. Although we could take the actual year
as the independent variable itself, a judicious choice of the origin at the
middle year of 2003 with the corresponding X values for other years as -5, -4,
-3, -2, -1, 0, 1, 2, 3, 4, 5 would simplify calculations. From equation. (15,10)
we see that to estimate the parameter bj we require the four summations
∑X � , ∑Y� , ∑X �� and ∑X � Y� .
Thus, calculations can be organised as shown below where the totals of the
four columns yield the four desired summations:

Year Annual Sales in lakhs of Rupees


1998 1
1999 5
2000 4
2001 7
2002 10
2003 8
2004 9
2005 13
2006 14
2007 13
2008 18

We find that
n = 11
292
∑X � = 0 Regression


X = 0/11 = 0
�� = 102
�� = 102/11
��� = 110
ΣX � �� = 158
∑�� �� − (∑�� Σ�� )/�
�� =
∑��� − (∑�� )� /�
158
= = 1.44
110
The fitted equation is thus n
�� = �� + �� (� − ��)
or �� = 9.27 + 1.44�
Thus the parameters �� and �� of the model Y = �� + �� X + � are estimated
by b� and b� which in this case are 9.27 and 1.44 respectively. Now that the
model is completely specified we can obtain the predicted values ��� and the
�� corresponding to the eleven observations. These
errors or residuals Y� − Y
are shown in the table below:

I �� �� ��� �� − ���
1 -5 1 2.07 -1.07
2 -4 5 3.51 1,49
3 -3 4 4.95 -0.95
4 -2 7 6.39 0.61
5 -1 10 7.83 2.17
6 0 8 9.27 -1.27
7 1 9 10.71 -1-71
8 2 13 12.15 0.85
9 3 14 13,59 0.41
10 4 13 15.03 -2.03
11 5 18 16.47 1.53

To determine whether the fit is good enough, the ANOVA table can be
constructed.
SS duo to regression = �� �Σ�� �� − (Σ�� Σ�� )/��
�∑�� �� �(∑�� ��� )/���
(Associated degrees of feedom = 1) = ∑��� �(��� )� /�

(158)�
= = 226.95
110 293
Forecasting The total (corrected) SS = Σ��� − (Σ�� )� /�
Methods (Associated degrees of = 1194 − (102)� /11
Freedom = 11 –1 = 10) = 1194 − 945.82
= 248.18
�� due to regression ���.��
The value R� = �� about mean
= ���.�� = 0.9145

indicating that the regression line explains 91.45% of the total variation about
the mean.
A NOVA TABLE

Source SS Df MS
Regression (b) 226.95 1 MSR = 226.95
Residual 21.23 9 � � = 2.36
Total (corrected) 248.18 10

15.3 EXAMINING THE FITTEID STRAIGHT


LINE
In fitting the linear model Y = �� + �� X + � using the least squares criterion
as indicated above in Section 15.2, no assumption were made about
probability distributions. The method of estimating the parameters β0 and β1
tried only to minimise the sum of squares of the errors or residuals, and that
simply involved the solution of simultaneous linear equations. However, in
order to be able to evaluate the precision of the estimated parameters and
provide confidence intervals for forecasted values, it is necessary to make the
following assumptions in the basic model �� = �� + �� �� + �� , � =
1,2, … … . . , �
1) �� is a random variable with mean zero and variance � � (unknown), that
is �(�� ) = 0, �(�� ) = � �
2) �� and �� are uncorrelated, i ≠ j, so that Cov ��� , �� � = 0
Thus �(�� ) = �� + �� �� , �(�� ) = � � and �� and Y� , i ≠ j, are
uncorrelated. A further assumption, which is not immediately necessary
and will be recalled when used, is that
3) �� is a normally distributed random variable, with mean zero and variance
� � by assumption (1), that is
�� ~N(0, � � )
Under this additional assumption ��� , �� � are not only uncorrelated but
necessarily independent.
It may be mentioned here that errors that occur in many real life situations
tend to be normally distributed due to the Central Limit Theorem. In practice
an error term such as ∈ is a sum of errors from several sources. Then no
matter what the probability distribution of the separate errors may be, their
sum will have a distribution that will tend more and more to-the normal

294
distribution as the number of components increases, by the Central Limit Regression

Theorem. Using the above assumptions, we can determine the following :


1) Standard error of the slope b, and confidence interval for �� ,
2) Standard error of the intercept b� and a confidence interval for ��
3) Standard error or ��, the predicted value
4) Significance of regression
5) Percentage variation explained
Standard Error of the Slope and Confidence Interval for its Estimate
From equation (15.10)
∑���� (�� − ��)(�� − ��)
�� =
∑���� (�� − ��)�
∑���� (�� − ��)��
= �
∑��� (�� − ��)�
(since the other term removed from the numerator is ∑���� (�� − ��)�� =
�� ∑���� (�� − ��) = 0

= �(�� − ��)�� + ⋯ … + (�� − ��)�� �/ � (�� − ��)�


���

Now the variance of a function


� = �� �� + �� �� + ⋯ … + �� �� is
�(�) = ��� �(�� ) + ��� �(�� ) + ⋯ … ��� �(�� )
if the �� are pairwise uncorrelated and the �� are constants, furthermore,
if V(Y� ) = � � ,
�(�) = (��� + ��� + ⋯ … ��� )� �
�� ��̅
In the expression for �� = ∑� �
��� (�� ��̅ )

since the X� can be regarded as constants.


Hence after reduction
��
�(�� ) = ∑� � �
(15.13)
��� (�� �� )

The standard error (s.e.) of �� is the square root of the variance, that is

s.e. (�� ) = � (15.14)
�∑� � � �
��� (�� ��) �

If � is unknown, we may use the estimate s in its place and obtain the
estimated standard error of �� , as

est s.e. (�� ) = � (15.15)
��(�� ���)� ��
295
Forecasting If we assume that the variations of the observations about the line are normal,
Methods that is, that the errors e, are all from the same normal distribution, N(0, � � ), it
can be shown that we can assign 100(1 − �)% confidence limits for ��, by
calculating

�����,�� ���
�� ± �
� … (15.16)
�(�� ���)� ��

� �
where � �� − 2,1 − � � is the �1 − � � percentage point-of .a t-distribution
with n -2 degrees of freedom (the number of degrees of freedom on which the
estimate � � is based) (see Figure I1I)
Figure III: The t-distribution

Standard Error of the Intercept and Confidence Interval for its Estimate
We may recall from equation (15.8) that
�� = �� − �� ��
In computing the variance of bQ we require the variance of (which is
� � ��
∑� � and thus has variance ∑���� Var (�� ) = , since Var (�� ) = � � )
� ��� � �� �
by assumption (2) stated at the beginning of Section (15.3) and the variance
of �� (which is available from equation (15.13) above. Since X̄ may be
treated as a constant we may write.
�) + (X
V(b� ) = V(Y �)� V(b� )

Substituting for V(Ȳ) and V(b� ) as indicated above, we obtain:


1 �� �
)
�(�� = � � + �
� Σ(�� − ��)�
�� ����
= � ∑� � �
(15.17)
��� (�� ��)

In like manner if � � is unknown, � � may be used to determine the estimated


variance and standard error of b� (square root of the variance). Thus the 100
(1- a)% confidence limits for �� are given by

� ∑��� �
�� ± � �� − 2,1 − �
�� � ∑� �
(� �
� ��� � �� )� � � (15.18)
296
� � Regression
where, as before �n − 2,1 − � �� corresponds to the 1 − � percentage point
of a t-distribution with (n-2) degrees of freedom(see Figure III once again)
Standard Error of the Forecast
The forecast or predicted value of the dependent variable Y can be expressed
in terms of averages, by using equation (19.9), as
�=Y
Y �)
� + b� (X − X

where both Ȳ and �� are subject to error which will influence ��. Now if �� .
and �� are constants, and
a = a� Y� + a� Y� + ⋯ + a� Y�
C = C� Y� + C� Y� + ⋯ C� Y�
then provided that �� and �� are uncorrelated when i ≠ j and if V(Y� ) = � � ,
all i, Cov (a, c) = (�� � + �� �� + ⋯ + �� �� )
� �i. ea� = �� and � = �� ( i.e.�� = (�� − ��)/
if follows by setting a = Y �
¯
∑���� (�� − ��)�, so that cov �Y
�, b� � = 0, that is Ȳ and b� are uncorrelated

random variables. Thus the variance of the predicted mean value of Y, ��� at a
specific value �� of X is
����� � = �(��) + (�� − ��)� �(�� )
�� (� ��� )� ��

+ ∑� � (� ���)� (15.19)
��� �

where the expression in equation (15.13) for V (�� ) has been utilised.
Hence the estimated standard error for the predicted mean value of Y for a
given �� is

� (�� ��� )� �
est. s.e. (��) = � × �� + �
∑��� (�� ���)� � … (15.20)

This is a minimum when X� = � X and increases as we move �� away from � X


in either direction. In other words, the greater distance an �� is (in either
direction) from X, the larger is the error we may expect to make when
predicting from the regression line the mean value of Y at �� (that is ��� ).
This is intuitively meaningful since we expect the best predictions in the
middle of our observed range of X, with predictions becoming worse as we
move away from the range of observed X values.
The variance and standard error in equations (15.19) and (15.20) above apply
to the predicted mean value of Y for a given �� . Since the actual observed
value of Y varies about the true mean value with variance � � (independent of
the V(Y�), a predicted value of an individual observation will still be given by
Y but will have a variance
� (� �� ) � �
� � + ����� � = � � �1 + � + ∑� �(� ���)� � … (15.21)
��� �
297
Forecasting If � � is unknown the corresponding value may be obtained by inserting � �
Methods for � � . In a similar fashion, the 100(1 − �)% confidence limits for a new
observation which will be centered on �� is

� � (�� ��� )� �
��� ± � �� − 2,1 − � � �1 + � + �
∑��� (�� ���)�
� … (15.22)

� �
Where � �� − 2,1 − � corresponds to the �1 − � � percentage point of a t-
distribution with (n-2) degrees of freedom (recall Figure III).
F-test for Significance of Regression
Since the Y, are random variables, any function of them is also a random
variable; two particular functions are MSR the mean square due to regression,
and � � , the mean square due to residual variation, which arise in the analysis
of variance table shown in Section 15.2.
In the case of fitting a straight line, it can be shown that if �� = 0 (i.e. the
slope of the fitted line is zero) the variable ��� multiplied by its degree of
freedom (here one), and divided by � � follows � � (chi-square) distribution
with the same (1) number of degrees of freedom. In addition(� − 2)� � /� �
follows a � � distribution with (n - 2) degrees of freedom. And since these
two variables are independent, a statistical theorem tells us that the ratio.
���
F= ��
… (15.23)

follows an F distribution with 1 and (n - 2) degrees of freedom provided


�� = 0). This fact can thus be used as a test of �� = 0. We compare the ratio
F = MS� /s� with the 100 (1- �)% point of the tabulated F(l, n - 2)
distribution in order to determine whether �� can be considered non-zero on
the basis of the observed data.
Percentage Variation Explained
The quantity � � defined earlier in Section 15.2 as the ratio of the SS due to
regression to SS about the mean measures the "proportion of total variation
about the mean Y explained by the regression". It is often expressed as a
percentage by multiplying it by 100.

15.4 AN EXAMPLE OF THE CALCULATIONS


The various computations outlined in the case of a straight line regression
situation in Section 15.3 will now be illustrated for the example of annual
sales data for a company that was considered earlier in Section 15.2. Recall
that the fitted regression equation was

Y = 9.27 + 1.44 X.
By choosing any value for X the corresponding prediction � Y could be made
by using this equation. However, the parameters of this model have been
estimated from the given data under certain assumptions, and these estimates
may be subject to error. Consequently the forecast obtained is subject to
chance errors. It is now our objective to
298
1) Quantify the errors of estimates of the parameters b� and b, Regression

2) Establish reasonable confidence intervals for the parameter values n


3) Quantify the error-of the forecast ��� Yk made at some point ��
4) Provide confidence intervals for the forecasted values at some ��
5) Test for the significance of regression, and
6) To obtain an overall measure of quality of fit.
These computations for the example at hand are performed below:
Standard error of the slope ��
�(�� ) = � � /Σ(�� − ��)�
= � � /110
estimate of �(�� ) = � � /110
= 2.36/110 = 0.0215

estimate of standard error (�� ) = � est. �(�)


= 0.1465

Suppose � = 0.05, so that � �� − 2,1 − � �

= t (9, 0.975) = 2.262 (from the tables of the t-distribution)


Then 95% confidence limits for �� are
�(9,0.975)�
�� ± �
�Σ(�� − ��)���
= 1.44 ± (2.262)(0.1465)
= 1.44 ± 0.3314, that is 1.7714 and 1.1086
Standard error of the Intercept ��
(��� )/� �
�(�� ) = �
Σ(�� − ��)�
110
= � � = � � /11
11 × 110
estimate of �(�� ) = � � /11
2.36
= = 0.215
11
estimate of standard error (�� ) = � est. �(�� )
= 0.4637
Then 95% confidence limits for �� are

299

Forecasting �(9,0.975)����� ��
Methods �� ± �
��Σ(�� − �̅ )� ��
= 9.27 ± (2.262)(0.4637)
= 9.27 ± 1.0489, that is 10.3189 and 8.2211
Standard error of the forecast
� (� ��� )�
Estimate of ����� � = � � �� + �(�� ���)� �

1 (�� − 0)�
= 236 � + �
11 110
1 X��
= 2.36 � + �
11 110

when, for instance, x� = X̄ = 0 then



�� � = 2.36 × = 0.214545
est. VY ��

∴ estimate of standard error of �� = √0.214545


= 0.4632
If a prediction of sales for 2009 were to be made using the regression
equation
��� = 9.27 + 1.44X�
one would obtain the value 9.27 + 1.44 (6)
= 17.91 (since the year 2009 corresponds to an X value of 6 on the
transformed scale).
The 95% confidence limits for the true mean value of Y for a given �� are
then given by ��� ± �(9,0.975) (est. se �� )

� ��� �
or ��� ± 2262 × �236 ��� + �����
We shall calculate these limits for �� = 0 (year 2003) and �� = 6 (Y ear
2009)
For X� = 0, �� = 9.27 and estimate of standard error of ��� = 0.4632
∴ 95% confidence limits are 9.27 ± (2.262 x 0.4632)
or 9.27 ± 1.0478
or 10.3178 and 8.2222
Notice that the limits, become wider as we move away from the Centre line.
Figure IV illustrates the 95% confidence limits and the regression line for the
example under consideration and shows how these limits change as the
position of X. changes. These curves are hyperbolae. The variance and
standard error of individual values may be computed by using equation
(15.21), while the confidence limits for a new observation may be obtained
from expression (15.22).
300
Figure IV: Confidence Limits about the Regression Line Regression

1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 Year
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6(X1)

Activity B
For the example problem of Section 15.2 being considered above, determine
the 95% and 99% confidence limits for an individual observation for a given
�� . Compute these limits for the year 2003 and the year 2009 (i.e. X = 0 and
X = 6 respectively). How do these limits compare with those found for the
mean value of Y above?
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
…………………………………………………………………………………
F-test for Significance of Regression
From the ANOVA table constructed for the example in Section 15.2
��� = 226.95
� � = 2.36
��� 226.95
�= = = 9.17
�� 2.36
If we look up percentage points of the F,(1,9) distribution we see that the
95% point (F1, 9, 0.95) = 5.12. Since the calculated F exceeds the critical F
value in the table, that is F = 9.17 > 5.12, we reject the hypothesis H0 running
a risk of less than 5% of being wrong.
Percentage Variation Explained
���.��
For the example problem R� = ���.�� = 0.9145

This indicates that the regression line explains 91.45% of the total variation
about the mean.
301
Forecasting
Methods
15.5 VARIETY OF REGRESSION MODELS
The methods of regression analysis have been illustrated in this unit for the
case of fitting a straight line to a giver set of data points. However the same
principles are applicable to the fitting of a variety of other functions which
may be relevant in certain situations highlighted below.
Seasonal Model
The monthly sales for items like woollens or desert coolers is expected to be
seasonal and a sinusoidal model would be appropriate for such a case. If ��
is the forecast for period t,
�� ��
�� = � + �cos �
� + �sin �
… (15.24)

when a, u and v are constants, t is the time period and N is the number of
time periods in a complete cycle (12 months if the cycle is 1 year). An
example of such a cyclic forecaster is given in Figure V.
Figure V: Cyclic Demand and a Cyclic Forecaster

Seasonal Models with Trend


When in addition to a cyclic component, a growth or decline over time of the
demand is expected, a cyclic trend model of the following kind may be more
suitable.
�� ��
F� = � + �� + ucos �
� + vsin �
t … (15.25)
which is similar to equation (15.24) except for the growth term bt. Thus,
there are now four parameters, a, b, u, v to be estimated. An example of such
a cyclic-trend forecaster is given in Figure VI.
Figure VI: Revenue Miles Flown and Linear-Cyclic Forecaster

302
Polynomials of Various Order Regression

We have considered a simple model of the first order with one independent
variable namely
� = �� + �� � + �
We may have k independent variables X� , X� … X� and obtain a first order
model with k-independent variables as
� = �� + �� �� + �� �� + ⋯ + �� �� + � … (15.26)
In a forecasting context, for instance, the demand for tyres in certain month
(Y) may be related to sales of petrol three months ago (��) the number of
new registrations of vehicles six months ago (��) and the current months
target production of vehicles (��). A second order model with one
independent variable would be
� = �� + �� � + �� � � + � … (15.27)
The most general type of linear model in variables �� , �� , … . �� is of the
form
� = �� �� + �� �� + ⋯ �� �� + � … (15.28)

where
�� = �� (�� , �� … �� )
can take any form: In many cases, each �� may involve only one X variable.
Multiplicative Models
Often by a simple transformation a non-linear model may be handled by the
methods of linear regression. For instance in the multiplicative model
� = ���� ��� ��� (15.29
a, b, c, d are unknown parameters and ∈ is the multiplicative random error.
Taking logarithms to the base a in equation (15.29) converts the model to the
linear form
In � = ln � + �ln �� + �ln �� + �ln �� + � … (15.30)
This model is of the form (15.28) with the parameters being In a, b, c and d
and the independent variables being ln X� , 1nX �� In X� While the dependent
variable is 1nY.
Linear and Non-linear Regression
We have seen above that many non-linear models can be transformed to
linear models by simple transformations. It is to be noted that we are
referring to linearity in the unknown parameters so that, any model which can
be expressed as equation (15.28) is called linear. For such a model the
parameters can be obtained by the method of least squares as the solution to a
set of linear equations (known as the normal equations). Non-linear models
which can be transformed to yield linear models are called intrinsically
linear. Some models are intrinsically non-linear. Examples are:
303
Forecasting � = ���� ��� ��� + � (15.31)
Methods
���� �
� = �� + �� +� (15.32)
� = �� + �� � + �� (�� )� + � (15.33)
Some kind of interactive methods have to be employed for estimating the
parameters of a non-linear system.

15.6 SUMMARY
In this unit fundamentals of linear regression have been highlighted. Broadly
speaking, the fitting of any chosen mathematical function to given data is
termed as regression analysis. The estimation of the parameters of this model
is accomplished by the least squares criterion which tries to minimise the sum
of squares of the errors for all the data points.
How the parameters of a fitted straight line model are estimated, has been
illustrated through an example.
After the model is fitted to data the next logical question is to find out how
good the quality of fit is. This question can best be answered by conducting
statistical tests and determining the standard errors of estimate. This
information permits us to make quantitative statements regarding confidence
limits for estimates of the parameters as well as the forecast values. An
overall percentage variation can also be computed and it serves to give a
score to the regression. Thus it also serves to compare alternative regression
models that may have been hypothesised. The various computations involved
in practice have been illustrated on an example problem.
Finally, it has been emphasised that the method of least squares used in linear
regression is applicable to a wide class of models. In each case the model
parameters are obtained by the solution of the so called "normal equations
.These are simultaneous linear equations equal in number to the number of
parameters to be estimated, obtained by partially differentiating the sum of
squares of errors with respect to the individual parameters.
Regression is thus a potent device for establishing relationships between
variables from the given data. The discovered relationship can be used for
predictive purposes. Some of the models used in forecasting of demand rely
heavily on regression-analysis. One such class of models, called Time -series
models is explored in Unit 20.

15.7 SELF-ASSESSMENT EXERCISES


1) What are the basic steps in establishing a relationship between variables
from a given data?
2) What is linear regression?
In this context classify the following models as linear or non-linear.
a) � = � + �� + �
304
b) Y = a + bX + cX � + � Regression

c) Y = a + be��� + e
���
(d) � = � + �cos �
+ �Sin + �

3) The demand of two products A and B for twelve periods is given below:

Period 1 2 3 4 5 6 7 8 9 10 11 12

Demand 80 10 79 98 95 10 80 98 10 96 11 88
of 0 4 2 5
Produc A

Demand 19 20 19 20 21 19 21 22 21 23 21 23
of 9 2 9 8 2 4 4 0 9 4 9 3
Product
B

assuming a linear forecaster of the type Y = �� + �� t+∈, where Y is the


demand, t the time period, �� , �� parameters and E a random error
component, establish the forecasting function for products A and B.
Obtain 95% confidence intervals ls for the parameters and the 95% confidence
interval for the true mean value of Y at any given value of t, say �� .
4) A test was run on a given process for the purpose of determining the
effect of an independent variable X (such as process temperature) on a
certain characteristic property of the finished product Y (such as density).
Twenty observations were taken and the following results were obtained
�� = 5.0Σ(�� − ��)� = 160, Σ(�� − ��)(�� − ��) = 80
�� = 3.0
0Σ(� − �� )� = 83.2
Assume a model of the type
a) calculate the fitted regression equation
b) prepare the analysis of variable table
c) determine 95% confidence limits for the true mean value of Y when
1) X = 5.0
2) X = 9.0
5) The cost of maintenance of tractors seems to increase with the age of the
tractor.
tor. The following data was collected
Age(yr) Monthly Cost (Rs)
4.5 619
4.5 1049
4.5 1033
4.0 495
4.0 723
305
Forecasting 4.0 681
Methods
5.0 890
5.0 1522
5.5 987
5.0 1194
0.5 163
0.5 182
6.0 764
6.0 1373
1.0 978
1.0 466
10 549

Determine if a straight line relationship is sensible (use �, the significance


level = 0.10).
6) It is thought that the number of cans damaged in a box car shipment of
cans is a function of the speed of the box car at impact. Thirteen box cars
selected at random were used to examine whether this was true. The data
collected is as follows:
Speed of 4 3 5 8 4 3 3 4 3 5 7 3 8
ear at
impact
No, of 2 5 8 13 6 10 2 7 5 3 16 4 5
cans 7 4 6 6 5 9 8 5 3 3 8 7 2
damage
d
What are your conclusions? (Use � = 0.05)

15.8 KEYWORDS
Dependent variable: The variable of interest or focus which is influenced by
one or more independent variable(s).
Estimate: A value obtained from data for a certain parameter of the assumed
model or a forecast value obtained from the model.
Independent variable: A variable that can be set either to a desirable value
or takes values that can be observed but not controlled. parameters of the
model are estimated by minimising the sum of squares of error (discrepancy
between fitted and actual value).
Linear regression: Fitting of any chosen mathematical model, linear in
unknown parameters, to a given data.
Model: A general mathematical relationship relating a dependent (or
response) variable Y to independent variables X� , X� … … , X� by a force
Y = f(X� , X� … X� )
306
Non-linear regression: Fitting-of any chosen mathematical model, non- Regression

linear in unknown parameters, to a given data.


Parameters: The constant terms of the chosen model that have to be
estimated before the model is completely specified.
Regression: Relating of a dependent (or response) variable to a number
independent variables, based on a given set of data.
Response variable: Same as a "Dependent variable".

15.9 FURTHER READINGS


Biegel, J.E. Production Control -A Quantitative Approach, Prentice Hall of
India: Delhi.
Chambers, J.C., S.K. Mullick and D.D. Smith, An Executive's Guide to
Forecasting, John Wiley: New York.
Draper, N.R. and N. Smith, Applied Regression Analysis, John Wiley: New
York.
Firth, M., 1977. Forecasting Methods in Business and Management, Edward
Arnold: London.
Jarrett, J., 1987. Business Forecasting Methods, Basil Blackwell: London.
Makridakis, S. and S.C. Wheelwright, Interactive Forecasting, Holden-Day:
San Francisco.
Makridakis, S., S.C. Wheelwright and V.E. McGee, Forecasting: Methods
and Applications, John Wiley: New York.
Montgomery, D.C. and L.A. Johnson, Forecasting and Time Series Analysis,
McGiraw Hill: New York.

307
Forecasting
Methods UNIT 16 TIME SERIES ANALYSIS

Objectives
After completion of this unit, you should be able to :
• appreciate the role of time series analysis in short term forecasting
• decompose a time series into its various components
• understand auto-correlations to help identify the underlying patterns of a
time series
• become aware of stochastic models developed by Box and Jenkins for
time series analysis
• make forecasts from historical data using a suitable choice from
available methods.
Structure
16.1 Introduction
16.2 Decomposition Methods
16.3 Example of Forecasting using Decomposition
16.4 Use of Auto-correlations in Identifying Time Series
16.5 An Outline of Box-Jenkins Models for Time Series
16.6 Summary
16.7 Self-assessment Exercises
16.8 Key Words
16.9 Further Readings

16.1 INTRODUCTION
Time series analysis is one of the most powerful methods in use, especially
for short term forecasting purposes. From the historical data one attempts to
obtain the underlying pattern so that a suitable model of the process can be
developed, which is then used for purposes of forecasting or studying the
internal structure of the process as a whole. We have already seen in earlier
unit that a variety of methods such as subjective methods, moving averages
and exponential smoothing, regression methods, causal models and time-
series analysis are available for forecasting. Time series analysis looks for the
dependence between values in a time series (a set of values recorded at equal
time intervals) with a view to accurately identify the underlying pattern of the
data.
In the case of quantitative methods of forecasting, each technique makes
explicit assumptions about the underlying pattern. For instance, in using
regression models we had first to make a guess on whether a linear or
parabolic model should be chosen and only then could we proceed with the
308
estimation of parameters and model-development. We could rely on mere Time Series
Analysis
visual inspection of the data or its graphical plot to make the best choice of
the underlying model. However, such guess work, through not uncommon, is
unlikely to yield very accurate or reliable results. In time series analysis, a
systematic attempt is made to identify and isolate different kinds of patterns
in the data. The four kinds of patterns that are most frequently encountered
are horizontal, non-stationary (trend or growth), seasonal and cyclical.
Generally, a random or noise component is also superimposed.
We shall first examine the method of decomposition wherein a model of the
time-series in terms of these patterns can be developed. This can then be used
for forecasting purposes as illustrated through an example.
A more accurate and statistically sound procedure to identify the patterns in a
time-series is through the use of auto-correlations. Auto-correlation refers to
the correlation between the same variable at different time lags and was
discussed in Unit 18. Auto-correlations can be used to identify the patterns in
a time series and suggest appropriate stochastic models for the underlying
process. A brief outline of common processes and the Box-Jenkins
methodology is then given.
Finally the question of the choice of a forecasting method is taken up.
Characteristics of various methods are summarised along with likely
situations where these may be applied. Of course, considerations of cost and
accuracy desired in the forecast play a very important role in the choice.

16.2 DECOMPOSITION METHODS


Economic or business oriented time series are made up of four components --
trend. seasonality, cycle and randomness. Further, it is usually assumed that
the relationship between these four components is multiplicative as shown in
equation 16.1.
X� = T, S, C, R … (16.1)
where
�� is the observed value of the time series
�� denotes trend
�� denotes seasonality
�� denotes cycle
and
�� denotes randomness.
Alternatively, one could assume an additive relationship of the form
X� = T� + S� + C� + R �
But additive models are not commonly encountered in practice. We shall,
therefore, be working with a model of the form (16.1) and shall
systematically try to identify the individual components. 309
Forecasting You are already familiar with the concept of moving averages, If the time
Methods series represents a seasonal pattern of L periods, then by taking a moving
average of L periods, we would get the mean value for the year. Such a value
will obviously be free of seasonal effects, since high months will be offset by
low ones. If �� denotes the moving average of equation (16.1), it will be free
of seasonality and will contain little randomness (owing to the averaging
effect). Thus we can write
M� = T� C� … (16.2)
The trend and cycle components in equation (16.2) can be further
decomposed by assuming some form of trend.
• One could assume different kinds of trends, such as
• linear trend, which implies a constant rate of change (Figure I)
• parabolic trend, which implies a varying rate of change (Figure II)
• exponential or logarithmic trend, which implies a constant percentage
rate of change (Figure III).
• an S curve, which implies slow initial growth, with increasing rate of
growth followed by a declining growth rate and eventual saturation
(Figure IV).
Figure I: Linear Trend

Figure II: Parabolic Trend

310
Figure III: Exponential Trend Time Series
Analysis

Figure IV: A typical S Curve

Knowing the pattern of the trend, the appropriate mathematical function


could be determined from the data by using the methods of regression, as
outlined in earlier unit. This would establish the values of parameters of the
chosen trend model. For example, assuming a linear trend gives
�� = � + �� (16.3)
The cycle component �� can now be isolated from the trend �� in equation
(16.2) by the use of equation (16.3) as follows
�� �
��

= ���� (16.4)

As already indicated, if a linear trend is not adequate, one may wish to


specify a non-linear one. Any pattern for the trend can be used to separate it
from the cycle. In practice, however, it is often difficult to separate the two,
and one may prefer to work with the trend cycle figures of equation (16.2).
The isolation of the trend will add little to the overall ability to forecast. This
will become clear when we take up an example problem for solution.
To isolate seasonality one could simply divide the original series (equation
16.1) by the moving average (equation 16.2) to obtain
�� �� �� �� ��
��
= �� ��
= �� �� (16.5)

311
Forecasting Finally, randomness can be eliminated by averaging the different values of
Methods equation (16.5). The averaging is done on the same months or seasons of
different years (for example the average of all Januaries, all Februaries,.... all
Decembers). The result is a set of seasonal values free of randomness, called
seasonal indices, which are widely used in practice.
In order to forecast, one must reconstruct each of the components of equation
(16.1). The seasonality is known through averaging the values in equation
(16.5) and the trend through (16.3). The cycle of equation (16.4) must be
estimated by the user and the randomness cannot be predicted.
To illustrate the application of this procedure to actual forecasting of a time
series, an example will now be considered.

16.3 EXAMPLE OF FORECASTING USING


DECOMPOSITION
An Engineering firm producing farm equipment wants to predict future sales
based on the analysis of its past sales pattern. The sales of the company for
the last five years are given in Table 1.
Table 1: Quarterly Sates of an Engineering Firm during 1983-87 (Rs. In
lakhs)

Year Quarters
I II III IV
1983 5.5 5.4 7.2 6.0
1984 4.8 5.6 6.3 5.6
1985 4.0 6.3 7.0 6.5
1986 5.2 6.5 7.5 7.2
1987 6.0 7.0 8.4 7.7

The procedure involved in the study consists of


a) deseasonalising the time series which is done by constructing a moving

average M and taking the ratio �� which we know from equation (20.5)

represents the seasonality and randomness.
b) fitting a trend line of the type T, = a + bt to the deseasonalised time
series
c) identifying the cyclical variation around the trend line
d) use the above information for forecasting sales for the next year
Deseasonalising the Time Series
The moving averages and the ratios of the original variable to the moving
average have first to the computed.
This is done in Table 2

312
Table 2: Computation of moving averages �� and the ratios �� /�� Time Series
Analysis

Year Quarter Actual 4 Centred Centred ��


Sales Quarter Moving Moving ��
Moving Total Average
Total (�� )
1983 I 5.5
II 5.4
III 7.2 23.8 6.0 1.200
IV 6.0 24.1 23.5 5.9 1.017
1984 I 4.8 23.4 23.2 5.8 0.828
II 5.6 23.6 22.5 5.6 1.000
III 6.3 22.7 21.9 5.5 1.145
IV 5.6 22.3 21.9 5.5 1.018
1985 1 4.0 21.5 22.6 5.7 0.702
II 6.3 22.2 23.4 5.9 1.068
• III 7.0 22.9 24.4 6.1 1.148
IV 6.5 23.8 25.1 6.3 1.032
1986 I 5.2 25.0 25.5 6.4 0.813
II 6.5 25.2 26.1 6.5 1.000
III 7.5 25.7 26.8 6.7 1.119
IV 7.2 26.4 27.5 6.9 1.043
1987 I 6.0 27.2 28.2 7.1 0.845
II 7.0 27.7 28.9 7.2 0.972
III 8.4 28.6
IV 7.7 29.1

It should be noticed that the 4 Quarter moving totals pertain to the middle of
two successive periods. Thus the value 24.1 computed at the end of Quarter
IV, 1983 refers to middle of Quarters II, III, 1983 and the next moving total
of 23.4 refers to the middle of Quarters III and IV, 1983. Thus, by taking
(��.����.�)
their average we obtain the centred moving total of �
= 23.75 ≅
23.8 to be placed for Quarter III, 1983. Similarly for the other values in case
the number of periods in the moving total or average is odd, centering will
not be required.
The seasonal indices for the quarterly sales data can now be computed by
taking averages of the X� /M� ratios of the respective quarters for different
years as shown in Table 3.

313
Forecasting Table 3: Computation of Seasonal Indices
Methods
Year Quarters
I II III IV
1983 - - 1.200 1.017
1984 0.828 1.000 1.145 1.018
1985 0.702 1.068 1.148 1.032
1986 0.813 1.000 1.119 1.043
1987 0.845 0.972 - -
Mean 0.797 1.010 1.153 1.028
Seasonal 0.799 1.013 1.156 1.032
Index

The seasonal indices are computed from the quarter means by adjusting these
values of means so that the average over the year is unity. Thus the sum of
means in Table 3 is 3.988 and since there are four Quarters, each mean is
adjusted by multiplying it with the constant figure of 4/3.988 to obtain the
indicated seasonal indices. These seasonal indices can now be used to obtain
the deseasonalised sales of the firm by dividing the actual sales by the
corresponding index as shown in Table 4.
Table 4: Deseasonalised Sales
Year Quarter Actual Seasonal Deseasonalised
Sales index Sales
1983 I 5.5 0.799 6.9
II 5.4 1.013 5.3
III 7.2 1.156 6.2
IV 6.0 1.032 5.8
1964 I 4.8 0.799 6.0
II 5.6 1.013 5.5
111 6.3 1.156 5.4
IV 5.6 1.032 5.4
1985 1 4.0 0.799 5.0
11 6.3 1.013 6.2
III 7.0 1.156 6.0
IV 6.5 1.032 6.3
1986 I 5.2 0.799 6.5
II 6.5 1.013 6.4
111 7.5 1.156 6.5
IV 7.2 1.032 7.0
1967 1 6.0 0.799 7.5
V II 7.0 1.013 6.9
III 8.4 1.156 7.3
IV 7.7 1.032 7.5
Fitting a Trend Line
The next step after deseasonalising the data is to develop the trend line. We
shall here use the method of least squares that you have already studied in
Unit 19 on regression. Choice of the origin in the middle of the data with a
suitable scaling simplifies computations considerably. To fit a straight line of
314
the form Y = a + bX to the deseasonalised sales, we proceed as shown in Time Series
Analysis
Table 5.
Table 5: Computation of Trend

Year Quarter Deseasonalised X X2 XY


Sales (Y)
1983 I 6.9 -19 361 -131.1
II 5.3 -17 289 -90.1
III 6.2 -15 225 -93.0
IV 5.8 -13 169 -75.4
1984 I 6.0 -11 121 -66.0
II 5.5 -9 81 -49.5
III 5.4 -7 49 -37.8
IV 5.4 -5 25 -27.0
1985 I 5.0 -3 9 -15.0
II 6.2 -1 1 -6.2
III 6.0 1 1 6.0
IV 6.3 3 9 18.9
1986 I 6.5 5 25 32.5
II 6.4 7 49 44.8
III 6.5 9 81 58.5
IV 7.0 11 121 77.0
1987 I 7.5 13 169 97.5
II 6.9 15 225 103.5
III 7.3 17 289 124.1
IV 7.5 19 361 142.5
Total 125.6 0 2660 114.2

Σ� 125.6
�= = = 6.3
� 20
�� 114.2
�= = = 0.04�
Σ� � 2660
∴ the trend line is � = 6.3 + 0.04X
Identifying Cyclical Variation
The cyclical component is identified by measuring deseasonalised variation
around the trend line, as the ratio of the actual deseasonalised sales to the
value predicted by the trend line. The computations are shown in Table 6.

315
Forecasting Table 6: Computation of Cyclical Variation
Methods
Year Quarter Deseasonalised Trend a+bX �
Sales (Y) � + ��
1983 I 6.9 5,54 1.245
II 5.3 5.62 0.943
III 6.2 5.70 1.088
IV 5.8 5.78 1.003
1984 I 6.0 5.86 1.024
II 5.5 5.94 0.926
III 5.4 6.02 0.897
IV 5.4 6.10 0.885
1985 I 5.0 6.18 0.809
II 6,2 6.26 0.990
Ill 6,0 6.34 0.946
IV 6.3 6.42 0.981
1986 I 6,5 6.50 1.000
II 6.4 6.58 0.973
III 6,5 6.66 0.976
IV 7.0 6.74 1.039
1987 I 7.5 6.82 1.110
II 6.9 6.90 1.000
III 7.3 6.98 1.046
IV 7.5 7.06 1.062

The random or irregular variation is assumed to be relatively insignificant.


We have thus described the time series in this problem using the trend,
cyclical and seasonal components. Figure V represents the original time
series, its four quarter moving average (containing the trend and cycle
components) and the trend line.
Figure V: Time Series with Trend and Moving Averages

316
Forecasting with the Decomposed Components of the Time Series Time Series
Analysis
Suppose that the management of the Engineering firm is interested in
estimating the sales for the second and third quarters of 1988. The estimates
of the deseasohalised sales can be obtained by using the trend line
Y = 6.3 + 0.04(23)
= 7.22 (2nd Quarter 1988)
and Y = 6.3 + 0.04 (25)
= 7.30 (3rd Quarter 1988)
These estimates will now have to be seasonalised for the second and third
quarters respectively. This can be done as follows :
For 1988 2nd quarter
seasonalised sales estimate = 7.22 x 1.013 = 7.31
For 1988 3rd quarter
seasonalised sales estimate = 7.30 x 1.56
= 8.44
Thus, on the basis of the above analysis, the sales estimates of the
Engineering firm for the second and third quarters of 1988 are Rs. 7.31 lakh
and Rs. 8.44 lakh respectively.
These estimates have been obtained by taking the trend and seasonal
variations into account. Cyclical and irregular components have not been
taken into account. The procedure for cyclical variations only helps to study
past behaviour and does not help in predicting the future behaviour.
Moreover, random or irregular variations are difficult to quantify.

16.4 USE OF AUTO-CORRELATIONS IN


IDENTIFYING TIME SERIES
While studying correlation in Unit 14, auto-correlation was defined as the
correlation of a variable with itself, but with a time lag. The study of auto-
correlation provides very valuable clues to the underlying pattern of a time
series. It can also be used to estimate the length of the season for seasonality.
(Recall that in the example problem considered in the previous. section, we
assumed that a complete season consisted of four quarters.)
When the underlying time series represents completely random data, then the
graph of auto-correlations for various time lags stays close to zero with
values fluctuating both on the +ve and -ve side but staying within the control
limits. This in fact represents a very convenient method of identifying
randomness in the data.
If the auto-correlations drop slowly to zero, and more than two or three differ
significantly from zero, it indicates the presence of a trend in the data. This
trend can be-removed by differentiating (that is taking differences between
consecutive values and constructing a new series).
A seasonal pattern in the data would result in the auto-correlations oscillating
around zero with some values differing significantly from zero. The length of
317
Forecasting seasonality can be determined either from the number of periods it takes for
Methods the auto-correlations to make a complete cycle or by the time lag giving the
largest auto correlation.
For any given data, the plot of auto-correlation for various time lags is
diagnosed to identify which of the above basic patterns (or a combination of
these patterns) it follows. This is broadly how auto-correlations are used to
identify the structure of the underlying model to be chosen. The underlying
mathematics and computational burden tend to be heavy and involved.
Computer routines for carrying out computations are available. The interested
reader may refer to Makridakis and Wheelwright for further details.

16.5 AN OUTLINE OF BOX-JENKINS MODELS


FOR TIME SERIES
Box and Jenkins (1976) have proposed a sophisticated methodology for
stochastic model building and forecasting using time series. The purpose of
this section is merely to acquaint you with some of the terms, models and
methodology developed by Box and Jenkins.
A time series may be classified as stationary (in equilibrium about a constant
mean value) or non-stationary (when the process has no natural or stable
mean). In stochastic model building the non-stationary processes often
converted to a stationary one by differencing. The two major classes of
models used popularly in time series analysis are Auto-regressive and
Moving Average models.
Auto-regressive Models
In such models, the current value of the process is expressed as a finite, linear
aggregate of previous values of the process and a random shock or error �� .
Let us denote the value of a process at equally spaced times t, t − 1, t −
2 … by Z� , Z��� , Z��� … … also let �� , ���� ���� … be the deviations
from the process mean, m. That is ��̅ = �� − m. Then
�� = �� ���� + �� ���� + ⋯ + �� ��⋅� + �� … (16.6)

is called an auto-regressive (AR) process of order p. The reason for this name
is that equation (16.6) represents a regression of the variable �� on successive
values of itself. The model contains p + 2 unknown parameters m,
�� , �� , … … �� , ��� which in practice have to be estimated from the data.

The additional parameter � � is the variance of the random error component.


Moving Average models
Another kind of model of great importance is the moving average model
where �� is made linearly dependent on a finite number q of previous a's
(error terms)
Thus
�� = �� − �� ���� − �� ���� − �� ���� … (16.7)

318
is called a moving average (MA) process of order q. The name "moving Time Series
Analysis
average" is somewhat misleading because the weights 1, −�� , −�� , … , −��
which multiply the a's, need not total unity nor need they be positive.
However, this nomenclature is in common use and therefore we employ it.
The model (16.7) contains q + 2 unknown parameters m, �� , �� , … . �� , ���
which in practice have to be estimated from the data.
Mixed Auto-regressive-moving average models :
It is sometimes advantageous to include both auto-regressive and moving
average terms in the model. This leads to the mixed auto-regressive-moving
average (ARMA) model.
�� = �� ���� + ⋯ �� ���� + �� − �� ���� − ⋯ − �� ���� … (16.8)
In using such models in practice p and q are not greater than 2.
For non-stationary processes the most general model used is an auto-
regressive integrated moving average (ARIMA) process of order (p, d, q)
where d represents the degree of differencing to achieve stationarity in the
process.
The main contribution of Box and Jenkins is the development of procedures
for identifying the ARMA model that best fits a set of data and for testing the
adequacy of that model. The various stages identified by Box and Jenkins in
their interactive approach to model building are shown in Figure VI. For
details on how such models are developed refer to Box and Jenkins.
Figure VI: The Box-Jenkins Methodology

POSTULATE GENERAL CLASS


OF MODELS

IDENTIFY MODEL
TO BE
TENTATIVELY
ENTERTAINED

ESTIMATE PARAMETERS IN
TENTATIVE MODEL

DIAGNOSTIC CHECKING ( IS
MODEL ADEQUATE?

NO YES

USE MODEL FOR


FORECASTING OR CONTROL

319
Forecasting
Methods
16.6 SUMMARY
Some procedures for time series analysis have been described in this unit
with a view to making more accurate and reliable forecasts of the future.
Quite often the question that puzzles a person is how to select an appropriate
forecasting method. Many times the problem context or time horizon
involved would decide the method or limit the choice of methods. For
instance, in new areas of technology forecasting where historical information
is scanty, one would resort, to some subjective method like opinion poll or a
DELPHI study. In situations where one is trying to control or manipulate a
factor a causal model might be appropriate in identifying the key variables
and their effect on the dependent variable.
In this particular unit, however, time series models or those models where
historical data on demand or the variable of interest is available are discussed.
Thus we are dealing with projecting into the future from the past. Such
models are short term forecasting models.
The decomposition method has been discussed. Here the time series is broken
up into seasonal, trend, cycle and random components from the given data
and reconstructed for forecasting purposes. A detailed example to illustrate
the procedure is also given.
Finally the framework of stochastic models used by Box and Jenkins for time
series analysis has been outlined. The AR, MA, ARMA and ARIMA
processes in Box- Jenkins models are briefly described so that the interested
reader can pursue a detailed study on his own.

16.7 SELF-ASSESSMENT EXERCISES


1) What do you understand by time series analysis? How would you go
about conducting such an analysis for forecasting the sales of a product in
your firm?
2) Compare time series analysis with other methods of forecasting, briefly
summarising the strengths and weaknesses of various methods.
3) What would be the considerations in the choice of a forecasting method?
4) Find the 4-quarter moving average of the following time series
representing the quarterly production of coffee in an Indian State.
Production (in Tonnes)
Year Quarter I Quarter II Quarter III Quarter IV
1983 5 1 10 17
1984 7 1 10 16
1985 9 3 8 18
1986 5. 2 15 19
1987 8 4 14 21
5) Given below is the data of production of a certain company in lakhs of
units
320
Year 1981 1982 1983 1984 1985 1986 1987 Time Series
Analysis
Production 15 14 18 20 17 24 27

a) Compute the linear trend by the method of least squares.


b) Compute the trend values of each of the years.
6) Given the following data on factory production of a certain brand of
motor vehicles, determine the seasonal indices by the ratio to moving
average method for August and September, 1985.

Production (in thousand units)


Year Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec.
1985 7.92 7.81 7.91 7.03 7.25 7.17 5.01 3.90 4.64 7.03 6.88 6.14
1986 4.86 4.48 5.26 5.48 6.42 6.82 4.98 2.45 4.51 6.38 6.38 7.59

7) A survey of used car sales in a city for the 10-year period 1976-85 has
been made. A linear trend was fitted to the sales for month for each year
and the equation was found to be
Y = 400 + 18 t

where t = 0 on January 1, 1981 and t is measured in �
year (6 monthly)
units
a) use this trend to predict sales for June, 1990
b) If the actual sales in June. 1987 are 600 and the relative seasonal index
June sales is 1.20, what would be the relative cyclical, irregular index
June, 1987?
8) The monthly sales for the last one year of a product in thousands of units
are given below :
Month 1 2 3 4 5 6 7 8 9 10 11 12
Sales 0.5 1.5 2.2 3.0 3.2 3.5 3.5 3.5 3.8 4.0 4.7 5.5
Compute the auto-correlation coefficients up to lag 4. What conclusion
can be derived from these values regarding the presence of a trend in the
data?

16.8 KEY WORDS


Auto-correiation : Similar to correlation in that it Describes the association
between values of the same variable but at different time periods. Auto-
corre'aHo" coefficients provide important information about the underlying
patterns in the data.
Auto-regressive/Moving Average (ARMA) Models : Auto-regressive(AR)
models assume that future values are linear combinations of past values.
Moving Average (MA) models, on the other hand, assume that future values
are linear combinations of past errors. .A combination of the two is called an
"Auto-regressive/Moving Average (ARMA) model".

321
Forecasting Decomposition : Identifying the trend, seasonality, cycle and randomness in
Methods a time series.
Forecasting : Predicting the future values of a variable based on historical
values of the same or other variable(s). If the forecast is based simply on past
values of the variable itself, it is called time series forecasting, otherwise it is
a causal type forecasting.
Seasonal Index : A number with a base of 1.00 that indicates the seasonality
for a given period in relation to other periods.
Time Series Model : A model that predicts the future by expressing it as a
function of the past.
Trend : A growth or decline in the mean value of a variable over the relevant
time span.

16.9 FURTHER READINGS


Box, G.E.P. and G.M. Jenkinsx. Time Series Analysis, Forecasting and
Control, Holden-Day: San Francisco.
Chambers, J.C., S.K. Mullick and D.D. Smith. An Executive's Guide to
Forecasting, John Wiley: New York.
Makridakis, S. and S. Wheelwright. interactive Forecasting: Univariate and
Multivariate Methods, Holden-Day: San Francisco.
Makridakis, S. and S. Wheelwright. Forecasting: Methods and Applications,
John Wiley, New York.
Montgomery, D.C. and L.A. Johnson. Forecasting and Time Series Analysis,
McGraw Hill: New York.
Nelson, C.R., Applied Time Series Analysis for Managerial Forecasting,
Holden-Day:

322

You might also like