Unit - 5 Forecasting

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Forecasting

 Forecasting is a basic tool to help managerial decision making. For example keeping inventory without knowing
that what the sales will be, buying equipment and machinery having no knowledge about the demand for the
product.
 Forecasting can be obtained using variety of techniques which falls into two categories, qualitative techniques or
model and quantitative techniques or model. Qualitative models are used based on personal judgment and
involves quality like inclusions and experience as the base of forecast and are subjective by their nature. On the
other hand quantitative models are objective in nature and they employ numerical information as the basis of
forecasting.
 Quantitative models include time series and casual model. The time series model attempt to predict the future
value using the historical data and the casual model are used where one variable is related to and therefore
dependent upon the value of same other variable.

Time series model of forecasting


 Time series model use time based data. A time series is a collection of readings belonging to different time
periods usually equally spaced which may be month, week, year etc. It is a series in which the data are pulled in
a chronological order. Forecasting time series data implies that predictions about the future values are made
only from the past data. The analysis of the time series means decomposing the past data into components and
then projecting them forward. A time series typically has four components:
1. Secular trend
 Over a long period, a time series will have an overall tendency either to move upwards or downwards though
the actual movements will not be regular. This tendency of the time series is known as secular trend. For
example, a glance at the sales of a popular soft drinks manufacturer is likely to reveal an increasing trend.
2. Sessional variation
 The fluctuations that occurs periodically, the movements recurring within a definite period may be every week
or month or quarter with reasonably high degree of predictability are called sessional variations. Either it can be
made or natural. Natural sessional variations include sales or profit based on sessions like summer, winter,
monsoon and manmade variations include profit and sales based on festival, fashion, wedding sessions, days etc.
The time of sessional variation is always less than one year.
3. Cyclical movements
 They are called by business cycles or trade cycles. These are oscillatory movements. A series of repeated
sequences imposed on the original data. These movements are of longer duration than a year. The sales of a
company for example may be high because the level of economic activity may be high.
4. Random movements
 Random movements are residual movements that do not have any set pattern and are usually caused by some
unpredictable reasons, like war, fire, strike, bunk etc.

Method of measuring time series – trend projection


 For this method a trend line is fitted to the given time series data and then projections are made into future
using this line. This trend line may be linear or curvy linear. For obtaining the trend line on the given historical
data are first plotted on the graph representing time scale on the X-axis, then a line is drawn through these
points in such a way that, the sum of deviations above the line is equal to the sum of the deviations below the
line, so that the sum of deviation is equal to 0, and the sum of square of these deviation is minimum.
 The trend line is drawn on the basic principle of least square an a line is given by Y = a + bX, where Y is
dependent variable which is to be predicted, X is independent variable, a is constant, b is slope.
 Y = a + bX
 ∑Y = na + b∑X
 ∑XY = a∑X + b∑X2
 a= −

 b=

Example
The sales of a company in millions of rupees are given below. Using the principle of least square, fit a straight line
trend equation to the above data. Show that the sum of deviation is equal to 0, also determine the sum of square of
deviation. Forecast the sales for the year 2007 and 2008.

Year(X) Sales(Y)
1999 82
2000 80
2001 90
2002 92
2003 83
2004 94
2005 99
2006 92

Year(X) Sales(Y) X XY X2 Y- (Y - )2
1999 82 0 0 0 82 0 0
2000 80 1 80 1 84 -4 16
2001 90 2 180 4 86 4 16
2002 92 3 276 9 88 4 16
2003 83 4 332 16 90 -7 49
2004 94 5 470 25 92 2 4
2005 99 6 594 36 94 5 25
2006 92 7 644 49 96 -4 16
712 28 2576 140 0 142
n=8
Y = 89
X = 322
∑Y = na + b∑X
∑XY = a∑X + b∑X2
712 = 8a + 28b …1
2576 = 28a + 140b …2
Multiply equation 1 by 5 and subtract equation 2 from it.
3560 = 40a + 140b
2576 = 28a + 140b
------------------------
984 = 12a
a = 82
Put a = 82 in equation 1
b=2
Y = 82 + 2X
Y2007 = 98
Y2008 = 100

To simplify the method of finding a and b, we need to change the decision of choosing X. If n is odd, then the
transformation is
X=
And if n is even, then the transformation is
X=
( )
Example
Fit a linear trend to the following data by least square method. Verify that ∑(Y-Ye) = 0. Also estimate the production
for the year 1999.
Year(X) Y
1990 18
1992 21
1994 23
1996 27
1998 16

Year(X) Y X XY X2 Y-
1990 18 -2 -36 4 20.6 -2.6
1992 21 -1 -21 1 20.8 0.2
1994 23 0 0 0 21 2
1996 27 1 27 1 21.2 5.8
1998 16 2 32 4 21.4 -5.4
105 0 2 10 0
∑Y = na + b∑X
105 = 5a + 0b
5a = 105
a = 21

∑XY = a∑X + b∑X2


2 = 0a + 10b
10b = 2
b = 0.2

Y = 21 + 0.2X
Y1999 = 21.5
Example
From the following data, find linear trend equation. Estimate the sales for year 1993 and find out sum of square of
deviation and also show sum of deviation is 0.
Year(X) Y
1994 550
1995 560
1996 555
1997 585
1998 540
1999 525
2000 545
2001 585

Year(X) Y X XY X2 Y- (Y - )2
1994 550 -7 -3850 49 554.155 -4.155 17.264
1995 560 -5 -2800 25 554.575 5.425 29.431
1996 555 -3 -1665 9 554.995 0.005 0.000025
1997 585 -1 -585 1 555.415 29.585 875.272
1998 540 1 540 1 555.835 -15.835 250.747
1999 525 3 1575 9 556.255 -31.255 976.875
2000 545 5 2725 25 556.675 -11.675 136.306
2001 585 7 4095 49 557.095 27.905 778.689
4445 0 35 168 0 3064.5838
∑Y = na + b∑X
4445 = 8a + 0b
8a = 4445
a = 555.625

∑XY = a∑X + b∑X2


35 = 0a + 168b
168b = 35
b = 0.21

Y = 555.625 + 0.21X
Y1993 = 553.735
Goodness of fit
 To determine how well the regression line obtain fits the given data points, we consider 3 components of
variations. The sum of square of the deviation of the different observations from their arithmetic mean is called
the total variation. It is given by ∑( − ) . It shows extent to which the arithmetic mean fails to describe the
data, this is made up of two parts (1) The variation that is explained by the line of relationship called explained
variance which is given by ∑( − ) (2) The variation which is not explained by the line is called unexplained
variance which is given by ∑( − ) . The former is calculated as the sum of square of deviation of difference
points on the line from the arithmetic mean, whereas the latter is found by adding the square of deviation of
different observations from corresponding points on the line. Thus
Total variation = ∑( − ) = Sum of square of total (SST)
Explained variation = ∑( − ) = Sum of regression (SSR)
Unexplained variation = ∑( − ) = Sum of error (SSE)
 A fit is good if, SSE is small and SSR is large. The ratio of explained variation to total variation is called coefficient
of determination which is given by = . On the other hand the ratio of unexplained variation to total
variation is called coefficient of non-determination which is given by 1 − = . It tells what proportion of
the total variation in Y which is not explained by X.

Multiple regression analysis


 It allows to build model with more than one independent variables. The two variable regression equation is
given by
 = + 1 1+ 2 2
 The parameter a, b1 and b2 are obtained by
 ∑ = + 1∑ 1 + 2∑ 2 …1
 ∑ 1 = ∑ 1 + 1∑ 1 + 2∑ 1 2 …2
 ∑ 2 = ∑ 2 + 1∑ 1 2 + 2∑ 1 …3
Example
A car manufacturer has recently held 3 day road side exhibition on the introduction of a new car model. The number
of salesman employed at each of a sample of 10 exhibitions and the number of car booked are given below. Using
this data obtain regression equation and estimate the number of cars booked if 10 salesman are employed on an
exhibition.
No. of salesman No. of cars booked
5 132
8 160
6 148
8 156
9 168
3 102
5 142
4 98
6 152
6 142

X Y X2 XY YC (Y-YC)2 (Y- )2 (YC- )2


5 132 25 660 128.75 10.5625 64 126.5625
8 160 64 1280 162.5 6.25 400 506.25
6 148 36 888 140 64 64 0
8 156 64 1248 162.5 42.25 256 506.25
9 168 81 1512 173.75 33.0625 784 1139.0625
3 102 9 306 106.25 18.0625 1444 1139.0625
5 142 25 710 128.75 175.5625 4 126.5625
4 98 16 392 117.5 380.25 1764 506.25
6 152 36 912 140 144 144 0
6 142 36 852 140 4 4 0
60 1400 392 8760 1400 878 4928 4050

= 140
=6
= 36

b=∑ = 11.25
a= − = 72.5
YC = 72.5 + 11.25x
Y10 = 185
SST = 4928
SSR = 4050
SSE = 878
R2 = 0.82
1-R2 = 0.18
This implies that the number of salesman employed explains about 82% of variation in the number if car booked. All
other factors combined can explain at most 18% of the variation, thus we may say that the linear relationship
between x and y is strong. The forecast of the booking of 185 cars, when 10 salespersons are employed is known as
point estimate of the variable Y. Such a point estimate is actually the expected value or mean of the distribution of
the possible value of Y. To measure the regression estimates, we need to compute standard error of estimate
∑( ) ∑ ∑ ∑
= or or = 10.48
Example
Given the following data, develop the estimating equation best described by this data. If an employee score 83 on the
aptitude test and had a prior experience of 7. What performance would be expected? Also test the goodness of fit.
Performance Evaluation Aptitude Test Score Prior Experience
28 74 5
33 87 11
21 69 4
40 69 9
38 81 7
46 97 10

Y X1 X2 X12 X22 X1Y X2Y X1X2 YC (Y-YC)2 (Y- )2 (YC- )2


28 74 5 5476 25 2072 140 370 28.4 0.16 40.07 35.16
33 87 11 7569 121 2871 363 957 41.8 77.44 1.77 55.8
21 69 4 4761 16 1449 84 276 25.6 21.16 177.69 76.21
40 69 9 4761 81 2760 360 621 34.6 29.16 32.15 0.07
38 81 7 6561 49 3078 266 567 33.4 21.16 13.47 0.86
46 97 10 9409 100 4462 460 970 42 16 136.19 58.83
206 477 46 38537 392 16692 1673 3761 205.8 165.08 401.34 226.93
= 34.33
= + 1 1+ 2 2
∑ = + 1∑ 1 + 2∑ 2 …1
∑ 1 = ∑ 1 + 1∑ 1 + 2∑ 1 2 …2
∑ 2 = ∑ 2 + 1∑ 1 2 + 2∑ 1 …3

206 = 6a + 477b1 + 46b2 …1


16692 = 477a + 38537b1 + 3761b2 …2
1673 = 46a + 3761b1 + 392b2 …3

Multiply equation 1 with 46 and subtract equation 3 after multiplying it by 6.


9476 = 276a + 21942b1 + 2116b2
10038 = 276a + 22566b1 + 2352b2
563 = 624b1 + 236b2 …4

Multiply equation 1 with 477 and subtract equation 2 after multiplying it by 6.


98262 = 2862a + 227529b1 + 21942b2
100152 = 2862a + 231222b1 + 22566b2
1890 = 3693b1 + 624b2 …5

Multiply equation 4 with 624 and subtract equation 5 after multiplying it by 236.
351312 = 389376b1 + 147264b2
446040 = 871548b1 + 147264b2
94728 = 482172b1
b1 = 0.197 = 0.2 (approx.)
b2 = 1.8
a = 4.6
Y = 4.6 + 0.2X1 + 1.8X2

When X1 = 83 and X2 = 7, Y = 33.8

SST = 401.34
SSR = 226.94
SSE = 165.08
R2 = 0.57
1-R2 = 0.43
Standard error of estimate

1 2= Where k = number of independent variables


Here Syx1x2 = 7.43

Example
Using the following data fit the desired regression equation. Estimate the sales Y, if X1 = 13000 and X2 = 7000. Also
check goodness of fit.
Y X1 X2
72 12 5
76 11 8
78 15 6
70 10 5
68 11 3
80 16 9
82 14 12
65 8 4
62 8 3
90 18 10

Y X1 X2 X12 X22 X1Y X2Y X1X2 YC (Y-YC)2 (Y- )2 (YC- )2


72 12 5 144 25 864 360 60 72.1 0.01 5.29 4.84
76 11 8 121 64 836 608 88 73.95 4.2025 2.89 0.12
78 15 6 225 36 1170 468 90 78.05 0.0025 13.69 14.06
70 10 5 100 25 700 350 50 68.9 1.21 18.49 29.16
68 11 3 121 9 748 204 33 68.2 0.04 39.69 37.21
80 16 9 256 81 1280 720 144 83.1 9.61 32.49 77.44
82 14 12 196 144 1148 984 168 83.35 1.8225 59.29 81.9
65 8 4 64 16 520 260 32 64.55 0.2025 86.49 95.06
62 8 3 64 9 496 186 24 63.4 1.96 151.29 118.81
90 18 10 324 100 1620 900 180 87.45 6.5025 246.49 172.92
743 123 65 1615 509 9382 5040 869 743.05 25.5625 656.1 631.52
= 74.3
= + 1 1+ 2 2
∑ = + 1∑ 1 + 2∑ 2 …1
∑ 1 = ∑ 1 + 1∑ 1 + 2∑ 1 2 …2
∑ 2 = ∑ 2 + 1∑ 1 2 + 2∑ 1 …3

743 = 10a + 123b1 + 65b2 …1


9382 = 123a + 1615b1 + 869b2 …2
5040 = 65a + 869b1 + 509b2 …3

Multiply equation 1 with 65 and subtract equation 3 after multiplying it by 10.


48295 = 650a + 7995b1 + 4225b2
50400 = 650a + 8690b1 + 5090b2
2105 = 695b1 + 865b2 …4

Multiply equation 1 with 123 and subtract equation 2 after multiplying it by 10.
91389 = 1230a + 15129b1 + 7995b2
93820 = 1230a + 16150b1 + 8690b2
2431 = 1021b1 + 695b2 …5

Multiply equation 4 with 695 and subtract equation 5 after multiplying it by 865.
1462975 = 483025b1 + 601175b2
2102815 = 883165b1 + 601175b2
639840 = 400140b1
b1 = 1.6
b2 = 1.15
a = 47.15
Y = 47.15 + 1.6X1 + 1.15X2

When X1 = 13 and X2 = 7, Y = 76

SST = 656.1
SSR = 631.5325
SSE = 25.5625
R2 = 0.96
1-R2 = 0.04

Fitting of second degree parabola


= + +
∑ = + ∑ + ∑
∑ = ∑ + ∑ + ∑
∑ = ∑ + ∑ + ∑

Example
The price of commodity during year 2001-06 are given below fit a second degree parabola and estimate the price for
the year 2007.
Year Price
2001 100
2002 107
2003 128
2004 140
2005 181
2006 192

Year(t) Price X X2 X3 X4 XY X2Y


2001 100 -5 25 -125 625 -500 2500
2002 107 -3 9 -27 81 -321 963
2003 128 -1 1 -1 1 -128 128
2004 140 1 1 1 1 140 140
2005 181 3 9 27 81 543 1629
2006 192 5 25 125 625 960 4800
848 0 70 0 1414 694 10160
848 = 6a + 70c …1
694 = 70b …2
10160 = 70a + 1414c …3
b = 9.91

Multiply equation 1 with 70 and subtract equation 3 after multiplying it by 6.


59360 = 420a + 4900c
60960 = 420a + 8484c
1600 = 3584c
c = 0.45
a = 136.08

Y = 136.08 + 9.91X + 0.45X2


Y2007 = 227.5
Example
Fit a parabola to the following data and estimate the value for year 2010 and year 2000. Also find trend.
Year Y
2001 2
2002 6
2003 7
2004 8
2005 10
2006 11
2007 11
2008 10
2009 9

Year(t) Y X X2 X3 X4 XY X2Y
2001 2 -4 16 -64 256 -8 32
2002 6 -3 9 -27 81 -18 54
2003 7 -2 4 -8 16 -14 28
2004 8 -1 1 -1 1 -8 8
2005 10 0 0 0 0 0 0
2006 11 1 1 1 1 11 11
2007 11 2 4 8 16 22 44
2008 10 3 9 27 81 30 90
2009 9 4 16 64 256 36 144
74 0 60 0 708 51 411
74 = 9a + 60c …1
51 = 60b …2
411 = 60a + 708c …3
b = 0.85

Multiply equation 1 with 60 and subtract equation 3 after multiplying it by 9.


4440 = 540a + 3600c
3699 = 540a + 6372c
-741 = 2772c
c = -0.27

Y = 10.02 + 0.85X – 0.27X2


Y2010 = 7.52
Y2000 = -0.98
Example
From the following data compute linear trend and second degree parabola.
Year Y
2001 238.3
2011 252
2021 251.2
2031 278.9
2041 318.5
2051 361
2061 439.1
2071 547.9

Year Y X X2 X3 X4 XY X2Y
2001 238.3 -7 49 -343 2401 -1668.1 11676.7
2011 252 -5 25 -125 625 -1260 6300
2021 251.2 -3 9 -27 81 -753.6 2260.8
2031 278.9 -1 1 -1 1 -278.9 278.9
2041 318.5 1 1 1 1 318.5 318.5
2051 361 3 9 27 81 1083 3249
2061 439.1 5 25 125 625 2195.5 10977.5
2071 547.9 7 49 343 2401 3835.3 26847.1
2686.9 0 168 0 6216 3471.7 61908.5
2686.9 = 8a + 168c …1
3471.7 = 168b …2
61908.5 = 168a + 6216c …3
b = 20.66

Multiply equation 1 with 168 and subtract equation 3 after multiplying it by 8.


451399.2 = 1344a + 28224c
495268 = 1344a + 49728c
43868.8 = 21504c
c = 2.04
a = 293.0225

Now let’s take


= +
∑ = + ∑
∑ = ∑ + ∑

2686.9 = 8a …1
3471.7 = 168b …2
a = 335.8625

You might also like