0% found this document useful (0 votes)
41 views13 pages

VAR Project

For the current project, I investigate the revenues and promotion dollars for a fast food retail store in a Midwestern college town. The data is collected at a weekly interval and spreads over 200 weeks starting from 9 / 29 / 2002 till 12 / 31 / 2006.

Uploaded by

sunils05
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views13 pages

VAR Project

For the current project, I investigate the revenues and promotion dollars for a fast food retail store in a Midwestern college town. The data is collected at a weekly interval and spreads over 200 weeks starting from 9 / 29 / 2002 till 12 / 31 / 2006.

Uploaded by

sunils05
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Time series analysis of store level sales and promotion for a fast food restaurant: A Vector auto regression

model

Sunil Singh

This version: 8th December 2011


Introduction In real life, statisticians and marketing scholars are presented with numerous scenarios where we need to analyze multiple time series at the same time. To handle such data, methodologies that accommodate multiple time series need to be used. One such technique that has been used extensively by modelers to tackle such data is the Vector Autoregressive Model (VAR). VAR models help us model multiple time series and also comprehend how different series impact each other. For the current project, I investigate the revenues and promotion dollars for a fast food retail store located in a Midwestern college town. The data is collected at a weekly interval and spreads over 200 weeks starting from 9/29/2002 till 12/31/2006. I have used the first 200 weeks (9/29/2002 till 7/23/2006) for estimating the VAR model and 23 weeks (7/30/2006 till 12/31/2006) for estimating the forecasting accuracy of the VAR model Since the early 1970s, price promotions have emerged as an important part of the marketing mix. Increasingly, they represent the main share of the marketing budget for most companies. An extensive body of academic research has established that temporary price reductions substantially increase short-term sales, which may explain their intensity of use by retailers and manufacturers alike. However, the long-term effects of price promotions have found to be much weaker. Extant research has found that short-term promotion effects die out in subsequent weeks or months-a period also coined as dust settling-leaving few, if any, permanent gains to the promoting brand. From a strategic perspective, these findings imply that promotions generally do not generate long-term benefits to the promoting brand beyond those accrued during the dust-settling period. By the same token, brands do not suffer permanent damage to their market position from competitive promotions either. Therefore, to be economically viable,

promotional actions should be held accountable for positive financial results during the dustsettling period. This raises a relevant empirical question of assessing whether or not promotions are attractive in financial terms Thus in the current project, I conduct an investigation of the effects of price promotions on a stores revenues. Apart from predicting future sales and promotion spending, one other goal of this project is also to get familiarized with the model building techniques and diagnostic checks that are predominantly used for modeling VAR. The rest of this report is organized as follows: I first perform descriptive statistics on the data to identify any sporadic behavior of sales and promotion. Later, I implement a VAR model based on difference/level as outlined by the guiding framework below. I also employ several model diagnostic checks at each stage to validate the models and to establish reliability and accuracy for the chosen models. Guiding principle for VAR modeling Step 1: Test for unit roots Step 2: If no unit root, build VAR model on levels and then go to step 5 If unit roots, go to step 3 Step 3: Test for co-integration Step 4: If no co-integration, build VAR model on differences and then go to step 5 If co-integration, build vector error correction model and go to step 5 Step 5: Derive impulse-response models

Descriptive statistics Descriptive statistics of sales and promotion data (displayed in Figure 1) revealed a presence of (a) a strong trend, and (b) seasonality in the data. To account for the variation a log transformation is taken for both sales and promotion data. The log transformed data also retains the trend, hence I think it is integrated in nature and need a first difference. Both the variables appear approximately stationary after the variables are transformed. Thus a VAR model would be a good model to proceed.

Figure 1: Descriptive statistics

Step 1: Test for unit root To employ unit root test, I used the augmented Dickey-Fuller test available under the library vars in R. Table 1 summarizes the ADF test results. The ADF results suggest that there are no unit roots for either of the time series and hence it can be concluded that both the time series are integrated of at least order one, which is coincidence with the plots above. In the ensuring step, I find the optimal lag length for the unrestricted VAR model Variable Sales Sales Promotion Promotion Deterministic terms Lags Constant, trend Constant Constant, trend Constant Test value 1% -3.99 -3.46 -3.99 -3.46 Critical value 5% -3.43 -2.88 -3.43 -2.88

2 -4.72 1 -3.57 2 -4.13 1 -3.42 Table 1: ADF results

10% -3.13 -2.57 -3.13 -2.57

Lag length estimation To determine the optimal lag length I use the VARselect command and choose the selection criteria as AIC, HQ, SC and FPE. The results for the command at maximum lag of 8 are summarized in Table 2 and Table 3. According to the AIC and FPE the optimal lag number is p = 3, whereas the HQ criterion indicates p = 2 and the SC criterion indicates an optimal lag length of p = 1. Since I have 200 observation, thus for parsimony I can choose HQ criteria for lag length as HQ has the maximum probability of estimating the true order of AR process for large number of observation ( >= 200). However, I still compute VAR model for all three lags p = 1,2, and 3. Table 4 below shows the statistical estimation results for each lag length.

AIC(n) HQ(n) SC(n) FPE(n) 3 2 1 3 Table 2: Selection criteria for lag length (1)
1 AIC(n) HQ(n) SC(n) FPE(n) -13.03 -12.98 -12.90 2.17E-06 2 -13.08 -12.99 -12.87 2.08E-06 3 -13.09 -12.98 -12.82 2.06E-06 4 -13.05 -12.92 -12.71 2.13E-06 5 -13.05 -12.88 -12.64 2.15E-06 6 -13.03 -12.84 -12.55 2.19E-06 7 -13.01 -12.79 -12.47 2.22E-06 8 -12.99 -12.74 -12.38 2.28E-06

Table 3: Selection criteria for lag length (2) p 1 2 3 Log Likelihood Standard error 741.11 .07 745.88 .07 747.69 .07 Table 4: Estimation for each lag length Adjusted R-square .67 .68 .68

With adjusted R-squared values and standard errors almost the same, p=1 gives the smallest Log Likelihood and makes a strong case for the optimal lag order. However, although the Log Likelihood of p = 2 or 3 are slightly larger, a close inspection of the VAR equations suggest that p=2 has most of the coefficients significant. Thus, based on selection criteria (AIC, HQ, SC and FPE) and closer observation of the different VAR models computed at lag 1, 2 and 3, I will proceed to fit the data with lag = 2. But before I decide on the model, I will check the time series for presence of cointegration. Step 3: Cointegration test To avoid (i) running spurious regressions of equation on non-stationary raw time series, and (ii) losing important long-run information by taking log first-differenced price series, I first investigate the presence of a cointegration relationship between the log scales of sales and

promotion series. Table 5 shows the results of the Johansen maximum trace statistics. Tests indicate that there doesnt exist a cointegration between the two time series Sales and Promotion at either lags. Thus we can proceed to compute the VAR model in differences with lag at 2. H0 r=1 r=0 Test statistics Critical value p=1 p=2 1% 5% 27.44 17.66 10.49 12.25 78.37 61.36 22.76 25.32 Table 5: Johansen cointegration tests

10% 16.26 30.45

Step 4: VAR model estimation The VAR equation for sales and promotion is computed at lag 2. The summary of the equations is tabulated in Table 6a and 6b. VAR (2) model suggest that sales can be explained well by its own lagged value in the last time interval and by promotion spending at lag 2. Promotion also follows a similar trend i.e., it can be explained well by the sales in the last quarter and promotion done at lag 2. This is also confirmed by Granger non causality test results given in Table 7. The null hypothesis that promotion do not Granger cause sales cannot be rejected at the 5% level (p-value of 0.07805), and the null hypothesis that sales do not Granger promotions also cannot be rejected (p-value of 0.01345). Therefore, according to the definition of Granger causality, lagged values of sales and promotions can be used for forecasting each other i.e. sales and promotion. Also alternatively, the null hypothesis of no instantaneous causality between sales and promotions cannot be rejected for both tests (p-value of 0.0001) at the usual 5% confidence level.

Coefficients (S) S.11 P.11 S.12 P.12 const trend

Estimate .6305 -.3252 -.1004 .4608 3.9511 .0005

Std. Error .1621 .2281 .1614 .2195 .9439 .0001

t value 3.890 -1.382 -.622 2.099 4.186 3.443

Pr(>(t)) .0001*** .1686 .5346 .0371* .00004*** .0007***

Residual standard error: 0.0669 on 192 degrees of freedom Multiple R-Squared: 0.6932, Adjusted R-squared: 0.6852 F-statistic: 86.76 on 5 and 192 DF, p-value: 2.2e 16 Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Table 6a: Estimated Coefficients for p=2 for sales Coefficients (P) Estimate S.11 P.11 S.12 P.12 const trend .3219 .0186 -.1871 .4476 3.659 .0004 Std. Error .1152 .1621 .1147 .1560 .6708 .0001 t value 2.795 .115 -1.632 2.869 5.456 4.168 Pr(>(t)) .0057*** .9085 .1044 .00458** .00001*** .0001***

Residual standard error: 0.0475 on 192 degrees of freedom Multiple R-Squared: 0.7422, Adjusted R-squared: 0.7354 F-statistic: 110.5 on 5 and 192 DF, p-value: 2.2e 16 Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Table 6b: Estimated Coefficients for p=2 for promotions

Granger causality test Cause = Sales Granger Instant Cause = Promotion Granger Instant

Statistic 4.3574 88.4603 2.5674 88.4603

p-value .0134 .0001 .0780 .0001

Table 7: Granger causality result for VAR (2) model The VAR (2) is also subjected to different diagnostic tests to assess its stability. To test for autocorrelation in residuals, I implement Portmanteau test. Although Ljung-Box statistics would have done the same but it is only limited to univariate time series and portmanteau test is well established for multivariate time series. Thus I ran three tests for checking the model stability. Portmanteau test is used for checking the autocorrelation of the error terms, Jarque -Bera test to check for normality and ARCH test for checking the heteroskedasticity. The results for all the above mentioned tests are summarized in Table 8. The graphical representation of the model fit and ACF and PACF plots are in figure 2. The test results in table 8 and figure 2 suggest that error terms are not auto correlated, are normal in nature and doesnt demonstrate heteroskedastic behavior.

Diagnostic test Lag 2 Q16 62.845 p-value .2468 JB 12.467 p-value .0142 M-ARCH 46.467 p-value .0411

Table 8: Diagnostic test results

Figure 2: Diagram of fit and residual for S (sales) The graphics of the varcheck object arch1 for the logsales (S) and OLS-CUSUM tests for the VAR(2) model are shown in Figure 3 and 4, respectively. The residuals of log (sales) are evenly distributed with mean zero, its very normal, and no significant lags appear on ACF nor PACF plot. All curves in OLS-CUSUM tests for sales and promotion small variation between -1 and 1.

Figure 3: OLS-CUSUM test of VAR (2) for equation S (log (sales)) and P (log (promotion))

Figure 4: Diagnostic plot of VAR (2) for equation S (log(sales)) Step 5: Impulse response function Granger non-causality tests result stated that sales are affected by lagged sales and promotion. To verify how much the impact of the shock from either of the variables will be on the other, I build the impulse response function (IRF). The graphical results for the IRF are displayed in figure 5. The results from the IRF are: 1) Response of promotion to the shock by sales exhibit that with a small change in sales, promotion reverts back to its mean level in next 10 periods and 2) Response of sales to the shock by promotion exhibit that sales has a magnification period till period 5, typical of a small hump i.e., a shock by promotion does cause the sales to increase in the short term. Last but not least I compute the factor error variance decomposition, an alternative method to the impulse response functions for examining the effects of shocks to the dependent variables. As we have seen that with the shock provided by promotion, sales shows a positive increase, we employ FEVD to determine how much of the forecast error variance, is explained

by innovations to each variable, over a series of time horizons. Table 9 summarizes the FEVD of promotion and provides the insight that forecast error variance is majorly accounted by the sales, thereby providing support towards IRF results that promotions shocks increases sales in the near term.

Figure 5: Impulse responses from P (log (promotion)) to S (log (sales)) (top panel) and S (log (sales)) to P (log (promotion)) (bottom panel)

Period 1 4 8 15 20 30

S .807 .846 .849 .849 .849 .849

P .192 .154 .151 .151 .151 .151

Table 9: Forecast error decomposition of the P (log (promotion) equation Conclusion To sum up, the analysis of the relationship between sales and promotion allowed us to derive the following insights: 1) there doesnt exist a cointegration relationship between sales and promotion. 2) A VAR (2) model based on difference suggest that both sales and promotion time seems to be associated with each other i.e., current weeks sales or promotion is positively related to last month sales and promotion with lag 2. Granger non-casualty test also points towards similar conclusion. 3) Diagnostic tests applied to VAR(2) models suggests a good fit of the data as error terms were not heteroskedastic, were not auto correlated and behaved in normally . 4) Impulse response function and forecast error variance decomposition test illustrate that with promotion shock, sales improves in the short term (~ 5 weeks).

You might also like