Modelling Time Series With Calender Variation
Modelling Time Series With Calender Variation
To cite this article: W. R. Bell & S. C. Hillmer (1983) Modeling Time Series with Calendar Variation, Journal of the American
Statistical Association, 78:383, 526-534
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the
publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or
warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed
by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with
primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings,
demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly
in connection with, in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is
expressly forbidden. Terms & Conditions of access and use can be found at http://amstat.tandfonline.com/page/terms-
and-conditions
Modeling Time Series With Calendar Variation
W. R. BEll and S. C. HlllMER*
The modeling of time series data that include calendar 2. MODEL-BUILDING PROCEDURES
variation is considered. Autocorrelation, trends, and sea-
In developing 'models of the form (1.1) for a specific
sonality are modeled by ARIMA models. Trading day
set of data we follow the three-stage model-building pro-
variation and Easter holiday variation are modeled by
cedure of identification, estimation, and diagnostic
regression-type models. The overall model is a sum of
checking presented in Box and Jenkins (1976). In (1.1)
ARIMA and regression models. Methods of identifica-
we assume that N, follows the ARIMA model
tion, estimation, inference, and diagnostic checking are
discussed. The ideas are illustrated through actual ex- q,(B)8(B)Nt = 8(B)a t, (2.1)
amples.
where B is the backshift operator (BN t = N, - 1), q,(B) =
KEY WORDS: Calendar variation; Trading day varia- 1 - q,IB - ... - q,pBP and 8(B) = 1 - 8 lB - ... -
Downloaded by [Moskow State Univ Bibliote] at 22:23 14 November 2013
tion; Easter holiday variation; ARIMA models; Monthly 8qBq have all their ~eros outside the unit circle, q,(B) and
time series. 8(B) have no common zeroes, 8(B) is a differencing op-
erator (all zeroes on the unit circle) such as (1 - B) or
1. INTRODUCTION (1 - B)(1 - B I 2 ) , and {at} is a sequence of independent,
Suppose we observe a time series Z, that follows the identically distributed (iid) random variables with mean
model (perhaps after transformation) o and variance (12. Some of the q,'s and 8's may be 0 or
otherwise constrained, so that (2.1) could be a multipli-
Z, = f(Xt;~) + N), (1.1) cative seasonal model.
Here f is a function of~, a vector of parameters, and of
X" a vector of fixed independent variables observed at
2.1 Model Identification
time t, and N, is a noise series. If N, is white noise, then The regression portion of the model, f(Xt; ~), can be
(1.1) is the familiar linear or nonlinear regression model. identified by consideration of the nature of the indepen-
However, when one deals with time series, N, will gen- dent variables, which in our case are describing the trad-
erally be autocorrelated and frequently nonstationary. ing day or holiday variation. To identify the noise model
Numerous authors have warned against the conse- (2.1) we first examine the sample autocorrelation function
quences of using standard regression theory when N, is (SACF) of the time series Zt. In our experience with se-
autocorrelated, the problem being well established as ries containing trading day or holiday variation, exami-
long ago as Anderson (1954). nation of the SACF of Z, is useful for determining the
In this article we are concerned with the converse prob- degree of differencing, 8(B), in Ns, We believe this is so
lem-that of the effects of ignoring f(X t ; ~) when ana- because the effect of the nonstationary N, on the com-
lyzing a time series. In the particular case we consider, puted sample autocorrelations dominates the effect of the
f(Xt; ~) represents trading day and holiday effects. For trading day or holiday variation; In contrast, after Z, (and
this case we illustrate the important points that (a) pure thus Nt) has been appropriately differenced, the effect of
ARIMA models should not be applied blindly to all time the differenced N, on the computed sample autocorre-
series, (b) to ignore known, relevant independent varia- lations no longer dominates the effect of the differenced
bles is to invite difficulties, and (c) substantial improve- f(Xt; ~). The SACF and sample partial autocorrelation
ments in models can be obtained when relevant indepen- function (SPACF) of the differenced Z, series are usually
dent variables are incorporated in the model. confused. At this stage we must at least approximately
remove the effects of f(Xt; ~) from Zt. To do this we fit
the model
* W. R. Bell is Mathematical Statistician, U.S. Census Bureau, Wash-
ington, DC 20233. S. C. Hillmer is Assistant Professor, School of Busi- 8(B)Zt = 8(B)f(X t; ~) + e, (2.2)
ness, University of Kansas. Summerfield Hall, Lawrence, KS 66045. by least squares regression (linear or nonlinear) and ex-
This research was partially supported while the authors were partici-
pants in the ASA-Census Research Fellowship Program. This program amine the SACF and SPACF of the residuals from this
was funded by the U.S. Census Bureau and the National Science Foun- regression in order to tentatively identify the noise model.
dation, and their support was instrumental in completing this research. A justification for this procedure is that the sample au-
The authors also acknowledge the many stimulating discussions with
Census Bureau employees that helped shape the ideas of the research,
and they wish to thank the referees and an associate editor for a number © Journal of the American Statistical Association
of helpful suggestions. The second author was partially supported by September 1983, Volume 78, Number 383
the Department of Commerce (Census Bureau) through JSA 81-2. Applications Section
526
Bell and Hillmer: Time Series With Calendar Variation 527
tocorrelations and hence the sample partial autocorre- 3. TRADING DAY AND HOLIDAY VARIATION
lations of the residuals from the least squares fit of (2.2)
The variation in a monthly time series that is due to
differ from those of 'O(B)N t by an amount that converges
the changing number of times each day of the week occurs
in probability to zero (see Fuller 1976, p. 399). This pro-
in a month is called trading day variation. Trading day
cedure is illustrated by two examples later in this article.
variation occurs when the activity of a business or in-
2.2 Model Estimation dustry varies with the days of the week so that the activity
for a particular month partially depends on which days
Combining (1.1) and (2.1), we can write our model as of the week occur five times. In addition, Young (1965)
. 8(B) notes that accounting and reporting practices can create
'O(B)Zt = 'O(B)f(Xt; ~) + <I>(B) a.. (2.3) trading day effects in a time series. For example, stores
that perform their bookkeeping activities on Fridays tend
We can then estimate ~, cfJ, and 0, in (2.3) by maximum to report higher sales in months with five Fridays than in
likelihood methods assuming normality of the at's We months with four Fridays. Holiday variation refers to
estimate (12 by 0- 2 = (n - r) - 1 L ill where n is the number fluctuations in economic activity due to changes from
of observations less the degree of 'O(B)<I>(B) , r is the num- year to year in the composition of the calendar with re-
ber of parameters in (2.3), and spect to holidays. The primary example of this for U.S.
economic series is the increased buying that takes place
at = 6(B) - 1 cI>(B)'O(B)[Zt - f(X t, ~)].
in some retail sales series just before Easter. This is a
Since the model for N, is invertible this is asymptotically holiday effect since Easter falls on various dates in March
Downloaded by [Moskow State Univ Bibliote] at 22:23 14 November 2013
equivalent to nonlinear least squares. and April. Holiday effects must be distinguished from
Pierce (1971) discusses inference for the model (1.1) seasonal effects, which are attributable to the same month
for the case in which f(X t ; ~) is linear in ~. He shows every year. For instance, the increase in retail sales in
that under some conditions on the at's and the Xt'S that December prior to Christmas each year is a seasonal ef-
the least squares estimates v = (t ci>, 0) are consistent fect and not a holiday effect.
and asympto!ically normal, ~ is asymptotically indepen- Almost all of the previous research on trading day and
dent of (cfJ, 0), and 0- 2 is a consistent estimator of (12. holiday effects has dealt with their relation to seasonal
Also, the (i, j)th element of the inverse of the asymptotic adjustment. Young (1965) describes the procedures that
covariance matrix of v can be approximated numerically are used in the Census X-II seasonal adjustment program
by -(a 2 Llavjavj) I v, where L is the log-likelihood. to adjust time series for trading day variation, and briefly
Hannan (1971) and Gallant and Goebel (1976) obtain re- discusses the adjustments made for holiday variation.
sults analogous to those of Pierce for the case in which Cleveland and Devlin (1980,1982) have reported on meth-
f(X t ; ~) is nonlinear in ~, although they do not explicitly ods to identify times in which trading day effects are pres-
consider the asymptotic properties of ~ and O. They re- ent in a time series and on methods to remove these ef-
quire the additional assumptions of continuity of f(X t ; ~) fects. Pfefferman and Fisher (1980) discuss adjustments
for the consistency of ~ (Hannan 1971) and twice differ- for both trading day and holiday variation. All of these
entiability for the asymptotic normality. authors use a two-stage approach in which a regression
model is fitted to data that have been preprocessed to
2.3 Diagnostic Checking remove the trend and seasonality. We prefer to postulate
a model of the form (1.1) and ARIMA noise structure and
In general, the adequacy of both the assumed formu- simultaneously estimate the regression and ARIMA pa-
lation of f(Xt;~) and the assumed noise model rameters. Once a model of the form (1.1) has been de-
<I>(B)'O(B)Nt = 8(B)a t should be checked. To check the veloped, it can be used for a variety of purposes including
form of f(X t; ~) the residuals, at, can be plotted against forecasting and seasonal adjustment.
the X j t and any other possible independent variables. The
at should be plotted against time to check for outliers, 4. MODELING TRADING DAY VARIATION IN
constancy of variance, and trends. The SACF of the re- TIME SERIES
siduals should be examined for any large autocorrela-
Trading day variation arises in part because the activity
tions. Ljung and Box (1978) show that under the hypoth-
for a monthly time series varies with the days of the week.
esis that the model is correct, for large n the statistic
We assume·that trading day effects can be approximated
L
by a deterministic model. We deal only with flow series
Q = n(n + 2) L rk(a)2/(n - k) for which the data are the accumulation of the daily values
. k~1
ber 1979, (the data for which may be obtained from the
6, and let T7 , = ~7=1 Xi' denote the length of month t. U. S. Census Bureau). Examination of a plot of the series
Then we can write (4.1) as
reveals that the amplitude of the seasonality increases
7
with the level. Therefore, we have determined that it is
TD, = ~ (ti - ~)(Xit - X 7 , ) appropriate to model the natural logarithms, which we
i~1
We get the same estimate for TO, whether we use the so we examine the SACF of the residuals from the regres-
parameterization (4.1) or (4.3); however, .we have ob- sion of (1 - B)(l - B I 2)Z, on (l - B)(1 - B I 2)Tit for
served that estimates of the E/s tend to be highly cor- i = 1, ... , 7. From this SACF, Figure 2, the presence
related while estimates of 131, ... , 136 are less so and are of the large negative value at lag 12 suggests the noise
not highly correlated with the estimate of ~7' The param-
eters ~i = Ei - ~, i = 1, ... ,6, measure the differences 1
between the Monday, Tuesday, . . . , Saturday effects (f)
and the average of the daily effects, ~7 = ~. The differ- z
ence between the Sunday effect and the average of the
a......
daily effects is then I-
7 6 <
..J
E7 - E= ~ Ei - E- ~ Ei W
1 1 IX
IX
6 6
a
= 6E - ~ (~i + E) = - ~ ~i, u
1 1 a
and one may solve for the Sunday effect, E7' using ~7 - I-
::J
~?~i' < -1
lSI
...
N ;- co
N (I')
4.1 An Example
As an example, consider the series retail sales of lum- LAG
ber and building materials from January 1967 to Septem- Figure 2. SACF of Regression Residuals.
Bell and Hillmer: Time Series With Calendar Variation 529
~1 1.
The parameter estimates for (4.9) are l = .40, l 2 = .88, e e
and iT 2 = .0017. The residual autocorrelations are plotted
~2 -.54 1.
in Figure 3. By comparing the results of the fit for (4.5)
~3 -.12 -.50 1.
with those of the fit for (4.9), we can judge the impact of
~4 .14 -.09 -.53 1.
the trading day effects upon this data set. While the re-
~5 .07 .14 -.12 -.51 1.
sidual autocorrelations from model (4.9) did not reveal
~6 -.04 .05 .15 -.07 -.55 1.
any specific pattern, there are a number of moderately
~7 .12 - .11 .13 -.14 .05 .09 1.
large sample autocorrelations. Futhermore, the value of
530 Joumal of the American Statistical Association, September 1983
lems involved with modeling a time series affected by the H(T, t) can be defined as the proportion of the time period
varying placement of the Chinese New Year. T days before Easter that falls in the month corresponding
The earliest and latest dates on which Easter can fall to time point t. With this definition H(T, t) can be defined
are March 22 and April 25. Thus, for series in which in- for any T> O. For fixed t, H(T, t) is in general a continuous
creased buying takes place before Easter we expect the but nondifferentiable function of T. Figure 5 shows H(T,
March and April values in any particular year to depend t) for t corresponding to March 1969 and April 1969,
on the date of Easter. Easter having been on April 6 that year.
Specifying a functional form for the effect of Easter is Patterns other than that leading to (5.3) are possible.
not as simple as doing so for trading day effects. To be However, because Easter seldom occurred in early April
rather general, let 0:; denote the effect on the series being from 1967 to 1979 (see Figure 4), it is unlikely that
modeled on the ith day before Easter; let h(i, t) be 1 when complex patterns can be detected from the data. This
the ith day before Easter falls in the month corresponding situation may change as additional data covering different
to time point t, and 0 otherwise. Then the Easter effect Easter dates become available. We illustrate an approach
at t, E" is to checking the adequacy of our assumed pattern in Sec-
K tion 5.3.
E, = ~ o:;h(i, t), (5.1)
;=1 5.1 Noise Model Identification
where K denotes some suitable upper bound on the length It is of interest to consider the effect of E, on the ACF
of the effect in days. Since many time series that contain of the original series and its differences. Figure 6 shows
Easter variation also contain trading day variation, we theSACF of (1 - B)(1 - B I 2)H(l4, t) (using January
consider the model 1967 through September 1975 data), its most unusual fea-
tures being the spikes at and near lags 36 and 48. Patterns
Z, = TD t + E, + N«, (5.2)
in the SACF's for H(T, t) for other T and other time pe-
where TD t is given by (4.3), N, by (2.1), and E, by (5.1), riods are similar. The degree to which these character-
although we will need to simplify E: istics are transmitted to the original series depends on the
The relationship (5.1) was derived by consideration of magnitude of the Easter effect relative to TD t and N,.
the daily impact of Easter on the level of the series. Un- However, spikes at these lags can be taken as a possible
fortunately, in most situations the only data available are indication of Easter effects in a series, especially when
monthly values of the series; as a consequence, in prac- they show up in the SACF of a residual series from a
tice we cannot estimate effects as general as (5.1). To model that has no terms to account for Easter effects.
illustrate, consider the placement of Easter for the years To illustrate noise model identification, we consider the
1967 to 1979. We chose these particular years because example of monthly retail sales of shoe stores (U. S.) from
they correspond to the time frame of an actual set of data January 1967 through September 1979, which is available
that is considered later; however, conclusions similar to from the Census Bureau. (The observation for January
those that we draw for these years are relevant for other 1970 (t = 37) was found to be an outlier and was modified
time periods. Figure 4 shows the Easter dates for these from 243 to 270.3 (millions of dollars). The effect of the
years and constitutes the experimental design for deter- outlier was estimated by fitting the model with an indi-
mining the effect of Easter. From the diagram it is evident cator variable at t = 37.) We found it appropriate to ana-
Bell and Hillmer: Time Series With Calendar Variation 531
= - ~ Ind"2(T) + constant
H(T, t} April 1969 (where d"2(T) is the estimate of (12 for fixed T) and maxi-
mizing this over T. Table 2 gives d"2(T) for the shoe store
1.0
1
(f)
0.8 z
a
H
r-
0.6 <
-.J
W
~
~
0.4 a
u
a
0.2 t-
:J
-c -1
0.0
IS)
...
N
•
N
co
(1)
0 5 10 15 20 25 T LAG
Figure 5. Hh, t). Figure 7. SACF of (1 - 8)(1 - 8 12 )Zr.
532 Joumal of the American StaHstlcal Association, September 1983
Table 2. Estimation of T
7 2 3 4 5 6 7 8 9 10 11 12 13
100d'2(7) .212 .182 .177 .176 .175 .167 .163 .162 .161 .160 .161 .164 .167
&(7) .13 .15 .15 .15 .15 .16 .16 .16 .16 .17 .17 .17 .17
d'(&(7)) .0129 .0127 .0125 .0124 .0123 .0123 .0123 .0122 .0122 .0123 .0126 .0128 .0130
14 15 16 17 18 19 20 21 22 23 24 25
100d'2(7) .171 .171 .173 .175 .176 .178 .180 .183 .183 .183 .183 .183
&(7) .17 .18 .18 .19 .19 .20 .21 .21 .22 .23 .24 .25
d'(&(7)) .0135 .0141 .0146 .0151 .0157 .0164 .0169 .0176 .0183 .0192 .0200 .0209
Bell and Hillmer: Time Series With Calendar Variation 533
1)
zero. The Ljung-Box Q statistic for 36 lags is 46.3. Since
/l,o, .. 0.1 = ... = o.j, o.j+ I = ... = 0.6 = 0 this is less than 48.6, the X2.os(34) critical value, we con-
clude that the residuals appear to be white noise.
against HI. Table 3 presents (asymptotic) likelihood ratio
test statistics for the shoe stores example computed as 5.4 Ignoring Trading Day and Easter Effects
[RSS(Ho) - RSS(Hdl/vl As in the example of Section 4 it is of interest to in-
RSS(H 1 ) / V 2 vestigate the influence of the trading day and Easter hol-
iday terms in model (5.5) by fitting the model without
and similarly for H o' , where RSS denotes the residual
these terms, which is
sum of squares. The numerator degrees of freedom, VI,
is 1 for testing H o andj - 1 for testing HOi. The denom- (l - B)(l - B I 2)Zt = (l - 8 1B)(l - 812B I 2)at. (5.7)
inator degrees of freedom, V2, is 153 - 13 (for differenc-
The parameter estimates for model (5.7) are 81 = .68,
ing) -I (outlier) - 7 (TD parameters) -2 (8 1 and ( 12 )
8 12 = .93, and &2 = .00333. The residual autocorrelations
- j = 130 - j. The 5% and 1% critical values for the F(VI,
are plotted in Figure 9. From Figure 9 there appear to be
V2) distribution are also reported. The test statistics do
a number of moderately large rk(a)'S at low lags, but there
not indicate that 0. j ¥- 0 for j > 3. Also, there is no reason
is not a recognizable pattern that would suggest a mod-
to reject the assumption that 0.1 = 0.2 = 0.3' We conclude
ification if a pure ARIMA model is to be used. Also, the
that for this example the data give no evidence that the
behavior of the rk(a)'S near lags 36 and 48 resembles that
simplified Easter effect given by (5.3) is inadequate.
in Figure 6, indicating the presence of the Easter effect.
In addition to checking the Easter effect, we also
The Ljung-Box Q statistic based upon 36 lags is 79.5,
should check the adequacy of the noise model (5.4). The
which is larger than X2 .0I (34) = 56.1. Thus, we would
sample autocorrelations of the residuals for the shoe
reject the hypothesis that the residuals from model (5.7)
stores series (using the model (5.5) with 7 = T = 10) are
were white noise. The model (5.7) has obvious inade-
all within plus or minus two standard errors of zero with
quacies, and there is about a 50 percent reduction in the
the exception of rs(a), which is 2.7 standard errors below
residual sum of squares when the trading day and Easter
influences are appropriately modeled.
Table 3. Investigating 0.1 = ...
0.6 = 0 6. CONCLUSIONS
F-statistic F-statistic In the time series literature the model (1.1) has been
i for He for He' F e5(i - 1,130) Fedi - 1,130) considered from a theoretical viewpoint; however, in
many applications there has been an apparent tendency
1 90.3
2 12.1 .9 3.9 6.8
either to consider pure regression models or to consider
3 5.7 .6 3.0 4.8 pure ARIMA time series models. We have argued that
4 .0 there are situations in which a combination of these two
5 2.7
6 .3 models is superior. As particular examples we considered
in detail the cases of time series that include trading day
Fos(1,130) = 3.9 F. Ol (1,130) = 6.8
variation and Easter holiday variation. These two partic-
NOTE: The F(V1,130·j) critical values are very close to the F(Vl,130) critical values. ular examples are important because there are many time
534 Joumal of the American Statistical Association, September 1983
series that contain one or both of these effects. The actual Methods," Journal of the American Statistical Association, 75, 487-
time series we considered indicate that substantial im- 496.
- - (1982), "Calendar Effects in Monthly Time Series: Modeling and
provements over pure ARIMA models can be achieved Adjustment," Journal of the American Statistical Association, 77,
if trading day and Easter effects are appropriately mod- 520-528.
eled. We hope that from this research more model build- FULLER, W. (1976), Introduction to Statistical Time Series, New
York: John Wiley and Sons.
ers will become aware of trading day and Easter varia- GALLANT, A.R., and GOEBEL, J.J. (1976), "Non-linear Regression
tions and, as a result, will be in a better position to handle with Autocorrelation Errors," Journal of the American Statistical
them. Association, 71, 961-967.
HANNAN, E.J. (1971), "Non-linear Time Series Regression," Journal
[Received March 1981. Revised February 1983.] of Applied Probability, 8, 767-780.
LIU, L.M. (1979), "User's Manual for BMDQ2T Time Series Analy-
REFERENCES sis," BMDP Technical Report.
- - (1980), "Analysis of Time Series with Calendar Effects," Man-
ANDERSON, R.L. (1954), "The Problem of Autocorrelation in Regres- agement Science, 26, 106-112.
sion Analysis," Journal of the American Statistical Association, 49, LJUNG, G.M., and BOX, G.E.P. (1978), "On a Measure of Lack of
113-129. Fit in Time Series Models," Biometrika, 65, 297-303.
BOX, G.E.P., and JENKINS, G.M. (1976), Time Series Analysis: Fore- PFEFFERMAN, D., and FISHER, J. (1980), "Festival and Working
casting and Control, San Francisco: Holden Day. Days Prior Adjustments in Economic Time Series," in Time Series,
CLEVELAND, W.P., and GRUPE, M.R. (1982), "Modeling Times Se- ed. O.D. Anderson, New York: North-Holland.
ries When Calendar Effects are Present," to appear in Proceedings PIERCE, D.A. (1971), "Least Squares Estimation in the Regression
ofthe Conference on Applied Time Series Analysis ofEconomic Data, Model With Autoregressive-moving Average Errors," Biometrika,
ed. Arnold Zellner. 58,299-312.
Downloaded by [Moskow State Univ Bibliote] at 22:23 14 November 2013
CLEVELAND, W.S., and DEVLIN, SJ. (1980), "Calendar Effects in YOUNG, A.H. (1965), "Estimating Trading-Day Variation in Monthly
Monthly Time Series: Detection by Spectrum Analysis and Graphical Economic Time Series," Technical Paper 12, Bureau of the Census.