The Unobservable Components Model
The Unobservable Components Model
The Unobservable Components Model
Notation of UCM
1
This presentation relies heavily on the material contained in the SAS HELP file under the keyword “Proc
UCM”.
1
p m
yt t t t rt i yt i j x jt t . (1)
i 1 j 1
In equation (1) y t represents the time series to be modeled and forecast, t the trend
component, t the seasonal component, t the cyclical component, rt the
autoregressive component, and t the irregular component. All of these components
are assumed to be unobserved and must be estimated given the time series data on y t and
x jt , hence the title unobserved components model. In addition, (1) allows the inclusion
p
of the autoregressive regression terms y
i 1
i t i and the explanatory regression terms
m
j 1
j x jt , the former representing the “momentum” of the time series as it relates to its
past observations and the latter representing the causal factors that one is willing to
suppose affects the time series in question.
In traditional time series analysis it is often assumed that a time series y t can be
additively decomposed into four components, namely, trend, season, cycle, and
irregular components as in
yt Tt S t Ct I t (2)
where Tt represents the trend in y t at time t, S t the seasonal effect at time t, C t the
cyclical effect at time t and I t the irregular effect at time t. Obviously, the UCM model
(1) does employ this decomposition but, in addition, allows unobserved autoregressive
effects and explanatory regression effects making it a very powerful model indeed. One
of the major advantages of the UCM (1) is its interpretability and, as we will see, PROC
UCM in SAS provides some very nice graphical representations of this decomposition.
2
yt Tt Ct S t I t
= 100 4.0t 50 cos(0.31416t ) 50 * dum1 25 * dum2
25 * dum3 25 * dum4 50 * dum5 50 * dum6
75 * dum7 50 * dum8 5 * dum9 25 * dum10
50 * dum11 20 * dum12 t .
The above seasonal dummies are defined dumi = 1 if the observation is in the i-th month
and zero otherwise. Obviously the intercept of the month varies by month. For example,
the intercept for all of the January months is (100 – 50 = 50), the intercept for the
February months is (100 – 25 = 75), etc. The irregular component is represented by the
unobserved error t which is normally and independently distributed with mean zero and
variance of 100.
Figure 1
3
Figure 2
Figure 3
4
Figure 4
5
Figure 6
Figure 7
6
And in Figure 8 below we have the representation of the fitted model obtained by
Proc Nlin (a Nonlinear Least Squares procedure).
Figure 8
which closely matches the population parameters. Obviously the January intercept is
estimated as 51.74, the February intercept is 51.74 + 21.36 = 73.1, etc. The slope of the
trend is 3.98 while the amplitude of the cycle is 50.76, the phase is -0.0314, and the
period is p 2 / w 2 * 3.1416 / 0.3140 20.01 months.
In the spirit of the above empirical model, we now move to cover the parts of the
UCM (1) in more detail.
7
II. Modeling the Trend
t t 1 t , t NIID(0, 2 ) (3)
where " NIID(0, 2 )" reads “distributed as a normal, independently and identically
distributed random variable, it referred to as the Random Walk (RW) trend model in
the UCM. This model is especially appropriate for time series data that are flat and
slow-turning. Notice if 2 0 then t t 1 for all t and therefore t 0 constant.
In this special case, the data are expected to revert back to the mean 0 in fairly short
order. A simulation of the Random Walk trend model is presented in the following graph
as produced by the SAS program Stochastic Level Model.sas with 0 0 and 2 1.0 .
Figure 9
Notice that the data, as expected, is flat and slow-turning. Just as a point of note, we
will see later that when a time series has a random walk level (3), it can be thought of
8
as following an ARIMA(0,1,1) Box Jenkins model hence the “loose” relationship
between Box-Jenkins models and this simple UCM model we previously alluded to.
Figure 10
In this model the trend is characterized by the following level and slope equations
Here t represents the stochastic level of the trend while t represents the stochastic
slope of the trend. Furthermore, it is assumed that t and t are independent of each
other. In the presence of both stochastic level and stochastic slope the data need not have
a linear trend but can have a trend with the curvature (slope) of the data slowly evolving
9
as well. See Figure 11 below that has been generated by the SAS program Stochastic
Level _ Stochastic Slope.sas with 0 0, 2 1, 2 , and 2 4 . Again, as a point of
note, we will see later that when a time series has a stochastic level and stochastic
slope making up the trend as in (4) and (5), it can be thought of as following an
ARIMA(0,2,2) Box Jenkins model and hence, again, the “loose” relationship between
Box-Jenkins models and this UCM model we previously alluded to.
If 2 0 then essentially you have a random walk with fixed drift, 0 , which
would be appropriate for data that has a linear appearing trend. See Figure 12 below that
has been generated by the SAS program Stochastic Level _ Fixed Slope.sas with
2 2 16 and by setting the slope error variance to zero ( 2 0 ) and setting
0 2 , we get the fixed slope of 2.0. Notice the data is now slowly turning around a
fixed trend (drift) of 2.0. If, in addition, 0 0 then we are back to the Random Walk
without drift model of trend in (3) above and depicted in Figure 9 above. If both error
variances are equal to zero ( 2 2 0 ), the resulting model becomes the deterministic
linear time trend model: yt 0 0 t t . See Figure 13 below that depicts a
deterministic trend time series generated by the SAS program Deterministic Trend.sas
where the trend equation is given by yt 100 4 * t t with t NIID(0,400). In this
latter case the data is expected to revert fairly quickly back to the deterministic trend. In
the example to be examined later Proc UCM in SAS provides some very useful
diagnostics for determining whether components are stochastic or non-stochastic, the
significance of the components, and the goodness-of-fit of proposed UCM models.
Figure 11
10
Figure 12
Figure 13
11
Thus we can see that the Stochastic Level _ Stochastic Slope model can be specialized in
several different ways as represented by Figures 9 – 13 simply by setting some of the
variances of the level and slope components ( 2 and 2 ) equal to zero. In fact, Proc
UCM in SAS provides t-tests of the significance of these error variances and based upon
them the level and slope components can be modeled as being fixed as compared to being
stochastic. See the below example for an illustration of these tests.
Proc UCM provides two basic ways of modeling the unobserved cyclical
component, t : a deterministic (non-stochastic) trigonometric cycle (or cycles) and a
stochastic trigonometric cycle. We will discuss these two representations in turn.
t cos sin t 1 t
* sin cos * * (7)
t t 1 t
where 0 1 is a damping factor and the disturbances t and vt* are independently
distributed as N (0, 2 ) random variables. This model can capture quite complex cyclical
patterns in economic and business time series without introducing an abundance of
12
parameters. If 1 , t has a stationary distribution with mean zero and
variance 2 /(1 2 ) . If 1 , t is non-stationary. Of course if 2 0 we revert to
the deterministic cyclical model (6).
Most economic and business time series exhibit seasonality. Assume that we are
analyzing monthly time series data. A rough definition of seasonality can be expressed
as follows: For any given month, deviations from trend tend to be of the same sign from
one year to the next. For example, December toy sales tend to be above the secular trend
of toy sales because of the holiday buying habits of consumers. In contrast, January toy
sales tend to be below the secular trend because of the lack of a child-oriented holiday in
January. The same can be said for seasonal quarterly data and the similar year-over-year
variation in the data compared to the secular trend in the data. One model for such
seasonal variation is called the Stochastic Dummy Variable Seasonal model discussed
below.
Let there be s seasons during the year, s = 12 for monthly data, s = 4 for quarterly
data, and s = 2 for bi-annual data. Consider the following model for the seasonal effect
t at time t:
s 1
i 0
t i t , t NIID(0, 2 ) . (8)
In this model the sum of the seasonal effects has a zero mean although their stochastic
nature allows them to evolve either slowly over time (when 2 is small) or quickly over
time (when 2 is large).
In the special case where 2 0 in (8), we have the following the so-called
Deterministic Dummy Variable Seasonal model. In this model the seasonal effects
1 , 2 ,, s are fixed and do not vary over time in contrast to the stochastic specification
in (8). In this case a test of the absence of seasonality in the time series data being
analyzed amounts to testing the null hypothesis H 0 : 1 2 s 1 0 , where the
s
sum constraint
i 1
i 0 implies that s 0 as well.
13
model further. For more information on this specification, one can consult Harvey
(1989) or the Procedure documentation of Proc UCM in SAS HELP.
Rather than modeling the cyclical nature of a time series via either the
deterministic cyclical model (6) or the stochastic cyclical model (7), one can use the
rather straight-forward specification
Moving beyond univariate time series modeling, one can specify regression terms
m p
for adding additional explanatory power:
j 1
j x jt and y
i 1
i t i . The inputs x jt are
intended to provide economic and business “causes and effects” that might help one in
deriving more accurate forecasts of y t . We will study the possibility of such causal
variables when we turn to modeling multivariate time series later in this course.
VI. Obtaining the Estimates of the Unobservable Components and Other Statistics –
the Kalman Filter
14
decomposed into the estimated trend component, ̂ t , estimated seasonal component, ˆt ,
estimated cyclical component,ˆ t , and estimated irregular component, ˆt , represented in
the graphs produced by Proc UCM. Therefore, at time t an observation on y can be
additively decomposed as in yt ˆ t ˆt ˆ t ˆt assuming the presence of unobserved
components for trend, season, cycle, and irregular effects. These estimated unobserved
component graphs will be presented in our example below. Other statistics such as
estimated error variances of the unobserved components, t-statistics of the significance of
the error variances, Chi-square statistics for gauging the significance of the various
components, and goodness-of-fit statistics are also produced by Proc UCM using the
Kalman filter and are useful for building a UCM model that fits the data well.
One approach for building a good UCM for a given time series is to build a
“Basic” structural model of the time series and then add to the Basic model as necessary.
In this spirit, Harvey (1989) has defined the following UCM as the Basic Structural
model (BSM):
yt t t t (10)
where
t t 1 t 1 t
t t 1 t .
Thus, the BSM consists of the locally linear trend (LLT) model for trend and a seasonal
component of either the stochastic dummy variable form or the trigonometric form (not
discussed here).
Proc UCM;
model y;
irregular;
level;
slope;
season length = s type = dummy;
The first line of the SAS code indicates the move to a procedure step in the SAS program
using the procedure UCM. The dependent variable to be modeled is y and the
Unobservable Components model is to have an irregular component, a level and slope
component in the trend, and the stochastic dummy variable seasonal specification (8) is
chosen for the seasonal component. Of course, in the above, s is replaced with 12 if the
data is monthly, 4 if the data is quarterly, and 2 if the data is bi-annual in nature. If this
model fits the time series data at hand well, then additional components can be added by
way of cyclical and autoregressive unobservable components and adding regression
15
terms. In contrast, if the data does not contain a stochastic trend and or if the data does
not have a seasonal component, the BSM can be correspondingly simplified.
A very frequently used time series to demonstrate the nature of a time series with
linear trend and seasonality is the so-called Airline Passenger data originally published in
Box and Jenkins (1970). This series is a monthly series involving the number of airline
passengers that traveled per month over the time period January 1949 through December
1960. As can be seen from the SAS graph below, the data has a linear trend in it and has
reoccurring seasonal deviations from trend. These SAS graphs have been generated in a
SAS program called BSM.sas that can be found in the appendix to this document.
In order to stabilize the variability of the data around the trend in the latter years,
Box and Jenkins (1970) recommended taking the natural logarithms of the data and
analyzing them instead of the original series. This transformation of the data is plotted in
the graph below.
16
Later in this course we will study statistical methods for determining the desirability of
this transformation. The statistical methods are contained in the SAS macro called
%logtest. At any rate we are going to take the logarithmic form to be the preferred form
to analyze this data in.
Now let us turn to the analysis of the Airline data vis-à-vis the SAS program
BSM.sas contained in the appendix. First consider the output produced by the following
SAS code that fits a BSM to the Airline data.
17
The first line of this SAS Procedure step simply provides SAS with a title for the output
to be produced by this part of the BSM.sas program. The ID for the plots is “date” and
the frequency of the data is monthly. The model statement tells SAS that “logpass” is the
time series to be modeled. The program then specifies the trend to have both the level
and slope unobservable components and a stochastic dummy variable seasonal. When
plotting the level, slope, and seasonal components we want the smoothed versions.
Finally, when estimating the model the program asks SAS to provide us with a plot of the
residuals, a histogram of the residuals with a overlaid plot of the normal distribution
implied by the variance of the residuals, and a plot of the autocorrelation function (acf) of
the residuals with 95% confidence intervals of the various autocorrelations. We will
discuss the details of the autocorrelation function later in this course.
So the data set we inputted wound up in the “work” subdirectory of SAS and the time ID
variable is “date”. SAS also recognizes that “logpass” is the dependent variable to be
analyzed by PROC UCM and its time span is given along with minimum and maximum
values and the like.
Skipping through some of the output that Proc UCM produces, the next output of
interest is the likelihood-based goodness-of-fit measures of the fitted model.
18
Bayesian Information Criterion -350.35397
The full likelihood function is the function that Procedure UCM maximizes in getting the
parameters (error variances) and unobserved components of the model. Its fitted value is
217.42040. As the next two items, “Diffuse Part of Log-Likelihood” and “Normalized
Residual Sum of Squares” are not necessary for our discussion, we will leave their
definitions for the reader to read in SAS HELP and its detailed description of Proc UCM.
The next two goodness-of-fit statistics, Akaike Information Criterion (AIC) and Bayesian
Information Criterion (BIC), are defined as follows:
where L denotes the full likelihood value of the fitted model, k is the number of free
parameters that are estimated in the chosen model, and T is the number of observations
used to estimate the candidate model. These goodness-of-fit criteria are useful for
discriminating among various competing UCM models. The specification that
minimizes these two measures is to be recommended over its competitors.
Now let’s consider some additional output produced by the above SAS program.
Approx Approx
Component Parameter Estimate t Value
Std Error Pr > |t|
Irregular Error Variance 0.00012951 0.0001294 1.00 0.3167
Level Error Variance 0.00069945 0.0001903 3.67 0.0002
Slope Error Variance 2.64778E-12 1.24107E-9 0.00 0.9983
Season Error Variance 0.00006413 0.00004383 1.46 0.1435
19
Season 11 772.21 <.0001
Summary of Seasons
In the BSM you will recall that the error variances of the irregular, level, slope,
and season components are, respectively, 2 , 2 , 2 , and 2 of the irregular, level,
slope, and stochastic dummy variable seasonal components, respectively. These are the
“free parameters” of the model and their estimates are reported in the table labeled “Final
Estimates of the Free Parameters” and are determined by considering all of the data that
is available on the dependent variable logpass. These estimates and their corresponding
t-values allow one to infer whether the corresponding component is non-stochastic (the
null hypothesis) or is stochastic (the alternative hypothesis). The results reported here
indicate that the slope component of the trend may be suitably modeled as being non-
stochatic (fixed) rather than stochastic. This seems logical given the linear shape (as
compared to having some curvature) of the logpass data. In the subsequent model we
examine here, labeled BSM2, we will take this suggestion to heart. That is, we will
compare the BSM model with stochastic slope with BSM2 with fixed slope. The other
error variances range from middling significance to highly significant so we assume, for
now, that these components are best modeled as being stochastic.
20
id date interval = month;
model logpass;
irregular;
level plot=smooth;
slope var = 0 noest plot=smooth;
season length = 12 type=dummy plot=smooth;
estimate plot=(residual normal acf);
run;
This procedure step is just like the previous one except the slope statement now includes
“var = 0 and noest” which specifies that the error variance for the slope component is to
be set to zero and not estimated implying that the slope coefficient in the trend is to be
estimated as a fixed parameter. The major diagnostic tables for this model are
reproduced below.
21
Approx Approx
Component Parameter Estimate t Value
Std Error Pr > |t|
Irregular Error Variance 0.00012951 0.0001294 1.00 0.3167
Level Error Variance 0.00069945 0.0001903 3.67 0.0002
Season Error Variance 0.00006413 0.00004383 1.46 0.1435
Again all three of the components (level, fixed slope, and season) are statistically
significant. Moreover the fit of the BSM2 model is better than the fit of the initial BSM
model because the BSM2 model’s goodness-of-fit measures (AIC = -402.84 and BIC =
-355.32) are smaller than the corresponding goodness-of-fit measures for the BSM model
(AIC = -400.84 and BIC = -350.350. We have achieved an improvement in our original
specification in going from a stochastic slope specification to a fixed slope specification
in the trend.
In the BSM2 Table labeled “Final Estimates of the Free Parameters” you will
notice that the “season” error component is not highly significant (p = 0.1435). Thus one
might question whether or not we should specify, not only the slope coefficient, but also
the seasonal coefficients (there are 11 of them) to be non-stochastic (fixed). We consider
this issue by entertaining the BSM3 model which is estimated by the following Proc
UCM statements:
22
Here you will notice that the “season” statement now has the additional specification “var
= 0 noest” which specifies that the dummy variable seasonal component should now be
treated as fixed and estimated accordingly. The only detail we provide from the output
produced by this model is the following goodness-of-fit table.
In comparing the AIC = -394.93 and BIC = -350.38 produced by this model with the AIC
and BIC measures when the seasonal component was treated as being stochastic (AIC =
-402.84 and BIC = -355.32) we see that the stochastic specification is to be preferred.
Evidently, the seasonal component in the Airline passenger data evolved slightly over
time (as we will see in a graph of the seasonal component presented in the final model we
choose later).
23
run;
Notice that the “cycle” statement has been added to the previous code resulting in a
model we label as BSM4. The essential diagnostic tables for this model are presented
below.
Approx Approx
Component Parameter Estimate t Value
Std Error Pr > |t|
24
Slope 1 46.93 <.0001
Cycle 2 38.12 <.0001
Season 11 356.74 <.0001
From these tables we see that included the cycle in our model has helped in that we have
a better fit of our data (AIC = -418.59 and BIC = -362.166). Moreover, the seasonal
component is highly significant although from the t-statistic on the cycle error variance it
is not clear whether we should be modeling our cycle to be stochastic as in (7) or non-
stochastic as in (6).
Here a second “cycle” statement has been added to the SAS code. This model is referred
to as the BSM5 model. The relevant output for considering the second cycle is provided
below.
25
Bayesian Information Criterion -348.59113
Approx Approx
Component Parameter Estimate t Value
Std Error Pr > |t|
As we can see from the significance analysis of the components, the second cycle
is not significant at conventional levels (p=0.1389). Moreover, the goodness-of-fit of the
model has deteriorated somewhat (AIC = -413.93, BIC = -348.59) with the addition of
the second cycle.
26
Is the Cyclical Component Stochastic or Non-Stochastic?
Now that we have settled on one cycle for the cyclical component instead of two,
let us turn to the issue of whether that one cycle should be modeled as being stochastic or
non-stochastic. The SAS code that addresses this issue is presented below. The model
with the non-stochastic cycle is labeled the BSM6 model.
To make the cycle non-stochastic the code “noest = variance var = 0” has been
added to the “cycle” statement. The major result is that the non-stochastic specification
of the cycle does not result in an improvement in the fit of the data vis-à-vis the AIC and
BIC measures. (AIC = -398.84 and –345.38). Therefore we adopt the stochastic
specification (7) of the cycle for this data.
It has been mentioned that the Autoregressive Unobserved Component (9) might
be considered as an alternative to the Fourier type of specification for the cycle in time
series data. The SAS code that allows us to try out this specification (labeled BSM7) is
listed below.
27
Although the output of this procedure indicates that the autoregressive component
is statistically significant, it does not improve on the fit of the data (AIC = -403.11, BIC =
-349.66) vis-à-vis the previous BSM4 model specification. Therefore, we select the
BSM4 specification of a stochastic level and fixed slope for the trend, a stochastic
dummy variable seasonal, and a single stochastic cycle along with the irregular
component. A summary of the Goodness-of-Fit values for the various models is
contained in the following table.
28
Goodness-of-Fits for Various UCMs
Variance Residuals
Model Description AIC BIC
Components w. n. *
stochastic level σŋ2 > 0
stochastic slope σξ2 > 0
BSM -400.84 -350.350 2
stochastic dummy σω2 > 0
season
stochastic level σŋ2 > 0
fixed slope σξ2 = 0
BSM2 -402.84 -355.32 2
stochastic dummy σω2 > 0
season
stochastic level σŋ2 > 0
fixed slope σξ2 = 0
BSM3 -394.93 -350.38 4
non-stochastic σω2 = 0
dummy season
stochastic level σŋ2 > 0
fixed slope σξ2 = 0
BSM4 stochastic dummy σω2 > 0 -418.59 -362.166 1
season
one stochastic cycle σν2 > 0
stochastic level σŋ2 > 0
fixed slope σξ2 = 0
stochastic dummy σω2 > 0
BSM5 -413.93 -348.59 2
season
two stochastic σν12 > 0, σν22 > 0
cycles
stochastic level σŋ2 > 0
fixed slope σξ2 = 0
stochastic dummy σω2 > 0
BSM6 -398.84 –345.38 2
season
one non-stochastic σν2 = 0
cycle
stochastic level σŋ2 > 0
fixed slope σξ2 = 0
stochastic dummy σω2 > 0
BSM7 -403.11 -349.66 2
season
Autoregressive συ2 > 0
component
Remember, don’t count the autocorrelation at lag = 0 because it is equal to one by
definition. Any number above 3 occurrences would indicate the lack of white noise
residuals.
29
Further Verification of the BSM4 model
30
In terms of the “Prediction Errors for logpass” graph, the residuals of the model
(prediction errors) appear to be varying in an unsystematic way. That is, there do not
appear to be “stretches” in the data whereby the residuals have positive runs for a little
while and then negative runs for a little while, etc. They seem to be pretty
unsystematically varying which is indicative of the independence of the residuals over
time and indicative that there is “nothing systematic left in the data” to be described by
additional components and parameters that we might entertain adding to our model.
This is verified in the third graph labeled “Prediction Error Autocorrelations for logpass”
where all of the autocorrelations at the various lags are inside the blue-shaded 95%
confidence intervals of zero autocorrelation. (More about autocorrelations and the
autocorrelation function later. The autocorrelation at lag zero is by definition 1.0 so its
value is not of importance in the graph.) Finally, the graph labeled “Prediction Errors
Histogram for logpass” shows the residual (prediction error) histogram pretty closely
resembling a normal distribution which is also a desirable trait of a good UCM model.
Therefore, all three graphs support the previous contention that model BSM4 is adequate
for explaining the variation in the Airline passenger data we set out to model.
In this section we are going to report the component graphs for the BSM4 model.
They are obtained by applying the Kalman filter and constructing the “smoothed”
estimates as described previously. The first graph is the smoothed level component graph
representing a time plot of ̂ t and the forecasts of it 12 periods ahead along with its 95%
confidence interval. The second graph is the smoothed slope component graph ( ˆ ) t
which happens to be fixed and estimated to be approximately 0.01.
31
The third and fourth graphs below depict the smoothed seasonal component ( ˆt )
and smoothed cyclical component (ˆ t ), respectively, as a function of time. In all of
these graphs the blue shading indicates the 95% confidence intervals of the components.
Also notice that each graph is extended for 12 months beyond the end of the data
32
(December 1960) representing the forecasted values of the components. Of course these
forecasted components, once they are added together, form the forecasted values of
logpass reported below.
The following three graphs depict the “adding” up of the components of trend, season
and cycle into the fitted values of the Airline series. Notice the extension of these series
into the forecast period.
33
34
Predicting with model BSM4
The 12 out-of-sample forecasts and their 95% confidence intervals are depicted in
the graph below in the last UCM procedure step in the BSM.sas program.
35
The numerical values of the forecasts are produced by the last statements in
BSM.sas. Since the interest of the user is more likely to involved obtaining the forecasts
of the number of passengers per month in the subsequent 12 months as compared to the
log of the number of passengers expected the SAS statements at the end properly
transform the log forecasts into the “level” forecasts of passengers (in thousands). As we
will see in later discussion, an unbiased forecast of passengers (pass) is obtained by the
anti-log transformation
and the upper and lower 95% confidence limits of passengers forecasted are obtained by
the transformations
In (13) “passf” denotes the desired passenger forecast, exp(.) is the exponential (anti-log)
function, “forecast’ is log of passenger forecast that is to be transformed, and “std” is the
standard deviation of the log forecast. In (14) “passlcl” denotes the desired lower
confidence limit for passengers forecasted and “lcl” is the lower confidence limit for the
log forecast. In (15) “passucl” denotes the desired upper confidence limit for passengers
forecasted and “ucl” denotes the upper confidence limit for the log forecast.
36
IX. Conclusion
Of course the UCM model is not the only time series model that we might
consider for modeling and forecasting time series data like the Box-Jenkins Airline data
that we have just analyzed. We will consider the Deterministic Trend / Deterministic
Seasonal model next and then go on to investigate exponential smoothing models and
Box-Jenkins models in turn.
37
APPENDIX
BSM.sas Program
That Fits Various UCMs
To the Box-Jenkins Airline Data
data airline;
input date:monyy5. pass @@;
datalines;
jan49 112 feb49 118 mar49 132 apr49 129 may49 121 jun49 135
jul49 148 aug49 148 sep49 136 oct49 119 nov49 104 dec49 118
jan50 115 feb50 126 mar50 141 apr50 135 may50 125 jun50 149
jul50 170 aug50 170 sep50 158 oct50 133 nov50 114 dec50 140
jan51 145 feb51 150 mar51 178 apr51 163 may51 172 jun51 178
jul51 199 aug51 199 sep51 184 oct51 162 nov51 146 dec51 166
jan52 171 feb52 180 mar52 193 apr52 181 may52 183 jun52 218
jul52 230 aug52 242 sep52 209 oct52 191 nov52 172 dec52 194
jan53 196 feb53 196 mar53 236 apr53 235 may53 229 jun53 243
jul53 264 aug53 272 sep53 237 oct53 211 nov53 180 dec53 201
jan54 204 feb54 188 mar54 235 apr54 227 may54 234 jun54 264
jul54 302 aug54 293 sep54 259 oct54 229 nov54 203 dec54 229
jan55 242 feb55 233 mar55 267 apr55 269 may55 270 jun55 315
jul55 364 aug55 347 sep55 312 oct55 274 nov55 237 dec55 278
jan56 284 feb56 277 mar56 317 apr56 313 may56 318 jun56 374
jul56 413 aug56 405 sep56 355 oct56 306 nov56 271 dec56 306
jan57 315 feb57 301 mar57 356 apr57 348 may57 355 jun57 422
jul57 465 aug57 467 sep57 404 oct57 347 nov57 305 dec57 336
jan58 340 feb58 318 mar58 362 apr58 348 may58 363 jun58 435
jul58 491 aug58 505 sep58 404 oct58 359 nov58 310 dec58 337
jan59 360 feb59 342 mar59 406 apr59 396 may59 420 jun59 472
jul59 548 aug59 559 sep59 463 oct59 407 nov59 362 dec59 405
jan60 417 feb60 391 mar60 419 apr60 461 may60 472 jun60 535
jul60 622 aug60 606 sep60 508 oct60 461 nov60 390 dec60 432
;
data airline;
set airline;
logpass=log(pass);
axis1 label=('Year');
38
plot pass*date / haxis=axis1 vaxis=axis2;
symbol1 i=join;
format date year4.;
run;
axis1 label=('Year');
axis2 order=(4 to 7 by .25)
label=(angle=90 'Log of Passengers');
run;
ods html;
ods graphics on;
39
title 'BSM3: stochastic level, fixed slope, non-stochastic dummy
seasonal';
40
proc ucm data = airline;
id date interval = month;
model logpass;
irregular;
level plot=smooth;
slope var = 0 noest plot=smooth;
season length = 12 type=dummy plot=smooth;
cycle noest=variance var = 0 plot=smooth;
estimate plot=(residual normal acf);
run;
data results;
set results;
passf = exp(forecast + 0.5*std**2);
passlcl = exp(lcl);
passucl = exp(ucl);
41
if _n_ > 144;
keep obs forecast std lcl ucl passf passlcl passucl;
run;
42