Temporal and Spatial Models: Anol Bhattacherjee, Ph.D. University of South Florida
Temporal and Spatial Models: Anol Bhattacherjee, Ph.D. University of South Florida
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1530.181 105.041 14.57 <2e-16 ***
t 66.828 3.206 20.84 <2e-16 ***
related to each other in time.
This quarter’s sales is in proximity to last
4000
quarter’s sales.
Sales
Examples:
This quarter’s sales is close to (in time) last quarter’s sales – temporal association.
A home’s value is close to (in space) home values nearby – spatial association.
If neighboring observations carry important information, then you should consider spatial
and/or temporal model components.
In addition to classical regression components.
The Statistical Challenge
How to incorporate the effect of neighboring (or close) observations?
Unlike multivariate regression, we are not predicting future values of Y based on independent
variables X, but based on previous values of the same variable Y.
Temporal dependency:
If this quarter’s sales depend on last quarter’s sales, then salest = f(salest-1)
Sales can be modeled as a lag-variable, using Time (t) as a predictor.
Spatial dependency:
If a home’s value depends on home values nearby, then the average value of all nearby homes
can be used as a predictor.
We can also incorporate the variance (volatility) of nearby homes, the max value, min value, etc,
of nearby homes.
Other Challenges of Spatial Models
Spatial models are usually more challenging than temporal models, because:
We have to define an appropriate distance metric, such as Euclidian distance.
Example: Home A is 5 miles from home B and 10 miles from home C.
Or more generally, define a similarity metric.
Example: Product A is closer to product B in the associated feature space.
We must also specify the reach of the spatial dependency:
Should we only include homes no further than 5 miles away? Or no further than 10 miles away?
Incorporating temporal and spatial dependencies is not hard, but requires some intuition and
thinking.
Additive Seasonality Model
How to incorporate seasonality:
Seasonality is a categorical variable.
Add seasonality to a trend model using dummy variables.
Four quarters in the Coca Cola data implies three dummy variables.
Seasonal model:
Quarter Sales t D D_1 D_2 D_3
yt = β0 + βt t + β1 D1 + β2 D2 + Q1-86 1734.83 1 1 1 0 0
β3 D3 + e Q2-86
Q3-86
2244.96
2533.80
2
3
2
3
0
0
1
0
0
1
Q4-86 2154.96 4 4 0 0 0
Q1-87 1547.82 5 1 1 0 0
Q2-87 2104.41 6 2 0 1 0
Q3-87 2014.36 7 3 0 0 1
Q4-87 1991.75 8 4 0 0 0
Interpretation: Q1-88 1869.05 9 1 1 0 0
Which is the better model? Why? Multiple R-squared: 0.9438, Adjusted R-squared: 0.9394
F-statistic: 214 on 4 and 51 DF, p-value: < 2.2e-16
Add seasonality model Linear trend model
0
0
10
10
20
20
t
t
30
30
40
40
50
50
Residual vs Time Residual vs Time
Res Res
0
10
10
20
20
Time
30
Time
30
40
40
Which Model Fits the Data Better
50
50
Residual distribution
Frequency
0 2 4 6 8
-500
0
Mod2
Residuals
500
1000
Predictive Accuracy On Holdout Set
How to measure predictive performance:
Create training data set and holdout (test) data set.
6000
Actual
Trend Model
Training: Q1-86 to Q4-99. Seasonal Model
5500
Estimate model using test data set and use
estimated model to predict sales for test data.
5000
Compute root mean square error for test data:
4500
RMSE (Linear Trend) = $729 million.
4000
RMSE (Additive easonality) = $498 million. 56 57 58 59 60 61 62
1.0
Patterns (exponential decay, positive/negative swings, etc.)
are bad.
0.5
ACF
Two ways of modeling autocorrelation:
0.0
Explicitly (as lag variable):
-0.5
0 5 10 15
1.0
Model response variable as a function of lagged error terms.
0.6
Moving average (MA) models.
ACF
0.2
Other problems:
-0.2
Non-zero mean. 0 5 10 15
Series res
Model with Lag Variables
Model specification
Yt denote sales at time t.
Create a “new” variable Yt-1 (lag-1 variable)
Model Yt = β0 + β1 Yt-1 + ∑βi xi + ε
Lags may occur for greater than 1 time unit:
Different lag models (lag-1, lag-2, lag-3, etc.) can be compared to determine optimum lag duration.
How many lags should we include?
“Lag” Variable
Quarter Sales Lag.Sales t D D_1 D_2 D_3
Q2-86 2244.96 1734.83 2 2 0 1 0
Q3-86 2533.8 2244.96 3 3 0 0 1
Q4-86 2154.96 2533.8 4 4 0 0 0
Q1-87 1547.82 2154.96 5 1 1 0 0
Q2-87 2104.41 1547.82 6 2 0 1 0
Interpreting Lag Model Results
Estimate Std. Error Pr(>|t|)
Questions: (Intercept) 147.3157 164.4734 0.3748
The coefficient 0.74 implies that: Lag.Sales 0.7373 0.0925 0.0000
t 18.0255 6.4156 0.0071
A. Last quarter’s sales have no impact. D_1 21.0050 77.0229 0.7862
B. Sales decrease by 0.74 every quarter. D_2 879.0750 84.2993 0.0000
D_3 207.9091 71.9296 0.0057
C. Seasonally adjusted and detrended sales increase by
Mult R-sq: 0.9761, Adj R-sq: 0.9736
0.74 every quarter.
D. None of the above. Estimate Std. Error Pr(>|t|)
How do the results compare against the linear trend and (Intercept)
t
1339.6120
67.6190
102.8780
2.3680
0.0000
0.0000
seasonal models? D_1 -207.6500 107.2790 0.0586
D_2 506.9240 105.3540 0.0000
D_3 334.6560 105.2740 0.0025
Mult R-squared: 0.945, Adj R-sq: 0.9406
600
400
Residual vs Time
Residual vs Time
Residual vs Time
400
Actual vs Fitted
500
200
200
Res
Res
Res
0
0
0
-200
-200
-500
-400
0 10 20 30 40 50 0 10 20 30 40 50 0 10 20 30 40 50
Trend + Lag + Seasonal Trend + Lag + Seasonal RMSE Trend = $729 million
RMSE Trend +Seas = $498 million
6000
Actual
5000
Trend Model
Seasonal Model
5500
4000
Sales
5000
3000
4500
2000
4000
0 10 20 30 40 50 56 57 58 59 60 61 62
t
ACF Plots
Series res Series res
1.0
1.0
1.0
0.6
0.6
ACF
0.5
ACF
0.2
0.2
ACF
-0.2
0.0
-0.2
0 5 10 15 0 5 10 15
-0.5
0 5 10 15
Lag
Lag Lag
0.2
Questions:
Partial ACF
Partial ACF
0.0
0.2
-0.2
-0.2
Diagnostic checking:
Compare model statistics (AIC, BIC, SBIC) to choose the best model.
Plot residual ACF: should be random (no pattern), i.e., white noise.
ACF and PACF
Autocorrelation Function (ACF):
Measures correlation between observation Yt and observation Yt-p located p periods apart.
𝜌k = Corr (Yt , Yt-p) = Cov (Yt , Yt-p) / (√Var (Yt) * √Var (Yt-p) = 𝛾p/ 𝛾0
Estimates how many past values may be related to Yt, i.e., the lag p in AR(p) models.
Partial autocorrelation function (PACF):
Measures correlation between residual 𝜀t and 𝜀t-q located q periods apart.
Autocorrelation of a signal with itself at different points in time, when the linear dependency of that
signal at shorter lags removed.
Used to estimate lag q in MA(q) models.
AirPassengers Data
Monthly ticket sales counts (in thousands) for 1949-1960.
We wish to predict ticket sales for the next 5 years.
data(AirPassengers)
ts <- AirPassengers
ts
start(ts)
end(ts)
class(ts)
frequency(ts)
cycle(ts)
plot(ts)
abline(lm(ts ~ time(ts)), col="red”)
boxplot(ts ~ cycle(ts))
Questions:
What are we trying to depict in this boxplot?
What inferences can we draw from this boxplot?
AirPassengers: Stationarity
plot(ts) Questions:
abline(lm(ts ~ time(ts)), col="red")
plot(log(ts)) What did the log transform do?
plot(diff(log(ts))) What did the differencing do?
Do we have a stationary time series?
Lag 0
Lag 0
Lag 1
Lag 2(p=2)
Lag 1 (q=1)
AirPassengers: ARIMA with Cross-Validation
model <- arima(log(ts), c(2,1,1), seasonal=list(order=c(2,1,1), period=12))
predicted <- predict(model, n.ahead=5*12)
predicted <- 2.718282^predicted$pred
predicted <- round(predicted,0)
predicted
ts.plot(ts, predicted, lty=c(1,3))
# Cross-validation
train <- ts(ts, frequency=12, start=c(1949,1), end=c(1958,12))
model <- arima(log(train), c(2,1,1),
seasonal=list(order=c(2,1,1), period=12))
predicted <- predict(model, n.ahead=2*12)
predicted <- 2.718282^predicted$pred Training data: 1949-1958
predicted <- round(predicted,0) Test data: 1959-1960
original <- tail(ts, 24)
original – predicted
ID PRICE NBROOM DWELL NBATH PATIO FIREPL AC X Y Estimate Std. Error Pvalue
1 47 4 0 1 0 0 0 907 534
2 113 7 1 2.5 1 1 1 922 574
(Intercept) 15.4510 5.5480 0.0059
3 165 7 1 2.5 1 1 0 920 581 NBROOM 1.1510 1.2430 0.3556
4 104.3 7 1 2.5 1 1 1 923 578 NBATH 8.3310 2.2010 0.0002
5 62.5 7 1 1.5 1 1 0 918 574
6 70 6 1 2.5 1 1 0 900 577
PATIO 17.2780 3.4680 0.0000
7 127.5 6 1 2.5 1 1 1 918 576 FIREPL 17.1310 3.0270 0.0000
8 53 8 1 1.5 1 0 0 907 576 AC 12.7670 2.8330 0.0000
580
580
570
150
560
560
550
100
Latitude
540
540
530
50
520
520
510
Longitude
The colors and diamond size are proportional to the price of a house.
What can we learn from this graph?
How can we model this effect?
How to Measure ‘Space’?
We must define space in order to measure its effects.
Naive method: Regional dummy variables, e.g., for zip codes.
Weight matrix: n x n neighborhood structure, where: 0 = not neighbor, 1 = neighbor.
Sample Region and Units Simple Neighborhood Matrix
1 2 3 4 5 6 7 8 9
1 2 3 1 0 1 0 1 0 0 0 0 0
2 1 0 1 0 1 0 0 0 0
3 0 1 0 0 0 1 0 0 0
4 5 6 4 1 0 0 0 1 0 1 0 0
5 0 1 0 1 0 1 0 1 0
6 0 0 1 0 1 0 0 0 1
7 0 0 0 1 0 0 0 1 0
7 8 9 8 0 0 0 0 1 0 1 0 1
9 0 0 0 0 0 1 0 0 0
Spatial Lag Model
Initial OLS Model (AIC OLS = 1793)
Spatial autocorrelation in response variable: Estimate Std. Error Pvalue
Y = ρWY + 𝛽X + ε (Intercept)
NBROOM
15.4510
1.1510
5.5480
1.2430
0.0059
0.3556
W: spatial weight; ρ: spatial coefficient NBATH 8.3310 2.2010 0.0002
PATIO 17.2780 3.4680 0.0000
Incorporates spatial effects by including a spatially lagged FIREPL 17.1310 3.0270 0.0000
dependent variable as an additional predictor. AC 12.7670 2.8330 0.0000
OLS vs. spatial lag results: Spatial Lag Model (AIC Spatial Lag = 1739)
Some of the estimates are smaller in the lag model. Estimate Std. Error Pvalue
(Intercept) -2.6764 4.8670 0.5824
Intercept term switched signs and is no longer significant. NBROOM 1.2673 1.0487 0.2269
What happened? Which model is better? NBATH
PATIO
7.6529
11.9579
1.8711
2.9539
0.0000
0.0001
FIREPL 11.1740 2.6072 0.0000
AC 8.4183 2.4014 0.0005
Spatial coefficient:
Rho: 0.49961; p-value: 6.6502e-14
OLS vs. Spatial Lag
Certain predictors (e.g., presence of patio or fireplace) lost their importance in predicting
home prices when neighboring homes are included (using spatial lag ρWy).
Why?
Houses located in the same area tend to have similar features, e.g., fireplaces and patios in
wealthy neighborhoods, no central AC in poorer neighborhoods.
Hence, prices of neighboring houses already factor in the price effect of these “expected” features.
Lack of these features may change the price a little but not by much.
Implication:
Better to buy a low-end house in an expensive neighborhood rather than a high-end house in an
inexpensive neighborhood?
Key Takeaways
Modeling temporal and spatial dependencies in data presents unique challenges such as
autocorrelation and location correlation.
Statistical models available to account for these dependencies.
Additive seasonality model.
Lag model.
AR, MA, ARMA, ARIMA models.
Spatial lag model.
Assessing model quality:
Use estimates from training set to predict values in test set (predictive accuracy).
Alternative measures of model fit such as AIC and visual examination of residuals are needed.
Such analysis provide better insight into relationship among data not available from OLS
models.