0% found this document useful (0 votes)
37 views18 pages

Time Series Final Review

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views18 pages

Time Series Final Review

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Final Time Series

Noted by Mỹ Dung K20 Logistics

CHAPTER 6: MULTIPLE REGRESSION


1. Introduction
Population: 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + ⋯ + 𝛽𝑛 𝑋𝑛 + 𝜀
Sample: 𝑌 = 𝑏0 + 𝑏1 𝑋1 + 𝑏2 𝑋2 + ⋯ + 𝑏𝑛 𝑋𝑛 + 𝑒

Independent variables: 𝑋1 , 𝑋2 , … , 𝑋𝑛 (predictors)


Dependent variables: Y

2. Standard error, ANOVA & R-square


Standard error: 𝑠𝑦.𝑥′ 𝑠 = √𝑀𝑆𝐸
Source Sum sq Df Mean sq
Regression SSR k MSR = SSR/df
Error SSE n–k–1 MSE = SSE/df
Total SST n–1

k: no. of independent variables


n: sample size

SST = SSR + SSE SST: Explained variabil.


SST: Total variability SSE: Unexplained varia.

R-sq: coefficient of determination, ...% of variability in Y that can be explained through knowledge
of the variability in the variable X.
R-sq = Explained var / Total var = SSR / SST
= 1 – Unexplained var / Total var = 1 – SSE/SST
R-sq = correlation^2

𝑛−1
Adjusted R-square: ̅𝑅̅̅2̅ = 1 − (1 − 𝑅2 ) ( )
𝑛−𝑘−1
Final Time Series
Noted by Mỹ Dung K20 Logistics

3. Hypothesis testing
i. F-test for significance of regression (test nguyên model)
𝐻0 : 𝛽1 = ⋯ = 𝛽𝑘 = 0
𝐻1 : at least one 𝐵𝑗 ≠ 0
F-ratio: MSR/MSE
Critical value: 𝐹(𝛼, 𝑑𝑓1 : 𝑘, 𝑑𝑓2 : 𝑛 − 𝑘 − 1)
Make decision: Reject 𝐻0 if F-ratio > critical value => model bth

ii. 2-tailed t-test for individual 𝜷𝒋 (test từng variable)


𝐻0 : 𝛽1 = 0
𝐻1 : 𝛽1 ≠ 0

Test-statistic: 𝑡 = 𝑏𝑗 /𝒔𝒃𝟏 = 𝑏1 /√𝑴𝑺𝑬/ ∑(𝒙 − 𝒙


̅)𝟐

Critical value: 𝑡𝛼/2, 𝑛−𝑘−1

Make decision: Reject 𝐻0 if t-stat > UB or < LB => variable đó oke

Confidence interval: 𝑏1 ± 𝑡(𝛼/2 , 𝑛−𝑘−1) (𝑠𝑏1 )

Excel: nhìn nhanh bằng p-value


Nếu p-value < alpha => Reject 𝐻0 => Keep X
Nếu p-value > alpha => Cannot reject 𝐻0 => Drop X
Final Time Series
Noted by Mỹ Dung K20 Logistics

4. Multicollinearity
Tìm xem các independent variable X có mối quan hệ tuyến tính với nhau hay không, vì nếu có thì sẽ ảnh
hưởng đến model và cần xử lý (vì trong một multi-regression model thì các independent X nên
independent!)
Mối quan hệ ở đây được thể hiện thông qua giá trị VIF, ngoài ra nó có thể thể hiện thông qua giá trị
correlation giữa các X với nhau (nếu X có correlation cao so với các X còn lại => cần xử lý/remove X đó)

𝑉𝐼𝐹𝑗 = 1/(1 − 𝑅𝑗2 )

VIF near 1 → weakly related to remaining predictors


VIF much larger than 1 → this variable is linearly related to the remaining predictors.
VIF > 10 → the linear relationship btw this variable and other predictors is significant, cần xử lý

Example:
There is 4 variables
For each var, we set up it as the dependent variable (Y), with the independent variables (X) are the
other variables
Var 1 = 𝛽0 + 𝛽1 Var 2 + 𝛽2 Var 3 + 𝛽3 Var 4 => find R_sq
Var 2 = 𝛽0 + 𝛽1 Var 3 + 𝛽2 Var 4 + 𝛽3 Var 1 => find R_sq
Var 3 = 𝛽0 + 𝛽1 Var 4 + 𝛽2 Var 1 + 𝛽3 Var 2 => find R_sq
Var 4 = 𝛽0 + 𝛽1 Var 1 + 𝛽2 Var 2 + 𝛽3 Var 3 => find R_sq

Var 1 (2,3,4) Var 2 (1, 3, 4) Var 3 (1, 2, 4) Var 4 (1, 2, 3)


0.75 0.6 0.35 0.1

1
𝑉𝐼𝐹1 = = 4
1 − 0.75
Final Time Series
Noted by Mỹ Dung K20 Logistics

CHAPTER 7: REGRESSION WITH TIME SERIES DATA


1. Durbin-Watson
Test xem error của các time period có sự liên quan nhau không (check autocorrelation with time lag 1
of error, hay còn gọi là First-order serial correlation)
Nếu có sự liên quan, thì 𝜌 sẽ > 0, còn không thì 𝜌 = 0 (thể hiện no correlation)

𝑦𝑡 = 𝛽0 + 𝛽1 𝑥𝑡 + 𝜀𝑡
𝜀𝑡 = 𝜌𝜀𝑡−1 + 𝜐𝑡
Hypothesis
𝐻0 : 𝜌 = 0
𝐻1 : 𝜌 > 0
Test-statistic
∑𝑛𝑡=2(𝑒𝑡 − 𝑒𝑡−1 )2
𝐷𝑊 =
∑𝑛𝑡=1 𝑒𝑡2
Or
𝐷𝑊 = 2(1 − 𝑟1 (𝑒))

∑𝑛𝑡=2 𝑒𝑡 𝑒𝑡−1
𝑟1 (𝑒) =
∑𝑛𝑡=1 𝑒𝑡2
Critical value:
Appendix, table 6. Xác định 𝛼, n và k (n là sample size, k là số lượng independent variables X)
Make decision:
If 𝐷𝑊 > 𝑑𝑈 , conclude 𝐻0 : 𝜌 = 0 => error is independent / random
If 𝐷𝑊 < 𝑑𝐿 , conclude 𝐻1 : 𝜌 ≠ 0 => error is not independent/ random
If 𝑑𝐿 ≤ 𝐷𝑊 ≤ 𝑑𝑈 : the test is inconclusive

Example: Given 𝑟1 (𝑒) = −0.67

𝐻0 : 𝜌 = 0
𝐻1 : 𝜌 > 0
𝐷𝑊 = 2(1 − 𝑟1 (𝑒)) = 2(1 − (−0.67)) = 3.34
Critical value: alpha = 5%, n = 6, k = 1
𝑑𝐿 = 0.61 𝑑𝑈 = 1.40
Make decision: 𝑑𝑈 < 𝐷𝑊 => 𝐻0 : 𝜌 = 0
(Correlation is not significantly different from 0) => error is not random
Final Time Series
Noted by Mỹ Dung K20 Logistics
Final Time Series
Noted by Mỹ Dung K20 Logistics
Final Time Series
Noted by Mỹ Dung K20 Logistics

CHAPTER 8: ARIMA
1. Introduction (p, d, q) & cut off, die out
AR p: 𝑌𝑡 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜙2 𝑌𝑡−2 + ⋯ + 𝜙𝑝 𝑌𝑡−𝑝 + 𝜀𝑡

Example:

AR (1): 𝑌𝑡 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜀𝑡

AR (2): 𝑌𝑡 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜙2 𝑌𝑡−2 + 𝜀𝑡

AR (3): 𝑌𝑡 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜙2 𝑌𝑡−2 + 𝜙3 𝑌𝑡−3 + 𝜀𝑡

MA q: 𝑌𝑡 = 𝜇 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1 − 𝜔2 𝜀𝑡−2 − ⋯ − 𝜔𝑞 𝜀𝑡−𝑞

Example:

MA (1): 𝑌𝑡 = 𝜇 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1

MA (2): 𝑌𝑡 = 𝜇 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1 − 𝜔2 𝜀𝑡−2

MA (3): 𝑌𝑡 = 𝜇 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1 − 𝜔2 𝜀𝑡−2 − 𝜔3 𝜀𝑡−3

ARMA (p,q)

𝑌𝑡 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜙2 𝑌𝑡−2 + ⋯ + 𝜙𝑝 𝑌𝑡−𝑝 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1 − 𝜔2 𝜀𝑡−2 − ⋯ − 𝜔𝑞 𝜀𝑡−𝑞

Example:

ARMA (1, 1): 𝑌𝑡 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1

ARMA (1, 2): 𝑌𝑡 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1 − 𝜔2 𝜀𝑡−2

ARMA (2, 1): 𝑌𝑡 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜙2 𝑌𝑡−2 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1

ARMA (2, 2): 𝑌𝑡 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜙2 𝑌𝑡−2 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1 − 𝜔2 𝜀𝑡−2

𝜙0 : intercept 𝜇: constant mean


𝜙1 , 𝜙2 , … , 𝜙𝑝 & 𝜔1 , 𝜔2 , … , 𝜔𝑞 : coeff to be estimated
𝑌𝑡−1 , 𝑌𝑡−2 , … , 𝑌𝑡−𝑝 : Y-value at time lag 1, 2, ... p (ind v)
𝜀𝑡−1 , 𝜀𝑡−2 , … , 𝜀𝑡−𝑞 : error in previous period (ind v)
𝜀𝑡 : error
Final Time Series
Noted by Mỹ Dung K20 Logistics

Cut off: độ dài của thanh giảm đột ngột

Cut off 1 Cut off 2


Die out: độ dài của thanh giảm từ từ

2. Define model
If data is nonstationary => differencing / take logarithm
❖ Non-seasonal diff: 𝑌𝑡′ = 𝑌𝑡 − 𝑌𝑡−1
𝑌𝑡 𝑌𝑡−1 𝑌𝑡′
15 - -
18 15 3
17 18 -1
16 17 -1

Example: d = 1, p = 1, q = 1
Δ𝑌𝑡 = 𝜙1 Δ𝑌𝑡−1 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1
(𝑌𝑡 − 𝑌𝑡−1 ) = 𝜙1 (𝑌𝑡−1 − 𝑌𝑡−2 ) + 𝜀𝑡 − 𝜔1 𝜀𝑡−1

❖ Seasonal diff 𝑌𝑡′ = 𝑌𝑡 − 𝑌𝑡−𝑠

Sum up:

AR + MA + stationary data = ARMA

AR + MA + nonstationary data = ARIMA


Final Time Series
Noted by Mỹ Dung K20 Logistics

By counting the number of significant sample autocorrelations and partial autocorrelations, the orders of the
MA and AR parts can be determined.

d = number of non-seasonal diff


D = number of seasonal diff

Xác định d, D bằng cách nhìn vào tên của ACF, PACF
diff1 => d = 1 diff2 => d = 2
diff12 or diff4 => d = 0, D = 1
diff1diff12 => d = 1, D = 1

ACF PACF
Cut off after the order q
MA(q) Die out
of the process
Cut off after the order p
AR(p) Die out
of the process
Die out
ARMA(q, p) q = số time lag significant (trên ACF)
p = số time lag significant (trên PACF)

Example: no differencing

ACF die out, PACF cut off 2 => AR(2)


Final Time Series
Noted by Mỹ Dung K20 Logistics

ACF die out, PACF cut off 1 => AR(1)


Example: differencing

ARIMA (p, d, q)(P, D, Q)s


Seasonal
Tên của 2 cái plot: diff12 => d = 0, D = 1
P and Q: nhìn time lag là bội số của 12 (vì này là diff12)
ACF cut off 1, PACF die out => Q = 1, P = 0
p & q: nhìn như bth
ACF và PACF không cut off, die out => q = 0, p = 0
=> ARIMA (0, 0, 0) (0, 1, 1)12
Final Time Series
Noted by Mỹ Dung K20 Logistics

PRACTICE

ACF cut off 1, PACF die out => MA (1)

ACF cut off 2, PACF die out => MA (2)

ACF die out, PACF cut off 1 => AR (1)

ACF and PACF die out, ACF has 2 time lags with significant autocorrelation, PACF has 3 time lags with
significant autocorrelation => ARMA (2, 3)
Final Time Series
Noted by Mỹ Dung K20 Logistics

3. Write equation 𝒑 𝑷 𝒅 𝑫 𝒚𝒕 = 𝒄 + 𝒒 𝑸 𝜺𝒕
𝑦𝑡 = 𝑦𝑡−1
𝐵𝐵𝑦𝑡 = 𝐵2 𝑦𝑡 = 𝑦𝑡−2
𝐵4 𝑦𝑡 = 𝑦𝑡−4
❖ Nonseasonal: ARIMA (p, d, q)
𝑝 = 1 → (1 − 𝜙1 𝐵) ; 𝑝 = 2 → (1 − 𝜙1 𝐵 − 𝜙2 𝐵2 )
𝑞 = 1 → (1 − 𝜔1 𝐵); 𝑞 = 2 → (1 − 𝜔1 𝐵 − 𝜔2 𝐵2 )
𝑑 = 1 → (1 − 𝐵) ; 𝑑 = 2 → (1 − 𝐵)2

What if p/d/q = 0? → (1 − 0) = 1
ARIMA(1, 0, 2)

(1 − 𝜙1 𝐵)𝑦𝑡 = 𝑐 + (1 − 𝜔1 𝐵 − 𝜔2 𝐵2 )𝜀𝑡

𝑦𝑡 − 𝜙1 𝑦𝑡−1 = 𝑐 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1 − 𝜔2 𝜀𝑡−2

𝑦𝑡 = 𝑐 + 𝜙1 𝑦𝑡−1 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1 − 𝜔2 𝜀𝑡−2

ARIMA (1, 1, 0)
(1 − 𝜙1 𝐵)(1 − 𝐵)𝑦𝑡 = 𝑐 + 𝜀𝑡

(1 + 𝜙1 𝐵2 − 𝜙1 𝐵 − 𝐵)𝑦𝑡 = 𝑐 + 𝜀𝑡

𝑦𝑡 + 𝜙1 𝑦𝑡−2 − 𝜙1 𝑦𝑡−1 − 𝑦𝑡−1 = 𝑐 + 𝜀𝑡

ARIMA (0, 2, 1)

(1 − 𝐵)2 𝑦𝑡 = 𝑐 + (1 − 𝜔1 𝐵)𝜀𝑡

(1 − 2𝐵 + 𝐵2 )𝑦𝑡 = 𝑐 + (1 − 𝜔1 𝐵)𝜀𝑡

𝑦𝑡 − 2𝑦𝑡−1 + 𝑦𝑡−2 = 𝑐 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1


Final Time Series
Noted by Mỹ Dung K20 Logistics

❖ Seasonal: ARIMA (p, d, q) (P, D, Q)s


P, D, Q: same as above
𝑃 = 1 → (1 − Φ1 𝐵 𝑠 ) ; 𝑃 = 2 → (1 − Φ1 𝐵 𝑠 − Φ2 𝐵2𝑠 )
𝑄 = 1 → (1 − Θ1 𝐵 𝑠 ) ; 𝑄 = 2 → (1 − Θ1 𝐵 𝑠 − Θ2 𝐵2𝑠 )
𝐷 = 1 → (1 − 𝐵 𝑠 ) ; 𝐷 = 2 → (1 − 𝐵 𝑠 )2

What if P/D/Q = 0? → (1 − 0) = 1

ARIMA (1,1,1)(1,1,1)4

(1 − 𝜙1 𝐵)(1 − Φ1 𝐵4 )(1 − 𝐵)(1 − 𝐵4 )𝑦𝑡 = 𝑐 + (1 − 𝜔1 𝐵)(1 − Θ1 B4 )𝜀𝑡

ARIMA (1,0,0)(0,1,2)12

(1 − 𝜙1 𝐵)(1 − 𝐵12 )𝑦𝑡 = 𝑐 + (1 − Θ1 B12 − Θ2 B2×12 )𝜀𝑡


Final Time Series
Noted by Mỹ Dung K20 Logistics

4. Model checking
In model AR (1)
𝜙1 = 0 → 𝑤ℎ𝑖𝑡𝑒 𝑛𝑜𝑖𝑠𝑒
𝜙1 = 1 → 𝑟𝑎𝑛𝑑𝑜𝑚 𝑤𝑎𝑙𝑘

Concept:
Check xem error có autocorrelation cao hay không, nếu nó cao (significant) thì model có vấn đề
(inadequate). Nếu error có autocorrelation thấp, gần bằng 0 (insignificant), nghĩa là error các time
period không liên quan nhau, thì model ổn (adequate).
Có 2 dạng bài tập:
+ Nhìn trực tiếp vào ACF của error (đề cho), nếu ACF insignificant cho hầu hết các time lags => adequate.
Và ngược lại.
+ Nhìn p-value của Ljung Box Test để kết luận. (test này group các time lags lại với nhau để so cho tiện,
cách làm t có để ở dưới)
Overall check of model adequacy is provided by a chi-square test based on the Ljung-Box Q statistic.
This test looks at the sizes of the residual autocorrelations as a group
𝑚
𝑟𝑘2 (𝑒)
𝑄 = 𝑛(𝑛 + 2) ∑
𝑛−𝑘
𝑘=1

Which is approximately distributed as a chi-square random variable with 𝑚 − 𝑟 degrees of freedom


where 𝑟 is the total number of parameters estimated in the ARIMA model.
𝑟𝑘 (𝑒) = 𝑡ℎ𝑒 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 𝑎𝑢𝑡𝑜𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑎𝑡 𝑙𝑎𝑔 𝑘
𝑛 = the number of residuals
𝑘 = the time lag
𝑚 = the number of time lags to be tested

Check model adequacy by Ljung-Box


Brief: p-value trong phần Ljung-Box, so với significant level
Nếu p-value < significant level → significant → the model is not adequate
Nếu p-value > significant level → not significant → the model is adequate

*p-value > alpha


The Ljung-Box Q statistics computed for groups of lags m= 12, 24, 36, and 48 are not significant, as
indicated by the large p-values. Therefore, this model is adequate.
Final Time Series
Noted by Mỹ Dung K20 Logistics

*p-value < alpha


The Ljung-Box Q statistics computed for groups of lags m= 12, 24, 36, and 48 are significant, as indicated
by the small p-values. Therefore, this model is not adequate.

Example: Write equation and check adequacy

Equation
𝑌𝑡 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜀𝑡 = 984.94 + 0.9045 𝑌𝑡−1

Model adequacy checking


Let alpha = 5%
The Ljung-Box Q statistics computed for groups of lags m = 12, 24, 36, and 48 are 0.991, 0.872, 0.954,
and 0.977, respectively.
They are not significant, as indicated by the large p-values (all value larger than 5%). Therefore, this
model is adequate.
Final Time Series
Noted by Mỹ Dung K20 Logistics

ARIMA (1, 1, 0)

(1 − 𝜙1 𝐵)(1 − 𝐵)𝑦𝑡 = 𝑐 + 𝑒𝑡

𝑦𝑡 + 𝜙1 𝑦𝑡−2 − 𝑦𝑡−1 − 𝜙1 𝑦𝑡−1 = 𝑐 + 𝑒𝑡

𝑦𝑡 = 𝑐 + 𝑦𝑡−1 + 𝜙1 𝑦𝑡−1 − 𝜙1 𝑦𝑡−2 + 𝑒𝑡

𝑦𝑡 = 0.74 + 𝑦𝑡−1 + 0.28𝑦𝑡−1 − 0.28𝑦𝑡−2 + 𝑒𝑡


Final Time Series
Noted by Mỹ Dung K20 Logistics

𝑦 = 𝜙0 + 𝜙1 𝑌𝑡−1 + 𝜙2 𝑌𝑡−2 + 𝜀𝑡

𝑦 = 20.7642 + 0.2682𝑌𝑡−1 + 0.4212𝑌𝑡−2 + 𝜀𝑡

(1 − 𝐵)(1 − 𝐵4 )𝑦𝑡 = 𝑐 + (1 − 𝜔1 𝐵)(1 − Θ1 B4 )𝜀𝑡

(1 − 𝐵 − 𝐵4 + 𝐵5 )𝑦𝑡 = 𝑐 + (1 − 𝜔1 𝐵 − Θ1 𝐵4 + 𝜔1 Θ1 𝐵5 )𝜀𝑡

𝑦𝑡 − 𝑦𝑡−1 − 𝑦𝑡−4 + 𝑦𝑡−5 = 𝑐 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1 − Θ1 𝜀𝑡−4 + 𝜔1 Θ1 𝜀𝑡−5

𝑦𝑡 = 𝑦𝑡−1 + 𝑦𝑡−4 − 𝑦𝑡−5 + 𝑐 + 𝜀𝑡 − 𝜔1 𝜀𝑡−1 − Θ1 𝜀𝑡−4 + 𝜔1 Θ1 𝜀𝑡−5

𝑦𝑡 = 𝑦𝑡−1 + 𝑦𝑡−4 − 𝑦𝑡−5 + 𝜀𝑡 − 0.7626𝜀𝑡−1 − 0.5080𝜀𝑡−4 + 0.7626 × 0.5080𝜀𝑡−5


Final Time Series
Noted by Mỹ Dung K20 Logistics

Additional:

a. Phương trình đã có sẵn, mình chỉ cần thế số vô công thức thui. Mình tìm Y của t = bao nhiêu thì thế
giá trị Y của t= t-1 vào

𝑌61 = 50 + 0.45 × 𝑌60 = 50 + 0.45 × 57.00 = 75.65

𝑌62 = 50 + 0.45 × 𝑌61 = 50 + 0.45 × 75.65 = 84.04

𝑌63 = 50 + 0.45 × 𝑌62 = 50 + 0.45 × 84.04 = 87.82

b. Tương tự câu a nhưng sửa giá trị của 𝑌61

𝑌62 = 50 + 0.45 × 𝑌61 = 50 + 0.45 × 59.00 = 76.55

𝑌63 = 50 + 0.45 × 𝑌62 = 50 + 0.45 × 84.04 = 84.45

c. Prediction interval 𝑦̂ ± 2𝑠

𝑦̂61 = 75.65

Predition interval = 75.65 ±2 × √3.2 = (72.07, 79.23)

You might also like