Completed Time Series Analysis! ?
Completed Time Series Analysis! ?
# Display an image
display(Image(filename=r"C:\Users\hi\OneDrive\Desktop\my file\documents\time.jpg
Time series analysis studies data that changes over time. The objective
is to identify meaningful characteristics, such as trends, seasonality, and
cyclical behavior, and use these insights for prediction or understanding
temporal dynamics.
Seasonality: Regular, periodic fluctuations within a fixed period (e.g., daily, monthly,
yearly). Cyclical Component: Irregular, non-periodic fluctuations due to business or
economic cycles. Noise
3. ARIMA Model
# Train-test split
train = data[:100]
test = data[100:]
# Forecast
forecast = fitted_model.forecast(steps=len(test))
test['Forecast'] = forecast
# Plot results
plt.figure(figsize=(10, 5))
plt.plot(train, label='Training Data')
plt.plot(test['Passengers'], label='Actual Data', color='blue')
plt.plot(test['Forecast'], label='Forecasted Data', color='orange')
plt.legend()
plt.show()
# Calculate error
error = mean_squared_error(test['Passengers'], forecast)
print(f"Mean Squared Error: {error}")
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\AppData\Local\Temp\ipykernel_10776\1960587366.py:14: SettingWithCopyW
arning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
4. Advanced Techniques
In [4]: import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
look_back = 3
X_train, y_train = create_dataset(train, look_back)
X_test, y_test = create_dataset(test, look_back)
# Forecast
lstm_predictions = model.predict(X_test)
plt.figure(figsize=(10, 5))
plt.plot(test[look_back+1:], label='Actual Data')
plt.plot(lstm_predictions, label='LSTM Predictions')
plt.legend()
plt.show()
C:\Users\hi\anaconda3\Lib\site-packages\keras\src\layers\rnn\rnn.py:204: UserWarn
ing: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Seq
uential models, prefer using an `Input(shape)` object as the first layer in the m
odel instead.
super().__init__(**kwargs)
Epoch 1/20
111/111 - 6s - 53ms/step - loss: 64586.8008
Epoch 2/20
111/111 - 1s - 5ms/step - loss: 61537.0078
Epoch 3/20
111/111 - 1s - 5ms/step - loss: 58685.2891
Epoch 4/20
111/111 - 1s - 5ms/step - loss: 56694.9336
Epoch 5/20
111/111 - 1s - 5ms/step - loss: 54899.6445
Epoch 6/20
111/111 - 1s - 6ms/step - loss: 53251.5664
Epoch 7/20
111/111 - 1s - 5ms/step - loss: 51684.4922
Epoch 8/20
111/111 - 1s - 5ms/step - loss: 50181.5898
Epoch 9/20
111/111 - 1s - 6ms/step - loss: 48733.3164
Epoch 10/20
111/111 - 1s - 5ms/step - loss: 47326.9453
Epoch 11/20
111/111 - 1s - 5ms/step - loss: 45963.5234
Epoch 12/20
111/111 - 1s - 5ms/step - loss: 44643.4883
Epoch 13/20
111/111 - 1s - 6ms/step - loss: 43356.7266
Epoch 14/20
111/111 - 1s - 5ms/step - loss: 42111.0039
Epoch 15/20
111/111 - 1s - 5ms/step - loss: 40900.3594
Epoch 16/20
111/111 - 1s - 5ms/step - loss: 39707.4375
Epoch 17/20
111/111 - 1s - 5ms/step - loss: 38348.6875
Epoch 18/20
111/111 - 0s - 4ms/step - loss: 37123.3203
Epoch 19/20
111/111 - 0s - 4ms/step - loss: 35981.4023
Epoch 20/20
111/111 - 1s - 5ms/step - loss: 34882.6484
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 497ms/step
1. Descriptive Analysis
# Load dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passe
data = pd.read_csv(url, parse_dates=['Month'], index_col='Month')
# Descriptive Analysis
print(data.describe()) # Summary statistics
data.plot(title='Airline Passengers Over Time', figsize=(10, 5))
plt.show()
Passengers
count 144.000000
mean 280.298611
std 119.966317
min 104.000000
25% 180.000000
50% 265.500000
75% 360.500000
max 622.000000
Exploratory Analysi
# Autocorrelation Plot
plot_acf(data, lags=20)
plt.show()
plot_pacf(data, lags=20)
plt.show()
3. Forecasting
# Plot Forecast
data.plot(label='Historical Data')
forecast.plot(label='Forecast', color='orange')
plt.legend()
plt.show()
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
4. Causal Analysis
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[8], line 4
1 from statsmodels.tsa.stattools import grangercausalitytests
3 # Granger causality test (example with two time series)
----> 4 grangercausalitytests(data[['Passengers', 'Another_Series']], maxlag=4)
5. Frequency Analysis
# Fourier Transform
fft_result = fft(data['Passengers'])
plt.plot(np.abs(fft_result))
plt.title('Frequency Spectrum')
plt.show()
6. Anomaly Detection
# Anomaly Detection
model = IsolationForest(contamination=0.05)
data['Anomaly'] = model.fit_predict(data)
# Plot anomalies
data['Passengers'].plot(label='Data')
data[data['Anomaly'] == -1]['Passengers'].plot(style='ro', label='Anomalies')
plt.legend()
plt.show()
7. Structural Analysis
Examines how the underlying structure of the data evolves over time.
Purpose: Test for stationarity or structural breaks. Methods: ADF (Augmented Dickey-
Fuller) Test. KPSS (Kwiatkowski–Phillips–Schmidt–Shin) Test
# Stationarity Test
result = adfuller(data['Passengers'])
print("ADF Statistic:", result[0])
print("p-value:", result[1])
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
Identifies points where the statistical properties of the time series change.
Purpose: Detect regime shifts or structural breaks. Methods: PELT, BOCPD (Bayesian
Online Change Point Detection)
In [ ]:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[13], line 1
----> 1 import ruptures as rpt
3 # Change Point Detection
4 algo = rpt.Pelt(model="rbf").fit(data['Passengers'].values)
ERROR: Could not find a version that satisfies the requirement reptures (from ver
sions: none)
ERROR: No matching distribution found for reptures
𝜙2𝑌𝑡−2
𝜙𝑝𝑌𝑡−𝑝
Key Parameter: p (number of lags). Use Case: Suitable for stationary data with
autocorrelation
# Fit AR model
model = AutoReg(data['Passengers'], lags=5)
ar_model = model.fit()
print(ar_model.summary())
𝜖𝑡
𝜃1𝜖𝑡−1
Key Parameter: q (number of error lags). Use Case: Suitable for data with short-term
noise.
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\statespace\sarimax.py:97
8: UserWarning: Non-invertible starting MA parameters found. Using zeros as start
ing parameters.
warn('Non-invertible starting MA parameters found.'
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\base\model.py:607: Convergenc
eWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
warnings.warn("Maximum Likelihood optimization failed to "
SARIMAX Results
==============================================================================
Dep. Variable: Passengers No. Observations: 144
Model: ARIMA(0, 0, 5) Log Likelihood -733.783
Date: Mon, 06 Jan 2025 AIC 1481.566
Time: 20:31:26 BIC 1502.355
Sample: 01-01-1949 HQIC 1490.013
- 12-01-1960
Covariance Type: opg
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const 280.3154 16.302 17.195 0.000 248.364 312.266
ma.L1 1.1217 78.179 0.014 0.989 -152.106 154.349
ma.L2 0.3926 9.582 0.041 0.967 -18.387 19.172
ma.L3 0.3872 21.146 0.018 0.985 -41.059 41.833
ma.L4 1.1083 9.207 0.120 0.904 -16.937 19.154
ma.L5 0.9920 77.646 0.013 0.990 -151.191 153.175
sigma2 1386.4913 1.08e+05 0.013 0.990 -2.11e+05 2.14e+05
=================================================================================
==
Ljung-Box (L1) (Q): 26.15 Jarque-Bera (JB): 12.
36
Prob(Q): 0.00 Prob(JB): 0.
00
Heteroskedasticity (H): 2.79 Skew: 0.
71
Prob(H) (two-sided): 0.00 Kurtosis: 3.
16
=================================================================================
==
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-st
ep).
ARMA Model
𝜙𝑝𝑌𝑡−𝑝
𝜖𝑡
𝜃1𝜖𝑡−1
𝜃 𝑞 𝜖 𝑡 − 𝑞 Y t=ϕ 1Y t−1+⋯+ϕ pY t−p+ϵ t+θ 1ϵ t−1+⋯+θ qϵ t−q
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\statespace\sarimax.py:96
6: UserWarning: Non-stationary starting autoregressive parameters found. Using ze
ros as starting parameters.
warn('Non-stationary starting autoregressive parameters'
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\statespace\sarimax.py:97
8: UserWarning: Non-invertible starting MA parameters found. Using zeros as start
ing parameters.
warn('Non-invertible starting MA parameters found.'
SARIMAX Results
==============================================================================
Dep. Variable: Passengers No. Observations: 144
Model: ARIMA(2, 0, 2) Log Likelihood -698.172
Date: Mon, 06 Jan 2025 AIC 1408.344
Time: 20:33:11 BIC 1426.162
Sample: 01-01-1949 HQIC 1415.584
- 12-01-1960
Covariance Type: opg
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const 280.3016 60.094 4.664 0.000 162.519 398.084
ar.L1 0.2540 0.223 1.137 0.256 -0.184 0.692
ar.L2 0.6510 0.192 3.397 0.001 0.275 1.027
ma.L1 1.1366 0.237 4.794 0.000 0.672 1.601
ma.L2 0.2127 0.173 1.232 0.218 -0.126 0.551
sigma2 930.1711 107.379 8.663 0.000 719.713 1140.630
=================================================================================
==
Ljung-Box (L1) (Q): 0.01 Jarque-Bera (JB): 1.
35
Prob(Q): 0.94 Prob(JB): 0.
51
Heteroskedasticity (H): 6.38 Skew: 0.
20
Prob(H) (two-sided): 0.00 Kurtosis: 3.
24
=================================================================================
==
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-st
ep).
ARIMA Model
Formula: 𝑌 𝑡 = Δ 𝑑 ( 𝜙 1 𝑌 𝑡 − 1
𝜙𝑝𝑌𝑡−𝑝
𝜖𝑡
𝜃1𝜖𝑡−1
𝜃 𝑞 𝜖 𝑡 − 𝑞 ) Y t=Δ d (ϕ 1Y t−1+⋯+ϕ pY t−p+ϵ t+θ 1ϵ t−1+⋯+θ qϵ t−q) Key Parameters:
p: AR order. d: Degree of differencing. q: MA orde
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
SARIMAX Results
==============================================================================
Dep. Variable: Passengers No. Observations: 144
Model: ARIMA(2, 1, 2) Log Likelihood -671.673
Date: Mon, 06 Jan 2025 AIC 1353.347
Time: 20:34:17 BIC 1368.161
Sample: 01-01-1949 HQIC 1359.366
- 12-01-1960
Covariance Type: opg
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ar.L1 1.6850 0.020 83.059 0.000 1.645 1.725
ar.L2 -0.9548 0.017 -55.420 0.000 -0.989 -0.921
ma.L1 -1.8432 0.125 -14.795 0.000 -2.087 -1.599
ma.L2 0.9953 0.135 7.373 0.000 0.731 1.260
sigma2 665.9568 114.115 5.836 0.000 442.295 889.619
=================================================================================
==
Ljung-Box (L1) (Q): 0.30 Jarque-Bera (JB): 1.
84
Prob(Q): 0.59 Prob(JB): 0.
40
Heteroskedasticity (H): 7.38 Skew: 0.
27
Prob(H) (two-sided): 0.00 Kurtosis: 3.
14
=================================================================================
==
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-st
ep).
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\base\model.py:607: Convergenc
eWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
warnings.warn("Maximum Likelihood optimization failed to "
SARIMA Model
Definition: Extends ARIMA to handle seasonal data by adding seasonal
terms.
Formula: 𝑆 𝐴 𝑅 𝐼 𝑀 𝐴 ( 𝑝 , 𝑑 , 𝑞 ) ( 𝑃 , 𝐷 , 𝑄 , 𝑠 ) SARIMA(p,d,q)(P,D,Q,s) where 𝑠 s is the
seasonal period. Use Case: Seasonal data
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
C:\Users\hi\anaconda3\Lib\site-packages\statsmodels\tsa\base\tsa_model.py:473: Va
lueWarning: No frequency information was provided, so inferred frequency MS will
be used.
self._init_dates(dates, freq)
SARIMAX Results
=================================================================================
===========
Dep. Variable: Passengers No. Observations:
144
Model: SARIMAX(2, 1, 2)x(1, 1, [1], 12) Log Likelihood
-503.024
Date: Mon, 06 Jan 2025 AIC
1020.048
Time: 20:35:24 BIC
1040.174
Sample: 01-01-1949 HQIC
1028.226
- 12-01-1960
Covariance Type: opg
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ar.L1 0.4441 0.388 1.145 0.252 -0.316 1.204
ar.L2 0.3287 0.303 1.086 0.278 -0.265 0.922
ma.L1 -0.8352 0.402 -2.079 0.038 -1.623 -0.048
ma.L2 -0.1385 0.385 -0.359 0.719 -0.894 0.617
ar.S.L12 -0.8799 0.274 -3.213 0.001 -1.417 -0.343
ma.S.L12 0.7843 0.359 2.183 0.029 0.080 1.489
sigma2 124.5105 14.050 8.862 0.000 96.974 152.047
=================================================================================
==
Ljung-Box (L1) (Q): 0.03 Jarque-Bera (JB): 12.
42
Prob(Q): 0.86 Prob(JB): 0.
00
Heteroskedasticity (H): 2.62 Skew: 0.
14
Prob(H) (two-sided): 0.00 Kurtosis: 4.
48
=================================================================================
==
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-st
ep).
X = data[['Lag1', 'Lag2']]
y = data['Passengers']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_
# Create sequences
def create_sequences(data, seq_length):
X, y = [], []
for i in range(len(data) - seq_length):
X.append(data[i:i+seq_length])
y.append(data[i+seq_length])
return np.array(X), np.array(y)
seq_length = 3
X_train, y_train = create_sequences(train, seq_length)
X_test, y_test = create_sequences(test, seq_length)
# Predict
lstm_predictions = model.predict(X_test)
C:\Users\hi\anaconda3\Lib\site-packages\keras\src\layers\rnn\rnn.py:204: UserWarn
ing: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Seq
uential models, prefer using an `Input(shape)` object as the first layer in the m
odel instead.
super().__init__(**kwargs)
Epoch 1/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 6s 11ms/step - loss: 53723.1602
Epoch 2/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 49023.90627
Epoch 3/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 41025.78121
Epoch 4/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 34404.5312
Epoch 5/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 24417.7988
Epoch 6/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 12097.3975
Epoch 7/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 2751.4854
Epoch 8/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 2470.1809
Epoch 9/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 1776.7296
Epoch 10/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 1331.9918
Epoch 11/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 1232.1517
Epoch 12/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 1411.8165
Epoch 13/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 1300.9708
Epoch 14/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 1088.6345
Epoch 15/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 1309.1160
Epoch 16/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - loss: 1091.2700
Epoch 17/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 1050.2249
Epoch 18/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 1128.1163
Epoch 19/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 851.8344
Epoch 20/20
7/7 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 740.9769
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 587ms/step
In [ ]: