0% found this document useful (0 votes)
44 views8 pages

PPP Models - ARIMA & NARNN - Ipynb - Colaboratory

Uploaded by

sethantanah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views8 pages

PPP Models - ARIMA & NARNN - Ipynb - Colaboratory

Uploaded by

sethantanah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

1/3/24, 7:22 PM PPP Models - ARIMA & NARNN.

ipynb - Colaboratory

keyboard_arrow_down Petroleum Price Prediction Models

The following models would be developed:

GARCH with NARNN and vice versa

ARIMA with NARNN vice versa

SARIMA with NARNN vice versa

Exponential Smoothing with NARNN and vice versa.

NB: The residuals from the base model would be used to train the other model

The existing models serve as a Benchmark ( GARCH, ARIMA and SARIMA, Exponential
Smoothing)

1 # Load Libraries
2 import numpy as np # Mathematical Computations
3 import pandas as pd # Data Manipulation
4 import matplotlib.pyplot as plt # Data Visualization
5 import seaborn as sns # Advance Data Visualizations
6 import plotly.express as px # Advance and Interactive Data Visualizations

keyboard_arrow_down Load Dataset and Preprocessing

1 # Pick the data from local machine


2 from google.colab import files
3 file_path = files.upload()

Choose Files petroleum_products.xlsx


petroleum_products.xlsx(application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) -
10200 bytes, last modified: 12/21/2023 - 100% done
Saving petroleum_products.xlsx to petroleum_products.xlsx

1 # Load the dataset into a useable dataframe format


2 df = pd.read_excel('petroleum_products.xlsx', index_col='Date')
3 df.head()

GASOLINE DIESEL LPG

Date

2019-01-01 4.89 4.90 4.99

2019-02-01 4.90 4.91 4.94

2019-03-01 5.15 5.16 5.01

2019-04-01 5.19 5.19 5.39

2019-05-01 5.20 5.21 4.97

https://colab.research.google.com/drive/1_zW5Ko8Dj5_jrrBrSJvRM3s5-upoVt_R#scrollTo=3umMyiSHddfa&printMode=true 1/8
1/3/24, 7:22 PM PPP Models - ARIMA & NARNN.ipynb - Colaboratory
1 # Rename columns for convinience
2 df = df.rename(columns = {'GASOLINE': 'Gasoline', 'DIESEL': 'Diesel'})
3 df.columns

Index(['Gasoline', 'Diesel', 'LPG'], dtype='object')

keyboard_arrow_down Data Selection

1 # Select a Product and Model Type to Analyze


product: Gasoline
2 product = "Gasoline" # @param ["Gasoline", "Diesel
3 print(f"Selected Product: {product}")

Selected Product: Gasoline

keyboard_arrow_down ARIMA - NARNN Model

keyboard_arrow_down Estimate The Mean Equation

1 ! pip install pmdarima


2 import pmdarima as pm
3 import statsmodels.api as sm
4 import matplotlib.pyplot as plt
5 from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

Collecting pmdarima
Downloading pmdarima-2.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manyl
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 8.6 MB/s eta 0:00:00
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.10/dist-packages (f
Requirement already satisfied: Cython!=0.29.18,!=0.29.31,>=0.29 in /usr/local/lib/python3
Requirement already satisfied: numpy>=1.21.2 in /usr/local/lib/python3.10/dist-packages (
Requirement already satisfied: pandas>=0.19 in /usr/local/lib/python3.10/dist-packages (f
Requirement already satisfied: scikit-learn>=0.22 in /usr/local/lib/python3.10/dist-packa
Requirement already satisfied: scipy>=1.3.2 in /usr/local/lib/python3.10/dist-packages (f
Requirement already satisfied: statsmodels>=0.13.2 in /usr/local/lib/python3.10/dist-pack
Requirement already satisfied: urllib3 in /usr/local/lib/python3.10/dist-packages (from p
Requirement already satisfied: setuptools!=50.0.0,>=38.6.0 in /usr/local/lib/python3.10/d
Requirement already satisfied: packaging>=17.1 in /usr/local/lib/python3.10/dist-packages
Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-p
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (f
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.10/dist-pac
Requirement already satisfied: patsy>=0.5.4 in /usr/local/lib/python3.10/dist-packages (f
Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from patsy
Installing collected packages: pmdarima
Successfully installed pmdarima-2.0.4

https://colab.research.google.com/drive/1_zW5Ko8Dj5_jrrBrSJvRM3s5-upoVt_R#scrollTo=3umMyiSHddfa&printMode=true 2/8
1/3/24, 7:22 PM PPP Models - ARIMA & NARNN.ipynb - Colaboratory
1 from statsmodels.tsa.stattools import adfuller
2 # Perform the ADF test
3 def perform_adf_test(ds):
4 result = adfuller(ds)
5 # Extract and print test statistics
6 adf_statistic = result[0]
7 p_value = result[1]
8 critical_values = result[4]
9
10 print(f'ADF Statistic: {adf_statistic}')
11 print(f'p-value: {p_value}')
12 print(f'Critical Values: {critical_values}')
13
14 # Check the p-value against the significance level (e.g., 0.05)
15 if p_value <= 0.05:
16 print("Reject the null hypothesis. The time series is likely stationary.")
17 else:
18 print("Fail to reject the null hypothesis. The time series may not be stationary.")
19
20 perform_adf_test(df[product])

ADF Statistic: -0.4714985407350731


p-value: 0.8974593895763701
Critical Values: {'1%': -3.5506699942762414, '5%': -2.913766394626147, '10%': -2.59462404
Fail to reject the null hypothesis. The time series may not be stationary.

1 # Diffrence the data


2 differenced_ds = df[product].diff(1).dropna()
3 perform_adf_test(differenced_ds)

ADF Statistic: -6.54915636272955


p-value: 8.941192716390255e-09
Critical Values: {'1%': -3.5506699942762414, '5%': -2.913766394626147, '10%': -2.59462404
Reject the null hypothesis. The time series is likely stationary.

1 def find_arima_lags(data, max_ar=5, max_ma=5, figsize=(12, 3)):


2 # Plot ACF and PACF
3 fig, (ax1, ax2) = plt.subplots(1, 2, figsize=figsize)
4
5 plot_acf(data, ax=ax1)
6 plot_pacf(data, ax=ax2)
7
8 ax1.set_title('Autocorrelation Function (ACF)')
9 ax2.set_title('Partial Autocorrelation Function (PACF)')
10
11 plt.show()
12
13 find_arima_lags(differenced_ds)

https://colab.research.google.com/drive/1_zW5Ko8Dj5_jrrBrSJvRM3s5-upoVt_R#scrollTo=3umMyiSHddfa&printMode=true 3/8
1/3/24, 7:22 PM PPP Models - ARIMA & NARNN.ipynb - Colaboratory

1 ar_model = pm.auto_arima(differenced_ds, s=0)


2 print(ar_model.summary())

SARIMAX Results
==============================================================================
Dep. Variable: y No. Observations: 59
Model: SARIMAX(0, 0, 2) Log Likelihood -71.805
Date: Wed, 03 Jan 2024 AIC 149.611
Time: 17:19:11 BIC 155.844
Sample: 02-01-2019 HQIC 152.044
- 12-01-2023
Covariance Type: opg
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ma.L1 0.2512 0.068 3.721 0.000 0.119 0.384
ma.L2 -0.2579 0.151 -1.705 0.088 -0.554 0.039
sigma2 0.6648 0.042 15.793 0.000 0.582 0.747
===================================================================================
Ljung-Box (L1) (Q): 0.05 Jarque-Bera (JB): 1182.36
Prob(Q): 0.82 Prob(JB): 0.00
Heteroskedasticity (H): 23.10 Skew: 3.51
Prob(H) (two-sided): 0.00 Kurtosis: 23.77
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

1 from statsmodels.tsa.arima.model import ARIMA


2 # Fit ARIMA(2,2) model
3 order = (0, 1, 2) # (p, d, q) order
4 arima_model = ARIMA(df[product], order=order)
5 arima_results = arima_model.fit()
6
7 # Display model summary
8 print(arima_results.summary())

SARIMAX Results
==============================================================================
Dep. Variable: Gasoline No. Observations: 60
Model: ARIMA(0, 1, 2) Log Likelihood -71.805
Date: Wed, 03 Jan 2024 AIC 149.611
Time: 19:06:35 BIC 155.844
Sample: 01-01-2019 HQIC 152.044
- 12-01-2023
Covariance Type: opg
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ma.L1 0.2512 0.068 3.721 0.000 0.119 0.384
ma.L2 -0.2579 0.151 -1.705 0.088 -0.554 0.039
sigma2 0.6648 0.042 15.793 0.000 0.582 0.747
===================================================================================

https://colab.research.google.com/drive/1_zW5Ko8Dj5_jrrBrSJvRM3s5-upoVt_R#scrollTo=3umMyiSHddfa&printMode=true 4/8
1/3/24, 7:22 PM PPP Models - ARIMA & NARNN.ipynb - Colaboratory
Ljung-Box (L1) (Q): 0.05 Jarque-Bera (JB): 1182.36
Prob(Q): 0.82 Prob(JB): 0.00
Heteroskedasticity (H): 23.10 Skew: 3.51
Prob(H) (two-sided): 0.00 Kurtosis: 23.77
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
/usr/local/lib/python3.10/dist-packages/statsmodels/tsa/base/tsa_model.py:473: ValueWarni
self._init_dates(dates, freq)
/usr/local/lib/python3.10/dist-packages/statsmodels/tsa/base/tsa_model.py:473: ValueWarni
self._init_dates(dates, freq)
/usr/local/lib/python3.10/dist-packages/statsmodels/tsa/base/tsa_model.py:473: ValueWarni
self._init_dates(dates, freq)

1 from statsmodels.tsa.stattools import kpss


2 kpss(arima_results.resid)

<ipython-input-32-5ae90626ffc6>:2: InterpolationWarning: The test statistic is outside of


look-up table. The actual p-value is greater than the p-value returned.

kpss(arima_results.resid)
(0.10495312679330766,
0.1,
10,
{'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739})

keyboard_arrow_down NARNN MODEL

1 from tensorflow import keras


2 from keras.models import Sequential
3 from keras.layers import LSTM, Dense
4 from keras.optimizers import Adam
5 from sklearn.preprocessing import MinMaxScaler

1 # Extract residuals from the ARIMA model


2 residuals = arima_results.resid
3
4 # Scale the residuals between 0 and 1
5 #scaler = MinMaxScaler(feature_range=(0, 1))
6 residuals_scaled = residuals.values.reshape(-1, 1) #scaler.fit_transform(residuals.values.reshape(-1, 1))
7
8 # Create a NARNN model
9 model = Sequential()
10 model.add(LSTM(units=100, activation='relu', return_sequences=True, input_shape=(residuals_scaled.shape[1
11 model.add(LSTM(units=100, activation='relu'))
12 model.add(Dense(units=1))
13
14 # Compile the model
15 model.compile(optimizer=Adam(learning_rate=0.001), loss='mean_squared_error')
16
17 # Train the model
18 model.fit(residuals_scaled, residuals_scaled, epochs=100, batch_size=32)

https://colab.research.google.com/drive/1_zW5Ko8Dj5_jrrBrSJvRM3s5-upoVt_R#scrollTo=3umMyiSHddfa&printMode=true 5/8
1/3/24, 7:22 PM PPP Models - ARIMA & NARNN.ipynb - Colaboratory
Epoch 74/100
2/2 [==============================] - 0s 10ms/step - loss: 0.0034
Epoch 75/100
2/2 [==============================] - 0s 8ms/step - loss: 0.0034
Epoch 76/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0033
Epoch 77/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0033
Epoch 78/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0032
Epoch 79/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0031
Epoch 80/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0031
Epoch 81/100
2/2 [==============================] - 0s 10ms/step - loss: 0.0030
Epoch 82/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0030
Epoch 83/100
2/2 [==============================] - 0s 11ms/step - loss: 0.0029
Epoch 84/100
2/2 [==============================] - 0s 10ms/step - loss: 0.0029
Epoch 85/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0028
Epoch 86/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0028
Epoch 87/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0027
Epoch 88/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0027
Epoch 89/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0026
Epoch 90/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0026
Epoch 91/100
2/2 [==============================] - 0s 10ms/step - loss: 0.0026
Epoch 92/100
2/2 [==============================] - 0s 11ms/step - loss: 0.0025
Epoch 93/100
2/2 [==============================] - 0s 13ms/step - loss: 0.0025
Epoch 94/100
2/2 [==============================] - 0s 11ms/step - loss: 0.0024
Epoch 95/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0024
Epoch 96/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0024
Epoch 97/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0023
Epoch 98/100
2/2 [==============================] - 0s 12ms/step - loss: 0.0023
Epoch 99/100
2/2 [==============================] - 0s 11ms/step - loss: 0.0022
Epoch 100/100
2/2 [==============================] - 0s 9ms/step - loss: 0.0022
<keras.src.callbacks.History at 0x7adb4f36d6c0>

1 # Evaluate the model


2 model.evaluate(residuals_scaled, residuals_scaled)

2/2 [==============================] - 0s 11ms/step - loss: 0.0022


0.0021930031944066286

1 # M k di ti
https://colab.research.google.com/drive/1_zW5Ko8Dj5_jrrBrSJvRM3s5-upoVt_R#scrollTo=3umMyiSHddfa&printMode=true 6/8
1/3/24, 7:22 PM PPP Models - ARIMA & NARNN.ipynb - Colaboratory
1 # Make predictions
2 predictions = model.predict(residuals_scaled)
3
4 # Rescale the predictions back to the original range
5 #predictions = scaler.inverse_transform(predictions)
6
7 df2 = df
8 df2['Predictions'] = predictions
9 #df2 = pd.DataFrame({'Predictions': predictions.flatten()})
10 # Plot the actual and predicted residuals
11 plt.figure(figsize=(12, 6))
12 plt.plot(residuals, label='Actual Residuals (ARIMA)')
13 plt.plot(df2['Predictions'], label='Predicted Residuals (NARNN)')
14 plt.title('Actual vs Predicted Residuals')
15 plt.xlabel('Time')
16 plt.ylabel('Residuals')
17 plt.legend()
18 plt.show()

2/2 [==============================] - 0s 7ms/step

1 from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score, mean_absolute_percentage_e


2 # Calculate MSE
3 mse = mean_squared_error(residuals, df2['Predictions'])
4 print(f'MSE: {mse}')
5
6 # Calculate MAE
7 mae = mean_absolute_error(residuals, df2['Predictions'])
8 print(f'MAE: {mae}')
9
10 # Calculate MAPE
11 mape = mean_absolute_percentage_error(residuals, df2['Predictions'])
12 print(f'MAPE: {mape}')
13
14 # Calculate R2
15 r2 = r2_score(residuals, df2['Predictions'])
16 print(f'R2: {r2}')
17

https://colab.research.google.com/drive/1_zW5Ko8Dj5_jrrBrSJvRM3s5-upoVt_R#scrollTo=3umMyiSHddfa&printMode=true 7/8
1/3/24, 7:22 PM PPP Models - ARIMA & NARNN.ipynb - Colaboratory

MSE: 0.0021930036840889827
MAE: 0.021088656323592293
MAPE: 0.07120554192303562
R2: 0.9978264000278362

https://colab.research.google.com/drive/1_zW5Ko8Dj5_jrrBrSJvRM3s5-upoVt_R#scrollTo=3umMyiSHddfa&printMode=true 8/8

You might also like