0% found this document useful (0 votes)
5 views29 pages

TSF Guided Project Sample Business Report

The document outlines a project report focused on forecasting gold prices using time series analysis. It includes sections on data overview, exploratory data analysis, model building, and performance evaluation of various forecasting models. The objective is to develop a reliable forecasting tool to aid investors and stakeholders in navigating the volatile gold market.

Uploaded by

srigill89
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views29 pages

TSF Guided Project Sample Business Report

The document outlines a project report focused on forecasting gold prices using time series analysis. It includes sections on data overview, exploratory data analysis, model building, and performance evaluation of various forecasting models. The objective is to develop a reliable forecasting tool to aid investors and stakeholders in navigating the volatile gold market.

Uploaded by

srigill89
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Time Series Forecasting Guided

[email protected]
KOVTJYSURE Project Business Report

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Content
S.no Topics Page

1 Problem - Gold Price Forecasting 2

1.1 Data Overview 2

1.2 Exploratory Data Analysis 6

1.3 Decomposition 8

1.4 Data Pre-processing 10

ry
1.5 Model Building -Original Data 12

1.6 Stationarity Check 19

1.7 Model Building -Stationary Data 22

ta
1.8 Comparison of Model Performance & Forecasting 26

1.9 Insights & Recommendations 28


rie
List of Tables
[email protected]
KOVTJYSURE
No Name of the Table Page no

1 Top 5 rows of the dataset 2


op

2 Bottom 5 rows of the dataset 3

3 Basic Information of dataset 3

4 Statistical Summary 5
Pr

List of Figures
No Name of Figure Page no

1 Trend of Price day wise 4

2 Trend of Price year wise 5

3 yearly boxplot 6

4 month boxplots 6

1
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Gold Price Forecasting

Context
In the dynamic landscape of the financial markets, the demand for accurate predictions of gold
prices is crucial for investors, traders, and stakeholders. Similar to the challenges faced by
business communities in the United States in identifying and attracting the right talent,
predicting the future values of precious metals poses a significant challenge. The inherent
volatility and complex dynamics influencing gold and silver prices require a reliable forecasting
tool.

ry
The goal is to provide a valuable asset for individuals navigating the financial markets, ensuring
they can approach the unpredictability of gold and silver prices with confidence and strategic
insight.

ta
Objective
In the realm of financial markets, predicting the future value of gold and silver holds significant
importance for investors, traders, and stakeholders. The challenge lies in the inherent volatility
rie
and complex dynamics influencing gold prices. To address this, the objective is to develop an
accurate time series forecasting model that can predict the future prices of gold. The model
[email protected]
KOVTJYSURE
should leverage historical data of gold price. The anticipated outcome is a robust forecasting
tool that empowers stakeholders to make informed decisions and navigate the dynamic
landscape of the gold market with confidence.
op

Data Overview
The data frame has 2 columns and 2539 rows. Data in each row corresponds to the price of
gold on the specific date.
Pr

Table 1: Top 5 rows of the dataset


2
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
ry
Table 2: Bottom 5 rows of the dataset

ta
rie
[email protected]
KOVTJYSURE

Table 3: Basic information of the dataset


op

The Date column is in the datetime format as required and price column is in float format.

Missing Value Treatment


Pr

We noticed that there are 79 missing values in the price column.

Lets, Incorporate forward filling for handling missing values in gold prices this ensures a
time-aligned and trend-preserving approach, vital for accurate forecasting.

3
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Plot the Time Series to understand the behaviour of the
data
● Plot the trend of price considering days

ry
ta
rie
[email protected]
KOVTJYSURE Fig 1: Trend of Price day wise
op
Pr

4
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
● Plot the trend of price considering years

ry
ta
Fig 2: Trend of Price year wise

Check the basic measures of descriptive statistics of the Time Series


rie
Before examining the overall data statistics, let’s compress the data from daily to monthly prices,
[email protected]
focusing solely on month-end prices.
KOVTJYSURE

Now, let’s check the statistics on month level data.


op
Pr

Table 4: Statistical summary

● After compressing we noticed that there are a total 121 entries.


● The minimum price is at 1068 dollars and the maximum price is 2011 dollars during
2013- 2023.
5
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Exploratory Data Analysis

● Yearly Boxplot

ry
ta
rie
[email protected]
KOVTJYSURE Fig 3: yearly boxplot
We can see that the gold prices grew rapidly in the years 2019 & 2020.

● Monthly Boxplot
op
Pr

Fig 4: month boxplots

We can see that July and August months have more growth compared to the rest of the months.
6
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
● Monthly price across years

ry
ta
rie
[email protected]
KOVTJYSURE
Fig 5: Price trend in months

● Plot the Empirical Cumulative Distribution.


op
Pr

Fig 6: ECD Plot

7
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Decomposition

Additive Decomposition

An additive model suggests that the components are added together.


An additive model is linear where changes over time are consistently made by the same amount.The
seasonal correction is added with the trend.
A linear seasonality has the same frequency (width of the cycles) and amplitude (height of the cycles).

ry
ta
rie
[email protected]
KOVTJYSURE
op
Pr

Fig 7: Additive decomposition

We observe that the trend and seasonality are clearly separated.

8
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Multiplicative Decomposition

A multiplicative model suggests that the components are multiplied together.


A multiplicative model is non-linear.The seasonal correction is multiplied with the trend.
A non-linear seasonality has an increasing or decreasing frequency (width of the cycles) and / or
amplitude (height of the cycles) over time.

ry
ta
rie
[email protected]
KOVTJYSURE
op

Fig 8: Multiplicative decomposition


Pr

We observe that the trend and seasonality are clearly separated.

9
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Data Pre-processing

Split the data into train and test

The data before the year 2021 is considered for training and the data from 2021 is considered for testing.

The size of the training dataset is (89,1)


The size of the testing dataset is (32,1)

First few rows of training data

ry
ta
rie
[email protected]
KOVTJYSURE

Last few rows of training data


op
Pr

10
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
First few rows of testing data

ry
Last few rows of testing data

ta
rie
[email protected]
KOVTJYSURE
op

Joint plot for training and testing data


Pr

Fig 9: Plot for training and testing data


11
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Model Building - Original Data

Linear Regression
For this particular linear regression, we are going to regress the ‘Price’ variable against the order of the
occurrence. For this we need to modify our training data before fitting it into a linear regression.

Linear Regression model is built on train data and tested on test data.

ry
Plot Linear Regression

ta
rie
[email protected]
KOVTJYSURE
op

Fig 10: Linear Regress


Pr

Evaluation - Linear Regression

Linear regression forecast on the testing data, RMSE is 207.87

12
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Moving Average (MA)
For the moving average model, we are going to calculate rolling means (or moving
averages) for different intervals. The best interval can be determined by the maximum accuracy (or the
minimum error) over here.

For Moving Average, we are going to average over the entire data.

ry
ta
rie
[email protected]
KOVTJYSURE
op

Evaluation - Moving Average

For 2 point Moving Average Model forecast on the Testing Data, RMSE is 27.945
For 4 point Moving Average Model forecast on the Testing Data, RMSE is 54.619
For 6 point Moving Average Model forecast on the Testing Data, RMSE is 70.894
For 9 point Moving Average Model forecast on the Testing Data, RMSE is 85.550
Pr

13
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Considering 2 - point movingaverage as the best moving average model,
let us plot all the models done so far and compare the time Series plots.

ry
ta
Simple Exponential Smoothening model
rie
[email protected]
Simple or single exponential smoothing (SES) is the method of time series forecasting used with
KOVTJYSURE
univariate data with no trend and no seasonal pattern. It needs a single parameter called alpha (a), also
known as the smoothing factor.
op
Pr

14
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Evaluation - Simple Exponential Smoothening

For Alpha =0.995 Simple Exponential Smoothing Model forecast on the Test Data, RMSE is 89.652

Double Exponential Smoothening model

This method is known as Holt's trend model or second-order exponential smoothing. Double exponential
smoothing is used in time-series forecasting when the data has a linear trend but no seasonal pattern.

ry
ta
rie
Evaluation - Simple Exponential Smoothening
[email protected]
KOVTJYSURE

For alpha = 0.9 and beta = 0.3 Double Exponential Smoothening RMSE is 90.07

Plot for the predictions on the test set


op
Pr

15
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Triple Exponential Smoothening
This method is the variation of exponential smoothing that's most advanced and is
used for time series forecasting when the data has linear trends and seasonal patterns. The technique
applies exponential smoothing three times – level smoothing, trend smoothing, and seasonal smoothing.
A new smoothing parameter called gamma (g) is added to control the influence of the seasonal
component.

Based on the decomposition done earlier, we consider multiplicative trend and additive seasonality to
build the triple exponential smoothing model.

Auto-Fit Parameters

ry
{'smoothing_level': 0.9999999226306073,
'smoothing_trend': 0.06316180190957783,
'smoothing_seasonal': 7.73681700533732e-09,
'damping_trend': nan,

ta
'initial_level': 1349.0144562281534,
'initial_trend': 0.9948502864119877,
'initial_seasons': array([ 48.55200673, 25.08694757, 1.13540709, -37.84535803,
-57.35393271, -16.35187943, 12.77061125, 6.29862206,
18.39548953, 6.72091278, 12.42347022, 24.86253388]),
rie
'use_boxcox': False,
'lamda': None,
[email protected]
KOVTJYSURE
'remove_bias': False}

Prediction on the test data


op
Pr

16
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Plot for the predictions on the test set

ry
ta
Evaluation - Simple Exponential Smoothening
rie
[email protected]
For Alpha=0.676,Beta=0.088,Gamma=0.323, Triple Exponential Smoothing Model forecast on the Test
KOVTJYSURE
Data, RMSE is 664.959
op
Pr

After fine tuning the alpha, beta and gamma values, RMSE is least when alpha = 0.8, beta = 0.5 & gamma
= 0.5

17
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Plot for the predictions on the test set

ry
ta
rie
[email protected]
KOVTJYSURE
op
Pr

18
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Check for stationarity of the whole time series data

Null and Alternate Hypothesis for the Augmented Dickey Fuller Test.
H0: The series is not stationary.
H1: The series is stationary

ry
ta
rie
[email protected]
KOVTJYSURE
op

We see that the series is not stationary with original form at alpha = 0.05

Results of Dickey-Fuller Test:


Pr

Test Statistic -0.569598


p-value 0.877705
#Lags Used 1.000000
Number of Observations Used 119.000000
Critical Value (1%) -3.486535
Critical Value (5%) -2.886151
Critical Value (10%) -2.579896
dtype: float64

19
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Let us try check for stationarity after taking first order differencing.

ry
ta
rie
[email protected]
KOVTJYSURE

Series is stationary post first order differencing at alpha = 0.05


op

Results of Dickey-Fuller Test:


Test Statistic -6.966787e+00
p-value 8.872747e-10
#Lags Used 0.000000e+00
Number of Observations Used 8.700000e+01
Critical Value (1%) -3.507853e+00
Critical Value (5%) -2.895382e+00
Pr

Critical Value (10%) -2.584824e+00


dtype: float64

20
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
ACF/PACF Plots
ACF Plot

ry
ta
rie
[email protected]
PACF Plot
KOVTJYSURE
op
Pr

Seasonality is not visible.

21
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Model Building - Stationary Data

Auto-ARIMA

Series is not stationary and hence differentiation would be required. For an Auto-ARIMA, we calculate the
best p and q parameters by looking at the lowest corresponding Akaike Information Criterion (AIC)
values.

ry
ta
rie
[email protected]
KOVTJYSURE
op

ARIMA (2,0,0) has the lowest AIC


Pr

22
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Summary of ARIMA (2,0,0)

ry
ta
rie
[email protected]
KOVTJYSURE

Plot for the predictions on the test set


op
Pr

For Auto ARIMA forecast on the test data, RMSE is 172.99


23
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
ARIMA
For an ARIMA, we calculate the best p and q parameters by looking at the ACF & PACF
plots.

p= 12
d= 1
q=1

ry
ta
rie
[email protected]
KOVTJYSURE
op

We can see that order (1,1,0) has the lowest AIC values
Pr

24
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Summary of ARIMA(1,1,0) model

ry
ta
rie
[email protected]
KOVTJYSURE
For this particular Auto-Regressive Integrated Moving Average we are regressing the original series on
itself at the lags of 1. We are also considering the errors from the auto-regression of the first lag. The
values of p and q are calculated by looking at the ACF and the PACF plots.
op

Plot for the predictions on test set


Pr

For ARIMA forecast on the test data, RMSE is 89.18


25
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Comparison of model performance

ry
ta
rie
In this particular we have built several models and went through a model building exercise. This
[email protected]
KOVTJYSURE
particular exercise has given us an idea as to which particular model gives us the least error on our test
set for this data. But in Time Series Forecasting, we need to be very vigil about the fact that after we have
done this exercise we need to build the model on the whole data. Remember, the training data that we
have used to build the model stops much before the data ends. In order to forecast using any of the
op

models built, we need to build the models again (this time on the complete data) with the same
parameters.

The model to be built on the whole data is Triple Exponential Smoothening which is showing
clear signs of a realistic trendcompared to ther models. Eventhough MA average models are
giving us lowest RMSE, they are usually used for exploratory data analysis and are not suitbale
Pr

for model building.

26
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Plotting the Forecasted Price using Triple Exponential Smoothening Model

ry
ta
rie
[email protected]
KOVTJYSURE
op
Pr

27
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Insights and Recommendations

Insights:

● Among the models evaluated, Triple Exponential Smoothening emerges as the most suitable
choice for forecasting gold prices. Its ability to capture the underlying trends and patterns in the
data, as evidenced by the alpha, beta and gamma values, positions it as a reliable tool for
financial forecasting tasks.

● While the 2-point moving average model yields the lowest RMSE, it's crucial to consider its
limitations in terms of model complexity and predictive power. Despite its simplicity, ARIMA(1,1,0)

ry
showcases competitive performance with an RMSE of 89.1. However, the triple exponential
smoothening model forecasts a realistic trend, making it a preferable choice for stakeholders
seeking accurate and robust forecasts.

● By prioritizing the triple exponential smoothening model, stakeholders demonstrate a

ta
commitment to adopting sophisticated forecasting techniques tailored to the complexities of the
gold market. This strategic approach not only enhances decision-making processes but also
equips stakeholders with actionable insights to navigate market volatility effectively.
rie
Recommendations:
[email protected]
KOVTJYSURE
● Adoption of Triple Exponential Smoothing Model: Based on the insights gathered, it is
recommended that the company adopts the Triple Exponential Smoothing model for forecasting
gold prices. This model has demonstrated superior performance in capturing underlying trends
and patterns in the data, making it a reliable tool for predicting future gold prices with accuracy
op

and precision.

● Continuous Monitoring and Evaluation: Implement a robust system for continuous monitoring and
evaluation of the forecasting models. Regularly assess the performance metrics, such as RMSE,
MAE, and MAPE, to ensure that the chosen model remains effective and reliable in predicting gold
prices amidst changing market dynamics and conditions.
Pr

● Integration of External Factors: Consider integrating external factors and market indicators into
the forecasting model to enhance its predictive power and accuracy. Factors such as economic
indicators, geopolitical events, and global market trends can significantly impact gold prices. By
incorporating these variables into the model, stakeholders can gain deeper insights into the
factors driving gold price movements and make more informed decisions accordingly.

28
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.

You might also like