100% found this document useful (1 vote)

226 views15 pages

Project - Time Series Forecasting (Sparkling - CSV) & (Rose - CSV)

The document discusses analyzing time series sales data of two types of wine - Sparkling and Rose wines from 1980-1995. The analysis includes: 1) Exploratory data analysis of the time series data including plotting, decomposition to understand trends and seasonality 2) Splitting the data into training (1980-1990) and test (1991-1995) sets 3) Building various forecasting models on the training set like exponential smoothing, regression and evaluating their accuracy on the test set using RMSE.

Uploaded by

guillermo coco

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

226 views15 pages

Project - Time Series Forecasting (Sparkling - CSV) & (Rose - CSV)

Uploaded by

guillermo coco

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Project - Time Series Forecasting

(Sparkling.csv)&(Rose.csv)
Problem:

For this particular assignment, the data of different types of wine sales in the 20th century is to
be analysed. Both of these data are from the same company but of different wines. As an analyst
in the ABC Estate Wines, you are tasked to analyse and forecast Wine Sales in the 20th century.

1. Read the data as an appropriate Time series data and plot the data
Time series is a sequence of observations recorded at regular time intervals.
Sparkling Data
Import libraries
Read the data
Head and Tail of the data
YearMonth Sparkling
0 1980-01 1686
1 1980-02 1591
2 1980-03 2304
3 1980-04 1712
4 1980-05 1471
YearMonth Sparkling
182 1995-03 1897
183 1995-04 1862
184 1995-05 1670
185 1995-06 1688
186 1995-07 2031

Plot the data

2. Perform appropriate Exploratory Data Analysis to understand the data and also
perform decomposition.

Plot graph
Timestamp
YearMonth Sparkling Time_Stamp
0 1980-01 1686 2013-01-31
1 1980-02 1591 2013-02-28
2 1980-03 2304 2013-03-31
3 1980-04 1712 2013-04-30
4 1980-05 1471 2013-05-31
Yearly plot

Monthly Plot

Plot a graph of monthly sales across years

Decomposition

Some of our key observations from this analysis:

1) Seasonality: Seasonal plot displays a fairly consistent month-on-month pattern. The monthly seasonal
components are average values for a month after removal of trend. Trend is removed from the time series usi
the following formula:

Seasonality_t × Remainder_t = Y_t/Trend_t

2) Irregular Remainder (random): is the residual left in the series after removal of trend and seasonal
components. Remainder is calculated using the following formula:

Remainder_t = Y_t / (Trend_t × Seasonality_t)

3. Split the data into training and test. The test data should start in 1991.

Shape of train data and test data

(130, 2)
(57, 2)

Training data of train and test data

First few rows of Training Data
YearMonth Sparkling
Time_Stamp
2013-01-31 1980-01 1686
2013-02-28 1980-02 1591
2013-03-31 1980-03 2304
2013-04-30 1980-04 1712
2013-05-31 1980-05 1471

Last few rows of Training Data

YearMonth Sparkling
Time_Stamp
NaT 1990-06 1457
NaT 1990-07 1899
NaT 1990-08 1605
NaT 1990-09 2424
NaT 1990-10 3116

First few rows of Test Data

YearMonth Sparkling
Time_Stamp
NaT 1990-11 4286
NaT 1990-12 6047
NaT 1991-01 1902
NaT 1991-02 2049
NaT 1991-03 1874

Last few rows of Test Data

YearMonth Sparkling
Time_Stamp
NaT 1995-03 1897
NaT 1995-04 1862
NaT 1995-05 1670
NaT 1995-06 1688
NaT 1995-07 2031

Plot the graph

Both the test data in Sparkling and Rose wine sales start from 1991

4. Build various exponential smoothing models on the training data and evaluate the
model using RMSE on the test data. Other models such as regression,naïve forecast
models, simple average models etc. should also be built on the training data and check
the performance on the test data using RMSE.
Please do try to build as many models as possible and as many iterations of models as
possible with different parameters.

SES, Holt & Holt-Winter Model

Exponential Smoothing methods

Exponential smoothing methods consist of flattening time series data.

Exponential smoothing averages or exponentially weighted moving averages consist of forecast

based on previous periods data with exponentially declining influence on the older observations.

Exponential smoothing methods consist of special case exponential moving with notation ETS (Error,
Trend, Seasonality) where each can be none(N), additive (N), additive damped (Ad), Multiplicative
(M) or multiplicative damped (Md).

One or more parameters control how fast the weights decay.

These parameters have values between 0 and 1

SES - ETS(A, N, N) - Simple smoothing with additive errors

T he simplest of the exponentially smoothing methods is naturally called simple exponential

smoothing (SES).

This method is suitable for forecasting data with no clear trend or seasonal pattern.
In Single ES, the forecast at time (t + 1) is given by Winters,1960
 ��+1=𝛼��+(1−𝛼)��Ft+1=αYt+(1−α)Ft
Parameter 𝛼α is called the smoothing constant and its value lies between 0 and 1. Since the
model uses only one smoothing constant, it is called Single Exponential Smoothing.

The dataset, Rose and Sparkling gives the wine sales, the 20th centuary1980 to 1995.

1. Building different models and comparing accuracy metrics

2. Forecast using SES model.
3. Calculate the values of RMSE and MAPE.
4. Plot the forecasted values along with original values.
Model1: Linear regression:

Training and Test time instance Training and Test time instance
Training Time instance Training Time instance
[1, 2, 3, 4, 5, 6, 7, 8, 9, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
10, 11, 12, 13, 14, 15, 16, 17, 11, 12, 13, 14, 15, 16, 17, 18,
18, 19, 20, 21, 22, 23, 24, 25, 19, 20, 21, 22, 23, 24, 25, 26,
26, 27, 28, 29, 30, 31, 32, 33, 27, 28, 29, 30, 31, 32, 33, 34,
34, 35, 36, 37, 38, 39, 40, 41, 35, 36, 37, 38, 39, 40, 41, 42,
42, 43, 44, 45, 46, 47, 48, 49, 43, 44, 45, 46, 47, 48, 49, 50,
50, 51, 52, 53, 54, 55, 56, 57, 51, 52, 53, 54, 55, 56, 57, 58,
58, 59, 60, 61, 62, 63, 64, 65, 59, 60, 61, 62, 63, 64, 65, 66,
66, 67, 68, 69, 70, 71, 72, 73, 67, 68, 69, 70, 71, 72, 73, 74,
74, 75, 76, 77, 78, 79, 80, 81, 75, 76, 77, 78, 79, 80, 81, 82,
82, 83, 84, 85, 86, 87, 88, 89, 83, 84, 85, 86, 87, 88, 89, 90,
90, 91, 92, 93, 94, 95, 96, 97, 91, 92, 93, 94, 95, 96, 97, 98,
98, 99, 100, 101, 102, 103, 99, 100, 101, 102, 103, 104, 105,
104, 105, 106, 107, 108, 109, 106, 107, 108, 109, 110, 111, 112,
110, 111, 112, 113, 114, 115, 113, 114, 115, 116, 117, 118, 119,
116, 117, 118, 119, 120, 121, 120, 121, 122, 123, 124, 125, 126,
122, 123, 124, 125, 126, 127, 127, 128, 129, 130]
128, 129, 130] Test Time instance
Test Time instance [43, 44, 45, 46, 47, 48, 49, 50,
[43, 44, 45, 46, 47, 48, 49, 51, 52, 53, 54, 55, 56, 57, 58,
50, 51, 52, 53, 54, 55, 56, 57, 59, 60, 61, 62, 63, 64, 65, 66,
58, 59, 60, 61, 62, 63, 64, 65, 67, 68, 69, 70, 71, 72, 73, 74,
66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 77, 78, 79, 80, 81, 82,
74, 75, 76, 77, 78, 79, 80, 81, 83, 84, 85, 86, 87, 88, 89, 90,
82, 83, 84, 85, 86, 87, 88, 89, 91, 92, 93, 94, 95, 96, 97, 98,
90, 91, 92, 93, 94, 95, 96, 97, 99]
98, 99] We see that we have successfully the
We see that we have successfully the generated the numerical time instance
generated the numerical time instance order for both the training and test set. Now
order for both the training and test set. we will add these values in the training and
Now we will add these values in the test set.
training and test set.

First few rows of Training Data First few rows of Training Data
YearMonth Rose time
Sparkling time Time_Stamp
Time_Stamp 2013-01-31 112.0 1
2013-01-31 1980-01 1686 2013-02-28 118.0 2
1 2013-03-31 129.0 3
2013-02-28 1980-02 1591 2013-04-30 99.0 4
2 2013-05-31 116.0 5
2013-03-31 1980-03 2304
3 Last few rows of Training Data
2013-04-30 1980-04 1712 Rose time
4 Time_Stamp
2013-05-31 1980-05 1471 NaT 76.0 126
5 NaT 78.0 127
NaT 70.0 128
Last few rows of Training Data NaT 83.0 129
YearMonth NaT 65.0 130
Sparkling time
Time_Stamp First few rows of Test Data
NaT 1990-06 1457 Rose time
126 Time_Stamp
NaT 1990-07 1899 NaT 110.0 43
127 NaT 132.0 44
NaT 1990-08 1605 NaT 54.0 45
128 NaT 55.0 46
NaT 1990-09 2424 NaT 66.0 47
129
NaT 1990-10 3116 Last few rows of Test Data
130 Rose time
Time_Stamp
First few rows of Test Data NaT 45.0 95
YearMonth NaT 52.0 96
Sparkling time NaT 28.0 97
Time_Stamp NaT 40.0 98
NaT 1990-11 4286 NaT 62.0 99
43
NaT 1990-12 6047 Now that our training and test data has
44 been modified, let us go ahead use
NaT 1991-01 1902 LinearRegression−−−−−−−−−−−−−−− to
build the model on the training data and
45
test the model on the test data
NaT 1991-02 2049
46
NaT 1991-03 1874
47

Last few rows of Test Data

YearMonth
Sparkling time
Time_Stamp
NaT 1995-03 1897
95
NaT 1995-04 1862
96
NaT 1995-05 1670
97
NaT 1995-06 1688
98
NaT 1995-07 2031
99

Now that our training and test data has

been modified, let us go ahead use
LinearRegression−−−−−−−−−−−−−−− to
build the model on the training data and
test the model on the test data.

Model Evaluation Model Evaluation

RMSE ON THE TEST DATA RMSE ON THE TEST DATA

We can calculate the RMSE using the We can calculate the RMSE using the
helper function from the scikit-learn helper function from the scikit-learn library
library mean_squared_error() that mean_squared_error() that calculates the
calculates the mean squared error mean squared error between a list of
between a list of expected values (the expected values (the test set) and the list of
test set) and the list of predictions. We predictions. We can then take the square
can then take the square root of this root of this value to give us an RMSE
value to give us an RMSE score. score.
## Test Data - RMSE and MAPE ## Test Data - RMSE and MAPE

Test RMSE
RegressionOnTime 1374.550202

Model 2: Naive Approach: y^t+1=yt

For this particular naive model, we say that the prediction for tomorrow is the same as today and
the prediction for day after tomorrow is tomorrow and since the prediction of tomorrow is same as
today,therefore the prediction for day after tomorrow is also today.¶
Timestamps: Timestamps:

Time_Stamp Time_Stamp
NaT 3116 NaT 65.0
NaT 3116 NaT 65.0
NaT 3116 NaT 65.0
NaT 3116 NaT 65.0
NaT 3116 NaT 65.0
Name: naive, dtype: int64 Name: naive, dtype: float64

Plot the graph: Plot the graph:

Model Evaluation: Model Evaluation:

Test Test RMSE
RMSE
RegressionO 1574.55
Regression 1374. nTime 0202
OnTime 55020
Test Test RMSE
RMSE
186.444
NaiveModel
2 629

1496.
NaiveMod
44462
el
9

Method 3: Simple Average Method 3: Simple Average

For this particular simple average
For this particular simple average method, we will forecast by using
method, we will forecast by using the
average of the training values.¶
the average of the training values.¶
Dataset
Dataset
YearMonth Rose mean_forecast
Sparkling Time_Stamp
mean_forecast NaT 110.0 104.692308
Time_Stamp NaT 132.0 104.692308
NaT 1990-11 4286 NaT 54.0 104.692308
2361.276923 NaT 55.0 104.692308
NaT 1990-12 6047 NaT 66.0 104.692308
2361.276923
NaT 1991-01 1902
2361.276923
NaT 1991-02 2049
2361.276923
NaT 1991-03 1874
2361.276923
Plot the graph Plot the graph
Test RMSE Test RMSE
RegressionOnTime 1374.550202 RegressionOnTime 157.550202
NaiveModel 1496.444629 NaiveModel 196.444629
SimpleAverageModel 1368.746717 SimpleAverageModel 168.746717

Method 4: Moving Average(MA)

For the moving average model, we are going to calculate rolling means (or moving averages)
for different intervals. The best interval can be determined by the maximum accuracy (or the
minimum error) over here.

For Moving Average, we are going to average over the entire data.
In [69]:

Head of the data

YearMonth Sparkling
Time_Stamp
2013-01-31 1980-01 1686
2013-02-28 1980-02 1591
2013-03-31 1980-03 2304
2013-04-30 1980-04 1712
2013-05-31 1980-05 1471
Trailing Moving Averages

YearMonth Sparkling Trailing_2 Trailing_4 Trailing_6 Trailing_9

Time_Stamp
2013-01-31 1980-01 1686 NaN NaN NaN NaN
2013-02-28 1980-02 1591 1638.5 NaN NaN NaN
2013-03-31 1980-03 2304 1947.5 NaN NaN NaN
2013-04-30 1980-04 1712 2008.0 1823.25 NaN NaN
2013-05-31 1980-05 1471 1591.5 1769.50 NaN NaN

## Plotting on the whole data

Let us split the data into train and test and plot this Time Series. The window of the moving average is need to be
carefully selected as too big a window will result in not having any test set as the whole series might get averaged
over.
## Plotting on both the Training and Test data

Model Evaluation
Done only on the test data.:
For 2 point Moving Average Model forecast on the Training Data, RMSE
is 811.179
For 4 point Moving Average Model forecast on the Training Data, RMSE
is 1184.213
For 6 point Moving Average Model forecast on the Training Data, RMSE
is 1337.201
For 9 point Moving Average Model forecast on the Training Data, RMSE
is 1422.653

Test RMSE

1374.55020
RegressionOnTime
2

1496.44462
NaiveModel
9
Test RMSE

1368.74671
SimpleAverageModel
7

2pointTrailingMovingAverage 811.178937

1184.21329
4pointTrailingMovingAverage
5

1337.20052
6pointTrailingMovingAverage
4

1422.65328
9pointTrailingMovingAverage
1

Simple Exponential Smoothing

We have the Sparkling wine sales data from Jan 1980 to Jul 1995.
Split the data into train and test in the ratio 70:30 Use Single Exponential Smoothing method to
forecast sales using the test data. Calculate the values of RMSE and MAPE. Plot the forecasted values
along with original values.
Forecasts are calculated using weighted averages where the weights decrease exponentially as
observations come from further in the past, the smallest weights are associated with the oldest
observations:

where 0≤ α ≤1 is the smoothing parameter.

The one-step-ahead forecast for time T+1 is a weighted average of all the observations in the
series y1,…,yT. The rate at which the weights decrease is controlled by the parameter α.
If you stare at it just long enough, you will see that the expected value yx is the sum of two
products: α⋅yt and (1−α)⋅y t-1.
Hence, it can also be written as :

So essentially we’ve got a weighted moving average with two weights: α and 1−α.
As we can see, 1−α is multiplied by the previous expected value y x−1 which makes the expression
recursive. And this is why this method is called Exponential. The forecast at time t+1 is equal to a
weighted average between the most recent observation yt and the most recent forecast y t|t−1.
Sparkling Data:
Parameters
{'smoothing_level': 0.0,
'smoothing_slope': nan,
'smoothing_seasonal': nan,
'damping_slope': nan,
'initial_level': 2361.2692421307943,
'initial_slope': nan,
'initial_seasons': array([], dtype=float64),
'use_boxcox': False,
'lamda': None,
'remove_bias': False}
Test data:
YearMonth Sparkling predict
Time_Stamp
NaT 1990-11 4286 NaN
NaT 1990-12 6047 NaN
NaT 1991-01 1902 NaN
NaT 1991-02 2049 NaN
NaT 1991-03 1874 NaN
Rose data:
Parameters
{'smoothing_level': 0.10272096910328038,
'smoothing_slope': nan,
'smoothing_seasonal': nan,
'damping_slope': nan,
'initial_level': 134.26283319141947,
'initial_slope': nan,
'initial_seasons': array([], dtype=float64),
'use_boxcox': False,
'lamda': None,
'remove_bias': False}
Test data
Rose predict
Time_Stamp
NaT 110.0 NaN
NaT 132.0 NaN
NaT 54.0 NaN
NaT 55.0 NaN
NaT 66.0 NaN

HOLT METHOD

Holt extended simple exponential smoothing to allow forecasting of data with a trend. It is nothing
more than exponential smoothing applied to both level(the average value in the series) and trend. To
express this in mathematical notation we now need three equations: one for level, one for the trend
and one to combine the level and trend to get the expected forecast y

Using Holt’s winter method will be the best option among the rest of the models beacuse of the
seasonality factor. The Holt-Winters seasonal method comprises the forecast equation and three
smoothing equations — one for the level ℓt, one for trend bt and one for the seasonal component
denoted by st, with smoothing parameters α, β and γ.

ADDITIVE
5. Check for the stationarity of the data on which the model is being built on using
appropriate statistical tests and also mention the hypothesis for the statistical test. If the
data is found to be non-stationary, take appropriate steps to make it stationary. Check the
new data for stationarity and comment. Note: Stationarity should be checked at alpha =
0.05.
Augmented Dickey-Fuller test

Statistical tests make strong assumptions about the data. They can only be used to inform the
degree to which a null hypothesis can be rejected or fail to be reject. The result must be
interpreted for a given problem to be meaningful. They can provide a quick check and
confirmatory evidence that your time series is stationary or non-stationary. The Augmented
Dickey-Fuller test is a type of statistical test called a unit root test. The intuition behind a unit root
test is that it determines how strongly a time series is defined by a trend.There are a number of
unit root tests and the Augmented Dickey-Fuller may be one of the more widely used. It uses an
autoregressive model and optimizes an information criterion across multiple different lag values.
The null hypothesis of the test is that the time series can be represented by a unit root, that it is
not stationary (has some time-dependent structure). The alternate hypothesis (rejecting the null
hypothesis) is that the time series is stationary. • Null Hypothesis (H0): If failed to be rejected, it
suggests the time series has a unit root, meaning it is non-stationary. It has some time
dependent structure. • Alternate Hypothesis (H1): The null hypothesis is rejected; it suggests the
time series does not have a unit root, meaning it is stationary. It does not have time-dependent
structure. We interpret this result using the p-value from the test. A p-value below a threshold
(such as 5% or 1%) suggests we reject the null hypothesis (stationary), otherwise a p-value
above the threshold suggests we fail to reject the null hypothesis (non-stationary). • p-value >
0.05: Fail to reject the null hypothesis (H0), the data has a unit root and is non-stationary. • p-
value <= 0.05: Reject the null hypothesis (H0), the data does not have a unit root and is
stationary.

• Null Hypothesis (H0): If failed to be rejected, it suggests the time series has a unit root,
meaning it is non-stationary. It has some time dependent structure. • Alternate Hypothesis (H1):
The null hypothesis is rejected; it suggests the time series does not have a unit root, meaning it
is stationary. It does not have time-dependent structure. We interpret this result using the p-value
from the test. A p-value below a threshold (such as 5% or 1%) suggests we reject the null
hypothesis (stationary), otherwise a p-value above the threshold suggests we fail to reject the
null hypothesis (non-stationary). • p-value > 0.05: Fail to reject the null hypothesis (H0), the data
has a unit root and is non-stationary.- In Rose dataset • p-value <= 0.05: Reject the null
hypothesis (H0), the data does not have a unit root and is stationary.- In Sparkling dataset
6. Build an automated version of the ARIMA/SARIMA model in which the parameters are
selected using the lowest Akaike Information Criteria (AIC) on the training data and
evaluate this model on the test data using RMSE.
ARIMA is a very popular statistical method for time series forecasting. ARIMA stands for Auto-
Regressive Integrated Moving Averages. ARIMA models work on the following assumptions – •
The data series is stationary, which means that the mean and variance should not vary with time.
A series can be made stationary by using log transformation or differencing the series. • The data
provided as input must be a univariate series, since ARIMA uses the past values to predict the
future values.

ARIMA has three components – AR (autoregressive term), I (differencing term) and MA (moving
average term). Let us understand each of these components – • AR term refers to the past
values used for forecasting the next value. The AR term is defined by the parameter ‘p’ in ARIMA
The value of ‘p’ is determined using the PACF plot. • MA term is used to defines number of past
forecast errors used to predict the future values. The parameter ‘q’ in ARIMA represents the MA
term. ACF plot is used to identify the correct ‘q’ value. • Order of differencing specifies the
number of times the differencing operation is performed on series to make it stationary. Test like
ADF and KPSS can be used to determine whether the series is stationary and help in identifying
the d value. Auto ARIMA takes into account the AIC and BIC values generated Akaike’s
information criterion (AIC) compares the quality of a set of statistical models to each other.
Hyperparameter tuning for ARIMA In order to choose the best combination of the above
parameters, we’ll use a grid search. The best combination of parameters will give the lowest
Akaike information criterion (AIC) score. AIC tells us the quality of statistical models for a given
set of data.
The above plot shows that our predicted values catch up to the observed values in the dataset.
Our forecasts seem to align with the ground truth very well and spiked up through 95 to 96 shows
result as expected. RMSE is also low in this case. So, final ARIMA model can be represented as
SARIMAX. This is the best we can do with ARIMA, so let’s try another model to see whether we
can decrease the RMSE
In this case ARIMA performed the best
7. Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the
training data and evaluate this model on the test data using RMSE.
The seasonal part of an AR or MA model will be seen in the seasonal lags of the PACF and ACF.
For example, an ARIMA model will show: • a spike at lag in the ACF but no other significant
spikes; • exponential decay in the seasonal lags of the PACF . Similarly, an ARIMA model will
show: • exponential decay in the seasonal lags of the ACF; • a single significant spike at lag in
the PACF. In considering the appropriate seasonal orders for a seasonal ARIMA model, restrict
attention to the seasonal lags.The modelling procedure is almost the same as for non-seasonal
data, except that we need to select seasonal AR and MA terms as well as the non-seasonal
components of the model. To determine a proper model for a given time series data, it is
necessary to carry out the ACF and PACF analysis. These statistical measures reflect how the
observations in a time series are related to each other. For modeling and forecasting purpose it is
often useful to plot the ACF and PACF against consecutive time lags. These plots help in
determining the order of AR and MA terms. Below we give their mathematical definitions: For a
time series{ } x(t),t = 0,1, 2,... the Autocovariance [21, 23] at lag k is defined as: γ = ( , ) = [( − μ)
( − μ)] k t t+k t t+k Cov x x E x x
In [ ]:

Biomedical Signals Processing and Signal Modelling
100% (1)
Biomedical Signals Processing and Signal Modelling
534 pages
Anushi Sparkling
100% (4)
Anushi Sparkling
70 pages
Stochastic Processes by Joseph T Chang
0% (1)
Stochastic Processes by Joseph T Chang
233 pages
FINANCE & RISK ANALYTICS – PROJECT - YARESH VIJAYASUNDARAM
No ratings yet
FINANCE & RISK ANALYTICS – PROJECT - YARESH VIJAYASUNDARAM
48 pages
Machine Learning - Nabeel Khan - Final Project Report - Problem 2
100% (1)
Machine Learning - Nabeel Khan - Final Project Report - Problem 2
24 pages
Cart-Rf-Ann: Prepared by Muralidharan N
67% (3)
Cart-Rf-Ann: Prepared by Muralidharan N
33 pages
Anshul Dyundi Machine Learning July 2022
50% (2)
Anshul Dyundi Machine Learning July 2022
46 pages
A Project Report On Customer Churn Rate in Axis Bank Hubli
No ratings yet
A Project Report On Customer Churn Rate in Axis Bank Hubli
74 pages
REport Time Series
100% (2)
REport Time Series
57 pages
SANDYA VB-Business Report TSF
100% (6)
SANDYA VB-Business Report TSF
24 pages
ML 2 - Problem Statements and Rubirics
No ratings yet
ML 2 - Problem Statements and Rubirics
3 pages
NIrupam Agarwal Business Report-ML
100% (1)
NIrupam Agarwal Business Report-ML
23 pages
Marketing & Retail Analytics - Report - Part A
100% (2)
Marketing & Retail Analytics - Report - Part A
18 pages
Production of Dates Final Thesis
No ratings yet
Production of Dates Final Thesis
50 pages
Palash Bhai - Machine Learning Assignment
100% (2)
Palash Bhai - Machine Learning Assignment
18 pages
Great Lakes Extraa - Learn Project Business Report - 2-Kavish-Rathod
No ratings yet
Great Lakes Extraa - Learn Project Business Report - 2-Kavish-Rathod
22 pages
Data Mining Project Report
100% (1)
Data Mining Project Report
98 pages
PM Guided Project Sample Business Report
100% (1)
PM Guided Project Sample Business Report
52 pages
Predictive Modeling
No ratings yet
Predictive Modeling
38 pages
AV Project Shivakumar Vanga
100% (1)
AV Project Shivakumar Vanga
37 pages
Shivani Pandey TSF
100% (1)
Shivani Pandey TSF
32 pages
Applied Time Series Analysis
No ratings yet
Applied Time Series Analysis
340 pages
Random Forest - US - Heart - Patients - Class
100% (1)
Random Forest - US - Heart - Patients - Class
24 pages
PM ProjectJune - 2021
100% (1)
PM ProjectJune - 2021
33 pages
SMT Capstone PPT Ayushi Rastogi PGPDSBA.O.MAY22.C
No ratings yet
SMT Capstone PPT Ayushi Rastogi PGPDSBA.O.MAY22.C
12 pages
ML-2 Guided Project Report
No ratings yet
ML-2 Guided Project Report
63 pages
SMDM Project Report
100% (1)
SMDM Project Report
19 pages
Time Series Forecasting - Rose - Buisness Report
100% (1)
Time Series Forecasting - Rose - Buisness Report
69 pages
Time Series Rose Shehroz Arfeen
100% (1)
Time Series Rose Shehroz Arfeen
42 pages
Answer Book - Rose Wines
100% (1)
Answer Book - Rose Wines
11 pages
Clustering Analysis: Prepared by Muralidharan N
100% (1)
Clustering Analysis: Prepared by Muralidharan N
16 pages
Time Series Using Python
No ratings yet
Time Series Using Python
18 pages
ML - Project - Business Report
No ratings yet
ML - Project - Business Report
43 pages
Time - PGP DSBA
100% (1)
Time - PGP DSBA
43 pages
Random Walk
No ratings yet
Random Walk
4 pages
FRA Project Report Milestone 1 PDF
No ratings yet
FRA Project Report Milestone 1 PDF
29 pages
Project Data Mining
No ratings yet
Project Data Mining
55 pages
Executive Sumary - Rajarshi Das (Data Visualization Using Tableau Project)
100% (1)
Executive Sumary - Rajarshi Das (Data Visualization Using Tableau Project)
11 pages
TSF - Project
100% (1)
TSF - Project
5 pages
Finance Risk Analytics - Priyanka Sharma - Business Report
No ratings yet
Finance Risk Analytics - Priyanka Sharma - Business Report
49 pages
Fra Project Report-Bajaj Auto Ltd. Vs Hero Motocorp Ltd. (Group-X)
100% (1)
Fra Project Report-Bajaj Auto Ltd. Vs Hero Motocorp Ltd. (Group-X)
10 pages
Pradeep Chauhan Business Report 09july'23
100% (1)
Pradeep Chauhan Business Report 09july'23
32 pages
Project-Time Series Forecasting
100% (1)
Project-Time Series Forecasting
10 pages
Great Learning Predictive Modelling Project
No ratings yet
Great Learning Predictive Modelling Project
12 pages
TSA Business Report
0% (1)
TSA Business Report
27 pages
Tsa - Time Series Analysis
No ratings yet
Tsa - Time Series Analysis
45 pages
Cold Storage Assignment - Atanu
100% (2)
Cold Storage Assignment - Atanu
11 pages
Basics 1 Vario Gram
No ratings yet
Basics 1 Vario Gram
37 pages
Uber Drive Practice DP PDF
No ratings yet
Uber Drive Practice DP PDF
10 pages
Time Series Forecasting Business Report-1
No ratings yet
Time Series Forecasting Business Report-1
65 pages
DSBA SMDM Project - Anula
100% (1)
DSBA SMDM Project - Anula
7 pages
CT4 Notes
No ratings yet
CT4 Notes
7 pages
Capstone Notes-1
No ratings yet
Capstone Notes-1
18 pages
Lecture Note 1 - Introduction To Time Series
No ratings yet
Lecture Note 1 - Introduction To Time Series
13 pages
SMDM Project Report Dipti
No ratings yet
SMDM Project Report Dipti
14 pages
Predictive Modelling Project Gloria Susan Raju 11 APR 2021 PDF
No ratings yet
Predictive Modelling Project Gloria Susan Raju 11 APR 2021 PDF
56 pages
Stock Market Price Prediction
No ratings yet
Stock Market Price Prediction
83 pages
P L Lohitha 19-04-23 TSF Business Report
No ratings yet
P L Lohitha 19-04-23 TSF Business Report
70 pages
Rahulsharma - 03 12 23
No ratings yet
Rahulsharma - 03 12 23
25 pages
RACHIT MITTAL Capstone Project. Notes 2 PDF
No ratings yet
RACHIT MITTAL Capstone Project. Notes 2 PDF
39 pages
Sample - Customer Churn Prediction Python Documentation
No ratings yet
Sample - Customer Churn Prediction Python Documentation
33 pages
FRA Main Project Part B Guided
No ratings yet
FRA Main Project Part B Guided
23 pages
Nagareddy 18-Nov-2023
No ratings yet
Nagareddy 18-Nov-2023
20 pages
DSP Lab Assignments 1 To 13
No ratings yet
DSP Lab Assignments 1 To 13
97 pages
Chapter 4. ARIMA - SV
No ratings yet
Chapter 4. ARIMA - SV
49 pages
Project Time Series Forecasting ROSE Dataset by Somya Dhar 1 PDF
No ratings yet
Project Time Series Forecasting ROSE Dataset by Somya Dhar 1 PDF
52 pages
Data Mining Clustering PDF
No ratings yet
Data Mining Clustering PDF
15 pages
Data Mining Problem 2 Report
No ratings yet
Data Mining Problem 2 Report
13 pages
Surabhi FRA PartA
No ratings yet
Surabhi FRA PartA
13 pages
Answer Book - Sparkling Wines
No ratings yet
Answer Book - Sparkling Wines
10 pages
TSCS PDF
No ratings yet
TSCS PDF
26 pages
Forecasting The Price of Rice
No ratings yet
Forecasting The Price of Rice
20 pages
AS Extended Buisnesss Report
No ratings yet
AS Extended Buisnesss Report
25 pages
Markov Chains and Mixing Times Second Edition David A. Levin All Chapter Instant Download
100% (5)
Markov Chains and Mixing Times Second Edition David A. Levin All Chapter Instant Download
53 pages
Forecasting Economics and Financial Time Series: ARIMA vs. LSTM
No ratings yet
Forecasting Economics and Financial Time Series: ARIMA vs. LSTM
20 pages
Assignment 2 Solution
No ratings yet
Assignment 2 Solution
6 pages
Econometrics Notes
No ratings yet
Econometrics Notes
30 pages
Unit 5 Dev 2023
No ratings yet
Unit 5 Dev 2023
23 pages
End Term Quiz1 - Attempt Review
No ratings yet
End Term Quiz1 - Attempt Review
5 pages
Sustainability 15 16293 v2
No ratings yet
Sustainability 15 16293 v2
25 pages
Potentail Growth Kalman Filter
No ratings yet
Potentail Growth Kalman Filter
25 pages
Journal of Computational Science: Hongsheng He, Dali Wang, Yang Xu, Jindong Tan
No ratings yet
Journal of Computational Science: Hongsheng He, Dali Wang, Yang Xu, Jindong Tan
13 pages
Effective Cement Demand Forecasting Using Deep Learning Technology: A Data-Driven Approach For Optimal Demand Forecasting
No ratings yet
Effective Cement Demand Forecasting Using Deep Learning Technology: A Data-Driven Approach For Optimal Demand Forecasting
8 pages
Project6 Time Series
No ratings yet
Project6 Time Series
14 pages
Impact of Exchange Rate Fluctuations On India's Exports and Imports
No ratings yet
Impact of Exchange Rate Fluctuations On India's Exports and Imports
10 pages
Longevity Among Hunter-Gatherers
No ratings yet
Longevity Among Hunter-Gatherers
46 pages
ASR Improvement
No ratings yet
ASR Improvement
5 pages
Solution To Problem 1: Importing The Libraries
No ratings yet
Solution To Problem 1: Importing The Libraries
6 pages

Project - Time Series Forecasting (Sparkling - CSV) & (Rose - CSV)

Uploaded by

Project - Time Series Forecasting (Sparkling - CSV) & (Rose - CSV)

Uploaded by

Project - Time Series Forecasting

Plot the data

Plot a graph of monthly sales across years

Some of our key observations from this analysis:

Seasonality_t × Remainder_t = Y_t/Trend_t

Remainder_t = Y_t / (Trend_t × Seasonality_t)

Shape of train data and test data

Training data of train and test data

Last few rows of Training Data

First few rows of Test Data

Last few rows of Test Data

Plot the graph

SES, Holt & Holt-Winter Model

Exponential Smoothing methods

Exponential smoothing averages or exponentially weighted moving averages consist of forecast

One or more parameters control how fast the weights decay.

These parameters have values between 0 and 1

SES - ETS(A, N, N) - Simple smoothing with additive errors

T he simplest of the exponentially smoothing methods is naturally called simple exponential

1. Building different models and comparing accuracy metrics

Last few rows of Test Data

Now that our training and test data has

Model Evaluation Model Evaluation

Model 2: Naive Approach: y^t+1=yt

Plot the graph: Plot the graph:

Model Evaluation: Model Evaluation:

Method 3: Simple Average Method 3: Simple Average

Method 4: Moving Average(MA)

Head of the data

YearMonth Sparkling Trailing_2 Trailing_4 Trailing_6 Trailing_9

## Plotting on the whole data

Simple Exponential Smoothing

where 0≤ α ≤1 is the smoothing parameter.

You might also like