ISYE6501 Homework 4
ISYE6501 Homework 4
2024-02-06
rm(list=ls())
setwd("~/Georgia Tech - OMSA/ISYE 6501")
if(!require(pacman)) install.packages("pacman")
library(pacman)
p_load(tinytex, tidyverse)
Question 7.1
Describe a situation or problem from your job, everyday life, current events, etc.,
for which exponential smoothing would be appropriate. What data would you need?
Would you expect the value of a(the first smoothing parameter) to be closer to 0 or
1, and why?
As a senior webmaster, my role revolves around ensuring the smooth functioning and optimal performance
of websites. One crucial aspect of my responsibilities involves predicting website traffic trends to anticipate
user demands and allocate resources effectively. To achieve this, I often turn to exponential smoothing, a
powerful forecasting technique that leverages historical data to generate accurate predictions. By applying
exponential smoothing, I can discern underlying patterns and trends in website traffic, allowing me to make
informed decisions regarding server capacity, content management, and marketing strategies.
Exponential smoothing works by assigning exponentially decreasing weights to past observations, giving more
prominence to recent data while gradually diminishing the influence of older observations. This approach
enables me to capture short-term fluctuations and seasonality in website traffic while maintaining stability
and adaptability in the forecast. However, the effectiveness of exponential smoothing hinges on selecting
an appropriate smoothing parameter, which dictates the balance between responsiveness and smoothness in
the forecast. By carefully adjusting this parameter, I can tailor the forecasting model to suit the specific
characteristics of the website’s traffic patterns and business objectives.
Question 7.2
Using the 20 years of daily high temperature data for Atlanta (July through October) from
Question 6.2 (file temps.txt), build and use an exponential smoothing model to help make a
judgment of whether the unofficial end of summer has gotten later over the 20 years. (Part of the
point of this assignment is for you to think about how you might use exponential smoothing to
answer this question. Feel free to combine it with other models if you’d like to. There’s certainly
more than one reasonable approach.)
1
Note: in R, you can use either HoltWinters (simpler to use) or the smooth package’s es function
(harder to use, but more general). If you use es, the Holt-Winters model uses model=”AAM” in
the function call (the first and second constants are used “A”dditively, and the third (seasonality)
is used “M”ultiplicatively; the documentation doesn’t make that clear).
set.seed(1)
#Load Atlanta temperature data
file_path <- "~/Georgia Tech - OMSA/ISYE 6501/hw4/data 7.2/temps.txt"
temps <- read.table(file_path, stringsAsFactors = FALSE, header=TRUE)
#head(summer_temp)
#summary(summer_temp)
80
70
60
50
Time
2
Decomposition of additive time series
observed
90
70
50
86
trend
82
random seasonal
5
−5
0 10 −15
−20
Time
First observation: There is a trend but it is kind of flat. The seasonal pattern is obvious
since we are observing the temperature changing from hot to cold repeatedly. Also, there is
definitively a lot of randomness in our data. Because of this, we would like to use the Holt
Winters method with triple exponential to smooth our data. We want to set alpha, beta and
gamma to NULL to allow the function to find the optimal values and minimize the error across
all data points. We also want to set the seasonal to “multiplicative” to emphasize the role of
seasonal as an index.
temps_HW <- HoltWinters(temps_ts, alpha = NULL, beta = NULL, gamma = NULL, seasonal = "multiplicative"
#temps_HW$fitted
temps_HW$SSE
## [1] 68904.57
plot(temps_HW)
3
Holt−Winters filtering
100
90
Observed / Fitted
80
70
60
50
Time
Second observation: Since we are asking the model to do triple exponential smoothing, the
Some of Square error is actually quite high 68904.5. We also clearly see the error marked by
the red line comparing to the black line near the beginning of the plot. But things are getting
better when the model when thought each cycle of 123 data points. Also, the model don’t
start to fit until the next iteration. In summary, the more exponential smoothing that we ask
the model perform on fewer data points, the more error it will produce. Now let look at the
fitted plot which has a very similar look at the decomposed time series above.
plot(temps_HW$fitted)
4
temps_HW$fitted
100
xhat
80
60
100
level
80 −0.0060 −0.003560
trend
0.8 1.0 1.2
season
Time
Third observation: Look at the graph above we are now more interested in the seasonal
component and the xhat component since the trend is just straight out and the level component
is showing a lot of randomness. Next we will try to apply the CUSUM approach on the seasonal
smoothed data to see if we can draw a conclusion.
#Add the row and columns name back to seasonal and xhat factor
colnames(seasonal_factor) <- colnames(temps[,3:21])
rownames(seasonal_factor) <- temps[,1]
#head(seasonal_factor)
for (i in 1:ncol(seasonal_factor)){
yearly_mean_sf[i] = mean(seasonal_factor[,i])
}
yearly_mean_sf
5
Notice an interesting thing here is that the first year mean is 1 which make senses since it is
the baseline. Now let define the CUSUM function similar to the last homework
Next, we will assign the C variable with the value of 0.5 standard deviation and the T variable
with the value of 3 standard deviation. Then we will run the CUSUM function above for each
year data and display the data frame result_df.
C_var = sd(seasonal_factor[,1])*0.5
T_var = sd(seasonal_factor[,1])*3
result_vector = vector()
for (col in 1:ncol(seasonal_factor)){
result_vector[col] = cusum_decrease(data = as.matrix(seasonal_factor[,col]), mean = 1,T = T_var, C = C
}
## Year Day
## 1 X1997 30-Sep
## 2 X1998 1-Oct
## 3 X1999 1-Oct
## 4 X2000 1-Oct
## 5 X2001 2-Oct
## 6 X2002 2-Oct
## 7 X2003 3-Oct
## 8 X2004 3-Oct
## 9 X2005 4-Oct
## 10 X2006 4-Oct
## 11 X2007 5-Oct
## 12 X2008 5-Oct
## 13 X2009 5-Oct
6
## 14 X2010 5-Oct
## 15 X2011 3-Oct
## 16 X2012 3-Oct
## 17 X2013 3-Oct
## 18 X2014 4-Oct
## 19 X2015 4-Oct
Conclusion, even though we used the CUSUM approach on the seasonal smoothed data, it is
still not clear to us that the unofficial end of summer has gotten later over the 20 years. We
might need more yearly data to have a better determination.