0% found this document useful (0 votes)
6 views7 pages

ISYE6501 Homework 4

The document discusses the application of exponential smoothing in forecasting website traffic trends by a senior webmaster. It also details the use of a Holt-Winters model to analyze 20 years of daily high temperature data for Atlanta to determine if the unofficial end of summer has shifted later over time. The analysis concludes that the results are inconclusive and suggests that more data may be needed for a clearer determination.

Uploaded by

vitieubao083
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views7 pages

ISYE6501 Homework 4

The document discusses the application of exponential smoothing in forecasting website traffic trends by a senior webmaster. It also details the use of a Holt-Winters model to analyze 20 years of daily high temperature data for Atlanta to determine if the unofficial end of summer has shifted later over time. The analysis concludes that the results are inconclusive and suggests that more data may be needed for a clearer determination.

Uploaded by

vitieubao083
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

ISYE6501-Homework4

2024-02-06

Load dependencies packages using pacman

rm(list=ls())
setwd("~/Georgia Tech - OMSA/ISYE 6501")
if(!require(pacman)) install.packages("pacman")

## Loading required package: pacman

library(pacman)
p_load(tinytex, tidyverse)

Question 7.1

Describe a situation or problem from your job, everyday life, current events, etc.,
for which exponential smoothing would be appropriate. What data would you need?
Would you expect the value of a(the first smoothing parameter) to be closer to 0 or
1, and why?

As a senior webmaster, my role revolves around ensuring the smooth functioning and optimal performance
of websites. One crucial aspect of my responsibilities involves predicting website traffic trends to anticipate
user demands and allocate resources effectively. To achieve this, I often turn to exponential smoothing, a
powerful forecasting technique that leverages historical data to generate accurate predictions. By applying
exponential smoothing, I can discern underlying patterns and trends in website traffic, allowing me to make
informed decisions regarding server capacity, content management, and marketing strategies.
Exponential smoothing works by assigning exponentially decreasing weights to past observations, giving more
prominence to recent data while gradually diminishing the influence of older observations. This approach
enables me to capture short-term fluctuations and seasonality in website traffic while maintaining stability
and adaptability in the forecast. However, the effectiveness of exponential smoothing hinges on selecting
an appropriate smoothing parameter, which dictates the balance between responsiveness and smoothness in
the forecast. By carefully adjusting this parameter, I can tailor the forecasting model to suit the specific
characteristics of the website’s traffic patterns and business objectives.

Question 7.2

Using the 20 years of daily high temperature data for Atlanta (July through October) from
Question 6.2 (file temps.txt), build and use an exponential smoothing model to help make a
judgment of whether the unofficial end of summer has gotten later over the 20 years. (Part of the
point of this assignment is for you to think about how you might use exponential smoothing to
answer this question. Feel free to combine it with other models if you’d like to. There’s certainly
more than one reasonable approach.)

1
Note: in R, you can use either HoltWinters (simpler to use) or the smooth package’s es function
(harder to use, but more general). If you use es, the Holt-Winters model uses model=”AAM” in
the function call (the first and second constants are used “A”dditively, and the third (seasonality)
is used “M”ultiplicatively; the documentation doesn’t make that clear).

set.seed(1)
#Load Atlanta temperature data
file_path <- "~/Georgia Tech - OMSA/ISYE 6501/hw4/data 7.2/temps.txt"
temps <- read.table(file_path, stringsAsFactors = FALSE, header=TRUE)
#head(summer_temp)
#summary(summer_temp)

# Unlist the datatable into a long vector


temps_vector <- as.vector(unlist(temps[,2:21]))
#plot(temps_vector)

#Convert temps_vector to time series then plot it


temps_ts <- ts(temps_vector, start=1996, frequency = 123)
plot(temps_ts)
100
90
temps_ts

80
70
60
50

2000 2005 2010 2015

Time

#Plot the decomposed temperature timeseries for exploration


plot(decompose(temps_ts))

2
Decomposition of additive time series
observed
90
70
50
86
trend
82
random seasonal
5
−5
0 10 −15
−20

2000 2005 2010 2015

Time

First observation: There is a trend but it is kind of flat. The seasonal pattern is obvious
since we are observing the temperature changing from hot to cold repeatedly. Also, there is
definitively a lot of randomness in our data. Because of this, we would like to use the Holt
Winters method with triple exponential to smooth our data. We want to set alpha, beta and
gamma to NULL to allow the function to find the optimal values and minimize the error across
all data points. We also want to set the seasonal to “multiplicative” to emphasize the role of
seasonal as an index.

temps_HW <- HoltWinters(temps_ts, alpha = NULL, beta = NULL, gamma = NULL, seasonal = "multiplicative"

#temps_HW$fitted
temps_HW$SSE

## [1] 68904.57

plot(temps_HW)

3
Holt−Winters filtering
100
90
Observed / Fitted

80
70
60
50

2000 2005 2010 2015

Time

Second observation: Since we are asking the model to do triple exponential smoothing, the
Some of Square error is actually quite high 68904.5. We also clearly see the error marked by
the red line comparing to the black line near the beginning of the plot. But things are getting
better when the model when thought each cycle of 123 data points. Also, the model don’t
start to fit until the next iteration. In summary, the more exponential smoothing that we ask
the model perform on fewer data points, the more error it will produce. Now let look at the
fitted plot which has a very similar look at the decomposed time series above.

plot(temps_HW$fitted)

4
temps_HW$fitted
100
xhat
80
60
100
level
80 −0.0060 −0.003560
trend
0.8 1.0 1.2
season

2000 2005 2010 2015

Time

Third observation: Look at the graph above we are now more interested in the seasonal
component and the xhat component since the trend is just straight out and the level component
is showing a lot of randomness. Next we will try to apply the CUSUM approach on the seasonal
smoothed data to see if we can draw a conclusion.

seasonal_factor <- matrix(temps_HW$fitted[,4], nrow=123)

#Add the row and columns name back to seasonal and xhat factor
colnames(seasonal_factor) <- colnames(temps[,3:21])
rownames(seasonal_factor) <- temps[,1]
#head(seasonal_factor)

#Calculate the mean of each year from seasonal smoothed data


yearly_mean_sf <- vector()

for (i in 1:ncol(seasonal_factor)){
yearly_mean_sf[i] = mean(seasonal_factor[,i])
}

yearly_mean_sf

## [1] 1.0000000 0.9981882 0.9980547 0.9975095 0.9971271 0.9961954 0.9955108


## [8] 0.9957393 0.9956360 0.9949667 0.9948162 0.9946495 0.9943300 0.9941211
## [15] 0.9940949 0.9933907 0.9932476 0.9935215 0.9928824

5
Notice an interesting thing here is that the first year mean is 1 which make senses since it is
the baseline. Now let define the CUSUM function similar to the last homework

cusum_decrease = function(data, mean, T, C){


results = list()
cusum = 0
rowCounter = 1
while (rowCounter <= nrow(data)){
current = data[rowCounter,]
cusum = max(0, cusum + (mean - current - C))
# print(cusum)
if (cusum >= T) {
results = rowCounter
break
}
rowCounter = rowCounter + 1
if (rowCounter >= nrow(data)){
results = NA
break
}
}
return(results)
}

Next, we will assign the C variable with the value of 0.5 standard deviation and the T variable
with the value of 3 standard deviation. Then we will run the CUSUM function above for each
year data and display the data frame result_df.

C_var = sd(seasonal_factor[,1])*0.5
T_var = sd(seasonal_factor[,1])*3

result_vector = vector()
for (col in 1:ncol(seasonal_factor)){
result_vector[col] = cusum_decrease(data = as.matrix(seasonal_factor[,col]), mean = 1,T = T_var, C = C
}

result_df = data.frame(Year = colnames(seasonal_factor),Day = temps[result_vector,1])


result_df

## Year Day
## 1 X1997 30-Sep
## 2 X1998 1-Oct
## 3 X1999 1-Oct
## 4 X2000 1-Oct
## 5 X2001 2-Oct
## 6 X2002 2-Oct
## 7 X2003 3-Oct
## 8 X2004 3-Oct
## 9 X2005 4-Oct
## 10 X2006 4-Oct
## 11 X2007 5-Oct
## 12 X2008 5-Oct
## 13 X2009 5-Oct

6
## 14 X2010 5-Oct
## 15 X2011 3-Oct
## 16 X2012 3-Oct
## 17 X2013 3-Oct
## 18 X2014 4-Oct
## 19 X2015 4-Oct

Conclusion, even though we used the CUSUM approach on the seasonal smoothed data, it is
still not clear to us that the unofficial end of summer has gotten later over the 20 years. We
might need more yearly data to have a better determination.

You might also like