0% found this document useful (0 votes)
23 views

Assignment 3

ucla msba time series class assignment

Uploaded by

jacksui181
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Assignment 3

ucla msba time series class assignment

Uploaded by

jacksui181
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

UCLA Anderson School of Management

Time Series Analysis and Forecasting


Dr. William Yu

Assignment 3

Part A: Predicting Car Sales

• In this assignment, we would like to build a forecast model from scratch for car sales in the
future from using the historical data. Follow my guidelines below.
• Use quantmod library in R (or fredapi library in Python) to import the following series from
FRED in which the Total Vehicles Sales will be the dependent variable. The rest of variables
will be potential predictors. Feel free to add more variables if you think they will help.
• getSymbols("TOTALSA", src="FRED") # Total Vehicle Sales
• getSymbols("PAYEMS", src="FRED") # US payroll jobs
• getSymbols("DFF", src="FRED") # Federal fund rates
• getSymbols("UNRATE", src="FRED") # Unemployment rates
• getSymbols("NASDAQCOM", src="FRED") # Stock market prices
• getSymbols("USSTHPI", src="FRED") # Housing market prices
• getSymbols("CUSR0000SETB01", src="FRED") # Gasoline price in CPI

• Show the visualizations of these variables.


• Convert the data frequency (there are various frequencies) from daily or monthly to
quarterly. (Think about why I suggested this. If you have time, you can try monthly series as
a comparison.)
Note: e.g. you can use “apply.quarterly” function.
• Select the sample period from 1976Q2 to 2023Q2 (June) as the trainset (in-sample).
• Are these time series stationary? If not, transform those non-stationary series to stationary
one. Think about why I suggested doing this.

(1) A Structural Model


• Put all these series together as a data frame and start to build/train a structural model by using
a simple OLS model with the trainset data.
Note: you might need to use “ts” function to make combine these time series from different
sources.
• Once you find a model you like, you are ready to forecast the car sales in 2023Q3 to 2024Q2
(testset; out-of-sample).
• Remember what we said in the class in order to use a structural model to forecast in real-
time. You will need to forecast all your explanatory variables first. And that is exactly what I
wanted you to do. But how?

1
• Here I suggest a simple way: use ARMA models to forecast all the significant predictors you
selected individually. And then you can forecast car sales in 2023Q3 to 2024Q2.
Note: In this case, we in fact know the values of explanatory variables in the testset. In most
real-world situations, we don’t know what they will be in the real time and therefore we need
to forecast them. Here we just pretend we don’t know those values of explanatory variables
in the testset.
In some situation, we could use the values of explanatory variables in the testset as known.
That case will be in Part (B) of this assignment.

(2) A Reduced-form Model


• Secondly, try a simple reduced form model. Simply run an ARMA model directly on total
vehicles sales and forecast it.
• Show these two models’ car sales forecast in 2023Q3 to 2024Q2 and compare them to the
real one (calculate for forecast errors in terms of number of cars sold).

(3) A Dynamic Regression Model; Autoregressive Distributed lag model; Mixed-form Model
• Thirdly, try the following model in the trainset and forecast its testset error.
• Cars Sales (t) = a + b1*Car Sales (t-1) + b2*X1 + b3*X2 + …

Part B: Predicting Sales and Understanding Which Marketing Channel is


Effective

• In the TVlift.xlsx, you can see a daily time series data with three marketing spending (sb,
snb, tv) and the product sales (sales). See the Description tab for more details. Let’s assume
that the trainset data is from 4/1/2016 to 2/28/2017 and the testset data is from 3/1/2017 to
3/19/2017.
• Write a script to build some dynamic models based on the trainset data to understand which
marketing channel(s) is effective to sell the products. And use those models to predict the
product sales in the testset. Calculate the RMSEs.
• Assignment submission:
(1) A R script (or Python/Jupyter notebook if you prefer).
(2) A word/PDF file to show the summary of models and compare their trainset performance
and their RMSEs in the testset.

2
• Hint: The following are some examples of the models and a simple way that you can train
the model in-sample and forecast out-of-sample (testset) in time series data. I provided 5
models. You should add more varying models.
full = read_excel('TVlift.xlsx')

train = full[1:334,]

fit01 = lm(sales ~ sb + snb + tv, data=train)


full$predict1 = predict(fit01, newdata=full)
fit02 = lm(sales ~ sb + snb + tv + factor(dow), data=train) # dow is day of the week seasonality
fit03 = lm(sales ~ lag(sales,1) + sb + snb + tv, data=train)
fit04 = lm(sales ~ lag(sales,1) + sb + snb + tv + factor(dow), data=train)
fit05 = auto.arima(train$sales, xreg = cbind(train$sb, train$snb, train$tv))
test = full[335:353,]
rmse01 = sqrt(mean((test$predict1-test$sales)^2))

You might also like