0% found this document useful (0 votes)
5 views25 pages

Lecture 2 Regression Approach

The document discusses regression analysis in the context of time series, illustrating how to predict new values using training data. It covers simple and multiple linear regression methods, providing examples with fuel consumption and advertisement expenditure. Additionally, it introduces trend models and causal regression models for forecasting future values.

Uploaded by

Eason Lau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views25 pages

Lecture 2 Regression Approach

The document discusses regression analysis in the context of time series, illustrating how to predict new values using training data. It covers simple and multiple linear regression methods, providing examples with fuel consumption and advertisement expenditure. Additionally, it introduces trend models and causal regression models for forecasting future values.

Uploaded by

Eason Lau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Lecture 2 Regression Approach

of Time Series Analysis


Revision on Regression Analysis

• Example: Fuel consumption vs # driving licenses and other factors.


Revision on Regression Analysis
Year X1=%Drivi X2=Incom X3=Tax Y=Fuel
ng e consumpt
License ion
Training data
2016 52.5 3.571 9 541
2017 57.2 4.092 9 524
2018 58 3.865 9 561
2019 52.9 4.87 7.5 414

New data
2020 60 3.2 5.5 ???
Revision on Regression Analysis
• Goal: Guess Y for the new data.

• n = # rows in training, p = # cols in X

• Regression courses: New value of X is used to predict new value of Y.

• Time series courses: Time (Year), previous values of Y, new and


previous values of X can also be used to predict new value of Y.
Revision on Simple Linear Regression
• Regression method that you learned in regression course:

• New X predicts new Y

• Naïve method for prediction p=1:


• Plot the training data
• Draw a straight line that “fit the data well”
• Make prediction from the straight line
Revision on Simple Linear Regression
Year X = Advertisement Y = Sales
Expenditure
Training data:
2015 172 60
2016 156 43
2017 167 52
2018 145 36
2019 150 40
2020 160 50

New data:
2021 180 ???
Revision on Simple Linear Regression
Revision on Simple Linear Regression
• Linear model: 𝑌𝑡 = 𝑎 + 𝑏𝑋𝑡 + 𝐸𝑟𝑟𝑜𝑟 (a=intercept, b=slope)

• Regression: Statistical methods of guess a and b from the training


data. For the new data where 𝑋𝑛+1 is known but 𝑌𝑛+1 is unknown,
predict 𝑌𝑛+1 as 𝑎 + 𝑏𝑋𝑛+1 .

• Computer: Use software called R to do statistics.


Revision on Simple Linear Regression
• # Input the training data
• salesdata = data.frame(
year = c(2015,2016,2017,2018,2019,2020),
advertisement = c(172,156,167,145,150,160),
sales = c(60,43,52,36,40,50)
)
• salesdata
Revision on Simple Linear Regression
• # Input the testing data
• newdata = data.frame(
year = 2021, advertisement = 180
)
• newdata
Revision on Simple Linear Regression
• In R, the procedure lm() can handle regression problems.

• # Perform linear regression


result = lm(sales~advertisement,salesdata)
summary(result)

• # Making predictions
predict(result, newdata)
predict(result, newdata, interval="prediction")
Revision on Simple Linear Regression
• Simple linear regression: Finding 𝑎ො and 𝑏෠ so that the sum of squared
errors is the smallest
𝑛
2
𝑆𝑆 𝑎, 𝑏 = ෍ 𝑌𝑡 − 𝑎 − 𝑏𝑋𝑡
𝑡=1
Revision on Simple Linear Regression
• Mathematical details:

𝑑 𝑑
• Set 𝑑𝑎
𝑆𝑆 𝑎, 𝑏 = 𝑑𝑏
𝑆𝑆 𝑎, 𝑏 = 0

σ𝑛 ത ത
𝑡=1(𝑋𝑡 −𝑋)(𝑌𝑡 −𝑌)
• After some algebraic manipulations, one gets 𝑏 = σ𝑛 ത 2
and 𝑎 =
𝑡=1 𝑋𝑡 −𝑋
𝑌ത − 𝑏𝑋,
ത where 𝑋ത and 𝑌ത are the sample averages.

• To avoid confusion, usually the symbols 𝑎ො and 𝑏෠ are used for the above
solution.
Revision on Multiple Linear Regression
• Problem: Predict Y from X1, X2, X3, …, Xp

• 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + ⋯ + 𝛽𝑝 𝑋𝑝 + 𝑒𝑟𝑟𝑜𝑟

• Multiple linear regression is statistical method for p>1 that estimate


beta from the data and use that beta for the prediction.
Revision on Multiple Linear Regression
Revision on Multiple Linear Regression
• Example: What affects the saving rate?

• Data:
• savings.csv
• Data of 50 countries collected over the period 1960-1970
• Belsley, Kuh, and Welsch (1980) Regression Diagnostics. Wiley.

• Response: Y = sr = personal saving divided by disposable income

• Predictors
• X1 = pop15 = percent population under age of 15
• X2 = pop75 = percent population over age of 75
• X3 = dpi = per capita disposable income in US dollars
• X4 = ddpi = percent growth rate of dpi
Revision on Multiple Linear Regression
• savings = read.csv("savings.csv",header=T)

• lm(sr~pop15+pop75+dpi+ddpi , savings)

• newdata = data.frame(country="Erehwon",
pop15=30, pop75=3, dpi=2000, ddpi=5)

• predict(result, newdata)
Revision on Multiple Linear Regression
• newdata = data.frame(
pop15=c(30,25), pop75=c(3,4), dpi=c(2000,2000),
ddpi=c(5,5))

• predict(result, newdata, interval="prediction")


Revision on Multiple Linear Regression
෡ automatically to minimize the sum of squared errors
• SAS can find 𝜷
𝑛
2
𝑆𝑆 𝜷 = ෍ 𝑌𝑡 − 𝛽0 − 𝛽1 𝑋1𝑡 − 𝛽2 𝑋2𝑡 − ⋯ − 𝛽𝑝 𝑋𝑝𝑡
𝑡=1

• The mathematical details are NOT the focus of this course and are
omitted. Interested students may consult textbooks on regression.
Time Series Analysis – Regression Approach
Time Series Analysis – Regression Approach
• Trend models:

• Linear: 𝑚𝑡 = 𝑎 + 𝑏𝑡

• Quadratic: 𝑚𝑡 = 𝑎 + 𝑏𝑡 + 𝑐𝑡 2

• Exponential: 𝑚𝑡 = 𝑎𝑏 𝑡 or log 𝑚𝑡 = 𝛼 + 𝛽𝑡
Time Series Analysis – Regression Approach
• In R, one can use dim(gtemp)[1] and dim(gtemp)[2] to retrieve the
size of the dataset.

• gtemp = read.csv("gtemp.csv",header=T)
• n = dim(gtemp)[1]
• year = 1:n
• gtemp = data.frame(gtemp,
time=year, time2=year^2)
Time Series Analysis – Regression Approach
• Exercise: Annual revenues of a company
Year 2007 2008 2009 2010 2011 2012 2013

Rev 375.2 414.2 458.5 511.9 542 588.1 656.4

Year 2014 2015 2016 2017 2018 2019 2020

Rev 722.5 804.2 896.8 957.4 1038.7 1069.3 1167.9

• Log-linear: log 𝑚𝑡 = 𝑎 + 𝑏𝑡
• Log-quadratic: log 𝑚𝑡 = 𝑎 + 𝑏𝑡 + 𝑐𝑡 2
• Try to predict the revenue in Year 2023
Time Series Analysis – Regression Approach
• Causal regression models:

• Lag 1 model: 𝑌𝑡 = 𝑎 + 𝑏𝑋𝑡−1 + 𝜖𝑡

• Lag 2 model: 𝑌𝑡 = 𝑎 + 𝑏𝑋𝑡−2 + 𝜖𝑡


Time Series Analysis – Regression Approach
• year = c(2015,2016,2017,2018,2019,2020)
advertisement = c(172,156,167,145,150,160)
sales = c(60,43,52,36,40,50)

• salesdata = data.frame(
year = year[2:6],
adv.lag = advertisement[1:5],
sales = sales[2:6]
)

• newdata = data.frame(
year = 2021, adv.lag = advertisement[6]
)

You might also like