0% found this document useful (0 votes)

15 views27 pages

Intro To Reg Models

Uploaded by

sibaee0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views27 pages

Intro To Reg Models

Uploaded by

sibaee0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Business Forecasting

Introduction to Regression Models

Generating Process
For the time series Yt, a basic representation of the
generating process is;

Yt = f(systematic, random component)

Yt = mt + et
The modeller needs to determine the relevant
functional form for mt. This will depend on the type
of patterns observed in the time series and the
type of model (time series/causal)
Regression
In regression modelling the systematic component,
mt is f ( X1, X2, X3…..Xk) where Xj are explanatory
variables
Yt = f ( X1, X2, X3…..Xk) + et

The exact functional form and the particular

independent variables (Xj ) to be included in the
model is a matter of judgement.
A regression model is a causal model since the
prediction of the target time series is linked to other
time series.
Why Use Regression?

Advantages:
1. Regression allows the forecaster to incorporate
theoretical knowledge of the time series, independent
variables and the functional form

2. Regression provides a “causal” explanation of why the

prediction may be appropriate.

3. Regression models can be used to provide strategy and

scenario based prediction. In particular, regression
can be used to analyse the best and worst case
scenarios. This provides some indication of the range of
likely values for the target time series and helps
management identify sources of risk
Why use Regression? (cont)
Disadvantages:
1. Regression requires much more data than other forecasts
methods. Data is required on the target time series
and the independent variables. In addition more
theoretical knowledge of the time series generating
process is required

2. Regression analysis requires more resources

(time,money,skill) to produce forecasts than previously
examined time series methods. This time and effort may not
necessarily result in greater predictive accuracy.

Regression may prove to be an expensive, time

consuming way of producing inferior forecasts.
When to use Regression?

Regression is a relatively time consuming and

expensive way of generating forecasts

Typically the use of regression should be for forecasts of

some importance for the firm or organisation or when
strategic options and/or scenario analysis is required

The benefits to the firm (improved understanding,

scenarios) must outweigh the considerable costs

Since regression is resource hungry it should only be

undertaken when there are sufficient resources (time,
money, data etc) to enable a proper regression analysis.
Regression Forecasting
There are basically three tasks involved

1. Choose an appropriate model. This includes

independent variable selection and choosing the specific
functional form of the model

2. Use a joint sample of observations on the dependent

and independent variables to derive estimates of the
regression coefficients

3. Use the estimated model and predicted values of

independent variables to generate forecasts of the
dependent variable
Choosing an Appropriate
Model
Choosing appropriate independent variables relies on economic
theory, logic, the observed time series and the experience
of the modeller

Typically, the modeller considers the above and selects a candidate

group of variables which may be independent variables in a final
regression model

Functional form is another issue. Once again use logic, theory,

experience and the observed time series (Linear v Non-Linear,
Statics v Dynamics, Levels v Changes)

How the predictive model will be used and what information

and/or forecasts are required may also influence the functional
form.
Estimation
Basic Functional Form (population model)

Yt = 0 + 1 X1t + 2 X2t + ... + k Xkt + et

A joint sample on Yt and X 1t, 2t, kt is collected.

Estimation of regression model coefficients (b0, b1 , b2… bk)

typically via Ordinary Least Squares (OLS) in EXCEL or
Minitab. This generates the sample regression model

E(Yt) = b0 + b1 X1t + b2 X2t + ... + bk Xkt

Why Use OLS?
OLS estimates have good forecast properties

Under the conditions the model is correctly specified and

the random error at any observation is independently
derived with zero mean and constant variance the OLS
estimates and forecasts will be

1. Unbiased - on average the OLS estimates and

forecasts will be equal to the true values

2. Efficient- the OLS estimators and forecasts will be

the most precise of any linear unbiased
estimators or forecasts.
Estimation (cont)
Before the model can be used for forecasts it needs to be
checked for adequacy and violations of assumptions
underpinning OLS

Assumptions of OLS include correctly specified

functional form and error term (et) behaviour

Residuals, other diagnostics and associated relevant

statistical tests used to determine the adequacy of model

Only after examination of the above diagnostics and

determination of adequacy should the estimated model
be used for forecasts
Regression: Example 1
Sales Advertising Advertising Vs Sales over
Week time
(Y) (1,000s) (X) ($100s)
25

1 10 9 y = 0.9137 x + 0.7842

2 6 7 20

3 5 5 Estimated Regression
Line/Equation (OLS)

4 12 14 15

5 10 15 Sales (1,000s)
10
6 15 12
7 5 6 5

8 12 10 St= b0+ b1* At

9 17 15 0
0 5 10 15 20 25
10 20 21 Advertising ($100s)
13

Excel: Regression Output

SUMMARY OUTPUT
Advertising Vs Sales over time
25
Regression Statistics y = 0.7842 + 0.9137 * x
20
Multiple R 0.891

Sales (1,000s)
R Square 0.795 15

Adjusted R 10
Square 0.769
5
Standard Error 2.448
Observations 10 0
0 5 10 15 20 25
Advertising ($100s)

ANOVA
df SS MS F Sig F
Regression 1 185.658 185.658 30.980 0.001
Residual 8 47.942 5.993
Total 9 233.6

Standard Lower Upper

Coefficients Error t Stat P-value 95% 95%
Intercept 0.784 2.025 0.387 0.709 -3.886 5.454

Advertising 0.914 0.164 5.566 0.001 0.535 1.292

Regression: Example 1 (cont.)
The sample estimated equation is

Sales = 0.7842 + 0.9137 * A

(Estimated through Excel or MINITAB)

Intercept: 0.7842 – Estimated Sales when A = 0

Slope: 0.9137 – Estimated constant increase in Sales (000’s)
when Advertising increases by 1 unit ($100)

Forecasts:
When A = 10: S = 0.7842 + 0.9137 * 10 = 9.921 (000’s)

When A = 15: S = 0.7842 + 0.9137 * 15 = 14.490 (ooo’s)

Measures of Estimated
Model Performance
R2 - Coefficient of Determination:
• R2 is the % of dependent variable variation (sample)
explained by the estimated regression

• R2 is between 0 and 1 and the closer the R2 to 1 the

better the estimated model fits the sample data.

• EXCEL calculates R2 as part of the standard regression

estimation routine

• In practice, many unskilled modellers place too much

emphasis on R2 or a close counterpart R2 adjusted.

• R2 has many flaws and can be easily manipulated. Don’t

place too much reliance on it but use it as one tool of
many in deciding the suitability of models
Performance Measures (cont.)
Standard Error

The standard error is approximately the “average”

residual of the regression.

It is similar although not identical to RMSE

Provided in standard regression output in EXCEL

and Minitab

Standard error can be used as comparison between

competing regression models
More on Performance Measures

Out of sample forecast performance:

R2 and standard error are indicators of the in- sample predictive
ability of the model.

In-sample prediction is easier since both dependent and

independent variable data are available and used to obtain the
“best” model (most accurate)

For out- of-sample prediction, the dependent variable is not

available (that’s why we are predicting it!). The values of the
independent variables may also need to be estimated

The predictive ability of the model will be different out of

sample. The forecaster should test the out-of-sample predictive
ability by leaving aside a portion of the most recent
observations as a test set. Usual error criteria can be used.
Excel: Regression Output
Standard
SUMMARY OUTPUT
Error =
2.448 R-sq = SSR/SST
Regression Statistics =185.7/233.6
Multiple R 0.891 =0.795
R Square 0.795
Adjusted R MSE =
Square 0.769 Mea
Standard Error 2.448
Observations 10

ANOVA
df SS MS F Sig F
Regression 1 185.658 185.658 30.980 0.001
Residual 8 47.942 5.993
Total 9 233.6

Standard Lower Upper Lower Upper

Coefficients Error t Stat P-value 95% 95% 95.0% 95.0%
Intercept 0.784 2.025 0.387 0.709 -3.886 5.454 -3.886 5.454

Advertising 0.914 0.164 5.566 0.001 0.535 1.292 0.535 1.292

Coefficient of Determination
(R2)
Advertising Vs Sales
25 over time
y = 0.914x + 0.784
R² = 0.795
20

15
Sales (1,000s)

10
79.5% of the sample
variation in Sales is
5
explained by the
variation in Advertising
0
0
5/7/2021
5 10 15 20 25 expenditure 19
Advertising ($100s)
Statistical Testing

• Test for overall model significance – F test

• Test for individual variable significance – t

tests
Testing Individual Variables
Testing Individual coefficients:

Separate tests of population slope coefficients (j) being

zero (null hypothesis)

If the slope coefficient is zero it suggests the independent

variable being examined does not influence the
dependent variable

Further, the independent variable being examined may be an

irrelevant variable and could possibly be dropped from
the model specification

t test - check p-value (<0.05 then Reject H0)

Individual Coefficient Tests
Single Co-efficient Tests:

Hypothesis tests can be applied to the co-efficients of all

variables separately. For a model given by

Y = 0 + 1* X1 + 2* X2 +….k*Xk + e

The relevant test (each co-efficient separately)

H0: j = 0 vs H1: j ≠0

The test statistic has a t distribution with p-values

indicating support for H0 or H1.
Are Individual Variables
Significant?
Use t tests of individual variable slopes
Shows if there is a relationship between the variable Xj
and Y
Hypotheses:
H0: βj = 0 (no linear relationship exists between
Xj and Y)
H1: βj ≠ 0 (linear relationship does exist
between Xj and Y)
Are Individual Variables
Significant? - (2)
H0: βj = 0 (no relationship)
H1: βj ≠ 0 (relationship does exist
between Xj and Y)

Test Statistic: bj − 0
t=
Sb j (df = n – k – 1)

Check p-value in output (<0.05 Reject H0)

25
Excel: Example 1 - t tests from
Regression Output
SUMMARY OUTPUT

Regression Statistics Sig-value (α) is the

Multiple R 0.891
probability of a similar or
R Square 0.795
more extreme sample t
Adjusted R
Square 0.769 value given β is zero.
Standard Error 2.448
Observations 10

ANOVA
df SS MS F Sig F
Regression 1 185.658 185.658 30.980 0.001
Residual 8 47.942 5.993
Total 9 233.6

Standard Lower Upper Lower Upper

Coefficients Error t Stat P-value 95% 95% 95.0% 95.0%
Intercept 0.784 2.025 0.387 0.709 -3.886 5.454 -3.886 5.454

Advertising 0.914 0.164 5.566 0.001 0.535 1.292 0.535 1.292

Check the Residuals
As in time series models, a necessary condition for adequacy
of a forecast model are non-systematic errors

Check residuals for randomness- visual inspection of

residual plots (vs. time and vs. all explanatory
variables separately)

Examine ACF and PACF of residuals

Systematic residuals may indicate violation of regression

assumptions

Other objective tests can be used (more on this next week)

Forecasting with Regression

The diagnostic tests are used as a tool to check specified models

and to suggest potential improvements to model
specifications

Models may be modified (according to diagnostic information)

and the process of estimation and diagnostic testing is
repeated

Once a final model is determined (with acceptable diagnostics) it

is used for forecasts

Forecasts will use the estimated equation and estimates of

future Xj values to forecast Yf

15 Types of Regression You Should Know
No ratings yet
15 Types of Regression You Should Know
30 pages
STATG5 - Simple Linear Regression Using SPSS Module
No ratings yet
STATG5 - Simple Linear Regression Using SPSS Module
16 pages
Notes STA408 - Chapter 3
No ratings yet
Notes STA408 - Chapter 3
17 pages
Cheat Sheet - BT1101
100% (2)
Cheat Sheet - BT1101
29 pages
Lecture Notes - Linear Regression
No ratings yet
Lecture Notes - Linear Regression
26 pages
Quiz PMI
100% (1)
Quiz PMI
66 pages
Questions Solved
No ratings yet
Questions Solved
3 pages
(Monographs On Statistics and Applied Probability (Series) ) Chacón, José E. - Duong, Tarn - Multivariate Kernel Smoothing and Its Applications-CRC Press (2018)
No ratings yet
(Monographs On Statistics and Applied Probability (Series) ) Chacón, José E. - Duong, Tarn - Multivariate Kernel Smoothing and Its Applications-CRC Press (2018)
249 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
BA3 4 5modules
No ratings yet
BA3 4 5modules
258 pages
Notes For 18.6501x, Fundamentals of Statistics: v0.2 (2019 April 24)
100% (1)
Notes For 18.6501x, Fundamentals of Statistics: v0.2 (2019 April 24)
14 pages
Week 5 Notes
No ratings yet
Week 5 Notes
175 pages
BS IMI U4 Oct23 Complete
No ratings yet
BS IMI U4 Oct23 Complete
182 pages
Spss Amos 70 Users Guide James L Arbuckle Download
No ratings yet
Spss Amos 70 Users Guide James L Arbuckle Download
84 pages
F Regression
No ratings yet
F Regression
65 pages
Econometrics For MGT ppt-2
No ratings yet
Econometrics For MGT ppt-2
58 pages
Chap01-3 (Autosaved)
No ratings yet
Chap01-3 (Autosaved)
51 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
CH-1 - Introduction
No ratings yet
CH-1 - Introduction
55 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
Chapter4 Statistics
No ratings yet
Chapter4 Statistics
108 pages
Lecture6 Regression
No ratings yet
Lecture6 Regression
42 pages
Introduction of Regression
No ratings yet
Introduction of Regression
57 pages
Basic Stats and Probability
100% (1)
Basic Stats and Probability
703 pages
1 - UNIT 2 2 Files Merged
No ratings yet
1 - UNIT 2 2 Files Merged
80 pages
Linear Statistical Inference and Its Application 2nd Edition C. Radhakrishna Rao Instant Download
No ratings yet
Linear Statistical Inference and Its Application 2nd Edition C. Radhakrishna Rao Instant Download
53 pages
Econometrics Session
No ratings yet
Econometrics Session
43 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
Linear Regression
No ratings yet
Linear Regression
38 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
Chapter 3
No ratings yet
Chapter 3
45 pages
LGT2425 Lecture 3 Part II (Notes)
No ratings yet
LGT2425 Lecture 3 Part II (Notes)
55 pages
Module 5
No ratings yet
Module 5
40 pages
A Penalized Synthetic Control Estimator Abadie 2021
No ratings yet
A Penalized Synthetic Control Estimator Abadie 2021
40 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Time Series Analysis Updated Unit-4
No ratings yet
Time Series Analysis Updated Unit-4
31 pages
Simple LR Lecture
No ratings yet
Simple LR Lecture
60 pages
Lecture 9-10
No ratings yet
Lecture 9-10
28 pages
Stock Price Predictions Using A Geometric Brownian Motion - Joel Lidén
No ratings yet
Stock Price Predictions Using A Geometric Brownian Motion - Joel Lidén
41 pages
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
No ratings yet
Applying Machine Learning Algorithms With Scikit-Learn (Sklearn) - Notes
19 pages
Concepts - Regression Overview
No ratings yet
Concepts - Regression Overview
14 pages
Pendugaan Parameter
No ratings yet
Pendugaan Parameter
30 pages
MODS 2023 L1W4 - CI and Stats Tests
No ratings yet
MODS 2023 L1W4 - CI and Stats Tests
30 pages
Regression
No ratings yet
Regression
39 pages
3.model Performance and Evaluation - Teaching1
No ratings yet
3.model Performance and Evaluation - Teaching1
22 pages
Ecf630-Final Examination - May 2021
No ratings yet
Ecf630-Final Examination - May 2021
12 pages
Fda Unit 5
No ratings yet
Fda Unit 5
20 pages
DA-3rd Unit
No ratings yet
DA-3rd Unit
16 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
20 pages
2023 Statistics Fin 10
No ratings yet
2023 Statistics Fin 10
14 pages
Regrion
No ratings yet
Regrion
19 pages
ABRM Regression
No ratings yet
ABRM Regression
22 pages
Regression Analysis
No ratings yet
Regression Analysis
65 pages
Chapter 5
No ratings yet
Chapter 5
73 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
Unit 3
No ratings yet
Unit 3
24 pages
A Guide On How To Compare Different Models in Linear Progression
No ratings yet
A Guide On How To Compare Different Models in Linear Progression
8 pages
Unit - Iii
No ratings yet
Unit - Iii
9 pages
Handout 4 - Statistical Interval
No ratings yet
Handout 4 - Statistical Interval
13 pages
Session 5 Marked B PDF
No ratings yet
Session 5 Marked B PDF
36 pages
Da On Regression
No ratings yet
Da On Regression
58 pages
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
No ratings yet
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
16 pages
Intro To Regresion: Codergirl Data Analysis
No ratings yet
Intro To Regresion: Codergirl Data Analysis
32 pages
14 Estimation
No ratings yet
14 Estimation
10 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
MATH 102 Prelim Exam
No ratings yet
MATH 102 Prelim Exam
9 pages
Unit III
No ratings yet
Unit III
18 pages
Da Unit 3 R22
No ratings yet
Da Unit 3 R22
15 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
Basic Econometrics Revision - Econometric Modelling
No ratings yet
Basic Econometrics Revision - Econometric Modelling
65 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Unit III
No ratings yet
Unit III
13 pages
E-Notes 716 Content Document 20241127032803PM
No ratings yet
E-Notes 716 Content Document 20241127032803PM
10 pages
Om SFM - 2023-24 PGDM - Gen
No ratings yet
Om SFM - 2023-24 PGDM - Gen
12 pages
Psych Stat Reviewer Midterms
No ratings yet
Psych Stat Reviewer Midterms
10 pages
Simple Linear Regression and Multiple Linear Regression: MAST 6474 Introduction To Data Analysis I
No ratings yet
Simple Linear Regression and Multiple Linear Regression: MAST 6474 Introduction To Data Analysis I
15 pages
A Tutorial On How To Run A Simple Linear Regression in Excel
No ratings yet
A Tutorial On How To Run A Simple Linear Regression in Excel
19 pages
Practice Multiple Choice Questions and Feedback - Chapters 1 and 2
No ratings yet
Practice Multiple Choice Questions and Feedback - Chapters 1 and 2
10 pages
Sbe10 10 Simple Regression
No ratings yet
Sbe10 10 Simple Regression
100 pages
Regression Analysis
No ratings yet
Regression Analysis
20 pages
Regression Analysis (AI)
No ratings yet
Regression Analysis (AI)
9 pages
Regression Analysis in R
No ratings yet
Regression Analysis in R
7 pages
Chapter 6: How To Do Forecasting by Regression Analysis
No ratings yet
Chapter 6: How To Do Forecasting by Regression Analysis
7 pages
FRM Quant Question Bank
100% (5)
FRM Quant Question Bank
111 pages
Data Analytics Unit 3 Notes
100% (3)
Data Analytics Unit 3 Notes
28 pages
Special Paper, 2019 BSAS, S-III Replica Statistics III
No ratings yet
Special Paper, 2019 BSAS, S-III Replica Statistics III
1 page
Manufacturing: Engineering, Management and Marketing
From Everand
Manufacturing: Engineering, Management and Marketing
S.O.T Ogaji
No ratings yet
Statistics and Probability STAT 112 Grade11 Week 1 20 Kuya SAGUIL
No ratings yet
Statistics and Probability STAT 112 Grade11 Week 1 20 Kuya SAGUIL
179 pages

Intro To Reg Models

Uploaded by

Intro To Reg Models

Uploaded by

Business Forecasting

Introduction to Regression Models

Yt = f(systematic, random component)

The exact functional form and the particular

2. Regression provides a “causal” explanation of why the

3. Regression models can be used to provide strategy and

2. Regression analysis requires more resources

Regression may prove to be an expensive, time

Regression is a relatively time consuming and

Typically the use of regression should be for forecasts of

The benefits to the firm (improved understanding,

Since regression is resource hungry it should only be

1. Choose an appropriate model. This includes

2. Use a joint sample of observations on the dependent

3. Use the estimated model and predicted values of

Typically, the modeller considers the above and selects a candidate

Functional form is another issue. Once again use logic, theory,

How the predictive model will be used and what information

Yt = 0 + 1 X1t + 2 X2t + ... + k Xkt + et

Estimation of regression model coefficients (b0, b1 , b2… bk)

E(Yt) = b0 + b1 X1t + b2 X2t + ... + bk Xkt

Under the conditions the model is correctly specified and

1. Unbiased - on average the OLS estimates and

2. Efficient- the OLS estimators and forecasts will be

Assumptions of OLS include correctly specified

Residuals, other diagnostics and associated relevant

Only after examination of the above diagnostics and

8 12 10 St= b0+ b1* At

Excel: Regression Output

Standard Lower Upper

Advertising 0.914 0.164 5.566 0.001 0.535 1.292

Sales = 0.7842 + 0.9137 * A

Intercept: 0.7842 – Estimated Sales when A = 0

When A = 15: S = 0.7842 + 0.9137 * 15 = 14.490 (ooo’s)

• R2 is between 0 and 1 and the closer the R2 to 1 the

• EXCEL calculates R2 as part of the standard regression

• In practice, many unskilled modellers place too much

• R2 has many flaws and can be easily manipulated. Don’t

The standard error is approximately the “average”

It is similar although not identical to RMSE

Provided in standard regression output in EXCEL

Standard error can be used as comparison between

Out of sample forecast performance:

In-sample prediction is easier since both dependent and

For out- of-sample prediction, the dependent variable is not

The predictive ability of the model will be different out of

Standard Lower Upper Lower Upper

Advertising 0.914 0.164 5.566 0.001 0.535 1.292 0.535 1.292

• Test for overall model significance – F test

• Test for individual variable significance – t

Separate tests of population slope coefficients (j) being

If the slope coefficient is zero it suggests the independent

Further, the independent variable being examined may be an

t test - check p-value (<0.05 then Reject H0)

Hypothesis tests can be applied to the co-efficients of all

Y = 0 + 1* X1 + 2* X2 +….k*Xk + e

The relevant test (each co-efficient separately)

The test statistic has a t distribution with p-values

Check p-value in output (<0.05 Reject H0)

Regression Statistics Sig-value (α) is the

Standard Lower Upper Lower Upper

Advertising 0.914 0.164 5.566 0.001 0.535 1.292 0.535 1.292

Check residuals for randomness- visual inspection of

Examine ACF and PACF of residuals

Systematic residuals may indicate violation of regression

Other objective tests can be used (more on this next week)

The diagnostic tests are used as a tool to check specified models

Models may be modified (according to diagnostic information)

Once a final model is determined (with acceptable diagnostics) it

Forecasts will use the estimated equation and estimates of

You might also like