0% found this document useful (0 votes)

4 views

Multivariate Regression Model - Lecture Notes

Uploaded by

Shruti Mittal

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Multivariate Regression Model - Lecture Notes

Uploaded by

Shruti Mittal

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Econometrics for Business I

BSE3703
Topic 3
Multivariate Regression Model
Learning Outcomes

At the end of the lesson, students must be able to:

1. understand the specifications of multivariate linear regression.
2. interpret the regression output table of multivariate regression model.
3. explain the normality assumption of the regression model.
4. understand the sampling distribution of OLS estimators in the regression model.
Multivariate Regression Model
hope for a lineaar relationship between variables

yi = β0 + β1 x1i + β2 x2i + ⋯ + βk xki + ui j=1 j=2 j=3 j=4

Population Size y x1 x2 x3 x4
N,k i=1 6756 29 17 2.83 21
i = 1, 2, … , N
= β0 + ෍ βj xji + ui j = 1, 2, … , k
i=2 7500 38 23 3.75 22
i=3 7440 32 19 3.15 14
i=1,j=1
i=4 7740 31 23 4.12 19
include all determinants possible
i=5 7836 38 21 3.57 16
sample regression fn
yi = β෠ 0 + β෠ 1 x1i + β෠ 2 x2i + ⋯ + β෠ k xki + uො i
… 7416 28 17 2.83 16
… 7596 37 20 3.37 21

n,k
𝑥1,5 … 7860 42 23 4.67 17
i = 1, 2, … , n … 7716 30 22 3.68 19
= β෠ 0 + ෍ β෠ j xji + uො i j = 1, 2, … , k … 7476 34 20 3.32 23
i=1,j=1 i=n−1 7536 35 21 3.42 18
Sample Size
i=n 7356 30 19 3.13 22

𝑥3,𝑛
Sample Size for Regression Model

▪ Sample size for multivariate regression model should be ‘sufficiently large’ in order
to build an adequate regression model for inference and forecasting.
➢ There is no ‘fast and hard rule’ to derive the appropriate sample size.
➢ In the real-world, the best practice is to configure your sample size according to your context.

▪ A ‘sufficiently large’ sample size will help to

1. increase the accuracy of the OLS estimators.
2. ensure the validity of hypothesis tests on the OLS estimators.
3. satisfy the normality assumption in the regression model.
4. mitigate the risks of ‘common problems’ often associated with the regression model.
Outliers
▪ Outliers will distort the computation of OLS estimators in the regression model.
▪ Therefore, the best practice is to clean up (or remove) outliers in the sample data
before moving forward to build the regression model.
▪ Outliers can be identified (and subsequently omitted) using

Outliers

Histogram Box Plot

Types of Survey Error
▪ Surveys are often used to collect samples.

▪ But surveys are subjected to potential errors.

▪ There are four types of survey errors.
1. Coverage error occurs if certain groups of items are excluded so that they have no chance of being
selected in the sample.
2. Nonresponse error arises from the failure to collect data on all items in the sample and results in a
nonresponse bias. if there is even one missing answer in a survey form
3. Sampling error reflects the “chance differences” from sample to sample, based on the probability of
particular individuals or items being selected from sample to sample.
4. Measurement error occurs because of a weakness in question wording, attributed to the fact that
the process of measurement is often governed by what is convenient, not what is needed.
Multivariate Regression Model : Example
Variable of Interest: Salary of Singapore Degree Holder
Determinants (Explanatory Variables):
Determinants Representation Data Type
SALARY AGE EDU GPA SIB
1 Age (in number of years) AGE Numerical
i=1 6756 29 17 2.83 0
2 Years of Education EDU Numerical
i=2 7500 38 23 3.75 1
3 Gender GEN Categorical
i=3 7440 32 19 3.15 2
4 GPA Score GPA Numerical
i=4 7740 31 23 4.12 0
5 Professional Certification(s) PRO Categorical
i=5 7836 38 21 3.57 1
6 Number of Siblings SIB Numerical
… 7416 28 17 2.83 1
in topic 3 only numerical data is analysed
… 7596 37 20 3.37 0
Population Regression Model:
… 7860 42 23 4.67 3
yi = β0 + β1 AGEi + β2 EDUi + β3 GPAi + β4 SIBi + ui … 7716 30 22 3.68 0
… 7476 34 20 3.32 2
Sample Regression Model:
i = 236 7536 35 21 3.42 0
yi = β෠ 0 + β෠ 1 AGEi + β෠ 2 EDUi + β෠ 3 GPAi + β෠ 4 SIBi + uො i i = 237 7356 30 19 3.13 2
Regression Output Table : Interpretation
k↑ → σ uො 2i ↓ → R2 ↑
yi = β෠ 0 + β෠ 1 AGEi + β෠ 2 EDUi + β෠ 3 GPAi + β෠ 4 SIBi + uො i
2
RSS σ uො 2i
𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 number of beta (b0) R =1− =1− 2
𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 k n−1−k 𝑀𝑒𝑎𝑛 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 =
𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠
n sample size
TSS σ yi − yത
𝐷𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚
▪ A problem with the R2 statistics is that adding additional
explanatory variables to the model will always increase it,
even if those variables do not have explanatory power.

ESS
σ uො 2i
RSS
Adjusted R2 = R ഥ2 = 1 − n − 1 − k
TSS R square will increase only if the variable is relevant as the change in σ yi − y ത 2
TSS = ESS + RSS n−1
numerator will be greater than change in denominator. adjusted r sq.
will always be smaller than r sq.
n−1
ഥ2 ≤ 1 but can be negative. As more explanatory variables are
▪ R
added to the model, R ഥ2 only increases if the extra variables
contribute significantly to the model’s explanatory power.
▪ n − 1 Τ n − 1 − k ≥ 1 implies R ഥ 2 ≤ R2

RSS σ yi − yො i 2 σ uො 2i
SER = = =
n−1−k n−1−k n−1−k
Regression Output Table : Interpretation
yi = 4348.772 + 37.3697AGEi + 35.70034EDUi + 377.8493GPAi − 5.317322SIBi + uො i

▪ The OLS estimators for a multivariate

regression model can be easily
computed using statistical software
(including Stata and MS Excel).

▪ Apart from the OLS estimates, the y

regression output table consists of β෠ 1

β෠ 2
useful information to check whether 𝐸𝑥𝑝𝑙𝑎𝑛𝑎𝑡𝑜𝑟𝑦
𝑉𝑎𝑟𝑖𝑎𝑏𝑙𝑒𝑠 β෠ 3
the constructed regression model is β෠ 4
adequate. β෠ 0

Hypothesis Testing
Residuals : Normality Assumption
residuals
yi = 4348.772 + 37.3697AGEi + 35.70034EDUi + 377.8493GPAi −5.317322SIBi + uො i
Normality Assumption (of residuals)
▪ The residuals of the estimated regression model (equation)
should be distributed symmetrically around zero, which means
that the residual is a random and independent variable which
follows an approximately normal distribution with zero mean.
▪ The normality assumption of residuals will imply
1. the estimated regression equation captures the main patterns
and sources of variation between the response variable and
all explanatory variables.
perform hypothesisis testing only if betas are normally distributed
2. the OLS estimators are approximately normally distributed
(which can be mathematically proven). This will ensure that
hypothesis tests on the OLS estimators in the estimated
regression model can be performed with accuracy.

▪ Causes of a skewed distribution of residuals include outliers and

sample not being ‘sufficiently large’.
Residuals : Tests for Normality
residuals
yi = 4348.772 + 37.3697AGEi + 35.70034EDUi + 377.8493GPAi − 5.317322SIBi + uො i

Histogram Normal Quantile Plot

Normal Density Plot 45-Degree Reference Line
Kernel Density Plot

1. Normal Density Plot depicts the probability density function of the data. 1. If the residuals fall along the 45-degree reference line, then the residuals
Unlike the histogram, the curve represents the proportion of the data in are approximately normally distributed.
each range, rather than the frequency.
2. Kernel Density Plot includes a kernel smoothing effect on the probability
density estimation of the data.

* Other statistical tests for normality includes the Kolmogorov-Smirnov test,

the Shapiro-Wilk test, the Jarque-Bera test, and the Anderson-Darling test.
Sampling Distribution of OLS Estimators
β෠ ≈ N E β෠ , Var β෠
E β෠ = β

Var β෠ = σ2 (X ′ X)−1 where σ2 = Var(u) β෠

෢2 (X ′ X)−1
෢ β෠ = σ
Var ෢2 = 1 σn uො i 2
where σ n i=1

β෠ − E β෠
≈ 𝑡 𝑛−1−𝑘
• β෠ is an unbiased estimator of β se β෠ ෠
Var β
Degree of Freedom (df)

• Standard error of β෠ = se β෠ = ෠
Var(β)
t
0
Sampling Distribution of OLS Estimators
X = N E X , Var X β෠ ≈ N E β෠ , Var β෠

X β෠
E X β

X−E X β෠ − E β෠
Z= = N(0,1) ≈ 𝑡 𝑛−1−𝑘
sd X se β෠ ෠
Var β
Degree of Freedom (df)

Z t
0 0
t Distribution

Standard Normal Distribution

Z Distribution = 𝑁(0,1)
t-Distribution (𝑑𝑓 ≥ 120)

t-Distribution (𝑑𝑓 = 60) Degree of Freedom (df)

t-Distribution (𝑑𝑓 = 10)

▪ When n (sample size) increases

→ Degree of Freedom (df) of t-distribution increases
→ t-distribution will approach Z distribution.

n increases → s. e. β෠ decreases → accuracy increases

Prepared by

Daniel SOH

NP P94-117-2 PDF
No ratings yet
NP P94-117-2 PDF
3 pages
Bioengineering Sample Statement of Purpose Sop
50% (2)
Bioengineering Sample Statement of Purpose Sop
2 pages
Trinity GESE 10 Interactive Phase With Advanced Functions
100% (8)
Trinity GESE 10 Interactive Phase With Advanced Functions
10 pages
B2 UNIT 1 Life Skills Video Worksheet
No ratings yet
B2 UNIT 1 Life Skills Video Worksheet
1 page
Cheat Sheet
No ratings yet
Cheat Sheet
4 pages
Pre-Entry Test - Cae Updated
0% (1)
Pre-Entry Test - Cae Updated
10 pages
Catapults and Projectile Motion Project
No ratings yet
Catapults and Projectile Motion Project
14 pages
Lecture 6
No ratings yet
Lecture 6
45 pages
Econometrics
No ratings yet
Econometrics
13 pages
TCH442E Quantitative Methods For Finance
No ratings yet
TCH442E Quantitative Methods For Finance
21 pages
UnivariateRegression 3
No ratings yet
UnivariateRegression 3
81 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
6034 - Classical Linear Regression Model
No ratings yet
6034 - Classical Linear Regression Model
30 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Introduction To Econometrics - Summary
No ratings yet
Introduction To Econometrics - Summary
23 pages
The Linear Regression Model
No ratings yet
The Linear Regression Model
25 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Basic Economterics - I
No ratings yet
Basic Economterics - I
17 pages
Regression: Dr. Agustinus Suryantoro, M.S
No ratings yet
Regression: Dr. Agustinus Suryantoro, M.S
31 pages
AF ECO 4000 cheat sheet
No ratings yet
AF ECO 4000 cheat sheet
3 pages
Lecture 2: Simple Linear Regression Model: Recap
No ratings yet
Lecture 2: Simple Linear Regression Model: Recap
5 pages
Econometrics Bacheror's Lectures Utrecht University
No ratings yet
Econometrics Bacheror's Lectures Utrecht University
24 pages
02 Simple Regression
No ratings yet
02 Simple Regression
29 pages
Welcome To The Course: Financial Econometrics I
No ratings yet
Welcome To The Course: Financial Econometrics I
14 pages
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
No ratings yet
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
64 pages
Basic Econometrics - II
No ratings yet
Basic Econometrics - II
30 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
統計摘要
No ratings yet
統計摘要
12 pages
Lecture 6: Classical Normal Linear Regression Model Some Basic Ideas
No ratings yet
Lecture 6: Classical Normal Linear Regression Model Some Basic Ideas
9 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Lecture8 4
No ratings yet
Lecture8 4
29 pages
qrm2 Session1 2
No ratings yet
qrm2 Session1 2
89 pages
Lecture4
No ratings yet
Lecture4
29 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Emet2007 Notes
No ratings yet
Emet2007 Notes
6 pages
Lecture 4
No ratings yet
Lecture 4
60 pages
Business Stat & Emetrics - Inference in Regression
No ratings yet
Business Stat & Emetrics - Inference in Regression
7 pages
Topic 3a
No ratings yet
Topic 3a
64 pages
Gauss Markov Theorem
No ratings yet
Gauss Markov Theorem
16 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
18 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Data Analytics Unit 3 Notes
100% (2)
Data Analytics Unit 3 Notes
28 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
Ordinary Least Squares-2
No ratings yet
Ordinary Least Squares-2
31 pages
Lecture 4 Linear Regression
No ratings yet
Lecture 4 Linear Regression
75 pages
Econometrics Endterm Summary 2 PDF
No ratings yet
Econometrics Endterm Summary 2 PDF
43 pages
Lecture 2.3 Model Validation
No ratings yet
Lecture 2.3 Model Validation
16 pages
Linear Regression
83% (6)
Linear Regression
499 pages
Additional Problem Set Units I and II
No ratings yet
Additional Problem Set Units I and II
8 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
23 pages
Chapter 11 Lecture Notes .
No ratings yet
Chapter 11 Lecture Notes .
22 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
Chapter 3 two variable regression model
No ratings yet
Chapter 3 two variable regression model
7 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
21 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
Regression 101
No ratings yet
Regression 101
18 pages
Module 3 - Regression and Correlation Analysis
No ratings yet
Module 3 - Regression and Correlation Analysis
54 pages
Econometric estimation BETA
No ratings yet
Econometric estimation BETA
36 pages
Properties of OLS Estimators: Assumptions Underlying Model
100% (1)
Properties of OLS Estimators: Assumptions Underlying Model
23 pages
Chapter 2-Simple Regression Model
No ratings yet
Chapter 2-Simple Regression Model
25 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
Week 2, OLS
No ratings yet
Week 2, OLS
83 pages
Elementary Algebra Notes Examples and Exercises
From Everand
Elementary Algebra Notes Examples and Exercises
George N. Frempong
No ratings yet
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
Syllabus Sencer Seccions
100% (1)
Syllabus Sencer Seccions
93 pages
BW - Naming Conventions (SAP)
50% (2)
BW - Naming Conventions (SAP)
8 pages
Gusset Plate Connection To Round HSS Tension Members
No ratings yet
Gusset Plate Connection To Round HSS Tension Members
7 pages
Thermodynamics - Theory T-V Diagram: Phase Change Process Under Constant Pressure (112 KB)
No ratings yet
Thermodynamics - Theory T-V Diagram: Phase Change Process Under Constant Pressure (112 KB)
3 pages
Limitations of Historical Research
75% (4)
Limitations of Historical Research
14 pages
On Affective Labor in Post-Fordist Italy: Andrea Muehlebach University of Toronto
No ratings yet
On Affective Labor in Post-Fordist Italy: Andrea Muehlebach University of Toronto
24 pages
An ICT Analysis of Global Shampoo Industry
83% (6)
An ICT Analysis of Global Shampoo Industry
20 pages
Manual - NQDI - Data Call Analysis Description
No ratings yet
Manual - NQDI - Data Call Analysis Description
22 pages
She Will Be Loved: Maroon 5
No ratings yet
She Will Be Loved: Maroon 5
15 pages
PHYS 201 - Resultant and Equilibrant Forces Formal Report
No ratings yet
PHYS 201 - Resultant and Equilibrant Forces Formal Report
4 pages
Math7 2Q e
No ratings yet
Math7 2Q e
5 pages
Lecture 11 - SimplerLinear and Simple Logistic Regression
No ratings yet
Lecture 11 - SimplerLinear and Simple Logistic Regression
31 pages
Mysql Commands
No ratings yet
Mysql Commands
7 pages
CAIE-IGCSE-Mandarin Chinese
No ratings yet
CAIE-IGCSE-Mandarin Chinese
13 pages
Aristotle
No ratings yet
Aristotle
3 pages
Organisational Behaviour - Group Work - Final Reflective Report
33% (3)
Organisational Behaviour - Group Work - Final Reflective Report
7 pages
Quiz 4
No ratings yet
Quiz 4
4 pages
Course 5 Chapter 3
No ratings yet
Course 5 Chapter 3
9 pages
Cemex SIBM Pune Case Analysis
No ratings yet
Cemex SIBM Pune Case Analysis
27 pages
How To Read Peoples Mind
100% (1)
How To Read Peoples Mind
4 pages
International Olive Councel: Sensory Analysis of Olive Oil Standard
No ratings yet
International Olive Councel: Sensory Analysis of Olive Oil Standard
24 pages
Personal-Development SLM Q3 Week-1
No ratings yet
Personal-Development SLM Q3 Week-1
8 pages
The Ritual Process Structure and Anti Structure Symbol Myth and Ritual Series
100% (7)
The Ritual Process Structure and Anti Structure Symbol Myth and Ritual Series
222 pages
Algorithms For Playing and Solving Games
No ratings yet
Algorithms For Playing and Solving Games
39 pages

Multivariate Regression Model - Lecture Notes

Uploaded by

Multivariate Regression Model - Lecture Notes

Uploaded by

Econometrics for Business I

At the end of the lesson, students must be able to:

yi = β0 + β1 x1i + β2 x2i + ⋯ + βk xki + ui j=1 j=2 j=3 j=4

▪ A ‘sufficiently large’ sample size will help to

Histogram Box Plot

▪ But surveys are subjected to potential errors.

▪ The OLS estimators for a multivariate

▪ Apart from the OLS estimates, the y

regression output table consists of β෠ 1

▪ Causes of a skewed distribution of residuals include outliers and

Histogram Normal Quantile Plot

* Other statistical tests for normality includes the Kolmogorov-Smirnov test,

Var β෠ = σ2 (X ′ X)−1 where σ2 = Var(u) β෠

Standard Normal Distribution

t-Distribution (𝑑𝑓 = 60) Degree of Freedom (df)

t-Distribution (𝑑𝑓 = 10)

▪ When n (sample size) increases

n increases → s. e. β෠ decreases → accuracy increases

You might also like