0% found this document useful (0 votes)

125 views6 pages

CH 14 Handout

Multiple regression models involve predicting an outcome variable based on two or more predictor variables. The model makes assumptions about the distributions of the error terms and relationships between predictors. Key outputs include the standard error of the errors, coefficient of determination (R2), and regression coefficients. R2 indicates how much variation in the outcome is explained by the predictors but can increase spuriously with more predictors. Adjusted R2 accounts for this. Coefficients represent the size and direction of change in the outcome per unit change in each predictor while holding others constant. Their significance can be tested to determine if a predictor meaningfully influences the outcome.

Uploaded by

Jnt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

125 views6 pages

CH 14 Handout

Uploaded by

Jnt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Chapter 14: Multiple Regression

A regression model that includes two or more independent variables is called multiple regression model. The population
model is given by the equation:

y = A + B 1 x1 + B 2 x2 + B 3 x3 + … + B k xk + ε
If all the x variables are raised to the power of one then the model becomes a first-order multiple regression model. If a
model has k independent variables and a sample size of n, then the degrees of freedom is df = n -k – 1

Multiple Regression Model Assumptions

Assumption 1: The mean of the probability distribution of εis zero, that is E(ε) = 0

Assumption 2: The errors associated with different sets of values of independent variables are independent.
Furthermore, these errors are normally distributed and have a constant standard deviation, which is denoted by σε
Assumption 3: The independent variables are not linearly related. However, they can have a nonlinear relationship.
When independent variables are highly linearly correlated, it is referred to as multicollinearity.

Assumption 4: There is no linear association between the random error term ε and each independent variable xi

Standard Deviation of Sample Errors:

Population standard deviation of errors is denoted by the symbol σε. The sample standard deviation of errors is denoted

by the symbol se.

SSE
where SSE    y  yˆ 
2
Se 
n  k 1

This value can be generated using Stata. In the results, seis usually termed Root MSE, where MSE means Mean Square
Error.

Coefficient of Multiple Determination

Similar to Simple Regression models, the coefficient of determination in a Multiple Regression Model, R2, measures the
proportion of Total Sum of Squares (SST) that is explained by the regression model. It measures how well the
independent variables explain the dependent variable. Recall that SSR is the part of SST that is explained by the
regression model. (SSE is the part of SST that is not explained by the regression model).

SSR
R2 
SST
In multiple regression models, the value of R2 keeps increasing with the addition of more independent variables, even if
the additional variables do not have significant influence on the dependent variable. However, increasing values of R2
misleads us to think that the model is improving with the additional independent variable.
To overcome this short-coming of R2 we use an alternative coefficient of determination measure, the Adjusted R2 or Ŕ2.
The formula is

 n 1  SSR /  n  k  1
R2  1  1 R2    or 1 
 n  k  1 SST /(n  1)

One thing to note is that while R2 can only be a value between 0 and 1, Ŕ2 can be negative.

Interpreting Regression Results in STATA:

Example 1: Simple Linear Regression Model

Last year five randomly selected students took a match aptitude test before they began their statistics course. Their
results are presented in the table below:

Student Serial Aptitude Test Score Grade in Statistics class

1 95 85
2 85 95
3 80 70
4 70 65
5 60 70

Q1. Present descriptive statistics of the dependent and the independent variables.

. sum statistics

Variable | Obs Mean Std. Dev. Min Max

-------------+--------------------------------------------------------
statistics | 5 77 12.5499 65 95

. sum aptitude

Variable | Obs Mean Std. Dev. Min Max

-------------+--------------------------------------------------------
aptitude | 5 78 13.50926 60 95

Q2. What linear regression equation best predicts statistics performance, based on math aptitude scores?

. reg statistics aptitude

Source | SS df MS Number of obs = 5

-------------+------------------------------ F( 1, 3) = 2.77
Model | 302.60274 1 302.60274 Prob > F = 0.1945
Residual | 327.39726 3 109.13242 R-squared = 0.4803
-------------+------------------------------ Adj R-squared = 0.3071
Total | 630 4 157.5 Root MSE = 10.447

------------------------------------------------------------------------------
statistics | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
aptitude | .6438356 .3866477 1.67 0.194 -.58665 1.874321
_cons | 26.78082 30.51824 0.88 0.445 -70.34184 123.9035
------------------------------------------------------------------------------
^y =26.78+ 0.644 aptitude

Q3. Fit the linear regression line through a scatter plot of the observations.

. graph twoway (lfit statistics aptitude ) (scatter statistics aptitude)

100
90
80
70
60

60 70 80 90 100
aptitude

Fitted values statistics

Q4. If a student made an 80 on the aptitude test, what grade would we expect her to make in statistics?

^y =26.78+ 0.644 aptitude ^y =26.78+0.644 (80) ^y =78.3

Q5. How well does the regression equation fit the data?

R-squared = 0.4803

48.03% of the variation in statistics grades is explained by the scores of the aptitude
test.
Example 2: Multiple Regression Model

A researcher wanted to find the effect of driving experience and the number of driving violations on auto insurance
premiums. A sample of 12 drivers was selected.

y = the monthly auto insurance premium (in dollars) paid by a driver

x1 = the driving experience (in years) of a driver

x2 = the number of driving violations committed by a driver during the past three years

We are to estimate the regression model: y =A +B1x1 + B2x2 + ε

. reg premium experience violations

Source | SS df MS Number of obs = 12

-------------+------------------------------ F( 2, 9) = 60.88
Model | 17961.2895 2 8980.64477 Prob > F = 0.0000
Residual | 1327.71045 9 147.523383 R-squared = 0.9312
-------------+------------------------------ Adj R-squared = 0.9159
Total | 19289 11 1753.54545 Root MSE = 12.146

------------------------------------------------------------------------------
premium | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
experience | -2.747272 .9770062 -2.81 0.020-4.957414 -.5371308
violations | 16.10612 2.61332 6.16 0.000 10.19438 22.01786
_cons | 110.2761 14.61865 7.54 0.000 77.20639 143.3458
------------------------------------------------------------------------------

Using the software output, we are going to answer the following questions:
(a) Explain the meaning of the estimated regression coefficients.

From the STATA output above, we can form the regression equation as follows:

^y =110.2761−2.747 experience +16.106 violations

The value of a=110.2761. This means that a person with 0 years of driving experience and 0 violations in the past three
years would be charged a premium of $110.2761 per month.
The value of b 1=−2.747 indicating that with each additional year of experience, the premium reduces by $2.474,
keeping the number of violations in the past three years fixed.
The value of b 2=16.106 indicating that with each additional violation committed in the past three years, the premium
increases by $16.106, keeping the number of years of driving experience fixed.

(b) Comment on the statistical significance of the coefficients.

By intuition, we expect the relationship between years of driving experience and monthly insurance premiums to be
negatively related. As per our sample, the coefficient of driving experience in the regression model is b 1=−2.747<0 . To
determine the statistical significance of the coefficient, we need to evaluate whether there is enough evidence present
to conclude that the relationship between monthly premiums and years of driving experience is negative in the
population. (STATA always run a two-tailed test by default)

H 0 :B1 =0

H 1 : B1 ≠ 0

The test is t-distributed, where the t-statistic is -2.81. The p-value for this two-tailed test is 0.020. If we select the value
of alpha to be 5%, then p-value <α and we reject the null hypothesis, concluding that there is sufficient evidence present
to conclude that the relationship between the two variables is different from zero in the population (the t-statistic being
negative indicates that the relationship is negative). However, if we choose a 1% level of significance, then we do not
reject the null hypothesis. Hence, there is a moderately strong indication that the relationship between the variables is
different from 0 in the population. Hence it is difficult to conclude whether the variable is indeed statistically significant.

By intuition, we expect the relationship between number of driving violations (in the past three years) and monthly
insurance premiums to be positively related. As per our sample, the coefficient of number of violations (in the past three
years) in the regression model isb 2=16.106>0 . To determine the statistical significance of the coefficient, we need to
evaluate whether there is enough evidence present to conclude that the relationship between monthly premiums and
number of violations in the past three years is positive in the population.

H 0 :B 2=0

H 1 : B2 ≠ 0

The test is t-distributed, where the t-statistic is 6.16. The p-value for this two-tailed test is 0.000. If we select the value of
alpha to be 1%, then p-value <α and we reject the null hypothesis, concluding that there is sufficient evidence present to
conclude that the relationship between the two variables is different from zero in the population (the t-statistic being
positive indicates that the relationship is positive in the population). Hence, there is overwhelming evidence about the
relationship being different from 0 and the variable, number of violations in the past three years, is statistically
significant in explaining variations in the dependent variable.
(c) What are the values of the coefficient of multiple determination, and the adjusted coefficient of multiple
determination?

R2=0.9321 and Adjusted R2=0.9159

(d) What is the predicted auto insurance premium paid per month by a driver with seven years of driving
experience and three driving violations committed in the past three years?

^y =110.2761−2.747 experience +16.106 violations ^y =110.2761−2.747 ( 7 ) +16.106 ( 3 ) ^y =139.37

A driver with 7 years of driving experience and three violations in the past year is predicted to pay a premium of
$139.37.

(e) What is the point estimate of the expected (or mean) auto insurance premium paid per month by all drivers
with 12 years of driving experience and 4 driving violations committed in the past three years?

^y =110.2761−2.747 experience +16.106 violations ^y =110.2761−2.747 ( 12 ) +16.106 ( 4 ) ^y =141.74

(f) Determine a 95% confidence interval for B1 (the coefficient of experience) for the multiple regression of
auto insurance premium on driving experience and the number of driving violations.

According to the STATA output the 95% confidence interval for the coefficient of experience is between -4.957 to -0.537.
This means with 95% confidence the interval -4.957 to -0.537 contains the true population coefficient for experience.

Regression Analysis Assignment
100% (1)
Regression Analysis Assignment
8 pages
Regression Analysis Assignment
No ratings yet
Regression Analysis Assignment
8 pages
2 The British Impact On India, 1700-1900 PDF
No ratings yet
2 The British Impact On India, 1700-1900 PDF
4 pages
Stat982 (Chap14) Q Set
100% (1)
Stat982 (Chap14) Q Set
30 pages
His 101 Lecture 1 Introduction To History
No ratings yet
His 101 Lecture 1 Introduction To History
46 pages
North South University: Title
No ratings yet
North South University: Title
11 pages
Experiment Design
100% (1)
Experiment Design
13 pages
Assignment 3: Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 3: Introduction To Machine Learning Prof. B. Ravindran
4 pages
Omnichannel - Team Jedi
No ratings yet
Omnichannel - Team Jedi
14 pages
Course Code: Mgt368.9: Date: 27/11/2019
No ratings yet
Course Code: Mgt368.9: Date: 27/11/2019
19 pages
Consumer Behaviour Theories and Models
100% (1)
Consumer Behaviour Theories and Models
12 pages
Multiple Regression
No ratings yet
Multiple Regression
100 pages
HIS 101 Courseoutline Mab2
No ratings yet
HIS 101 Courseoutline Mab2
4 pages
Economic History of Bengal 3 PDF
No ratings yet
Economic History of Bengal 3 PDF
37 pages
Multiple Regression
100% (1)
Multiple Regression
100 pages
Bengal Renaissance 1 PDF
No ratings yet
Bengal Renaissance 1 PDF
21 pages
Multiple Linear Regression
100% (4)
Multiple Linear Regression
26 pages
Regression Analysis Assignment
No ratings yet
Regression Analysis Assignment
8 pages
Business Statistics, 5 Ed.: by Ken Black
No ratings yet
Business Statistics, 5 Ed.: by Ken Black
34 pages
Da Unit-3
No ratings yet
Da Unit-3
27 pages
Ma40092 Problem Sheet 3 - Solutions
No ratings yet
Ma40092 Problem Sheet 3 - Solutions
4 pages
The Stata Command All Commands Concerning Fixed and Random Effect
No ratings yet
The Stata Command All Commands Concerning Fixed and Random Effect
10 pages
Solutions Chapter 4 PDF
No ratings yet
Solutions Chapter 4 PDF
31 pages
Handout 1 - Measuring Employment and Unemployment
No ratings yet
Handout 1 - Measuring Employment and Unemployment
2 pages
Negative Binomial Regression
No ratings yet
Negative Binomial Regression
36 pages
Linear Regression (Simple & Multiple)
No ratings yet
Linear Regression (Simple & Multiple)
29 pages
Operations Management: Presented To Wasif Sayeed Choudhury Lecturer North South University
No ratings yet
Operations Management: Presented To Wasif Sayeed Choudhury Lecturer North South University
29 pages
Assignment
No ratings yet
Assignment
9 pages
Stats Week 1
No ratings yet
Stats Week 1
16 pages
Multiple Regression (Compatibility Mode)
No ratings yet
Multiple Regression (Compatibility Mode)
24 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
18 pages
Logistic Regression 205
No ratings yet
Logistic Regression 205
8 pages
Pol 101
No ratings yet
Pol 101
3 pages
What Is Multiple Linear Regression
No ratings yet
What Is Multiple Linear Regression
23 pages
Icra16 Slam Tutorial Grisetti PDF
No ratings yet
Icra16 Slam Tutorial Grisetti PDF
57 pages
Theil Procedure
No ratings yet
Theil Procedure
9 pages
Complete Business Statistics: Multiple Regression
No ratings yet
Complete Business Statistics: Multiple Regression
64 pages
Situation Analysis (Swot) : Strength Weakness
No ratings yet
Situation Analysis (Swot) : Strength Weakness
13 pages
Chapter 09 W12 L1 Multiple Regression Analysis 2015 UTP C10 PDF
No ratings yet
Chapter 09 W12 L1 Multiple Regression Analysis 2015 UTP C10 PDF
17 pages
Designing An Experiment
No ratings yet
Designing An Experiment
6 pages
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
No ratings yet
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
67 pages
Cha 2
0% (1)
Cha 2
23 pages
MultipleRegression 1
No ratings yet
MultipleRegression 1
40 pages
Lecture 3 Multiple Regression Model-Estimation
No ratings yet
Lecture 3 Multiple Regression Model-Estimation
40 pages
2014 Bookmatter DataAnalysis
No ratings yet
2014 Bookmatter DataAnalysis
20 pages
Jannatul Nayeem 1611482030 Section 5 PDF
No ratings yet
Jannatul Nayeem 1611482030 Section 5 PDF
6 pages
Suggestive Questions-Final Fall 19
No ratings yet
Suggestive Questions-Final Fall 19
1 page
Methodology
No ratings yet
Methodology
2 pages
Regression Stepwise (PIZZA)
No ratings yet
Regression Stepwise (PIZZA)
4 pages
SMMD: Practice Problem Set 6 Topic: The Simple Regression Model
No ratings yet
SMMD: Practice Problem Set 6 Topic: The Simple Regression Model
6 pages
Module01.1 LinearRegression
No ratings yet
Module01.1 LinearRegression
32 pages
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
No ratings yet
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
4 pages
Assignment: Topic - Testing For Violation of OLS Assumptions
No ratings yet
Assignment: Topic - Testing For Violation of OLS Assumptions
50 pages
Final Assignment: Jannatul Nayeem NSU ID: 1611482030 ECO 104.11 Spring 2020
No ratings yet
Final Assignment: Jannatul Nayeem NSU ID: 1611482030 ECO 104.11 Spring 2020
9 pages
Data Analytics Unit 3 Notes
100% (3)
Data Analytics Unit 3 Notes
28 pages
STA1510 Chapter 8 Summary
No ratings yet
STA1510 Chapter 8 Summary
3 pages
Outputs 1
No ratings yet
Outputs 1
3 pages
Econometric Modelling: Module - 4
No ratings yet
Econometric Modelling: Module - 4
14 pages
2024 Chapter 1
No ratings yet
2024 Chapter 1
8 pages
Topic 3 Multiple Regression Analysis Estimation
No ratings yet
Topic 3 Multiple Regression Analysis Estimation
31 pages
Ch13 Multiple Regres
No ratings yet
Ch13 Multiple Regres
46 pages
6.3 SSK5210 Parametric Statistical Testing - Analysis of Variance LR and Correlation - 2
No ratings yet
6.3 SSK5210 Parametric Statistical Testing - Analysis of Variance LR and Correlation - 2
39 pages
Final Assignment: Jannatul Nayeem NSU ID: 1611482030 BUS 173.4 Spring 2020
No ratings yet
Final Assignment: Jannatul Nayeem NSU ID: 1611482030 BUS 173.4 Spring 2020
28 pages
Jannatul Nayeem 1611482030 MKT 460.5 Spring 2020 Final Assignment PDF
No ratings yet
Jannatul Nayeem 1611482030 MKT 460.5 Spring 2020 Final Assignment PDF
27 pages
Simpreg
No ratings yet
Simpreg
6 pages
6th Lecture Note 108335647 230518 203102
No ratings yet
6th Lecture Note 108335647 230518 203102
35 pages
Lecture5 Mar22 2024
No ratings yet
Lecture5 Mar22 2024
44 pages
SouvenirSales Multiplicative
No ratings yet
SouvenirSales Multiplicative
57 pages
Week 11-2 Lecture 15 Student
No ratings yet
Week 11-2 Lecture 15 Student
54 pages
Statistics and Probability Week 7 - 8
No ratings yet
Statistics and Probability Week 7 - 8
4 pages
Month Actual Shed Sales 2-Month MA 3-Month MA 4-Month MA
No ratings yet
Month Actual Shed Sales 2-Month MA 3-Month MA 4-Month MA
17 pages
Chapter 15
No ratings yet
Chapter 15
67 pages
Econometrics - Sheet 2A - MR - 2024
No ratings yet
Econometrics - Sheet 2A - MR - 2024
3 pages
Lecture 12
No ratings yet
Lecture 12
47 pages
Paper On Polynomial Regression
No ratings yet
Paper On Polynomial Regression
7 pages
Solved Application On Multiple Linear Regression Model
No ratings yet
Solved Application On Multiple Linear Regression Model
8 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
22 pages
Ch3 Multiple Regression
No ratings yet
Ch3 Multiple Regression
56 pages
REGRESSION
No ratings yet
REGRESSION
8 pages
SRM FS
No ratings yet
SRM FS
23 pages
Fba 1
No ratings yet
Fba 1
9 pages
Section 2
No ratings yet
Section 2
22 pages
Team8 Lab3
No ratings yet
Team8 Lab3
12 pages
Lecture 12 - Adv. Correlation and Multiple Regression
No ratings yet
Lecture 12 - Adv. Correlation and Multiple Regression
32 pages
Chapter 5 - 2010
No ratings yet
Chapter 5 - 2010
8 pages
DID Princeton
No ratings yet
DID Princeton
38 pages
Day 6 Session 1 MLR
No ratings yet
Day 6 Session 1 MLR
25 pages
Multiple Regression
No ratings yet
Multiple Regression
60 pages
Chapter 15
No ratings yet
Chapter 15
67 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
Stats Cheat Sheet (Size 11)
No ratings yet
Stats Cheat Sheet (Size 11)
5 pages
Multiple Regression Slides Mod-Ed
No ratings yet
Multiple Regression Slides Mod-Ed
32 pages
Part A Assignment - No - 4
No ratings yet
Part A Assignment - No - 4
14 pages
Chapter 15 - Q&A Extra
No ratings yet
Chapter 15 - Q&A Extra
3 pages
Multiple Linear Regression-I
No ratings yet
Multiple Linear Regression-I
6 pages
II-I - MCA - Data Science and Analytics - Course Material - Unit2
No ratings yet
II-I - MCA - Data Science and Analytics - Course Material - Unit2
15 pages
Key Formulas: Simple Linear Regression
No ratings yet
Key Formulas: Simple Linear Regression
2 pages
Machine Learning (CSO851) - Lecture 02
No ratings yet
Machine Learning (CSO851) - Lecture 02
74 pages
4.1 Multiple Regression Models
No ratings yet
4.1 Multiple Regression Models
6 pages
Lecture 2-3 - Properties of The OLS Estimates
No ratings yet
Lecture 2-3 - Properties of The OLS Estimates
20 pages
De Souza
No ratings yet
De Souza
10 pages
Ch02-Regression Handout
No ratings yet
Ch02-Regression Handout
22 pages
Machine Learning-Lecture 1 (Student)
No ratings yet
Machine Learning-Lecture 1 (Student)
14 pages
Lecture 9-10 - Updated Vesion S25 - Regression
No ratings yet
Lecture 9-10 - Updated Vesion S25 - Regression
43 pages
Bias - Varience Trade Off
No ratings yet
Bias - Varience Trade Off
4 pages
Module 2
No ratings yet
Module 2
21 pages
Solutions Manual to accompany Introduction to Linear Regression Analysis
From Everand
Solutions Manual to accompany Introduction to Linear Regression Analysis
Douglas C. Montgomery
1/5 (1)
Worked Examples in Mechanics of Machines using MATLAB
From Everand
Worked Examples in Mechanics of Machines using MATLAB
Eric Ogur
No ratings yet

CH 14 Handout

Uploaded by

CH 14 Handout

Uploaded by

Chapter 14: Multiple Regression

Multiple Regression Model Assumptions

Standard Deviation of Sample Errors:

by the symbol se.

Coefficient of Multiple Determination

Interpreting Regression Results in STATA:

Example 1: Simple Linear Regression Model

Student Serial Aptitude Test Score Grade in Statistics class

Variable | Obs Mean Std. Dev. Min Max

Variable | Obs Mean Std. Dev. Min Max

. reg statistics aptitude

Source | SS df MS Number of obs = 5

. graph twoway (lfit statistics aptitude ) (scatter statistics aptitude)

Fitted values statistics

^y =26.78+ 0.644 aptitude ^y =26.78+0.644 (80) ^y =78.3

y = the monthly auto insurance premium (in dollars) paid by a driver

x1 = the driving experience (in years) of a driver

We are to estimate the regression model: y =A +B1x1 + B2x2 + ε

. reg premium experience violations

Source | SS df MS Number of obs = 12

^y =110.2761−2.747 experience +16.106 violations

(b) Comment on the statistical significance of the coefficients.

R2=0.9321 and Adjusted R2=0.9159

^y =110.2761−2.747 experience +16.106 violations ^y =110.2761−2.747 ( 7 ) +16.106 ( 3 ) ^y =139.37

^y =110.2761−2.747 experience +16.106 violations ^y =110.2761−2.747 ( 12 ) +16.106 ( 4 ) ^y =141.74

You might also like