0% found this document useful (0 votes)
15 views6 pages

03a.session Notes On Multiple Linear Regression Analysis

Uploaded by

nairsuraj725
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views6 pages

03a.session Notes On Multiple Linear Regression Analysis

Uploaded by

nairsuraj725
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Session Notes on Multiple Linear Regression

Let us look at the situation considered for the discussion. We have a consortium of US firms
that produce raw materials used in Singapore. They are interested in the following
1. Predicting the level of exports from US.
2. Understanding the relationship between US exports to Singapore and certain
variables affecting the economy of that country.
Let us question, what are the advantages by doing the above.
1. Understanding the relation will allow the consortium members to time their
marketing efforts to coincide with favourable conditions in the Singapore economy.
2. Understanding the relationship would also allow the exporters to determine
whether expansion of exports to Singapore is feasible.
3. Also, to identify the significant variables that acts as main drivers of the exports to
Singapore.
Variables considered in the study
 US exports to Singapore in billions of Singapore Dollars (the dependent variable,
Exports),
 money supply figures in billions of Singapore dollars (variable M1),
 minimum Singapore bank lending rate in percentages (variable Lend),
 an index of local prices where the base year is 1974 (variable Price),
 the exchange rate of Singapore dollars per U.S. dollar (variable Exchange)
Now, why regression should be used as a method for analysing this data. Taking into
consideration the objectives, the appropriate method is regression. Regression gives one an
opportunity to
1. Measure the level of changes in the exports with the change in the levels of other
drivers considered.
2. To test the significance of each driver or variable that contributes to the change in
exports.
3. To help the US consortium to find the favourable conditions
4. To build a model that connects the exports and significant drivers of the exports and
make predictions.
Assumptions associated with the linear regression analysis
1. The response and the regressor variables are linearly related
2. On average residual is zero
3. All residuals have constant variance
4. All residuals are uncorrelated
5. Residuals are normally distributed
6. All regressors are independent
Discussion on R codes
In order to adopt R as a tool for running the regression analysis, we need to install few
packages available in R. These packages are developed by researchers and comes with
various built-in functions that are used to run the analysis. For running regression analysis in
R, we install the following packages
car-Companion to Applied regression analysis
https://www.rdocumentation.org/packages/car/versions/3.0-8

psych- package used for psychological, psychometric and personality research

https://www.rdocumentation.org/packages/psych/versions/1.9.12.31

Hmisc- Harrell Miscellaneous

https://www.rdocumentation.org/packages/Hmisc/versions/4.4-0

lmtest- Testing Linear Regression Models

https://www.rdocumentation.org/packages/lmtest/versions/0.9-37

lm.beta- Standardized regression coefficients to Lm objects


https://www.rdocumentation.org/packages/lm.beta/versions/1.5-1/topics/lm.beta

R-codes
setwd("F:/07.PGDM 2020/03.DAR/09.R-Codes") # This is used to set the working directory
getwd() # used for getting the working directory used

install.packages("readxl") # Used to install the package for importing the excel files to R
library(readxl) # Used to call the package readxl
install.packages("psych")
library(psych)
install.packages("Hmisc")
library(Hmisc)
install.packages("lmtest")
library(lmtest)
install.packages("lm.beta")
library(lm.beta)
install.packages("car")
library(car)
exports=read_excel(file.choose()) # Import the excel file named as exports
attach(exports) # Attach the file
fix(exports) # Open the data file in the R editor
View(exports) # Open the data file to view the data
#Summary Statistics
summary(exports) # Before building the model, it is very important to understand the
variables better. For this, one can obtain the summary statistics. One has to describe each
variable using the summary statistics like mean, median, mode, quartiles etc.
Exports M1 Lend Price
Min. :2.600 Min. :4.900 Min. : 7.80 Min. :114.0
1st Qu.:4.200 1st Qu.:6.000 1st Qu.: 9.00 1st Qu.:146.0
Median :4.800 Median :7.000 Median :10.00 Median :151.0
Mean :4.528 Mean :6.909 Mean :10.52 Mean :147.3
3rd Qu.:5.100 3rd Qu.:8.100 3rd Qu.:11.60 3rd Qu.:154.0
Max. :5.600 Max. :8.800 Max. :15.00 Max. :162.0

Exchange
Min. :2.040
1st Qu.:2.100
Median :2.130
Mean :2.133
3rd Qu.:2.160
Max. :2.240

#scatter plots
pairs(~exports$Exports+exports$M1+exports$Lend+exports$Price+exports$Exchange)
# This is used to get the scatter plots for all the variables considered in the study
#Building the model
exp_lm=lm(Exports~M1+Lend+Price+Exchange, exports) # “lm” means “linear model” and is
used to build the model. The symbol ~ is used to link the response (dependent variable) and
the regressor variables (independent variables). All the regressor variables are included in
the code using “+” sign.
exp_lm # This gives the coefficient values of the model.
Coefficients:
(Intercept) M1 Lend Price
-4.015461 0.368456 0.004702 0.036511
Exchange
0.267896

summary(exp_lm) # This gives the results of the testing


Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.015461 2.766401 -1.452 0.151679
M1 0.368456 0.063848 5.771 2.71e-07 ***
Lend 0.004702 0.049222 0.096 0.924201
Price 0.036511 0.009326 3.915 0.000228 ***
Exchange 0.267896 1.175440 0.228 0.820465
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3358 on 62 degrees of freedom


Multiple R-squared: 0.825, Adjusted R-squared: 0.8137
F-statistic: 73.06 on 4 and 62 DF, p-value: < 2.2e-16

# Rebuilding the model after dropping the insignificant variables


exp_lm2=lm(Exports~M1+Price,exports) # Here, we rebuild the model by dropping the
variables that are insignificant.
exp_lm2
summary(exp_lm2)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.422957 0.540853 -6.329 2.75e-08 ***
M1 0.361417 0.039246 9.209 2.45e-13 ***
Price 0.037033 0.004094 9.046 4.70e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3306 on 64 degrees of freedom


Multiple R-squared: 0.8248, Adjusted R-squared: 0.8193
F-statistic: 150.7 on 2 and 64 DF, p-value: < 2.2e-16

#Testing the assumptions of the mode


#1. Average residual is zero. This indicates that the contribution from the unknown variables
is on average negligible.
mean(exp_lm2$residuals) # This will help one to test the assumption.
[1] -2.531693e-18

#2. Variance of residual is constant


bptest(exp_lm2) # Breuch-Pagan test is used to check this assumption.
studentized Breusch-Pagan test

data: exp_lm2
BP = 3.0888, df = 2, p-value = 0.2134

#3. Errors are uncorrelated


durbinWatsonTest(exp_lm2) # We use Durbin Watson test to check this assumption
lag Autocorrelation D-W Statistic p-value
1 -0.3188038 2.576484 0.024
Alternative hypothesis: rho != 0
#4. Normality of the residuals
shapiro.test(exp_lm2$residuals)
Shapiro-Wilk normality test

data: exp_lm2$residuals
W = 0.96227, p-value = 0.03998

#5. All the regressors are independent


vif(exp_lm2) # Variance inflation factor is used to check this assumption
M1 Price
1.249779 1.249779
# If the VIF value is more than 10, then we conclude that there is a problem of
multicollinearity
# Confidence intervals for the regression coefficients
exp_lm2$coefficients
(Intercept) M1 Price
-3.42295723 0.36141732 0.03703264
# Confidence Intervals
confint(exp_lm2, level = 0.95)

2.5 % 97.5 %
(Intercept) -4.50343606 -2.34247841
M1 0.28301385 0.43982079
Price 0.02885435 0.04521092

predict(exp_lm2, interval = "confidence")


# Prediction Intervals
new1=data.frame(M1=c(5.3,5.4,5.5), Price=c(118,119,120))
new1
help("predict")
exp_lm2
predict(exp_lm2,new1)
predict(exp_lm2,new, interval = "prediction")
detach(exports)

You might also like