0% found this document useful (0 votes)
11 views

Expt3.ipynb - JupyterLab

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Expt3.ipynb - JupyterLab

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

10/20/24, 12:23 AM expt3

In [1]: # EXPERIMENT -3 :: Multiple Linear Regression

# Using multiple linear regression perform the following tasks on the Autodata s
# (a) Produce a scatter plot matrix which includes all of the variable sin the d
# (b) Compute the matrix of correlations between the variables using the f
# You will need to exclude the name variable, cor() which is qualitat
# (c) Use the lm() function to perform a multiple linear regression with m
# and all other variables except name as the predictors. Use the summ
# print the results.
# Comment on the output, That is
# i. Is there a relationship between the predictors and the response?
# ii. Which predictors appear to have a statistically
# significant relationship to the response?
# iii. What does the coefficient for the year variable suggest?
# (d) Use the plot() function to produce diagnostic plots of the linear regres
# Comment on any problems you see with the fit.
# Do the residual plots suggest any unusually large outliers?
# Does the leverage plot identify any observations with unusually high leverage?
# (e) Use the * and : symbols to fit linear regression models with interaction
# any interactions appear to be statistically significant?
# (f) Try a few different transformations of the variables, such aslog(X),√ X,
# Comment on your findings

#This question involves the use of multiple linear regression on the Auto data s

library(ISLR)
library(MASS)
data("Auto")
typeof(data)
head(Auto)

#Produce a scatterplot matrix which includes all of the variables in the data se

pairs(Auto)

#Compute the matrix of correlations between the variables using the function cor
Auto$name<-NULL
cor(Auto,method = c("pearson"))

lm.fit<-lm(mpg~.,data=Auto)
summary(lm.fit)

which.max(hatvalues(lm.fit))
## 14
## 14
par(mfrow = c(2,2))
plot(lm.fit)

localhost:8888/lab/tree/expt3/expt3.ipynb 1/5
10/20/24, 12:23 AM expt3

#Use the plot() function to produce diagnostic plots of the linear regression fi
#Comment on any problems you see with the fit.
lm.fit = lm(mpg ~.-name+displacement:weight, data = Auto)
summary(lm.fit)

#Use the * and : symbols to fit linear regression models with interaction effect
#Do any interactions appear to be statistically significant?
lm.fit = lm(mpg ~.-name+I((displacement)^2)+log(displacement)+displacement:weigh
summary(lm.fit)

#Try a few different transformations of the variables, such as log(X),√X, X2. Co


lm.fit = lm(mpg ~.-name+I((displacement)^2)+log(displacement)+displacement:weigh
summary(lm.fit)

Warning message:
"package 'ISLR' was built under R version 4.4.1"
'closure'
A data.frame: 6 × 9

mpg cylinders displacement horsepower weight acceleration year origin

<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>

che
1 18 8 307 130 3504 12.0 70 1 ch
m

2 15 8 350 165 3693 11.5 70 1 s

plym
3 18 8 318 150 3436 11.0 70 1
sa

4 16 8 304 150 3433 12.0 70 1


re

5 17 8 302 140 3449 10.5 70 1

6 15 8 429 198 4341 10.0 70 1 g

localhost:8888/lab/tree/expt3/expt3.ipynb 2/5
10/20/24, 12:23 AM expt3

A matrix: 8 × 8 of type dbl

mpg cylinders displacement horsepower weight acceleration

mpg 1.0000000 -0.7776175 -0.8051269 -0.7784268 -0.8322442 0.4233285

cylinders -0.7776175 1.0000000 0.9508233 0.8429834 0.8975273 -0.5046834

displacement -0.8051269 0.9508233 1.0000000 0.8972570 0.9329944 -0.5438005

horsepower -0.7784268 0.8429834 0.8972570 1.0000000 0.8645377 -0.6891955

weight -0.8322442 0.8975273 0.9329944 0.8645377 1.0000000 -0.4168392

acceleration 0.4233285 -0.5046834 -0.5438005 -0.6891955 -0.4168392 1.0000000

year 0.5805410 -0.3456474 -0.3698552 -0.4163615 -0.3091199 0.2903161

origin 0.5652088 -0.5689316 -0.6145351 -0.4551715 -0.5850054 0.2127458

Call:
lm(formula = mpg ~ ., data = Auto)

Residuals:
Min 1Q Median 3Q Max
-9.5903 -2.1565 -0.1169 1.8690 13.0604

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -17.218435 4.644294 -3.707 0.00024 ***
cylinders -0.493376 0.323282 -1.526 0.12780
displacement 0.019896 0.007515 2.647 0.00844 **
horsepower -0.016951 0.013787 -1.230 0.21963
weight -0.006474 0.000652 -9.929 < 2e-16 ***
acceleration 0.080576 0.098845 0.815 0.41548
year 0.750773 0.050973 14.729 < 2e-16 ***
origin 1.426141 0.278136 5.127 4.67e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.328 on 384 degrees of freedom


Multiple R-squared: 0.8215, Adjusted R-squared: 0.8182
F-statistic: 252.4 on 7 and 384 DF, p-value: < 2.2e-16
14: 14

localhost:8888/lab/tree/expt3/expt3.ipynb 3/5
10/20/24, 12:23 AM expt3

Warning message in terms.formula(formula, data = data):


"'varlist' has changed (from nvar=8) to new 9 after EncodeVars() -- should no lon
ger happen!"
Error in eval(predvars, data, env): object 'name' not found
Traceback:

1. lm(mpg ~ . - name + displacement:weight, data = Auto)


2. eval(mf, parent.frame())
3. eval(mf, parent.frame())
4. stats::model.frame(formula = mpg ~ . - name + displacement:weight,
. data = Auto, drop.unused.levels = TRUE)
5. model.frame.default(formula = mpg ~ . - name + displacement:weight,
. data = Auto, drop.unused.levels = TRUE)
6. eval(predvars, data, env)
7. eval(predvars, data, env)

localhost:8888/lab/tree/expt3/expt3.ipynb 4/5
10/20/24, 12:23 AM expt3

In [ ]:

localhost:8888/lab/tree/expt3/expt3.ipynb 5/5

You might also like