Lab3 Report Revathy
Lab3 Report Revathy
R
revak
2024-02-04
#Setup
# Set the CRAN mirror to Cloudflare
options(repos = structure(c(CRAN = "https://cloud.r-project.org/")))
library(car)
#Lab Steps
# Step 1: Create a linear model with response = mpg, and explanatory variable
horsepower
# Code for creating the linear model
mpg_model <- lm(mpg ~ horsepower, data = Auto)
##
## Call:
## lm(formula = mpg ~ horsepower, data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.5710 -3.2592 -0.3435 2.7630 16.9240
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.935861 0.717499 55.66 <2e-16 ***
## horsepower -0.157845 0.006446 -24.49 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.906 on 390 degrees of freedom
## Multiple R-squared: 0.6059, Adjusted R-squared: 0.6049
## F-statistic: 599.7 on 1 and 390 DF, p-value: < 2.2e-16
# Step 2: Plot a scatter plot using the residuals and fitted values
# Code for plotting residuals vs fitted values
plot(mpg_model$fitted.values, mpg_model$residuals, main = "Residuals vs
Fitted",
xlab = "Fitted Values", ylab = "Residuals")
# Step 3: Does the scatter plot provide evidence that a linear regression
assumption is violated? Explain.
# Print summary results for reference
model_summary <- summary(mpg_model)
print(model_summary)
##
## Call:
## lm(formula = mpg ~ horsepower, data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.5710 -3.2592 -0.3435 2.7630 16.9240
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.935861 0.717499 55.66 <2e-16 ***
## horsepower -0.157845 0.006446 -24.49 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.906 on 390 degrees of freedom
## Multiple R-squared: 0.6059, Adjusted R-squared: 0.6049
## F-statistic: 599.7 on 1 and 390 DF, p-value: < 2.2e-16
##
## Residual Standard Error: 4.905757
##Answer: The absence of distinct patterns in the scatter plot, along with a
Residual Standard Error of 4.906 and a Multiple R-squared of 0.6059, suggests
that the model adheres to linear regression assumptions, showcasing
randomness, consistent spread of residuals, and linearity.
##
## Call:
## lm(formula = horsepower ~ weight + origin + cylinders + acceleration,
## data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -27.683 -8.221 -1.129 5.292 83.857
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 80.837488 7.203703 11.222 < 2e-16 ***
## weight 0.029489 0.001836 16.065 < 2e-16 ***
## origin 3.154415 1.038882 3.036 0.00256 **
## cylinders 2.375640 0.953508 2.491 0.01314 *
## acceleration -5.285656 0.284090 -18.606 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.27 on 387 degrees of freedom
## Multiple R-squared: 0.8824, Adjusted R-squared: 0.8812
## F-statistic: 725.9 on 4 and 387 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = horsepower ~ weight + origin + cylinders + acceleration,
## data = Auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -27.683 -8.221 -1.129 5.292 83.857
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 80.837488 7.203703 11.222 < 2e-16 ***
## weight 0.029489 0.001836 16.065 < 2e-16 ***
## origin 3.154415 1.038882 3.036 0.00256 **
## cylinders 2.375640 0.953508 2.491 0.01314 *
## acceleration -5.285656 0.284090 -18.606 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.27 on 387 degrees of freedom
## Multiple R-squared: 0.8824, Adjusted R-squared: 0.8812
## F-statistic: 725.9 on 4 and 387 DF, p-value: < 2.2e-16
# Extract relevant values for commenting
residual_std_error_new <- model_summary_new$sigma
r_squared_new <- model_summary_new$r.squared
##
## Residual Standard Error (New Model): 13.26792
#Answer : The scatter plot, along with a Residual Standard Error of 13.26792
and a Multiple R-squared of 0.8823972 for the new model, suggests that linear
regression assumptions are upheld, showing randomness, consistent residual
spread, and robust linearity.
# QQ plot of residuals
qqPlot(horsepower_model$residuals, main = "QQ Plot of Residuals (Horsepower
Model)")
## 14 117
## 14 116
#Step5 - Analysis of plot for deviations - Point 14 and 117 show deviations