0% found this document useful (0 votes)
11 views

Lab 4

Uploaded by

thulasi.v
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Lab 4

Uploaded by

thulasi.v
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Lab-4 Residual Analysis

Thulasi-2348152

2023-11-30

Introduction:
The objective of this study is to establish a simple linear regression model for a given
dataset and analyze the linear relationship between the variables. The dataset, represented
by the variables y and x, is subjected to a comprehensive analysis to evaluate the adequacy
of the model. Simple linear regression aims to model the relationship between a dependent
variable (y) and an independent variable (x) through a linear equation. Assessing the
model’s performance involves examining various diagnostic tools, particularly residual
plots, to validate the assumptions and ensure the reliability of the regression results.
library(readxl)
engine <- read_excel("D:/R-MST171/engine.xlsx")
model=lm(engine$y~engine$x)
summary(model)

##
## Call:
## lm(formula = engine$y ~ engine$x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.517 -1.730 -0.294 1.641 5.903
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.294621 1.774067 18.767 2.88e-13 ***
## engine$x -0.046267 0.005891 -7.855 3.18e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.136 on 18 degrees of freedom
## Multiple R-squared: 0.7741, Adjusted R-squared: 0.7616
## F-statistic: 61.69 on 1 and 18 DF, p-value: 3.18e-07

fit1=fitted.values(model)
fit1

## 1 2 3 4 5 6 7 8
## 17.10106 17.10106 21.72779 17.05479 22.88448 12.93700 22.60687 21.17258
## 9 10 11 12 13 14 15 16
## 29.14444 28.81132 17.10106 30.59724 25.38291 21.35765 26.81720 19.32189
## 17 18 19 20
## 10.16096 12.93700 17.10106 18.58162

Interpretation:
The regression equation is Y=33.294621−0.046267×Engine Displacement The intercept
and the coefficient for Engine Displacement are statistically significant. The model explains
a significant portion of the variance in the dependent variable (R-squared = 0.7741). The F-
statistic also supports the overall significance of the model. The linear regression analysis
suggests that Engine Displacement has a statistically significant effect on the dependent
variable, and the model fits the data well. The negative coefficient for Engine Displacement
indicates a negative relationship with the dependent variable.
plot(fit1,resid(model))
abline(0,0)

plot(fit1,rstandard(model))
r=rstandard(model)
r

## 1 2 3 4 5 6
## 0.59449098 -0.03339746 -0.56599236 0.39508748 -0.92534724 -0.59869761
## 7 8 9 10 11 12
## -0.15989936 0.09733788 1.94832014 0.55403436 -0.19863125 2.12869487
## 13 14 15 16 17 18
## -1.29746339 -0.54265038 -2.21009625 -0.49848104 1.53277613 0.67314487
## 19 20
## 0.23097662 -0.71270880

abline(0,0)

All the values of


rstardard are less than 3,which tell us that there is no outliers in the dataset.
plot(fit1,rstudent(model))
abline(0,0)
r1=rstudent(model)
r1

## 1 2 3 4 5 6
## 0.58349811 -0.03245750 -0.55500662 0.38563173 -0.92146004 -0.58771054
## 7 8 9 10 11 12
## -0.15550473 0.09462032 2.13146745 0.54307506 -0.19324679 2.39152931
## 13 14 15 16 17 18
## -1.32435645 -0.53172870 -2.51619123 -0.48781525 1.59748846 0.66257203
## 19 20
## 0.22480232 -0.70261316

plot(model)
shapiro.test(r)

##
## Shapiro-Wilk normality test
##
## data: r
## W = 0.96599, p-value = 0.669

From the shapiro-wilk test,W=0.96599 indicates that the distribution resembles normal
distribution and p=0.669 which is greater than 0.05,Hence we not reject our null
hypothesis.Therefore,residulas are normally distributed.

Conclusion:
The simple linear regression model demonstrates a significant association between the
variables y and x. The negative coefficient for x suggests a decreasing trend in y as x
increases. The high R-squared value indicates a strong explanatory power of the model.
Residual plots exhibit no apparent patterns, validating assumptions of homoscedasticity
and linearity.

You might also like