Lab 4
Lab 4
Thulasi-2348152
2023-11-30
Introduction:
The objective of this study is to establish a simple linear regression model for a given
dataset and analyze the linear relationship between the variables. The dataset, represented
by the variables y and x, is subjected to a comprehensive analysis to evaluate the adequacy
of the model. Simple linear regression aims to model the relationship between a dependent
variable (y) and an independent variable (x) through a linear equation. Assessing the
model’s performance involves examining various diagnostic tools, particularly residual
plots, to validate the assumptions and ensure the reliability of the regression results.
library(readxl)
engine <- read_excel("D:/R-MST171/engine.xlsx")
model=lm(engine$y~engine$x)
summary(model)
##
## Call:
## lm(formula = engine$y ~ engine$x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.517 -1.730 -0.294 1.641 5.903
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.294621 1.774067 18.767 2.88e-13 ***
## engine$x -0.046267 0.005891 -7.855 3.18e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.136 on 18 degrees of freedom
## Multiple R-squared: 0.7741, Adjusted R-squared: 0.7616
## F-statistic: 61.69 on 1 and 18 DF, p-value: 3.18e-07
fit1=fitted.values(model)
fit1
## 1 2 3 4 5 6 7 8
## 17.10106 17.10106 21.72779 17.05479 22.88448 12.93700 22.60687 21.17258
## 9 10 11 12 13 14 15 16
## 29.14444 28.81132 17.10106 30.59724 25.38291 21.35765 26.81720 19.32189
## 17 18 19 20
## 10.16096 12.93700 17.10106 18.58162
Interpretation:
The regression equation is Y=33.294621−0.046267×Engine Displacement The intercept
and the coefficient for Engine Displacement are statistically significant. The model explains
a significant portion of the variance in the dependent variable (R-squared = 0.7741). The F-
statistic also supports the overall significance of the model. The linear regression analysis
suggests that Engine Displacement has a statistically significant effect on the dependent
variable, and the model fits the data well. The negative coefficient for Engine Displacement
indicates a negative relationship with the dependent variable.
plot(fit1,resid(model))
abline(0,0)
plot(fit1,rstandard(model))
r=rstandard(model)
r
## 1 2 3 4 5 6
## 0.59449098 -0.03339746 -0.56599236 0.39508748 -0.92534724 -0.59869761
## 7 8 9 10 11 12
## -0.15989936 0.09733788 1.94832014 0.55403436 -0.19863125 2.12869487
## 13 14 15 16 17 18
## -1.29746339 -0.54265038 -2.21009625 -0.49848104 1.53277613 0.67314487
## 19 20
## 0.23097662 -0.71270880
abline(0,0)
## 1 2 3 4 5 6
## 0.58349811 -0.03245750 -0.55500662 0.38563173 -0.92146004 -0.58771054
## 7 8 9 10 11 12
## -0.15550473 0.09462032 2.13146745 0.54307506 -0.19324679 2.39152931
## 13 14 15 16 17 18
## -1.32435645 -0.53172870 -2.51619123 -0.48781525 1.59748846 0.66257203
## 19 20
## 0.22480232 -0.70261316
plot(model)
shapiro.test(r)
##
## Shapiro-Wilk normality test
##
## data: r
## W = 0.96599, p-value = 0.669
From the shapiro-wilk test,W=0.96599 indicates that the distribution resembles normal
distribution and p=0.669 which is greater than 0.05,Hence we not reject our null
hypothesis.Therefore,residulas are normally distributed.
Conclusion:
The simple linear regression model demonstrates a significant association between the
variables y and x. The negative coefficient for x suggests a decreasing trend in y as x
increases. The high R-squared value indicates a strong explanatory power of the model.
Residual plots exhibit no apparent patterns, validating assumptions of homoscedasticity
and linearity.