Lab 5

Uploaded by

thulasi.v

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views6 pages

Lab 5

Uploaded by

thulasi.v

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Electronic Inverter data model

Thulasi-2348152

2023-12-09

OBJECTIVE:
To fit a suitable linear regression model and construct a normal probability plot of the
residuals and to see if there is any problem with the normality and constant variance
assumption and construct and interpret a plot of the residuals and find if the residuals
correlated .To find outliers in the data and way to handle them.

INTRODUCTION:
Multicollinearity exists whenever an independent variable is highly correlated with one or
more of the other independent variables in a multiple regression equation. An outlier is an
observation that lies an abnormal distance from other values in a random sample from a
population.We will analyze the data and see how can we handle the outliers.
library(readxl)
inventer<- read_excel("C:/Users/Admin/Downloads/Inverter data.xlsx")
View(inventer)
model=lm(inventer$y~inventer$x1+inventer$x2+inventer$x3+inventer$x4+inventer$
x5)
summary(model)

##
## Call:
## lm(formula = inventer$y ~ inventer$x1 + inventer$x2 + inventer$x3 +
## inventer$x4 + inventer$x5)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.2915 -1.0794 -0.5519 1.2685 3.5009
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.85473 1.86922 1.527 0.1432
## inventer$x1 -0.29047 0.11742 -2.474 0.0230 *
## inventer$x2 0.20572 0.07506 2.741 0.0130 *
## inventer$x3 0.45444 0.18768 2.421 0.0256 *
## inventer$x4 -0.59419 0.21253 -2.796 0.0115 *
## inventer$x5 0.00464 0.01817 0.255 0.8012
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.196 on 19 degrees of freedom
## Multiple R-squared: 0.5584, Adjusted R-squared: 0.4422
## F-statistic: 4.805 on 5 and 19 DF, p-value: 0.005239

Here we got R^2 value as 0.5584 suggesting that the model explains a moderate amount of
the variance in the data. P value is 0.005239 and we see that predictors (except Inverter$5)
are individually significant in predicting the response variable.
#Normal probability plot of the residuals
qqnorm(resid(model))
qqline(resid(model))

res=resid(model)
#To test the normality by using shapiro-wilk test
shapiro.test(rstandard(model))

##
## Shapiro-Wilk normality test
##
## data: rstandard(model)
## W = 0.91451, p-value = 0.03847

The plot of residuals shows there is are deviations from the reference line and from
shapiron test ,the p-value is 0.038 which rejects our null hypothesisand hence it doesn’t
holds the constant variance assumption.
#To constant variance(homoscedasticity test)
library(lmtest)

## Warning: package 'lmtest' was built under R version 4.3.2

## Loading required package: zoo

##
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':

##
## as.Date, as.Date.numeric

bptest(model)

##
## studentized Breusch-Pagan test
##
## data: model
## BP = 14.072, df = 5, p-value = 0.01516

P - value in Breusch pagan test is 0.015 which is less than 0.05 . Hence we reject the null
hypothesis at a significance level of 0.05 , there is evidence of heteroscedasticity.
#To find the fitted values
fit=fitted.values(model)
fit

## 1 2 3 4 5 6
7
## 2.1812243 5.5844677 2.3791193 -2.7968027 1.7266932 3.3656279
1.4620291
## 8 9 10 11 12 13
14
## 5.6060986 6.9425257 2.0506003 3.0384708 0.8448812 3.0303516
5.7139819
## 15 16 17 18 19 20
21
## 0.6348032 1.7663056 2.0435146 2.8485294 -0.9096868 -0.7936971
0.7016434
## 22 23 24 25
## 3.6975241 3.0440700 3.3092863 1.8304384

#To check whether the residuals are correlated using acf(auto correlation
factor)
acf(resid(model))
There is no
significant correlation between the residuals since all the lags lie in between the threshold
line.
#To find variance inflation factor
library(car)

## Warning: package 'car' was built under R version 4.3.2

## Loading required package: carData

## Warning: package 'carData' was built under R version 4.3.2

## Loading required package: carData

vif(model)

## inventer$x1 inventer$x2 inventer$x3 inventer$x4 inventer$x5

## 1.224968 1.379589 1.299559 1.365858 1.295112

Since all the predictors have a low VIF value of approximately 2(less than 5), it doesn’t have
multi-collinearity among them.
#Checking for outliers
res1=rstandard(model)
res1

## 1 2 3 4 5 6
## -0.75521781 -3.29890047 -0.33472058 2.11852178 -0.46081838 0.68417952
## 7 8 9 10 11 12
## -0.41108251 1.96135788 1.36586687 -0.39905377 0.74517729 -0.27006283
## 13 14 15 16 17 18
## -0.37810817 1.74654110 0.03007476 -0.68093441 -0.77999639 0.24478579
## 19 20 21 22 23 24
## 0.58836599 0.54315134 -0.18405269 0.66681923 -0.82171407 -0.93401389
## 25
## -0.58855319

#2nd value is greater than 3, which is an outlier. We can then remove that
value from the data.
outliers<-which(abs(res1)>3)
outliers

## 2
## 2

#Removing the outlier

Inverter1=inventer[-outliers,]
Inverter1

## # A tibble: 24 × 6
## x1 x2 x3 x4 x5 y
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 3 3 3 3 0 0.787
## 2 3 6 6 6 0 1.71
## 3 4 4 4 12 0 0.203
## 4 8 7 6 5 0 0.806
## 5 10 20 5 5 0 4.71
## 6 8 6 3 3 25 0.607
## 7 6 24 4 4 25 9.11
## 8 4 10 12 4 25 9.21
## 9 16 12 8 4 25 1.36
## 10 3 10 8 8 25 4.55
## # ℹ 14 more rows

#Regression model for new data after removing the outlier

model2=lm(y~., data=Inverter1)
summary(model2)

##
## Call:
## lm(formula = y ~ ., data = Inverter1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.6019 -0.8676 -0.5544 0.6388 3.3297
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.125843 1.303651 0.864 0.39917
## x1 -0.306737 0.078924 -3.886 0.00108 **
## x2 0.383627 0.062069 6.181 7.8e-06 ***
## x3 0.448299 0.126041 3.557 0.00225 **
## x4 -0.445042 0.145916 -3.050 0.00689 **
## x5 0.006228 0.012209 0.510 0.61619
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.474 on 18 degrees of freedom
## Multiple R-squared: 0.8072, Adjusted R-squared: 0.7536
## F-statistic: 15.07 on 5 and 18 DF, p-value: 6.837e-06

Here we got R^2 value as 0.7228 suggesting that the model explains a substantial amount
of the variance in the data. P value is 0.02039 and we see that predictors (except
Inverter$5) are individually significant in predicting the response variable. Here we can say
that outlier is a levarage point since variation between these two models will affect the
prediction value.

CONCLUTION:
For the given data, a suitable regression model is fitted.The residual vs fitted-value plot is
plotted. Autocorrelation is found using acf plot.Correlation between the residuals is found.
Multicollinearity is found using vif(model). The presence of outliers is found using
standardized residuals. The value greater than 3(which is an outlier) is removed and
further analysis is done.

Lesson Plan Measures of Central Tendency
100% (11)
Lesson Plan Measures of Central Tendency
13 pages
Data Analysis of Google Play Apps
100% (2)
Data Analysis of Google Play Apps
32 pages
C1M5 Peer Reviewed Others
No ratings yet
C1M5 Peer Reviewed Others
27 pages
Stat 151 - Final Review
No ratings yet
Stat 151 - Final Review
15 pages
Lab 4
No ratings yet
Lab 4
7 pages
A028 GLM-SC3
No ratings yet
A028 GLM-SC3
137 pages
Homework 2
100% (1)
Homework 2
14 pages
Lab 5 LR
No ratings yet
Lab 5 LR
9 pages
Yaikob Second Assesiment Final
No ratings yet
Yaikob Second Assesiment Final
33 pages
Multiple Linear Regression
100% (1)
Multiple Linear Regression
14 pages
H-311 Linear Regression Analysis With R
100% (1)
H-311 Linear Regression Analysis With R
71 pages
Backward Elimination Mattouhi Aicha
No ratings yet
Backward Elimination Mattouhi Aicha
3 pages
A2 Copy 2
No ratings yet
A2 Copy 2
8 pages
soruma-SECOND-ASSEsiment L Reg
No ratings yet
soruma-SECOND-ASSEsiment L Reg
33 pages
Soruma SECOND ASSEsiment Final L Reg
No ratings yet
Soruma SECOND ASSEsiment Final L Reg
34 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Simple Regression Model Fitting
No ratings yet
Simple Regression Model Fitting
5 pages
Modern Regression 1 - hw6
No ratings yet
Modern Regression 1 - hw6
11 pages
Amta - Final - Notes.r: ### Step Wise AIC Regression
No ratings yet
Amta - Final - Notes.r: ### Step Wise AIC Regression
6 pages
HW6 Solution
No ratings yet
HW6 Solution
10 pages
Code
No ratings yet
Code
10 pages
7 OLS Assumptions
No ratings yet
7 OLS Assumptions
37 pages
20mia1006 FDA LAB REGRESSION TYPES
No ratings yet
20mia1006 FDA LAB REGRESSION TYPES
11 pages
Stepwiseselection MATTOUHI AICHA
No ratings yet
Stepwiseselection MATTOUHI AICHA
7 pages
Business Analytics C-2
No ratings yet
Business Analytics C-2
7 pages
Analysis of Hydrocarbon Data - Application of LASSO Regression
No ratings yet
Analysis of Hydrocarbon Data - Application of LASSO Regression
26 pages
Regression Analysis Script
No ratings yet
Regression Analysis Script
24 pages
Regression Script
No ratings yet
Regression Script
3 pages
Exercice V
No ratings yet
Exercice V
5 pages
Da Lab File 2
No ratings yet
Da Lab File 2
13 pages
The Arbitrage Pricing Theory Model
No ratings yet
The Arbitrage Pricing Theory Model
3 pages
Lab Wk1soln PDF
No ratings yet
Lab Wk1soln PDF
14 pages
Mod 3
No ratings yet
Mod 3
50 pages
Module 5: Peer Reviewed Assignment: Outline
No ratings yet
Module 5: Peer Reviewed Assignment: Outline
55 pages
05 Diagnostic Test of CLRM 2
No ratings yet
05 Diagnostic Test of CLRM 2
39 pages
Predictive Analytics Group Assignment
No ratings yet
Predictive Analytics Group Assignment
21 pages
Lec 05 2 - Time Series Regression Model
No ratings yet
Lec 05 2 - Time Series Regression Model
75 pages
Using R For Linear Regression
No ratings yet
Using R For Linear Regression
9 pages
Lec 05 - Time Series Regression Model
No ratings yet
Lec 05 - Time Series Regression Model
32 pages
Analysis Course HW3
No ratings yet
Analysis Course HW3
12 pages
Make Up Cat
No ratings yet
Make Up Cat
6 pages
Arun 27072021 Predictive Modeling PDF
No ratings yet
Arun 27072021 Predictive Modeling PDF
33 pages
Module01 LinearRegression
No ratings yet
Module01 LinearRegression
41 pages
R Notesss
No ratings yet
R Notesss
12 pages
ML Fundamentals
No ratings yet
ML Fundamentals
38 pages
MIT 302 - Statistical Computing II - Tutorial 03
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 03
16 pages
Regression Model
No ratings yet
Regression Model
6 pages
HW4 Solutions: Problem 6.2
No ratings yet
HW4 Solutions: Problem 6.2
8 pages
DMV Unit 3 PPT - RSK - 250419 - 125620 Jfhuehiwhu
No ratings yet
DMV Unit 3 PPT - RSK - 250419 - 125620 Jfhuehiwhu
89 pages
Experiment No.8 - Fit Simple Linear Regression Models Using Built-In Functions.
No ratings yet
Experiment No.8 - Fit Simple Linear Regression Models Using Built-In Functions.
8 pages
Linear Regression
No ratings yet
Linear Regression
17 pages
3 Regression Diagnostics
100% (1)
3 Regression Diagnostics
53 pages
ISYE6501 Homework 5
No ratings yet
ISYE6501 Homework 5
5 pages
Amit Sir - Assignment
No ratings yet
Amit Sir - Assignment
19 pages
How To Use "Qqplot": X: Independent Variable, Y: Dependent Variable
No ratings yet
How To Use "Qqplot": X: Independent Variable, Y: Dependent Variable
6 pages
Day 6 Session 2 MLR
No ratings yet
Day 6 Session 2 MLR
16 pages
Shivam Batra (19BPS1131) 21/01/2022: List
No ratings yet
Shivam Batra (19BPS1131) 21/01/2022: List
5 pages
Machine Learning-Lecture 1 (Student)
No ratings yet
Machine Learning-Lecture 1 (Student)
14 pages
IE 451 Fall 2023-2024 Homework 4 Solutions
No ratings yet
IE 451 Fall 2023-2024 Homework 4 Solutions
19 pages
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
Instruction for Using a Slide Rule
From Everand
Instruction for Using a Slide Rule
W. Stanley
No ratings yet
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Evaluating Diagnostic Tests: Payam Kabiri, Md. Phd. Clinical Epidemiologist Tehran University of Medical Sciences
No ratings yet
Evaluating Diagnostic Tests: Payam Kabiri, Md. Phd. Clinical Epidemiologist Tehran University of Medical Sciences
96 pages
Winpepi Sample Size
No ratings yet
Winpepi Sample Size
2 pages
Farlin Bnad276-003 Completed Analytics Report
No ratings yet
Farlin Bnad276-003 Completed Analytics Report
6 pages
Fundamentals of Statistics in Geography (GES 108)
No ratings yet
Fundamentals of Statistics in Geography (GES 108)
6 pages
Statistics in Kinesiology 5th Edition Readable PDF Download
100% (8)
Statistics in Kinesiology 5th Edition Readable PDF Download
17 pages
Basic Concepts of Research Methodology PDF
No ratings yet
Basic Concepts of Research Methodology PDF
25 pages
Muhammad Syafrudin 2010516110003 Responsi 2
No ratings yet
Muhammad Syafrudin 2010516110003 Responsi 2
7 pages
Beyond The Shades of Vote:voter's Educational Awareness Among Grade 12 Students in LNHS
No ratings yet
Beyond The Shades of Vote:voter's Educational Awareness Among Grade 12 Students in LNHS
58 pages
P&s Model Paper
No ratings yet
P&s Model Paper
4 pages
Mangiafico, S.S. 2016. RHandbookProgramEvaluation
No ratings yet
Mangiafico, S.S. 2016. RHandbookProgramEvaluation
659 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
7 pages
FINC3017 Week3 Tut
No ratings yet
FINC3017 Week3 Tut
3 pages
Quantitative Research Methods
100% (1)
Quantitative Research Methods
21 pages
Univariate Outlier Detection
No ratings yet
Univariate Outlier Detection
9 pages
Percentiles and Percentile Ranks PDF
No ratings yet
Percentiles and Percentile Ranks PDF
17 pages
Prepared by Mr. A.P Nanada, Associate Prof. (OM) : Decision Science Sub. Code: MBA 105 (For MBA Students)
No ratings yet
Prepared by Mr. A.P Nanada, Associate Prof. (OM) : Decision Science Sub. Code: MBA 105 (For MBA Students)
247 pages
1.1 - Statistics Refresher
No ratings yet
1.1 - Statistics Refresher
34 pages
Excel Solution
No ratings yet
Excel Solution
14 pages
Biostat Lec Part 4 (SV)
No ratings yet
Biostat Lec Part 4 (SV)
3 pages
5.2.4 Practice - DONE Modeling Dot Plots, Box Plots, and Histograms (Practice)
100% (1)
5.2.4 Practice - DONE Modeling Dot Plots, Box Plots, and Histograms (Practice)
4 pages
DBCA
No ratings yet
DBCA
11 pages
bài tập
No ratings yet
bài tập
4 pages
Hasil Uji Hipotesis
No ratings yet
Hasil Uji Hipotesis
3 pages
Standard Deviation GMAT Questions
No ratings yet
Standard Deviation GMAT Questions
3 pages
Statistical Tools
No ratings yet
Statistical Tools
12 pages
IPE 333 - Sheet-1
No ratings yet
IPE 333 - Sheet-1
11 pages