3-Applying multiple linear Regression
3-Applying multiple linear Regression
Diagnostic plots
Area Y X1 X2
1 110 30 11
2 80 40 10
3 70 20 7
4 120 50 15
5 150 60 19
6 90 40 12
7 70 20 8
8 120 60 14
Code:-
> Y=c(110,80,70,120,150,90,70,120)
> X1=c(30,40,20,50,60,40,20,60)
> X2=c(11,10,7,15,19,12,8,14)
> input_data=data.frame(Y,X1,X2)
> input_data
Y X1 X2
1 110 30 11
2 80 40 10
3 70 20 7
4 120 50 15
5 150 60 19
6 90 40 12
7 70 20 8
8 120 60 14
> RegModel <- lm(Y~X1+X2, data=input_data)
> RegModel
Call:
lm(formula = Y ~ X1 + X2, data = input_data)
Coefficients:
(Intercept) X1 X2
16.8314 -0.2442 7.8488
> summary(RegModel)
Call:
lm(formula = Y ~ X1 + X2, data = input_data)
Residuals:
1 2 3 4 5 6 7 8
14.157 -5.552 3.110 -2.355 -1.308 -11.250 -4.738 7.936
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.8314 11.8290 1.423 0.2140
X1 -0.2442 0.5375 -0.454 0.6687
X2 7.8488 2.1945 3.577 0.0159 *
---
Interpretation :
Now the regression the regression model is
Y 16.834 0.2442* X 1 7.8488* X 2
Since R2 is 0.9593 and the ANOVA shows that the F-ratio is significant, this model
can be taken as good-fit in explaining the sales interms of the other two variables.
R2 is 0.8701 ,which is about 87% of BMR can be explained in terms of age HT,WT
and BMI of a person through this linear model, we also see that all the explanatory
variables have positive relationship with BMR. These regression coefficient are how
ever not statistically significant except that of age, though the F-test in ANOVA shows
that the overall regression is significant at 0.01 level(p-value is almost zero).The
meaning of the regression coefficient can be understood as follows
if the age increases by 4.021 at fixed values of the other factors like HT,WT and BMI.
Problem 5:( Agriculturedata.csv)
Write the model and interpret about that model for the fallowing Code:
R code:-
>input_data<-read.csv('C:/Users/10526/Desktop/Moksha_New/
Agriculturedata.csv')
>input_data
>summary(input_data)
>cor(input_data[,c("Net_Agricultural_Output","Population_Active_in_Agricult
ure","Fertilizer_Consumption","Number_of_Tractors_in_Agriculture")],
use="complete.obs")
>RegModel.2 <-
lm(Net_Agricultural_Output~Population_Active_in_Agriculture+Fertilizer_Co
nsumption, data=input_data)
>summary(RegModel.2)
>plot(RegModel.2)
Practice problems :-
1. For the given details viz. Sector wise Number of Factories, Productive
Capital, No. of Employees, Total Output and Net Value Added Fit the
Multiple Regression and interpret your result. Assume the variables as
Dependent and Independent according to your requirement/description. File
Name: Ex 3 data file.
2. Use the Life Satisfaction dataset to fit the regression equation. File Name:
Ex 1 and 4 data file.