0% found this document useful (0 votes)
2 views

r lang-Unit-05

The document covers the concepts of Simple Linear Regression and Multiple Linear Regression, explaining their definitions and providing examples. Simple Linear Regression involves one independent variable to predict a dependent variable, illustrated with a salary and experience dataset. Multiple Linear Regression extends this to multiple predictors, demonstrated using the mtcars dataset to predict miles per gallon (mpg) based on cylinder count and displacement.

Uploaded by

km587522
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

r lang-Unit-05

The document covers the concepts of Simple Linear Regression and Multiple Linear Regression, explaining their definitions and providing examples. Simple Linear Regression involves one independent variable to predict a dependent variable, illustrated with a salary and experience dataset. Multiple Linear Regression extends this to multiple predictors, demonstrated using the mtcars dataset to predict miles per gallon (mpg) based on cylinder count and displacement.

Uploaded by

km587522
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Nrupathunga University

Department of Computer Science


V Sem BCA (NEP)
Statistical Computing and R Programming Language
Unit -05
Simple linear regression, Multiple linear regression

1. Simple Linear Regression


1. What is Linear Regression?(2 marks)
Linear regression fits a straight line or surface that minimizes the discrepancies
between predicted and actual output values.
2. Explain Linear Regression concept with an example.(8 marks)
This is the simplest form of linear regression, and it involves only one independent
variable and one dependent variable. The equation for simple linear regression is:
Y=α+ βX
where:
• Y is the dependent variable
• X is the independent variable
• α is the intercept
• β is the slope

Consider an Example of Salary and Experience

Experience (X) Salary(Y)

3 30

8 57

9 64

13 72

3 36

6 43

11 59
21 90

1 20

16 83

x̄ =9.1 ȳ=55.4
The dataset contains two columns namely: “Experience” and “Salary”. In this
case the model will be using the Experience to predict the Salary. Hence, Experience is the
independent variable and Salary is the dependent variable.

Now we can Model the relationship that salary may be related to the number of years of work
experience with the equation

Y = α + βX

α and β are given by the following equations

------------Equation 1

Given the above data, compute mean of Experience x̄ =9.1 and mean of Salary ȳ=55.4.
Substituting these values in the above Equation 1 in α and β we get

β = 3.5

α = 23.6

Now substitute α and β in the equation

Y=α+βX

Y=23.6+3.5*X this is the mathematical equation modelled for the above


problem.
Means for a given experience of 10 years the salary we can expect is

Y=23.6+3.5*10
Y=58.6
It means for an experience of 10 years the salary expected is $58.6 k
Note( Learn the R code from Lab Mannual (veryimp))

3. What is Multi Linear Regression? (2 marks)


Is an extension of Linear regression involving more than one predictor variable. It
allows one response variable Y to be modeled as a Linear function of a Multidimensional
feature vector. A multiple regression model based on two predictor attributes X1 and X2 is

Y=α + β1*X1 + β2*X2

Here,
β1 is Co-efficient of X1
β2 is Co-efficient of X2

4. Demonstrate Multilinear Regression using Mtcars (8 marks)

The dataset of mtcars is as follows

In the above dataset, we need to predict mpg for a given cyl and disp. Therefore here mpg
is the predictor variable and cyl and disp are independent variable.

So mpg is represented as Y

Cyl (X1) and disp(X2) are more than independent variable

x̄1 =6+6+4+6+8/5=23.6

x̄2=160.0+160.0+108.0+258.0+360.0/5=776

ȳ = 21.0+21.0+22.8+21.4+18.7/5=17.24
Y=α + β1*X1 + β2*X2

Now calculate β1 for X1 and β1 for X2 using the following formula

Therefore ,

β1=
∑𝑠𝑖=1(6−23.6)(21.0−17.24)+(6−23.6)(21−17.24)+(4−23.6)(22.8−17.24)+(6−23.6)(21.4−23.6)+(8−23.6)(18.7−17.24
(6−23.6)2 +(6−23.6)2 +(4−23.6)2 +(6−23.6)2 +(8−23.6)2

=-1.0780664

β2=
∑𝑠𝑖=1(160−776)(21.0−17.24)+(160−776)(21−17.24)+(108−776)(22.8−17.24)+(258−776)(21.4−23.6)+(360−23.6)(18.7−17.24
(160−776)2 +(160−776)2 +(108−776)2 +(258−776)2 +(360−776)2

= 0.0008423

α = 27.2721844

Therefore the equation of the model is

Y= α + β1*X1 + β2*X2
Substituting the values of α, β1 and β2 we get
Y= α + (-1.0780664)*X1 + (0.0008423)*X2
Now take the values of the first row in the data set for X1 (cyl=6) and X2
(disp=160)
Y= α + (-1.0780664)*6 + (0.0008423)*160
Y=20.93849
So Y is the Predicted Value
But Actual Value of mpg for a given value of cyl and disp is 21.00
So the error is
Error =Predicted Value -Actual Value
=20.93849-21.00
So the Error = -0.06151
The error is very small and hence we can consider the mathematical model
of the equation
Y= α + (-1.0780664)*X1 + (0.0008423)*X2
To find the prediction for more than one variables.

R Code for the above program (very imp)

data(mtcars)

View(mtcars)

mtcars1<-head(mtcars,5)

View(mtcars1)

# Create the relationship model.

model <- lm(mpg~cyl+disp, data = mtcars1)

# Show the model.

print(model)
# Get the Intercept and coefficients as vector elements.

cat("# # # # The Coefficient Values # # # ","\n")

a <- coef(model)[1]

print(a)

27.27218

Xcyl <- coef(model)[2]

Xcyl

-1.078066

Xdisp<-coef(model)[3]

Xdisp

0.0008423

# Value to be Predicted for ist row

Y= 27.27211844 + (-1.0780664)*6 + (0.0008423)*160

Y=

#Predictive Error calculated

#Error =Predicted Value -Actual Value


# =20.93849-21.00

error=20.93849-21.00

error
-0.06151

You might also like