0% found this document useful (0 votes)
12 views

Linear Regression

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Linear Regression

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

What is Linear Regression?

Linear Regression is a supervised learning algorithm in machine

learning, which is widely used for solving regression problems.

Regression is a type of machine learning problem where the goal is to

predict a continuous output variable based on one or more input

variables.

In Linear Regression, the goal is to find the best-fitting linear equation

to describe the relationship between the input variables (also known

as predictors or features) and the output variable (also known as the

response variable).
The equation for a simple linear regression model can be written as

follows:

y = b0 + b1 * x

Here, y is the dependent variable (the variable we are trying to

predict), x is the independent variable (the predictor or feature), b0 is

the intercept term (the value of y when x is zero), and b1 is the slope

coefficient (the change in y for a unit change in x).

The goal of Linear Regression is to find the best values for b0 and b1

such that the line best fits the data points, minimizing the errors or

the difference between the predicted values and the actual values.

Types of Linear Regression?

There are two main types of Linear Regression models: Simple Linear

Regression and Multiple Linear Regression.

Simple Linear Regression: In simple linear regression, there is only

one independent variable (also known as the predictor or feature) and

one dependent variable (also known as the response variable). The


goal of simple linear regression is to find the best-fitting line to

describe the relationship between the independent and dependent

variable. The equation for a simple linear regression model can be

written as:

Y = b0 + b1 * X

Here, Y is the dependent variable, X is the independent variable, b0 is

the intercept term, and b1 is the slope coefficient.

Multiple Linear Regression: In multiple linear regression, there are

multiple independent variables and one dependent variable. The goal

of multiple linear regression is to find the best-fitting line to describe

the relationship between the independent variables and the

dependent variable. The equation for a multiple linear regression

model can be written as:

Y = b0 + b1 * X1 + b2 * X2 + … + bn * Xn

Here, Y is the dependent variable, X1, X2, …, Xn are the independent

variables, b0 is the intercept term, and b1, b2, …, bn are the slope

coefficients.
In both types of linear regression, the goal is to find the best values

for the intercept and slope coefficients that minimize the difference

between the predicted values and the actual values. Linear regression

is widely used in many real-world applications, such as finance,

marketing, and healthcare, for predicting outcomes such as stock

prices, customer behavior, and patient outcomes.

Linear Regression Line

In machine learning, a regression line can show two types of

relationships between the input variables (also known as predictors or

features) and the output variable (also known as the response

variable) in a linear regression model.

● Positive Relationship: A positive relationship exists

between the input variables and the output variable when


the slope of the regression line is positive. In other words, as

the values of the input variables increase, the value of the

output variable also increases. This can be seen as an

upward slope on a scatter plot of the data.

● Negative Relationship: A negative relationship exists

between the input variables and the output variable when

the slope of the regression line is negative. In other words,

as the values of the input variables increase, the value of the

output variable decreases. This can be seen as a downward

slope on a scatter plot of the data.


Finding the best fit line

In machine learning, finding the best-fitting line is crucial in linear

regression, as it determines the accuracy of the predictions made by

the model. The best-fitting line is the line that has the smallest

difference between the predicted values and the actual values.

To find the best-fitting line in a linear regression model, we use a

process called “ordinary least squares (OLS) regression”. This process

involves calculating the sum of the squared differences between the

predicted values and the actual values for each data point, and then

finding the line that minimizes this sum of squared errors.


The best-fitting line is found by minimizing the residual sum of

squares (RSS), which is the sum of the squared differences between

the predicted values and the actual values. This is achieved by

adjusting the values of the intercept and slope coefficients, also

known as c and m, respectively.

Once the values of c and m are determined, we can use the linear

regression equation to make predictions for new data points. The

equation for a simple linear regression model can be written as:


y=c+m*x

Here, y is the dependent variable (the variable we are trying to

predict), x is the independent variable (the predictor or feature), c is

the intercept term (the value of y when x is zero), and m is the slope

coefficient (the change in y for a unit change in x).

In multiple linear regression, the equation would have more

independent variables, and the slope coefficients for each variable

would be included in the equation.

Overall, finding the best-fitting line in a linear regression model is

critical for accurate predictions and is achieved by minimizing the

residual sum of squares using the OLS regression method.

Gradient Descent : Linear Regression

In this tutorial you can learn how the gradient descent algorithm

works and implement it from scratch in python. First we look at what

linear regression is, then we define the loss function. We learn how

the gradient descent algorithm works and finally we will implement it

on a given data set and make predictions.


Linear Regression

In statistics, linear regression is a linear approach to modeling

the relationship between a dependent variable and one or more

independent variables. Let X be the independent variable and Y

be the dependent variable. We will define a linear relationship

between these two variables as follows:


This is the equation for a line that you studied in high school. m

is the slope of the line and c is the y intercept. Today we will use

this equation to train our model with a given dataset and predict

the value of Y for any given value of X. Our challenge today is to

determine the value of m and c, such that the line corresponding

to those values is the best fitting line or gives the minimum

error.

Loss Function

The loss is the error in our predicted value of m and c. Our goal

is to minimize this error to obtain the most accurate value of m

and c.

We will use the Mean Squared Error function to calculate the

loss. There are three steps in this function:


1. Find the difference between the actual y and predicted

y value(y = mx + c), for a given x.

2. Square this difference.

3. Find the mean of the squares for every value in X.

Mean Squared Error Equation

Here yᵢ is the actual value and ȳᵢ is the predicted

value. Lets substitute the value of ȳᵢ:

Substituting the value of ȳᵢ

So we square the error and find the mean. hence the name Mean

Squared Error. Now that we have defined the loss function, lets

get into the interesting part — minimizing it and finding m and

c.
The Gradient Descent Algorithm

Gradient descent is an iterative optimization algorithm to find

the minimum of a function. Here that function is our Loss

Function.

Understanding Gradient Descent

Imagine a valley and a person with no sense of direction who

wants to get to the bottom of the valley. He goes down the slope

and takes large steps when the slope is steep and small steps

when the slope is less steep. He decides his next position based
on his current position and stops when he gets to the bottom of

the valley which was his goal.

Let’s try applying gradient descent to m and c and approach it

step by step:

1. Initially let m = 0 and c = 0. Let L be our learning rate.

This controls how much the value of m changes with

each step. L could be a small value like 0.0001 for good

accuracy.

2. Calculate the partial derivative of the loss function with

respect to m, and plug in the current values of x, y, m

and c in it to obtain the derivative value D.

Derivative with respect to m

Dₘ is the value of the partial derivative with respect to m.

Similarly lets find the partial derivative with respect to c, Dc :


Derivative with respect to c

3. Now we update the current value of m and c using the

following equation:

4. We repeat this process until our loss function is a very small

value or ideally 0 (which means 0 error or 100% accuracy). The

value of m and c that we are left with now will be the optimum

values.

Now going back to our analogy, m can be considered the current

position of the person. D is equivalent to the steepness of the

slope and L can be the speed with which he moves. Now the new

value of m that we calculate using the above equation will be his

next position, and L×D will be the size of the steps he will take.

When the slope is more steep (D is more) he takes longer steps


and when it is less steep (D is less), he takes smaller steps.

Finally he arrives at the bottom of the valley which corresponds

to our loss = 0.

Now with the optimum value of m and c our model is ready to

make predictions !

Implementing the Model

Now let’s convert everything above into code and see our model

in action !

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

plt.rcParams['figure.figsize'] = (12.0,
9.0)

# Preprocessing Input data

data = pd.read_csv('data.csv')

X = data.iloc[:, 0]

Y = data.iloc[:, 1]

plt.scatter(X, Y)

plt.show()
# Building the model

m = 0

c = 0

L = 0.0001 # The learning Rate

epochs = 1000 # The number of iterations to perform


gradient descent

n = float(len(X)) # Number of elements in X

# Performing Gradient Descent

for i in range(epochs):

Y_pred = m*X + c # The current predicted value of Y

D_m = (-2/n) * sum(X * (Y - Y_pred)) # Derivative


wrt m

D_c = (-2/n) * sum(Y - Y_pred) # Derivative wrt c

m = m - L * D_m # Update m

c = c - L * D_c # Update c

print (m, c)

1.4796491688889395 0.10148121494753726

# Making predictions

Y_pred = m*X + c

plt.scatter(X, Y)

plt.plot([min(X), max(X)], [min(Y_pred), max(Y_pred)],


color='red') # regression line

plt.show()
Gradient descent is one of the simplest and widely used

algorithms in machine learning, mainly because it can be applied

to any function to optimize it. Learning it lays the foundation to

mastering machine learning.

You might also like