0% found this document useful (0 votes)
25 views4 pages

ML Algorithm

Uploaded by

arbaazkhan5735
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views4 pages

ML Algorithm

Uploaded by

arbaazkhan5735
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Linear Regression in Machine Learning

Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a
statistical method that is used for predictive analysis. Linear regression makes predictions for
continuous/real or numeric variables such as sales, salary, age, product price, etc.
Linear regression algorithm shows a linear relationship between a dependent (y) and one or more
independent (y) variables, hence called as linear regression. Since linear regression shows the linear
relationship, which means it finds how the value of the dependent variable is changing according to
the value of the independent variable.
The linear regression model provides a sloped straight line representing the relationship between the
variables. Consider the below image:

Mathematically, we can represent a linear regression as:


y= a0+a1x+ ε
Here,
Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error
The values for x and y variables are training datasets for Linear Regression model representation.
Types of Linear Regression
Linear regression can be further divided into two types of the algorithm:
o Simple Linear Regression:
If a single independent variable is used to predict the value of a numerical dependent variable,
then such a Linear Regression algorithm is called Simple Linear Regression.
o Multiple Linear regression:
If more than one independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Multiple Linear Regression.
Linear Regression Line
A linear line showing the relationship between the dependent and independent variables is called
a regression line. A regression line can show two types of relationship:
o Positive Linear Relationship:
If the dependent variable increases on the Y-axis and independent variable increases on X-
axis, then such a relationship is termed as a Positive linear relationship.

o Negative Linear Relationship:

If the dependent variable decreases on the Y-axis and independent variable increases on the X-axis,
then such a relationship is called a negative linear relationship.
Finding the best fit line:
When working with linear regression, our main goal is to find the best fit line that means the error
between predicted values and actual values should be minimized. The best fit line will have the least
error. The different values for weights or the coefficient of lines (a 0, a1) gives a different line of
regression, so we need to calculate the best values for a 0 and a1 to find the best fit line, so to calculate
this we use cost function.
Applications of Multiple Linear Regression:
There are mainly two applications of Multiple Linear Regression:
o Effectiveness of Independent variable on prediction:

o Predicting the impact of changes:

import numpy as np
import matplotlib.pyplot as plt
ef estimate_coef(x, y):
# number of observations/points
n = np.size(x)
# mean of x and y vector
m_x = np.mean(x)
m_y = np.mean(y)
# calculating cross-deviation and deviation about x
SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x
# calculating regression coefficients
b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return (b_0, b_1)
def plot_regression_line(x, y, b):
# plotting the actual points as scatter plot
plt.scatter(x, y, color = "m",
marker = "o", s = 30)
# predicted response vector
y_pred = b[0] + b[1]*x
# plotting the regression line
plt.plot(x, y_pred, color = "g")
# putting labels
plt.xlabel('x')
plt.ylabel('y')
output:

Applications of Linear Regression


Trend lines: A trend line represents the variation in quantitative data with the passage of time (like
GDP, oil prices, etc.). These trends usually follow a linear relationship. Hence, linear regression can
be applied to predict future values. However, this method suffers from a lack of scientific validity in
cases where other potential changes can affect the data.
Economics: Linear regression is the predominant empirical tool in economics. For example, it is used
to predict consumer spending, fixed investment spending, inventory investment, purchases of a
country’s exports, spending on imports, the demand to hold liquid assets, labor demand, and labor
supply.
Finance: The capital price asset model uses linear regression to analyze and quantify the systematic
risks of an investment.
Biology: Linear regression is used to model causal relationships between parameters in biological
systems.
Advantages of Linear Regression
Easy to interpret: The coefficients of a linear regression model represent the change in the dependent
variable for a one-unit change in the independent variable, making it simple to comprehend the
relationship between the variables.
Robust to outliers: Linear regression is relatively robust to outliers meaning it is less affected by
extreme values of the independent variable compared to other statistical methods.
Can handle both linear and nonlinear relationships: Linear regression can be used to model both linear
and nonlinear relationships between variables. This is because the independent variable can be
transformed before it is used in the model.
No need for feature scaling or transformation: Unlike some machine learning algorithms, linear
regression does not require feature scaling or transformation. This can be a significant advantage,
especially when dealing with large datasets.
Disadvantages of Linear Regression
Assumes linearity: Linear regression assumes that the relationship between the independent variable
and the dependent variable is linear. This assumption may not be valid for all data sets. In cases where
the relationship is nonlinear, linear regression may not be a good choice.
Sensitive to multicollinearity: Linear regression is sensitive to multicollinearity. This occurs when
there is a high correlation between the independent variables. Multicollinearity can make it difficult to
interpret the coefficients of the model and can lead to overfitting.
May not be suitable for highly complex relationships: Linear regression may not be suitable for
modeling highly complex relationships between variables. For example, it may not be able to model
relationships that include interactions between the independent variables.
Not suitable for classification tasks: Linear regression is a regression algorithm and is not suitable for
classification tasks, which involve predicting a categorical variable rather than a continuous variable.

You might also like