Linear Regression

Machine Learning
Techniques(AI)
Types of Machine Learning
- Classification
01 Supervised Learning - Regression
- Clustering
02 Unsupervised Learning - Dimensionality Reduction
- Self Training
03 Semi-supervised Learning - Co Training
- Q- Learning
04 Reinforcment Learning - Deep Reinforcement Learning
Supervised ML Techniques
Tree Based
Linear Models Models Nearest Neighbor
Naïve Bayes Neural Network

OUR Focus
Linear Models:
- Linear Regression
- Logistic Regression
- Support Vector Machine (SVM)
Linear Regression
Supervised Machine Learning Technique
Linear Regression
1) Linear regression is a type of supervised machine learning algorithm that
computes the linear relationship between a dependent variable and one or
more independent features.
2) When the number of the independent feature, is 1 then it is known as

Univariate Linear regression, and in the case of more than one feature, it is
known as multivariate linear regression.
 Salary Prediction
 Age Prediction
 House Price Prediction
Linear Regression
Simple Linear Regression
One Variable Regression = Unvariate Regression
is a statistical method that allows us to summarize

and study relationships between two continuous
(quantitative) variables
 One variable, denoted x, is regarded as the

predictor, explanatory, or independent
variable.
 The other variable, denoted y, is regarded as
the response, outcome, or dependent
variable.
How to represent our function?
Consider f(x) as linear equation
Slope Formula
Slope of a Line
Y - intercept
Y – intercept is the point where the Best Fit

Line meets y-axis (x =0 )
Here Y-intercept equals 5
Hence a = 5
Y(hat) = bx + a
B is positive
A is 5
So, Y(hat) = bx +5
Formula
Function: Fw,b(x) = wx + b
Parameters: w, b
Cost Function
Cost Function is our way to make sure how good our
model is
The tool to determine which line is the Best Fit Line

( BFL)
Suppose we have this data
Visualize with scatter

Model equation:
Model equation with representing each point:
Where:
Our Goal
GOAL 1 Find Optimal w, b
GOAL 2
Make (Y –hat )closest as

possible to actual Y for all
(Xi , Yi) points or at least
most of them
Cost Funcion Formula
Cost function=(actual values-predicted values)
Which is also represented as

Cost function(d)=Y-Y’
Where Y=Actual value

Y’=Predicted value
Similarly, is we have to find this cost function for all data

points, then we can write like this:
Cost function0(d0) =Y0-Y0’
Cost function1(d1) =Y1-Y1’
Cost function 2(d2) =Y2-Y2’
. .
. .
. .
Cost function n(dn) =Yn-Yn’
So, Cost function is defined as follows:
Where:
Why do we get the squared difference not just the
difference?
1) Sensitivity to Outliers :
 - The squared difference deals larger errors (outliers) more heavily compared to absolute
difference
 - Imagine a data point with a very high or low target value compared to the rest. The
squared term in (yi - ŷi)^2 amplifies the error for this point.
 - This forces the model to pay more attention to fitting these outliers during training,
leading to a more robust model that's less susceptible to extreme values.
2) Facilitates the use of gradient descent optimization.

Why do we take the average (mean) of this sum by dividing by the
total number of data points (m).
?
(1) 1/m (Commonly Used):
 Dividing by the total number of data points (m) directly calculates the average squared error. This
is the most common normalization approach.
 It provides a clear interpretation of the cost function value, representing the average squared
difference between predicted and actual values.
(2) 1/(2m) (Alternative):

 Dividing by 2m achieves the same goal of finding the minimum but with a constant factor of ½
 This factor doesn't affect the optimization process since minimizing f(x) is equivalent to
minimizing f(x)/2, as long as we're only considering relative changes (which is the case
during optimization).
Ø The derivative of the squared error term with 1/(2m) sometimes leads to simpler or more
elegant mathematical expressions during the derivation process.
Formula
Function:
Parameters: w, b
Cost Function:
Cost Function For One Parameter
Function:
Parameters:
Cost Function:
Goal:
For b = 0 and fixed w For w =1
J(w)= 0
For b = 0 and fixed w For w = 0.5
J(w)= 0.583
For b = 0 and fixed w For w = 0
J(w)= 2.3
For b = 0 and fixed w For w = n
Cost Function For Two Parameter
Function:
Parameters: w, b
Cost Function:
Cost Function For Two Parameter
Cost Function in this way is

complex as there are two
parameters affecting the J
function
Hence, we will get

3D Plot
Any point will have three

values (J,w,b)
Horizontal axis : w, b
Vertical axis : J(w,b)
Contour Plot
Key benefits of linear regression
 Easy implementation: The linear regression model is computationally simple to
implement as it does not demand a lot of engineering overheads, neither before the model
launch nor during its maintenance .
 Interpretability: The coefficients of the model directly represent the impact of each
independent variable on the dependent variable. This makes it easy to understand how
changes in one factor will affect the predicted outcome
 Scalability: Linear regression is not computationally heavy and, therefore, fits well in
cases where scaling is essential. For example, the model can scale well regarding
increased data volume (big data).
Key benefits of linear regression
 Optimal for online settings: The model can be trained and retrained with each
new example to generate predictions in real-time, unlike the neural networks or support
vector machines that are computationally heavy and require plenty of computing resources
and substantial waiting time to retrain on a new dataset
 Robustness to Noise: Linear regression can be relatively robust to noise in the

data, especially when compared to some other models that can be more sensitive to
outliers
 Wide Range of Applications: Linear regression is a versatile tool used in various

fields, including finance (stock price prediction), healthcare (patient diagnosis), and
marketing (customer churn prediction). Its interpretability and efficiency make it a popular
choice for various data analysis tasks.
Thank You!

Linear Regression

Uploaded by

Copyright:

Available Formats

Linear Regression

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Regression

Uploaded by

Copyright:

Available Formats

Machine Learning

Naïve Bayes Neural Network

2) When the number of the independent feature, is 1 then it is known as

is a statistical method that allows us to summarize

 One variable, denoted x, is regarded as the

Y – intercept is the point where the Best Fit

Here Y-intercept equals 5

The tool to determine which line is the Best Fit Line

Visualize with scatter

Model equation with representing each point:

GOAL 1 Find Optimal w, b

Make (Y –hat )closest as

Cost function=(actual values-predicted values)

Which is also represented as

Where Y=Actual value

Similarly, is we have to find this cost function for all data

2) Facilitates the use of gradient descent optimization.

(2) 1/(2m) (Alternative):

Cost Function in this way is

Hence, we will get

Any point will have three

 Robustness to Noise: Linear regression can be relatively robust to noise in the

 Wide Range of Applications: Linear regression is a versatile tool used in various

You might also like