0% found this document useful (0 votes)
16 views3 pages

Linear Regression Generalized Expressions

Uploaded by

adityaa7533
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views3 pages

Linear Regression Generalized Expressions

Uploaded by

adityaa7533
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Methodology to Minimize Error Square in Linear

Regression
K. S. N. Vikrant

Introduction
Linear regression is a statistical method to model the relationship between de-
pendent and independent variables. The objective is to minimize the squared
error, defined as the sum of the squared differences between observed values
and predicted values. This document presents the methodology to minimize
the error square for two cases: 1. y = mx 2. y = mx + c

Case 1: Linear Regression with y = mx


Error Definition
For each data point (xi , yi ), the predicted value is:

ŷi = mxi

The error for each point is:

ei = yi − ŷi = yi − mxi

The total squared error is given by:


n
X 2
E= (yi − mxi )
i=1

Minimizing the Error


To find the optimal value of m, we take the derivative of E with respect to m
and set it to zero:
n n
∂E ∂ X 2
X
= (yi − mxi ) = −2 xi (yi − mxi )
∂m ∂m i=1 i=1

n
X n
X
xi yi = m x2i
i=1 i=1

1
Solving for m: Pn
xi yi
m = Pi=1
n 2
i=1 xi

Example with 4 Data Points


Consider the following data points: (x1 , y1 ) = (1, 2), (x2 , y2 ) = (2, 4), (x3 , y3 ) =
(3, 6), (x4 , y4 ) = (4, 8). The required sums are:
X X X X
xi = 10, yi = 20, x2i = 30, xi yi = 60

Using the formula for m:


P
xi yi 60
m= P 2 = =2
xi 30

Thus, the best-fit line is:


y = 2x

Case 2: Linear Regression with y = mx + c


Error Definition
For each data point (xi , yi ), the predicted value is:

ŷi = mxi + c

The error for each point is:

ei = yi − ŷi = yi − (mxi + c)

The total squared error is given by:


n
X 2
E= (yi − (mxi + c))
i=1

Minimizing the Error


To find the optimal values of m and c, we take partial derivatives of E with
respect to m and c, and set them to zero.
1. **Derivative with respect to m:**
n
∂E X
= −2 xi (yi − (mxi + c))
∂m i=1
X X X
m x2i + c xi = xi yi

2
2. **Derivative with respect to c:**
n
∂E X
= −2 (yi − (mxi + c))
∂c i=1
X X
m xi + nc = yi
These equations form a system of linear equations that can be solved for m
and c.

Example with 4 Data Points


Using the same data points as in Case 1: (x1 , y1 ) = (1, 2), (x2 , y2 ) = (2, 4),
(x3 , y3 ) = (3, 6), (x4 , y4 ) = (4, 8). The required sums are:
X X X X
xi = 10, yi = 20, x2i = 30, xi yi = 60

The normal equations are:


30m + 10c = 60
10m + 4c = 20
Solving these equations:
m = 2, c=0
Thus, the best-fit line is:
y = 2x

Generalized Expressions
For y = mx + c, the normal equations are:
X X X
m x2i + c xi = xi yi
X X
m xi + nc = yi
The solutions for m and c are:
P P
xi yi − xin yi
P
m= P P 2
x2i − ( nxi )
P P
yi xi
c= −m
n n

You might also like