0% found this document useful (0 votes)
23 views27 pages

Numerical Computation - 7 - Linear Regression

This document discusses linear regression techniques for fitting a linear model to data. It covers determining the best fit line using least squares regression and the normal equation. It also discusses quantifying the error in linear regression models and linearizing non-linear relationships.

Uploaded by

sarah alina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views27 pages

Numerical Computation - 7 - Linear Regression

This document discusses linear regression techniques for fitting a linear model to data. It covers determining the best fit line using least squares regression and the normal equation. It also discusses quantifying the error in linear regression models and linearizing non-linear relationships.

Uploaded by

sarah alina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Numerical Computation (2 Credits)

2022-2023, 2nd term


Linear Regression

Naufan Raharya, Ph.D.


Linear Regression 3

2
First, the points indicate that
the force
FU increases as velocity
increases. Second, the
F=ma points do not increase
smoothly, but exhibit rather
v significant scatter,
FD particularly at the higher
velocities. Finally, although
it may not
be obvious, the relationship
between force and velocity
may not be linear. This
conclusion becomes more
apparent if we assume that
force is zero for zero
velocity.
3
•14.1 STATISTICS REVIEW

READ in TEXT-BOOK

4
LINEAR LEAST-SQUARES REGRESSION
The best curve-fitting strategy is to derive an approximating function that fits the shape or
general trend of the data without necessarily matching the individual points.
One approach to do this is to visually inspect the plotted data and then sketch a .best. line
through the points.

The mathematical expression for the straight line is


y = a0 + a1x + e (14.8)
where a0 and a1 are coefficients representing the intercept and
the slope, respectively, and
e is the error, or residual, between the model and the
observations, which can be represented
by rearranging Eq. (14.8) as
e = y − a0 − a1x (14.9)
Thus, the residual is the discrepancy between the true value
of y and the approximate value,
a0 + a1x, predicted by the linear equation.

5
Criteria for a “Best” Fit
One strategy for fitting a .best. line through the data would be to
minimize the sum of the residual errors for all the available data

However, any straight line passing


through the midpoint of the
connecting line (except a perfectly
vertical line) results in a minimum
value equal to zero

For the four points shown, any straight


line falling within the dashed lines will
minimize the sum of the absolute values
(14.12) of the residuals. Thus, this criterion also
does not yield a unique best fit.
FIGURE 14.7 A third strategy for fitting a best line is the minimax criterion.
Examples of some criteria for “best fit” that are
In this technique, the line is chosen that minimizes the
inadequate for regression: (a) minimizes the sum
of the residuals, (b) minimizes the sum of the maximum distance that an individual point falls from the line.
absolute values of the residuals, and (c) minimizes
the maximum error of any individual point.
6
Least-Squares Fit of a Straight Line

Minimum dgn differensial

7
8
Linear Problem

The GDP per Capita and Life


Satisfaction per country [3]

Country GDP per Capita (USD) Life Satisfaction


Hungary 12240 4.9
Korea 27195 5.8
France 37675 6.5
Australia 50962 7.3
United States 55805 7.2
Linear Problem

• If we plot the previous table, we can get the following figure (note the blue dots are
from other countries)

The GDP per Capita and Life


Satisfaction per country [3]
Linear Problem

• We can see a trend in the figure. Therefore, we can make a linear regression model
prediction using the following equation

• or
yˆ = θ 0 + θ1 x1 + θ 2 x2 +  + θ n xn

yˆ = θ.x
θ = θ 0 , , θ n
=x x=
0 , , xn , x0 1
Linear Problem

• Because we only have one input, the GDP per capita, we can write the linear
regression as follows

life satisfaction = θ 0 + θ1 × GDP per capita


Linear Problem

θ values [3]
Linear Problem

θ values obtained from a least square [3]


Linear Regression

• We will now learn how to use the linear regression.


• In linear regression, we have the normal equation, which is an analytical approach
with a least square cost function. The normal equation is given by

• Supposed that : θ = ( XT X) −1 .( XT y )

ym* = xθ

ym * ∈ y * = [ y0 , , yM ]T
=x [= x0 , , xN ]; θ [θˆ0 , , θˆN ]T
 x0,0  x0, N 
= = , X 
y * Xθ   
 
 xM ,0  xM , N 
Linear Regression

• We can get error equation given as follows

e= y − y* , y =
[ y1 , , yM ]T
e= y − Xθ
• Then, we square the error as follows

eT e = (y − Xθ )T (y − Xθ )
| e |2 = y T y − ( Xθ )T y − y T ( Xθ ) + ( Xθ )T Xθ
Linear Regression

• We then do the derivation with respect to θ , and the derivation is equal to zero

0 − XT y − XT y + 2 XT Xθ =
∇θ (eT e) = 0
2 XT y = 2 XT Xθ
( XT X) −1 ( XT y ) = ( XT X) −1 ( XT X)θ
• Remember, θ = ( XT X) −1 ( XT y )

∇ w (wT a) =
∇ w ( w aT ) =
a
Quantification of Error of Linear Regression

where sy/x is called the standard error of the estimate. The


subscript notation y/x designates that the error is for a
predicted value of y corresponding to a particular value of x.
Also, notice that we now divide by n − 2 because two data-
derived estimates- a0 and a1- were used to compute Sr

FIGURE 14.9
The residual in linear regression represents the
vertical distance between a data point and the
straight line.

18
FIGURE 14.10
Regression data showing (a) the spread of the
data around the mean of the dependent variable
and (b) the spread of the data around the best-fit
FIGURE 14.11
line. The reduction in the spread in going from
Examples of linear regression with (a)
(a) to (b), as indicated by the bell-shaped curves
small and (b) large residual errors.
at the right, represents the improvement due to
linear regression.

The difference between the two quantities, St − Sr , quantifies An alternative formulation for r that
the improvement or error reduction due to describing the data is more convenient for computer
in terms of a straight line rather than as an average value. implementation is
Because the magnitude of this quantity is scale-dependent,
the difference is normalized to St to yield

19
20
Linearization of Non-linear Relationships

• Linear regression provides a powerful technique for fitting a best line to data.
• However, it is predicated on the fact that the relationship between the dependent and
independent variables is linear.
• This is not always the case, and the first step in any regression analysis should be to
plot and visually inspect the data to ascertain whether a linear model applies.
• In some cases, techniques such as polynomial regression, which is described in
Chap. 15 (General Linear Least Squares and Non-Linear Regression), are
appropriate.

21
22
23
24
25
LINEARIZATION OF NONLINEAR RELATIONSHIPS

26
27

You might also like