Numerical Computation - 7 - Linear Regression
Numerical Computation - 7 - Linear Regression
2
First, the points indicate that
the force
FU increases as velocity
increases. Second, the
F=ma points do not increase
smoothly, but exhibit rather
v significant scatter,
FD particularly at the higher
velocities. Finally, although
it may not
be obvious, the relationship
between force and velocity
may not be linear. This
conclusion becomes more
apparent if we assume that
force is zero for zero
velocity.
3
•14.1 STATISTICS REVIEW
READ in TEXT-BOOK
4
LINEAR LEAST-SQUARES REGRESSION
The best curve-fitting strategy is to derive an approximating function that fits the shape or
general trend of the data without necessarily matching the individual points.
One approach to do this is to visually inspect the plotted data and then sketch a .best. line
through the points.
5
Criteria for a “Best” Fit
One strategy for fitting a .best. line through the data would be to
minimize the sum of the residual errors for all the available data
7
8
Linear Problem
• If we plot the previous table, we can get the following figure (note the blue dots are
from other countries)
• We can see a trend in the figure. Therefore, we can make a linear regression model
prediction using the following equation
• or
yˆ = θ 0 + θ1 x1 + θ 2 x2 + + θ n xn
yˆ = θ.x
θ = θ 0 , , θ n
=x x=
0 , , xn , x0 1
Linear Problem
• Because we only have one input, the GDP per capita, we can write the linear
regression as follows
θ values [3]
Linear Problem
• Supposed that : θ = ( XT X) −1 .( XT y )
ym* = xθ
•
ym * ∈ y * = [ y0 , , yM ]T
=x [= x0 , , xN ]; θ [θˆ0 , , θˆN ]T
x0,0 x0, N
= = , X
y * Xθ
xM ,0 xM , N
Linear Regression
e= y − y* , y =
[ y1 , , yM ]T
e= y − Xθ
• Then, we square the error as follows
eT e = (y − Xθ )T (y − Xθ )
| e |2 = y T y − ( Xθ )T y − y T ( Xθ ) + ( Xθ )T Xθ
Linear Regression
• We then do the derivation with respect to θ , and the derivation is equal to zero
0 − XT y − XT y + 2 XT Xθ =
∇θ (eT e) = 0
2 XT y = 2 XT Xθ
( XT X) −1 ( XT y ) = ( XT X) −1 ( XT X)θ
• Remember, θ = ( XT X) −1 ( XT y )
∇ w (wT a) =
∇ w ( w aT ) =
a
Quantification of Error of Linear Regression
FIGURE 14.9
The residual in linear regression represents the
vertical distance between a data point and the
straight line.
18
FIGURE 14.10
Regression data showing (a) the spread of the
data around the mean of the dependent variable
and (b) the spread of the data around the best-fit
FIGURE 14.11
line. The reduction in the spread in going from
Examples of linear regression with (a)
(a) to (b), as indicated by the bell-shaped curves
small and (b) large residual errors.
at the right, represents the improvement due to
linear regression.
The difference between the two quantities, St − Sr , quantifies An alternative formulation for r that
the improvement or error reduction due to describing the data is more convenient for computer
in terms of a straight line rather than as an average value. implementation is
Because the magnitude of this quantity is scale-dependent,
the difference is normalized to St to yield
19
20
Linearization of Non-linear Relationships
• Linear regression provides a powerful technique for fitting a best line to data.
• However, it is predicated on the fact that the relationship between the dependent and
independent variables is linear.
• This is not always the case, and the first step in any regression analysis should be to
plot and visually inspect the data to ascertain whether a linear model applies.
• In some cases, techniques such as polynomial regression, which is described in
Chap. 15 (General Linear Least Squares and Non-Linear Regression), are
appropriate.
21
22
23
24
25
LINEARIZATION OF NONLINEAR RELATIONSHIPS
26
27