Module 5
Module 5
𝑆 = σ𝑁 2 𝑁
𝑖=1(𝑌𝑖 − 𝑦𝑖 ) = σ𝑖=1(𝑌𝑖 − 𝑎1 𝑥𝑖 − 𝑎0 )
2
n Ti Ri Ti^2 Ri*Ti
Solving for a1 and a0,
1 20.5 765 420.25 15682.50
2 32.7 826 1069.29 27010.20 a1 = 3.3949
3 51 873 2601.00 44523.00 a0 = 702.1721
4 73.2 942 5358.24 68954.40
5 95.7 1032 9158.49 98762.40 R = 3.3949T + 702.1721
sum = 273.1 4438 18607.27 254932.50
Therefore, we have
18607.27a1 + 273.1a0 = 254932.5 (eq. 1)
273.1a1 + 5a0 = 4438 (eq. 2)
Excel Output for Linear Least Squares Regression
2. Find the least-squares line that fits the following data, assuming that the x-values are free of
error.
x 1 2 3 4 5 6
y 5.04 8.12 10.64 13.18 16.20 20.04
n xi yi xi^2 xi*yi
1 1 5.04 1.00 5.04
2 2 8.12 4.00 16.24
3 3 10.64 9.00 31.92 Solving for a1 and a0,
4 4 13.18 16.00 52.72
5 5 16.2 25.00 81.00 a1 = 2.9080
6 6 20.04 36.00 120.24 a0 = 2.0253
sum = 21 73.22 91.00 307.16
y = 2.908x + 2.0253
Therefore, we have
91a1 + 21a0 = 307.16 (eq. 1)
21a1 + 6a0 = 73.22 (eq. 2)
MATLAB Implementation
Polyfit and Polyval Command/Function
R=
736.1208
>>
MATLAB Implementation
>> x = [1 2 3 4 5 6];
x y >> y = [5.04 8.12 10.64 13.18 16.20 20.04];
>> a = polyfit(x,y,1)
1 5.04
2 8.12 a=
3 10.64 2.9080 2.0253
4 13.18
5 16.20 >> y = polyval(a,10)
6 20.04 y=
31.1053
>>
Linear Regression Analysis
• A regression model is a mathematical equation that describes the relationship
between two or more variables.
• A single regression model includes only two variables: one independent and
one dependent. The relationship between two variables in a regression analysis
is expressed by a mathematical equation called a regression equation or
model. A regression equation that gives a straight-line relationship between two
variables is called a linear regression model; otherwise, it is called a non-
linear regression model.
Quantification of Error
of Linear Regression
The standard deviation σ measures the spread of the errors around the regression line
The standard deviation of errors (Se) is calculated using
𝑆𝑆𝐸
𝑠𝑒 = 𝑛−2
Where: Se = standard error of the estimate
SSE = sum of the squares of the errors (or residuals)
𝑆𝑆𝐸 = σ(𝑦𝑖 − 𝑎1 𝑥𝑖 − 𝑎0 )2
Determine:
a. standard deviation of errors, Se
b. error sum of squares, SSE
c. total sum of squares, SST
d. regression sum of squares, SSR
e. the coefficient of determination, r2
f. Pearson’s correlation coefficient
Excel Output on Quantification
of Error of Linear Regression
1. In a physics laboratory, a group of students conducted an experiment to
determine the effects of temperature on resistance. They have recorded
the temperature and resistance measurements as shown below.
a . Se 10.2477
b. SSE 315.0459
c. SST 42849.2
d. SSR 42534.15
e. r^2 0.992648
f. r 0.996317
Sample Problem on Quantification
of Error of Linear Regression
2. Given the following data:
x 1 2 3 4 5 6
y 5.04 8.12 10.64 13.18 16.20 20.04
Determine:
a. standard deviation of errors, Se
b. error sum of squares, SSE
c. total sum of squares, SST
d. regression sum of squares, SSR
e. the coefficient of determination, r2
f. Pearson’s correlation coefficient
Excel Output on Quantification
of Error of Linear Regression
2. Given the following data:
x 1 2 3 4 5 6
y 5.04 8.12 10.64 13.18 16.20 20.04
a . Se 0.442553
b. SSE 0.783413
c. SST 148.7715
d. SSR 147.9881
e. r^2 0.994734
f. r 0.997364
Polynomial Regression
The least squares procedure can be
readily extended to fit the date to a
higher-order polynomial. For
example, suppose that we fit a
second order polynomial or
quadratic:
𝑦 = 𝑎2 𝑥 2 + 𝑎1 𝑥 + 𝑎0 + 𝑒
𝑎0 𝑁 + 𝑎1 σ 𝑥𝑖 + 𝑎2 σ 𝑥𝑖2 = σ 𝑦𝑖
𝑎0 σ 𝑥𝑖 + 𝑎1 σ 𝑥𝑖2 + 𝑎2 σ 𝑥𝑖3 = σ 𝑥𝑖 𝑦𝑖
𝑎0 σ 𝑥𝑖2 + 𝑎1 σ 𝑥𝑖3 + 𝑎2 σ 𝑥𝑖4 = σ 𝑥𝑖2 𝑦𝑖
x 0 1 2 3 4 5
y 2.1 7.7 13.6 27.2 40.9 61.1
Excel Output for Polynomial Regression
Fit a second-order polynomial to the data in the first two columns of the given table.
x 0 1 2 3 4 5
y 2.1 7.7 13.6 27.2 40.9 61.1
>> x = [0 1 2 3 4 5];
x y >> y = [2.1 7.7 13.6 27.2 40.9 61.1];
0 2.1 >> a = polyfit(x,y,2)
1 7.7
a=
2 13.6
3 27.2 1.8607 2.3593 2.4786
4 40.9 >> y = polyval(a,10)
5 61.1
y=
212.1429
>>
Multiple Linear Regression
A useful extension of linear regression is the case where y is a linear
function of two or more independent variables.
Consider a function y which is a linear function of x1 and x2 as in
𝑦 = 𝑎0 + 𝑎1 𝑥1 + 𝑎2 𝑥2 + 𝑒
Such an equation is quite useful in fitting experimental data where variable
being studied is often a function of two other variables. For this two-
dimensional case, the regression line becomes a plane. The best values of
the coefficients are obtained by formulating the sum of the squares of the
residuals:
𝑆 = σ𝑛𝑖=1(𝑦𝑖 − 𝑎0 − 𝑎1 𝑥1,𝑖 − 𝑎2 𝑥2,𝑖 )2
At a minimum S, the partial derivatives S/a2, S/a1 and S/a0 will be
equal to zero and expressing the result in matrix form as
𝑛 σ 𝑥1,𝑖 σ 𝑥2,𝑖 𝑎𝑜 σ 𝑦𝑖
σ 𝑥1,𝑖 σ 𝑥1,𝑖 2 σ 𝑥1,𝑖 𝑥2,𝑖 𝑎1 = σ 𝑥1,𝑖 𝑦𝑖
σ 𝑥2,𝑖 σ 𝑥1,𝑖 𝑥2,𝑖 σ 𝑥2,𝑖 2 𝑎2 σ 𝑥2,𝑖 𝑦𝑖
Sample Problems on
Multiple Linear Regression
1. Use multiple linear regression fit for the following data:
x1 0 1 1 2 2 3 3 4 4
x2 0 1 2 1 2 1 2 1 2
y 15 18 12.8 25.7 20.4 35 30 45.3 40.1
x1 0 0 1 2 1 1.5 3 3 -1
x2 0 1 0 1 2 1 2 3 -1
y 1 6 4 -4 -2 -1.5 -12 -15 17
Sample Problems on
Multiple Linear Regression
1. Use multiple linear regression fit for the following data:
x1 0 1 1 2 2 3 3 4 4
x2 0 1 2 1 2 1 2 1 2
y 15 18 12.8 25.7 20.4 35 30 45.3 40.1