Lecture 4 - Systems of Linear Equations (DONE!!)
Lecture 4 - Systems of Linear Equations (DONE!!)
Learning
mm humpylinalgimport inv matrixrank
Lecture 4
Semester 2
2021/2022
Acknowledgement:
EE2211 development team
(Kar-Ann Toh, Thomas Yeo, Chen Khong, Helen Zhou, Vincent Tan, Robby Tan and
Haizhou Li)
2
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Module II Contents
• Operations on Vectors and Matrices
• Systems of Linear Equations
• Set and Functions
• Derivative and Gradient
• Least Squares, Linear Regression
• Linear Regression with Multiple Outputs
• Linear Regression for Classification
• Ridge Regression
• Polynomial Regression
3
© Copyright EE, NUS. All Rights Reserved.
Fundamental ML Algorithms:
Linear Regression
References for Lectures 4-6:
Main
• [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019.
(read first, buy later: http://themlbook.com/wiki/doku.php)
• [Book2] Andreas C. Muller and Sarah Guido, “Introduction to Machine
Learning with Python: A Guide for Data Scientists”, O’Reilly Media, Inc., 2017
Supplementary
• [Book3] Jeff Leek, “The Elements of Data Analytic Style: A guide for people
who want to analyze data”, Lean Publishing, 2015.
• [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied
Linear Algebra”, Cambridge University Press, 2018 (available online)
http://vmls-book.stanford.edu/
• [Ref 5] Professor Vincent Tan’s notes (chapters 4-6): (useful)
https://vyftan.github.io/papers/ee2211book.pdf
4
© Copyright EE, NUS. All Rights Reserved.
Recap on Notations, Vectors, Matrices
Scalar Numerical value 15, -3.5
Variable Take scalar values x or a
Capital Sigma ∑2
01& 30 = 3& + 3' + … + 325& + 32
Capital Pi ∏2
01& 30 = 3& · 3' ·…· 325& · 32
5
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
6
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
3& %3&
% 8 = % 3 = %3
' '
&
3& 3
& & ; &
;
8 = ; 3' = &
3
; '
7
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
ComputeRankofMatrix
8
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
dotproductofvector
at nparray I 1,2 3
Dot Product or Inner Product of Vectors:
b nparray E 2,31
D 8 · 9 = 8<9 c a dot b
:&
ix2
2
Bc a b8
= 3& 3' :
'
= 3& :& + 3':' scalar
Geometric definition: A
8 · 9 = 8 9 cosA 9
8 cosA
cow x
Matrix-Vector Product or
2 3 3 1 C W dot x
I&,& I&,' I&,> 3&
H8 = I',& I',' I',> 3'
3>
10
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
c x W
or
Vector-Matrix Product
I
2 3 C X.da W
2
I&,& I&,' I&,>
8 < H = 3& 3' I',& I',' I',>
= (3& I&,& + 3' I',& ) (3& I&,' + 3' I',' ) (3& I&,> + 3' I',> )
I 3
11
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
Matrix-Matrix Product D C X W
or
3&,& … 3&,M I&,& … I&,P C X daw
*H = ⁞ ⋱ ⁞ ⁞ ⋱ ⁞
32,& … 32,M IM,& … IM,P
Matrix inverse
Ows columns are
linearlyindependent
n
determinantsof A A to y p
Definition: gonon squarematrix
y
A d-by-d square matrix A is invertible (also nonsingular) A isinvertib
if there exists a d-by-d square matrix B such that
RS = SR = T (identity matrix)
B is aninverseof A
1 0…0 0 B At
0 1 0 0
T= ⁞ ⋱ ⁞ d-by-d dimension Squarematrix A is invertible if
0 0 1 0
0 0…0 1 A exists
Ref: https://en.wikipedia.org/wiki/Invertible_matrix
13
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
Matrix inverse computation
5&
1
R = adj(R)
det R
• det R is the determinant of R scalar
• adj(R) is the adjugate or adjoint of R matrix
Determinant computation
Example: 2x2 matrix
% Z
R=
[ \
% Z
det R = |R| = = %\ − Z[
[ \
Ref: https://en.wikipedia.org/wiki/Invertible_matrix
14
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
• adj(R) is the adjugate or adjoint of R
• adj(R) is the transpose of the cofactor matrix ^ of R à adj(A)= CT
• Minor of an element in a matrix R is defined as the determinant
obtained by deleting the row and column in which that element lies
%11 %12 %13 % %23
A= %21 %22 %23 Minor of a12 is f12 = 21
%31 %33
%31 %32 %33
ftp.Immimmmi deAin anmn anmntaisM
• The `, a entry of the cofactor matrix ^ is the minor of `, a
element times a sign factor +
Cofactor Cij = −1 ` a f`a
I AijCi
Ref: https://en.wikipedia.org/wiki/Invertible_matrix
15
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
%21 %23
Minor of a12 is f12 = adj(A)= CT
%31 %33
Cofactor Cij = −1 ` + af det(A)= ∑kj1& −1 ` + a%
`a `af`a
% Z
• E.g. R =
[ \
\ −[
^= def A ad t bl c ad be
−Z %
< \ −Z
• adj R = ^ = det R = |R| = %\ − Z[
−[ %
& & \ −Z
R5& = adj(R) =
lmn R ;M5op −[ %
Ref: https://en.wikipedia.org/wiki/Invertible_matrix
16
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
U
importnumpyasup
fromnumpylinalgimportinv
x uparray El43,1047,13 4
y uparray I30.5.4
= a(ei - fh) – b(di - fg) + c(dh - eg) print inv x
17
© Copyright EE, NUS. All Rights Reserved.
I
Operations on Vectors and Matrices
won
%'' %'>
The minor of %&& = %
>' %>>
Ref: https://en.wikipedia.org/wiki/Determinant
18
© Copyright EE, NUS. All Rights Reserved.
Operations on Vectors and Matrices
Example
1 2 3 diet A 24 215 31 4
Find the cofactor matrix of R given that R = 0 4 5 .
1 0 6
Solution:
4 5 0 5 0 4
%&& ⇒ = 24, %&' ⇒ − = 5, %&> ⇒ = −4,
0 6 1 6 1 0
2 3 1 3 1 2
%'& ⇒ − = −12, %'' ⇒ = 3, %'> ⇒ − = 2,
0 6 1 6 1 0
2 3 1 3 1 2
%>& ⇒ = −2, %>' ⇒ − = −5, %>> ⇒ = 4,
4 5 0 5 0 4
24 5 −4
The cofactor matrix C is thus −12 3 2 .
−2 − 5 4
Ref: https://www.mathwords.com/c/cofactor_matrix.htm
23
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Module II Contents
• Operations on Vectors and Matrices
• Systems of Linear Equations
• Set and Functions
• Derivative and Gradient
• Least Squares, Linear Regression
• Linear Regression with Multiple Outputs
• Linear Regression for Classification
• Ridge Regression
• Polynomial Regression
24
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
25
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
These equations can be written compactly in matrix-vector
notation:
*s = 9
Where Supervised ML
Note:
inputdata
• The data matrix * ∈ x2×M and the target vector 9 ∈ x2 are given
• The unknown vector of parameters s ∈ xM is to be learnt
26
© Copyright EE, NUS. All Rights Reserved.
Xis invertible it det X 0
Systems of Linear Equations
A set of linear equations can have no solution, one
solution, or multiple solutions:
*s = 9
Where
3&,& 3&,' … 3&,M I& :&
*= ⁞ ⁞ ⋱ ⁞ , s= ⁞ , 9= ⁞ .
32,& 32,' … 32,M IM :2
27
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
*s = 9, * ∈ x2×M , s ∈ xM×&, 9 ∈ x2×&
1. Square or even-determined system: ‚ = ƒ
- Equal number of equations and unknowns, i.e., * ∈ xM×M
- One unique solution if * is invertible or all rows/columns of * are
linearly independent
- If all rows or columns of * are linearly independent, then * is
invertible.
Solution:
If * is invertible (or * 5& * = T ), then pre-multiply both sides by * 5&
* 5& * s = * 5& 9
⇒ Ñ = * 5& 9
s
(Note: we use a hat on top of s to indicate that it is a specific point in the space of s)
28
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Example 1 I& + I' = 4 (1) Two unknowns
I& − 2I' = 1 (2) Two equations
* s 9 importhumpyasup
fromnumpy Iinal import in
1 1 I& 4 import g as
plot pit
= matplotsibDy
1 − 2 I' 1 X nparray 11 I E 27 1
y i nparray14 1 s
Ñ
Ñ = * 5&9
s W in x y
1 1 5& 4
=
1 −2 1
5& −2 −1 4 3
= =
> −1 1 1 1 Python demo 3
1 \ −Z
R5& = adj(R) adj R = ^< = det R = %\ − Z[
det R −[ %
29
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
*s = 9, * ∈ x2×M , s ∈ xM×&, 9 ∈ x2×&
2. Over-determined system: ‚ > ƒ
– More equations than unknowns
– * is non-square (tall) and hence not invertible
– Has no exact solution in general *
– An approximated solution is available using the left inverse
If the left-inverse of * exists such that * Ü* = T, then pre-multiply both
sides by * Ü results in
* Ü* s = * Ü9
⇒s Ñ = * Ü9
Definition:
A matrix B that satisfies ‡ƒ ˆ ‚‰‚ ˆ ƒ = T is called a left-inverse of R.
The left-inverse of *: * Ü= (* < *)5‹ * < given * < * is invertible.
Note: * exception: when rank(*) = rank([*,9]), there is a solution. approximatesolution rankx rankXY
exists rankX rank
solution
Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied Linear Algebra”, Cambridge University Press, 2018 Xy
(Chp11.1-11.2, 11.5)
30
© Copyright EE, NUS. All Rights Reserved.
makeathup.vstack importnumpyasnp
X.Ty XY from
Systems of Linear Equations numpy.linalgimportin x
np.arrayleei.is i t
1,0
Y np.arraylliio.rs
wainuxet x X Y
Example 2 I& + I' = 1 (1) Two unknowns
I& − I' = 0 (2) Three equations
* s 9 I& = 2 (3)
1 1 I 1
&
1 −1 I = 0
'
1 0 2
s
Ñ
No exact solution
Approximated solution
32
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
3. Under-determined system: ‚ < ƒ
Derivation:
*s = 9, * ∈ x2×M , s ∈ xM×&, 9 ∈ x2×&
*Ü
right-inverse
33
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Example 3 I& + 2I' + 3I> = 2 (1) Three unknowns
I& − 2I' + 3I> = 1 (2) Two equations
* s 9
I
1 2 3 I& 2
' =
1 −2 3 I> 1
34
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
* s 9
I
1 2 3 I& 2
' =
3 6 9 I> 1
35
© Copyright EE, NUS. All Rights Reserved.
Systems of Linear Equations
Module II Contents
• Operations on Vectors and Matrices
• Systems of Linear Equations
• Set and Functions
• Derivative and Gradient
• Least Squares, Linear Regression
• Linear Regression with Multiple Outputs
• Linear Regression for Classification
• Ridge Regression
• Polynomial Regression
36
© Copyright EE, NUS. All Rights Reserved.
Notations: Set
• A set is an unordered collection of unique elements
– Denoted as a calligraphic capital character e.g., •, x, ‘ etc
– When an element 3 belongs to a set ’, we write 3 ∈ •
• A set of numbers can be finite - include a fixed amount of values
– Denoted using accolades, e.g. {1, 3, 18, 23, 235} or {3& , 3' , 3> , 3ì , . . . , 3M }
• A set can be infinite and include all values in some interval
– If a set of real numbers includes all values between a and b, including a and
b, it is denoted using square brackets as [a, b]
– If the set does not include the values a and b, it is denoted using
parentheses as (a, b)
• Examples:
– The special set denoted by x includes all real numbers from minus infinity
to plus infinity
– The set [0, 1] includes values like 0, 0.0001, 0.25, 0.9995, and 1.0
Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019 (p4 of chp2).
37
© Copyright EE, NUS. All Rights Reserved.
Notations: Set operations
Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019 (p4 of chp2).
38
© Copyright EE, NUS. All Rights Reserved.
Functions
• A function is a relation that associates each element 3 of a set —,
the domain of the function, to a single element : of another set ˜,
the codomain of the function
• If the function is called f, this relation is denoted : = ™(3)
– The element 3 is the argument or input of the function
– : is the value of the function or the output
• The symbol used for representing the input is the variable of the
function
– ™(3) ™ is a function of the variable 3; ™(3, I) ™ is a function of the variable 3 and w
— ˜
1
1 2 Range
2 3 (or Image)
3 4 {3,4,5,6}
4 5
6
{1,2,3,4} domain codomain {1,2,3,4,5,6}
Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019 (p6 of chp2). 39
© Copyright EE, NUS. All Rights Reserved.
Functions
Ref: [Book1] Andriy Burkov, “The Hundred-Page Machine Learning Book”, 2019 (p7 of chp2).
40
© Copyright EE, NUS. All Rights Reserved.
Functions inputvectors
real vectors
g r
real scalar
• The notation ™: xM → x means that ™ is a function that
number
at I 1 I
Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied Linear Algebra”, 2018 (Ch 2, p29)
41
© Copyright EE, NUS. All Rights Reserved.
in ML
Functions a is treated as 5
x isdatafunction X
y
The inner product function
• Suppose œ is a d-vector. We can define a scalar valued function ™ of d-
vectors, given by
™ 8 = œ< 8 = %& 3& + %' 3' + ⋯ %M 3M (1)
for any d-vector 8
• The inner product of its d-vector argument 8 with some (fixed) d-vector œ
• We can also think of ™ as forming a weighted sum of the elements of 8;
the elements of œ give the weights œ
A
8
œ cosA
Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied Linear Algebra”, Cambridge University Press, 2018 (p30)
42
© Copyright EE, NUS. All Rights Reserved.
Functions
Linear Functions
• Homogeneity
• For any d-vector 8 and any scalar •, ™ •8 = •™ 8
• Scaling the (vector) argument is the same as scaling the
function value
• Additivity
• For any d-vectors 8 and 9, ™ 8 + 9 = ™ 8 + ™ 9
• Adding (vector) arguments is the same as adding the function
values
Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied Linear Algebra”, Cambridge University Press, 2018 (p31)
43
© Copyright EE, NUS. All Rights Reserved.
Functions
Linear Functions
Superposition and linearity
• The inner product function ™ 8 = œ< 8 defined in equation (1)
(slide 9) satisfies the property
™ •8 + ž9 = œ< •8 + ž9
= œ< •8) + œ< (ž9
= • œ< 8) + ž(œ< 9
= •™ 8) + ž™(9
for all d-vectors 8, 9, and all scalars •, ž.
Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied Linear Algebra”, Cambridge University Press, 2018 (p30)
44
© Copyright EE, NUS. All Rights Reserved.
Functions
Linear Functions
Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied Linear Algebra”, Cambridge University Press, 2018 (p30)
45
© Copyright EE, NUS. All Rights Reserved.
Functions
Example:
™ 8 = 2.3 − 23& + 1.33' − 3>
Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied Linear Algebra”, Cambridge University Press, 2018 (p32)
46
© Copyright EE, NUS. All Rights Reserved.
Functions
Ref: [Book4] Stephen Boyd and Lieven Vandenberghe, “Introduction to Applied Linear Algebra”, Cambridge University Press, 2018 (p33)
47
© Copyright EE, NUS. All Rights Reserved.
Summary
• Operations on Vectors and Matrices Assignment 1 (week 6)
• Dot-product, matrix inverse Tutorial 4
• Systems of Linear Equations *s = 9
• Matrix-vector notation, linear dependency, invertible
• Even-, over-, under-determined linear systems
• Set and Functions
* is Even- m = d One unique solution in general Ñ = * 5& 9
s
Square determined
* is Over- m > d No exact solution in general; Ñ = (* < *)5‹ * < 9
s
Tall determined An approximated solution Left-inverse
* is Under- m < d Infinite number of solutions in general; sÑ = * < (** < )5& 9
Wide determined Unique constrained solution Right-inverse
48
© Copyright EE, NUS. All Rights Reserved.