0% found this document useful (0 votes)
69 views26 pages

Financial Mathematics and Principles (Lecture Notes)

The document contains lecture notes for a Financial Mathematics module at Westminster International University, detailing the course structure, grading system, and essential mathematical concepts. It outlines a timeline of topics covered each week, including mathematical logic, calculus, and probability theory, leading up to midterm and final exams. The notes are self-contained, with references to additional textbooks for further study.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views26 pages

Financial Mathematics and Principles (Lecture Notes)

The document contains lecture notes for a Financial Mathematics module at Westminster International University, detailing the course structure, grading system, and essential mathematical concepts. It outlines a timeline of topics covered each week, including mathematical logic, calculus, and probability theory, leading up to midterm and final exams. The notes are self-contained, with references to additional textbooks for further study.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Financial Mathematics and Principles

(4FNCE001C): lecture notes


Alessandro Saccal∗

Westminster International University in Tashkent

Original version: Spring 2025


Current version: January 30, 2025
Latest version


[email protected]. ©2025 Copyright Alessandro Saccal
Foreword

The module team is characterised by: Alessandro Saccal (module co-leader, asac-
[email protected]); Rustam Suleymanov (module co-leader, [email protected]); Nilufar
Rashitova ([email protected]); Otajon Otabekov ([email protected]); Muhammad
Ateeq-Ur-Rehman ([email protected]).
The grade is subdivided between the midterm exam mark and the final exam mark,
which are respectively worth 40% and 60% and are both closed book examinations. There
are no specific textbooks for this module, as the present lecture notes are self-contained;
useful textbooks are nevertheless “Fundamental Methods of Mathematical Economics” by
Alpha Chiang and Kevin Wainwright and “An undergraduate introduction to financial
mathematics” by Robert Buchanan. All content is accompanied by seminar questions,
stressing financial applications, which must be practised and perused, prior to the respective
teaching week together with these lecture notes, for achieving proficiency and perfecting
exam technique.

Timeline

Week 1. Mathematical logic and set theory 3

Week 2. Vectors and matrices 8

Week 3. Lengths, systems and eigenvalues 17

Week 4. Calculus of variations ?

Week 5. Optimisation ?

Week 6. Probability theory ?

Week 7. Midterm exam (no classes)

Week 8. Di!erence equations ?

Week 9. Ordinary di!erential equations ?

Week 10. Partial di!erential equations ?

Week 11. Exponents and logarithms ?

Week 12. Class revision

Week 13. Individual revision (no classes)

Weeks 14-15. Final exam (no classes)

2
1. Mathematical logic and set theory
The following symbols and concepts are employed hereunder. Universal quantifier (FOR
ALL, FOR ANY): →. Existential quantifier (THERE EXISTS): ↑. Exclusive existential
quantifier (THERE ONLY EXISTS): ↑! Naturals: N = {0, 1, 2, . . .}. Positive naturals:
N+ = {1, 2, . . .}. Integers: Z = {0, 1, ↓1, 2, ↓2, . . .}. Rationals: Q = {x ↔ Q : x =
a
b
, →a, b ↔ Z, b ↗= 0}. Irrationals: Q↭ = {x ↔ Q↭ : x ↗= ab , →a, b ↔ Z, b ↗= 0}. Reals:
R = {x ↔ R : n(R) >≃ n(N+ ), →R = Q↘Q↭ }. Complex numbers: C = {z ↔ C : z =
a ± bi, →a, b ↔ R, i = ↓1}.
Definition 1.1 (Element) x is an element of set X. Formally:

x ↔ X.
Definition 1.2 (Negation, complement) Not x is the negation of x ↔ X such that not
not x is x. Formally:

¬x : ¬¬x = x ↔ X.
Complement X ↭ is the complement set of set X such that complement complement X is
X. Formally:
! "↭
X↭ : X↭ = X.
Definition 1.3 (Singleton) Singleton S is a set containing only one element. Formally:

S = {x}.
Notice that reflexive sets are singletons which contain themselves: x = {x}.
Definition 1.4 (Intervals) Number x is contained in a closed interval such that it is no
smaller than a ↔ R and no greater than b ↔ R; number x is contained in an open interval
such that it is greater than a ↔ R and smaller than b ↔ R; number x is contained in a
semi-closed interval such that it is no smaller than a ↔ R and smaller than b ↔ R; number
x is contained in a semi-open interval such that it is greater than a ↔ R and no greater
than b ↔ R. Formally:

x ↔ [a, b] ⇐ R : a ⇒ x ⇒ b;
x ↔ (a, b) ⇐ R : a < x < b;
x ↔ [a, b) ⇐ R : a ⇒ x < b;
x ↔ (a, b] ⇐ R : a < x ⇒ b.

Definition 1.5 (Proper subset) Set Y is a proper subset of set X such that Y contains
some elements of X. Formally:

Y ⇑ X : Y = {x1 , . . . , xm }, X = {x1 , . . . , xn }, →n > m ↔ N+ .


Notice that set X is a proper superset of set Y. Notice this: N ⇑ Z ⇑ Q ⇑ R ⇑ C.

3
Definition 1.6 (Subset) Set Y is a subset of set X such that Y contains at least one
element of X. Formally:

Y ⇐ X : Y = {x1 , . . . , xm }, X = {x1 , . . . , xn }, →n ⇓ m ↔ N+ .
Notice that set X is a superset of set Y. Notice this: X ⇐ X.
Definition 1.7 (Set di!erence) The di!erence between set X and set Y is such that
their common elements are suitably subtracted. Formally:

X\Y = {. . . , xn } : Y = {x1 , . . . , xm }, X = {x1 , . . . , xn }, →n > m ↔ N+ .

Definition 1.8 (Cardinality) Cardinality n ↔ N+ of set X is the number of elements


contained in X. Formally:

n(X) = n, →X = {x1 , . . . , xn }, n ↔ N+ .
Notice that cardinality n(X) = 0 for empty set X = {}.
Definition 1.9 (Power set) Power set P(X) is the set of all subsets of set X whereby
X is countable. Formally:

n[P(X)] = 2n , →X = {x1 , . . . , xn }, n ↔ [1, ⇔] ⇑ N+ .


Notice this: →X = {x1 , x2 }, P(X) = {{x1 }, {x2 }, {x1 , x2 }, {}} =
{{x1 }, {x2 }, X, ↖}. Notice this: →X = {}, n[P(X)] = 20 = 1 such that P(X) =
{{}} = {↖}.
Definition 1.10 (Empty set) Empty set ↖ is a set containing no elements. Formally:

↖ = {}.
Notice that empty set ↖ is a subset of any set X, as both contain no elements at least:
#
↖ ⇐ X. Disjoint sets are sets whose intersection equals the empty set: ni=1 Xi = ↖.
Definition 1.11 (Ur-element) u ↔ U is an ur-element such that it cannot be a set.
Formally:

u ↔ U : u ↗= {x1 , . . . , xn }, →n ↔ N+ .
Definition 1.12 (Conjunction, intersection) The conjunction (AND) of n elements
ei ↔ X, →i = 1, . . . , n, is such that all elements are true. Formally: →i ↔ [1, n] ⇑ N+ ,
n
$
ei = T, →ei = T.
i=1

The intersection of n sets Xi , →i = 1, . . . , n, is such that any common element


x ↔ Xi , →i = 1, . . . , n, is contained therein. Formally:
n
%
x↔ Xi : x ↔ Xi , →i = 1, . . . , n.
i=1

4
Notice this: A ⇑ B ↙ A ∝ B = A, whereby A ⇑ B = A ↓′ B.
Definition 1.13 (Inclusive disjunction, inclusive union) The inclusive disjunction (OR)
of n elements ei ↔ X, →i = 1, . . . , n, is such that at least one element is true. Formally:
→i ↔ [1, n] ⇑ N+ ,
n
&
ei = T, ↑ei = T.
i=1

The inclusive union of n sets Xi , →i = 1, . . . , n, is such that any common or non-


common element x ↔ Xi , ↑i = 1, . . . , n, is contained therein. Formally:
n
'
x↔ Xi : x ↔ Xi , ↑i = 1, . . . , n.
i=1

Notice this: A ⇑ B ↙ A ↘ B = B, whereby A ⇑ B = A ↓′ B.


Definition 1.14 (Exclusive disjunction, exclusive union) The exclusive disjunction
(XOR) of n elements ei ↔ X, →i = 1, . . . , n, is such that only one element is true.
Formally: →i ↔ [1, n] ⇑ N+ ,
&n
ei = T, ↑!ei = T.
i=1
The exclusive union of n sets Xi , →i = 1, . . . , n, is such that any non-common element
y ↔ Xi , ↑!i = 1, . . . , n, is contained therein. Formally:
'n
y↔ Xi : y ↔ Xi , ↑!i = 1, . . . , n.
i=1

Proposition 1.15 (De Morgan’s laws) De Morgan’s first law is such that the negation
of A AND B equals ¬A OR ¬B; De Morgan’s second law is such that the negation of A
OR B equals ¬A AND ¬B. Formally:

¬(A ∞ B) = ¬A ∈ ¬B;
¬(A ∈ B) = ¬A ∞ ¬B.

Proof. Notice that: AB|¬(A ∞ B)|¬A ∈ ¬B = T T |F |F, T F |T |T, F T |T |T, F F |T |T ;


AB|¬(A ∈ B)|¬A ∞ ¬B = T T |F |F, T F |F |F, F T |F |F, F F |T |T. QED
In general, by mathematical induction1 , for n elements ei ↔ X, →i = 1, . . . , n,
( ) ) (
De Morgan’s laws are ¬( ni=1 ei ) = ni=1 ¬ei and ¬( ni=1 ei ) = ni=1 ¬ei . For sets A and
# * * #
B De Morgan’s laws are (A B)↭ = A↭ B ↭ and (A B)↭ = A↭ B ↭ ; in general, by
# *
mathematical induction, for n sets Xi , →i = 1, . . . , n, they are ( ni=1 Xi )↭ = ni=1 Xi↭ and
* #
( ni=1 Xi )↭ = ni=1 Xi↭ .
Definition 1.16 (Syntactic implication) A syntactic implication is such that: A is
su"cient for B (IF A THEN B); B is necessary for A (A ONLY IF B); A AND ¬B are
not; ¬A OR B are. Formally:
1
Mathematical induction is such that, →n ↔ N, {P (0) ∞ [P (n) ↓′ P (n + 1)]} ↓′ P (N).

5
A ↓′ B = ¬(A ∞ ¬B) = ¬A ∈ B.
A is called antecedent, premiss, hypothesis or protasis; B is called consequent, conclusion,
thesis or apodosis. Notice that AB|¬A ∈ B = T T |T, T F |F, F T |T, F F |T : it is true
(valid, dialectical, logical) even if A is false (vacuous proof) or as long as B may be true
(trivial proof); for true A and B it is called a semantic implication (sound, apodictic,
ontological), formalised as A =∋ B. A direct proof is such that A syntactically entails
B : A ↙ B = A ↓′ B; for a true A a direct proof is such that A semantically entails B,
formalised as A ↫ B = A =∋ B.
Definition 1.17 (Contraposition) A contraposition is such that: ¬B is su"cient for
¬A (IF ¬B THEN ¬A); ¬A is necessary for ¬B (¬B ONLY IF ¬A); ¬B AND A are not;
B OR ¬A are. Formally:

¬B ↓′ ¬A = ¬(¬B ∞ ¬¬A) = B ∈ ¬A.


Notice this: ¬B ↓′ ¬A = B ∈ ¬A = ¬A ∈ B = A ↓′ B. An indirect proof is such
that ¬B syntactically entails ¬A : ¬B ↙ ¬A = ¬B ↓′ ¬A; for a true ¬B an indirect
proof is such that ¬B semantically entails ¬A, formalised as ¬B ↫ ¬A = ¬B =∋ ¬A.
Definition 1.18 (Converse) A converse is such that: B is su"cient for A (IF B
THEN A); A is necessary for B (B ONLY IF A); B AND ¬A are not; ¬B OR A are.
Formally:

B ↓′ A = ¬(B ∞ ¬A) = ¬B ∈ A.
Notice this: B ↓′ A = ¬B ∈ A ↗= ¬A ∈ B = A ↓′ B.
Definition 1.19 (Inverse) An inverse is such that: ¬B is su"cient for ¬A (IF ¬B
THEN ¬A); ¬A is necessary for ¬B (¬B ONLY IF ¬A); ¬B AND A are not; B OR ¬A
are. Formally:

¬B ↓′ ¬A = ¬(¬B ∞ ¬¬A) = B ∈ ¬A.


Notice this: ¬B ↓′ ¬AB ∈ ¬A = ¬A ∈ B = A ↓′ B.
Definition 1.20 (Syntactic denial) A denial is such that: A is not su"cient for B
(IF A NOT THEN B); B is not necessary for A (A NOT ONLY IF B); A AND ¬B are;
¬A OR B are not. Formally:

A ↗↓′ B = A ∞ ¬B = ¬(¬A ∈ B).


A proof by contradiction is such that syntactic denial A ∞ ¬B syntactically entails A
or not B : A ∞ ¬B ↙ ¬A ∈ B. The denial of a semantic implication is a semantic denial:
A ↗ =∋ B.
Definition 1.21 (Double implication) A double implication is such that: A is necessary
and su"cient for B (A IF AND ONLY IF B); ¬A OR B AND ¬B OR A are (NXOR,
exclusive non-disjunction). Formally:

A △′ B = (A ↓′ B) ∞ (B ↓′ A) = (¬A ∈ B) ∞ (¬B ∈ A).

6
Notice that for true A and B it is called a semantic double implication, formalised
as A ▽∋ B. Notice this: A △′ B = ¬(A ↬ B) ̸ A∞B, as AB|A △′ B|¬(A ↬ B) =
T T |T |T, T F |F |F, F T |F |F, F F |T |T.
Definition 1.22 (Function) Function f is such that domain X is transformed into
co-domain Y. Formally:

f : X ′ Y, f (x) = y ↔ Y.
A function is also called a transformation, map, mapping, operator, functional or
morphism. An injection is such that for any y ↔ Y there exists at most one x ↔ X :
→y ↔ Y, ↑?x ↔ X. A surjection is such that for any y ↔ Y there exists at least one
x ↔ X : →y ↔ Y, ↑x ↔ X. A bijection is such that for any y ↔ Y there exists only one
x ↔ X, whereby it is invertible: →y ↔ Y, ↑!x ↔ X, whereby f →1 (y) = x ↔ X.
A function composition is such that function g ◦ f : X ′ Z, whereby function g ◦ f (x) =
g[f (x)] = g(y) = z ↔ Z, →f : X ′ Y and g : Y ′ Z. A function is partially commutative,
associative, partially distributive, inverse and identitarian: →f, g, h,

f + g = g + f, f ◦ g ↗= g ◦ f ;
(f + g) + h = f + (g + h), (f ◦ g) ◦ h = f ◦ (g ◦ h);
f ◦ (g + h) ↗= f ◦ g + f ◦ h, (g + h) ◦ f = g ◦ f + h ◦ f ;
f + (↓f ) = 0, f ◦ f →1 = 1, →f →1 ;
f + 0 = f, f ◦ 1 = f.

Notice that the function f : N+ ′ R is injective, as real numbers are uncountable and
the function is not such that for any real number there does exists at least one positive
natural number.
Definition 1.23 (Ordered pair, Cartesian product) An ordered pair (x, y) = (x↑ , y ↑ )
is such that x = x↑ and y = y ↑ ; a Cartesian product contains all combinations of ordered
pairs. Formally:

(x, y) = (x↑ , y ↑ ) : x = x↑ , y = y ↑ ;
X ∀ Y = {(x, y) : x ↔ X, y ↔ Y }.

Exercises (optional)
1. Show that A ⇑ B ↙ A ∝ B = A.
2. Show that A ⇑ B ↙ A ↘ B = B.
( ) ) (
3. Prove De Morgan’s laws such that ¬( ni=1 ei ) = ni=1 ¬ei and ¬( ni=1 ei ) = ni=1 ¬ei .
# * * #
4. Prove De Morgan’s laws such that (A B)↭ = A↭ B ↭ and (A B)↭ = A↭ B ↭ .
#n * * #
5. Prove De Morgan’s laws such that ( i=1 Xi )↭ = ni=1 Xi↭ and ( ni=1 Xi )↭ = ni=1 Xi↭ .
6. Prove that →n ↔ N, {P (0) ∞ [P (n) ↓′ P (n + 1)]} ↓′ P (N).
7. Prove the properties of a function.
8. Show that f : N+ ′ R is injective.

7
2. Vectors and matrices

Definition 2.1 (Vector) A row vector x is a function f : Cn ′ Cn collecting elements


in columns; a column vector x↓ is its transpose. Formally: →n ↔ N+ , x ↔ C1↔n , x↓ ↔ Cn↔1 ,
 
x1
 . 
 ..  .
x = [x1 · · · xn ]; x = [x1 · · · xn ] = 
↓ ↓ 
xn
Notice that vectors are sometimes represented as underlined or in bold: x = x = x.
Notice this: (x↓ )↓ = x.
Definition 2.2 (Scalar) A scalar is a number k ↔ C which multiplies each vector x or
x element. Formally: k ↔ C such that, →n ↔ N+ , x ↔ C1↔n , x↓ ↔ Cn↔1 ,

kx = k[x1 · · · xn ] = [kx1 · · · kxn ];


   
x1 kx1
 .   . 
kx = k  .  =  .. 
↓  .  
.
xn kxn

Notice this: kx = xk; kx↓ = x↓ k.


Definition 2.3 (Matrix) Matrix A is a function f : Cm↔n ′ Cm↔n collecting column
or row vectors. Formally: →m, n ↔ N+ , A ↔ Cm↔n such that
 
a11 · · · a1n
 . ... .. 
 ..
A= . 
.
am1 · · · amn
Notice that the element at the ith row and jth column is Aij = aij , →i, j ⇒ m, n ↔ N+ ;
it is thus also denoted as A = (Aij ). Matrix A is square if and only if rows and columns
m = n; it is tall rectangular if and only if m > n; it is long rectangular if and only
if m < n. The vector operator f : Cm↔n ′ Cmn↔1 for matrix A = (Aij ) is such that
vec(A) = [A11 · · · Amn ]↓ .
Definition 2.4 (Matrix transpose) Matrix transpose A↓ is the interchange of its rows
and columns. Formally: →m, n ↔ N+ , A = (Aij ) ↔ Cm↔n , A↓ = (Aji ) ↔ Cn↔m such that
 
a11 · · · am1
 . .. .. 
 ..
A = .

. 
.
a1n · · · amn
The transpose function is f : Cm↔n ′ Cn↔m . For matrices A, B ↔ Cm↔n and scalar
k ↔ C notice the following properties:

8
(A↓ )↓ = A (double transpose);
(AB)↓ = B ↓ A↓ (transpose multiplication);
(kA)↓ = kA↓ (transpose scalar multiplication).

Definition 2.5 (Null matrix) Null matrix N is a matrix composed of zero elements
alone. Formally: →m, n ↔ N+ , N ↔ Cm↔n such that
 
0 ··· 0
 . .
. . ... 

 ..
N = .
0 ··· 0
Definition 2.6 (Identity matrix) Identity matrix In is a square matrix composed of ones
along its main diagonal and of zeros in its o! diagonals. Formally: →n ↔ N+ , In ↔ Cn↔n
such that
 
1 ··· 0
 . .
. . ... 

 ..
In =  .
0 ··· 1
Notice this: →n ↔ N+ , A, In ↔ Cn↔n , AIn = In A = A; →m, n ↔ N+ , A ↔ Cm↔n , Im ↔
Cm↔m , In ↔ Cn↔n , AIn = Im A = A.
Definition 2.7 (Diagonal matrix) Diagonal matrix D is a square matrix composed
of complex numbers along its main diagonal and of zeros in its o! diagonals. Formally:
→n ↔ N+ , D ↔ Cn↔n , dii ↔ C and dij = 0 such that
 
d11 · · · 0
 . .. .. 
 ..
D= . . 
.
0 · · · dnn
Notice that main diagonal element diag(D)i = dii . Notice that diagonal matrix D = D↓
(symmetric). Notice that a square null matrix is diagonal, but not the converse.
Definition 2.8 (Triangular matrix) Lower triangular matrix L is a square matrix
composed of complex numbers below or on its main diagonal and of zeros above it; upper
triangular matrix U is a square matrix composed of complex numbers above or on its main
diagonal and of zeros below it. Formally: →n ↔ N+ , L, U ↔ Cn↔n , li↗j ↔ C, li<j =
0, ui>j = 0 and ui↘j ↔ C such that
   
l11 · · · 0 u11 · · · u1n
 . . .   . .. .. 
 ..
L= . . .. 
 ; U =  .. . . 

.
ln1 · · · lnn 0 · · · unn
Definition 2.8 (Jacobian matrix) Jacobian matrix J is a matrix whose elements are
the partial derivatives of multivariate function fi : Rn ′ R, once continuously di!erentiable

9
fi ↔ C 1 , with respect to independent variable xj , →i = 1, . . . , m, j = 1, . . . , n. Formally:
→i ↔ [1, m] ⇑ N+ , j ↔ [1, n] ⇑ N+ , fi : Rn ′ R, fi ↔ C 1 , J ↔ Rm↔n such that
 ωf1 ωf1 
···
 ωx. 1 ..
ωxn
.. 
J=  . .
 . . .

ωfm ωfm
ωx1
··· ωxn

Notice that element Jij = ωxωfi


j
.
Definition 2.9 (Hessian matrix) Hessian matrix H is a square matrix whose elements
are the second partial derivatives of multivariate function f : Rn ′ R, twice continuously
di!erentiable f ↔ C 2 , with respect to independent variables xi and xj , →i, j = 1, . . . , n.
Formally: →i, j ↔ [1, n] ⇑ N+ , f : Rn ′ R, H ↔ Rn↔n such that
 
ω2f ω2f
ωx1 x1
··· ωx1 xn
 
H=
 .. .. .. 
 . . . .

ω2f ω2f
ωxn x1
··· ωxn xn
2
Notice that element Hij = ωxωi ωxf
j
and that row vector Hi = ωx ωJi
j
= ω≃f
ωxj
for twice
continuously di!erentiable function fi ↔ C and gradient operator ∃f = [fx1 · · · fxn ].
2

Definition 2.10 (Symmetric matrix) Symmetric matrix S is a square real matrix


which equals its transpose. Formally: →n ↔ N+ , S = (Sij ) ↔ Rn↔n and S ↓ = (Sji ) ↔ Rn↔n
such that

S = S ↓.
Notice that matrix S is skew symmetric such that S = ↓S ↓ . Notice this:

diag(S)i ↔ R (symmetric diagonal);


ωS(ε) ↔ R for S(ω) = S ↓ ωIn (symmetric eigenvalues);
n

¬viS(ε) , v¬iS(ε) ∅ = viS(ε)i v¬iS(ε)i = 0,
i=1
 
||vi ||2 = ||v¬iS(ε) ||2 = ¬viS(ε) , viS(ε) ∅ = ¬v¬iS(ε) , v¬iS(ε) ∅ = 1 and
bs(S) = {viS(ε) }ni=1 ⇑ R (symmetric eigenvectors).
n

Definition 2.11 (Hermitian matrix) Hermitian matrix M is a square complex matrix


whose elements are the complex conjugates of its transpose. Formally: →n ↔ N+ , M =
(Mij ) ↔ Cn↔n and M = (M ji ) ↔ Cn↔n such that

MH ̸ M = M .

Notice this: (M H )H = M. Notice this:

10
diag(M )i ↔ R (Hermitian diagonal);
ωM (ε) ↔ R for M (ω) = M ↓ ωIn (Hermitian eigenvalues);
n

¬viM (ε) , v¬iM (ε) ∅ = viM (ε)i v ¬iM (ε)i = 0,
i=1
 
||vi ||2 = ||v¬iM (ε) ||2 = ¬viM (ε) , viM (ε) ∅ = ¬v¬iM (ε) , v¬iM (ε) ∅ = 1 and
bs(M ) = {viM (ε) }ni=1 ⇑ Cn (Hermitian eigenvectors).

Notice that matrix M is anti-Hermitian such that M = ↓M . For matrices A, B ↔


Cm↔n notice the following properties:

(A + B)H = AH + B H (Hermitian addition);


(AB)H = B H AH (Hermitian multiplication).

Definition 2.12 (Idempotent matrix) Idempotent matrix P is a square matrix which


equals its square. Formally: →n ↔ N+ , P = (Pij ) ↔ Cn↔n such that

P = P 2.
Definition 2.13 (Positive definite matrix) Positive definite matrix M is a Hermitian
matrix such that its product by the square of vector v ↔ Cn↔1 \{0} is positive. Formally:
→n ↔ N+ , M = M H ↔ Cn↔n , v ↔ Cn↔1 \{0}, v ↓ = (v ji ) ↔ Cn↔1 \{0},

v ↓ M v ↔ C++ .
Notice that a positive semi-definite (non-negative definite) matrix M is a Hermitian
matrix such that its product by the square of vector v ↔ Cn↔1 is non-negative: ceteris
paribus, v ↓ M v ↔ C+ . Notice that a positive definite matrix is invertible, but not the
converse.
Definition 2.14 (Matrix addition and subtraction) p ↔ N+ \{1} matrices are summed
or subtracted such that they are of the same dimension, matching the respective summands
or minuends and subtrahends. Formally: →m, n, p ↔ N+ , {Ak }pk=1 ↔ Cm↔n ,

11
p

Ak = A1 ± . . . ± Ap =
k=1
   
a111 · · · a11n ap11 · · · ap1n
 . . .   . .. .. 
 ..
= .. .. 
 ± . . . ±  .. . =
. 

a1m1 · · · a1mn apm1 · · · apmn
 
a111 ± . . . ± ap11 · · · a11n ± . . . ± ap1n

= .. ... .. 
 . . .

a1m1 ± . . . ± apm1 · · · a1mn ± . . . ± apmn
p ! "
Notice this: S = (Sij ) = k=1 Ak =
p
k=1 Akij . For matrices A, B, C ↔ Cm↔n notice
the following properties:

A + B = B + A (additive commutativity);
A + (B + C) = (A + B) + C (additive associativity).

Definition 2.15 (Matrix multiplication) Two matrices are multiplied such that they are
conformable, matching the columns of the first product matrix with the rows of the second
product matrix and multiplying the terms therein. Formally: →{m, n, p, q} ⇑ N+ , A ↔
Cm↔n , B ↔ Cp↔q , n = p,

A ∀ B = A · B = AB =
  
a11 · · · a1n b11 · · · b1q
 . .
..  . .. . =
 
 ..
= . .. 
  .. . .. 
am1 · · · amn bp1 · · · bpq
 n n  
j=1 a1j bj1 ··· j=1 a1j bjq
=
 .. .. .. 
=
 . . . 
n n
j=1 amj bj1 ··· a b
j=1 mj jq
 p 
p 
i=1 a1i bi1 ··· i=1 a1i biq
=
 .. .. .. 
 . . . .

p p
i=1 ami bi1 ··· a
i=1 mi iqb

Notice this: AB ↗= BA (multiplicative non-commutativity) despite q = m (square)2 .


However, for matrices A, B, C ↔ Cm↔n and scalars k, h ↔ C notice the following properties:
2
Matrix product AB = BA for (i) diagonal matrices A, B ↔ Cn→n , (ii) matrices A = kIn , B = hIn and
scalars k, h ↔ C or (iii) matrices A, B as polynomials of the same given matrix.

12
A(BC) = (AB)C (multiplicative associativity);
A(B + C) = AB + AC (pre-distributivity);
(B + C)A = BA + CA (post-distributivity);
k(hA) = (kh)A and k(AB) = (kA)B = A(kB) (scalar associativity);
k(A + B) = kA + kB and (k + h)A = kA + hA (scalar distributivity).

The commutator of matrices A, B ↔ Cm↔n is [A, B] = AB ↓ BA. The Hadamard


product is such that →m, n ↔ N+ , A, B, C, 0, 1 ↔ Cm↔n , k ↔ C, A ℜ B = (Aij Bij ) ↔
Cm↔n , featuring the following properties:

A ℜ B = B ℜ A (commutativity);
A ℜ (B ℜ C) = (A ℜ B) ℜ C (associativity);
A ℜ (B + C) = A ℜ B + A ℜ C (distributivity);
A ℜ 1 = 1 · A = A (identity);
(kA) ℜ B = A ℜ (kB) = k(A ℜ B) (scalar associativity);
A ℜ 0 = 0 ℜ A = 0 (nullity).

The Kronecker product is such that, →{m, n, p, q} ⇑ N+ , A ↔ Cm↔n , B ↔ Cp↔q , A ℑ


B = (Aij B) ↔ Cmp↔nq , featuring the following properties:

A ℑ (B ℑ C) = (A ℑ B) ℑ C (associativity);
A ℑ (B + C) = A ℑ B + A ℑ C (pre-distributivity);
(B + C) ℑ A = B ℑ A + C ℑ A (post-distributivity);
(kA) ℑ B = A ℑ (kB) = k(A ℑ B) (scalar associativity);
A ℑ 0 = 0 ℑ A = 0 (nullity);
(A ℑ B)(C ℑ D) = AC ℑ BD (mixed multiplication);
(A ℑ B)vec(C) = vec(BCA↓ ) (mixed vectorisation);
(A ℑ B) ℜ (C ℑ D) = (A ℜ C) ℑ (B ℜ D) (Hadamard multiplication);
(A ℑ B)→1 = A→1 ℑ B →1 and (A ℑ B)+ = A+ ℑ B + (inversion);
(A ℑ B)↓ = A↓ ℑ B ↓ and (A ℑ B) = A ℑ B (transposition);
det(A ℑ B) = det(A)n det(B)m forA ↔ Cm↔m , B ↔ Cn↔n (determination);
A ⊤ B = A ℑ Im + In ℑ B (addition);
eA⇐B = eA ℑ eB (exponentiation);
vec(A ℑ B) = (In ℑ Kqm ℑ Ip )[vec(A) ℑ vec(B)] for Kqm vec(A) = vec(A↓ ) (vectorisation);
x ℑ y = vec(xy ↓ ) for x ↔ Cm , y ↔ Cn (outer product).

13
1 k 
Notice that matrix exponentiation is such that, →A ↔ Cn↔n , eA = ⇒ k=0 k! A , whereby
matrix B = eA features matrix logarithm A, which need not be unique, as the exponential
function is surjective; a unique matrix logarithm is however such that A = elnA .
Definition 2.16 (Matrix trace) The trace of a square matrix A is the sum of its main
diagonal elements. Formally: →n ↔ N+ , A ↔ Cn↔n ,
n

tr (A) = aii .
i=1

For matrices A, B ↔ Cn↔n and scalar k ↔ C notice the following properties:

tr(kA) = ktr(A) (trace scalar distributivity);


tr(AB) = tr(BA) (trace commutativity);
tr(A + B) = tr(A) + tr(B) (trace distributivity).

Definition 2.17 (Matrix determinant) The determinant of a square matrix A equals


the sum of the product of its element in the first column or row across all rows or columns
and of the respective co-factor; the co-factor of A is the product of (↓1)i+j and of the
determinant of the minor of A, which is the matrix generated by the exclusion of row i and
column j. Formally: →n ↔ N+ , A ↔ Cn↔n , cij = (↓1)i+j det(mij ),
n
 n

det(A) = ai1 ci1 = a1j c1j .
i=1 j=1
 
a b
Notice this: →A = , det(A) = ad ↓ bc, as
c d

a(↓1)1+1 det(d) + c(↓1)2+1 det(b) =


a(↓1)1+1 det(d) + b(↓1)1+2 det(c) =
ad ↓ cb = ad ↓ bc;
 
a b c
→A =  
 d e f  , det(A) = (aei + dhc + bf g) ↓ (ceg + bdi + f ha), as
g h i
     
e f b c b c
a(↓1)1+1 det + d(↓1)2+1 det + g(↓1)3+1 det =
h i h i e f
     
1+1 e f 1+2 d f 1+3 d e
a(↓1) det + b(↓1) det + c(↓1) det =
h i g i g h
a(ei ↓ f h) ↓ d(bi ↓ ch) + g(bf ↓ ce) = a(ei ↓ f h) ↓ b(di ↓ f g) + c(dh ↓ eg) =
= (aei + dhc + bf g) ↓ (ceg + bdi + f ha).

For matrices A, B ↔ Cn↔n and scalar k ↔ C notice the following properties:

14
det(AB) = det(A)det(B) (determinant product);
det(A) = 0 for Ai, j = 0 (zero determinant);
n

det(A) = aii for A ⊥ L, U (triangular determinant);
i=1
det(A) = kdet(B) for A = [kBi, j B¬i, ¬j ] (scalar determinant).

Definition 2.18 (Matrix inverse) The inverse of a square matrix A equals the product
of the inverse of its determinant and of its adjunct (classical adjoint, adjugate), which is
the transpose of its co-factor matrix. Formally: →n ↔ N+ , A ↔ Cn↔n , C = (cij ),
1 1
A→1 = adj(A) = C ↓.
det(A) det(A)
An invertible matrix is called non-singular or non-degenerate. Notice that an invertible
matrix is square, but not the converse. For square matrices A, B ↔ Cn↔n and scalar k ↔ C
notice the following properties:

(A→1 )→1 = A (double inverse);


(kA)→1 = k →1 A→1 (scalar inverse);
(A→1 )↓ = (A↓ )→1 (transpose inverse);
(AB)→1 = B →1 A→1 (product inverse).
1 1
Notice this: A→1 = det(A) adj(A) ↓′ AA→1 = A→1 A = In = det(A) adj(A)A =
1
det(A)
Aadj(A) ↓′ det(A)In = adj(A)A = Aadj(A).
Exercises (optional)
1. Prove that (x↓ )↓ = x.
2. Prove that kx = xk; kx↓ = x↓ k.
3. Prove that (A↓ )↓ = A, (AB)↓ = B ↓ A↓ and (kA)↓ = kA↓ .
4. Prove that AIn = In A = A and AIn = Im A = A.
5. Prove that diagonal D = D↓ .
6. Prove that a square null matrix is diagonal, but not the converse.
7. Prove that diag(M )i ↔ R for M = M H .
8. Prove that a positive definite matrix is invertible, but not the converse.
9. Prove that (A + B)H = AH + B H and (AB)H = B H AH .
10. Prove that A + B = B + A and A + (B + C) = (A + B) + C.
11. Prove that AB ↗= BA despite q = m.
12. Prove that A(BC) = (AB)C, A(B + C) = AB + AC, (B + C)A = BA +
CA, k(hA) = (kh)A, k(AB) = (kA)B = A(kB), k(A + B) = kA + kB and (k + h)A =
kA + hA.
13. Prove that tr(kA) = ktr(A), tr(AB) = tr(BA) and tr(A + B) = tr(A) + tr(B).

15

14. Prove that det(AB) = det(A)det(B), det(A) = 0 for Ai, j = 0, det(A) = ni=1 aii
for A ⊥ L, U and det(A) = kdet(B) for A = [kBi, j B¬i, ¬j ].
15. Prove the properties of the Hadamard product.
16. Prove the properties of the Kronecker product.
17. Derive the matrix exponential and the matrix logarithm; prove that a unique matrix
logarithm is such that A = elnA .
18. Prove that an invertible matrix is square, but not the converse.
19. Prove that (A→1 )→1 = A, (kA)→1 = k →1 A→1 , (A→1 )↓ = (A↓ )→1 and (AB)→1 =
→1 →1
B A .

16
3. Lengths, systems and eigenvalues
Definition 3.1 (Linear space) A linear space X is such that it is closed under addition
and scalar multiplication. Formally: →n ↔ N+ , x, y, z ↔ X ⇑ Cn , ε ↔ C,

z = x + y;
εx ↔ X.

Linear spaces and subspaces are also called vector spaces and subspaces. For linear
elements x, y, z ↔ X ⇑ Cn and 0, 1 ↔ Cn and scalars ε, ϑ ↔ C notice the following
properties:

x + y = y + x (additive commutativity);
x + (y + z) = (x + y) + z (additive associativity);
ε(ϑx) = (εϑ)x (scalar multiplicative associativity);
ε(x + y) = εx + εy (scalar distributivity);
x + (↓x) = ↓x + x = 0 (additive inversion);
x + 0 = 0 + x = x and x · 1 = 1 · x = x (identity).

Definition 3.2 (Linear subspace) A linear subspace Y ⇐ X is such that Y is also a


linear space, whereby, →x, y ↔ Y and ε, ϑ ↔ C,

εx + ϑy ↔ Y and 0 ↔ Y.
Notice that linear subspaces {0} ⇑ X and X ⇐ X are trivial linear subspaces of linear
space X.
Definition 3.3 (Linear independence) Vectors are linearly independent such that each
one cannot be written as a linear combination of the others; linear dependence is its negation.
Formally: →m ↔ N+ , X = {xi }ni=1 ⇑ Cm ,

n

ki xi = 0 : {ki }ni=1 = 0;
i=1
n

ki xi = 0 : {ki }ni=1 ⇑ C\{0}.
i=1

 
Notice that linear dependence is such that xi = n→1 k¬i
¬i=1 ki x¬i ↓′ i=1 ki xi = 0 for
n

{k1 , . . . , kn } ⇑ C\{0}. Linear independence is checked by grouping vectors {xi }ni=1 into
some matrix A ↔ Cm↔n and expressing it in reduced row echelon form through Gauss
Jordan elimination or row echelon form through Gaussian elimination such that the absence
of pivots (presence of zero row vectors) verifies linear dependence.
Reduced row echelon form is such that: (i) the non-zero element (pivot) of each row is
one and is to the right of that in the row above it; (ii) all elements above and below the

17
pivot are zero; (iii) any zero row is at the bottom of the matrix. Row echelon form is such
that: (i) the non-zero element (pivot) of each row is to the right of that in the row above it;
(ii) any zero row is at the bottom of the matrix.
Gauss Jordan elimination proceeds: (i) by dividing the row of the pivot candidate by
the pivot candidate; (ii) adding the product of the pivot row and of the negative of some
element above or below the pivot to that row above or below it; (iii) repeating the first two
steps across rows to yield the reduced row echelon form.
Gaussian elimination can proceed: (i) by dividing the row of the pivot candidate by
the pivot candidate; (ii) adding the product of the pivot row and of the negative of some
element below the pivot to that row below it; (iii) repeating the first two steps across rows
to yield the row echelon form.
Definition 3.4 (Matrix rank) The rank of a matrix is the maximum number of linearly
independent vectors. Formally: →m, n ↔ N+ , A ↔ Cm↔n ,

rk(A) = max({Ai, j }m,


i, j=1 ).
n

The rank of a matrix is computed by expressing matrix A in reduced row echelon form
through Gauss Jordan elimination or row echelon form through Gaussian elimination and
counting the non-zero vectors. Notice that the row rank and column rank of matrix A are
equal: →m, n ↔ N+ , A ↔ Cm↔n , rkr (A) = rkc (A).
The row rank and column rank of matrix A are full such that rk ˆ r (A) = m and
ˆ c (A) = n, respectively, whereby rk
rk ˆ r (A) ↗= rk
ˆ c (A) and rk(A)
ˆ ˆ r (A), rk
= min{rk ˆ c (A)} =
min{m, n}; notice that rk ˆ r (A) = rk
ˆ c (A) = m = n for A ↔ Cm↔m = Cn↔n .
Definition 3.5 (Span and basis) The span of matrix A, originating from some linear
subspace X = {xi }ni=1 ⇑ Cm , is a set of vectors Y = {yh }ph=1 ⇑ Cm such that they are linear
combinations of each matrix vector Ai, j ; the basis for A is its span Y = {yh }ph=1 ⇑ Cm such
that its elements are linearly independent. Formally: →i, j ⇒ m, n ↔ N+ , p ↔ N+ , X =
{xi }ni=1 ⇑ Cm , A ↔ Cm↔n , Y = {yh }ph=1 ⇑ Cm ,

 p 

sp(A) = sp(X) = yh ↔ Y : kh yh = Ai, j , →{kh }ph=1 ⇑ C\{0} ;
h=1
 p 

bs(A) = bs(X) = sp(A) : ϖh yh = 0 : {ϖi }ph=1 =0 .
i=1

Notice that basis bs(A) = bs(X) : rk(A) = rk(X) = n for A = (Ai ) = X = (xi ).
Definition 3.6 (Completeness) For basis X = {xi }ni=1 ⇑ Cm and length m ↔ N+
completeness is such that
n

Im = xi x↓
i .
i=1

Notice that complete vectors X = {xi }ni=1


⇑ Cm form an orthonormal basis for space
H
Cm such that identity matrix Im = X X for matrix X = (xi ).
Definition 3.7 (Linear space dimension) The dimension of a linear space is the number

18
of vectors (cardinality) contained in one given basis thereof. Formally: →n ↔ N+ , X ⇑ Cn ,

dim(X) = n[bs(X)].
Definition 3.8 (Row and column space) The row or column space of matrix A,
originating from some linear subspace X = {xi }ni=1 ⇑ Cm , is the set of all linear combinations
of all matrix rows or columns Ai, j , namely, it is the span of all of its rows or columns.
Formally: →m, n ↔ N+ , X = {xi }ni=1 ⇑ Cm , A ↔ Cm↔n ,

rw(A) = rw(X) = sp({Ai }m


i=1 ) = sp({xi }i=1 ) and cl(A) = cl(X) = sp({Aj }j=1 ).
m n

Notice that the row or column span of matrix A equals its row or column space:
→m, n ↔ N+ , A ↔ Cm↔n , spr (A) = rw(A) and spc (A) = cl(A) such that rw(A) = cl(A↓ ),
whereby the row space is computed by finding the row rank vectors and the column space
is computed by selecting the corresponding A columns of the pivot elements in its reduced
row echelon form.
Notice that the rank of matrix A equals the dimension of the row space and column space
of A : →m, n ↔ N+ , A ↔ Cm↔n , rk(A) = dim[rw(A)] = dim[cl(A)] = n{bs[rw(A)]} =
n{bs[cl(A)]} = n[bs(A)], whereby the basis of A is computed by finding the rank row
vectors.
Definition 3.9 (Null space) The null space of matrix A is the solution vector x in the
system of linear equations Ax = 0, →m, n ↔ N+ , A ↔ Cm↔n , x ↔ Cn↔1 , 0 ↔ Cm↔1 :

Ax = 0 : nl(A) = x.
Notice this: →i, j ⇒ m, n ↔ N+ , A ↔ Cm↔n , x ↔ Cn↔1 , Ax = 0 such that nl(A) = x
and ¬A↓ i , x∅ = 0. The null space of matrix A is also called kernel of A and it is such
that the sum of A’s rank and A’s kernel dimension equals the number of A’s columns
(rank nullity theorem); therefore, for a full rank the kernel dimension of matrix A equals
zero and is trivial: →m, n ↔ N+ , A ↔ Cm↔n , rk(A) + dim[ker(A)] = n such that
rk(A) = n ↓′ dim[ker(A)] = n{bs[ker(A)]} = n[bs(0)] = n(0) = 0.
Notice that the kernel of matrix A is computed such that the system of linear equations
Ax = 0 is rendered as the augmented matrix [A|0] and transformed into row echelon form
through Gaussian elimination, solving the resulting system for free variables.
The transformation of the augmented matrix [A|0] into augmented matrix [In |v] for
matrix A ↔ Cn↔n and vector v ↔ Cn↔1 is such that there exists no non-trivial kernel. The
co-kernel of matrix A is the kernel of matrix transpose A↓ such that the system of linear
equations x↓ A = 0 ↓′ (x↓ A)↓ = A↓ x = 0.
Claim 3.10 (Unique solution) In the system of linear equations Ax = b, →n ↔ N+ , A ↔
C , x ↔ Cn↔1 , b ↔ Cn↔1 , the following statements are equivalent: there exists a unique
n↔n

solution vector x; matrix A is invertible; matrix A’s determinant is non-zero; matrix A’s
rank is full. Formally: →n ↔ N+ , A ↔ Cn↔n , x ↔ Cn↔1 , b ↔ Cn↔1 ,

Ax = b : ↑!x △′ ↑A→1 △′ det(A) ↗= 0 △′ rk(A) = n.

19
Definition 3.11 (Cramer’s rule) In the system of linear equations Ax = b, →n ↔
N+ , A ↔ Cn↔n , x ↔ Cn↔1 , b ↔ Cn↔1 , an invertible matrix A is such that

det(Â)
xi = ,
det(A)
in which Â’s ith column is b and all other columns are those of A.
Notice that matrix A is invertible and has rank rk(A) = n and that solution vector x is
unique because determinant det(A) ↗= 0.
Claim 3.12 (Linear system solution) In the system of linearly independent equations
Ax = b, →n ↔ N+ , A ↔ Cn↔n , x ↔ Cn↔1 , b ↔ Cn↔1 , the unique solution vector x is computed
by expressing the augmented matrix [A|b] in reduced row echelon (identity sub-matrix) form
through Gauss Jordan elimination, which proceeds: (i) by dividing the row of the pivot
candidate by the pivot candidate; (ii) adding the product of the pivot row and of the negative
of some element above or below the pivot to that row above or below it; (iii) repeating the
first two steps across rows to yield the reduced row echelon form (identity sub-matrix).
Formally: →{ki }ni=1 , {xi }ni=1 ⇑ C,

k11 x1 = k10 + . . . + k1n xn


..
.
knn xn = kn0 + . . . + knn→1 xn→1

such that

20
      
k11 . . . k1n x1 k10 k11 . . . k1n k10
 . .
..  .  ..  .. . . . .. 
    
 . . .. 
 .   ..  =  .  . . .. . 
 
 ↓′  
kn1 . . . knn xn kn0 kn1 . . . knn kn0
 k   
11
. . . kk1n k10
1 . . . kk1n k10
 k11
 .. ..
11
.
k11
.. 
 =  ..
 ..
11
..
k11
..  r̃1 = r1
↓′  . . .. .   . . . . 

k11

kn1 . . . knn kn0 kn1 . . . knn kn0


 
1 ... k1n
k11
k10
k11

 .. .. .. .. 

↓′  . . !. " !. "
 =
 
kn1 ↓ 1 · kn1 . . . knn ↓ kk1n
11
kn1 kn0 ↓ kk10
11
kn1
 
1 ... k1n
k11
k10
k11
 .. . . .. ..
 

= . . ! . " !. "

 
0 . . . knn ↓ k11 kn1 kn0 ↓ kk10
k1n
kn1 r̃n = rn ↓ r1 kn1
11
 
1 ... k1n
k11
k10
k11
 .. . . .. .. 

 . . . !. "


↓′  
 
r̃n =
k10
 kn0 → kn1  !rn "
0 ... 1
k11
 ! "  knn →
k1n
kn1
k1n k11
knn → k11
kn1
  ! "  
k10
! " ! " ! " kn0 → kn1
1↓0 ↓1
k1n k1n k1n k10 k1n k11
 ... ↓  ! "  
 k11 k11 k11 k11 k11
knn →
k1n
kn1

 k11 
 
 .. .. .. .. 
=
↓′ 
 . . . !. "


 k10 
 kn0 → kn1 
0 1
k11
 ... ! " 
k1n
knn → k11
kn1
  ! "  
k10
! " kn0 → kn1
1 ... 0 k10 k1n k11
 ↓  ! "  
 k11 k11 k1n 
 knn → k11
kn1  ! "

 .. . . .. ..

 r̃1 = r1 ↓ k1n
rn
= . . . !. "
  k11
 
 k10 
 kn0 → kn1 
0 ... 1
k11
 ! " 
k1n
knn → k11
kn1
  ! "  
k10
! " kn0 → k11
kn1
   k10 ↓ k1n  ! "  
 k11 k11 k1n 
x1 

knn → k11
kn1 

 .  ..
 ..  =
 
!. "
. . . ↓′   

.

xn  k10 
 kn0 → k11
kn1 
 ! " 
k1n
knn → k11
kn1

For equations m = n system Ax = b, →m, n ↔ N+ , A ↔ Cm↔n , x ↔ Cn↔1 , b ↔

21
Cm↔1 , is determined and unique solution vector x is expressed as above, provided no
linearly dependent equations and no contradictory equations; for unknowns n > m it is
underdetermined and x is expressed as a function of b for {xi }n→1
i=1 , but not xn (indeterminate),
and x = {y + x0 : Ay = b, x0 ↔ ker(A)}, as Ax = A(y + x0 ) = b + 0 = b and
A(x ↓ y) = b ↓ b = 0; for m > n is overdetermined and solution vector x does not exist
(inconsistent).
Contradictory equations are such that identical sides equal di!erent corresponding
sides; redundant equations are such that some must be dropped in order to achieve system
determination. For no linear dependence and no contradictory equations solutions of
underdetermined systems are indeterminate and those of overdetermined systems are
inconsistent, whereby inconsistent overdetermined systems can be reduced to determined
systems through the suitable elimination of redundant equations.
Definition 3.13 (Inner product) The inner product of two vectors is the sum of the
product of their respective elements such that those of the second are the complex conjugates.
Formally: →n ↔ N+ , x, y ↔ Cn ,
n

¬x, y∅ = xi y i .
i=1

For vectors x, y, z ↔ Cn and scalars ε, ϑ ↔ C notice the following properties:

¬x, x∅ ↔ R+ such that ¬x, x∅ = 0 for x = 0 and ¬x, 0∅ = ¬0, y∅ = 0 (positive semi-definition);
¬εx + ϑy, z∅ = ε¬x, z∅ + ϑ¬y, z∅,
¬x, εy + ϑz∅ = ε¬x, y∅ + ϑ¬x, z∅ and
¬εx + ϑy, εx + ϑy∅ = |ε|2 ¬x, x∅ + εϑ¬x, y∅ + ϑε¬y, x∅ + |ϑ|2 ¬y, y∅ (linearity);
¬x, y∅ = ¬y, x∅ (symmetry).

For matrices A, B ↔ Cm↔n for rows and columns m, n ↔ N+ notice that inner product
¬A, B∅ = tr(B ↓ A). In a Euclidean space, although it may not be unique, the inner product
is also called dot, scalar or projection product.
Definition 3.14 (ϱp norm) The ϱp norm of vector x equals the pth root of the sum of
the moduli of its elements raised to exponent p ↔ N+ . Formally: →n, p ↔ N+ , x ↔ Cn ,
 n 1
 p
||x||p = p
|xi | .
i=1

For vectors x, y ↔ Cn and scalar ε ↔ C notice the following properties:

||x||p ⇓ 0 such that ||x||p = 0 for x = 0 (positive semi-definition);


||εx||p = |ε| ||x||p (scalar multiplication);
||x + y||p ⇒ ||x||p + ||y||p ↓′ | ||x||p ↓ ||y||p | ⇒ ||x ↓ y||p (triangle inequality).

22
The ϱ2 norm is called the Euclidean norm, whereby it equals the square root of is inner
product: →x ↔ Cn ,

  n 1  n 1
 2  2
2
||x||2 = ¬x, x∅ = |xi | = xi xi ,
i=1 i=1

as, →a, b ↔ R, i = ↓1, xi xi = (a + ib)(a ↓ ib) = a2 ↓ (ib)2 = a2 + b2 = |xi |2 . Notice
that, →x, y ↔ Cn , Euclidean norm

n
 n
 n

||x ± y||22 = ¬x ± y, x ± y∅ = ||x||22 + ||y||22 ± 2¬x, y∅ = |xi |2 + |yi |2 ± 2 xi y i .
i=1 i=1 i=1

For the Euclidean norm the parallelogram rule is such that

||x + y||22 + ||x ↓ y||22 = 2(||x||22 + ||y||22 )


and the polarisation identity is such that

4¬x, y∅ = ||x + y||22 ↓ ||x ↓ y||22 + i||x + iy||22 ↓ i||x ↓ iy||22 .


The ϱ⇒ norm is called the maximum norm:

lim ||x||p = ||x||⇒ = max{|x1 |, . . . , |xn |}.


p⇑⇒

Definition 3.15 (Cauchy Schwartz inequality) For Euclidean vectors x, y ↔ Cn and


length n ↔ N+ the Cauchy Schwartz inequality is such that

|¬x, y∅| ⇒ ||x||2 ||y||2 .


Notice that for non-Euclidean vectors (ϱp norms) the Cauchy Schwartz inequality is
generalised by Hölder’s inequality: →n, p, q ↔ N+ , x, y ↔ Cn ,
 n 1  n 1
n
  p  q
1 1
|xi y i | ⇒ |xi |p |yi |q such that + =1
i=1 i=1 i=1 p q
and

n
 n   n 
  
|xi y i | ⇒ (supi |xi |) |yi | = ||x||⇒ |yi | such that p ′ ⇔ and q ′ 1.
i=1 i=1 i=1

Definition 3.16 (Triangle inequality) For vectors x, y ↔ Cn and length n ↔ N+ the


triangle inequality is such that, →p = [1, ⇔] = N̄+ ,

||x + y||p ⇒ ||x||p + ||y||p .


Notice that for the Euclidean norm the triangle inequality follows from the Cauchy

23
Schwartz inequality:

||x + y||22 = ¬x + y, x + y∅ = ||x||22 + ¬x, y∅ + ¬y, x∅ + ||y||22 =


= ||x||22 + 2ℵ¬x, y∅ + ||y||22 ⇒ ||x||22 + 2|¬x, y∅| + ||y||22 ⇒
⇒ ||x||22 + 2||x||2 ||y||2 + ||y||22 = (||x||2 + ||y||2 )2
↓′ ||x + y||2 ⇒ ||x||2 + ||y||2 .

Proposition 3.17 (Angle of two vectors) Angle ς ↔ R between Euclidean vectors


x, y ↔ Cn , →n ↔ N+ , is such that

¬x, y∅
cosς = ↔ [↓1, 1] ⇑ R.
||x||2 ||y||2
Proof. By the law of cosines c2 = a2 + b2 ↓ 2abcosC angle ς between Euclidean vectors
x and y is computed as follows: →x, y ↔ Cn ,

||x||22 + ||y||22 ↓ 2||x||2 ||y||2 cosς =


= ||x ↓ y||22 = ||x||22 + ||y||22 ↓ 2¬x, y∅ ↓′
↓ 2||x||2 ||y||2 cosς = ↓2¬x, y∅ ↓′
¬x, y∅
cosς = ↓′
||x||2 ||y||2
 
¬x, y∅
ς = cos →1
.
||x||2 ||y||2

QED
Notice that orthogonal vectors x, y ↔ Cn , →n ↔ N+ , are such that inner product
¬x, y∅ = 0 and angle
   
¬x, y∅ 0 φ
ς = cos →1
= cos →1
= 90⇓ = .
||x||2 ||y||2 ||x||2 ||y||2 2
Notice that orthonormal vectors x, y ↔ Cn are such that they are orthogonal and
Euclidean norm ||x||2 = ||y||2 = 1. The Gram Schmidt algorithm is such that for span
sp(X) of linearly independent vectors X = {xi }ni=1 ⇑ Cm , →m ↔ N+ , there exists an
orthonormal basis bs(X) = {ei }ni=1 ⇑ Cm by vector ej+1 = ||bbj+1
j+1
||2
for vector bj+1 =

xj+1 ↓ ji=1 ¬xj+1 , ei ∅ei . Orthogonal matrix A ↔ Cn↔n for rows and columns n ↔ N+ is such
that its rows and columns form an orthonormal basis.
Definition 3.18 (Metric space) (Cn , d) is a metric space such that, for vectors
x, y ↔ Cn , length n ↔ N+ and any norm definition, distance

d(x, y) = ||x ↓ y|| ↔ R.


Notice that, →x, y ↔ C, Euclidean distance and norm

24
 1
d(x, y) = ||x ↓ y||2 = ¬x ↓ y, x ↓ y∅ = (|x ↓ y|2 ) 2 = |x ↓ y|.
Definition 3.19 (Characteristic polynomial) The characteristic polynomial of a square
matrix A is the equation generated by the zero determinant of the di!erence between the
matrix and the product of an eigenvalue scalar ω and the identity matrix of a corresponding
size In . Formally: →A ↔ Cn↔n , ω ↔ C,

   
a11 · · · a1n 1 ··· 0
 . .
..  ↓ ω  .. . . ... 
.   . . 
A(ω) = A ↓ ωIn =  ..
 ..  
=
an1 · · · ann 0 ··· 1
     
a11 · · · a1n ω ··· 0 a11 ↓ ω · · · a1n
 . . .   . . .   . . .. 
=  ..
 .. ..  ↓  ..
  . . ..  = 
  .. .. . 

an1 · · · ann 0 ··· ω an1 · · · ann ↓ ω
 



a11 ↓ ω · · · a1n 

.. ... .. 
↓′ det[A(ω)] = det 
 . . 
 = 0.

 
 an1 ··· ann ↓ ω 

Notice that the characteristic polynomial of matrix A is an nth order polynomial with
roots {ωi }ni=1 ⇑ C such that determinant det[A(ω)] = 0 and ωi is an eigenvalue. By the
Fundamental theorem of algebra, eigenvalue ωi is the inverse of root zi for the characteristic
equation generated by determinant det[A(z)] = 0 such that matrix A(z) = In ↓ Az and
roots {zi }ni=1 ⇑ C.
Definition 3.20 (Eigenvalue problem) The eigenvalue problem is such that for square
matrix A ↔ Cn↔n , eigenvalue ω ↔ C and eigenvector v ↔ Cn↔1

A(ω)v = (A ↓ ωIn )v = 0 △′ Av = ωv.


Notice that eigenvector vi associated with eigenvalue ωi is linearly independent of v¬i ,
for eigenvalues {ωi }ni=1 ⇑ C and eigenvectors {vi }ni=1 ⇑ Cn↔1 . Eigenvector vi is computed
by solving the system A(ωi )vi = 0 for a non-trivial vi , allowing for normalisation viH vi = 1
wherever dictated by scientific theory. Normalised eigenvectors vi of matrix A (eigenspace
of A) form a basis for space Cn↔1 .
The repetition of eigenvalue ωi is called algebraic multiplicity and is a positive natural
number: µA (ωi ) ↔ N+ . The repetition of eigenvector vi associated with eigenvalue ωi is
called geometric multiplicity and is a positive natural number no greater than ωi ’s algebraic
multiplicity: ↼A (ωi ) ↔ [1, µA (ωi )] ⇑ N+ .
The degree of degeneracy m ↔ [0, n] ⇑ N+ is the number of eigenvectors {vi }ni=1
featuring the same eigenvalue ωi , whereby it is degenerate. Notice that a real symmetric
matrix A = (aij ) = (aji ) ↔ Rn↔n features n linearly independent vectors, namely, it features
arithmetic and geometric multiplicity µA (ωi ) = ↼A (ωi ) = 1.
Eigenvectors {vi }ni=1 are an orthogonal set ¬vi , v¬i ∅ = 0 for eigenvalues {ωi }ni=1 of some
matrix A ↔ Cn↔n with algebraic multiplicity µA (ωi ) = 1. Eigenvalues {ωi }ni=1 endowed

25
with complex conjugate pairs (ωi , ωi ) feature eigenvectors {vi }ni=1 endowed with complex
conjugate pairs (vi , v i ) such that system A(ωi )vi = 0 features complex conjugate system
A(ωi )v i = 0.
Notice that the trace of a square matrix A equals the sum of its eigenvalues {ωi }ni=1 and

that its determinant equals their product: →A ↔ Cn↔n , {ωi }ni=1 ⇑ C, tr(A) = ni=1 ωi and

det(A) = ni=1 ωi .
Notice that similar matrices A, B ↔ Cn↔n are such that B = S →1 AS for similarity
matrix S ↔ Cn↔n , whereby A and B feature the same eigenvalues {ωi }ni=1 . It follows that
for similar matrices D, A ↔ Cn↔n in which D is diagonal A = P →1 DP such that diagonal
element diag(D)i = ωiA(ε) if matrix P = (viA(ε) ), whereby eigenvectors {viA(ε) }ni=1 form a
basis for space Cn↔n given A : Cn↔n ′ Cn↔n .
Unitary matrix U ↔ Cn↔n is such that U U H = U H U = In ↓′ U H = U →1 . Unitary matri-
ces are complex generalisations of orthogonal matrices. Unitary matrices have eigenvalues
which are complex numbers with modulus 1. The eigenvectors of a Hermitian matrix can be
used to construct a unitary matrix which transforms the Hermitian matrix into a diagonal
one. A unitary matrix can also be constructed to perform a change of basis.
Exercises (optional)
1. Prove the reason for which the absence of pivots in the row echelon form of the given
linear space su"ces to deduce the linear dependence of its vectors.
2. Prove that the row and column ranks of a matrix are equal; prove it for the full ranks
of square matrices as well and disprove it for the full ranks of rectangular matrices.
3. Prove that X = {xi }ni=1 ⇑ Cm form an orthonormal basis for Cm such that Im = X H X
for X = (xi ).
4. Prove that dim(X) = n[bs(X)].
5. Prove that the rank of a matrix equals the dimension of the row space and column
space.
6. Prove that the row or column span of a matrix equals its row or column space; prove
that the row space is computed by finding the row rank vectors and the column space
is computed by selecting the corresponding matrix columns of the pivot elements in its
reduced row echelon form.
7. Prove the rank nullity theorem: rk(A) + dim[ker(A)] = n.
8. Prove that ↑!x △′ ↑A→1 △′ det(A) ↗= 0 △′ rk(A) = n.
9. Prove that ||x + y||p ⇒ ||x||p + ||y||p ↓′ | ||x||p ↓ ||y||p | ⇒ ||x ↓ y||p .
10. Derive the parallelogram rule, the polarisation identity and the maximum norm.
11. Derive the Cauchy Schwartz inequality and Hölder’s inequality for p ⇒ ⇔.
12. Derive the triangle inequality for non-Euclidean vectors.
13. Prove that similar matrices feature the same eigenvalues.
14. Prove that for similar D, A ↔ Cn↔n in which D is diagonal A = P →1 DP such that
diag(D)i = ωiA(ε) if P = (viA(ε) ), whereby {viA(ε) }ni=1 form a basis for Cn↔n .

26

You might also like