Solving linear systems
Solving linear systems
3
Numerical Analysis
4
1 Resolution of linear systems
Kirchhoff’s laws are fundamental tools in electrical circuit analysis. They allow us to derive
linear systems of equations that describe the currents and voltages in a circuit. In this exam-
ple, we will analyze a simple circuit with two loops and three resistors to demonstrate how
Kirchhoff’s laws lead to a system of linear equations.
Consider the following circuit:
R1
V1 R2
R3
I1 = I2 + I3
where:
• I1 is the current through R1 ,
• Loop 1: V1 → R1 → R3
Starting from V1 and moving clockwise:
V1 − I1 R1 − I3 R3 = 0
• Loop 2: R2 → R3
Starting from R2 and moving clockwise:
−I2 R2 + I3 R3 = 0
5
Numerical Analysis
Combining the equations from Kirchhoff’s laws, we obtain the following system of linear
equations:
I1 − I2 − I3 = 0,
R1 I1 + R3 I3 = V1 ,
−R2 I2 + R3 I3 = 0.
• V1 = 10 V, R1 = 2 Ω, R2 = 3 Ω and R3 = 5 Ω.
This example demonstrates how Kirchhoff’s laws can be used to derive a system of linear
equations from an electrical circuit. Solving this system allows us to determine the unknown
currents I1 , I2 , and I3 in the circuit.
Linear systems are extensively used across various domains, including engineering, physics, eco-
nomics, computer science, operations research, biology, and finance, among others. They play
a crucial role in modeling, analyzing, and solving real-world problems, ranging from simulating
physical systems to predicting financial trends. Their versatility makes them essential tools in
addressing complex challenges across multiple fields. In this chapter, we will focus on solving
linear systems using both direct and iterative methods. Direct methods yield exact solutions
in a finite number of steps, while iterative methods are particularly useful for large systems
where direct methods may be computationally expensive or infeasible. By exploring these ap-
proaches, we aim to enhance our understanding of how to effectively tackle linear systems in
diverse contexts.
• The space of real (resp. complex) m × n matrices is indicated by Rm×n (resp. Cm×n ).
6
Numerical Analysis
where aij represents the element in the ith row and jth column.
aij = aij
• Let A ∈ Cm×n and B ∈ Cn×p . Their product AB ∈ Cm×p is defined component-wise by:
n
X
(AB)ij = aik bkj , 1 ≤ i ≤ m, 1 ≤ j ≤ p
k=1
– (AB)∗ = B ∗ A∗ .
• A square matrix is a matrix where the number of rows equals the number of columns
(m = n).
7
Numerical Analysis
• A square diagonal matrix with all diagonal elements equal to 1. For size n:
1 0 ··· 0
0 1 · · · 0
In = .. .. . . .. .
. . . .
0 0 ··· 1
• The rank of a matrix A, denoted rank(A), is the dimension of the vector space spanned
by its rows or columns. It is equal to the number of linearly independent rows or columns.
• The null space of a matrix A ∈ Cm×n is the set of all vectors x ∈ Cn such that:
Ax = 0
• An n-by-n square matrix A is invertible if there exists an n-by-n square matrix B such
that
AB = BA = In .
AA−1 = A−1 A = In .
– (A−1 )−1 = A.
– Upper triangular if all entries below the main diagonal are zero (aij = 0 for i > j).
– Lower triangular if all entries above the main diagonal are zero (aij = 0 for i < j).
8
Numerical Analysis
– The inverse of a lower triangular matrix is a lower triangular matrix, and the
inverse of an upper triangular matrix is an upper triangular matrix.
• The determinant of a square matrix A ∈ Cn×n , denoted det(A) (or |A|), is a scalar
value that can be computed recursively or using properties like row reduction.
– det(In ) = 1.
– det(A> ) = det(A).
– det(A∗ ) = det(A).
• The trace of a square matrix A, denoted tr(A), is the sum of its diagonal elements:
n
X
tr(A) = aii .
i=1
9
Numerical Analysis
– tr(In ) = n.
– tr(A> ) = tr(A).
– tr(A∗ ) = tr(A).
• For a square matrix A ∈ Cn×n , a scalar λ is called an eigenvalue if there exists a non-zero
vector v (called an eigenvector) such that:
Av = λv
A = P DP −1
Schur decomposition
• For any square matrix A ∈ Cn×n , there exists a unitary matrix U (i.e., U ∗ U = I)
and an upper triangular matrix T such that: A = U T U ∗ .
• For any square normal matrix A ∈ Cn×n , there exists a unitary matrix U (i.e.,
U ∗ U = I) and a diagonal matrix D such that: A = U DU ∗ .
• For any square orthogonal matrix A ∈ Cn×n , there exists an orthogonal matrix O
(i.e., U ∗ U = I) and a diagonal matrix D such that: A = ODO> .
• Let V be a vector space over a field (typically R or C). A norm is a function k·k : V → R
that satisfies the following properties for all vectors u, v ∈ V and all scalars α ∈ K:
1. Positive Definiteness:
10
Numerical Analysis
3. Triangle Inequality:
ku + vk ≤ kuk + kvk.
Common norms:
– kvk1 = |v1 | + |v2 | + · · · + |vn | = ni=1 |vi |.
P
1
– kvk2 = |v1 |2 + |v2 |2 + · · · + |vn |2 = ( ni=1 vi2 ) 2 .
p P
n
! p1
X
kvkp = (|v1 |p + |v2 |p + · · · + |vn |p )1/p . = |vi |p
i=1
Common norms:
kAxkp
– Operator norm: kAkp = supx6=0 kxkp
.
kAvk1
= max1≤j≤n ni=1 |aij |.
P
– kAk1 = supv∈Cn ,v6=0 kvk1
κ(A) = kAkkA−1 k
11
Numerical Analysis
k∆xk k∆bk
≤ κ(A) ,
kxk kbk
k∆xk k∆Ak
≤ κ(A) .
kx + ∆xk kAk
Ax = b (1.1)
where:
12
Numerical Analysis
• Size Matters: For large systems (n > 1000), direct inversion becomes computationally
expensive and memory-intensive
• Error Amplification: Small input errors can cause large solution errors in sensitive
systems (ill-conditioned matrices)
13
Numerical Analysis
(k−1)
aik
1. Compute multiplier: mik = (k−1)
akk
Stage 1 Transformations
1
R2 ← R2 − R1
2
3
R3 ← R3 − R1
2
Resulting system:
2 4 6 | 4
0 1 −1 | 1
0 −5 −4 | −4
Stage 2 Transformations
R3 ← R3 + 5R2
14
Numerical Analysis
Exact Solution
The exact solution of (S) is:
10000 9998
x= ≈ 1, y= ≈ 1.
9999 9999
Substituting y = 1 into the first equation: 10−4 x + 1 = 1 ⇒ x = 0. This yields the inaccurate
result:
x = 0, y = 1.
Elimination Process: Multiply the first equation by 10−4 and subtract from the second:
(
x+y =2
10−4 x + y − 10−4 (x + y) = 1 − 2 × 10−4
15
Numerical Analysis
Partial Pivoting
In partial pivoting, one examines the entries in the current column (from the diagonal entry
downwards) and selects as pivot the entry with the largest absolute value. That is, at the kth
step one chooses the pivot from the set
(k)
{ |aik | : i = k, . . . , n }.
(k)
If akk is zero or nearly zero, a row with a larger entry is swapped into the kth position.
Example. Consider the system
x1 + 2x2 + x3 = 5
3x1 + x2 + 4x3 = 6
2x1 + 3x2 + 2x3 = 7
• Swap R1 ↔ R2:
3 1 4 | 6
1 2 1 | 5
2 3 2 | 7
Elimination:
1
R2 ← R2 − R1
3
2
R3 ← R3 − R1
3
3 1 4 | 6
0 5 − 1 | 3
3 3
0 37 − 23 | 3
Stage 2 (k=2):
• Pivot: max(| 35 |, | 73 |) = 7
3
at row 3
• Swap R2 ↔ R3:
3 1 4 | 6
0 7 − 2 | 3
3 3
0 35 − 13 | 3
16
Numerical Analysis
Final Elimination:
5
R3 ← R3 − R2
7
3 1 4 | 6
0 7 − 2 | 3
3 3
0 0 17 | 67
Back Substitution
x3 = 6
x2 = 3
x1 = −7
Because column interchanges may be required, it is necessary to keep track of the permutation
of the variables.
Example. Consider the system
2x1 + 4x2 + x3 = 8
x1 + 3x2 + 5x3 = 10
3x1 + x2 + 2x3 = 5
Initial Matrix:
2 4 1 | 8
1 3 5 | 10
3 1 2 | 5
Stage 1 (k = 1):
• Global pivot: 5 at position (2, 3)
• Swap R1 ↔ R2, C1 ↔ C3
5 3 1 | 10
1 4 2 | 8
2 1 3 | 5
Elimination:
1
R2 ← R2 − R1
5
2
R3 ← R3 − R1
5
5 3 1 | 10
0 3.4 1.8 | 6
0 −0.2 2.6 | 1
Stage 2 (k = 2):
17
Numerical Analysis
• No swaps needed
Final elimination:
−0.2
R3 ← R3 − R2
3.4
5 3 1 | 10
0 3.4 1.8 | 6
46
0 0 17 | 23
17
Solution :
x1 = 0.5, x2 = 1.5, x3 = 1
In Gaussian elimination, besides performing row interchanges, one may also swap columns
especially when using complete (or total) pivoting. Each column exchange introduces a
sign change in the determinant, much like a row swap.
Suppose we perform:
• q column swaps.
Ek · · · E2 E1 A = U,
18
Numerical Analysis
Example
1 2
Let A = .
3 4
1. Step 1: Eliminate the entry 3 in row 2, column 1. Use Eadd with c = −3:
1 0 1 2
E1 = , E1 A = .
−3 1 0 −2
1. Pivot Selection and Normalization: Choose a nonzero pivot in the current row and
scale the row so that the pivot becomes 1.
2. Elimination in All Directions: Use the normalized pivot row to eliminate all other
nonzero entries in the pivot’s column (both below and above the pivot).
The final matrix is in RREF if every pivot is 1 and is the sole nonzero entry in its column.
Consider an augmented matrix for a system:
(1) (1) (1) (1)
a11 a12 · · · a1n | b1
a(1) a(1) · · · a(1) (1)
| b2
[A | b] = 21 22 2n
.
.
.. .
.. . .. .. ..
. | .
(1) (1) (1) (1)
an1 an2 · · · ann | bn
(2) 1 (1)
L1 = (1)
L1 .
a11
19
Numerical Analysis
20
Numerical Analysis
(3) 1 (2)
L2 = L .
−3 2
Then, eliminate the second column entries in the other rows:
(3) (2) (3) (3) (2) (3)
L1 = L1 − 1 · L2 , L3 = L3 − (−2) L2 .
Step 3: Finally, normalize the third row and eliminate the third column entries in the rows
above. At this point, the augmented matrix becomes:
1 0 0 | x1
0 1 0 | x2 .
0 0 1 | x3
x1 = 0, x2 = −1, x3 = 1.
1.2.4 LU-Factorisation
Building on Gaussian elimination, we now formalize its matrix factorization interpretation.
Definition 1.2.1. For a square matrix A ∈ Cn×n , an LU decomposition is a factorization:
A = LU
where:
• L is lower triangular with unit diagonal entries
• U is upper triangular
21
Numerical Analysis
Definition 1.2.2 (Leading Principal Submatrix). For a square matrix A ∈ Cn×n , the k-th
leading principal submatrix Ak is the k × k submatrix formed from the first k rows and columns
of A:
a11 · · · a1k
Ak := ... . . . ... , 1 ≤ k ≤ n
ak1 · · · akk
Definition 1.2.3 (Leading Principal Minor). The k-th leading principal minor is the determi-
nant of the k-th leading principal submatrix:
∆k := det(Ak )
Let A ∈ Cn×n . Then A admits an LU decomposition if and only if all leading principal
minors are nonsingular. When it exists, the decomposition is unique.
Ak = Lk Uk
Thus:
k
Y
det(Ak ) = det(Lk ) det(Uk ) = uii 6= 0
i=1
because diagonal entries uii are non-zero (they are Gaussian elimination pivots).
(⇐) Non-zero Leading Minors =⇒ LU Decomposition:
Gaussian Elimination without Permutations:
Step 1: The first minor
det(A1 ) = a11 6= 0.
The first pivot u11 = a11 .
Step k: Assume the first k − 1 pivots are non-zero. The minor
det(Ak ) 6= 0
22
Numerical Analysis
This equality implies that L1 = L2 and U1 = U2 , as a lower triangular matrix with 1s on the
diagonal can only equal an upper triangular matrix if both are the identity matrix.
Let A be an invertible square matrix of order n. Then, there exists a permutation matrix
P such that all the pivots of P A are nonzero. Consequently, we can factorize P A as
P A = LU,
where L is a lower triangular matrix with ones on its diagonal and U is an upper triangular
matrix.
The linear system Ax = b can therefore be reformulated as
P Ax = P b.
2. Compute P b.
We wish to solve this system using LU factorization, where the coefficient matrix A is decom-
posed as:
A = LU,
with
1 0 0 u11 u12 u13
L = l21 1 0 and U = 0 u22 u23 .
l31 l32 1 0 0 u33
LU Decomposition
Multiplying L and U gives:
u11 u12 u13
LU = l21 u11 l21 u12 + u22 l21 u13 + u23 .
l31 u11 l31 u12 + l32 u22 l31 u13 + l32 u23 + u33
23
Numerical Analysis
24