Chapter 1
Chapter 1
equations
1.1.1 Denitions
Dénition 1.1.1. We call a linear equation in the variables (or unknowns) x1 , . . . , xp any rela-
tionship of the form
a1 x1 + · · · + ap xp = b, (1.1)
Remarques 1.1.2. It is important to stress here that these linear equations are implicites,
that is to say, they describe relationships between variables, but do not directly give the
values that the variables can take.
Solving an equation therefore means making it explicit, i.e. making more apparent the values
that the variables can take.
Dénition 1.1.3. A system of n linear equations with p unknowns is a list of n linear equations.
is a list of n linear equations.
Such systems are usually written in n lines placed one below the other.
1
1.1. THEORY OF LINEAR SYSTEMS 2
ai1 x1 +ai2 x2 +ai3 x3 + · · · +aip xp = bi (← équation i)
. . . . .
. . . .
= ..
. . . .
a x +a x +a x + · · · +a x = b
(← équation n)
n1 1 n2 2 n3 3 np p n
Dénition 1.1.5. A solution of the linear system is a list of p real numbers (s1 , s2 , . . . , sp ) (a
p-uplet) such that if we substitute s1 for x1 , s2 for x2 , etc., in the linear system, we obtain
equality. linear system, we obtain an equality. The set of solutions of the system is the set of all
these p-uplets.
x1 = −18 , x2 = −6 , x3 = 1 .
On the other hand, (7, 2, 0) only satises the rst equation. It is therefore not a solution to the
system.
As a general rule, the aim is to determine the set of the solutions of a linear system. This is
called solving the linear system. This leads to the following denition.
Dénition 1.1.6. Two linear systems are said to be equivalent if they have the same set of
solutions.
from there, the game of solving a given linear system is to transform it into an equivalent
system that is simpler to solve than the original system. We'll see later how to do this systema-
tically.
1.1. THEORY OF LINEAR SYSTEMS 3
Théorème 1.1.7. A system of linear equations has either no solution, a single solution, or an
innite number of solutions.
In particular, if you nd 2 dierent solutions to a linear system, then you can nd an innite
equation Lj .
These three elementary operations do not change the solutions of a linear system ; in other words,
Exemple. Let's use these elementary operations to solve the following system.
x
+y +7z = −1 (L1 )
2x −y +5z = −5 (L2 )
−x −3y −9z = −5
(L3 )
Let's start with the operation L2 ← L2 − 2L1 : subtract the rst equation twice from the second
x
+y +7z = −1
−3y −9z = −3 L2 ←L2 −2L1
−x −3y −9z = −5
1.1. THEORY OF LINEAR SYSTEMS 4
Then L3 ← L3 + L1 :
x
+y +7z = −1
−3y −9z = −3
−2y −2z = −6
L3 ←L3 +L1
Continue so that a coecient 1 appears at the top of the second line ; to do this, divide the line
L2 by −3 :
x
+y +7z = −1
y +3z = 1 L2 ← − 31 L2
−2y −2z = −6
And so we continue
x +y +7z = −1 x +y +7z = −1
y +3z = 1 y +3z = 1
4z = −4 z = −1 L3 ← 14 L3
L3 ←L3 +2L2
x +y +7z = −1 x +y = 6 L1 ←L1 −7L3
y = 4 L2 ←L2 −3L3 y = 4
z = −1 z = −1
We thus obtain x = 2, y = 4 and z = −1 and the only solution to the system is (2, 4, −1).
The method used for this example is repeated and generalised in the next paragraph.
entry is called the [[leading coecient]] (or pivot) of that row. So if two leading coecients are in
the same column, then a row operation of [[Row operations|type 3]] could be used to make one of
those coecients zero. Then by using the row swapping operation, one can always order the rows
so that for every non-zero row, the leading coecient is to the right of the leading coecient of
the row above. If this is the case, then matrix is said to be in row echelon form. So the lower left
part of the matrix contains only zeros, and all of the zero rows are below the non-zero rows. The
word echelon is used here because one can roughly think of the rows being ranked by their size,
with the largest being at the top and the smallest being at the bottom.
So as denition :
1.1. THEORY OF LINEAR SYSTEMS 5
Example 1
For example, the following matrix is in row echelon form, and its leading coecients are
0 2 1 −1
shown in red : 0 0 3 1 .
0 0 0 0
It is in echelon form because the zero row is at the bottom, and the leading coecient of the
second row (in the third column), is to the right of the leading coecient of the rst row (in the
second column).
A matrix is said to be in reduced row echelon form if furthermore all of the leading coecients
are equal to 1 (which can be achieved by using the elementary row operation of type 2), and
in every column containing a leading coecient, all of the other entries in that column are zero
Example 2
2x1 +3x2 +2x3 −x4 = 5
−x2 −2x3
= 4
is echelon (but not reduced).
3x4 = 1
2x1 +3x2 +2x3 −x4 = 5
−2x3 = 4 is not echelon (the last line starts with the same va-
x3 +x4 = 1
riable as the line above).
It turns out that linear systems in reduced echelon form are particularly easy to solve.
Exemple. The following linear system with 3 equations and 4 unknowns is echelon and reduced.
x1 +2x3 = 25
x2 −2x3 = 16
x4 = 1
1.1. THEORY OF LINEAR SYSTEMS 6
x1 = 25 − 2x3
x2 = 16 + 2x3
x4 = 1.
In other words, for any real value of x3 , the values of x1 , x2 and x4 calculated above provide a
solution to the system, and we have thus obtained them all. We can therefore fully describe the
set of solutions :
S = (25 − 2x3 , 16 + 2x3 , x3 , 1) | x3 ∈ R .
equation. It is also known as Row Reduction Technique. In this method, the problem of systems
of linear equation having n unknown variables, matrix having rows n and columns n+1 is formed.
This matrix is also known as Augmented Matrix. After forming n x n+1 matrix, matrix is
transformed to upper trainagular matrix by row operations. Finally result is obtained by Back
Substitution.
1. Start
6. Display Result.
7. Stop
Suppose the goal is to nd and describe the set of solutions to the following system of linear
equations :
2x + y − z = 8 (L1 ) (1.2)
The table below is the row reduction process applied simultaneously to the system of equations
and its associated augmented matrix. In practice, one does not usually deal with the systems in
terms of equations, but instead makes use of the augmented matrix, which is more suitable for
computer manipulations. The row reduction procedure may be summarized as follows : eliminate
mvar|x from all equations below L1 , and then eliminate y from all equations below L2 . This will
put the system into triangular form. Then, using back-substitution, each unknown can be solved
for.
Example 2 Solve :
1.1. THEORY OF LINEAR SYSTEMS 8
2 4 −2 −2 −4
1 2 4 −3 5
A= b=
−3 −3 8 −2 7
−1 1 6 −3 7
Note that there is not hing wrong with this system. Aisfull rank. The solution exists and is
Subtract 1/2 times the rst row from the second row,
add 1/2 times the rst row to the fourth row. The result of these operations is :
The next stage of Gaussian elimination will not work because there is a zero in the pivot location,
~a22 .
Another zero has appear in the pivot position. Swap row 3 and row 4.
1.1. THEORY OF LINEAR SYSTEMS 9
Example 3
b and c for which there are there no solutions, one solution or more than one solution. To
An analysis of the last row tells us everything : If b−a ̸= 0 , then there is exactly one solution.
If b−a = 0 , and c − 3 ̸= 0 , then there are no solutions. Otherwise (when b=a and c=3 )
Exercice
Ax = b
where the n×n matrix A has a nonzero determinant, and the vector x = (x1 , . . . , xn )T is
the column vector of the variables. Then the theorem states that in this case the system has a
unique solution, whose individual values for the unknowns are given by :
det(Ai )
xi = i = 1, . . . , n
det(A)
where Ai is the matrix formed by replacing the i-th column of A by the column vector b.
AX = B
where the n×n matrix A has a nonzero determinant, and X, B are n×m matrices. Given
sequences 1 ⩽ i1 < i2 < · · · < ik ⩽ n and 1 ⩽ j1 < j2 < · · · < jk ⩽ m, let XI,J be the k×k
submatrix of X with rows in I := (i1 , . . . , ik ) and columns in J := (j1 , . . . , jk ). Let AB (I, J)
1.1. THEORY OF LINEAR SYSTEMS 11
be the n×n matrix formed by replacing the is column of A by the js column of B, for all
s = 1, . . . , k . Then
det(AB (I, J))
det XI,J = .
det(A)
In the case k = 1, this reduces to the normal Cramer's rule. The rule holds for systems of
equations with coecients and unknowns in any eld , not just in the real numbers.
:
a x + b y = c
1 1 1
a2 x + b2 y = c2
:
a b x c
1 1 = 1 .
a2 b2 y c2
Assume a1 b2 − b1 a2 is nonzero. Then, with the help of determinants, x and y can be found
c1 b1 a1 c1
c2 b2 c1 b2 − b1 c2 a2 c2 a1 c2 − c1 a2
x= = , y= = (1.5)
a1 b1 a1 b2 − b1 a2 a1 b1 a1 b2 − b1 a2
a2 b2 a2 b2
.
d1 b1 c1
d2 b2 c2
d3 b3 c3
x= ,
a1 b1 c1
a2 b2 c2
a3 b3 c3
1.1. THEORY OF LINEAR SYSTEMS 12
a1 d1 c1
a2 d2 c2
a3 d3 c3
y= ,
a1 b1 c1
a2 b2 c2
a3 b3 c3
and
a1 b1 d1
a2 b2 d2
a3 b3 d3
z=
a1 b1 c1
a2 b2 c2
a3 b3 c3
Examples
Example1 : Using Cramer's Rule to Solve a 2Ö2 System
12x + 3y = 15
2x − 3y = 13
Solution
Solve for x
Solve for y
1.2. THEORY OF NON LINEAR EQUATIONS 13
Find the solution to the given 3Ö3 system using Cramer's Rule.
x+y−z =6
3x − 2y + z = −5
x + 3y − 2z = 14
not assumed to be linear, it could have any number of solutions, from 0 to ∞. In one dimension,
1.2. THEORY OF NON LINEAR EQUATIONS 14
if f (x) is continuous, we can make use of the Intermediate Value Theorem (IVT) to bracket
a root ; i.e., we can nd numbers a and b such that f (a) and f (b) have dierent signs.
Then the IVT tells us that there is at least one magical value x
∗ ∈ (a, b) such that f (x∗ ) = 0. The
number x∗ is called a root or zero of f (x). Solving nonlinear equations is also called root-nding.
To bisect means to divide in half. Once we have an interval (a,b) in which we know x∗ lies,
Otherwise, the sign of f(a+b /2 ) will either agree with the sign of f(a), or it will agree with
the sign of f(b). Suppose the signs of f(a+b 2 ) and f(a) agree. Then we are no longer guaranteed
that x∗ ∈ (a, a+b/2) but we are still guaranteed that x∗ ∈ (a+b/ 2 ) ,b). So we have narrowed
∗
down the region where x lies. Moreover, we can repeat the process by setting a+b 2 to a (or to
Algorithm
Objective
Given a non-linear function f (x), nd the solution x0 to the equation f (x) = 0 in an interval
[a, b] for which f (a)f (b) < 0.
Inputs
A and B (the extremities of the initial interval), ε the desired precision, N0 (the maximum
number of iterations).
Output
go to 9)
a+b
4. Pose c= 2
b−a
5. if f (c) = 0 or
2 =0 then printed (c) end
6. n=n+1
8. Printed after N0 iterations, the approximation obtained is X0 = c and the maximum error
b−a
is
2 .
9. end.
Notes
nding a zero of f (x), where f (x) is a real-valued function of a single real variable.
2. We assume that we know an interval [a,b] on which a continuous function f(x) changes
sign.
However, there is likely no oating-point number (or even rational number !) where f(x) is
exactly 0.
So our goal is :
3. If f(x) is continuous and we have a starting interval on which f(x) changes sign, then
that bracket x∗ .
1.2. THEORY OF NON LINEAR EQUATIONS 16
Example
Example
f (x) = 2 − exp(5x)
Find the value x0 that corresponds to f (x) = 0.5 using the bisection method
Solution
hand, we see that g ′ = 5 exp(5x) is strictly positive everywhere in the set of real numbers, so the
interval in which we can look for the solution x0 is [0, 0, 1], so a=0 and b = 0, 1.
The results of the bisection algorithm are shown in the table below :
The solution using the bisection method is x0 = c7 = 0.08046875. This result converges with the
modication to the bisection method. In step 4 of the bisection algorithm, instead of choosing ci
1.2. THEORY OF NON LINEAR EQUATIONS 17
as the midpoint of the interval [ai , bi ], we will take as the value of ci the point where the straight
line joining the points (ai , f (ai )) and (bi , f (bi )) intersects the x-axis (x). The relationship to be
f (bi )[bi − ai ]
ci = bi −
f (bi ) − f (ai )
Changing the edge of the A-B interval is done in exactly the same way as with the bisection
Algorithm
Objective
Given a non-linear function f (x), nd the solution x0 to the equation f (x) = 0 in an interval
[a, b] for which f (a)f (b) < 0.
Inputs
A and B (the extremities of the initial interval), ε the desired precision, N0 (the maximum
number of iterations).
Output
go to 8)
b−a
4. if f (c) = 0 or
2 <ε then printed (c) end
5. n=n+1
7. Printed after N0 iterations, the approximation obtained is X0 = c and the maximum error
b−a
is
2 .
8. end.
Note
1. If the curve of the function does not change in the interval [a, b], i.e. the curve remains
convex throughout the interval or remains concave throughout this interval, then step 7 of
the false position method algorithm will always lead to a xed bound a or b which does not
change along the iterations. If the function is convex on [a, b], (f ′ (x) < 0), and if f (a) < 0,
then a is the xed bound. And if f (a) > 0, then b is the xed bound. The xed bound will
if f (a) > 0, then the xed bound will be a. The xed bound will be the bound with a
positive sign. In conclusion, if the curve of f does not change sign on the interval [a, b], then
the xed bound will be the bound whose image has the same sign as the curve of f on the
Exmple
Let's take the same example from the previous section with g(x) = exp(5x − 1.5 and [0,0.1]
as the start interval and applying the false position method. The results converged with the
So we can see that the false position method needed at least half as many iterations to achieve
Algorithm
Objective
Given a continuous non-linear function f (x), nd the solution x0 to the equation f (x) = 0.
Inputs
Output
f (x0 )
3. Pose x = x0 − f ′ (x0 )
5. Pose n=n+1
6. Pose x0 = x
8. end.
Notes
In fact, the generalization of the above description of Newton's method is the only viable general-
But, as a general-purpose algorithm for nding zeros of functions, it has 3 serious drawbacks.
If f(x) is not smooth, then f (x)′ ) does not exist, and Newton's method is not dened.
Example
Let the following continuous function
f (x) = 2 − exp(5x)
Find the value x0 that corresponds to f (x) = 0.5 using the Newton method.
exp(5xi − 1.5)
xi+1 = xi −
5 exp(5xi )
The following table shows the results with the initial solution x0 = 0.
Iteration number Approximation xi
0 0
1 0.1
2 0.0820
3 0.0811
4 0.08111
1.2. THEORY OF NON LINEAR EQUATIONS 21
rst derivative of f (x). The second disadvantage is that it is unlucky to choose an initial solution
whose rst derivative is zero. This second disadvantage can be resolved by choosing another initial
solution. The rst disadvantage suggests replacing the Newton method, of the rst derivative
f (xi ) − f (xi−1 )
f ′ (xi ) ≈
xi − xi−1
Otherwise
Algorithm
Objective
Given a continuous non-linear function f (x), nd the solution x0 to the equation f (x) = 0.
Inputs
Output
x1 −x0
3. Pose x = x1 − f (x1 )−f (x0 )
5. Pose n=n+1
8. end.
It is the most popular and powerful method for solving systems of nonlinear equations.
Before discussing it, we will rst need to introduce the concept of the Jacobian matrix (also
Let
f1 (x1 , x2 , x3 ..., xm )
f2 (x1 , x2 , x3 ..., xm )
f (x) = f3 (x1 , x2 , x3 ..., xm )
.
.
.
fm (x1 , x2 , x3 ..., xm )
∂f (x)
Then the Jacobian matrix Jf (x) = ∂x is an m×m matrix whose (i, j) element is given by
∂fi
[Jf (x)]ij =
∂xj
e.g., when m = 2,
∂f1 ∂f1
Jf (x) = ∂x1 ∂x2
∂f2 ∂f2
∂x1 ∂x2
Let
x21 − x2 + σ
f (x) =
−x1 + x22 + σ
1.3. THEORY OF NON LINEAR SYSTEMS 23
Then,
2x1 −1
Jf (x) =
−1 2x2
The Newton iteration for systems of nonlinear equations takes the form
then update
xn+1 = xn + sn
Newton's method reduces the solution of a nonlinear system to the solution of a sequence of
linear equations.
Example
Solve the nonlinear system
x1 + 2x2 − 2
f (x) = =0
x21 + 4x22 − 4
First compute the Jacobian :
1 2
Jf (x) =
2x1 8x2
Then
3 1 2
f (x0 ) = Jf (x0 ) =
13 2 16
Solve for the correction s0 from
1 2 3 −1.83
s0 = − ⇒ s0 =
2 16 13 −058
etc.