Lecture Week04 PDF
Lecture Week04 PDF
Matrix Norms
Now we turn to associating a number to each matrix. We could choose our norms anal-
ogous to the way we did for vector norms; e.g., we could associate the number maxij |aij |.
However, this is actually not very useful because remember our goal is to study linear
systems A~x = ~b.
The general definition of a matrix norm is a map from all m × n matrices to IR1 which
satisfies certain properties. However, the most useful matrix norms are those that are
generated by a vector norm; again the reason for this is that we want to solve A~x = ~b so
if we take the norm of both sides of the equation it is a vector norm and on the left hand
side we have the norm of a matrix times a vector.
We will define an induced matrix norm as the largest amount any vector is magnified
when multiplied by that matrix, i.e.,
kA~xk
kAk = maxn
x∈IR
~
x6=0
~
k~xk
Note that all norms on the right hand side are vector norms. We will denote a vector and
matrix norm using the same notation; the difference should be clear from the argument.
We say that the vector norm on the right hand side induces the matrix norm on the left.
Note that sometimes the definition is written in an equivalent way as
kA~xk
kAk = sup
x∈IRn
~ k~xk
x6=0
~
m
hX i
kAk1 = max |aij | (max absolute column sum)
1≤j≤n
i=1
1
Example Determine kAk∞ and kAk1 where
1 2 −4
A= 3 0 12
−20 −1 2
We have
Proof We will prove that kAk∞ is the maximum row sum (in absolute value). We will
do this by proving that
n
hX i n
hX i
kAk∞ ≤ max |aij | and then showing kAk∞ ≥ max |aij |
1≤i≤m 1≤i≤m
j=1 j=1
kA~xk∞
kAk∞ = maxn
x∈IR
~
x6=0
~
k~xk∞
and hence
n
kA~xk∞ X
kAk∞ = maxn ≤ max |aij | .
x∈IR
~
x6=0
~
k~xk∞ i
j=1
kA~y k∞
kAk∞ ≥
k~y k∞
2
for any ~y ∈ IRn because equality in the definition holds here for the maximum of this ratio.
So now we will choose a particular ~y and we will construct it so that it has k~y k∞ = 1.
First let p be the row where A has its maximum row sum (or there are two rows, take the
first), i.e.,
Xn n
X
max |aij | = |apj |
i
j=1 j=1
Now we will take the entries of ~y to be ±1 so its infinity norm is one. Specifically we choose
1 if apj ≥ 0
yi =
−1 if apj < 0
Defining ~y in this way means that aij yj = |aij |. Using this and the fact that k~y k∞ = 1 we
have
n n n n
kA~y k∞ X X X X
kAk∞ ≥ = max | aij yj | ≥ | apj yj | = |apj | | = |apj |
k~y k∞ i
j=1 j=1 j=1 j=1
but the last quantity on the right is must the maximum row sum and the proof is complete.
What are the properties that any matrix norm (and thus an induced norm) satisfy?
You will recognize most of them as being analogous to the properties of a vector norm.
(i) kAk ≥ 0 and = 0 only if aij = 0 for all i, j.
(ii) kkAk = |k|kAk for scalars k
(iii) kABk ≤ kAkkBk
(iv) kA + Bk ≤ kAk + kBk
A very useful inequality is
kA~xk kA~xk
kAk = maxn ≥ ⇒ kA~xk ≤ kAkk~xk
x∈IR
~
x6=0
~
k~xk k~xk
that is, we have perturbed the right hand side by a small about δb. ~ We assume that A
is invertible, i.e., A−1 exists. For simplicity, we have not perturbed the coefficient matrix
3
~ and so our
A. What we want to see is how much ~y differs from ~x. Lets write ~y as ~x + δx
~
change in the solution will be δx. The two systems are
A~x = ~b ~ = ~b + δb
A(~x + δx) ~
What we would like to get is an estimate for the relative change in the solution, i.e.,
~
kδxk
k~xk
in terms of the relative change in ~b where k·k denotes any induced vector norm. Subtracting
these two equations gives
~ = δb
Aδx ~ which implies δx
~ = A−1 δb
~
Now we take the (vector) norm of both sides of the equation and then use our favorite
inequality above
~ = kA−1 δbk
kδxk ~ ≤ kA−1 k kδbk~
Remember our goal is to get an estimate for the relative change in the solution so we have
a bound for the change. What we need is a bound for the relative change. Because A~x = ~b
we have
1 1
kA~xk = k~bk ⇒ k~bk ≤ kAkk~xk ⇒ ≥
k~bk kAkk~xk
~ by kAkk~xk > 0 we can use this
Now we see that if we divide our previous result for kδxk
result to introduce k~bk in the denominator. We have
~
kδxk ~
kA−1 kkδbk ~
kA−1 kkδbk
≤ ≤
kAkk~xk kAkk~xk k~bk
~
kδxk ~
kδbk
≤ kAkkA−1 k
k~xk k~bk
If the quantity kAkkA−1 k is small, then this means small relative changes in ~b result in
small relative changes in the solution but if it is large, we could have a large relative change
in the solution.
K(A) ≡ kAkkA−1 k
Note that the condition number depends on what norm you are using. We say that a
matrix is well-conditioned if K(A) is “small” and ill-conditioned otherwise.
4
Example Find the condition number of the identity matrix using the infinity norm.
Clearly it is one because the inverse of the identity matrix is itself.
Example Find the condition number for each of the following matrices using the infinity
norm.. Explain geometrically why you think this is happening.
1 2 1 2
A1 = A2 =
4 3 −0.998 −2
First we need to find the inverse of each matrix and then take the norms. Note the following
“trick” for taking the inverse of a 2 × 2 matrix
−1 −1 3 −2 −1 500 500
A1 = A2 =
5 −4 1 −249.5 −250
Now
K∞ (A1 ) = (7)(1) = 7 K∞ (A2 ) = (3)(1000) = 3000
n approximate K2 (A)
2 19
3 524
4 15,514
5 476,607
6 1.495 107
5
Vector (or Linear) Spaces
What we want to do now is take the concept of vectors and the properties of Euclidean
space IRn and generalize them to a collection of objects with two operations defined, ad-
dition and scalar multiplication. We want to define these operations so that the usual
properties hold, i.e., x + y = y + x, k(x + y) = kx + ky for objects other than vectors, such
as matrices, continuous functions, etc. We want to be able to add two elements of our set
and get another element of the set.
Definition A vector or linear space V is a set of objects, which we will call vectors, for
which addition and scalar multiplication are defined and satisfy the following properties.
(i) x+y =y+x
(ii) x + (y + z) = (x + y) + z
(iii) there exists a zero element 0 ∈ V such that x + 0 = 0 + x = x
(iv) for each x ∈ V there exists −x ∈ V such that x + (−x) = 0
(v) 1x = x
(vi) (α + β)x = αx + βx, for scalars α, β
(vii) α(x + y) = αx + αy
(viii) (αβ)x = α(βx)
Example IRn is a vector space with the usual operation of addition and scalar multipli-
cation.
Example The set of all m × n matrices with the usual definition of addition and scalar
multiplication forms a vector space.
Example All polynomials of degree less than or equal to two on [−1, 1] form a vector
space with usual definition of addition and scalar multiplication. Here p(x) = a0 + a1 x +
a2 x2
Example A line is a subspace of IR2 because if we take add two points on the line, the
result is still a point on the line and if we multiply any point on the line by a constant
then it is still on the line.
Example All lower triangular matrices form a subspace of the vector space of all square
matrices.
6
Example Is the set of all points on the given lines a subspace of IR2 ?
y=x y = 2x + 1
The first forms a subspace but the second does not. It is not closed under addition or
scalar multiplication, e.g.,
C1~v1 + C2~v2 = ~0 ⇒ C1 = C2 = 0
Of course there are many other sets of two vectors in IR2 that have the same prop-
erties.
n
X
w
~= Cj ~vj
j=1
n
X
Cj ~vj = ~0 ⇒ Ci = 0, ∀i
j=1
otherwise they are linearly dependent. Note that this says the only way we can
combine the vectors and get the zero vector is if all coefficients are zero.
(iii) form a basis for V if they are linearly independent and span V .
7
Example Let V be the vector space of all polynomials of degree less than or equal to
two on [0, 1]. Then the set of vectors (i.e., polynomials) in V {1, x, x2 , 2 + 6x} span V but
they are NOT linearly independent because
and thus they do not form a basis for V . The set of vectors {1, x} are linearly independent
but they do not span V because, e.g., x2 ∈ V and it can’t be written as a linear combination
of {1, x}.
Definition The number of elements in the basis for V is called the dimension of V . It
is the smallest number of vectors needed to completely characterize V , i.e., that span V .
Example Find a basis for (i) the vector space of all 2 × 2 matrices and (ii) the vector
space of all 2 × 2 symmetric matrices. Give the dimension.
A basis for all 2 × 2 matrices consists of the four matrices
1 0 0 1 0 0 0 0
, , ,
0 0 0 0 1 0 0 1
and so its dimension is 4. A basis for all 2 × 2 symmetric matrices consists of the three
matrices
1 0 0 0 0 1
, ,
0 0 0 1 1 0
and so its dimension is 2.
Four Important Spaces
1. The column space (or equivalently the range) of A, where A is m × n matrix is all
linear combinations of the columns of A. We denote this by R(A).
By definition it is a subspace of IRm .
8
This says that an equivalent statement to A~x = ~b being solvable is that ~b is in the range
or column space of A.
2. The null space of A, denoted N (A), where A is m × n matrix is the set of all vectors
~z ∈ IRn such that A~z = ~0.
For A1 we have that N (A1 ) = ~0 because the matrix is invertible. To see this we could
take the determinant or perform GE and get the result.
For A2 we see that N (A2 ) is α(−2x2 , x2 )T , i.e., the all points on the line through the
origin y = −.5x. To see this consider GE for the system
1 2 1 2
→ ⇒ 0 · x2 = 0, x1 + 2x2 = 0
2 4 0 0