0% found this document useful (0 votes)
74 views

Lecture Week04 PDF

The document discusses matrix norms, conditioning, vector spaces, linear independence, spanning sets and bases, and the null space and range of a matrix. It defines an induced matrix norm as the largest amount any vector is magnified when multiplied by that matrix. It provides theorems for calculating the infinity and one norms of a matrix. The condition number of a matrix indicates how stable a linear system is to perturbations in the right hand side. A higher condition number means the solution can change significantly with small changes to the right hand side.

Uploaded by

张岌淼
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views

Lecture Week04 PDF

The document discusses matrix norms, conditioning, vector spaces, linear independence, spanning sets and bases, and the null space and range of a matrix. It defines an induced matrix norm as the largest amount any vector is magnified when multiplied by that matrix. It provides theorems for calculating the infinity and one norms of a matrix. The condition number of a matrix indicates how stable a linear system is to perturbations in the right hand side. A higher condition number means the solution can change significantly with small changes to the right hand side.

Uploaded by

张岌淼
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Lectures - Week 4

Matrix norms, Conditioning, Vector Spaces, Linear Independence,


Spanning sets and Basis, Null space and Range of a Matrix

Matrix Norms
Now we turn to associating a number to each matrix. We could choose our norms anal-
ogous to the way we did for vector norms; e.g., we could associate the number maxij |aij |.
However, this is actually not very useful because remember our goal is to study linear
systems A~x = ~b.
The general definition of a matrix norm is a map from all m × n matrices to IR1 which
satisfies certain properties. However, the most useful matrix norms are those that are
generated by a vector norm; again the reason for this is that we want to solve A~x = ~b so
if we take the norm of both sides of the equation it is a vector norm and on the left hand
side we have the norm of a matrix times a vector.
We will define an induced matrix norm as the largest amount any vector is magnified
when multiplied by that matrix, i.e.,

kA~xk
kAk = maxn
x∈IR
~
x6=0
~
k~xk

Note that all norms on the right hand side are vector norms. We will denote a vector and
matrix norm using the same notation; the difference should be clear from the argument.
We say that the vector norm on the right hand side induces the matrix norm on the left.
Note that sometimes the definition is written in an equivalent way as

kA~xk
kAk = sup
x∈IRn
~ k~xk
x6=0
~

Example What is kIk? Clearly it is just one.


The problem with the definition is that it doesn’t tell us how to compute a matrix
norm for a general matrix A. The following theorem gives us a way to calculate matrix
norms induced by the `∞ and `1 norms; the matrix norm induced by `2 norm will be
addressed later after we have introduced eigenvalues.

Theorem Let A be an m × n matrix. Then


n
hX i
kAk∞ = max |aij | (max absolute row sum)
1≤i≤m
j=1

m
hX i
kAk1 = max |aij | (max absolute column sum)
1≤j≤n
i=1

1
Example Determine kAk∞ and kAk1 where
 
1 2 −4
A=  3 0 12 
−20 −1 2

We have

kAk1 = max{(1 + 3 + 20), (2 + 1), (4 + 12 + 2)} = max{24, 3, 18} = 24

kAk∞ = max{(1 + 2 + 4), (3 + 12), (20 + 1 + 2)} = max{7, 15, 23} = 23

Proof We will prove that kAk∞ is the maximum row sum (in absolute value). We will
do this by proving that
n
hX i n
hX i
kAk∞ ≤ max |aij | and then showing kAk∞ ≥ max |aij |
1≤i≤m 1≤i≤m
j=1 j=1

First recall that if A~x = ~b then


n
X n
X
bi = aij xj ⇒ k~bk∞ = max |bi | = max | aij xj |
i i
j=1 j=1

For the first inequality we know that by definition

kA~xk∞
kAk∞ = maxn
x∈IR
~
x6=0
~
k~xk∞

Now lets simplify the numerator to get


n n n
X X X
kA~xk∞ = max aij xj ≤ max
|aij | |xj | ≤ k~xk∞ max |aij |
i i i
j=1 j=1 j=1

Thus the ratio reduces to


Pn n
kA~xk∞ k~xk∞ maxi j=1 |aij | X
≤ = max |aij |
k~xk∞ k~xk∞ i
j=1

and hence
n
kA~xk∞ X
kAk∞ = maxn ≤ max |aij | .
x∈IR
~
x6=0
~
k~xk∞ i
j=1

Now for the second inequality we know that

kA~y k∞
kAk∞ ≥
k~y k∞

2
for any ~y ∈ IRn because equality in the definition holds here for the maximum of this ratio.
So now we will choose a particular ~y and we will construct it so that it has k~y k∞ = 1.
First let p be the row where A has its maximum row sum (or there are two rows, take the
first), i.e.,
Xn n
X
max |aij | = |apj |
i
j=1 j=1

Now we will take the entries of ~y to be ±1 so its infinity norm is one. Specifically we choose

1 if apj ≥ 0
yi =
−1 if apj < 0

Defining ~y in this way means that aij yj = |aij |. Using this and the fact that k~y k∞ = 1 we
have
n n n n
kA~y k∞ X X X X
kAk∞ ≥ = max | aij yj | ≥ | apj yj | = |apj | | = |apj |
k~y k∞ i
j=1 j=1 j=1 j=1

but the last quantity on the right is must the maximum row sum and the proof is complete.
What are the properties that any matrix norm (and thus an induced norm) satisfy?
You will recognize most of them as being analogous to the properties of a vector norm.
(i) kAk ≥ 0 and = 0 only if aij = 0 for all i, j.
(ii) kkAk = |k|kAk for scalars k
(iii) kABk ≤ kAkkBk
(iv) kA + Bk ≤ kAk + kBk
A very useful inequality is

kA~xk ≤ kAkk~xk for any induced norm

Why is this true?

kA~xk kA~xk
kAk = maxn ≥ ⇒ kA~xk ≤ kAkk~xk
x∈IR
~
x6=0
~
k~xk k~xk

The Condition Number of a Matrix


We said one of our goals was to determine if small changes in our data of a linear
system produces small changes in the solution. Now lets assume we want to solve A~x = ~b,
~b 6= ~0 but instead we solve
A~y = ~b + δb
~

that is, we have perturbed the right hand side by a small about δb. ~ We assume that A
is invertible, i.e., A−1 exists. For simplicity, we have not perturbed the coefficient matrix

3
~ and so our
A. What we want to see is how much ~y differs from ~x. Lets write ~y as ~x + δx
~
change in the solution will be δx. The two systems are

A~x = ~b ~ = ~b + δb
A(~x + δx) ~

What we would like to get is an estimate for the relative change in the solution, i.e.,

~
kδxk
k~xk

in terms of the relative change in ~b where k·k denotes any induced vector norm. Subtracting
these two equations gives
~ = δb
Aδx ~ which implies δx
~ = A−1 δb
~

Now we take the (vector) norm of both sides of the equation and then use our favorite
inequality above
~ = kA−1 δbk
kδxk ~ ≤ kA−1 k kδbk~

Remember our goal is to get an estimate for the relative change in the solution so we have
a bound for the change. What we need is a bound for the relative change. Because A~x = ~b
we have
1 1
kA~xk = k~bk ⇒ k~bk ≤ kAkk~xk ⇒ ≥
k~bk kAkk~xk
~ by kAkk~xk > 0 we can use this
Now we see that if we divide our previous result for kδxk
result to introduce k~bk in the denominator. We have

~
kδxk ~
kA−1 kkδbk ~
kA−1 kkδbk
≤ ≤
kAkk~xk kAkk~xk k~bk

Multiplying by kAk gives the desired result

~
kδxk ~
kδbk
≤ kAkkA−1 k
k~xk k~bk

If the quantity kAkkA−1 k is small, then this means small relative changes in ~b result in
small relative changes in the solution but if it is large, we could have a large relative change
in the solution.

Definition The condition number of a square matrix A is defined as

K(A) ≡ kAkkA−1 k

Note that the condition number depends on what norm you are using. We say that a
matrix is well-conditioned if K(A) is “small” and ill-conditioned otherwise.

4
Example Find the condition number of the identity matrix using the infinity norm.
Clearly it is one because the inverse of the identity matrix is itself.

Example Find the condition number for each of the following matrices using the infinity
norm.. Explain geometrically why you think this is happening.
   
1 2 1 2
A1 = A2 =
4 3 −0.998 −2

First we need to find the inverse of each matrix and then take the norms. Note the following
“trick” for taking the inverse of a 2 × 2 matrix
   
−1 −1 3 −2 −1 500 500
A1 = A2 =
5 −4 1 −249.5 −250

Now
K∞ (A1 ) = (7)(1) = 7 K∞ (A2 ) = (3)(1000) = 3000

Example Calculate the K∞ (A) where A is


 
a b
A=
c d

and comment on when the condition number will be large.


 
−1 1 d −b
A =
ad − bc −c a
So
1  1 
K∞ (A) = max{|a| + |b|, |c| + |d|} max{ |d| + |b| , |c| + |a| }
ad − bc ad − bc
Consequently if the determinant ad − bc ≈ 0 then the condition number can be quite large;
i.e., when the matrix is almost not invertible, i.e., almost singular.
A classic example of an ill conditioned matrix is the Hilbert matrix which we have
already encountered Here are some of the condition numbers (using the matrix norm
induced by the `2 vector norm).

n approximate K2 (A)

2 19
3 524
4 15,514
5 476,607
6 1.495 107

5
Vector (or Linear) Spaces
What we want to do now is take the concept of vectors and the properties of Euclidean
space IRn and generalize them to a collection of objects with two operations defined, ad-
dition and scalar multiplication. We want to define these operations so that the usual
properties hold, i.e., x + y = y + x, k(x + y) = kx + ky for objects other than vectors, such
as matrices, continuous functions, etc. We want to be able to add two elements of our set
and get another element of the set.

Definition A vector or linear space V is a set of objects, which we will call vectors, for
which addition and scalar multiplication are defined and satisfy the following properties.
(i) x+y =y+x
(ii) x + (y + z) = (x + y) + z
(iii) there exists a zero element 0 ∈ V such that x + 0 = 0 + x = x
(iv) for each x ∈ V there exists −x ∈ V such that x + (−x) = 0
(v) 1x = x
(vi) (α + β)x = αx + βx, for scalars α, β
(vii) α(x + y) = αx + αy
(viii) (αβ)x = α(βx)

Example IRn is a vector space with the usual operation of addition and scalar multipli-
cation.

Example The set of all m × n matrices with the usual definition of addition and scalar
multiplication forms a vector space.

Example All polynomials of degree less than or equal to two on [−1, 1] form a vector
space with usual definition of addition and scalar multiplication. Here p(x) = a0 + a1 x +
a2 x2

Example All continuous functions defined on [0, 1].


Note that if v, w ∈ V then v + w ∈ V ; we say that V is closed under addition. Also if
v ∈ V and k is a scalar, kv ∈ V and it is closed under scalar multiplication.

Definition A subspace S of a vector space V is a nonempty subset of V such that if we


take any linear combination of vectors in S it is also in S.

Example A line is a subspace of IR2 because if we take add two points on the line, the
result is still a point on the line and if we multiply any point on the line by a constant
then it is still on the line.

Example All lower triangular matrices form a subspace of the vector space of all square
matrices.

6
Example Is the set of all points on the given lines a subspace of IR2 ?

y=x y = 2x + 1

The first forms a subspace but the second does not. It is not closed under addition or
scalar multiplication, e.g.,

2(x, 2x + 1) = (2x, 4x + 2) which is not on the line y = 2x + 1

Linear independence, spanning sets and basis


We now want to investigate how to characterize a vector space, if possible. To do this, we
need the concepts of
(i) linear independence/dependence of vectors
(ii) a spanning set
(iii) the dimension of a vector space
(iv) the basis of a finite dimensional vector space
In IR2 the vectors ~v1 = (1, 0)T and ~v2 = (0, 1)T are special because
(i) every other vector in IR2 can be written as a linear combination of these two vectors,
e.g., if ~x = (x1 , x2 )T then ~x = x1~v1 + x2~v2 .
(ii) the vectors ~v1 , ~v2 are different in the sense that the only way we can combine them
and get the zero vector is with zero coefficients, i.e.,

C1~v1 + C2~v2 = ~0 ⇒ C1 = C2 = 0

Of course there are many other sets of two vectors in IR2 that have the same prop-
erties.

Definition Let V be a vector space. Then the set of vectors {~vi }, i = 1, . . . , n


~ ∈ V can be written as a linear combination of the ~vi , i.e.,
(i) span V if any w

n
X
w
~= Cj ~vj
j=1

(ii) are linearly independent if

n
X
Cj ~vj = ~0 ⇒ Ci = 0, ∀i
j=1

otherwise they are linearly dependent. Note that this says the only way we can
combine the vectors and get the zero vector is if all coefficients are zero.
(iii) form a basis for V if they are linearly independent and span V .

7
Example Let V be the vector space of all polynomials of degree less than or equal to
two on [0, 1]. Then the set of vectors (i.e., polynomials) in V {1, x, x2 , 2 + 6x} span V but
they are NOT linearly independent because

2(1) + 6(x) + 0(x2 ) − (2 + 6x) = 0

and thus they do not form a basis for V . The set of vectors {1, x} are linearly independent
but they do not span V because, e.g., x2 ∈ V and it can’t be written as a linear combination
of {1, x}.

Definition The number of elements in the basis for V is called the dimension of V . It
is the smallest number of vectors needed to completely characterize V , i.e., that span V .

Example Find a basis for (i) the vector space of all 2 × 2 matrices and (ii) the vector
space of all 2 × 2 symmetric matrices. Give the dimension.
A basis for all 2 × 2 matrices consists of the four matrices
       
1 0 0 1 0 0 0 0
, , ,
0 0 0 0 1 0 0 1

and so its dimension is 4. A basis for all 2 × 2 symmetric matrices consists of the three
matrices      
1 0 0 0 0 1
, ,
0 0 0 1 1 0
and so its dimension is 2.
Four Important Spaces
1. The column space (or equivalently the range) of A, where A is m × n matrix is all
linear combinations of the columns of A. We denote this by R(A).
By definition it is a subspace of IRm .

Example Consider the overdetermined system


   
1 2   b1
3 x
4  = b2 

y
5 6 b3

We know that this is solvable if there exists x, y such that


     
1 2 b1
x  3  + y  4  =  b2 
5 6 b3

which says that ~b has to be in the column space (or range) of A.

8
This says that an equivalent statement to A~x = ~b being solvable is that ~b is in the range
or column space of A.

2. The null space of A, denoted N (A), where A is m × n matrix is the set of all vectors
~z ∈ IRn such that A~z = ~0.

Example Find the null space of each matrix


     
1 2 1 2 0 0
A1 = A2 = A3 =
3 4 2 4 0 0

For A1 we have that N (A1 ) = ~0 because the matrix is invertible. To see this we could
take the determinant or perform GE and get the result.
For A2 we see that N (A2 ) is α(−2x2 , x2 )T , i.e., the all points on the line through the
origin y = −.5x. To see this consider GE for the system
   
1 2 1 2
→ ⇒ 0 · x2 = 0, x1 + 2x2 = 0
2 4 0 0

This says that x2 is arbitrary and x1 = −2x2 .


For A3 we see that N (A2 ) is all of IR2 .

Example What are the possible null spaces of a 3 × 3 matrix?


For an invertible matrix it is the (i) zero vector in IR3 , we could have (ii) a line through
the origin, (iii) a plane through the origin or all of (iv) IR3 .

You might also like