0% found this document useful (0 votes)
63 views24 pages

Solving Linear Systems: Iterative Methods and Sparse Systems

This document discusses methods for solving linear systems, including direct and iterative methods. It focuses on iterative methods, which start with an approximate solution and iteratively improve the accuracy through repeated calculations. Iterative methods are well-suited for large, sparse systems where direct methods would be prohibitively expensive. The document introduces concepts like iterative improvement by computing residuals and corrections. It also describes the Sherman-Morrison formula for efficiently updating the inverse of a matrix after a low-rank modification. This allows iterative solvers to be applied to systems with special structures like cyclic tridiagonal matrices.

Uploaded by

dbreddy287
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views24 pages

Solving Linear Systems: Iterative Methods and Sparse Systems

This document discusses methods for solving linear systems, including direct and iterative methods. It focuses on iterative methods, which start with an approximate solution and iteratively improve the accuracy through repeated calculations. Iterative methods are well-suited for large, sparse systems where direct methods would be prohibitively expensive. The document introduces concepts like iterative improvement by computing residuals and corrections. It also describes the Sherman-Morrison formula for efficiently updating the inverse of a matrix after a low-rank modification. This allows iterative solvers to be applied to systems with special structures like cyclic tridiagonal matrices.

Uploaded by

dbreddy287
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

Solving Linear Systems:

Iterative Methods and Sparse Systems

COS 323
Direct vs. Iterative Methods

• So far, have looked at direct methods for


solving linear systems
– Predictable number of steps
– No answer until the very end
• Alternative: iterative methods
– Start with approximate answer
– Each iteration improves accuracy
– Stop once estimated error below tolerance
Benefits of Iterative Algorithms

• Some iterative algorithms designed for accuracy:


– Direct methods subject to roundoff error
– Iterate to reduce error to O( )

• Some algorithms produce answer faster


– Most important class: sparse matrix solvers
– Speed depends on # of nonzero elements,
not total # of elements

• Today: iterative improvement of accuracy,


solving sparse systems (not necessarily iteratively)
Iterative Improvement

• Suppose you’ve solved (or think you’ve solved)


some system Ax=b
• Can check answer by computing residual:
r = b – Axcomputed
• If r is small (compared to b), x is accurate
• What if it’s not?
Iterative Improvement

• Large residual caused by error in x:


e = xcorrect – xcomputed
• If we knew the error, could try to improve x:
xcorrect = xcomputed + e
• Solve for error:
Axcomputed = A(xcorrect – e) = b – r
Axcorrect – Ae = b – r
Ae = r
Iterative Improvement

• So, compute residual, solve for e,


and apply correction to estimate of x
• If original system solved using LU,
this is relatively fast (relative to O(n3), that is):
– O(n2) matrix/vector multiplication +
O(n) vector subtraction to solve for r
– O(n2) forward/backsubstitution to solve for e
– O(n) vector addition to correct estimate of x
Sparse Systems

• Many applications require solution of


large linear systems (n = thousands to millions)
– Local constraints or interactions: most entries are 0
– Wasteful to store all n2 entries
– Difficult or impossible to use O(n3) algorithms
• Goal: solve system with:
– Storage proportional to # of nonzero elements
– Running time << n3
Special Case: Band Diagonal

• Last time: tridiagonal (or band diagonal) systems


– Storage O(n): only relevant diagonals
– Time O(n): Gauss-Jordan with bookkeeping
Cyclic Tridiagonal

• Interesting extension: cyclic tridiagonal


 a11 a12 a16 
 
a21 a22 a23 
 a32 a33 a34 
  xb
 a43 a44 a45 
 
 a54 a55 a56 
 
a61 a65 a66 

• Could derive yet another special case algorithm,


but there’s a better way
Updating Inverse

• Suppose we have some fast way of finding A-1


for some matrix A
• Now A changes in a special way:
A* = A + uvT
for some n1 vectors u and v
• Goal: find a fast way of computing (A*)-1
– Eventually, a fast way of solving (A*) x = b
Sherman-Morrison Formula

A*  A  uv T  A(I  A 1uv T )
A 
* 1
 (I  A 1uv T ) 1 A 1

Let x  A 1uv T
Note that x 2  A 1u v T A 1u v T
Scalar! Call it 
x 2  A 1uv T   A 1uv T   x
Sherman-Morrison Formula

x2   x
x  I  x   x 1   
x
x  I  x  0
1 
x
Ix  I  x  I
1 
 x 
I   I  x   I
 1  
 x 
   I  x
1
 I 
 1  
A 1uv T A 1
A   * 1
A  1

1  v T A 1u
Sherman-Morrison Formula

1 1
 
T
* 1 A u v A b
x A b  A 1b 
1  v T A 1u
 
So, to solve A* x  b,
z vT y
solve Ay  b, Az  u , x  y 
1  vT z
Applying Sherman-Morrison
 a11
11
a12
12 16 
a16
 
• Let’s consider a21

a22
22
a23
23 

a32 a33 a34
  xb
cyclic tridiagonal again:
32 33 34
 a43 a44 a45 
 43 44 45

 a54
54
a55
55
a56
56

 
a61
61
a65 a66 
66 

a1111  1 a12
12  1 1
     
 a21 21
a22
22
a23
23     
 a32 a33 a34     
• Take A

32 33

a43
34

a44 a45
 , u   , v   
    
 43 45
    
 a54
54
a55
55
a56
56
    
     
 a65
65
a66
66
 a 61 a
61 16 
16   a
 61 
61  a16
16 

Applying Sherman-Morrison

• Solve Ay=b, Az=u using special fast algorithm


• Applying Sherman-Morrison takes
a couple of dot products
• Total: O(n) time
• Generalization for several corrections: Woodbury
A *  A  UV T
A 
* 1
 A 1  A 1U  I  V T A 1U  V T A 1
More General Sparse Matrices

• More generally, we can represent sparse matrices


by noting which elements are nonzero
• Critical for Ax and ATx to be efficient:
proportional to # of nonzero elements
– We’ll see an algorithm for solving Ax=b
using only these two operations!
Compressed Sparse Row Format

• Three arrays
– Values: actual numbers in the matrix
– Cols: column of corresponding entry in values
– Rows: index of first entry in each row
– Example: (zero-based)
0 3 2 3
  values 3 2 3 2 5 1 2 3
2 0 0 5
0 cols 1 2 3 0 3 1 2 3
0 0 0
  rows 0 3 5 5 8
0 1 2 3
Compressed Sparse Row Format

0 3 2 3
  values 32325123
2 0 0 5
0 0 0 0 cols 12303123
 
0 1 2 3 rows 03558
• Multiplying Ax:

for (i = 0; i < n; i++) {


out[i] = 0;
for (j = rows[i]; j < rows[i+1]; j++)
out[i] += values[j] * x[ cols[j] ];
}
Solving Sparse Systems

• Transform problem to a function minimization!

Solve Ax=b
 Minimize f(x) = xTAx – 2bTx

• To motivate this, consider 1D:


f(x) = ax2 – 2bx
df/ = 2ax – 2b = 0
dx

ax = b
Solving Sparse Systems

• Preferred method:
conjugate gradients
• Recall: plain gradient
descent has a problem…
Solving Sparse Systems

• … that’s solved by
conjugate gradients

• Walk along direction


d k 1   g k 1   k d k

• Polak and Ribiere


formula:
g k 1 ( g k 1  g k )
T

k 
g kT g k
Solving Sparse Systems

• Easiest to think about A = symmetric


• First ingredient: need to evaluate gradient

f ( x)  x T A x  2b T x
f ( x)  2 Ax  b 

• As advertised, this only involves A multiplied


by a vector
Solving Sparse Systems

• Second ingredient: given point xi, direction di,


minimize function in that direction
Define mi (t )  f ( xi  t d i )
d
Minimize mi (t ) : mi (t )  0
dt
dmi (t ) want
 2d i  Axi  b   2t d i A d i  0
T T

dt
d  Ax  b 
T
t min   i T i
di Adi
xi 1  xi  t min d i
Solving Sparse Systems

• So, each iteration just requires a few


sparse matrix – vector multiplies
(plus some dot products, etc.)
• If matrix is nn and has m nonzero entries,
each iteration is O(max(m,n))
• Conjugate gradients may need n iterations for
“perfect” convergence, but often get decent
answer well before then

You might also like