0% found this document useful (0 votes)
2K views212 pages

Applied Numerical Linear-Demmel

Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
2K views212 pages

Applied Numerical Linear-Demmel

Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 212
Preface to the 1996 Edition ‘The following ase natural goals for an aspiring textbook writer of « book ke this one: 1. Te should be attractive to fst year graduate students fom a vatety of ngineeng sa seintie dsellins 2 ‘The text should be selontaine, assuming only a good undergraduate backgroud in near ara ‘The students should learn the mathematical bass ofthe ld, as well x how to bull old good merical software 4. Students should segue practical knowledge for solving real problems clients, In partiuler, they should know what the stateot-theart, teciniqoes are in enh an, oF when 0 lon or them nd where tof 5. I should all tin ove semester, slace that Is what most students have vlable for this aubject ‘The fth goal perhaps the hardest to manage, ‘The fst editlon ofthese notes was 215 pages, which did ft into one ambitious semester. This elton fas more than doubled in length, whic is ectsinly too much material for ven heroic semester "Tho rew material retlets the growth reser in the eld inthe lst few years, ining my own, It nso reflects requests from calles for sections on erin topis that mere teat ight oF noe al in the ist einen. Notnble mln ince + a class homepage with Matlab sourwe code for examples and homework problems in the text, ae pointrs to other ortine voftare and text Fook, + more pototers inthe text to software fr all problems, inhi su nary of available sofware for solving sparse Inar systems us det metho +» now chapter on Krylo subspice method for egersale problems section on domain decomposition, leading both oeelapping sal nonover Tapping methods si Preface + sections on relative perturbation theory” an eceresponding high-secunacy algctims for egenpeablens, Ike Jacobi ad ads, + more dual! peeformance comparisons of competing lest sures sd symmetee geval lgorithns and + new homowork problems, Inluding many contributed by Zao Ba A reasonable one-semester curriculum cou const of the falling chap- oor al seta «all of Chaper 1 + Chapter 2, exelading setions 22.1, 243, 2 58, ond 26-45 + Chapter 8, exluding setion 35: # Chaptor 4, up to end including setion 4.45 1 Chapter 5, exluding sections 5.2.1, 5.5, 5.4 and 5.5; and + Chapter 6, exluing sections 63.3, 65.5, 65.6, 666, 67.2, 673, 67.4, 5,62, od 610. omer problems ce marked Easy; Meim or Hard, noon to their deity. Problems involving sgntennt amounts of programing sr ark “progaing” Many’ people have helped contribute to this text, aotably Zaojun Bs, ‘Alba Elan, Veil Kalin, Richard Laban, Berend Past, sa way ‘nonymious refres, all of whom made detail connents on varous parts of the text. Table 2.2 taken from the PhD thesis of my stident Xlanye i Alan Felon st MIT ond Martin Guskneeht st ETH Zurich provided fospitable surroundings at thelr inatutions while this ical edition was being prepared Many stents at Courant, Berek, Kentucky and AIT have fisted to sd helped debus tis material over the years and deserve thanks. Fins, Kathy Yelk bas eontibuted sete comments asx consulting, ard moral support ‘over more sours than either of exp thik projet to take James Denim MIT September, 1996 Contents Introduction Li Dale Notation Standant Problems of Numerical Liner Algebea 1) General Techgues TALL Matrix Petorations 182 Perturbation Theory and Condition Numbers YR Effects of Roundoff Error on Algorihs RA. Analyzing the Speed of Algorithms 1.85. Baglorering Null Software 4 Example: Polynenial {Floating Plot Arithmetic Palynomia! Evaluation Revsted Vector al Mateix Noes Referees and Other Topes for Caper 1 (Questions for Chapter Linear Equation Solving 21" Intrdetion 22 Peecbation Theory 221. Relative Perturbation hoy 25 Gaussian Bimination 2.6 Brrr Analyst 21 The New! for Pivoting 2.42 Formal Eroe Analysis of Gausian Blisation 2.43 Enimating Condition Numbers 2.44 Practical Error Rous Improving the Accuracy of Soltion 25:1 Single Preion Herve Refinement 252 Equllbeation 26 Blocking Algrithns for Maher Performance 261 Borie Linear Algebea Subroutines (BLAS) 262 How to Optimie Matix Multiplication 263 Reorgalang Gausian Elimination to use Level 8 BLAS 2064 More About Parallel and Otber Performance ess Special Linens Syste 271 Real Syametre Posive Dtate Matexs rt e 6 a 76 76 22:2 Symmetric Indefinite Maries 273 Band Matrices 274 Goperal Sparse Matrices 21.5 Dense Matrices Depending on Fever Than O(n?) Po 28 References and Otier Topics for Chapter 2 29 Questions foe Chapter 2 inear Least Squares Problems {1 Introduction 32 Matix Faetriations Tliat Solve te Liner Least Squares Peo lem 321” Normal Bantons 322 QR Decomposition 32.3 Singular Valve Decomposition 33 Perturbation Theory for the Least Squates Problem 34 Orthogonsl Matis SLL Housebolder Tstsortations 342 Gives Rotations 3.6 Roundot Error Anaivis for Orthogonal Matrices Bt Why Orthogonal Matrices? 3.5, RankeDefcint Lest Squares Problems 351 Solving Rank-Deicent Laat Squares Problems Using the sv 252 Solving Rank-Deiciet Last Siuares Problems Using QR wih Pivoing 36 Peformance Compass of Methods for Solving Lest Squares Problems {LT References and Other Tops for Chapter 3 8 Questions foe Chapter 8 Nonsymmetrie Eigenvalue Problems 4.1 Inteoietion 42 Carona! Forms 42.1 Computing Bigenvectors frm the Schur Form 43. Perturbation Theory AA Algorithms forthe Nonsyinmeie Higenprbiern {1 Power Method 442 averse eration 443 Onhononal eration 144 QRtheration AA5 Making QR eration Practical 4.46 Hessenberg Reduction ALT Tiidagonel and Bidiagonal Reduetion 101 101 06, 106, um 09 7 us 35 134 15 1 10 122 131 184 139 199 10 8 uss 188 it 16 19, 1 16 448 QR Meration with Lplt Stites wr 4.5. Other Nonsymmetre Bgervalve Problems 1 ASA Reolar Matrix Penis aod Weierstrass Canonical Form 173 452 Singular Matrix Penis aod the Kronecker Canonical Form 1 453 Nonlinear Eigenvalue Problems 1s 46 Summary it AT Reforences and Other Topics for Caper 4 ro 418 Questions for Chapter & 17 ‘The Symmetric Bigenproblem and Singular Value Decompo- sition 195 1 Introduction 195 52 Perurbatlon Theory wm 52.1 Relative Perturbation Theory oo 53 Algorithms forthe Symmetic Eigenprobera 210 ‘Tridiagonal QR Teraton a2 Rayicigh Quotient Iteration ais Divide-and-Conguer IT Bisetion and Inverse Keraton ms “iobis Method oe 5:36 Performance Cormperion 20 54 Algeitims forthe Singular Valve Decomposition an 541 QR Renation and Kes Variations forthe Bidiagonal SVD 212 5.42 Computing the Bisgoual SVD co High Relative Aecursey2i6 5.43. cobs Method for the SVD 219 5.5. Ditlrental Bavatons sod Bigervale Problems 2H 551. ‘The Ted Lattice 255 552 The Conection to Partial Diferenial Pquations «5: 250 56 References and Other Toples for Chapter 5 20 537 Questions for Chapter & 2m Iterative Methods for Linear Systems 265, 1 Introduction 208 62 Online Help for Taraiive Methods 2 83 Polesn’s Equation an 681 Poisson's Equation in One Dimension an {53.2 Poisson's Equation in Two Dirensons 20 533 Expressing Posson's Bavtion with Kronecker Prodvets 274 664 Sommary of Methods for Solving Poiwon"s Pquntion 27 155. Basic Iterative Methods 279 651 cobs Method ai 652 Gauss Sede! Method BS 653 Suosesive Overrlanation 2 66 os i) 55.4 Convergence of Jocai's, Gauss Seidel, ad SOR(a) Methods an the Model Probes 285 65.5 Detaled Convergence Criteria for Jacobs, ‘Gauss-Seidel, and SOR() Methods 286 656. Chebystey Acceleration and Syanaetse SOR (SOR) | 204 Kyov Subspace Methods a0 6.5.1 Extracting Information stot A vie Matris-Veevr Mul: Upleation ao. 662 Solving Ax ~ b Using the Krylov Subspace Ke 306 553 Conjugate Grodiet Meshod 07 {56.4 Convergence Analysis of the Conjugate Greens Method 312 665 Preconditioning si 666 Other Krylov Subspaco Algoitms for Solving Az ~ 6 320 Fast Foutir Transform = 7.1 ‘The Diseret Fourier Transform 3 67.2. Solving the Contioys Model Problem Using Fourier Seres a 673 Convolutions 2 67.4 Computing the Fist Fer Transforms 2 Block Cyele Reduction ws Multigrid 3 69.1 Overview of Multigrid on Two-Dimensional Poisons Equa tion ‘8 692 Detaled Deseviption of Multignd on One-Dimensional Poison's Equation Domain Decomposition 6.10.1 Nopoverlpping Metiads 6.102. Overlapping Methods G11 References and Otter Tops for Chapter 6612 Questions for Chapter 6 7 Tterative Algorithms for Bigenvalue Problems 08 Ta Intredetion 3 22 The Raykigh Fiz Method 6 7 The Lanczos Algorithm in Exact Arithmetic Sis TA The Laneaos Algorithm la Floating Point Arithmetic a7 75 The Lanczos Algorithm with Seletive Orhogonalization | 32 7.6 yond Selective Orthagonalization 8 ET Iterative Algorithms for the Norsynmetse Bigenproblom 386 TR References and Otter Tops for Choptet 7 aS 79 Questions foe Chapter 7 — Bibliography son dex. Introduction 1.1. Basic Notation tn his course we wil refer roquenty to matrizes vectors and sealers. A ites wil be dei by an Upper ese hater such a, ads (th slement wil be denoted by ay Ue mat sien by a expresion sb SAB, we wil ete (4-1 B)y- In detailed alrite dcr we wil sometime write A(,) oro be Mata [2] tation A=: 1) 10 este the subaati o A Ig in ows oh Ja ns Ug A wero str ie will deote nectar, nd Hs th ment be wtten Vectors wl aunt alae be clu vector which are Ue seine mates with one clan. Lomeree Grek ters (nd oceiusly lowercase hte) wl dene scl, R wll dente the et of ral uber 2, the set af mela rel etre asl =P", the st fry eal miriam C, Cand C=" denote complex taney, wets, a ats, repectively Oceana’ we wl ue the sbrthand "toate tht is tnm-by-n mart A willenote the onpasee tbe mat A (AP) = 9 Fer complex tices we wl alo see conigte tense A Cy a 3 od wl dole te teal ted Hnasary pts the copie tab 5 rspostivay, IFA ts mye, thn [4 the tym mare of abecate vals of ets oA: (Ay — a ete ie A | ace mat Componenti: inl < [foal adj. We wil also se ib ava va notation for vector (ah ~ ls. Ed of prot wl be male by 0, ad tn of exp ty. Other ation wil be traced a nd 1.2, Standard Problems of Numerical Linear Algebra We wil consider the ‘olowing standard problems + Linear mptems of erations: Solve At =. Here A isa given nbn nonsingular eel or ennplex mates, bis glen elim vector with ‘tre, and 2 isa colunun vector with nents tat we wish fo compute, 2 Applied Noserical Linear Algebra 1 Least squares problems: Compute the # hat minimizes Ae —blp. Here Ae mbyen, is maby, 2 8 moby and ia = VST elle the two-nerm of the veto y- Tm > so tat we have more quations than unkrowns, the system is caled overdterminad. In tls ce we ‘eannot georallysove Az — 6 exsetly. IF m vy thon we can reduce the eas freer to OG) by using band Chokesks. IP we Sy quite explicitly at we are tying to solve Poisson's esti on 8 Squnee| ‘sng 9 S-point diference approximation, whieh determines the matrix nearly ‘uniquely then by using the multigedalgosthm we enn edie the cost to (0), whic i ner fast 8 posible, In the sease that we use just a constant ‘mount of work per sluton component (scetion 6), 1.3. General Techniques "Tere are several geval conceal tec that we wl use epee 1, mati eorizations 2, perturbation theory and edition numbers ‘8. eflets of round! error on alent, including properties of Noting ola aithieti; Inston a {4 analysis ofthe sped of an lenin 5. engimoring numeri! sot.ware ‘We discuss eich ofthese billy below. 13.1, Matric Factorizations A factorization of the matsix A sa representation of A a 8 pret of several Simple” trees, whieh make the problem at hind eee to solve. We give to exarapes, BXAMPLE 1.1, Suppose that we want to solve Ar = b IFA Ts a lower tran- slo nti, on a) [a oar) a] _|& am oa = ton | Lae | Lon Js easy to solve using forwand substation: fort 10m Ube Deh aate/e nd fo ‘An anslogous idea, Sack substitution, works if Ais upper langue. To se this to solve a general system Ar ~b we nocd the following matrix factor lati, whieh is Just restatement of Gaussian elimination, ‘Tuvonuan 1.1. If the n-byn matrie A Sa nonlagulr, there eile a perm tation matree P the identity mate wih ts sows permated), a nonsingular lower triangular matrérL, end a nonsingular upper triangular metriz Usuch that A= Pe kot, Tastee a ~ b we salve the euisatent stem PLU & 0 follows: ie = P-l— PT (perinate entries of 8), Ue EMT) forward substitution), B-U-HE'PTH) (hack substitution) We will powe this theorem in set 28. 6 BxAMPLe 1.2, ‘The Jordan canonical factorization A= VJV~! exhibits the igonvalves and eigenvectors of A. Here V is @ nonsingular matrix, whose falonns incl the eigenwetors, and J i he Joan ennonieal form of A, 4 special triangular matric with the eigemalis of Aon its diagonal. We Wil earn hat it & sumeselly superior to compute the Sour fecarzaion A=UTU", where U is unitary mati (2. U's ealumns sr orthonormal), ‘and T's upper triangular with 4's eigenvalues oni diagonal. ‘The Seu fom, 7 can be computed faster sal more accurately tha the Jorda form J. We seuss the Jordan and Schur fuctoraations in section 42. © 4 Applied Noserical Linear Algebra 1.3.2, Perturbation Theory and Condition Numbers "The snsmers proce by nomereal algorithms are seldom exactly correct ‘These are two sees of ero. Firat, ere may be errors inthe input data to the algorithm, cause by prio calculations or perhaps ineasreent ert. Second, thee are errors caused by the algorithm sel ue to approximations ‘made within the algoritin. In onder 1 estimate the erors in the computed fnawers fom bot these sours, we need to undersea! How much the soltion fof «problem is change (or peta) ithe input data i sight porturbed EXAMPLE 1.8. Let J(2) be real-valued ontinnos fneton fa eel via {= We want to compute f(a), but wed not know x exacts. Suppose insted that we are given 2 and bound on 82. ‘The best that we an do (without snore information) i 1 compute f(z 4 Bx) and 10 ty to hound the alte ‘ror {a1 82) ~ f(a). We may use a simple nese approximation to f to get the eer bound fl | 82) fa) | bxf(2), and so the eve is [fa $32) — (0) ~ 6a f(a). We call L'a) the absolute condition numberof fst F172) is large enough, ten the error may be large even If Sr ssl In this ease we ell filbconditioned wt 2. © Wo say absolute condition number becuse i provides 8 bound on the solute errr [f(a +) — f(z) ven bound on the absolut change [in {he input. We wil als ote use the following essentially equivalent expresion tobound the err: wey a We "This expession hounds the slave errr [f(x §&e) ~ fla) /(2] a6 9 ml plea the velairechange || inthe input. The mutdpber, (2) -[l/a, [Sealed the relative eanation nantes, oe oftn jst condition amber for shoe. “The cadiklon number sal that we need to understand how error in the Input date affects the computed answer: we simply multiply the condition rombee by © boon on the inp eror to bound the error in the compte selution or each problem me consider, ill derive its corresponding condition 1.3.3. Effects of Roundoff Error on Algorithms ‘Te continue our analysis ofthe errr caused by the algorithm itll, we ned to sty the effect of roundof error In the atithmetic, or simply roundel foe short. We wil do so by using & property posse by mos goed slgorithis: Sackurd sabe. We define I as follows alg(2) ou algorithm for (2), including te ees of round, ve all lg) fackuand stable algorithm or f(2) fo all» thre Inston 4 is a Small Gr such that alge) = fe + 62). Bis called the leckwar error. Informally, we say Uh we get the exc snswer (ice 62) fora sighly wong problem (2 18x) "This mpl that we may bound the errr as error = alas) — f(a] = [72+ 82) — Fa) ~ P-L, the product of the absolute condition number [f(a and the magnitude of the backward error ||. ‘Thus, alg) is backward sab, [bx is always smal so the eror wil smal nls the absolte condition naar i la Thus, backward stability i a desirable property for an algorithm, and most ofthe algoritins that we proent willbe backward! stable, Combined wth the corresponding eondtion numbers, we will have error bounds fr all our computed solutions Proving that a algorithm is backward stable requires knowledge of the rouusol error of the bse Hoting pot operations of the machine ad how these errors propagate trough an lao. This i sessed in Section 13. 1.3.4, Analyzing the Speed of Algorithms In choosing an algorithm to solve a problem, ope must of course consider its sped (ith is ko called performance) well i backwarelstabilty "These are several ways to estimate speed. Given 9 particule problem istanes, ‘ potienla implementation of an algorithm, and» particular computer, one ‘ean ofcourse simpy ron the agli and see ow long it takes. This may bediflenlt ote caasuming, so we often wast simpler estimates. lad, we typleally want to estimate bow Jong particular algo would tke before Implementing. "The traditional way to estimate the tne an algorithm takes isto count the flops, o floating point operations, that ic performs. We will do this for all the algorthins we present. Howeve, this I often ensleading te e= timate on modem computer architectures, because I ean take sigafcatly ‘ore tine to move the dats Inde the computer to the place where I to to multipbed, say, than it does to setelly perform the multipieation. ‘This is espeilly true on parallel computers but also is true on conventional me- chines such as workstations and PCs. For example, matrix multiplication on the TBM RS600/580 workstation ean be sped up ror 65 flops (milions of eating point: operations pe scene) to 210 Mos, rear four Himes faster, lye iuisnsly reordering the operations ofthe stavar sgorithm (and ing Ue eoreet compiler optimizations). We discuss this ford in set 265, Ian algorithm i teratie, L, provdes a series of approximations con- verging tothe answer rather chan stopping after a fe! number of stops, then Wwe mst ask how many steps are needed to decree the eeror to 8 tle able evel. To do this, we aed to decide i the eonvergence Is Kear (Le, 6 Applied Noserical Linear Algebra the err dereases by # constant factor <6 < 1 at each step 50 that letron 2th aid ~ (an #49) t= lama) Few Paa 0 then / there «0 8 el */ eve Pt Pgs <0 then /* here i 08 [hl *Y/ ele 7 a 18 01008 */ eu nd whe 000 = (iw ha)/2 1.5. Floating Point Arithmetic “The number —8.1116 may be exprese in sient notation 2 follows i 5 31416 x 10 sign fraction base ‘exponent Computers use similar representation called floating point, but gee aly the base i 2 (with exceptions, such as 16 for BMC 37D and 10 for some spreadsheets and most eaeulsor). For example, 101013 x2" ~ 6.256 A floating point nuaber is called normalized i the leading digit of the fretion sneer. Fo example, -10101,%2° is aoemalze, but 010101) 28 ot. Floating pot numbers ee usualy normale, which has wo advantages: ‘ach nonaer floating point value has a unique representation as Bit sting, ‘and in isa the esi 1 in the faction need not be stored explicitly (because Tels always 1), lewing on extn it fra longer, move accurate faction, "The mos important parameters deserting fating point numbers are the Ie; che nuber of ss (it) in te feaetion, which dternines the prison: ‘nd the tumber of digits (its) in the exponent, which determines the expr ent range and this the largest and smallest rpresetable numbers. Difereat » Applied Noserical Linear Algebra ‘oat pot athetis aso die in how ey ound eorpated esl, what ty dlo about numbers that are too near zero (underflow) of too big (over flow), whether 200 is allowed, nd whether useful sonnunbers are provided (ometimes called NaNs,indefntes, oF eserves operand) are provided. We tomy te An Nas ‘urn yay operate with wldne ae rn rs sha cones Bs fv SaN Oa, te ‘Whose an aithmetie operation invalid nl prods an NAN, vero ve sro ope =, ourtown, an epi fg is sean alter be etd by the ts progam, hse tars pet oa to write bith mae reblog (ese tn prsrtm ean tet a core torn exepinn, ise of apy aborts extn) and fet ronan br tng mo posrarming with ota rae {© vid pombe but uk actin). For example, se Qin 1.19, ‘the comments following Lemma 5-3, and [80]. ‘The mod expat ero knoe fo hav btn chased by 40 inoropty andl Noting oi exon Te th cash ofthe Ara 5 rake ofthe rope Space Agecy ode 1996 Sew HOME/wianep heal or feral ov ll machines wo TEBE artnet or outer, hough nly sido. Th mot important mer xeon ar thse rachis poet by Cray ere ltioushfture generations of Cry ties ma TELE rthmeti Sin the ifloece beeen Ea) sompated on Cry tod a8) computed ons TEED machine nly fs the 1th dein pln o bron th er may wonder eter the irene pot Toned most sith in sumorea! near ltr re neste to cecal inthe way roudll ade But tr on Ut ome sgh ore cess 10 dig, or nore bl, wen roving is dow rote. Hee ar ‘wocamploe number qual 0 the singe precision ane revit the mane way We alae the Cony and Ta, wee a "a amputee bail en DEC Rip rceuam, wich we IEE thet ey ly {gern te Be oo ee eek) uy esc wu puch by Sm raphe in 10, Inston 18 When the Cray C90 sbiracs from the next smaller Roting pint m= tes, ib gets ~2-", which is twie the correct answer, ~2. Ging even tiny dilerenes to high elativesecuracy i essential forthe cores ofthe dlvde-and-cooquee goat for Hing eigenvalues and eigenvectors of yt race mateees, cure the foster salable for the rable. This Algorithm requires rather noninuive modiestion to guarantee eorecess fn Cray machines (0 soeton 5.3). ‘The Cray may also yield an ertor when computing arcon(2/ 777) breause excessive round causes the argument of areas to be lege th “This exnnot happen in IEEE arithmetie (ote Question 117) "To accommodate erorsoalsis ons Cray CW0 or other Cray machines we ray insted use the model ab) — (1461) 21 +43), OC0eB)~ (Web 143), and fe) — (9/8) +43), with [|< where e 8 small multiple of the Iasi relative representation ere ils wc sy Ut eoerect rong nd exer features of IBER arith mete ae designed to peserve as may mathematial relationships wed to Os some ertor measure, and J > 0s seme comparison value (ce seetion 1:5 for an example). By negligible we mean “ise [pel So a. ‘Too tha here fag tice top, chore slan(as!), © =n2) Elael Te ddan gh aan, Mate pa pb de we wn atte ae ction ng ty ae niet Tsamp nehate mpc Tones sno snd i vl of aro oct 43)” Rn yo (Sate pn eh th ur et dg nh ten ttn ithe th eof dite ot in he abn he stn, ec mp tf ed ne oh to 8 Applied Noserical Linear Algebra d (e299 Jeeves Fig. 13. Plt of ero bound om he sal of y= (2 —2) ental an Horners “Thin simple ipl rations betwen contion number ad distance tothe nearat pred pois very common anal mala i we ‘Nike belaning of the nroducton we sil that we would we ental forms of matics to el salve nar atv problens. For example, knowing Ue expt Jordan cazolal frm mnie computing exact elevalen tv ‘There an analogous earonkal frm for pom, which makes scurt polyoma evaluation eat: te) ~ auf (2 — In other words, we Fp eon he pote by te edn eect a and Hs rt ry Pus valle ps) we ws obs lait fort tod parler) ed foe Te is easy to show the computed p = p(x} (14 8), whore [6] < 2a; iy, wo always get pz} with high relative acurses, ut we ned he roots of the polynomial todo this 1.7. Vector and Mat Norms Nomis are wed to measure eres in max computations, 50 we aed to Understand how vo compute a manipute thn. Missing proofs ae lft 8 probleas atthe ea of the chapter Duraxrri0x 1.8. Lat B be a ral (comples) near space BY (or C°}. It armed if there isa fonction |] BR, which we eal a noe, satisfying al of te folowing 1) fl 20, oro ~0 Fond only i ~ 0 (postive deinitenes 2) col a Jor ony et (or comple) wl (homoge igh 3) fol f+ (he ange equal) BXAMPLE 1, ‘The most common norms are [ally — (Slo?) for 1 < p< which we all p-narms as wells [0 — Maxi whieh we ell the ce-marm oe infty norm. Also, if [| any ner and Cis ary nonsinglae mati, then Cz 8 also nor. Ma wo thn there ae many noms that out neo mensure eon i imports to chon an aprons ete For exc es) 2 in intel [01,201 20} mtrs. Ps pod ppraon toc Bnet restive evr Els = po ae a [10,201,200 is «bd approximation becuse 24-2 — 3, Bt suppose the ist component 2» Applied Noserical Linear Algebra 's measured in Klee instead of meters. "Than inthis norm y ad y lok om oot 01 a] | anf ton], ans mse 4 a This "To compare #1 and 2s, wo sould use 7 ‘to make tho units the same oro that equally Important errors make tho neta scaly tare. ‘Now we define inner products, which aro generalization ofthe standard dot preduct Sy, and arse frequent Kncar algebra. Derininion 14 Let Be a ral (compler)tinear space (1): B x B — R(C) ‘son inner proviral of the following apy 1 fe) = (a2) (or aD), 2) (rae 3) = (esa) ea 3) (az) ~ a(a,y) for any ra! (or comple) selar a, 4} (x2) 20, and (22) — 0 and only Fx ~ 0, Hel ExAMrU 1.5. Over Ry (asa) = ae = Ty zum, and over C, (0) = ve Eesha ae prods (clea = 9 th ot erp we Derininion 1.5. and y are orthogonal if 9) = 0 ‘The most important property of an inner producti that it sates the Cast Sty Th cn be wa itr ow ht a Lemsta 1.1, Coueby Schwartz inequality. |(2,9)] < Va) Tiv)- Lesa 1.2. VERT «norm, ‘There isa one-to-one correspondence between innae-prodvets and syne te (Hermitian) postive defnte matrices, as dfived below. ‘These matries Arise frequent in applet ins Derinition 1.6. A rat symmetric (compler Hermitian) matriz A is positive definite 2" Ax > 0 (2dr > 0) for all 2 ~ 0. We abbreviate symmetric bostee define to sp, and Heritian positive (0 pd Inston a Lean 1.8. Let B= R® (or C9) and (4) bean inner product. Then there Sen nbyn spd (Aged) mate A such tht (zy) —y Ae (yz). Cone ersey, AS spd fp), then yi! Ae (yA) 6 an Somer product. “The folowing two lemnns aro useful in converting eor boul in terme ‘of one norn ta error bound in terns of eer rannen 14. Le lla al be t80 norms on R (or C8). There are fematants¢1,04 > 0 ech thet, for at, cilele © Holy $ alta. We aso Sy that norms jl end yar equivlen with epee to constonts ey and Lowa 1.5, Ih < lah < vale, tle = lal = Viel, tle = lah = mize In alton to vector norms, we will lo need matrie norms to reasire Duvixrriow 1.7. [6 mate worm on m-byn matrices if i vector ‘orm on mm dimensional space 1) JA] > 0 and [Al =0 gf ond only A =O, 2) adil — lal Al, 8) [4+ By S [Al 1 EXAMPLE 1.6. may lo alle the mae norm, and (Slo P)" = Lal inal the oben norm. ‘The following defiition is wseful for bounding the nora of a product of ‘mates, something we often ned todo wha delving eror Bounds DDueintT10N 1.8, Let Iwan 6€ a matric norm on m-by-n matrices, lp tea matre norm on n-byp matrices, a | ny be 0 matrix norm on ep matrizes. These norms ere called val consistent fA Bllnxp [Ales [Blosps where A is m-bpn ond B is ny Duvisiti0N 1.9. Let A be m-ty-, lye @wetor norm on RO, ado tea sector norm on B®. The oe ltl, ln = as {scaled an opeesior net oF induce norm o subordinate inte norm, 2 Applied Noserical Linear Algebra ‘The net lem provides large source of mate norms, ons that ew ‘efor bounding errs, Leanna 1.6, An operator norm is 0 mats norm. Orthogonal sn unitary matrices, dei reat, ee eseatal ingredients of nae ou lgoritas for last squares robles sa eigen problems DEMATION 1.10. A ral square mats @ i orthogonal FQ"! = QT. A ‘eompler square mari i unltary if Q~* = Q" Allows (or class) of orthogonal (or uritary) matvees have unt 2-nomns andar orthogonal to one another, siace QQ" ~ QTQ—1 (QQ — @°Q~ I) “Tho next kana summarizes the eseatlal properts of the noes and maize we have Intoicd sofa, We wil use these properties later in the book Lista 1.7. A 0 since if ne, say A, wore negative, wo would take as its eigenvector andl got the contraieion O'< [AgI2 — @? TAG — aX Nigl <0. Therefore [Arp (at atAgy? (Qagray Mb = 23s "Sch —— 2 Ich nay CRAG"? eo Qala mux Tomy Sa Von ie which Is ttalnable by chosing y to be the appropriate column ofthe Kenly uate, 1.8. References and Other Topics for Chapter 1 AL the ead of eoch chapter we will Ist the references most relevant 10 that upter. They azo aso Usted alphahetially i the biblogeapy a Ube ed. 1 ‘allion we wl give pointes to elated topes not dseussod in the sain txt ‘The most modern comprehensive work in this aris by G, Golub an C, Van Loan [19], which alo has wn extensive bibliograph A reent undergrad tte level o binning graduate tet in his materia by D. Watkin (250) ‘Another good grade tox by L. Trefethan ard D. Bia (21). A ease ork that i somewhat dated but sill an exceleat eetereae by J. Wikis [250 An older but sill exon book atthe sume level as Wats i by Stewart (2) ‘More detailed information on eror analysis ean be found in the recent hook boy N- Higham [147 Older but sil good general references are by, Wilkinson [250] and W. Kata [135 “What every computer sist shoul! koow about floating point arith metic” by D. Goldberg i good recent survey [117 IEEE. arithmetic de- seribed formally in [11 12,187) a8 well sin the reference manuals public tyrcomputer manufacturers, Discussion of eror analyse with IEEE arithmetle nay bo four in (58, 0, 1ST, 150] ad the eorenes ete beret, "0 more general discussion of condition numbers ad the distance to the rarest ilposed problem given by te author in (7) 8 wel asin serie of papers by S. Stale ad M Shub 7, 218, 2, 29). Vector aul mate oma are dseused at length in 19, sets 2.2, 2) a Applied Noserical Linear Algebra 1.9. Questions for Chapter 1 QuasTios 1. (Baty; 2 Bai) Let A be an orthogonal matrix. Show that ddet{A) ~ 1. Show tat i B algo orthogonal aa det{A) = de}, then A Bis snguler QuesTION 1.2. (Rasy: Z Bai) The rank ofa matrix the dimension ofthe space spann by tseaumns. Stow that Aas rank one fame only if A = ab {esome cluan vetoes db Questiox 1.8, (Basys 2 Bei) Show that fw matrix orthogonal a tran- ular, thn it ingot What are is agonal elements? Question 14. (acy; 2 Bei) A mote Is strictly upper triangular I I per telangular with zoo diagonal erent. Show that i A is strictly uppor ‘viangulae and n-by-n, then A* 0. Questios 1.5, (asys % Bai) Lat bea vector norm on Rand assume that C¢ R°™", Show that if enka) =n, then [aio = [Cz isa wetae a YON 1.6. (Basy: Z Bai) Stow tht FD = 9 €R and BER, thea H(-8) Queso 1.7. (Hasys 2 Bai) Verity that ny" lr ~ bzila = lalla for ny 139 ee ee — QussTiON 1. (Medi) One can Went the degree d polyaasaep) ~ Sef au! with Re" vn the vcr of oficons. Lae # be fel. et Sy be {i et of pyominis wth an infiie reltve condition number with epee to crloting them a2 (Ly they are mr 3). nfo words, dserin S feormtrealy ws suet oF RE. Lat Ss) be the set f pgm wn ‘nie coon number sor grater. Deerbe Se) Rental In 9 few words: Deseribe how Si) changes foetal = 0 Quesmiox 1.9, (Motiumy from the 1905 final enum) Consider the figure be tow. It pots the function y ~ lg(1 + 2)/s computed in two diferent was, Mathematically y i & smock Sortion of = ree” ean I a Bat Ste compte ssn this form, wget te plots on the et (shown in the anges 2 € 1,1] an the top lst aad © [-10"", 10°] on the bottom let) "Tis form i lor unstable near # = 0. On the otbee hand i we ase te slgoethen Inston B a=tte a1 thea is loa d—1) ad we gt che two plots on the right, which are eoreet ear = 0. Explain this Phenomenon, proving thatthe sco algorithm mast compte an nourate Answer in lating point arithmetic. Assume thatthe lg feetionrearas a ‘accurate anywee foray argument. (This tre of any cease implemen: tation of lognrithm.) Assume TEER doting pont aithmetie i tnt rinks your argument csr, (Both algcthas can alfunetion ona Gray wachine) rebate ebsant retawane ove Quug08 1.10, (Medium) Stow that, baring etow or unertow, SSE yn) = Bet gd + 4), where [ke Use thn to prove the folowing nc Let 4% and 5" be mats, sl compute thir padi ithe inl wa Baring averow oF unertow sow tat fA 8) AB = ftve-[Al- [Bl Her the absolute value ofa mats [4] means the mai wth ets (4) = Ina the noua s meant componente. “The reult of hs question wil be vein eto 2: where we nae ‘he round rors In Gaus lint 26 Applied Noserical Linear Algebra Questios 1.11. (Medion) Let J bea lower teangular ati aed save Lr Dy forward substitution Show that basting ovriow or underw, the com- ule solution # sais (L-+ 41}2 — 6, where |fls| < ney], whewe «isthe Inschine precsion. ‘This means tht forward substitution i backward sable “Argue tat bockward substitution for solving upper triangular systems sts the same bound ‘The result ofthis question wil be used in section 2.4.2, where we analyze ‘he rose errs in Gas bation, QuesTION 1.12, (Median) In oer to nnalyze the effets of rousing eros ‘have used the flowing med Sc equation (11): food) =(@oHi 48, whore © sone ofthe fur baste operations, ‘hat our analyses also work for compler data, we need fo prow an analogous formals for the four base omplex operations. Now 6 wil be tay compen ‘umber bounded in absolute value by w small mukiple ofc rove tha this 5 true for comple addition, subtraction, mukipiaton, and eivsion. Your lgocthm for complex cvision shook! sixcessilly compute e/' ~ 1, whore [oli either wey Inge (larger than the square root ofthe overflow threshold) fr wee smal (smaller than the square root of the undertow tes) 1s {rue that out the rea ane imaginary parts of ce eomple prt are alas puted to high slative acenrey? QueSTION 1.18, (Median) Peove Lem 13 a ON 1-14 (Medi) Peowe Lemans 1.5. ON 1.15. (Medi) Peowe Lemna 1.6 QueSTION 1.16. (Medium) Prove all parts except 7 of Lemma LT. Hint oe prt 8: Use the fact thf and ¥ ape both mby-n then XY aad YX hae the same eigenvalues. Hint for poet 9: Use the fact that mates normal if snd only iit has compete set of erthonaeaa elgeavetors {Question 1.17. (Hard; W. Kahan) We meationed that on a Cray machine the expression arecs(//2¥ |g) caused an eror,breause roundel caused (ei V2"1 #)toexeeed 1 Show that this imposible using IEEE eithinetie ‘beng overtow or underiow. Hint: You wil nce to wse more than the simple model fe) = (a 6K +8) with |) soa, ‘Think about evaluating v=, tnd sho that, barring oveetlow or undeeiow (2) ~ x enact in numerical tsperiments dane by A. Li, this fled about 5% of thetic of Cray YMP. Your night try some nomerical experiments and explain them. stra credit: Prove the sane result sing eoretly rove dsimal athe. (The root ‘elilereat) This question & doe to W. Kahan, who was inspred by a bia ay prose of J. Sethian Inston a QuESHON 1.18. (Hort) Suppose «nd b ce normed IEEE double pre: sion Mating point numbers, ad consider the flowing lgoritin, running With IEEE arthmetes Ir (la) 8 “This routine is important enough that i bas been stasarized as 0 Basic Linear Algebra Srowtine, or BLAS, which shouldbe wailabieon all chines [N67] We disens the BLAS at length in section 2061, ad documentation fn sample implewentations may be found at NETLIB/blas. In particular, see NETLIB e-bin/ epg pl/as/sarm2.f fora sample implementation {hat Is propertles 1) and) but not 2). These sauple implementations are Intended to be sarting points for Implementations specaled to particular sehitetures (an ease problem than prodoing completely porte on, 38 quested In this problem), Ths, wen writing your own numerical software, sou should think f competing aa building block that should bo avilable Jina numeial brary oa each machine. For another earful implementation of 2 0 (4 You ean extract test code fon NETLIB/blos/slatt co se f your imple mentation Is cereal implementetios tuned in must be thoroughly tested swell timed, with times eomnpared to the obvious algorithm above on those ass whore both ru. See how close to satisfying the thre conditions you ean cexne; the frequent use of the word “nearly” In eondtions (1), (2) aed (8) ‘hows where you may compromise n attaining one condition i onde to more rary attain another. In pater, yoo might want t so how mach ease the problem if you mit yourelf to machines running IEEE arthmote Tint: Assume that the values ofthe over and undertow thresholds are salable for your algorithm. Portabe software fr computing these vals ailable (eo NETLIB ein netbget.pl/Inpoc tl sameh.), QUESTION 1.20, (Basy: Mein) We will us » Matlab program to illustrate how sensitive the rots of polosnial ean be to stall perturbations in the ceicients. "The program i wile” HOMEPAGE, Matis polyp Inston » Palyplot takes an input polynomial specie by its roots a then ads random perturbations t the polynalal eoalieats, ampules tbe peered roots, nd pls sae. The tps re 1 vector of oats of the polyeial {¢— masiminseative pertnrbation to make 10 each coteent of the polyol, ‘m— numberof random polynomials vo generate, whose rots are pote 1. (ay) he fist part of your assignment isto rn this program forthe falowing inputs. In al eases coo high oh Ut ou ge 8 fly ‘dense pl ut don't hve co wait oo long. m — Te hare pecs 1000 ero. You may want to cisnge the aes of te lot if the graph 1s to stall oto large oC}; = Ted 104, 165, 166, 107, 108, # £(1:20};¢ = 169, el, 1013, Les, 2 F-24816... 1021 eto, 162, 108, Lot (la this ease, use ‘xi fet-) and srlloes(reale1) mag). ) [Aloo Uy sour ova exampie with comple conjugate rots. Which foots 2 (Aedium) The soond part of your ssignment sto modify the program to compute the condition number e() foreach root. Tn other words, relative petirbation of «in ach coeicient should change root) by ‘most about et), Modify the program to plot circles centered ar), with od ee), and confirm that those cireks encloe the peeturbed roots (at last wen is stall enough tat the Fneaization used to erive the coniton ner acer). Yu shal tra i 8 few pls with ices sad portrbeeigenvalies, and some explanation of what sou observe 1. (Afedin) Inthe lst pt, atic that your foetal fre) “bows up if ‘v(e) — 0. Thisconaltion mans tate) is a mail tof pla) ~ 0, ‘We ean sll expect some accuracy in the commuted valve of multiple root, however, el inthis part of tho question, we wil ak how sensitive ‘multiple root can be: Fist, write pla) — a) (2 — n) where eG) = 0 and m sche mulkpicty ofthe root 1). Then compute the ‘m roots nearest (of the slightly perturbed polsnomil pz) ae, land show that they die frm ri) by fe. So that if m — 2, for instance, the rot ri) perturbed bye, whieh is much larger than Cif kL Higher vale of m yield even larger perturbations. Ie is ‘roandrchineeyilon ane presents rowing etor in compu the root, this menus an mtyple rote low al but 1th os Sgt is 0 Applied Noserical Linear Algebra Question 121. (Median) Apply Algorithm 1.1, Beton, to fad the rots af x2) — (2 2)" = 0, wher p(2) is evaluated using Horner’ rule. Use the Matlab implementation ia HOMEPAGE/Matlab/bisce.m, or ese welte yout ‘own, Confirm thst changing the Input Interval lightly changes the eormpted ‘oot drastically. Movify the sgorthm to uso tbe eror bound dseussed inthe text to sop biceting when the roundoff etor Inthe computed value of p(2) es 0 largo hat its sign cannot be determined Linear Equation Solving 2.1, Introduction “This chapter discusses perturbation theory, lgoithns, and errr analysis for solving the linear equation Az —b, "Tho algrthms are all variations on Gaussian elimination. ‘They ar ell diet methods, beans inthe absence of roundoff exor they would give the exact solution of Ar = b afer nite rumber of steps. In entra, Chapter 8 dnesses iterate mets, which ‘amp a saqence 91,42, of eve better approsimateslitions of A Tone stop iterating (eompating the ext 2.) whea is accurate enough Depersdingon dhe mate A and the speed with which x eonverges to ~ A", ‘det method or an erative nuethod taay be faster or ture accurate. We wil dsciss the relative merits of diet and Reraive methods at length in ‘Chapter 6. For now, we will just say that det methods are the methods at sie when the user as no spectal krowhge about the source? of matrix A fr when a solution Is requlved with guseanted stability nd In a guerantes ‘mount of te. "The rst ofthis chapter is ongantze a follows, Section 2.2 dscusss per- turbation theory for az —b; it forms the bas fr the practical error bain in section 2, Section 2:3 derives the Gaussian elimination algorithm for dense ratios, Section 2:4 analyzes the erors in Gaussian elimination nnd presets practical ertor bounls. Section 2.5 shows how to improve the securaey of 8 ‘olution computed by Guise elminstion, sing a simple sind inexpensive iterative method, To get high speed from Gaussian elimination and other Tinene algebra algorithms on contemporary comptes, eae mast be taken to ‘nyse the computation to reper the computer memory organizations this ie diseussed in section 25. Finally, section 2-7 discusses faster variations of ‘Gaussian elimination for natiiees with spell popertes coamnoaly aes in practice, such as symmetzy (A~ AP) or sparsity (when many ents of Aare 20) a 2 Applied Noserical Linear Algebra Sections 22.1 and 2.5.1 discus ecent innovations upon which the sare In the LAPACK trary depends. There are 9 sacely of opm problems, which we shall mention 8 we gp slong. 2.2. Perturbation Theory Suppose Az = b and (A | 6A)2 — 0-4 Ab; our goal Isto ound the norm of fies — sr, Wosimply subtract thes two equalities and rlve for A one way to do this sto eke (Ay sAyeey 82) = 04 a - lar = 4 Tae TAT Ear Bb ind rearrange to ge fe AN -BAB 48) ey ‘aking norms aad using pat 1 of Lemna 1.735 well the Lange inequality far vector norms, we get [6 < FA (IGA + 1. 2) (We have assumed that the vector norm and matrix noetn are consistent, af defined in sesion 17. For example, any wetor orm and its induced maleix ‘oem wil do} We ean farther rearrange this inequality 10 get Wel 4-1 i ee aii) ta “The quantity nA) = 4°] tbe conton mambo?” of the ate A, because eases be eltve eange Ae inthe answer os «lie ofthe rave cha Ht in he dt, (Toe repro, we nd Yo show ‘tat iequaly (22) Ian equal for som sonar coke fA an 8; Ctiewist (4) woul ony be a per bound ete coon uma, See veion 23) The quant mukipyng nA) wil be sal 54 aS re ttl yelling sll poe bur on the ete err eh “the upper bond depts on Se va) ch mak i sem hd 0 interpret satay te sn prvi, ane we nwt compte ston and cn scien te und. Ween le derive * hey mae atte hound tnt dos pedo 8 flaw "Gas pt, sion mb with pct pon mai inves inn te r,t wt Linear Bavation Sosing 8 cua 21 Ul aay JABS UAL BL Then LX 1 ne tha FOX ie inentibe, X= SEX", and XPS oe Proof. ‘The sum 32%y Ni sid to converge i and only ft eons in ‘each component. Wert the fact (fre applving Lemma 1.4 to Example 1.6) that fr any acer, there is constant sh that [gl & e- IX]. We then a0t (Xl Se-| 83] Se- [so each component ofS: X" is doainted by ‘convergent goometrie series SANS and must converge. ‘Therefore Sy Sty AT converges to same $ asim — oc, and (I = X)Sp = (F = XY + KEKE eee) XH OT as no cy since [XI < JAP ‘Theeetore (/—X)8— J and 8 = (UX) ah Ngee s Pees EE i © Solving out fst equation Baz + (4 + 3Atx = 4b for br yields fr = (AY 6A) *(—6Ar +80) = [AU + AA) ae 480) = AMAA be + 80, “Taking norms, ving both sides by 2 using ect 1 of Lemma 1.7 aod the triangle inequality, a asin that 8 8 small enough so that JABAL = 4% fb <1, we get te desire bound Mak gr acinar ay gat (reay HO Wa [Salas [4-4 Lerma 2.1 impies thot 1} A-H8A invertible, ae so A+ is invertible "a sow the minimum equals pa, me constrict & A of oe ge rch that As Ae singulr. Note that sine [4H ~ mayaa A, Aber xs az seh ct fle = Land LAH = 4a > 0. Now ke a= hs sols 1. Let 6A hr. Ten in ala = may Malla = meg it wore the maximum i attained when =i ony nonzero multiple y, and A 8A singular because (by ay 0 ho We have now seen thatthe distance to the nares ikposed problem equals the reeiproeal of te eoudtion number fortwo problems: plyonia evalation fant Tneer equation solving. This repeal relationship ls quite common In ‘numerical sls (70) ere sa slightly diferent vay to do perturbation theory for Az = by we wil need Ie to derive practieal erro bounds later in section 2.44. IF s any ‘eetor, wo can bound the difrence 8z = 2 ~~~ ~1b m folows. We let 5 = Abbe the residual of 2 the residual rs zo If = x. This lets us wwite Sr = Ar, yielding the bound [2 = 1A < LAD es) "This simple bounds atesetive vo use in pete, sine Fs easy to compute, sven an apprasinate slstion & Furthermore, thee is no apparent etd 10 fstlmate 6A nd 6. ta fet on two approaches ar very eloslyroated, 96 Shown by the next thor. ‘Tunoneat 2.2. Let r= Ab—0. Then there exits «6A such that Al — fh and (At BA\E = be No 6A of saler norm and sting (A + BA} 6 ois. Ths, 6 i the smallest possible nclward exer (measure norm) ‘Tis is tre Jor any ceton nora end induced nor (or =p Jor wears ‘end Ie for matrices) Linear Bavation Sosing 3 Prof, (A+ 6A = bi acd only 54-3 =b~ AB = —r, 0 l= IAB S 1S, implying Isl >. Wecomplte the proof ony for he we- nom and is duced matrix norm. Choowe 5A = 347. We can ely verity that 54-8 ——+ nd la = et. “Tos the ale [htt cul! yi an siping (A-44}8~ Band = Abbie gen by Theorem 22, Appling ere nd (22) ith 8 0) ie vs art (EE) = 14-40 te ae ou (28. ‘All cu bounds depend te aby 10 etna te calito sander ai EA“, We tar to this probe a sction 2.43. Coon aber ‘stats ae comput by LAPACK rons sith a goer, 2.2.1. Relative Perturbation Theory In the list section we showed! how to bound the aor ofthe enor x = — 2 inthe approximate solution of Az ~&, Our bound on [zl ws proprdional to the conition number (A) = AAI mes tenors and [8], where sisis (A 6A)2 6:48 Tn many cose this bound ie quite satisfactory, but not always. Our gol in this section isto show when i is tow pessimistic nnd to derive an alternative perturbation theory that provides tighter bounds. We wll use this perturb on theory late in seeton 2.1 to joni the error bounds eaapaby the LAPACK subronties like gests This seetion maybe skipped on a frst rvding Here isan example whore the eeror bound ofthe last section is much to pessimist, EXAMPLE 2.1. Lat A= diag) (a dlgonal mates with entries any — 9 fad e22 = 1) and 6 ~ [91], whan > 1, Thon — A" = [1]. Any roxsonabledret method will solve Az ~ 6 very seeuraely (sing two divisions ‘fag to gee ye the eonition number mA) — 7 may bo arbiter lange Therefore cur error bound (2.3) may be arbitral ‘he retson that the condition number x(A) leads us to overestimate the ‘roe is that hound (2.2, from which ie comes, sume that 84 is bounded in morn, but és otherise erktrry this ned to prow that bond (22) i attainable i Question 23. In contest, the 64 eoresponding 1 Ue asl rounding eerrs isnot arbiteary but has a special structure not expired by is norm alone. We ean determine the salest 6A coresponting to for ‘ur prablem a follows: simple rounding erzor analysis shows that — (Oia) (E+), wheee |] Se. Thus (oat au —b We may rowste this % Applied Noserical Linear Algebra as (AF 6A} ~ b whese 6A ~ ding yay, S403). Then [A] can be as lege tax; [eag| ~£7- Applying eror bound (23) with 3 0 yields eee ws <7() 9 In contest, the setual error stifies Wel = e—ahe [Lei 3 fore DL = [Bei Ha) - (ie Saga ae sete, shih iv about 7 ies stale. © For this example, we can dscibe the structure ofthe actual A a flows loo < dla, whore cba ny number. We wete this more suesnety es [fal saa] e8) (eset 11 for notation). We abo say that 6 i «small eomponentise ‘elaine pertrtnton in A. Since A ean often be me to satisfy bowl (2.65) in pret, along with [6 < [l(a setion 2.51), we wil derive portrbation {heey using thse bounds 0a 5 a 8. ‘We begin with equation (21: Be Asa 4 8b) [Now take absolute values, and vepetedly usw the tangle Inequality to got lia] = |a-l(—sae +40) AM" a+) [Nel al +e) ALD ‘Now using ny wotor norm (ke the ini TTI [ll we go the bound one. of Frobenius norm), where [oe < ellAHCLA = + en Linear Bavation Sosing eo Asuming forthe moment that 8b = 0, we can weaken his bound to I< ela ANE 18 Wel ya te “This eal us to die kom A) = 14-4 -[Al 99 the componente relate canton numberof ors relative condition nub shor. I me tes alo elle ue Rau coulon rrr 8) or Skea eon tunes [aa 224 25, Fora pot that boul 21) and (2) ate ata soe ueston 24 Teel at Toren 2.1 elated the conton number xl othe dance fron othe nearest singular tate For ase Interpeaton of wont), see 7, 206) EXAMPLE 2.2. Consider our curler example with A ~ dingy 1) and 8 = [ps1 Tes easy to conti that went A) — 1, sine [AMY] [A] 1. Indeed, ‘ken(A) = for any dlagoaat mat’ 4, eapturing ou intuition that a digoaal ‘tem of aquttions should be slble quite accurately. © ill ea) More generally suppone Di ny nonsingular diagonal matrix and Bis an arhitraryronsinglae matrix. Then kealDB) = | KDB) (DBI [1-Day 1B" 101 = mont Bh ‘This means that 1 DB 1s badly Sela, i. Bis welLenniionsd but DB 's badly conditioned (because D has widely varying diagonal ects), then we should hope to get an aoeurate solution of (Di}2 ~b despite DB's ie ‘conditioning. This discus further In sections 244,251, and 2.2. Finally inthe last sation we prove an error bound using only the residual 7 AB U6el|= LAME SHA“ es) Whore we avo use che tingle Inequallty. In secon 24.4 we wil sce that {his bound ean sometimes be mich smaller than the sinular bound (2.5), in particular when Ais badly sealed. There i aso an analogue to Theerem 2.2 hist “Tupone 2.3. ‘The smallest «> 0 such tha there exit 8A < eA ond a = setifping (A5A2 ~ 86 sella the components relative backward ferots mny be expressed in terms ofthe residual r~ Az ~1 as follows compel Ta * Applied Noserical Linear Algebra or prof, see Question 25 LAPACK routines lke agesr: compute the componentwise backward ee stlveeroe¢ (the LAPACK favable nin foe ¢ is BER). 2.3. Gaussian Elimination "The base algorithm fr saving Ax = bls Gaussian elimination. To state it, we fist aed to dene a permutation metre DDeriutioy 2.1. A permutation mats P 4am identity matric with permuted ‘The most important properties of a permutation mtr are given by the folloing emma. Ln 2.2. Let P, Pa, and Py be nly permutation matrices and X be an retyn matriz, Phen 1, PX is the seme 06 X with ts rows permuted. XP isthe same as X with, ‘ts clans permated 2 Per, 3, deur) = 1 A, Ph Pe i also a permutation matric Por a prof, see Question 26. [Now we ean state our overall algorithm for solving Az =. Auconrrin 2.1 Solving Ax = using Ceusionelininetion 1, Paelorice A nla A= PLU, where P= permutation mates, 1 = init liver tlangular matri (with ones on the agonal, U = nonsinguar per triangular wai 2, Solve PLUS ~b for LU by prmting the entries of bs LU = Pb = Pp, 8, Solve Le = P% for Us by forward substitution: Ux = L-*(P-). A, Solve U2 = LH P-*8) for by back substation: 2 = 0-H" P%, Wi wil derive the algorithm fo factorizing A = PLU i sever ways, We begin by showing why the permutation mate P is newssay. Linear Bavation Sosing 0 Darisrni0x 2.2. ‘The lealing by principal submatrix af A i ACL: j 19) ‘Tuwons 2. The folowing two sttements ore epeicalent 1. There rise @ unique unit lower triangular Land nonsinguer upper triangular U such that A EU, 2, AU ading principal submatrices of A are nonsingular. Proof. We first show (1) implies (2) A= LU may also be writen Po Pee An da Lyn Ente Tan Hain t Lai where Ay is jby-jlening principal submatrix ane Lay and Ui. There fore det Ag ~ esl) det Pan dey — 1° TE, Udon = 0 ince Ls ni eongblar ad 1 i eianghl. ‘We prose that (2) imps (1) by induction on. Te is easy for by] Imatioes @~ 1+. To prow it fr mby-n mates A, we me (ond unique (0 T}sbye(a— 1) tingle matrices Feand U otigve (a I)by-1 vectors Ad, ad unique nomena sear 7 sch that oe ee [By indvetion, unique nd U exist such Usa A= LI. Now let w= 1, = CIV, sud ~8~ Ey, all of which are unique. The diagonal ens af age nonateo by invetion, and = 0 sine 0 det) = de) -n- "Ths LU factorization withowtpiving ean slo (welkenined) ron singular matics such a the permutation mate oro por roo the I-by-t and 2by-2 eading pelapal minors of P are slagula. So we nod to iatroducr permutations into Gaussian elinination ° ‘Tuvons 2.5. If As nonsingular, thn thre exit permatetions Py and Py ‘ami lower triangular matrix Ly ond 0 nonsingular upper trimgular mati 1 such that PAP, ~ LL. Only one of Py and Py is necessary. (Note: PA reanders che rows of A, APY seoders the ents, end PAB, reorders bth ry Applied Noserical Linear Algebra Proof As with many mates factolations, sles to understand block 2by.2 matics. More formally, we use induction on the dimersion n. Tt eas) for I-by-l matrices Py = L— 1 and U— A. Assume that It {tue for dipeasion » — 1. IFA is nonsingular, then It has 8 nowzero entry: ‘cone permtatlons Pf snd P% so thatthe (1-1) entry of PEAPE Is nonzero. (Werte only one of Pf and PY sine nasingularky ples that each om and ach column of has « nonzero entrs:) [Now ae write the desired factorization and solve for the unknown eompo- Tmt = [dele 309 [ad sot a | am winre Azs nd Alas ae (n= I-bs4n =D Solving forthe components ofthis 2-5-2 Bock factorization wo got wi = f= 0, Ue = Aaa and Lay = Aa Snoe wry~ an. = 0, we ea soe for Lay = Ba. Pilly anUia + Aaa — Aas itplics Aaa = Aza ~ Lal We want 0 apply indvetion to Aga, but to do so we nat to heck that det Any ~0: Since det PRAPS Edel AO and abo 1 arses 9) da [U2] 1-0 ees ee “rh ent ex eates Aan tt Fhe Satin nt ein es a nae = [2 9][S wider] [at] ote] [%! ote] [A aral[S [6 a] [alfa EEE NLS a] sme ge the desired fctoriaton of nan = (3 8le)aCal2 8) aia f]C3F) » Linear Bavation Sosing 41 "The next two corals state simple ways to choose Fy ad Po uarantce that Guusianelanation wil succetd on a tonsngular tates Conon 2.1. We cam chase PL 1 and Ps that ay i the largest entry ‘mabe value ct, lich imple = 38 has eis Sowded By {i absolute ets. Moe general atte fof Ganon elimination, whe te are computing te th cat of we rede the rows 30 that he largest nur i he colt som the ogona ‘This ellod “Cause elimination {Sih portal pst,” or GEPP for ator. GBPP guarantees that ll entree Of are toned by me in aboie wave. GEPP is the most common way to implement Gaussian elimination in practice, We discus its numerical stability inthe nest section, Another more ‘expensive way to choose and Pe is given by the next corollary: Its slmost owe use in praetic, although tere are race examples where GEPP fas but the next method seen in computing an acute mse fee Question 21). Wo dseas rity i in the next eon ml. ‘ConouaRy 2.2. We eam chose PY and Pl so that ayy he larest entry ‘absolute value ithe whole tatr More generally, a! step of Gasian lamination, were we ore computing the ith column of Ly we render the rows ‘and colar stat the lyst entry in the atric ison the agonal. Ts fall "Goasian elsnination with complete pivoting,” or GECP for shar. ‘The following algorithm embodies Theorem 25, perforng permutations, ‘computing the fst colunn of Za the frst row of, and updating Aza to get Arg — Aza—IayUa. Wo wrote algorithm fst in conventional programming Innguage notation snd then using Matlab notstion, Aconrmit 2.2, LU fctrisation with pivoting fort ton-1 ‘apply permutation soa =O permute Land U 100) 7 Jor example, for CEPP, soap rows } andi of A and of L tere [al the largest entry om (AU: m8 for CECP, swap rows j and tof A end of L, tnd colamns kari ofA and of tare laa the lagest entry im [AG sm) 17 /* compte colon sof (Lan 8 (210)) *7 fos Nt y= anlaa end for 7 eampute ow of U Win 60 240)) */ for sito 2 Applied Noserical Linear Algebra fend for 7 wplate Aug (0 9c Aaa = Aaa ~ Lata (210)) */ frit tion fork =i 1100 = Hs nd for end jor nu or ‘Note tht onc column fof 4 is used to compute cluma if it is never ‘ve gin. Similarly, cow i of A i never used agin afer commuting row fof U, This bts ws overwrite Land U on top of Aas they are computed, 0 we ree no exten spice to store thems; L oeties the (ret) lamer trang of A (the ores on the cago of Leave not store expt), ad L aocuies the Upper telnale of This simples the slgortha 10 Aucontmns 2.3. LU factorization with pivoting, oveneriting Land U on A Jori=1ton=1 any permutations (re Algrthn 22 for details) fojeiiiion fy = ay/04 endfor frp si.ion fork ition endfor endfor nd for ‘Using Matlab notation this further reduces to the following algorithm, Auconernt 24. LU factorization with pivoting, oveeriting Land U on A Jori=t ont ‘apply permatations (ste Agorithm 22 for details) AUCH) = AU ET my /ALA) AW Limé tin) = AGH Lin pen) AGF Lim spe ait ten) end for Tn the bast tine of he algorithm, AGC 18) +AQi E41) he produce ofan (0 byet sme (Ly) by aby) mate (iz), Whi yl (oOybysa 4) mat Linear Bavation Sosing as ‘We now rere th lgoritn fom scratch starting fom perhaps the mont familie deserption of Gansian eliaiaation: Take each row and subizact multiples of i from Ince rows to aro out te ents below the disgot.” ‘Translating this dicetly into nage yids foct=1eon—1 {for ca row £*/ for j= F4 1 tom fF subkract a multiple of ose | fren ow = */ fock=tton ft eolomansf ibeough m . */ fn — ayn Sow /* to aco out colon t ‘below the dingonal */ nd for nd for nd for We wil now rake some improvercns o this algorithm, modifying unt it becomes identical to Algorithm 2.3 (except for pivoting, which we omit). First, we roomie that we ned not eampite the tr entre bel he ig ‘onal, beau re ow thy are ero. This shortens th loop to vied foci=1t0n—1 frst 110m fock=i 4 Tton en = On~ Sou cd for end for for "The next performance improvement i 10 ompute $ outside the nee Joop, sine tf constant within te Inner loop fort=1ton-1 forj=i4ttom he end for forj 4 tion foc k=i+ Tton = Hel cd for ead for od foe Finaly, we store the multiplier ie the subdiagonal ents ay that we crginally serod out; they are notre for anything es. This yes Algo then 2-8 (exept for posing), “4 Applied Noserical Linear Algebra ‘The operation count of L1 is dane by eepacng ops by sumations over the same range, ad nor loops by tei operation counts: a(S-25)) Fon .4ne—o% = 2a 00 The forward and back substitutions with L and U to compote the solution cof Az — b ont Ofr2}, so overall solving Az ~ b with Gasian elnination fants $n? O(n) operations. ere we have wed the fact at S22 sme 2) + Ofm). This Formula enough to get Ue high ont ten in the operation cou. There is moee to implementing Gaussian elimination than writing the nested loops of Algorthin 22. Tadeed,dopendig on the computer, progean- ‘ning language, and matrix san, merely interchanging the Int two loops on j ne can change the execution tne by ores of magnitude, We diss this fat length in section 26 2.4. Error Analysis Recall our two-step paradigm for obesining error bounds for the solution of Arh 1. Analy roof errors to show thatthe rt of solving A = Bis the ‘eet soliton # of the pertrbed liner system (1+ 54) — 6 8h, where [And bee sna. This a example of backward eor analy, a A and ae eal the tached ero 2. Apply tho peeturbation theory of setlon 22 10 bound the error, fe ‘example by using bound (23) o (25) We tne wo gl in is eto, Th fst i to stow how implement Gonsin nonin rt ot token se sal Tn portal, we nul! He ohnep 4h and anal ws Of). This a stale necan eet toma then he mt odin he ents Gf (ord) ot io the Boing pot fran rake ee (or a {hav Unt ues we are eel abot ting Ea nel a Be Stal We ates ts nh es ec “nese ao dre patent eor bonds which re skaneoy cir tocampio ad "ugh La, chet tho tres er I tn On nk {ita bouts or [64D th we cn formal rove ae gral meh ger ‘inn th es enounare in prac, eee, pret er bss {insaton 244) lyon theconpua dal” Asan bound 3), Linear Bavation Sosing % inne of bound (23). We ako nod 4 be able to estimate (A) inexpensively ts s discus in sertion 2.43. Unfortunatly, we do not have error bounds that lays sts our twin sous of cheapness and tightness, that simultanoouly 1, cost» peslgble amount compared to solving Ar = & In tho ist place {for example, that cost On) flops versus Gausian elimination's O) ops) and 2 provide an ere bound eat Is always at lasts largo as the tue error [and never tore than constant factor Inge (10 times large, 9) “The practical bounds in section 2.44 will cst O{n?) but will on very rare ‘ceceslone provide eror bounds that are much too small of much to large. ‘The probblty of geting a bed exor bound so sina hat these bounds are Widely used in pretice. Tho only truly guaranteed bounds use either iteeval athmete, wry high reson arithmetic or both anlar several tes roe expensive than just solving Az = b sc setion 13) Te has in fac been conjectured that no Bound sitistying our twin goals of popaess and tightness exist, but this remains an open probe 2.1. The Need for Pivoting Lat us apoly HY atoruation witout pong to A= ("1 ue decimal dg fating point arithmetic and se why we get the wrong answer Note that a(A)~ [An [~~ 4, 50 ATs well condoned ad ths we ould expect to be abl o solve Ar —b aceuratly ° 1 + [rates | J 930°) = [gg ota |p ML= 101 caw — 104 om [ae ][ oe -['8 a] but A= le : Note thatthe original azz has been entirely lst” from he computation by subtracting 108 from i We would have gotten the same 14 factors whet er (ty Bad Dees 1,0, ~2, oe any umber such Ua ese 108) ~ 108. Since the algorithm proces to work oly with Zan U, i wil ge he same answer foe all these diferent az, wich correspond to Completely diferent A aid s0 completely diferent 2 — ats thee i no way Io qunraate an seeurate ‘answer. This Is elledmumerical instability, sine L sad U are not the ext 6% Applied Noserical Linear Algebra factors of s matrix clove to A. (Anathee way to say this that JA — EAI 8 bout as large as A rather chan eA.) Tat uss whist happens when we go on to solve Ax = [12)7 for using this LU foctoriatlon, The coret answer Is ~ [11]?, Instead we get the folloyng. Solving Ly ~ [1,2 gles yy ~ 1/1) ~ 1 and yp = (2108-1) ~ 10 note that the valve 3 as boon Tet” by subtracting 10" fom it. Solving Uz = y yds 4y — (109 (10%) = 1 and #4 ~ A4(1 —1)/10-9 — 0, oxmpltey erroneous slo, Another warning of the loss of accuracy eames rom comparing the con- Aisin number af Ato the condition number of Fad Recall tht we ‘ansfo the problem of solving Av = B into solving two other systems with Land U,s0 we donot wa the condition aumbers of oe to be auch target than ths of A. But le, the condition number of Ais about 4, wheres the endiion numbers of Zand are about 10 Ta the next section we wil show that ding GEPP nearly always eliminates the Instability jst Ilustrsted, In the above example, GEPP woud have r= ‘ersed he order ofthe two aqutions before proceeding, The rede i Invited to cone that in this case wo would got fee ( as sion 1] [or sot : [5 aa aon-y]=[0 1] tint LY oon aie mente Dah Lad U rie wh Sadia, a0 5. Tho conte wlton vet bss qe aur 2 er i the intstion behind our error anabss of LU deompoiion. If n= tected quantities arsing inthe pede =U are very large compared 10 TAL, the information in eats of A wll gt “ot” wn thie large valves tre subtest fom then. This it what Hnppened 9 az i the expe in fection 21. If te latermdiate quantities in the product L-U were stead comparable to tose ofA, we would expect a tiny backard ezor ALU inthe {sctorization. Therefore, we want to bound the largest laterite quantiles In the product £1. We will do ths by bounding the entries of tho matrix |) [U1 (40 setion 1.1 for nattion). (Our analysis analogous to the one we used for polynomial evaluation In section 1.6. There we considered p = Sojait" and showed hat if | were mmparable to the sum of absolote ales 5, Ja", then p woud! be earnputd secortel ‘Mier prseating « general analysis of Gaussian elimination, we will use 51 to show that GEPP (or, more expeasively, GECP) will ep the eas of |1)-|U|eomporuble co [A tn almost all praia! eveunstances, Formal Error Analysis of Gaussian Elimination Linear Bavation Sosing Unfortunately, the best bounds on 6] that we can prove in genera are still nh larger than Ue errors encountered in pence. ‘Thereor, the error Funds tt we use ia peaetice wil be based a the ened sil rad rund (2.8) (or bound (29) iste of the sigorous ut peste bound! in this section, Now supe that matrix A has alway been pivoted, 50 the notation is siopler. Wersimplify Algoritha 22 to two equations, one for je with j = & land one for jk, Let fist trace what Algrithin 2.2 does to ae when JS K this clement i repeatedly updated by subtrseting Fue foe t— 1 10 JL and is aly assigned to us that When j > ky as ain ts Lye subtract for = 1 eo k~ 1 ad then the rest sum is divided by nye dase 0 ie SE ie ty BE "Todo the roundoff eror analysis of these to formulas, we use the rst from Question 110 that a dot prodet computed in Dating point axthneie satis 8( Sam) Yan 16) wi se ‘We opply this to the Formula For wp, sling a= (om —F twat + ‘) are wit fi SGD ad 6. Song raw i = phpuctis + SET jwa(l +). snee by = 1 Shastawa + Fh stam with al SG — te and Tg = Shalima + Ba where we ean bound x by el = [QO berm 8) al Inala = net WD ie fren ae Bat the alba st Spend ca th era una 8 Applied Noserical Linear Algebra Dain the see analysis forthe formula for fy yds satay (02a 2) with 6 < (=H, [6 Se, ad [6 Se, We salve for ay to 5 Seale + byte +4) oe TT ri = Stine +S tsads wi 4 rar with [6 ns, and so [Bl $ n(| [Ul a8 before ‘Altogether, we ean summarize this errr aalyle with the simple formula A= 10 4 £ where |B < ne -|U]. Taking noes we get | < ne} (| [10h tthe norm des nt depend on the signs ofthe matrix entries (tre for the Frobenin,infins and one-norms but no te two-narm), we en simpy this to [ES nell IU. "Now we consider solving the rest of the problem: Lr — 8 via by — tnd Us ~ y-_The rel of Question 1.11 shows that solving Ly = b by forward stitution sels capt sltion jsatiniing (1 82.) — 8 with [aE] < nel E). Similarly when sling Uz fj wo gee saisying (+ 6U}2 — with 60] < nO Combining these yes b= Hane = +60) +0 \2 = (LU 1A + 61 4 LUE (A= Bau LU 4 S180 ye (AY SA\E where 6A = —B 5 LAU SY ¢ 6180. ‘Now we combine our bounds on BSL, and JU and we the tangle inoquelty to bound [bal = | cau yaw 4 s1au [B+ (280+ BE) + ara JL 12 -[aU] + a4) - 12 + an 60 lb) + mel IO + ne O83 IU ‘nel enn Linear Bavation Sosing “9 ‘Taking norms and ensuing I 1X] [= [I (rv as before forthe Frobe- sia, infty, and acorns but nat the twos) we get [6A < Snel Wwe ‘hus, to sae when Gaussian oinination jy backward stable, we must ask when Sei [UI OCIA then the BSE nthe perturbation theory bounds wll be Of) me wo dese (ate that A= 0). "The main empire observation usted by decides of experience, stat GEPP almost aways kes ILI IU ~ lll. GEPP guarantees tha each entey fs bounded by Vin abolute ale, 0 weed consider only IU. We define the pivot growth factor for GPP os fro ~ [he Alm [Ae imax | 30 stability i equivalent to gor being smal or growing slow 089 fonetion of pati, gr sent alg no ks. To aero tieor soemns to ben" or perhaps ven jut [20 (Soe igure 2.1.) This makes ‘GEDP the algorithm of cir for many probiens. Unfortunatly, ter nee rare examples in which ger can be a large as 2! PROPOSITION 2.1, GEPP quorantes that gee <2°-!, ‘This bound is tan ble Proof. The frst sep of GEPP updates de ~ aye fy tay where [1 and Ia = [ul e502 mi se So eo of the n= 1 ‘ajr segs of GEPP eat double the sae of the ealing trix eee, and wwe ge 2" as che overall bound. Se the example ia Question 2.14 vo se that {i ataloabe. Potting ll thee bounds ogee, we get 6A < Sterne Aly eu singe [jc 1, mening that all pecson Is potently st. Example 2-3 graphs ‘SaypnPs along with ce true backward error to show how it ean be pesmistie, UA] 6 usually O(2) Al, so we ean say that GEPP Is tackward stable in practic, eventhough we ean construct examples wher it fal. Section 2A presents pretical error bounds forthe eomputed solution of Az = # that are ‘much salle thon what we go. rom sing [x < 3g) A can be shown that GECP is even more sable than GEPP, with its pivot pot gor sisting the worstease bound (20, . 218] max a soy = Matall < VgB BIO TOT wg "ii dton cis nt fo he lo in he erty bt ely ssn [Bh 0 Applied Noserical Linear Algebra "This upper bound is ako ech too lage in practice. The avecage bee of dor m2. IU was an odd open conjecture thst gop fla) + oft): (y 2) efla) es). To compe ‘of we ase all ys) 01m f(a) ~ 3 by (bis almost alas Ui) Let @ — alan by), 906 ~ 21-and Pte) — 5,556. Then Be Shu and of oct (BPE in amma, to compute VAC) take the steps w = Bs, ¢ = sete, aod Gf". "Theale mwah ante that ppsinatn [A the tied sca ed intensity caer tan expt cept Linear Bavation Sosing 8 Auconmiis 2.5. Hager condition estimator returns a lover brand el om eh choose ony z such that | = 1 Pegma hy pent = Bre, ¢= sigan), == BEE nt aot eloe © =F then return ete a= 6 where [|= Hew endif nd repeat ‘Tuwonsn 2.6. 1. When fail retuned, lh = Boy 4 lel mass mam of Beh 2 Otherwise, Bey (al end of lop) > QR} ft tart), 80 the arth as made progress in maxing $2) Proof 1. In this ome, [ote © #7, Near sf) = Bal = S35, Gh ger In 9 Ja) = ft0) + 950) Ky 2) ~ fa) +372), where 2 gfe). Tosbow sa local axitum we want 27(y~2) 2 hen ivi. We compute Fy-2) = Bytes Nelo 2 In this eae lan > 2%. Choe # =e) sgn), where j is ebasen so that 5) = Behe: Then £0) 2 felt os-(@—= sa) "e—9) Slo) + 27B=2F 2 f(a) +527 > Se), =e <0 as dese where the las inequality is true by constretion, igham (14, 140) tested slightly improved version of this algorithm by trying many random matries of sees 10,25,50 apd condition numbers ‘5 = 10,10, 108,10; inthe worst ease the computed underestimate the trae my fietoe 1. The algorithm i saab in LAPACK a subroutine ‘laces. LAPACK rontine in egeeyx cll elacon itera and reir the ‘timated eonalition number. (They actually retuen the eiproeal of te est ‘nated condition numb, 0 sid overfo’ om exsetlysinglar mattis) A diferent condition estar is avelable in Matin as zeand. The Matlab ro tine cond computes the exact condition nuraber [All usng lgoihuns ddseused in scton 4; Is mich more expensive than cond, 5 Applied Noserical Linear Algebra Estimating the Relative Condition Number We can alo use the algoidhm from the las section co estinate the elatve ecniion number nex) ~ [AT] Alo fom bound (28) o° 10 evaluate the bound | |4-"|-[ fs fom (29). We ean reduce both tothe same probien, that of estimating || |-9ja, where 9 Is @Yetar of nonnegatie entries. To See wy, lec eb the eetr ofall ones. Pom pt 5 of Lami 17, we so that [Xie = Xela ithe matrix 2 has nonnegative ents. Then HA -[Alae = AH [Ales =A" -aliey where 9 = Ale Here is how we extinate IA] -aio- Let G = diagts---ga}s then = Ge. Ths HAH alle = WAM] -Gel = HA“ “Cas = Ghee ‘The Lt equaty s true Buea [ae ~ Io for any matrix Yh, sulle: to estate the infiatynoem of tho matrix A~'G. We ean do ths bs ‘pplsng Hager’ algorithm, Algorithm 2:5, 1 the matri (A*G)" = GAT, tocstimate AG)" ~ AWG (see part Gof Lamm 1.7). This equlee 1 to multiply by the matrix GAM and its transpose A'G, Multiplying by is easy sinc ts diagonal, and we mukiply by A and AW using the LU feetoriation ofA, as weed inthe lst ston. 2.4.8, Practical Error Bounds ‘We present two practical error bounds foe our approximate solution of Ar For the fst bound we use equa (25) 10 ge arg Hts, Bp state es) whore r= Ab— bis the resdual, We estimate [Ake by applying Algo: rithm 25 w0 B= AMT estimating |2h1 — JA~PTy = [AM (se pats 5 fd 6 of Lerma 1.7). ‘Our second error bound comes fromthe tighter Inequality (2.9) Be=slle ¢ HAL Te < Te roe ey Wo estimate [|4"!)-lrJoo using te algorithm based on equation (2.12) "Bre ound (2.1) (modi as deserb bes inthe subetion “What as 9 wrong") Is computed by LAPACK routines like agesws. ‘The LAPACK ‘lable name for the evor bound is FERR, for Forward ERRor. Linear Bavation Sosing a Pig. 2. Hor Bound (212) plottd seve ru ray o = GPP, + ~ GECP. Exqurne 2.4, We have computed the fist errr bound (2.18) ond the erve ror forthe some se of examples as in Figures 2.1 and 22, plating the result In Figure 23. For euch problon Ar ~ solved with GEPP we plot «2 at the point (tue eto eeror bound), and for each problem Ax — Usoved with ‘GCP we plot a + a the pot (true ere, ere hound). I ube eeor boar were aqua 0 the true eror the © or + would He on the sold lgonal line, Since the error hou always exces tho tre err, the oF eds above this tiagonal. When the error bound ss han 10 ines ager thn the tre eron, the © or + appears between the sll agonal ine and the frst supeedingonal shed line. When the exror bound is between 10 nol 100 times anger thn the tre err, the 2a | appencs teen te fit tw superna dashed Fines. Mont etror bounds inthis range, with fw err bows a large ‘9 100 times the cu eae. Thus, our computes err bound underestimates {he numberof coeet decimal digs inthe answer by on tO ad raze ‘ies by a8 much ws thee. The Matlab cde for podoring these grap i the Sane as belore, HOMEPAGE/Matlab/pivotn. "© Bxaurce 2.5. We present an example on to ilastate the difrenc be- tweet the «wo error hounds (2.13) and (211). "This example wil lo show 56 Applied Noserical Linear Algebra that GECP can sometimes be mote accurate than GEDP. We choose a st of ‘oly sealed examples constructed ols. Bach test matt fof the feta A= DB, with the dinension eunning fom 5 to 100. 8 i equl to an Wen ‘ity matt plus very small random ofélagonal entries, around 10-T, 50 It ‘wey welleondtioned. Dis agonal mats with enres sealed geometrically feom I up to 100. (ln othe words, di ns1/ds Is tho sare fr all.) The A mates have condition numbers n(A) [Aj [Alas nearly equal 10 102, whieh is ery illeonlitiond, although their relative condition numbers sont A) = [1A™"Ale = |B] [Blas ae all nearly 1. AS before, r- ‘ine prion fe 2 — 2" ~ 10-1 The examples were compute uring the ‘same Matlab code HOMBPAGE Matlab pivot. “Tae pivot goth factors yy ad oy Weee newer large than aout 1.8 foe sy example, sal the bcenardeeor [om Theorem 22 never exrle 10-!° Inany ease. Hagee’s estimator was very aceurate ln all ests, returning the ‘tue condition number 10! to many decal pce, Figure 2:1 pots the eror bounds (2.13) and (2.14) fr these examples, along with the eomporentise relative bec errr, at given by the formula In ‘Theorem 23. The cluster of plus sigs in the upper kt comer ofthe top Jet graph shows that while GECP computes the answor with e tiny ere near 10-", the error bound (218) is usually loser to 10-4, which is very pessimistic. This s beense the confition numbor is 10, and so unkes the Sbrkaard erorf much smaller than e ~ 1", which i unkly, the enor ‘oun wil be else to 10-110!" = 10-2, ‘The elaner of cvs in the mile top of the sae geal shows that GEPP gets a langer error of about 10" while che ere ond (2:13) is agai usally nee 10-2 Tn contrast, the error bound (2.14) is nearly pefostyaceurste, as se trated by the pluses and circles on the diagonal in the top eight graph of ‘Figure 2.4. This graph again lusrates that GECP is nary pert see 1c, whereas GEDP tose about hal the seeurcy. his diferenee in seeuracy 5 explained by the bottom grep, which shows the componcntwise rlative bckward errr for GEPP and CECP. This graph makes it cee that GECP as neatly perfoet backward error in the componente relative sens, so since the coresponding eomponentwise relative contion number i, the accuracy fs perfec GEPP on the other bie i aot completely stable in this see, Jeg rn 5 to 10 decal digs Tn section 2.5 we sow how to itraivly improve the computed solution 2 Ove step ofthis ratod will make the solution caput by GEPP as accurate 1 the solution fom GECP. Since GECP i sigalenntly ore expensive than GEDP in patie Is very rarely ved, © ‘What Can Go Wrong. Unfortunate 8 meatonad in Ue begining of seton 2.4, ero boul (218) snd (2.16) ae no guranteed to provide ight bounds nal eases when p= Linear Bavation Sosing a » Fig. 2 (a) plots the cor downd (218) versus the true eror (b) plots the or Sowa (14) sere th re ro, 58 Applied Noserical Linear Algebra Fig, 24. Continue (6) plot he componente relative backward oor from Te rem 23. ‘mented in poetic. In this setion we deserbe Ue ene!) ways thy ca fall fn the parla remedies used in peace First, a8 desribed in section 2:63, the estimate of [AW] from Algo- rithm 25 (or similar algorithms) provides only lower bound, elthough the probability very fow tht it s more than 10 times too smal Second, there isa small but nonneatisbe probability’ that roundo in the caluation of r= AB might make | asifially smal, in fact zero, and ‘or also make our computed err hour too seal To tke this pomsblty Tino seeonat, one ean add sll quatity top| to semua for ite Fra Question 1:10 we know chat the roudof in evaluating rs bounded by [Ab 0) (AB ~ 6) < (m4 Det Le (215) 0 we ean roplce |r with [+ (2+ 1e(LA =) in Boupe (218) (tis ‘done in the LAPACK code sgesex) orf with r-+-(0-+ DeCLAL- I) Fn hoand (218). The fetor m1 unually mich too lage and ean be oni i desire. Thin, rountol in peforming Gousian elimination on very l-oadiione snatsios can yield stl insceurate Lara Uthat bound (214) tic 0 lo. xaurui 2.6. We present an example, dicowred by W. Khan, that ils ‘rates the dilutes in gong teuly guaran error bounds. In this example Linear Bavation Sosing 0 {he atric A wll be emely sigur. Therefore the campsite eror bo TEA sould be ove or lrger to icate that no digits he compute acl Ut ae corer, sin the te soliton dos ot ex Rounda ere during Gausan elimination wil yd nonslagular but very theonitiond fetes Land 7. With this example, computing using Matlab with IEE double pression arth, the compte esi r tos out 10 be ezuty wero because of round, 20 both eror bounds (213) and (2.14) return ze Te repair bound (2.13) by adding (Ha-Ha) wll be Tanger than 1 sed nfortunstely ou scond, “ihe” ere bound (2.14) about 10-7, e roneousyindicaing that sven cits ofthe computed solution ae eee. Tire how te example is construct. Let \ 8/2", C2, [s x ‘| Ae ean ee Oooect ct 1559-10-03 108 8-108 = | cis Gias-0 0 S105 10-> -S41N6-10-% ens: 10-* and b= A= [1,142,117 A can be computed without any roundoff eror, but bas abit of roundoff, which means that It not exactly In the space spanned by the columns of, s0 Ax ~ bas no solution, Peering Gaussian sliminaton, we get eee ux| .ooces 1 0 ‘06.0000 1 and 0g 10! 1.0021: 108 o 0 LsI90-10- ‘elding compte vane of 20180-10 —son6-10!" 5.076 a ar Wo 1.6384. 10" 6384-108 us : 20180. 10° —5.4976- 10! — 5076 10" 20180: 10 —5.4976- 10! 5076 10" This means the computed valve of [AW [A has all ents appreximately ‘sual 1067109: 107,30 nex) is computed to be O10"). In other words, the ‘error bound indents tht about 16 9 digits of the comptes slation ‘are aceite, wheres none. Brsing lage pivot growth, one ean prow ha boul (218) (ih fe sppropritey ners) enol be made aialy stall by the phenomenon lastrated bere Sinlary, Kahan bas found faaly of m-ty-n singular matics, where lunging on tay’ entey (about 2°") to 220 lowes won) 10,040). © @ Applied Noserical Linear Algebra 2.5. Improving the Accuracy of a Solution ‘Wo have jot son tha to roe in slving Ax — 8 my be lng ws Ae 1th ero sto ae, win can we do One possi wo rer et computation ta higher peso, tk isnt experi en Soncr. Foctinatly a lng nA) not age her are mich cee ‘sods abe forgetting» more neriate slo "ose aay equi f(z) =O We ean Uy Uo ne Newtons ath 1 improv an approximate lation 4, got ys — 5 ~ EH. Appin this to J(2)— Az—b iso stp of erative refement rae —b tolve Ai — fo maaan wo could compute r = Ar ~ b exactly and solve Ad = + exactly, we won be done in ome ste, which is what we expect thom Newon applied to ‘tea? problem. Roundoll eror preven this tnmadite covergente. The gost sinteresting ad of use prsly whea is so l-couditoned that solving Ad ~s (and Azy ~0) 8 eather naceuate, "Twronp 2.7. Suppase that is computed in double precision and (A) < ee priny <1, here isthe denn of a 9 he ptt got act ‘then efentel eave fnement omega B-Ab 640 Ts Note tha the condition number does not appear in the final eror town ‘Ths means that we compute the anaer accurately ldependent of the cond fiom number, provided that n( A is suficientes thaw 1. (In practice, © 8 too conservative an upper bund, aa the algorithm often suceeds even when ‘(Ae ie greater than.) ‘Shetch of Proof. In onder to ke the prot transparent, we wll tke only ‘the most important rounding errs into account Fe brevity, we abbreviate [lls bv I - Our gals to show that te at sO, ot clea By assumption, ¢ < 1, 0 this inequality implies that the error lias2 ~ 2 creme monotonically 10 zo. (presi wil not deere all he way to nero been of rounding ero in the astgtiment 1 — dy which we se ignoring) We bein by estimating th exor in the computed eeidual r We get = Maz —8) — Any —O+ J, where bythe result of Question 1.10 Ugh Linear Bavation Sosing 6 ‘ncaa + JM) 4 ely 8] Any We "Thee term come from the Soule preston compatstion of, snd We ¢ tera games from rounding Uae ‘oubie prcsion result back to single proion. Sine ©? ¢, we wll neglet the 2 eer in he Bound ox Nest wo ge (44 J4}d~ 7, where from hound (211) we know that 6A] < ‘3-2 lA, whese y— Sn°9, although this I usually mich too large. As ‘menoned eavlor we simpy matters by asuming 2:4) ~~ dexsetly, Continuing to amore all = terms, we got = (As bar 4 ABA = aye 64 1) = U4 AMA) ee A'S) SLAM +) ye AA 2) AY "Theectore aya —2 25 —d—2 = A-'SA—2) — AMP and so a %6ates~ 2+ AY [AH bya JAM oe a [A bse — a AH 2 EAG@s — 20 [AoA Lae — a HA lle Bes — a WM TAI 2-6-4 Death om 6 =A Halle e(-41) = Adeje, then we have convergence. © Teeratve ellzement or otter variations of Newion’s method) can be wed to improve aceursey for many ether problems of Linear alge 28 wel 2.5.1, Single Precision Iterative Refinement "This section may be skipped on 8 first reed Sometimes double precision is not availble to run iterative refinement For example, ifthe inp ean is lrendy in double precision, we would ned to compute the esa rin quadruple precision, which may not be available. On ‘Some machines, like the Intel Pent, doubted pneison i lab, Which provides 1 more bits of veto than double preiion (see setin 1.5). ‘This not as accurate as quadruple precision (whieh would need at lest 2-59 ~ 106 fraction its) but stl improves the accursey ntieably But if none of these options are salable, one could stl un iertive refinement while computing the eidual in single prelion (Le, the sane e Applied Noserical Linear Algebra prison a¢ the input dat) In this ease, dhe Theorem 2.7 does wot bold ny more. On the other land, the following theorem shows that under cetaln {echnial assumptions one stp of iterative refinement in single precision itl worth doing baeuse I eves the eomaponentwise relative Dek ward eror 35 defined in Theorem 2.3 O(c), I the corresponding relive condition number teon(l) ~ [LAM Al ls frm section 22.1 I lglennly smaller than the ‘tal contin nimber (4) — [A ]-n ln then th answer wil aso be “Tunomea 2.8, Suppose that ri computed in single precision and sma(lAl-lelh imin(Al-iDs ‘Then one step of tertve refinement yields 2 such that (A 8A = b4 8b seth 0) O(a) ond [| ~ Of) In other words, the componentse relative backward err is as smal as possible. or example, this means tha A ad bore spor, then 8A and 8b have the sme sparsity structures A sul, reece. Me «tie <1 or prof, se [LT] ne well [14 23,224,225 for more det Single precision iterative relineent nd he ereor bond (214) are iple- mented in LAPACK rovtines ike genes EXAMPLE 2.7. We consider the sine mations asin Example 2.5 and pee: form one step of erative refinement inthe sme prison the rest of the cenaputation(e~ 10°), For these example, the usual coition nuraber ‘x(A) 10", wheres ncn(A) 5 1,9 we expecta large accuracy iaprovetent Indes, the components reatve ero fe GEPP is dven belo 10", and the coresponding error from (2.1) Is delve below 10°! as wel "The Matlab cade for this examples HOMEPAGE/Matlsb/pivatan. 25.2. Equilbration "Theres ane mar common technique for improving the erin solving nee system: oquldratin. This refers to chosing an appropeate diagonal maix D asd solving Date — Db usted of lz ~ bs D is coven to ty to tke te ‘endition numberof DA smaller than that of A In Example 27 for instance, tcosing tobe the melprocal of the two-torin of row fof A would make Dal early oq to the identity matrix, reducing its conition number frm 10! to 1. Its posible to show that choosing D this way redvees the condition umber of DA to within a factor of 7 of its smallest posible vai fr any agonal D [242]. In penton me minal choose to agonal matrices Doe bid Da ad slee (DroyAD ua}? Dron 3 Daa ‘The techniques of trative element and equilibration are implemented in the LAPACK subroutines like agerts ocd ageoge, respectively. ‘Thee are Jn tu used by dever routines ke genes Linear Bavation Sosing 6 2.6. Blocking Algorithms for Higher Performance Ath end of ston 2 wes at changing the ode ofthe the nested lope inthe mplonentation of Gasman elimination in Algorithm 22 cok change the excition spe! yonder of maga, depending onthe compet {nd the problem being sored. In ths eton we wil explore why thi the ane‘ describe soe carefily wit Baer gers satware which tle Une males ino secon. Thee inplenentation te scala Hock ale ‘thn, becaune they operate on square or etanglar subbloks of aces io Ut inert bop ae tha on ale rows elias. Tae codes are lable In plledomain stare bates suc as LAPACK (in Fortra, st NECTLIB/lapck™ aod SeaLAPACK (et NETIIB/szalapok). LAPACK (an its versions nol lngungs) are sukable for PCs, woraatons, elo cot puters nd stare-netory pra computers. These ele the SUN SPAR {Center 2000 (236, SCI Power Challenge 21), DEC Alphaserver 840 [0 ‘nd Cray C0) 25,259, SeaLAPACK fstab or dsr parallel computer, nich =the IBM SP-2 254 ItlPargon (25, Cry 13 ‘ere Sa actor of wckstaions (The Ura ae swale on NETL, including» eompreensive mana 10) ‘A moro comprehensive discussion of algrthns for high performance (es pial parallel) machines may be found oa the World Wie Web at PARAL HOMEPAGE, \CK es originally mative by the poor performance of is pre ‘xsors LINPACK sad EISPACK (also silsle on NETLIB) oa some highs Peformance machines. For example, ease the table below, which presents the speed i Mops of LINPACK's Cholesky routine spofa.oa a Cray YMP, 0 supereompaver ofthe late 10805. Cesky variant of Gaui elation Sultsble fr syrametiie postive definite matrices. It dlcussed in depen Section 2.7 eve it sulle to know that itis very sila to Algeithin 22. Te table ko includes the spat of sevrat other linear algebra operations. The Gray YMP is parallel computer with up to 8 procesors that ean be used imultaously, 20 we include one column of dats for 1 procesor atl another ‘column where all 8 process re ue 1 falas of CAPACK, cae CLAPACK (at ETLIB/lapa) ie a vada Larne (a nities pe) at APA 8 NETH we 6 Applied Numerical TProe_8 Pros a Motrixnatri multiply (n=500) S12 2405 Matricwetor multiply (w= 50) SIL 25 Solve 7X — B (n 500) 500255 Solve Te = 0 (n~ 500) om Sst LINPACK (Choisy n = 300) mn OR TAPACK (Cholesky, ~ 500) 20 att LAPACK (Cholesky. = 1000) 012115 ‘The top line, the maximum speed of the machine, i an upper bound on ‘he nurmbers that folow. "The basi near ages operations on the next four ins have been measured using subroutines especially designed for high sped fn the Cray YMP. The all got reasonably close to the maximum posible ‘ed except for solving x —h single rinngular system of near equations, shih doesnot se § processors efletively. Solving TX —B reles to solving {Wiangular systems with many eghtchod sides (08a square mates), These umber ae for lege mates aad vertors (1 50) “The Cholesky route trom LINPACK inthe sth eof the able excentes Sloneatly more stow than these cther operatins even though ts Working on lage matrix a the previous operations and doing mathemati si= ‘ae operations. This poor performance leads us to Uy to reoeeaniae Cholesky fn! other lines algbre routines to go as fas as ther simpler counterparts ike matrx-matris multiplication. Tho speeds of theo reorganized codes from LLAPACK ae given in the as two lines ofthe table. ti apparent that the TEAPACK routines come mich closer to the mixin sped of the machine We enphsive that the LAPACK anal LINPACK Cholesky routines perfor the sane Boating operations, but in a diferent ode To understand how these speedup: were staid, we must understand ow the tne ks pent by the computer while executing. This in turn equles s to understand how computer memories operate. Tt turns out that all computer ‘memories from the cheapest personal computer tothe bist supercomputer fe bull ox hierarchies, with a sere of diferent kinds of memosis ranging frm wey fast, expensive, ae therfore stall memory at the tp ofthe hierarehy down to sow, eheap, and very large merry a the bottom Fast, small, expensive Reginers cache Memory Disk Stow, lage, cheap, Tope Linear Bavation Sosing 6 Forexample msner form the fstst memory then cache, main memory, ks, sel eal Lape 98 the sees, lagen, tad eepest, Usefl sith ‘and logis operations can be done oly on daa tthe top ofthe hierar, in the registes, Data atone level ofthe memory hierarchy can move 1 adseet levels for ezample, moving between main meory aud disk ‘The speed at Which data moves high aor the top of the hiezarey (between registers and cache) aod siow near the botiom (between and disk and main metay). In Particular, the speed at which arithmetic done x much faster than the speed {which data I transfered Beeween lower levels fn the memory Rica, by Factors of 105 or even 1000, depending on the level. This teens that an = signed algorithm may spend most of its time moving date fom th bottom ‘ofthe metory hierarchy to the esters inorder operfet uefa work rather than atu doing the work ‘ere fan example of impo algorithm whieh unfortunately cannot avoid spending most of is time moving data rather thea doing sel erithmete Suppose that we want to add two large n-by- marin, large enough <0 that they Rion ina large, slo level of the memory hierarchy. To add then, they rst be be transfered m pies at ine upto the resters to do the adeitions, ‘nd the sos re trate! ble doen. ‘Ths, there ace exactly 3 memory transers betwen fst ad low memory (eng? summands nto fast memory ‘and weting 1s back to slow memory) for every addition peor IC ae time to do Reating point operation fay Sends ad the tine to move & word of data ecwen temory eel yan 08, Whee foe fa He the excentin te ofthis goin is 17 Cany | lyn, WH sl lara than than the tne nf rele forthe arithmetic sige, ‘This means that ‘nates addition Is doomed to run a te sped ofthe slowest level of meme In which the matrices reside, rather than the much higher speed of alton In contrat, wo wil seo Inter that other operations, such as matex-matix multiplication, can be made to un at the sped of the fastest level ofthe memory even ifthe deta are originally stored in the slowest TLINPACK's Cholesky routine runt so alomiy because It was not designed to minimize memory movement on machines sich we the Cray YM. In con- least, maticimatric mulpbeation end tho Urea other base Linear algebra algorithms measured inthe tabi were specialized to niniize data movereat foun Ceay YP. 2.6.1. Basic Linear Algebra Subroutines (BLAS) Since It fs not costaffective to write a speclal version of every’ route ke ‘Cholesky for every now computer, wo nee a more spstemate approach, Since ‘operations lke matrc-strix multiplication sre so common, enmputer man freturrs have standard thems the Basie Lanear Aigebra Subrontines, oF im deg alse hint ery on fs awe % Applied Noserical Linear Algebra BLAS 167, 87,85, sa optimized then for die machines. In other woes, ‘library of subroutines fr entrianarix muliplention,entrx-vector mul lation, snd other snllar operations fs available wth «standard Fortran ce interface on high performance machines (and many others), but uderaesth they have been opted foreach machine. Our gol is to take advantage of| ‘hese optimized BLAS by reorganizing algorithms like Cholesky so that they call the BLAS to perform most of thet work In this seton we will dius the BLAS in gnera. In setion 26.2, we wil desecibe how to optimize matrix wultpieton in parila. Pinal in Seetion 2.6, we show how to rearganine Gaui elimination so ht most of ts work is perfor ing mate liplienion Tot us exaine the BLAS more ceil. Table 21 counts the nuabe of memory references and floating points operations performed by three elated TBLAS. For example, the number of memory references needed to implement the aaxpy operation inline 1 ofthe table n+, because we ned t read ‘values of 2, n yluts of y, and 1 value ofa fom slow memory to reser, fn then writen valves of ty bck to ska menor. The lst column gives the ratio g of flops to memory references (ts highes-ordr orm in eal). ‘The signleance of qs that it tells us roughly how many flop that we ean perforin per memory rfrene o hoe mach suf work me can co compare the time moving dats. This tells us how fst the algorithm ean potently un Foreximpie sippe that an algorithm performs floating pots ones, eh of which takes yy sco, and me memory relerence, ech of which {les seconds. Then the total rating teas lage 8 Stee +t = Fac (oe suming that the arithmetic and memory referees re not performed in fpr. Therefore, the lege the vale of, Uh leer the rnin ime 20 the best posible running time f-tygay whith how long the algorithm would {he if ll data were i registers. This means that ages the larger ‘values are better building backs fr other agoethns. Tab 21 elects a hierarchy of operations: Operations such as exspy performs O(0!) fps on vectors and oer the west 4 alias thse are call Teyel 1 BLAS, of BLAS! [167], aod inclode inner products, multiplying a sealar ines yetor and ether smpe operations. Operations suchas mates ‘weir multiplication peor O(n?) Hops on matrices and vectors and offer slighty better values; these are ealled Level 2 BLAS, or BLAS? [S7, 86), fn ined solving tlangular systems of equations and rank-1 upeiates of roatrces (xy, and y comin vetoes). Operations seh as matrix ‘matric mltphication perform O(%) fps on pairs of maties, and fe the best ¢ values; these at called Level $ BLAS, Oe BLASS 85,8, sd clude solving teiangulae systems of equations with many right-hand sides “The diwetry NETLIB/blas includes documentation aod (unoptmizad) Linear Eavation Solving o Operation Detinition mm [a= Jin aaxpy yaa yor wpa) Be (BLAS) | eon hm FS vam Matrieveetor mit [y= A-a + yor Ba ayaa fa cotase) |= S808) +e Matriemateix mu |O= ABs Cor an" | an? | — wi (BLASS) | ay = SE yale ts fF lavnea “Tnble2.1. Counting ting point operations and memory referees forthe BLAS. J ‘she nud of tng pot epeations nd ate tamer of memory ween Implemertations ofall the BLAS. Fora quick summary of all the BLAS, te NETLIB/bias/basgrps. This summary also appears In 10, App. (or NETL /lapec/lug/lapeck. ugha) Since the Level 3 BLAS have the highest valves, we endeavor to reorganize four algothms in terms of operations sich as matrxnatris makiplcaion rathor than saxpy or matri-vector mukiplicaion, (LINPACK’s Cholesky is ‘onstrates in ers of ells to apy) We erphasize that sich earns ‘algorithms wll oly be faster when sing BLAS that hve been optimized 2.6.2. How to Optimize Matrix Multiplication [Lotus examin in detail how to implement mates multiplication C= A-B 4c 1o minimize the aumber of memory moves and 50 optimize is performares. ‘We wll sce thatthe performance Is sensitive to the Implementation deta. To simplify our discussion, we will se the following machine model, We assume that mates are stored clurmowis, asin Fortran. (Iti esy to modify the ‘examples below if matrices are stored rowwise asin C.} We assume that there fare two knels of memory hiomrchy, fast and sow, where tho slow memory is large enough to contain the thre nxn enarices a, Bnd C, but the fast memory contains ony M words whect 2m < AF 92 this means that the fast mewory is large enough to bok! wo matr columns oF rows but ot a whe mates. We forter asm tat the data movement Is unde programmer contol. (In practic, dita movement ry be done automaticaly ty hardware, such a the ence controller. Novetheless, the base optimization seberme renaing the same) "The smplest matrix-nulipieatlon gem that ove might try const of vce nested ops, which we have notated 10 diate the data movements. Auconrrins 2.6, Unblock mate mltpention(ennotaed to indicate mem: ory ett) 6 Applied Numerical Jori=tion { Read row i of A nto fost memory } forj= 100m { He Cy ito fst memory } {Read clan j of Bio fost emery) frk—tton Gy Cy + Aa Bas nd fr (Write Cs tat sow memory } end or cod for "The innermost loop doing 9 dot pradvet of row fof A and calunan j of B 10 wmpute C98 shown in the following igure: al cap Bes) ne can aio describe the two Inermost loops (on and) a8 doing setenatrix multiplication ofthe throw ofA ines the mai to get the ‘th row of C. This sa hae tat we will ot perfor any better than thse TBLASI and BLAS? operations since they are within te Ineemos loops ere sche dese count of memory references for reading B n ines {one foreach val of 5}; 1 for eeding Aone row at ie ae ping iin fiat memory unt o longer nee; sal 2n? oe wading ane etey of © ta time, heping iia fast merry unlit i completely computed, and then ‘moving it back 1 slow memory. This comes to n+ Sn? memory moves, ce 428° (08 Sn!) = 2, whieh sno beter tha the Level 2 BLAS and fr frm ‘the masizu posible n/2 (sce Table 21). 1 M =m, 0 tat we cannot keep ‘fll row of in fast memory, ¢ further decreases to 1 sinc the algoethin reduce to Soquence ofiner products, which are Level | BLAS, For every permutation of the three loops on i jak oa gets another slaehm with bout the same vr prefer algorithm uses tock, where Ci broken nto an > LN block mates with n/N n/N blacks CY, and A aod 3 ae slaty Linear Bavation Sosing o prtitioned, as shawn below for N=. 'The alge bosoms oy a ad a8 Auconrvins 2.7. Blocked matr malipintion (annotetad to sndiote mem ory ct) fori 10 forj= lon {Tend int fast memory} “ork to 8 { Hoa 4 no fost memory} { Hou Bint fst memory) Chr ap endfor {Write tack to st memory} tnd for end for ur memory reference count iso follomst 2n? for sending and writing cach block of C one, Ni for reading A times (reding each nf X-by-nfN submatrix A ties), and Ni for rung BN times (reading each n/N tem /N submatsix BY N° times), for «total of (24+ 2)! = 2Nr? memory references, So we want to choose 1’ as small as posible to minimige the nme bor of memory references. But JV i subject tothe constraint M > 3{n/N), hich means hae one block ech fam A, B, nd © roust A in fst memory ‘simultaneously. ‘This yields N my/3/M, anal so 9 % (2n?)/(2Nn2) ~ M73, Which i mel better chs che previous slaneitin. Te poeticulae geows tn. dependently of 88 M grows, which means that we expect the agri to bre fast for any matrix size and co go faster if the fst memory sie Ys Ineresed. These are both attractive properties. Tin fct,itean be shown that Algorithm 27 Is asymptsiealy opts! [10 In other words, no reorganization of matrx-atex mulipleaion (that per forms the same 2 arthmetle operations) ean have a9 larger than OCT) ‘On the other hand, this ble analysis ignores « numberof proctcal Issues: 1. A real code will ave to dal with nousauare mattis, fr whieh the ‘optimal block ss may’ not be square 7» Applied Noserical Linear Algebra speedinMeptons a 00 sees Fig. 25, BLAS aon the BI RS e000 750 2. The cache and rmgsterstrvctare of « machine will strongly ect the best shapes of submaties, ‘8. ‘Thero mey be spel hndware instructions that peer both amily tu! an ation in one eyele. It may also be posible to execute several ‘mltilyadd operations simultane i they donot intertere For deal! dsension of tse ses for one high performance workstation, ‘he IBM RSH000/590, se [1], PARALLEL. HOMEPAGE, oe tp ww austin. sb. coa/teef ‘igure 2.6.2 shons the speeds ofthe three baie BLAS fr this machine. ‘The Iocapntal ass mati le, and the vertical suse sped in Mops. peak smschine speed is 26 Mos. Tho top eure (peaking ae 250 Miloz) is sane Imatrivmatrix utiplcation, The middle eurve (peaking near 100 Mops) i Square matrix-vector mutipieation, and tho bottom curve (peaking near 75 ‘llops) saxpy. Note thatthe sp Ineeases for leger mares. This Is © common phenomenon and rans that we wil iy to develop algorithms whose Jiterna matesemultpieations ue as large maties ns rensonable, Bott the abowe matrxsmaeix mtpliation algorithms peor 20° arth smote operons. Hts ot that there ae other pene of mates ‘mates mulipintion that use fr ewer operations. Sessa’ mashod [3 8 the fist of these algoithins to be discovered aa he simplest 10 explain "This slgorthmn multples matrices ecusvely by dividing them Into 22 block Linear Bavation Sosing a ratios and multiplying the subocks using seven mate maipicntions (r= crsivly) and 18 matrie aditons of tall Ue sie his eds t an asyznptote ceamplety of 817 12 instead of AUGoRIID 2.8. Strusen’ mater multiplication apron © = Sirasen(A,B.n) 7 Heton ©” Ae B, where A and B ere nbyns Assume mis a power of */ wast tur C= AB /* salar mulipliation */ ee Ay An By By Perttion a= [4% 4 ) ana p= [ Bn Bo tohere the slacks Ay and By are nf2-byen 2 = Sasent dig ~ Ana, Bay + Baa n/2) y= Strassen Ay + Asa, Bhi + Ba. 2 ) P= Simson day — Any, Bri + Ba, m2) P, = Strassen Ais + Avs, Baa, m/2) Py Strassen As Bi m2) Pa — Strassen Aaa, Bay ~ Bir m/2 ) Ps — Strassen Aan + Asa By n/2 ) Cn Pee Pa PP Ca PP C= Pee CaP RA mame=[ oh eh] af leis tedious but straightforward to confi by deli that this algorithm ‘multiplies matics eoretly (se Question 2.21). To show that is eomplesity 1s OG"), welt T(n) be the number of additions, subtractions, and mul lies perorted by the algorithm, Since the algorithm performs seven recursive ‘al on matrices of sae n/2, and 18 addons ofn/2-byn/2 mateces, we ex ‘write dwn the recurrence T(n) = TF(n/2) + 18(n/2). Changing variables From nto m = logan, we gota new reeurenen on) = THI) +182"—", whore T(m) ~'7(2), We ean confirm that this finer recurrence for T has a solution Tn) = O17) = O(n), "The valve of Strssen’s algorithm isnot jost this asymptotic complexity bout nation ofthe peublem to staleesubpeablems which eventually ft in fast memory; nce the subproblens i in fast memory, standard mitre ‘multiplication may be ued. This approach hs led to speadups on zlatively Inege matrices on some mchines [2A drawack s the eed for signenat workspace and somewhat lower numerical taal, although Ie is adequate for 2 Applied Noserical Linear Algebra many purposes (7). Thee ae a number of other even faster rant eal cation algorithms; the current rcord is about O(%22), due to Winograd ad CCoppersnith 261) Buc thse algrthats only perform Tower operations than Strasen for Impractialy large values of m. Fora survey seo [193 2.6.3. Reorganizing Gaussian Elimination to use Level 3 BLAS Wo will organize Gaussian clinination to use, fist, the Level 2 BLAS an, ‘hen, the Level 3 BLAS. Fr simple, weassume tha no pivoting newer. Indes, Algpeitm 2.48 alteady Level 2 BLAS algorithm, because mest ofthe work is dove in the second ine, AG $F nim) = AH mitts m—AUG Lena) eAQid | 1m), whieh i a nek apdate of the submatrix AQ 1nd +e). The other althnatie in the algorithm, AGL mi) = AGEL: mD/AQGA, is aetlly done by multiplying the swetor A(¢# Ti) by the salar 1/A( 9), ince mulkpietion is much aster than division; this is also e Level | BLAS operation. We need to mody ‘Algorithm 2 sight borane we will set within the Level 3 version Auconrrint 2.9. Level? BLAS implementation of LU factorization without pivoting for en m-dyn matrie A, where m > ns Overerite A by the m-byn Inatrer Land icbyse matrirU. We have numbered the portent lines for later reference. Jor = wo mint ~ 1.9) (1) AGH) = AW EL 9/46) wien 0) AGG Lemay tem) = AG eLem eb tem) AGT mi) AtE HL) a for ‘The let side of Figure 26 strates Algorithm 2.9 applied to a squere imatris. At step # ofthe gorithm columns 1to§—1 of Land rows L071 Of U are alredy dane, column #of Lana row i of Ure to be computed and the traling submatrit of isco be updated by arank-l update. On tbe et side f Figure 2.5, the suibmotrioe are Inbeed by the tine ofthe algorithm (Ci) oF 2) thse update them. The ran updte i ine (2) to subtract the Drodet ofthe shed column and the shaded ow fro te sbmst fbeled @, “The Lavel § BLAS algorithm wll organize this computation by delaying ‘he apt of subaatre (2) for Bates, where b is smal inte called te ‘block se, nd Inter applying Beanket updates all at once in sage mats ‘matrix multpleation, ‘To ase how to do this, suppowe that we have aleady Linear Bauation Solving a i »b (done) U (done) 1 b | z @ z = = ® ‘Step iof Level 2 BLAS ‘Step i of Level 3 BLAS Innplementation of LU Implementation of LU Fig. 208, Lal ond Lote BLAS implantation of fatrzation. ‘computed the rat 4 — 1 eolums of Zsa rows of UY ylding <1 ob wob-ted in Au Aa Aw Aso dn Aa As robin Ay Ae Ase In 00) [Un Ux tu ty to]-| a de dy |. mot] bo ae ay sor all he mates re patiiod he same ay. ‘This shown othe right side of Figure 2.6. Now apply Algorithm 2.9 to the submatrix [ 42 | to [ae] [23 = [re] [e#]-[eee Ae ass Tain Ass ty 0) [Ua Thais (i. 9] (0 ant ia | : [E) (sau | a Applied Noserical Linear Algebra [el ee] Altogether, we get an updated facorastion with b more columas of Land tows of U complete: Au Aw du) [tn 0 0) [Un Ua Un dn da da] =| ta ta o|-| 0 Ua Ua an de 4a] [tn tt] Lo 0 aw ‘This defines an algorithm sith the following thn steps which are ils ‘eaten the right of Figure 2.6: (1) Use Algorithan 2.9 t0 fuctorize | 42) =| [2 | Ua (2) Form U5 = 1.4. This means solving «langue lnae systom with many Figh-hand sks (4g) slagle Level 3 BLAS operation (8) Porm Ass = As ~ fr Uas, 9 matrxcmatrix makiplcation, More formally, we bave the following algorithms. Auconrrins 2.10. Level BLAS tmplementetion of EL factorization without ‘iveting for en n-byn mats A, Overunite Land UT om a The tines of {he arto are numbered ax above and to correspond tothe right port of Figure 26, fori = 191 aonb (1) Use Algorithm 2.9 to factorize AUG: més 646—1) =| [2 (2) A s64b— 16 bam) = Bgl ASK HLL bem) "7 form a */ ) Ae} Dine tbe) AG bem bbem) SAGE Din gstt b= De AUB OnE bem) /* frm ea 7 Un ou for Wo till ed to choose the block size bin onder to maxinie the speed of the algrthm, On the ope hand, we wou lke to make 6 lage beenuse we have seen that sped increases when multpiying lager matrices, On the othe hand, we ean verify thatthe number of Boating point operations performed by the slower Lawl 2 al Level 1 BLAS in fine (1) of the alr abou 0/2 or small b which goons ab grows, so we do rt want to pick b 190 lange. The opts vl of bs machine depedeat nd ea be ton fo enc ‘machine. Values of b= 2 or b— Glare commonly usd Linear Bavation Sosing 6 "To see detail implementations of Algorithms 2.9 a 210, see stron Lines aget £2 an agetef, respectively in LAPACK (NETLIB lac). For ‘aoe information oa block slgoithns,inliing detailed performance nu. ber on 8 varity of machines, ste als [10] oF Ue course notes at PARAL EELHOMEPAGE, 2.6.8. More About Parallelism and Other Performance Issues le thissetion we bly seve’ othe ses involved ia implementing Gaussian liminaton (and othe near algebra routines) a eiientty as possible A paral computer contains p > 1 process capable of siultanousty working an the same probin. One may hope to solve any glvea probs Dimes fister on sich a machine chan on a conventional uniprocessor. But Such *pereet ficiency” i arly achieve, even If there ae always at least p independent casks avaliable to do, beeause ofthe overbeed of coordinating » proceso and the cost of soning data from tho processor that may store Fe to the procesor that ned it. ‘This last problem is another example of| ‘8 memory hierarchy: fom the pint of view of processor i its ont memory is fat, but geting data from the memory owned by processor js slower, sometimes those of ts seme ‘Gausinn elimination offers mary opportunites for parallin, sine exch entry ofthe tating subtrixriy” be wat ieepenenty al in pore ‘teach step. Hut some care is needed to be a efficent os posible. Tw stat an pers of software are silale. The LAPACK routine agetsé describes In the last seein [10] runs on shared-memory parle! machines, peovked that ane has available implementations ofthe BLAS at rua ia pall. A related Libary called SeaLAPACK, for Sealable LAPACK 82, le designed foe Aisiributed-memory poral machines, ebro that rule spell opeeations tomove date Breween diferent proceso. All sofware ssallable on NETLIB inthe LAPACK snd SesLAPACK suircories, Seal APACK i desribed in ‘more detail inthe notes at PARALLEL. HOMEPAGE. Extensive performance ‘ata fr linear equation solvers are saab as the LINPACK Benchmark with an uptodate version available at NETLAB/benehmack/performance.ps, ‘rin the Perforce Ditabise Serve. As of Angst 1986, the fastest it ‘any linear system hd boen seve esing Gaussian elimination wns one with ‘2 = 12860) gaa Tate Paragon XP/S MP with p — 6768 processors: Use ober ran at Just ver 281 Glos (aialops), of maxizaum 38 Gps "There aze some matees 10 lege to tin te enn memory’ of any aval able machine, These matics are stored on disk and ust be read into mala ‘memory ples by plese in order to perform Gaussan elimination. ‘The ors zation of such routs largely sll to the technique curently used In Seal APACK, and they will oon be Included in SeaLAPACK. Ti perme i paonmanc/tin/P Stop hl 6 Applied Noserical Linear Algebra Fall, one sight hope that compilers would became sufiently cover to take the siplest Implementation of Gausslan elimination ising three nested oops ad automaticaly “opting” the code ook ke the bie! algrithn 0 for sll —0. In this section we wll show how to solve Az —b inh the tine ‘axl bal the space of Gausan elimination when Asp. PROPOSITION 2.2. 1. IFX is nonsingulr, then A is sp. if and only if NTAN apd 2 Ais spd. and H is any principal submatrix of A (H = AU: ky 2) for some) <8), then His spd. Linear Bavation Sosing a 5. Ate spd if and only f A= AT ond all ts eigenvalues are pase 4. A is sped tho all > 0, on mesa] = mas 045 > 0 5. A is pad, if and only f there isa wniue lower triangular nonsingular Imatrie L, with positive diogonal entries, such that A= LEE. A= LLP 1s called the Cholesky factorization of und 1 called the Cholesky Factor of 4 Proof 1. X nonsingular imples Xe = 0 forall x = 0, 29 27XTAXe > 0 for all 2-0. SoA spd. implies X7AX spd. Uso X~! to deduce the other Implcetion, 2, Suppose fst thae H_— A(L:m,1 an). Thon gvon any meveetor , te vector [yf OP satis yTHly ~ 2 Aa. So if 27 Ax > 0 forall ronzero then y Hy > O forall nonzero yan 0 Hiss fH does ot lien the upper left commer of, let P bea permuttion so hat Hf oes le in the upper kt corner of PYAP ane apply Par 1 ‘8. Lav X bo tho roa, orthogonal egonveetor matrix of Aso that NTAX = Is the dlagonal matric of real eigenvalues Since 7 Az ~ YB, A Isspa if end only ifeach 3; > 0. Now apply Part 4 Let; be the ith eolumn of the Menty mates. Then ef Aes = a > 0 forall Ilo) = maxi abut k =f, ehoose 2 ~ ey — senoe "Then 2 Ar = axe ay ~2ay| <0, coteadeting postive deans 5. Suppose A= LAT with onsngule, ‘The 2x = (27 (LPs) = Felt > 0 or all 0,04 spd. ICA i pd, we show that 1 tats by lndution onthe denon n. we hn each > 8 fenstrton wil determine L uniquely: tm ~ 1 ese fn — hid exits sine ayy > 0. As with Gaui eliination, sufi a dria the ck 29-2 cave Wee += (44) [2 i]b Al[F #] [2h a has J: 0 the (~1}94o0— 1) mate ey = An — A pm 8 Applied Noserical Linear Algebra By Pare above, [3 2, | spd, so by Part 2 nis pd. Thus by induction there exists an L such that Aga = LE and vax 0) [1 0) | van + (Ebal[* 4] at lee eee frj=toon Y= (ass — DEV G fori 3+ Vion as = (aig ~ SA balsa is end or ul for A not postive defirite, thn (in exact arithmetic) eis alent wil fall by tempting to compute tho square root of a negative nuber or by Aividing by zero: this is the cheapest way to est ifm symmetric matrix postive define. ‘As with Gowan ebmination, Lean overait the mer bal of A. Only the fone ha of is referred toby the algorithm, son fet only mn 1)/2 Storage aed sted of 2. The number of ops yen ¥ 2n= bot cow, cr js the ops of Gausin lato, Juss wth Gaus ination, Chelny tay be teorpanind to peroe mon ots fotng pl operations sng Lee 8 BLAS, ee LAPACK conte apotet. Pong ot nesmsy for Cholesky to be ner stabe (ue Jey, ou so sy ny pivot onder ser stable). We show this ‘lows, Those sols sor Gaesnn elimination inection 212 shows “hatte competed sion #satsin (4-82 —bwith a = e| ut by the Coy Sct nanity and Pat of Propston 22 UEL-IETDig = So Mil -Ibal = yVUavyER = aves mas ah (216) Linear Bavation Sosing ” 50 |E1-|0"| Jac J-4b, oF F<) — by! on fides ° A o Oe oa ‘Bnd matress arse often in practeo (ww give an example Iter) and are wef to recive beeaise their Lene! Tactors are aso “esl bande ranking them cheaper to compute and store. We explain what we mean by sentially fanded” below. But fist, we consider LU factorization witout pivoting ar show that Land ace banded ithe wal sense with te se Pats a8 PRoPOsrmON 2.8. Let Abe banded with ler bandh by, ad upper ad wah y= Let ALU be computed witout pivoting Then L hes lower ends end has upper bandh by and oan e compute ‘ont 2b -by erthnetic operations wen by and by, ee all compared to fn. The space needed is N(by. by + 1). The fall cot of sling Ar — 06 2rd by Dab + Pd, Sketch of Proof. 1 sulos to lok atone ste; soe Figure 27. At step § of ‘Gaussian liination, the shaded region i modified by suburacingthe prods » Applied Normeriel Linwar Alba bu out Fig, 27, Band LU fctoraton without pian. of the fst column and fest sof the shaded rego Hote Ut this dows ot large the bandwidth, © Pnorosrioy 2.4. fet A be banded with ler bath bad upper ba swt by. Then after Gaussian elimination with partial pivoting, U te banded ‘oun opper banda a most bh, ed Ls “eserilly and wh ower bendindih by. THs mens Bhat Dat et mest by, + 1 nonzero each clam, (and 99 an Be stored in the same space aa tba ath Ure bens b, Sketch of Proof Again w pete of the reglon ebanged ty one step ofthe ‘gorithm illstrtes the prof, As ilastrated in Figure 28, pivoting can in reso the upper bandwidth by at most br, Later permutations ean reorder the etres of earer colarns so that entries of Zmay le belew subdiagonal b by per cour. ‘Gaussian elimination and Cholesky fr bane matrices ave avaiable in LA PACK routes lite asbev and seper ‘Band matrices often are from dseretiaing physical problems with nearest ighbor interactions an a mesh (provided the unknowns are ore owe er colunawise see abo Example 29 and scion 63) BXAMPLE 2.8, Consider the ordinary eifeential equation (ODE) yf) = Blalw(2) —ale}a(s) ~ 112) on the inter [a,b] with boundary conditions {0} ~ a, x8) = 8. Wo als assume q(2) > ¢> 0, This equation may’ be ws to made! te eat fli a kn, thin od for example. Tosole the dieretil ‘near Bauntion Solving 8 byt bn, bu but Fig. 28, Band 10 factorization wth parti pong. ‘eqation numerially, we dieretize it by seeking its solution only atthe evenly Speed mesh pnts 25 a4 ah OyanyN 1, whee = (b= a)/(N- 1) is the mesh spacing. Deine mph) 11 — lays aed aay). We ned to de rive uations to sale far our deste approximations y= ys), whee py = a tial pve ~ 3. To drive these equations, we sprain the derivative (2) by the flowing fate dierenceapprinaion vest vo vey BS (Note that as A gots smaller, the right-hand! side appraimates (2) more and ‘more aeruratel.) We can sniasly approximate the seco derivative by hat 20 tes ea) Bae het (S1 ato 03.1 in Chapter 6 ora mae eta! derision) Inserting these approximations into the difereatial equation yells w= %H tween er 9, a Rewriting ths a8 linear system we get Ay ~ 0, where nm n (tre 1 v= ae 2 ow wd Lato Isic, 2 Applied Noserical Linear Algebra wa Ea a-|% = desta eval og = dda ty ay ‘Note that a: > 0, nd aio > O and > Of hs small enough “This fs 0 nonsymmetrc ‘rdiggonal system to salve for y. Wo will show how to changeit toa symmetric postive deinite triiagonal sytem, so that swe may use dnd Chea to solve Chon Dat By SEES. Then we may hangs Ay = bo (DAD™*)(Dy)~ Dh or A —b, where a vam vom Vale A vai ie Vina “oy is easy to ee that 4 is symmetse, and i has the same eigenvalues 98 A because A and A= DAD" are ssn (Seo setion 42 in Chapter 4 for detail.) We will se the next theorem to how It also postive dete, ‘Tunowe 2.9. Gerslgorin. Let B be an arbitrary muir, Then the eigen wet h of Bove lcted inthe union of the mls (A~ baal 0 a Prof. Given and 2 = 0 puch tha Be = Az, 1 = he = 24 by sellg 2 secs: Then Sy yey ~ Mey = 050 A—by SEN yy, soli bale Diels Tiel 8 Now it his osm tat foal, Hp <1 then ivri$(0e4a) oC ‘Therefore al eigenvalues of Ato inside the disks centred 1+ Aq /2 > 14 g/2 with redus 1; in particular, they rust all have positive real pets Since Ai symmeti, isegenvalus are eal sa hence positive, so ds postive defo, Ts smallest geval bounded below by gt /2 ‘Thus, can be Sy) rere Byer Be Soe Fests aaa Linear Bavation Sosing s solve by Cholesky: The LAPACK sibratine for solving symnetre positive elite Lingo syste i apt In section 4.3 wo wll usin so Gershgorn’s theorem to compute pert bution bounds for eigenvales of motrces. © 2.7.4, General Sparse Matrices AA spmrse mari is defined to be a matrix with lage numberof zo entre. Tn pret, thi means a mri with enon zero enti tat ts worth sing ‘an ngorithen that avo Morn or operating nth reo entries Chapter Ie devoted 10 mettods for solving spre Linen systems oe than Gatssan limination and is variants. There are a lange numb of sparse metho ‘hoorng the est one often raqules substan! knowledge about the mate [24 tn cis setion we wil only skit the base sues In sparse Ganesan ‘limination an give polaters tothe Iterature apd avalable software "To ive a very simple example, comsider the following mati, whieh ‘onder 80 that GEDP does aot permate any rows 1 1 1 1 4 1 afew ial au 1 1 1 - 1 ae 1 a diac 6 A salad an arrow metre because ofthe patern of ts nowzro entries. Note ‘that none of the zero entries of A wor fled in by GHP so that Land U ‘ogecher ean be stored in the same space as the nonzero entries of A, Alo, it ‘we count the numberof essential arithmetic operations, i, noe mukiplcaton by oro oe adding zero, thee are only 12 of them (4 disons to compute the Ins row of Land mltiptientions and aon toute the (5.5) entry), Instead of $n = 8A More generally, A weee a meyer arrow mates, it wouk! take only Sn — 2 locations to store i insted of and 3n 8 Boating point operations to peefrm Gausianelninstion instead of 31° Wha ns Taege, both the space a! operation count become tay compared toa dense rats, Suppose thet instead of A we were given A’, which Is A with the ordee fits rows ard colurans reversed. This amounts to reversing the order of the ‘equation and of the unknowns i the linear spstem Az ~ 6. GEPP applied 10 st Applied Noserical Linear Algebra AY sg peau no rots, and co two dell places we get eee vu 1 1 1 drat e ie 9 <0 3 lk 4 Show that Ais nonsingular, Hint: Use Gersgorin’s theorem, 4 Stow that Gaussian elimination with partial pivoting does ot setualy Dermuteany rows Le, that is dential 19 Gann elsinatin wit Dwotng. Hint” Show that after one step of Gaussian elimination, the trailing (n—1}-by-(n—1) submatee, the Sear ecmplement of ey A, 1s sll diagonally dominant. (See Question 2.18 for more discussion of the Schur comploment.) Quesniox 2.20. (Easy: Z, Bel) Given sn n-byn nonsingular mates A how do you eticealy solve the flowing problens, using Gaussian elimination with partial pivoting? (@) Solve the tinea system Ab where 6 positive integer (©) Compute a = <4, Linear Bavation Sosing * (6) Solve the mtr equation AN = B, where B is n-by-m. ‘You stould (1) desribe your algosithns, (2) peeseat them in pseudocode (using 4+ Madablik language; you sould not wate down the algoritn for GEPP), ‘and (3) mo he ele ops. Quesrios 2.21. (Aedivm) Prove that Strassn's algorithm (Algorithm 28) correctly multiplies mby-n matriees, where i.3 power of 2. Linear Least Squares Problems 3.1. Introdu n Given an meby-n matele A and sn m-by-1 veto B the linear least squares ‘rode isto Bad an teby-t vector minimizing [cr ~ la. I'm -—n and “A's nonsingular, che answer is simply — AD. But ifm > m so that wo hve more equations than unknowns, the problem i elle ovndeterminad and goverally no x stses Ar ~ b exaty. One oceasonally encounters the tderdetermined proba, where m =n, bit we will contre on the more ‘omamon overdetermines as. ‘This chapter Is organo olows,‘The rest ofthis introduction deseribes three appbentins of ast squaces problems, to eurve fing, to statistical ma ling of noisy data, andl to grate meting. Section 8.2 seuss thee stan ‘ar wags to sl the eat squares problems the normal equations, th QFE tecomposiion, sal he sngalar enue donation (SVD). We wil eqns tse the we SVD fs ool in ater eliptes 50 we deriv several ots properties {although algrithus forthe SVD are lef to Chapter 5). Setion 83 ncsses Drrtebation they for least squares problems, ad section $4 diseases Ue Implementation details and roundoff errr analysis of our sain method, QR dcomnpenition, The roundoff analysis applies to many algorithas vsing Or ‘ogo mates, inluding maay algoethmns for elgenlies und the SVD in (Chapters 4nd 5, Seti 8.5 discus the pateuasy i-conditone situs tio ofrankefielnt last squares problem and how to solve them accurately Section 3.7 and the questions a! the end ofthe chapter give pointers to other Kinds of least squares problems and to software for sparse probes EXAMPLE 3.1. A pial apleation oes quae score ting. Suppose at we he mp fuer nb) = mB) A we ant oie the “bet eieplyaominl tty a 2 func o Th men ding polyno! codcetsr,—2 0 that the ply f) 32-24 Daas the esl r= pi) ~B oes = tom. Wen owe hs ao me Apolial Numerical Linear Algebra na] [am] po ea [|] mm |] re) Lote | Loe tw gl) pay fo tm wal fal | & 1 te va] Led Lon = Ae-t, wine a ae my, A. sy, a is by To iin, te euk eyo sch fe FP Ti st ones whieh oveporisominiiing tne Sum of thos ols ety, near tout sper role ‘aur! nws an exp, wine wef polonials of tresng degre vothesmooi faecon b= sig) +t the 28 pintsy = 5,15, 4, 155, 6 The lt de of Fgiwe pt th da pot x rc and out dite spproxisting poston of eer 1 3,6 an 1. The i side oF Fire 3 ps the idl norm fs versus dee for des em 10 Sn, Note htm th dere nee rom 101 the eel decrees We eset this bio xis ners th ppl dese ft data Bate Bat when we ich dare 18, he rsd gon seen incre de matic We ea ee how erat the pt ofthe degre 1 pla on the ke ho ine). This det ction as we wi ters “yp one dos polyoma Ming nly with eave ow degre vomit, song heeding (0). Paleo ftir vale the fineton psy in Mah Ter a llomatite to paloma fing. More gel, on as at of independent faetions fifo fom Beto a tof pints Cobb) with © by € Ry and one wishes tof ext Iie these pint tthe form b— Sf) In other words one wants tooo # = ayaa t mime te eda n= 435th) foe 14 m. Leng ay ~ fl), we can wee thn ar Ab where Aisne lent aad Und ae mA gd dio of tm Tieton f() ean edo ete Hs an lose Heconfond sens tan ting ppm 3, 92,1 EXAMPLE $.2. In statisti! modeling, one often wishes to estimate eran parameters 2) bsed on sone observations, where the observations are ea Taminated by noise. For example, supose that ove when o pret the cllege grade point sverage (GPA) (0) of fesman applicants based on thee Linear Least Sqacos Problems 10s, Fig. 5.1. Paloma acne b= lany/5)+ 9/5, al eka norm high sehool GPA (a) ad two Selasie Aptitude Test scores, verbal) and ‘quantitative (0), part of the college sisions prog. Bast on pst ‘data from adoitta freshmen one can eonsttnet Enear model of the fort 1S az. The observations area, a, i, aly, ne st for each of the m-Sdents nthe database, Thus, one mts to malian rat One be which we can do a8 lest squares problem. Here fn statistics jostifention for least squares, wich is called linen ragresion by sttistcins: assume that theo; are knowa ext co that only has nose ini, ane that the ois in each yi independent and normally lstribited with O mean and the same standard deviation a. Let x be the so lution of the east sures probe ane be the tre vale of the parameters, Then 2 calla! marian ed estimate ofp, a Ue etary normaly distributed, with sero ean in each component ad coveriance me tar 024A)". We wil seo the mate (44}~ again below wh we sole the least squares problem using the normal equations. For more details on the ‘connection to statisti, see, for example, (8,257). © Tyan siaon aie dil tom iw agra: atte wee AS Ite Ae mt poled Numerieal Linear Algebra EXAMPLE 8.8. ‘The last squares problem was ft posed and formals by Gans 10 solve » practeal problem forthe Germann government. ‘There are important enone ad legal reasons to know exactly wheze the bouates le beeween plo a land owt by diferent people. Surveyors would go out sad ‘uy to establish these boundsies, weasoring eoraln angles and dstanoes ad then trianqulating from known landaarks. As population density increased, ie became necesaty to improve the accuracy to which the locations of the Inndmarks were known. So the surveyors of the day went out apd measured many angles and dstanees between Indmarks, and It fall vo Gauss to figure ‘ut bow to take hese more seeuate measurements aed update the government database of locations. For this be invened last squares, as we wil explain story ‘he problen that Gauss solve did not xo away nnd must be periodically revisited. In 1974 the US National Grete Survey undertook to update the US arodesie database, which conse of abot 70000 points. The motive tis id! grown to ice sbpplsing accurate enough dat fr evil eninees fan resional planners to pln constretion projets and for sph (0 ‘Study’ the motion of tectonte plates i the earths erst (whieh ean move up £0 Sem per yea). The coresponding lst squarsproble was the lrgst ever Solved at the tne bout 25 millon equations In 400,000 unknowns. 1 wis ‘so very sparse, which aude i tractable on the computers waluble i 1978, ‘when the computation was done [162 "Now we ify elicits the formulation ofthis problem. It scl non- Toward solved by approximating it by a sequence of eae ones, each of ‘which isa near least squares problem. ‘The data bas consists of list of Points (landmarks), each labeled by location: att, longitude, aed possibly ‘Seaton. For simplicity of expenition, we assume that the earth Bat sd ‘ppone that each point # i abo ty iene eoodinates =) — (x). Ror ‘each pint we wish to eampte a ormeton 85 = (%,64)* 9 tat the eak- rect location 2 — (2)? — 54 more nearly maces the nee, ore secorate measurements. These measurements incade both distances betwen Selected pet of points sa sings between the line see Tom pois # to J al #10 k (oe Figure 6.1). To sce how to turn these new measurements Inco constraints, consider the angle i Figur 3.1. The eoeans ae labeled by thelr (coercted) lations, and the angles 8nd edge lengths Fare also Slow. From this data, tf aay to write dawn contents based on sine Tegonomnetele inties. For example, an sceurate measurement of 8 ads to the constraint WPA 080 a LL Where we have expressed 080, in terms of dot peoets of certain sides of the teangie. If we assume that 8 I sll compara to 5, then we can Noeare this contin as follows! multiply through by the denominator of Linear Least Squares Probes 103, xpen ty wich) Fig, 82. Contras mating «goatee datatoe the fraction, multiply out al the terms to gets quark polyno in all the ‘arnbls” (Ike 473), ane chow evay all terms containing more thn one ‘arable a efector. This yields an equation in which al #aeables appear Tne If we colt all thee iar estan from all the new a aa distance measurements together, we get an overdetermid linear system of ‘equations fo al the d-arihles. We wish to fire! the smallest corrections the smallest wali of 8, te that most nearly sti Use contents. his tsa least squares problem. © Late, after we introduce more machinery, we will alo show how image compression canbe interpreted os lest squares problem see Example 3.) 3.2. Matrix Factorizations That Solve the Linear Least ‘Squares Problem “The linear least squares problem hss several explicit solutions that we now dies 2. QR decomposition, 5 sv, 44 transformation to. rar system, (ee Question 3.3). "The first method i te fastest but hast accurate; i is adequate wh the cmmdition number i anal, The second method isthe standard one an ects up to twig sen a te st mato. The thd metid sof most we on eopdiiowed problem, 1, when 4s wot of fl nk: cs several Les mee 106 Apolial Numerical Linear Algebra expensive agin. The last method lets 1d iterative refinement ae prone the solution wlan the problem is Ukeoaditions. All mods but the third cen be adapted to deal eicisuy with sparse mates [33 We will sess ‘ach solution in urn. We assume iil for methods 1 and 2 that A has fll column rank 3.2.1, Normal Equations "To derive the normal uations we lok forthe wheee the gradient of Ar — big = (Ae = 0)" —D) vases. So we want DF (Ate + 6) =8) Uae bar = 6) o Tele 26T(AT Aw — ATE) 5 TATA The cod erm ETA AL — 42a aprons on pst O,sottetatr A” ArT incor tem mas ao bem, oe AT Ar = AT ‘his system of limar eqns Inn unknowns, te tora ations. Why ts 2 = (ATA)"1AT the mininlzce of [Az — 03? We ean note that the Hasan Als pocv defi whlch nus hat the fein ety Conver an any cite pont ha goel minimum. Ot we can compl the Svar ty tng #1 and ping (As 0)" (Ae 8) = (Ay 4 Ae OF ay + Ae) (ay (Ay) + ae = 8 (Ae = 8) (Ay) (Ae — 8) nid + ae 013 4-297 (a7 A — a7) (4g + Lae. ‘This clearly miniaiaed by y ~0. This jst the Pythagorena thane, sce the residual t Ae — 0 rtagonal co the space spanned by the eons of Avie, 0 Ar —ATAr Ab as stated below (che plane shows i te Sin of te column wetors ofA so that Ar, Ay, and Aa? — A(z +p) al ie tn the pane): Linear Least Squares Probes or FeAx-b Axb = Alxty pb AX = Atty) Since ATA is symmetric and positive definite, we ean use the Cholesky dcompesition to salve the normal equations. ‘The total ast of computing 7A, 1b, and the Cholesky dewompostion i x2m-+ n+ OU) Raps Sinoe ‘m2, the Pm costo faring ATA dominates the cos 3.2.2. QR Decomposition ‘Turonss 8.1. QR doampostion, Let A Be m-hyn with m > n. Suppose thet A has fell conn rant. Then there exists « unique tym orthogonal rmatris Q(Q7Q = la) and a unique n-byn-wpper trengular mati I with ostve diagonals vg > 0 such that A ~ QF Prof. Wogive wo pron ofths thoran. Fist, here sa restates! the Gram Schmidt orthogonalistion proces [17]. If me apply Gran Sehmdeto the columns a, of ~ [ev04 een fom He to ht, we st a sequcace of ethanol vectors teh ge snning the same spe: thes erthogona! wrt ar clus of @. Gra Smit ab empats content= #0 expresing ech eum a «en combi og hroagh 2 a — 3554s The ra jst the eatin aR Auconmii 3:1. The elasical Gram Schmidt (CGS) and modifod Grom Seine (MGS) Algorithms fo factoring A ~ QR. Jor tt 9m / compe th columns of and B/ wna Borst to4—1 /* sutiset components 95 dation fom a, */ (ogee tiodia MGS end for a= Wal ra =0/* 0 leary dependent om 35-06 */ emf a= alra 10s Apolial Numerical Linear Algebra efor We kv it san exercise to show that the to forms for ri the algo- thon ace mattematieally equivalent (see Question 81). IFA a fll cloran rank, 4 will not be zso. The following Hgure strates Gram- Seki when Aig 2by.2 ‘he sco! proof ofthis thea will use Algorithm 3:2, which we prone Tinortunately, CGS is rumerilly wnstable in Roating point arithmetic when theealunay of 4 ae nearly Lncaely dependent. MGS is moee stable aad wil be used in algoitans later inthis book bt my stil esi in Q beg ae from orthogonal (|G — being far larger than =) when A scanned [192,38 147). Algorithm 3.2 in section 31 i stablealernatve agora foe factoring A~ QR. See Question 82. ‘We wil rive the formula forthe that mlnimizes Az — bl ung the decomposition A QJ in thee slightly diferent ways. Fist, we an alays ‘hoore mn ore orthonormal vectors Q 0 that (Q,Q] fea square orthogonal nati (for example, we ean choose any m ~ nm moe independent yeiors X that we went and then apply Algorithm 8.1 to the r-by-n nonsingular matrix IQ.XD. Phen | Wax — O13 MQ, QI (Az - HIE > ‘by part 4 of Lemma 1.7 [F}om-of on ]=-[ $8 \| "ae" If [Re — QPOIE + 1QT OE 2 NG". Linear Least Squtes Problens Py We can solve Re — QT = 0 for 2, singe A and have the samme rank, ‘a, end So 0s nonsingular. Then 2 — R-1QD, and the minimum value of [ae oly i [QO ‘ere sa second, gly diferent devivation that doos not use Uke mat . Rewrite Ae — bas Ab = Qitr~6=QRe~(QQ" +1 QQ" = Q{Re— 9") + 9". Note that the vetoes Q(Rer — QF) and (I — QQ") aro orthogonal, be enuse (QR — QP) (I= QQ") = (Ra = Q7NTQ"U = QQ" = (Re = Q"5)"!96 —0. Thoreor, by the Pythagorean theorem, [Ae-0lF = JQ(e—Q798 + 1-99" 013 = (Re Q" 08 + Mer QQ HI, ‘bore we hve td part 4 of Lemma 17 in the form Qld — yl. This sm (oF squares is inimiznd when the fst term is zoo, ie, © — AO*Q* inl, here is third derivation that starts from Ue neral qtions solotio: 2 = taytaty REQ ORT = (REM TQN = RR TRG HQT, Loter we wll how thatthe ost ofthis decomposition and subsequent last squares solution i 28m ~ fn", about twice the cost of tho natal equstions ifm: mand bout the sare itm —m. 3.2.3. Singular Value Decomposition ‘he $VD isa very important decomposition whichis usd for many purposes other than saving least squares problems, "Tuwowsst 32. SVD. Let A bean ebony m-by-n mairiz wth m > n. ‘Then tee can write A= USEVT, where Ua mb ad ates UPU ~ 1, V inte and satisfies VV ~ 1, and © ~dliag(ay..y04),atere oy 2 o> 05 20. The columns uss of U ere called lett Sigular vectors. The colons Sjyscite of V ae elle right singular vectors. The o, ere ella singular vals! (ifm 0. Such, ‘exist by the dfinition of a= miss lata Lat which is muni vector. Choose (and V0 dnt U — [wis an m-by-n orthogonal mates, nd V [eV fan wg ethognal atx, Now write ry [| i fae wav vray = [ir] ate = [By Bae s sie cgeas tee ted: ee ages a Tee Tavs snd OT Av ~ OA — 0. We chim wAV— 010 bene othernise @ = ||Alla = ||V7AV la = 1,0)... 007 AV|la = [elu AV|ll2 > 0, contra ie. (We hae ed part Lama 12) SOUTAV =[ 5 gry ]= 1. b We may now apply the induction hypothesis to A to get A= UiSaVi", where Uy is (m — b-by-(ne — 1), So is (nin = 1, a 8 (9 Deen 1) So vr -[5 owe }-[o a] [5 81 [0 4] cE CE "The SUD fins large imberof important algebraic and geometric nron- erties, the most important of which State ere ‘Tuwonsn 8.8. let A= UV" be the SVD of te m-byn matric A, where m2, (Thee are analogs resus for m anja =s-+ aq =0. Them the rank of A i ‘Te mall space of A, ce, the subspace of wetorsv such that Av = 0, is the space spaniel by coluans r+ 1 tough of Vs spanks) ‘The range space of A, the subspce of vectors ofthe frm Av fora, fs the space sparse by clans 1 through» of U= spans) 8, Let S°-1 be the unit sphere in RY: SY! — (@ ERY : ely ~ 1h. Let AS be the mage of S™-! under Ar A-S™) = (Ap ae Re and [ala = 1). Then A= S°-? iawn ellipoid centered atthe origin OF R™, with principal aes 0, 2B. Write V = feyvyosty| and 0 = fasts 90 A= US = Eliecwol (a sum of ron matrices). Then a mats of nk k tens to spaces mus overlap. Let ‘bea unit ecto Int nection, Then JAB > Nca~ ayn18 = Vanfh = WEV"AIE ev ANE ob lV TAS By0 m4 poled Numerieal Linear Algebra EXAMPLE 5.4. Weillstrate thelist part of Theorem 33 by using it for image temmorevion. nv paticula, we wil stcate it with w-raak approximations ‘fe clown. An tbyen image i just an bye mate, where entry (é$) incepta he brightness of pet (,) In other words, matey entre ange Ing fom 0 101 (sa) ae interpeted as pels angi from Hack (0) trout various shades of gray to white (=1). (Colors also are posible.) Rather than ‘toring or eansuting all m-n matic entries to represent the lage, we often, ‘refer co empress the Image by storing many fewer numbers, om whieh we fan sul approximately reconstruct tbe original image. We may use Part 9 of "Theorem 33 10 do this, as we now ilustrate ‘Consier the image st the top lt of Figure 3k ‘This $2by-200 piel image corresponss to a 320-05-200 matrix A. Let A = UBVF be the SVD of A. Part 8 of Theorem 8.3 tells ws tht Ae — FF sue isthe best mank-k ‘pproximation of A, i te see of minimizing [A Ally —aeya- Note that ieonly takes ms (m+) weds to store ay thrngh ‘0 trough one, fron wie we ets reestrucl Ay La eos, takes ‘nv-n words to sore A (or Ax explicitly), which iv much lager when fi Small. $0 we wil use Ay a our compress image, stoned using, (mm) = words. The other image in Figur 33 show thewe apprsizatins foe vasious ‘ales off, tong with the relative errors ayy4/ay aod eorpresion ration (om) -Rjton=m) = 520-/61000 ~ 8/128 IE Ratative erar = oxox [ Comprostion ratio = S200 /6%00 7 1s Er 0 on ost » ‘00 16 ‘These images wore prodved by the following commands (th clown and ther images are evilable in Matlab smpong the visualization demonstration fle; eee your loa installation for fenton} + (U,8,¥I-avd(8); cotormep(gray'): eBCL2e, LDV: LK) ‘There are ao many other, cheaper inageeompresion techniques available than te SVD [18,150 | Laer wil sc tht the cost of solving a east squares problem with the 'SVD js bout the samo as with QR whon m >> m, and about tm — $n (O(n) for smaller m. precise comparson of th costs of QR andthe SVD aso depends onthe machine being use, Sc setion 3.6 for detail Durestr.0x $.1, Suppse chat Ais by. with m > 0 and has fl rank with A= QI= UDV" being A's QR decomposition aud SVD, respectively. Then AMS UMAR = Rg! = vatuT Linear Least Squares Probes ur ds ello the (Moor Pere) rendouvere of A. fm and as full rank. Sup pose that mises 2b]. Letr dA be te residual Let since A+ 64)2~ (04 8s Aswone c= rane Hl) < hy ~ 2 Dien comes 1002), ono ce (Baa a sania} 048 tere sind = fe. tn other word, 0 the gle between the vectors b and Az and measees whether che residual orm Ila lrg (ex) oF sna ine 0) is 4th condition number forte lat oguars pole. Sketch of Proof. xpand # — ((A + BAJA BAYA 4 4A) + 88) rowers of SA al, ed throw svn al bat the Bee terms in 8A aa 8b. We have assumed that ¢-ny(A) < 1 forthe se reason asin the derivation ‘of bound (2.4) forthe perturbed solution of the square iene system Aa — Teguarantes Ut AY 3 has fll rank so that 2 Is uniquely deter ‘We may Interpret this bound a falls. 1f@ iO or vey stall, then the resdyol small and the etietive condition number i about 2634}, nue tke ‘einary linear equation solving. If # isnot small but nt close to =/2, the resol fs moderately large, aed then the efetivecontion number ean be ruth Inger: R§(A). If 0 else 10 9/2, so the tre solution Is eae 20, then the eferive condition nim becomes unbonde ewe 28) sal ‘host three cass an ihntrated below. The rightmost pictare mak easy to se why the codon number i nite wien O ~ 9/2 in this ese the sclution + — 0, and alost any arbitnry srall change in Aor b wl yd ona Solution , an “Lately” lage eave change us Apolial Numerical Linear Algebra An alternative form forthe bound in Theovern 24th lininates the O(2) tec isa follows (256, 147] (hee Fis the perturbed residual F— (B48) — (hb sane ele 5 oA (5 Gacy + 1p lle ae $ ty (0+ 000+ oe) [fra Weare tema Tr = (+24) We will sce that, properly implemented, both the QR decomposition sad SVD are numerically sable; Le, thy yield solution minimiing |(4 + sae 64 ob wh ax (AL LH) — og ( Al va) a We may combine this with th abowe perturbation bounds to get ero bounds forth olution ofthe least square problem, ch ws we did friar equation valving The normal equations aw not a8 seeurte. Since they involve solving (ATA}e "A, the aceamsey depen on the condition number (A) ‘B(A). Ths the enor is alvays bounded by on WA}e, newer just mae ‘Therelore we expect tnt the normal equations ean lose tio a nny igs of securacy as mathads based on tbe QH decompoiion ad SVD. Furthermore, solving dhe normal qquations i not nocrsarly stable, Le, the computed solution does ot generally mimi (A+ 64)2 — (6+ 0a for small 8A and 6b SU when the condition nuber is sll, we expect the normal equstions to be about as accurate ae the QR decomposition of SSVD. Slace the normal equations a the fastest wa’ to solve the let ques problem, they are the method of choice when the matrix i wol-coditione ‘We return tothe problem of solving very -ondtions least sures probe lems in section 5, 3.4. Orthogonal Matrices As we sil in section 8:22, Gram Schade orthagonalization (Algorithm 3.1) nay not compute an orthogonal matrix Q whea the vectors big orthogonl Linear Least Squtes Problens ny lade aelyacaly dependent, so we cannot use i 10 compute the QR composition stably. Instead, we base our algoithns on cata esl computable eethozoaal ‘mattces called Houscholder relations and Givens rotations, whieh wo can hoose to introduce ares ito vectors that they multiply. Later we wil show thot ang algorithm that uses these orthogonal mares to intrxdvee eros is automaticaly stable. "This eror analysts will apply © our algorithms for the QR decoration a well a4 mary SVD and eigenolue agocithms in Chapters and 5 Despite the possibility of namarthogonal Q, the MGS algorithm has ln porta uses in tumerieal near algeben. (There file use fries less stable ‘vorsion, CGS.) These uses include ding elgonvetors of symmetle rdagonal ‘atrees using bsetion nnd inverse eration (section 53.4) and the Arno land Lanezosalgritims fo reducing a matrix to certain “condease” forms (setions 66:1, 6656, and 7.1). Aol an Lanes ar used as the basis of algorithms for slving spare linear systems ae ding eigeavalues of sparse ratios, MGS can also be medifed to solve the lst squares probe ably, Tut Q may il be fr from orthogonal 35 3.4.1, Householder Transformations ‘A Householder transformation (or reletion) fs matte of the form P= 12a whee Ila — 1. Tis easy tp see that P= PP aad PPT = (0 2eal)(1—2uul) 1 — tua + dual wal —1, $0 P ia symm, ethogoaal niatex, It is elle eellection because Ps reflction of In the plane trough 0 perpendinlar to w Given» wetor 28 easy to ad 2 Houser rection P= 1— 24a to aro out ut the Bt enry of Pa [0 ~ ee. We do ths a flows, Wie Pz =~ 2a) = e-e1 20 tal w= schy(e— eh io iar combination of ade Sinn [la ~ [Pala emt be prec th wetor ty al 0 ~ ie One vey Mat ether cose oan yes a wating Py, a8 ng 0. We Wel sea = signe, snc ths mane ta hee anelaton i 10 Apolied Numerical ‘computing the rst component of In summary, we gee 21 tsgntay) ll 2 with w= ‘We write this ssw = House(2). (In practic, we en store instead of w to save the work of computing vs and use the fori P= 1 — (2/2) instead P12) EXAMPLE 8.5. Wi show how to compute the QR decompoiton of 2. Feby= ‘(roatsie A using Housbolder transformations. ‘This example will uke the ‘atten foe genecal m-by-n matrices evdeat- In the matrices below, P, is ‘by. orthogonal mati, denotes 8 gener noasero ene, and 9 denotes 8 eo etry. 2 Choose Pe — {119 Ar = PoAi ios ee ee 3. Choose Py = 1 |? | 0 A=RAQ=] 0 0 8 ‘ sins 4 Choo =] 1 |° | m Asrd=|o oz 2 or, we have clanen 8 Howsaholder matric PE to xem at the subdingo- al enees in elu this docs nt dst the ats lredy fated fa previous enn Tet us al the faa Sby-4 upper tiangulae matrix R= As. Then A PEPLPE PER = Qh, wheee Q isthe fest foe colutans of PEP] PL PE = PLPLPAP, (sine all Pare sjecre) and Fis the fst four rows of. © Linear Least Squtes Problens 1 ‘ere isthe general algoritha for QR decomposition using Householder teansfrmatons Axconsrine 8.2. QR factorization nsing Householder reflections fori-lion ty —Howe( As m0) PT 2a Amin) — Rate iméen) ed for Hero arw yome more impkanentaton details We never ned to form P explicitly bot just multiply (1 2eyu AGE mim) = AGE: mye) —D(uP AGE mE EA), which costs less. To store P, we neo only ti, oF and These ean be stared In columa § of in fet Ht a note changed! Thus QR ean be overwritin” oa 4, whore @ is stored in fetorel form Ps: Pa, and is stored as i blow the diagonal in column sof. (We need an extra areay of Fength mfr the top eney oft, ine the diagonal entry is occupied by Fs) Recah to solve theless quares problem in | Ar—bla using A~ QR, ‘we ed to compute QT, This done as flows: QUA PaPhoas-- Pi 50 ‘we eed only kaep multiplying By Ph Py .o-s Pa fori Vion, yaad paeent end for "The est dot produets 7 — ~2 0B apd Ssvxpys” btu. The cost of exputing A QU this way is 2n%m ~ 3n°, and te subsequent cost of Solving the Hast squares problem given QW is jit an editions! O(mn) "The LAPACK routine for solving the lest squares problem using QR is ‘agete, Just as Gaussian elimination ean be reorganized to we ntrin-mntix nukipbcation and other Lavel 3 BLAS (se setion 26), the same ean be done forthe QR dewompostion; se Question 3:17. In Mati, i he by. mastic 4s ore rows Uh columns sd bs by 1, ANB sles te ks soar problem. ‘The QR desompostion isl ls available va [QF =ar(A. 3.4.2. Givens Rotations A ir ton M0) = 1 S59 ESP |e ay tor? counter lacie by 0: i, poled Numerieal Linear Algebra Rex We also nod to define the Given rotation by 0 In coordinates ¥ atl ‘ Z ‘ esd sind i sind eas = 115 )-[4"] oF cond — rie and sin ~ 8, ‘The Qf algoita using Givens rotations i analogous to using Householder ‘ellen, but when zeroing out column 3, wo 20 It out one entry’ at tne (bottom v9, 53). com) sind EXAMPLE 3.6, We llustrate two Intermediate steps n computing the QR d= composition of» S-by-f matrix using Givens rotations, To progres from ‘ve mine 1 zea2) [erre 1 ores| jozze 1 cors|-loore Linear Least Squtes Problens ns snd 1 aeea) [pene 1 ores| jorre ent |looez|-looral.o ve cors| jooor "The cost ofthe QR deeompesition using Givens rotations Is tw the east fusing Houschoder elections. We will eed Givens rotations for ote ap Pllestons le. Tere are sme implementatlon detall. Just as we overwrote A with Q ‘and R when using Houscholder refetions, we can do the same with Giveas rotations. We use the same trek, storing the information deserbing the tans foraton inthe entries zeroed out. Sino a iver ration zeros out jut ce fenty, we mist ste the information abet the rotation there Wed this flloms. Late —sin0 and ¢~ cond. Is < [lyse 8 sige) and ther wise store SI.‘ recone al from the stored salve (al tp) we do the following” if pl <1, then s = p and ¢ = VT=# otherwise © = 3 an 9 = yI=@, The reason wo do not just stoves and compute ¢ = v= Fis that when # is cloe to 1, would be Inccuratelyreconstcucted, Note also that we may recover other # and ee ~s end —e; tls equate In pectin. "Ther is also way o apply @soquence of Givens rotations whe perfor Jing fewer fleatng pont operations than desribed above, These ae called Fast Givens rotations (7,8, 33). Since they ae stil shower than Hovscholder elec Yio forthe purpose of canting the QR faxariztion, we wil not eosier tho frthee 3.4.3. Roundoff Error Analysis for Orthogonal Matrices, "This analysis prow backward stability forthe QR decomposition ad for many ofthe north for igenales and singular vals ht me wil cen Leas 3.1. Let P be an enact Honsehnder (or Givens) trensformation, and P be is lating pont appracmation. Then, (PA) = PAB) le =O) [Ale AAP) (AEP Pp =O0)-lAla ‘Sketch of Prof. Apply the usual formula (a 8) ~ (20 8 formas foe computing and applying P. Se Question 3.16. 0 Tn words, this says that spplsing # single ortieganal mate backward stable, 14 poled Numerieal Linear Algebra ‘Tunone 3.5. Consider epnying © semence of orthoponl trnsformations (Ay. Dhen the compute product 16 a eet orthogonal ansformation of Ay + BA, where [BAl» ~ OC). fn other wordy, the enti computation ts bend stb: BPP Pada Q1 a3) = Bye Pio + BQ sth [Ea = 3 Ole) [Alas Hor, ax iw femme 3.1, Pond Qe fnting int erthopoal matrices end Pond Qe ext erkoyoel maces lat By = Ryo Phar Qs = Qsoe-Qy. We wish to show tht 1(P;A, 13) ~ Py(A + B)Q; for some [la = JOC» We ine Uma 31 recor. ‘The ret & newly ttf} ~ 0. Now ame hat th res raf = Than we compte DB WPA) Py Aja +BY by Lemma 8 PsA + Bn1Qas 4 by tion + PLEO 1 = PIA PQ, where BBM = WB y-1+ PEEP all < Esa + WB OF ata = Bib t ie JOLALS Siowe [25-1 = = OCI Als and "la = O()LAla: Postrukipbction bby Qs handled in the same way, 3.4.4, Why Orthogonal Matrices? Let ws consider how theeror would grow ifwe were to mulkiny by a senna ttnorthegonal ates in Theorera 5 istead of orthogonal nace. Lee X be the exact nonorthogonal tratsorotion and X be te flostng plat apc nation. ‘Thea the sual Hating point eror analysis of mete multpiation tells us ehat Xa) = NAP B= NAP X ase), sare a S OC IN Ta-BAl ands [Pls IN“Ma- I$ Of) -naX) Wal So the eo Fs magi by the condition number aX) > 1 In lnegerprodet Xy-- X,Y" -¥y the err would be eageel by [], aX) sei). This ate stand ad only Hal; nd 3 are orthogonal Saar multples of orhogonal eis), In which ase ltr fone Linear Least Squtes Problens 135 3.5. Rank-Defi it Least Squares Problems So far we have assumed that A fas fll ranke when minimising [An — lo ‘What happens when A is rank deficent oe else” to rank defient? Soh problems arse in practice many ways, suchas exacting signals rom noisy ate, solution of some inegrl equations, dial image restoration, comput ing inverse Lapin transforms, and so on (139, 10} These problens are ‘ery ilbcenditioned, so we wil need to impe extra codons oa thi 0- lutions to male them wellgontiioned. Making an ihenadiioned problem ‘wel-conditone by imposing cx conditions an Ue soliton i alle regular [Ention audi also done in other feds of urberial says when i-onditoned problems are Por example, th next proposition shows that if As exactly rank deficient, then the lest squares solaton sot even oie PROPOSITION 8.1. Let A be m-byn with m > m and rank A 0. The 1. if minimizes [Ar ~ Ba, then [a> Jub eins tery the Last ‘olan of U in A UST. 2 changing b to 6+ 66 can change # to. + 6, ahere [ila is a8 large as [6b /omn In other words, iF A és nearty rank dicen (in 8 sal), then the sole tion i itheonitionad ni possby very lage Proof For pert, ¢ = Ath = VEIT, 2 lol = BE [O10 = [ull /omin. Foe part 2, choose 8 prall to ty. © ‘Wo bain our diension of regularization by showing how to rogaine an ent rankedeient least square problems Suppene A fe m-by-n with rank r 135 poled Numerieal Linear Algebra PRorosriox 3.8. When A i emetly singular, the x that minnie Ax — Ha ta be characterised as follies. Let A= URV? have rk r-, '0F A + [Vas] and this sini id by = 0. 5. Changing bby AB changes by at most [V7 UF Sa < Fahl — [bbiafe. 2 Proposition 8. tls us tht the minima aon solution 2 i ue ad auay be well conditioned ithe smallest nonzero singular value i wot too stall ‘This kay to prcteal algorithm, dseussed Inthe pet section, Linear Least Squtes Problens Pa ExAMPLt 8.7. Suppose that we ore doing mail researc on the eet of 2 carta drug on blood suger level. We calet data fom each pet (ou bred fom 1 — 1 tom) by eecoeding iso ber iii! blood sugar level (a) fand fol blood sugar evel (), the amount of drug administered (a2), sed ‘ther medical quits, Ineing body welghts on each day of « weeklong ‘weatment (243 through e9). In toa, thre aren toad 3, O otherwise. Let Y= dlag(g)- We eall UEV™ the tracted SVD of A, becuse we have se ‘singular values smaller than tol to ero, Now we soe the east squares problem sing the truncated SVD instead of the orginal SVD. This is justified sinew JV" OSVIy ~ 10 S)UF a < tl i, the change in A eased by ‘hanging each 6 10; is less than the usec’sinerent uneeetainty in the data "The mativtion for sing istend of 3th of ll matrices within stone tol of 3 maximizes th smallest roe singular vale a. Ie thor word ‘minis both the orm ofthe mimen nor kes quae solution dis ‘ondition number. ‘The plese below ilsteates the geometric relationships faong the input mate A, A= O37, and A — ORV, wre ne we hie ofeach mat 8 « point in Bveldean spoce B°". In this spt, the rank ‘defilent matrices form a surface, a8 shown below Linear Least Squtes Problens Fy Randeicient matrices EXAMPLE 8.8, Weilustrate the above procedure 0 tw0 20-10 rank-dfient| rnties Ay (of rank ry ~5) and Ag (of rank r= ~ 7). We write the SVDs of cither Ay oF Apes Ay ~ U,84,7, where tho corimon dimension of Us, 3, and Vis the rank rf Ai this isthe same notation a in Proposition 8.3. Tie onze singular wlis of Ay (singular eles of) ane shown a re plses ta Figure 3. (or) ad Figure 3.5 (lor Ay) Note that Ay ta Figure 3 has five lange onzero singular valves (all lightly exceeding 1 and 50 plated on top of one another, onthe ight edge the graph), wheveas the seven now Sligula values of Ay ln Figure 3.5 range down to 1.2 10°? = a Wo then choot an r-imensonal vector a, and lat ~ Via and Aus, ~ Uda, so. i the eet mien norm ston minimising At — bys Then we considera sequence of perturbed problens +34, where the peeturbation is ebosen randomly to have a eange of aoe, and salve the Feast squares problems (A+ 3), ~ busing the tenet lease squares ‘procedure with tol — 10%, ‘The blue lnes in Figures 4 and 25 plot the computed rank of A, +A (aumber of computed slagune values exceeding tol = 10-%) versus [4a (in the top graphs), and the error ye ~ asla/Eza {in ho bottom graphs). The Matlab cade for producing these Ages Is HOMEPAGE /Matinh/RankDefeiet.m ‘The simplest ease is in Fight 3.1, 50 we cousier i fst. Av | A will ‘hao Sve sagular values nar or slahtiy exceeding 1 and che other ve esl 1 [lls oF Sess. For [3a < tl th computed rank of 3A stays the Some a5 that of 4}, sumdly, 5. The eer alo increases slowly from near rackine epsilon (= 10°) t about 10-® meer bla ~ to, and then both the rank ad the errr jump, to 10 and 1, respectively, for larger [8]. This Iseonsistent with our analisin Proposition 33, which says thatthe condition number ithe reciprocal ofthe smallest norizro singular al, i, the smallest Sula val exceting tol. Por [aja < to, this stalest roar singular ‘ale i near to, oF slightly exerts, 1. Therefore Proposition 33 pris an ror of [6ala/O(1) ~ [ball This wel-conditonad situation is coafiad ty the small ero plotted 10 the let of [lly — tol inthe boom graph of Figure 3. On the other hand, when [64a > to, thea the smallest nox 130 Applied Numeral Lincar Algeben Fig. 8. Crop of tant lat quae ato of iy YA 4 An bil ming 1210-8, Thesiger ses of At et shown ceed aces, ‘The nor BA the honcona aa The Lp graph pl he rank of A, 84, Lethe mares of str alter exocing tl. The btm graph pate ~s3hy/tals whee y ‘he wisi ath SA singule vel is Of |S), which i quite sal, causing the ero to jmp |sAls/O(I6A\) ~ O(1), a8 shown wo te Fight of [Ale — 0 the batt sgsph of Figure 3 Ta Figure 3.5, the nonzero sigula values of As ae ao shown a8 rd pluses the smallest one, 1.2107 just lager than ta. So the predicted ero whe sala < tl is [6Aa/10-®, which grows to O(1) when Als ~ tal. ‘This is ‘confirmed by the bottom graph In Fgure 3.5. © 3.5.2. Solving Rank-Doficiont Least Squares Probloms Using QR with Pivoting [A cheaper but sometimes less accurate alternative to the SVD QR with poting. In exsetsrithme, I had rank r m then the iii QH decomposition donates the the cost ofthe sobeswvent operations on the ‘yn mai al che algorithms ost abou the same. ‘The fastest version of rankrevealing QR was that of (9, 194). On Type 1 mates, this slgpethen ranged frou $2 tines slower than QR without pivot or =m = 20 just. ties slower for n= = 1600. Oa Type 2 atl, i ranged from 2.8 tines slower (or = m = 20) to 12 times shower (Gorn — m ~ 1600). In contest, the current LAPACK algorithm, 4geqpe, ‘ns 2 tiesto 2.5 tes slower for both matrix types "The fastest version of the SVD ws the one in (37) although one based on Aivit-and-conque (se setion 5.88) was bout qual fst for nm ~ 1600, (Tho one hse em dvido-ar-coner nko sed rch es memory) Roe Type 1 matrices, the SVD algocthin was 7. tins sloner (for n ~~ 20) 10.33 lines shows foe — m= 180). Fe Type 2 mars, the SVD alge was ‘85 Limes slower (for n — m ~ 20 10 80 ties slower (orn ma 1600). Ta ‘contact, the current LAPACK slzoits, dgetae, ranged from ttn slower (Gor Type 2 mates with n — mn ~ 20) t0 97 times slowte(oe Type | mates with nm ~ 1600), "This enormous showdown I apparently di to mernory Dorrehy eft Ths, we se that there tof betwen celinblity and sped in so ing rankelefient ent snes problems: QR witht pivoting ie fstest but least reliabie, the SVD is slowest but ost rfinble, and raleceweling QR in-between. If > a all algorithms cost aboot the same. Th eboice of lgosthen depends the eative inpoctace of spied a relay 10 the Future LAPACK releases ll contain impeoved versions of both rank revealing QI and SVD slgoelthns fr the last squares problem, 1st Apolial Numerical Linear Algebra 3.7. References and Other Topics for Chapter 3 "The best recent reference on kat ares problems [3 whieh ako dserses variations on the base problem dlacused here (uch a constalned, weight, and updating last squares), diferent ways to eegularze ranked. probs lems, snd software for sparse last squares problens. See also chapter 5 of| [119} stl (166, Pereuation ory andl ertor bounds forthe lst squares Selution are discussed In dtal ln [U7 Rank-eveaing QR decompositions fare discussed in 28, 0, 47,49, 124, 48, 19, 20, 284) In perteuar, these Papers examine the tradeofbecwren cost and ancurscy in rank determination, ‘and in 204 there sa comnprobensive performance comparison of the available Inet for rankeeficent lest squntes problems 3.8. Questions for Chapter 3 Quasrios 3.1. (Hany) Show that the two variations of Algorithm 8.1, CGS ‘and MCS, are athemstialyequlalet by showing thatthe two formulas for 1, Yel the same results in exer athe Quesri0y 8.2. (Baxy) ‘This question wil llstrate the diferenee in nie nical stabi among thr algorithms for oxputing Ye QR foetorea- ion of mates: Householder QR (Algo $2), CGS (Algor 8.1), fand MGS (Algorithm 1). Obtain the Matlab prograa QRStablky.n from HOMEPAGE Matlab QRStabity.n. This program generates random ite ‘35 with userspeifed dimensions mand a atd condition numbe end, computes thelr QR decomposition using the tre slgcithins, and weasures the accuracy ofthe resus. Tt dors this with the residual A~ QL AL, whieh shoud be ‘round machine eplloa © Tora stable lgoeltin, and the orthagonality of Q IGT @~11, whieh shoud also be around =, Run this program for small me- tsi dimensions (such m= Gand m4), mosest numbers of random matrices (samples 20), and condition norbers ranging fm end yp to en 10" Deserb what you see, Which algorithms are more stable than others? Sce if yor can describe how lage [QQ Tl ean be ms onetion of ehoice of lgrithm, ena an © QUESTION 8.8. (Medium; Herd) Let A be m-by-n,m >, and have fll nk 1 (iam) Stow tht (fe J [E18 hs 9 ston were = tiniins Az ~ a. Ove reson fr tis formalin that wo can nse (sto scton 25). 2 (Modiam) What i the coediion umber of the eoetent mates, tn terms of the single vals of A? Hint: Use the SVD of A. Linear Least Squtes Problens 135 (Medium) Give on explicit expression fr the inverse of the oeicent Istria Hock 2-by-2 rai, Unt: Use 2-by-2 Block Gaussian ei Inston, Where have we previously sen the (21) boc entry? 44 (Merd) Show bow to use the QR decomposition of Ato implement an erative refinement algorithm vo improve the acureey of x QuisTION 3.4. (Medium) Weighted lent spun I some components of Ar— Dane move important than othtes, we can weigh chen wi 8 ele factor ‘and solve the weight est squazes pobles min (Ax O)2 sted, whece Dios agonal ete. More geuealy, reall tht I's symmetric postive tne, then [ajc = (24C2)/ is a norm, nd we ean consider miniming ||Ar— jc. Derive the normal equations for this problem, as well as the oemulation corresponding to the previous question, Quistios 3.5. (Medium: 2 Bai) Let A. R™" ba positive deinte, Two ‘wectrs wand np ae ealled Acorthogonal if uf Ana = 0. 117 € RO and UTAU “1, then the elumns of U ate sid 0 be ALorthonoral. Show that ‘very subapace hs an A-orthnormal ass, Questios 8.6. (Banye . Bai) Le A hve the form " (2) i yap io al es De Spa ana nate et {ule form Your gorithm should not "Elin" te eros in Rand this equre fewer operations than wouk! Algorithm 32 app to A QUESTION 8.7. (Medium: Z. Bes) A= ns, whee 8 an uppee trian slo matrix, ad and ae column vectors, describe nein algorithm To campute the QR decomposition of A. Hint: Using Gives rotations, your gost should take O(o!) opeatins. tn coaras, Algorithm 3.2 would take O(n") operations QUESTION 3.8. (Medium: Z Bai) Lot 2 € RY and lt P be & Householder slatri such that Pa zfaei- Let Chay-v-yCy-ty be Givens rotations, and Jet Q = Gia: Guna Suppose Qe — fallen. Must P equal Q? (You ed to ive a proof of counterexample.) QUESTION 3.9. (Paxye Z Bai) Let A be m-bs-n, with SYD A — UDV Compute the SVDs of the folowing matrices in terms of U3, a Lara, 2 ahapat, 136, rear Alger 8. alarayt, 4 aatay tar Quustiox 3.10, (Medium; R. Sehreiter Let A be bast unk-k approxi ton ofthe mati A, ss defined In Part 9 of Theorem 33. Leto be the ith Singular yloo of A. Show that Ay is unique i 24 > Qurstioy 8.11. (Basys Z Bas) Let A be meby-n. Show that X = Ab (the Moore-Ponroe pwdoinverse} minimizes AX =p one all m-by-m matrioes 1X. Whos is the wale ofthis iim? Quesmiox 8.12. (Mediums 2. Bat) Let A, B, sad C be mations with di merous such thatthe product ATC is wall deine. Let 2° be the et of matress X mininlang [AN ~ Cl, and let Xo be the unique member of ‘minimiing |p. Show that Xy —A'CB. Hint: Use the SVDs of and 2. Questi0s 8.18, (Medium; Z, Bai) Show thatthe Moore-Pearose pseudoin- vere of satsfes the folowing entities AAA = A, ataat = a, ata = (Atay, aat = (aatyr Quasnos $14 (Medium) Prove pat 4 of Tleoren $3: Let n= (8 8, whee 4 i square and A = UV ies SVD, Lat ingle os Unt ad V = yon Prove he 2a cigoovalies of seo, wth cormsponing unit eigmnvetors Js xe ted to the em of rxangulr A QuEsTOs 8.15. (Medium) Let A be m-byn, m 1, the wstem is underdomped, sud thee are to compe eaejgate elgnvacs with el part his ace the saliton oats wile decaying to mt. In bath cases th ster it ‘gonalzale since the digecvalrs are distinc. When 42" ~ 1 the system is erty damped, there ae (no real eigervaies equal to —2, and has 1 single 2by-2 Jordan lok with this elgevale. Th ther wi, the note 1, he, he elgenenve te iLcondtioned, then the uper ad on the distance in VET If, th reciprocal of the condition number Prof. Fist we show tha we cn sue witht kof gory that As upper triangular in Seu form), with ayy A This Beaune puting A tn Sst or is equivalent to epeing A by 7 Q*AQ. where @Q i nay. I {and ya slgeetos of 4, then Q°z and Q'y ar egrets of T- Sica (G0) G"=) — 17 Q's ~ 92, changing to Schur for doer ot ehange the condion number a. (Another way’ to sa this shat the codon number Ws the secant of the angle O{z,9) between aly, aid hanging = to QE aay to Q"y jst rotates x andy the sae wy without changing the angle Beever hem) So without loss of generality we can asume that A= [3 42 |. ‘Then =e and y is pclel 0 = [1 ALe(AF— Aa), 08 y= fila: Thos 1 bile : ayia a tt baat = ay VERT = lan Am)“ < ala: MAL~ An) < —\h Tai Nonsymmetse igenvalve Problems 15s By delaition of tse smallest singular valu, there sa By where [6A taut =) outa cag AY gl Lee ran ate Ot A+ 6h Thas| 3, Mg, [basa double egal a, where Al Aza S omn(M ~ Ax) < A 8 deste, Finally, we relate the eonition numbers of the eigervales to the sralest possible condition number [SS~"I of any salty sch that diagnaliaes 4p S"1AS — A~ dagQy,..2y). The theorem sys that If sy elgenle has large conditon number, then S bas to have an approximately equally large canton number. In other words te condition numbers for finding he (worst) eigenvalue and for reducing the matrix to diagonal fem ao nary the "Turonsst 4.7. Let A be dagonalicable with eigenvalues and right end lft eigenvectors end ys, respectively, normalied so 2 ~ [ya — 1 Sup one that S satisfs S1AS — N= dlag(Ny,-n)- Then [Sa IS Ia > rays 1/ly) Ife ohoose 8 — ra...) then [S|] !la < nsmax fit fey the condition mumer ofS 8 within a factor of of ts smallest vale Fora proof, se 6 Foran overview f condition numbers forthe eigenproble, including ge ‘cory ivnrantsubspnees, and the eigenales oerespocting to. invariant subspace te chapter 4of the LAPACK mana [10], a8 well [159,235 Al rth for computing Hse corlition umibers aro seinble in subroatnes {Ferera nd sereen of LAPACK tor by cling the driver rates sgeeus sed ‘goons. 4.4. Algorithms for the Nonsymmetric Eigenproblem \We wil buildup to our wlimate algorithm, the shied Hessenberg QR algo- thm, by starting with simpler ones. Foe simplicity of expasition, we assure Ais rl. (Our fst and simplest algorithm isthe power method seton 4.41), wih can find oly the lars eigenvalue ofA in absolute vl and the correspon ing eigenvector. To ind the other eigonalves and eigenvectors we opply the rower method to (Aol)! fa sone lift oath ale iter serae tom (setion 11.2}; note tht the larg eigen of (A~ ef)" 81/40), hae A ste closest egeralve 1 0, $0 we can howe which egenvales 10 Tal by choosing 0- Our net improvement to the power method ets een Pte a enirlvariant subspace a ime rather in jus snl genset, 4 Apolial Numerical Linear Algebra ‘we call this orthogonal eration (section 4.4.8). Finals, we organize ontbon- ‘nal iteration wo make i convenient Lo apply {0 (Aa) ested of; (his Seale! QR Heration (section 4.4), Mathematically speaking, QR Keration (with «shift a) is our ulate lgosthm. But several problems remain tobe salve co mabe i sliealy fost and reliable fr practel use (ection 45). Seetln 4.46 discusses the Ist teansfrmatio designed to make QR iteration fast: reduelng A trom dense to upper Heasenterg form (aonzero only on aed sbove the frst subdiagonal). Subsequent setions describe how to implament QR iteration ent on Upper Hessenberg matrices, (Stetion 41.7 shows how upper Hessenberg form "Smpliies in the esses ofthe symmetric cigenvale problem and SVD.) 11. Power Method ALcORITIME 4.1. Power methad: Gien zy, we derate repeat tie = An Zin =thaflvesla —— (apprasimateeigenoctr) Raa atadeiss fepracsmateeienvalu) anil eoncergence Let us it aply this algorih in the very simple case when A = day(s, hs with [Ma] > [Mal & o> > [le In this ease the eigenvectors are {ist the eolums'eof te Wdeatity mates. Note that ean alo be wetien = Atza/ Abul ser the factors 1a eal sete #1 to be unit Veto snd do nat elnge Its direction. Then we get a] fad a2 |_| am (ay Aya aX : a(e) ‘where we have assumed ay = 0. Sine all the fractions Ay/2y are Fess than in absolute value, A'r becomes more and more realy paral toe, 9 4'za/| Aue beoaes cleser nnd closer to tr, the eigenvector comesponding to the largest eigen Ay. The rate of convergence depends cn tw rah Samal than the ration [Ay/X 2 =~ 2 P/ha re, the sale ce taste Shee converses to), Ss 2 Ax converges to Ay he arses eigenvalue Tn showing that the power method converses, we have made several 8 sumptions, most notably that Ais dagonsl. To sale 2 more geal se, We now assume chet A SAS~ ig dlsgonsllable, with A ~ diagy,--.%y) HE Nonsymmetrie lgenvalive Problems 155 and the eigenvalues sorte so tats] > [Aa] 2 aoe & Pale Wete [si---rl, Mere tbe columns are the eoresponding elgeneetors and lo Sati ja — Hy the last paragraph we had — This ats us wete = (Sag) = S(lay---,M9i") Alo, since A ~ SAS", we ean write A= (SAS) (sas!) = sais since al the $1 pais canoe. This lly let vs write a ay a aX Bay = (SNS) S 5 =anis a nN As before, the vector in beets converges toe, $0 Ary gets lovee a ler to a mutinle of Sey ~ sy the eigenvector corresponding 10 My. Therefore, A= a? Ary converses to sf As, ~ sf so1~ As ‘Aino deswback ofthis metho ithe assumption that ay ~ 0; this trve with very high peobabiiyi yi chosen at random. A major dewwtack tat Ie converges othe egervaue/eigenvectr par aly forthe elgenlueof gest bvolute magaitnde, and its eanvergence rave depends on [¥g/2)| quanlty ‘whieh may be ease to snd thus esuso very slow eouvergence. Td, fA real and che Inegostegenvalve is eompex, there are wo complex conjugate ‘igenvalues of largest absolute value [Ay) ~ [Aland so the above analysis ‘does not work at all, In the extreme cace of an orthogonal moti, al the ‘genvales have the same absolute valve, ness, 4.4.2. Inverse Iteration We will meroame the deuwhacks ofthe powee method jst esrb by ap ping the power method to (A~ of)" iste of 4, whe ocala ‘This wil let us converge to the eigenvalue closest to 9, rather tha jst 2 ‘This method i aed laverseKerstion or the Inverse power inethod Auconsrine 4.2. Inverse iteration: Given x, we erate ino peat win (Anal'2, 23 —Maflveall — (appraiate eigencertor) (aypracinate eiencatu) 16 poled Numerieal Linear Algebra “To ale the emnvergenc, rte that A = SAS“! implies Aa = S{A— ai)S"" nd 90 (At) (Af) 18-1 Ths (Ao) hs te se ‘ignetes a8. wih eresponing devas (A01)"y— Oma)! ‘The same amas before tls sto expe to eaege to th eigenvector corresponding tothe largest geval In absaute valve. Mare spell ‘sume that [yo is all than all he ote [Xa 90 that Oy =a)! isthe largest eigen abolite vale. Also, wit ~ Sync a8 before, aed asume ay =. Then @) fewer! (A= atm = (SA=o1'S-HS s a ant eo} a(t)’ ice || a (tes) whee en ey & Sel te ats (y—2)/(—9) re as han Shon abate ali, the wet in bakes aptahes tsa ay fs cee and car tog mulipl of Se, other coepenng {Das Au lor aA alo conver oA “Ti enn of nes train ent i poner metod i the ability to converge to oy si ga te on tars the si) iy choosing ter ches te iw ciel, ween conver very ly and mate Im y he proximity wares storia poner metho ‘The antes prtelry etre when hve © goo oppo fo ta egerlo td watt ony is coromeding eget (er exemple Seton 5) later esl exn tow to doe sch 0 wit Hoong thedgemelus, ith at we se ig to compute ae toc 4.4.3. Orthogonal Iteration (Our next improvement il pee us to converge 10 8 (p > 1)-dinensionl Invariant subspace, rather than one eigenvector ata time, It sell orthogonal Iteration (a sosesines subspace eration oe Simultaneous Testin}. Auconimine 4.8. Orthogonal iteration: Let Z be an np orthogonal mate. ‘Then we tere Nonsymmetrie lgenvalive Problems 19 Yio AZ, Factor Yess = Zip isn wing Algorithm 82 (QR decomposition) ‘ncariont subspace) inant sil convergence ‘Here an informal sus of this method. Assume [3 > [psi IER ty this method sn its analysis are dental othe power meth. When p > 1 we write span{ Ze} ~ span(Yya} = span{AZs}, so span{Zs} ~ span(A‘Ze} = span{SA'S™Z}. Note that SNS = S dlagldf..... Sy va-[ 2} ‘where ¥4 approaches zero ko (ye), an Xs does not approach se Teed, if Xp hs ful ak (o generlization ofthe rssumpon in section 44.1 that a} =0}, thon Xs will hae ful rank too. Write the matrix of eigenvectors S= [bieresba] = [SPP M, ey Sp ~ rece Then SA'S“! ASL I! 1 = AUS-X+ 8%). Thus spnn( 74) converges to span(Z:) — span(SA'S™1Z) ~ span(S,X + $14) > span($,X:) = spanis;), the invariant subspace spanned by the fst pegenvetors, as desired The use of the QR decomposition keeps the’ vectors panting span{4°Z) 0 fll rank despite round "Note that if we foo only the fist

j +1) (00 section 46). Then we wll apply a stop of QR iteration impiety, i, without computing Q oF maliplying by it explietly (xesetion 4.48), This wil ree the cost of one QR iteration from Otn!) to O(n) and the overall eos from Otn*) to Ofn') as desi When Ais syrmetsic we will educe i to tridiagonal frm instead, ducing the cost of singe QR version further to O(n). This isdseussd In section 4.47 and Chapter & 2 Since complex eigenvalues of rel musics occur in conplex conjugate ‘is, we ean sift by 0; and simultane it tues out tht this wal perait us to maintain real arithmetic (oe section 448). 1A is Symmetee, al egenvals are ral, and this not an iso '% Convergence oars when subiagonal entries of Agar “smal enough ‘To ap choose a precical ths, we use the notion of backward st hay: Since pelea to Ay a silty transformation by an oe ‘ogo mates, we expect A, to have roundol ener af sie OA 14 poled Numerieal Linear Algebra ‘ty: Tetra signal ety f Asmar han OAL inv ayn tl beat, so we tito ne Who oper Heber ting yy ao wl abe Aa lok uot tangle ace A ~ [ASA | whee Ay i yp ad Ay and dla ne both Hosen. hehe sivas of Any and Aza may be itu indopenenty tt te sieve of Wh altho aga Socks ae st or ty the ath he fd 4.4.6. Hessenberg Reduction Given a rol matrix 4, wo sok an orthogonal Q so that QUQ™ is upper Hessenberg. ‘Tho algorithm i a simple variation of the iden se forthe QI ecerponitin, EXAMPLE 4.8, Woillstrat th gneral patter of Hessenberg reduction with 1 S:by-5 example, Rach Q, blow fm :by-5 Householder reflection, chose to ‘er out eatees #42 though min olor F nnd leaving entries 1 through hn 1, Choose Qs 50 Qd-lorre ornare Qj Jones the st row of Qi unchanged, and QF leaves the fist column 9f AGT unchanged, incuding the wero 2 Choe Qs 0 Q@ahi-|o F222] md e@agt-lor zee Qs changes only the last cree ams of Ars ard QF loaves the rst t90 eolumns of QoQ unehanar neng the 9r Ti wom och oe tino cath, racine wit he mr | sc a a nw yh {hp apayet tt ace bg! sti ev sgh eu the TRAC at theo dea Nonsymmetrie lgenvalive Problems 165 2 Choose Q 50 Qar=]o zz zz] wd A-Qagh-|o 2227 whic s upper Hlessenbeng. Altogether As ~ (4201): 1Q30203} Qaghe “The general slgorithm for Hesenberg reduction isa los, AuconrTine 4.6. Ration to upper Hessenbeny form: FQ in desir, set QF fori=tin=2 a= Hlose( AQ) PT 2uul 7° Q,= diag, P) °/ AGH Tetin) = Proales Limien) Aen CF =m) = AM ess en) Py FQ is desires QUE Tm sen) = PF-QUEtemsem — #Q=Q-0/ ent f ed for ‘As with the QR decomposition, one doesnot form P,explitly ut instead mules by #—2uuf via metee-vector operations. The ws wetars ea ao be stored below the subdgonal, inlar tothe QR decomposition. They ean bo applid using Lael 3 BLAS, es deseibed in Question 3.17. This algorithm iewalnble a the Matlab common! besa ofthe LAPACK rout sgabra. "The nur of floating point operations Is easly counted to be Hse (O(a), 0 Hs O(a! ifthe product @ = Qu-1-~-Q1 i eamputed as wl, "The advantage of Hessenberg form under QR iteration feta costs only 6x? O(a) ps instead of O(n), snd ts oes preserved 80 that che mate ‘toms upper Hessenberg Prorostniox 4.5. Hessenberg form is presero by QR eration Proof. Itiseay toconfiem that the QR decomposition ofan upper Hessenberg atic like Ayo yes an wpper Hessenberg @ (ine the jth eon of Q ‘61 linear combination of the ling jeolumns of A, ~ of). Then it is easy Toone that RQ remaies upper Hsseabirg aud sing docs not change this, © 106 poled Numerieal Linear Algebra DerinrnioN 4.5. A Hessenberg matric H is woredaced sf all subdiagonal re Tseng to ae that His ede because yy — 0, then ts egealues are thse of is leading -b4 Hesenborgsubmates aad is tralia (n—-b: (m8) Hessenberg submatrix, s0 we nted consider onl unredvced tates. 4.4.7. Tridiagonal and Bidiagonal Reduction IFAs symmetle, the Hessonborg redvtion proces leaves A symmeiie at each step, 50 zor are erated in symmetie postions. ‘This means wo noed ‘work on only half the matrix, eucing the operation count to {n° O(n?) or Fre + Of) to form Qu-ts.-Qi as wll Wo cll this algorthrn eridiaponat ‘eduction. We wil us this elzocthmn in Chapter 5, ‘This routine is avaible ss LAPACK rostine ssytra Looking shed it to our dscusion of computing the SVD in section 5:4 ‘we real from scetion 3.2: tht the eigenvalues ofthe symmetric matric ATA tar the squares ofthe singular values of A Our eventual SUD algihin. will, tse this fact, so wo woul like to ind form for A wiih implies that APA is tegiagol. We wll choose Ao be upper bidiaponal, or nonzero only on the Aingonal and firs soperngonal. This, we want to compte orthogonal r= Islets anl Vsueh that QAY is bidiagonal. The agorthn, cll itingmat retion, is very sail Hessenberg and eiiagolrevetion, EXAMPLE LD, eve i 8 bys example of diagonal redetion, wih ills teats the general pater: 1 Gian Qys0 aa=|2 22 2|aivionmasqun=|2 222 1 2 Householder relleton, and Vs a Housholdereeletion that leaves the fit column of Qh unchanged 2. Chom Q 0 Gea] 2522) andvisorime Ar =a] 2 2 22 Nonsymmetse igenvalve Problems 16 Qa's. Houseolderseflection that eaves the fst row of Ay ochanged Ug Houser retin that verte st to ols of Qa change 3. Choose Qu s0 eda =] 9 FE 8) aad Vs = 1s at As = Qe Qs's 4 Householder reflection that leaves the fist two sows of Ay ut hanged. © In genera fA sry then. oe got artogona mations Q fand VV" Vang sich that QV — A upper bingo Nove that aA VEATQFQAV = VIATAY, so a7. hs the some ciggnvalves as ATA Le. Abas the same singular vals 9 A "The cont of this bidagonal eedveton i $x? | O(e!) Hops pus enather 42 O{n®) flops to compute Q and V. This routine salable as LAPACK routiveagebsa, 1eQe 44. QR Iteration with Implicit Shifts le this setion we stow how to implement QR iteration dheply on an upper Hessesherg mtr. The implementation will be mpi the sense that we do ot expltlyeompuce the QR factorization ofa mati H but thee cousteuct @ implicy as n product of Givens rotations and olber simple rthogonal tines, ‘The impli Q theonan described beso shows thst thi impiety constructed Qs the @ we want. Then woshow how 1 incorporates sale shit 1, which is necessary to seeclerate convergence, To retain real arithmetic in the presence of complex esate, we then show how todo. double shi, combine two conscitive QR iterations with complex conjugate sits ¢ are the rest ter this double she is mgin ea, Finally, we seuss strategies or loosing sits @ nd 7 to prove rebae quadratic convergence. However, there hae been recent scart of are situation where cornergenee dors not occur (25, Gt, 0 Bading completely eelible ad fas implementation of QR eration renin an open probles, Implicit Q Theorem (Our eventual implementation of QR eration will depend oa the folowing theorem ‘Turowsst 4.9. lmplit Q theorem. Suppose that QTAQ =H ie warned upper Hessendeg. ‘hen eons 2 through of Q ere determined wuigely (ap 20 sins) Oy the fet ean of Q. 1s Apolial Numerical Linear Algebra "This dhorem implies that to compute Aygn = QFAQ from A, in the QR gost, wll ed oly 10 1. compute the fst column of Q (hie is parallel to the First eturnn of ‘vf al so ean be poten jst by nortan thin enon set) 2, chm other columns of Qs $0 Qe is orthogonal and As i unredoeed Hesenbers Ten by sho implicit Q tore, we know that we wil ive compte ‘Ay corey bast Qs unig up i, which do nae mater (Sets dont mater beeine changing te ign of i ote of , he sane fs caning Ay — ou! — Qf wo (QS), where 8, ~ Hag H- ‘Then Aer (SURMQLSD +o — (RQ, + of), whl ma artiognal start a ot change the sg of clans nd ows of 441) Prof ofthe inp Q teoren. Sapp that Q°AQ Mal VEAV — Coe tnretsend upper Tester, Ql ae orgoa, a! he Bs cols aad V are oq Let (1X), denote the heli of X. We wish stow (0), 2) tora > J,orequvaenty that W'= VQ ~ dag tl, Since W— V7Q, we aot GW’ — GUTQ = VEAQ — VIQH — WHT Now GHW = WI inpia GW), = (CW), — WWM, — TH hgAW)y aaa = GUN Eh bal, Sitoe Wa = U0) and Cs pp Heswen ber, we cn i inveton on show cat (Wao in entre 1 ttn te, W is per nanglt Sine Wf ab tonal, iedagonal = cag(tt-- 21). QR Algorithm "Tosco ow to use the implet @ teorem to eompute fom Ay =A, we we a bty-5 example EXAMPLE AIO. 1. Choose a 1 50 a1 =OFAQ, We discus tow to ose ¢; and below; for now they may be any Givens rotation. The in position (81) felled & bulge sind moat to be gotten Hd of to restore Hesaners fom, Nonsymmetrie lgenvalive Problems 2. Choose =| a 3 Choowe nd Ads wo QA = oa , "The bulge has how chased fom (42) 10 ( 4, Choone 1 1 a= zt and Ae QaaQe + 3) 10 Apolied Numerical rear Alger sow are back 10 uppee Hessenberg frm Altogeties QFAG is upper Hessen, where pans 0 the frst elu of Q is fn, 03- 01%, which bythe impli @ theorem has uniquely termine the other ens of @ (up to signs). We noe chore the first eolumn of Q to be proportional to the fist ealunn of A~ of, [ax 240), This meas Q i the same asin the QR decomposition of Ano, 0s desi. © ‘he cst for an by. matic 6? + O(n}. Implicit Double Shift QR. Algorithm. ‘This section desertbes how to alntan real arithmetic by shiing by @ and tthe se ie, This essential for an eficient practical implementation but ‘ot for 9 mathamstial understanding of te algorithm and may be skipped ‘on fist reins "The results of shifting by o and @ in sucession are Ayal = Qumt, Ay = RiQh bel oA, = QTAQr, Aral = Qift, Ay = MaQr4at soa, = QTAQs = QFOP AQ IQ Leanna 4.5. We ean choose Q, and Qe s0 (2) Q1Q ie rat (2) Aus therefore reat, (8) the first column of Qua i sy to compute Proof See Quits = A, ~BE = RiQ + (2-2), we get QiQehakey = QUA Hoo) Ri = QiRiQuhes + (0 —2}Q.Re (a= ody = 01) + (6 ~7(Ao = of) A} —20Ra)Ao + oP = Mt. ‘Thus (@sQa) lta) i the QRD ofthe rel mateix M and 90 Qa, 8 well 1 Raft, can be chose real. This snes thay — (Qs 3)” (Qi) also re Nonsymmetrie lgenvalive Problems im "The fest enluan of Q,Qo 8 proportional to the st column of 432A + lo, whieh s 4h + aye) —2(Re)an + [oP taylan +en2~2(R0)) o "The rest ofthe columns of QQ are computed imply using the implelt thorens, The process i ll ealled “bulge cheng,” but now the bulge ie by? Instead by, EXAMPLE LIL, Here fs 8 6-y-6 example of bulge easing. 1. Choow QF —[ | where the fist colunn of QF is given ws above, aa PEE tit|amtasatan [1 PE Ets ‘We see that there is 2 2by-2 blue nite by pls sig. 2, Choowe «Householder relleetion QZ, whieh afets only rows 23, and 4 fF, zeroing out enti (1) and (11) of Ay (ths means chat QE is che entity matrix ouside rows and ents 2 throogh 4) Qt and As and the 2-2 bulge hasbeen “chased one column Choose a Homer mitection QF, whieh afots only rows Sand 5 of QE Ap, zeroing out enteles (4,2) and (5,2) of Ay (this means chat QE im poled Numerieal Linear Algebra 5s the identity ouside rows ad alumns 8 through 3 A= 4. Chon Howsholder releton Qf whieh aets only wows 4,5, rd 8 of QF Ay, zeroing ot ent (53) and (0,3) oy (hs ear tat QE Js th entity mtr out rows and columns ehough An = QPAQ GAs ‘Choosing a Shift for the QR Algorithas ‘To completly specify one Herston of ether snl shift o double shift Hes- senbeeg QR iteration, we nerd to chore the sito (and). eal from the fd of ection 4A tat a resonable choice of sae si, one that resulted In agmpeotc quadrate ccvergece to a rel cleat, Was @ un, the Boom Fight entry of A, ‘The generalization for double shifting fs to ie the ‘reins wich that means and are che eigervalins ofthe ota 2-by- Deorer of i [6 | This wll leu conver to eter two real cigs nth bom 29-2 corer o ingl 2-b-2 blk wth complex ‘onuate eigenvalues. When we are ne to convergence, we expect ty. {sa posh an) obs 0 tht the esas ofthis 2-y-2 are ano approximations for elgnvales of. tne one ea sw tha tis choice kas to quali convergeoe anyimptticall” ‘his means tat once Gerad (ad ponly days) mall enough, fs magaltde wil square at Nonsymmetse igenvalve Problems is ‘ach sep and quickly approch zero, In practic, this works so well hat sverag only two QM iterations per cggavalue ate needed for convergence for ‘not all ates, In practice, the QR iteration with the Frans shift ean fall 1 converge lied, ess oot 100 010 bchngy). So the practical lgorith in use for deeds ol an esexpiona SInI” every 10 shifts if convergence had nt accord Stl, Un ets of tatios where that aor dl not converge were discovered only eecently [8,6 mates in small aighborbood of bono bn od bolo ‘wher is few thousand times machine epsilon, fom such ase. So another osceptioal shit" was recently ned to the algorithm to pach this cae But i still an open problon 0 find shit strategy that guncantos fast. ‘eonvergece forall mato 4.5. Other Nonsymmetric Eigenvalue Problems 5.1. Regular Matrix Pencils and Weierstrass Canonical Form ‘he stead eigewalve problem asks for which sans = the matris A ~ =1 Is Singur; thes sears are the eigenvalues. his notion generalizes in several DevINITION 16, A— AB, ahere A and Bore m-byn matries i called @ rate pene, or just pci More i om indeterminae, not particule, rnerical wae Derixrriow 4.7. IfA end B ave squere and det(A ~ AB) isnot identically 2470, the pncit AB i ated regu Otherwise 1 allot singular. When ANB is repulr, (A) = det(A XB) ig called the earactrsle polyno of A~AB ond th eienvatus of A~ Et are defied to be (1) he vont of 18) =0, (2) 2 (ath mtptity » — dep) $f dtp) AB)e =O if ond only if Py(A~ AB)PaPig'2 = (A= ABy'y = vitae only if PRA ~ AB) PEFR y = 0. ‘The following theorern generalizes the Joedan eAnosieal form to regular runt pene ‘Tunonent 4.10. Welestras canonical form, Let AB te regular. Then there are nonsingular Pad Py, such that P(A ~ AB) = élag(dns@s) Mos fog Mn) — Mags Nan a 16 poled Numerieal Linear Algebra sche Ju) 9 ar m-hyns Jordon lok with eigenelue 2, aL Jot) . N fd Ng 8 @ ordan Bock for \ 20 with mutipiity ms," rar ee leas = In, Ad) Por a pron, se [108 Application of Jordan and Weierstrass Forms to Differential Baus ener th ina deel equnsion 0) = A) +020) ~ a. An cele slution sven by 20) — Map + Gea" jteide IC we keow the Jordan form A SJ", wo may change variable nthe iret expotion fo y()— S10) to gett) Jot} + JU with stony) egy | (ee*-S-f (aude. "There I an expleit formula to compute e" oF Saye ineton f(1) a mati lord fon (We sould nse ths fora numeri For the bese 9 betier again, sce Question 1.4) Sopot fis given ys Taso sere fl) = Seg BE aes 8 Single edn lock" A-+N, ware has oes on st supertagnal fed src ewe. Then DOLLY s £0 5» ( : ) ASIN? by the binomial theorem, $220 (4) 0 eng ra ton AOC Jon, were nth at equally we reve the onder of unsmton nd wed ht Ine that NY Dor} > 1. Nowe hat hs ner onthe superdago- tal and ener, Floly, note that 32%, 231) he Toor Nonsymmetrie lgenvalive Problems in expansion for JO%A)/jh Ths a x00) n= LF 7 JO) £0) SB. A : oo 40) Loy 10) so that (J) 8 upper tangular wich /0)/j! on te js superdiagan ‘Tosolwe the more general problam Bit = Ar 0), A—AB regular, we use the Weierstrass form: let P:(4~ AB)Pp be in Weierstrass frm, and rewrite the equation as PB PaPg't — PLAPaP ye | PLf(O, Let Py — 9 aod PLA ~ alt) Now the problem has bata decomposed into subprobiens: In nf) In) An) In) i Bach subprobiem j= JQ) +9(0 = J9-+ 0) sa standard near ODE ‘8 above with solution 0 = Woes [ ePgee ‘The olution of Ju()9 ~ 9-490) is gen by back substitution starting fron the ast equation: write Jy( O19 — 0) a 18 poled Numerieal Linear Algebra on a) fo] fo o} Lim} Lome) Lon ‘The h(a) exquntion $9 0— Yn + Se O thy = —B- Th equation says Bn Oe 8 90 ye ~ Heys = a ts ‘herefore the soliton depends on derivatives of g, nt an integral of 9 8 in the wit! ODE. hos a cotinions 9 which isnot eierentiable can ene discontinuity inthe slain hiss semetimes calle a impulse respons, od ‘occurs nly i thre ae afin eigenvalues. Furthermore, to have continuo Seloton y mst satis certain eorisency conditans at & = 0: Age wo 32 -fermo uneral mets bse 0 inet ang ch ier suet guts DD wh sant Seid Generalized Schur Form for Regular Pencils Jost ae me cannot compute the Jondanform stably we cannot compute its generalization by Weierstrass stably. Inston, we compute the generalized See or, “Tueonew 4.11, Generation Sur frm. Lat A AB be rola, Ten there exist unitary Qe and Qn s0 thet QrAQn = Ts and QLBQre ~ To ave both upper triangular. The exenvaes of ANB are then T/T the ratios of the dingonal entries of Ps and Ti Proof: ‘Tho proof ks very mich ke that for tbe usual Schur form, et X be ‘an egenvlye ad 2 unt right igavector: za — 1. Sineo Ar — ¥'Bx ~0, both ar aad Beare mukiples of the same unit yeetory (even if one of Ar 0 Be is zero). Now lt X'~ [eX] and ¥ = [y,¥ bo unitary matrices with Aust colomas 2 and y, espetivey. Thea YAN = (8! 3°) and YB [5 Bi | by construction, Apply ths process inductively to day — AB. Nonsymmetse igenvalve Problems v9 IFA aul B ace eal, ther general val Schur fee too: real a thagonal ane Qx, where Qu.AQn quasi upper triangular and Qr BQ Is oper rangle "The QR slgorithm and alts einements generalize to compute the ges le (ral) Schur form; seal the QZ algorithm and aalable in LAPACK Subroutine eggee. In Matlab one uses the command #1g(R, 5). Definite Pencils | simple special eae that often arise in proline i the peaeil A~AB, where A= AT, B~ BT, and positive defiaie. Such pens are elle define pencils ‘Tuwone 4.12. Let A= AT, and let B= BT be positive define. ‘Then therein rad nonsingular mais X so that XTAX dingo rst) and X™BX ~ diog(ih-ss3h)- In particle, oll te eigenvales 0/8 ee real tnd ine, Proof. ‘The root that we give is ntualy the algorithm ws to solve the problem: (0) Let LL. = B be the Choesky decomposition. (2) Let W = L-1AL-T; ne that I symmetric (3) Let H = QTAQ, with @ orthogonal, wal and dlagosl Then X = 1°79 sates XTAX = QTENAI-TQ = A and X7BX Qiipitg=1e ote that tho theorem is als tre aA + 3B is postive deft for some scalars vad 3 Soft for this proble is alae 9 LAPACK routine 18% EXAMPLE 4.15. Consider the penell K — AM from Example 4.14. This i 9 etn perl snge the stiles mtx Ki ymmetrte and the mass matric AMF is symmetrie and postive definite. In fet, Kis trigonal and AP is Aiagonl inthis very simple example, so M's Chokes factor i lso diagonal, find H— EK" i als synmetie sd tridiagonal. In Chapter 5 xe wil Consider a variety ofngorithms forthe synimctre triagonal egenproblem Now me onsir singular pencils A AB. Recall that A ~ AB is sngola citer A and Bare norsquare or they ore squee and det{ — XB) = 0 for fl lies of A The net empl shows dat azo ated in extending the Setaion of eigenvalues to this ease, 180 Apolial Numerical Linear Algebra sauce 416, tae A= [3 8] and B= 1 $e Then by making ‘early small changes to get A'=( J and B= [1 | che elgenvalves hecome c/s and ex/ee which ean be arbitrary comple numbers. So the geval are infil esto. © Despite this extreme sensitivity, singulae pencils are usd in modeling er- tain physical sytem, me describe below ‘We continue by showing how o generalize tho Jordan and Welastrass foes to singular pencil, In eddton to Jordan snd “infinite Jordan” Blacks, we get two new “singular blocks” inthe eanonial fom, “Tuvonew 4.13. Kronecker eoniea form. Let A and I be arbitrary rita: aguler chy matrices. Then there are square nonsingular matrices Pan ps that PAP APLBPq is Hck diagonal with four kinds of bol Noa Jn(X)-AP 1 mbyem Jordan block waa 1 neby-m Joedan block Nee ; for dap 1 ia sncby-(m 1) sight Singular block ty = 1 nl (om 1) et Um il Singular Block % Wecal Lm at guar ck bess has ght al etn A, foal LE has an saga tl veto. For poo, 108 Jost as Shur form genre o regular mati en in th ast etn, it canbe goneratized to rotrary singular penis wall For the ean form, prtarbtion theory and softwar, [2,75 24 Sngule peas ae ied to model syste ring in systems nd cat Weve 0 exarpin Nonsymmetse igenvalve Problems is Application of Kronecker Farm to Differential Equations Suppose tht we wank to solve BE Az fl, where ANB ism singular psc Wate PBPabgtt ~ PLAPQDyg'2 4 PLS(0) to decompose the peo lem into independent blocs. ‘Thee are four Kinds, oae foreach kind i the Kronecker form. We have alredy deat with J(X)— AF ad Ny Blocks when we considered rgular peacils and Weierstrass frm, a0 we have 10 consider (aly mad LF, blocks. From the Zy bocks we ste [? PE IEE, fe = tm owl) = wi) Rents) + air iy = we toe or asl) = lO) + fant) + gtr finest = n+ Sm OF ns sEl) = thos s(O) + SyCdma7) + gm}: ‘This meas that woean choose ya en arbitrary integrable Function and use the shove recurrence relations 10 gta slut. ‘This becauso we have ope rmgeo unknown than eqution, #9 th the ODE is underdetermined. From the TH blacks we get PSHE JERE Starting with the st equation, we solve to get nm b= nh pa te = tte aed the conssncy condition fnsy = in ~ > ~ en 0 als te 4 Satisfy this equation, thee i sation. Heee we ve on mace etion then variable, ad the subproblem Is oerdetermina 182 Apolial Numerical Linear Algebra tion of Kronecker Form to Systems and Control Theory The contotabe subspace of (2) = Az0} + Bult is the space in whieh the system state enn be “okra” by chosing the contol pat ul) tarsi st 7(0) ~0- This equation fs me to mde fb) antl syste here the a(t) is chonen by the contol systrn engineer to rake 2} have cotain esrale propectis, sch ts houndeess. From Pe ase Sn [Paw ‘one ean prove the contellable spe i span, AB, AB, ..,A°-1B)} any ‘ompanents of (0) outside this space eanno by controled hy varying w() ‘Tocanmpute tis spe in pete, in order to determine whether Hie pies! systema being modeled eatin fet be control by input wt), ane apples & QR- The slgortun to the singulsepoval [BAM]. For decals see (77,24, 215 Apo Buls)dr 4.5.3. Nonlinear Eigenvalue Problems Finally, we casi the nontineareigensalue proton oF mats polynomial SOMA = Ng bt AL Ao aa) Suppose for simpy thatthe Ay are n-by- matrices an Ais nonsingular Davtsiti08 4.10. Thecbarsetecstie polynomial of the matri polynomial (4.7) HN) = dott g AL). The rnts ofp) = 0 are dened to be the eigenal- ws. One can confirm that p(X) bas deer dm, so there are dn eigenalues. Snppase that ian eigenonue. A nonzero orton satisfying Soy Ace 0 fsa righ, eignvector for >. Ale eigauerior y is defied enabgensy by Shavva 0 ExaupLs 1.17. Coasler Example 4. once again. The ODE arising were la equation (19) i AFH) 4 B90) + Ka) — 0. If we seck solutions ofthe feem a(¢) — et(0), we get ™'OBA(0) + 4BEl0) + Ka) ~ 0, o8 Niai(0) + A240) + Kai(0) ~0. Thus; is an egenvalve and (0) is as fgonveetr of the mate polynomial BM FAB Ko Sage we aro assuming thot Ay Is nonsingular, we can multiply cough by AG! to get the equivalent problem MT) AEA AEN nb Ay. ‘Therefore, to keep the notation simple, we will assume Ay = (see sede 16 Nonsymmetrie lgenvalive Problems 1s foe the general ease). Inthe very simplest ease where each Ais Iby-l, Ley @ Scala, the ginal mates polynomial equa othe charsetariste polynomial, ‘Wecan urn the problem of finding the eigenvalues of matrix potynonial Iwo a standard eigenvalue probern by using trek analogous to the ane us to change highorder ODE into frcander ODE. Conse rst the simplest fase n ~ 1, whore cach A i «sealnr. Suppose that Is a root. "Then the vector 2 =I V8 anise wae Aa Ao Eta Teta 0 co or Greieoe 0 |e 2” ° a) “ ay ow, "Ths is an eigermertr ad 7s an cigervae of the matrix, which s calle the companion inairr of the polynomial (4.7) (Phe Math rotine roots for nding rots of polyxnil applies the Hessenberg QR iteration of section 1.48 tothe companion matrix C, since this ‘surrey one ofthe mos eiabl, expensive, methods known (8,115, 289, Cheaper alternatives are under development.) "The sme ie works when the Ay are matrices. C bosoms an (= -b- (2-4) tock companion matrix, where the 1's and O's below the top OW became ‘ym identity aod zeeo eats, respectivels. lb, 2” becomes ote te oa whee 2 18 right eigenvector of the matcix polynomial It again turns ot ist Co! = 2 EXAMPLE 4.18. Returning onc agin to 28M + AB 4 KY Wo} AMIS | AP-TIC and th 0 the companion mate Mon Mm 1 0 "This the sme a5 the mate A in equation tof Fxample 4. & © rt poled Numerieal Linear Algebra inaly, Question 4.16 shows how to ase mats polynomials to solve & problem in computational germetry. 4.6. Summary ‘The following ist suai al the eanonical forms, algoithins, threats, ‘nd sppliestons to ODEs desrbed in this ehupter. I ako Ines poaters to algorithins exploking eymnezy, although thes are discussed in more detall In the next chapter. Algorithms for sparse matrices re discussed in Chapter 7 salar ~ Jordan form: For sone nosngula 8, ee AM = Saag]... . ou Noa ~ Sehr form: Por yome unitary Q, AMI = QU = AQ", where 7 strangle ~ Real Schur form of eel A: Far some real orthogonal Q, A — MI = QU = AQT, where Tis real quasherangulr ~ Application to ODEs Provides solution of #0) — Ar(t) +0 ~ Algorithm Do Hesenbeng radetio (Algorithm 4.5), followed by QW ierstion to get Schur form (Agorthen 4.5, implemented esr in section 48). Figeasetors ean be compte from the Schur form (as deserted insertion 421), ~ Cost: This costs 10n? fps if egeavalues only are deste, 258 {Tana Q are slo desired nnd litle over 27 I eigenvectors re aso desired. Since not ll pats of tho age can tale advantage ofthe Level 3 BLAS, tho cost setully higher thane comparison with the 2a ost of matrix multiply woul inate: insted of take ing (10m*)/C2n*) ~ 5 ties loge to compute eigenvalues than to ipl mates nen 23 tes longer for n = 1 a 19 ies Tonge form — 1000 on IBM RSG 590 10, pe 62. Ite of taking (2709)/2n8) — 13.5 times longer co expat egeaalics land eigenvectors, i aes 1 times longer fr 8 — 10 a Ties longer for n ~ 1000 om the same machite ‘Thus computing eigen voles of noasymmetsic mato i expensive. (The synmetec ease 1s much chespe; see Chapter 5.) Nonsymmetrie lgenvalive Problems 185 = LAPACIC: ages for Schur forms or age for eigenvalues ad eine eters; egeess oF egeet for ear hounds too ~ Mab: eebur for Setur form or e8g fr cigvalus ed elgonee ~ Exploting symmetry: When A — 4%, better algorithms are di cussed in Chapter 5, especialy ston 5.3 + Regular A AB (det(A~ AB) 20) = Welerstrss form; For some nonsingular Py, and Pr, fae sm ANB = Py diag | Jowda vt ~ Gone Seu fom: Fs ary Qa Qn, A= M8 = QuCty ya whce4 ael Tyae gle ~ Generated rl Serf of lA BF sone tg Owol Qy atl Que A~AB = Qu{Ta ~MihQh, where Pe ‘fn ng tad Tp eal nla Apolention to ODES: Proves slton of Ba) ~ Ax) +5, see the hin sng erin Bt ay Spend ote moot othe data (imple rape) ~ Alin Hester tangle retin flowed by ee tion (QR applied implicitly © AB-'), ~ Com: Campin Ta and Tn cons 9n. Compting Qa Qn dition ests tn Competing econ wel cox 6 le ich nn oil As blr, eel 8 BLAS cot be ted 2 pro th arith = LAPACK:agpe for Snr her o agg or Signals; ages or agers or ror tou Maths 44 foreign nd genectrs Espting ymmety: When A A, 11 — BS nl 6 poste defn, oe can sone. the pote to Sing the eenvalies of snl sonnet rein Tine 1.12 This en [APACK rots sey, sper (for syne mates in pk storage”) , und sabgy (for symmetric band matrices). 1s Singular A AB 186 Apolied Numeral rear Alger ~ Kroneekts frm: For some nonsingslie Parl Pe An ba so " diag | Weierstrass, : Pit fi 1 ‘ da ~ Goneralaal upper twinnglae form: For some unitary Qi aud Qn, A~ AB = QulTa— ATH)Qip, where and Ty ace in sence Upper triangular Tor, with diagonal blocks corresponding to d= ferent pats of the Kronecker frm. Seo (78, 24) for deta of Use form and lgoithins. ~ Cost; ‘The most genera and reliable version of the algorithm can fant as much sO, depenting on the detail of the Kronecker Structire; this mich more than for reg A~ AB. Theresa ‘Sigh less rable O(2) algo 27}, = Application to ODEs: Provides solution of BA) ~ Az(t) +10. ‘whee the solution may be oerdeteined or underdeteenied ~ Software: NETIIB/inalg/gupte ‘© Matric polynominks S79 AAs (116) = I Ay =F (or Ag is squave and well-coditoned enough to replace each A; by AA), ten linearize to got the standard robles w Aaa Aga or oes oe oy roo ° Geese © |x. o eer) ~ IF Ag conditioned or singular, Katze to get the penel waa Awa ove mda) [Ae Teo: 0 1 Cee oy 1 0 fee 1 Nonsymmetse igenvalve Problems st 4.7. References and Other To for Chapter 4 Fora general dseusin of propectis of eigenvalues and eigenvectors [137 For more deals about porturbation they of eigenalts and eigenvectors soe 1159, 255,51,and ehapter 4 of [10 For a proof of Theorem 7, ee (8. For a cisension of Weierstrass sind Kronockteexnonial form, 6 [108,16 Far thee application to systems and eoatrol theory, sew [346, 245, 77)Por spplcstions to computations geometry, geaplics, al mnnical CAD, see [179,180,163 Foe dseussion of paral slgoelthns forthe noasymmetsie geaprobleny, Se 73). 4.8. Questions for Chapter 4 QuesTi0¥ 4:1. (Baxy) Lat A be defined asin etion (1.1). Show that sbea(a) = [TE ydetAy) ard hen that det MI) — TP deta ~ AD. ‘Conele tht the eta egevales of As the union ofthe es of eigen of Ay though Ay. Quesniox 4.2. (Medium: Z. Bai) Suppose that Ais normal Le, Aa* = AMA Show that i Ais also langue, t must be diagonal. Use this to show that ‘an mty-n matic is toemal fad only I thas » orthonormal eigenvectors Hint: Stow that 4 8 oeral i aud ony if ts Sehur form Is norm Questiox 4.8. (Busy: 2. Bas) Let Nand be dstnet elgenvalics of A, let x bright eloavector for A, a let y bee eigenveeto fej. Sbow that {Pn y re orthogonal QUESTION 4.4. (Medivm) Suppose A has distinc genes, Lat (2) Sj! bea function which scene mt the eigervaleof A. Lat "AQ. T be the Schur form of A (30 Q is unitary and upper triangle) 1. Show that 4(A) = QY(TIQ*. Ths to compute (A) it soe vo be fable to compune f(T). In the rest of the problem you will drive ‘Staple recurrence formula fr (7) 2 Show that (f(T) = f(T) 50 that tediagonal of (7) ean becom From the agonal of T Show that T4(T) ~ f(T. 4. Prom the Inst rest, show thatthe sth superdingonal of §(7) ean be ‘computed from the (é~ I)st and eater subingonals. Ths, starting ‘tthe diagoral of f(T), me ea compute the frst spending exo Supentionsl, sd so 03 188 Apolial Numerical Linear Algebra Quistios 4.5. (Pasy) Let A be» square matcix. Apply ether Question 4.4 to the See foem of Ar equation (15) to the Jordan form ofA to eoztade tht the egenalics of f(A) ate 0), Whee the Ay aze the cigenvals of 4 "This sults ealled the spectral mapping theorem. "Tis questin suse In the root of Theoeem 6.5 and scion 6.3.5. Questi0N 4.6. (Median) tn this problew we will show how to solve the Shbester oF Lyaprnoe cquition AX XB —C, whore X end C are mrby-t, ‘Aiemebyem, aod B ism. This sw spsem of mn lnear equations fr the cents of X. 1. Given the Schur decompeitons of A and By shaw bow AX — NB = C ‘an he transformed into asia system a'Y YB! — where and 1B ae upper angulae 2, Show bow to ove forthe entries of Y one at the by process sak fous to bac substitution. What condition on the eigenvales of sd 1 guarantees tha the system of equations ie nonsingular? 2 Show how to tnsform ¥ to ge the solution X Qquisios 4.7. Medi) Sapo ht 7 = (4 iin Shur frm. Wo ‘wnt to ind a matrix $0 that S~I'7S —( 4 9 J, te tumsut wo ean chooe 8 ofthe form | Show hw ose fr QuusmioN 4.8. (Medium; 2. Bas) Let A be m-s-n and B be n-by-m. Show that the matics ABO) ag (0 0 wo) ™ (a a ce simile. Conclude that the nonzero eigenvalies of ABB are the same as Mowe of BA. Questiox 4.9. (Medium; Z Ba) Lot Abe n-by-n with egensalics Show tht SSP apts tas Cursos 4.10, (Meta Bl Lat A be en mate wt lg x 1, Show that A eon be written A= Hf) S, where — 11° i Hesmiian fand $= —S* iy show-Hlermitan. Give expbet formulas for Hand Sin terms of A 2 Show that S22 [RAJ? < Nonsymmetse igenvalve Problems 9 2 Show that S22 [OA < Is. 4. Show that 4 is normal (AA* = AYA) ifand only IEP INP = LAL Quustios 4.11, (Bay) Let bea simple eigenvalue, and lt z andy be right and lt eigenvetors. We define the spetal projection P eorrespnding to X ts P= 2y"/(y'2). Prove that P hs the flowing properties, 1. Ps miguel deined, even tong we could use ar’ nono sear ml tiles a al yin is deition 2 PP=P. (Any mate suisbing P? = P is calle a projection mate) 3. AP = PA~AP. (These propertes motivate the name spectral proje on, singe P “eontaias™ the Flt and eight Invariant subspeces of 4), 4. Pls the condition number of & Quustiox 4.12. (Bay: % Bas) Let A= [fe Show that the condition numbers of the eigenvalues ofA are both equal to (1+ (35). Thus, the ‘conition umber sage the diference ab between the elgenvalues it mal ‘eompared to, the offiagonal part ofthe mati QUESTION 4.18. (Medium, Z. Bai) Lt Abo 9 mints, ¢ be 8 it vector (ila = 1), be a sean, and + — Ar yx. Show that there is mais E with Ee lrla such that A-+ has egenvale and eigenvector 2 Quistox 4.14. (Medium: Programming) In this question we will sea Matlab program to plot eacrlues of «perturbed mstix and thee ‘condition numbers. (Its siiable at HOMEPAGE Matlab/egseat.m) The Inpotis = Input mates, rr = sa af peetsbaton, 1m = numberof perturbed metses to compute The output consists ofthe plots ia whieh cach symbol tthe leaton of an igen of» prt rat ‘ot marks the keaton of each unperturbed eigenvalue, “marks the leaton ofeach perturbed eigenvalue, whore a eal perturbation mats of rem et ade toa “mrs the location ofeach perturbed eigenvalue, where a com ple pertrtation matrix of norm err sed to | table ofthe oigenvalues ofA end thee condition numbers also pete Here are some interesting examples to tr (ora large a ta you Want to walt the lager them the beter, and rw equ to afew hundred is good) 10 Apolial Numerical Linear Algebra (1) = vanda(5) (4f a does not have complex eigenvalves, ‘zy again) (2) = diag (onee(4.2),0) @ettiso a fo2 163 0}; foo 3 10}; foo 1 a} errste-®, 16-7, 16-6, 16-6, 16-4, 16-3 @ taal 3 Crandn($,4)) sasqediag (ones(3,1),1)4q° cs) a ton 406);[0 1 165); (0 0 102, errete-T, 10-6, 50-8, Be-6, 10-5, 156-5, 20-5 @a-t 0 0 0 0 oO; fo 2 1 0 0 oF fo 0 2 0 0 a; fo 0 0 3162 tea); fo 0 0 0 31a; f 0 0 0 0 a err 16-10, 16-8, 1e-8, tend, tors Your assignment isto tey tse examples and compare te regis axeupial by the egenwalics (the so-called pseuoepertran) withthe bounds desebed it section 43. What isthe difference beeween real perturbations snd complex perturbations? What happens to the zeglons oseupied by the eigenvalues 9 the perturbation ert goes to aero? What i iting sae ofthe relons as ert spe to zero (Le, ow many digs ofthe computed egenvalins ate cores)? Quastios 4.15. (Medium; Programming) tn this question we use a Matlab program to ploc the diagonal eis of mates undergoing unsifted QR eration. ‘The values ofeach agonal ae plot after cach QR iteration, cach lingual orresponding 10 one af te plotted enews. (The progeamis valable fa: HOMEPAGE /Matlab/grplt.n ad also shown below.) The inputs are ‘m~ nuaber of QR teratons, and the ovtpet «plot of the diagonal Examples to ty this eoe on areas follows (choose m large enough so that th ears either convergn or go int eels) fa = eanan(6): Nonsymmetrie lgenvalive Problems 1 1b xandn(6); 2 = bediag((1,2,5,4,5,6))+ie0(8): 22 TE t):t1 a= 300 1 dbag((1 Seoes(1,5)) .Werdr+(:8)) + (ote (aiag(ones(4,1), Deatagtones(@, DD): ‘What happens if there are comple eigen? In what onler do the eigenvalues appear in the mats after many eran ons? Perform the folowing experiment: Suppose thats n-by-n a symmetric In Matiab, let peem=(a=1:1) This produces Ist ofthe Intezers fom n down to 1, Rum the Herston for m kerations, Let a-a(permper); we call this “tipping” a, beense it reverses the order ofthe rows and columns ofa. Run te Teration again for m Herts, aod agin form aa(peem perm). How does this value ofa compare with the original value of a? You sould not kt ‘be to lege (try m ~ 5) oF ele rondo wil obscure the relationship you Should se. (See also Coolary 5.4 andl Question 5.25) ‘Chang the ene to compute the ere in ich diagonal from is fil vale (do this jst for mato with all real eigenvalue) Plot the lg of this eror versus the iteration rumaber. What do you get asymptotically? roid oft 30 og ploe(e’'#) arta Questi0N 4.16. (Hard: Progamming) ‘This problem deserts sn apiation ofthe ronlnear egesproblem to computer graphics computational seomety, ‘and mecnical CAD; se also 179,18, 163) Let P= [fulaa.3) be 9 mate whose entries are polynomials ia the tyes variables Then det(F) ~ 0 will (generally) dein a twospring system ‘become A(0) = —KaQ), wheee A = dlagim,...,2) aod eke be Ste kathy hs K ate ate ke Since AF Is nonsingular, we can rewrite this as (2) — —M-ACa(), If we sek solutions ofthe form (0) ~e™2(0), then we got e™72x(0) = —AI~Ke2(0), or M~'Ka(0) = ~72x(0). In other words, ~7 i an cigevalue apd 2(0) i fn eigenvector of MK. Now MIA is ot generally symmetric, but we ccan make it symmetric as follows. Define M1? — ding(m;’®,...,ma”), ane toutiply A R(0) — —74(0) by AP on both sks to got A (0) — MERE) = 920) or Ro — —y%, wheve # — M1/29(0) and R= MIRAI, is cay to oe i : wa At ‘Symmes. ‘Ths each eigenvalue ~72 of Kis ral, and each eigenvector 2 MP0) of Ks onthogoal w the ater ‘The Symmetric Eigenproblem ane SVD. i In ft, K ism tringona mate, a special frm to which ar symmetric rusts can be rad, using Algoitrs 46, specialaed to sytaetic antsies ‘8 dseibed in setion 4.1.7. Mest of the lgorithins i setion 5.8 for Badin the elgenelas and clgeaveciors of symmetric matrix assume that che mate Is italy Deca reduend to tediagonal fora. "There is another way’ to express the solution to this mechanical vibe on problem, using the SVD. Dene Kip = dlag(hy,.-.hy) and KY? ing! 4). Then K ean be fctored a6 K= BADD, where 1 «8 can be confirmed by 8 small seuaton. Thus arly = PRK pT? = Wah?) aa 8) (BK) art BK cet. 6.) “There th nga vals of — M01 ae tbe sq rot of he eign ff ad th tiga ets of are the gnc of SSciown'n Thonn 13, No that @ is mmr only on te main goal ‘onthe ft sperngpel. Sich mst ar le bynes t grt othe SYD benny cing the ate ool fo, ng Uh lit in sein at that th actrztion K = GOP ipl tat, Ki postive dfn since Gi woingan, The th egsles 9 o Kae all pe “Throgs, ad the eons ofthe rig eel egtion X00) areaxlasry wth rene Fra Mathb sekton of vibntng, masesprag. stem, xe HOMEPAGEYMatibjammpingm. Fora Mala asian ofthe we tana sar pal tn, detours msc beng © 5.2, Perturbation Theory Suppose that A is symmetie, with eignalues ay 2-2 am ad ere Sponing wnt eaves gh-vfo. Suppose Fi ap syne, ad et A= AY E have perturbed eigenaltes dy > --- 2 dyad corresponding pee tured elgenseecors gh. sdq- The majo gal ofthis Seton sto boul the ws Applied Noserical Linear Algebra Aiercoces bexween the eigevalas a sald, ad between te eigenvectors fad in terms ofthe "sb of E. Mast of out bounds wil use] the sae of , except for section 5.2.1, which discusses “relative” perturbation theory. ‘Wi already derived our fist perturbation bound for egervaus in Chap- ter 4, whore we proved Corollary 41: Let A be symmetric with ewencaes 12 Dag. Let AH be symmetric wth eigenvalues &y >= > dye If ‘is simple, then os a < Ila (1B) This result i weak because it asses ag has multiply one, and it is ful only for suficently small Js. ‘The nest here eiminates both sesh. "Turon 5.1. Weyl, Let A ond B Be n-byn symmetric matress, Lat a 2 ay be the eigenelues of A and dy 2 oo" > ty be the eigenvalues of ASASB. Then le ~ 6) < [Bl Conouiane 5:1, Let and F be arbitrary matrices (ofthe sume size) where 2 +> oy are the singular values of G and of > of, are the singular falues of CP. Then ou ofl = Fla Wo ean aso Wee's theorem to got error bounds for the egenvaes com pute any backward stable algritie, sigh a8 QR Reeations Seago itm comptes eigenvals dy that are the exact egerlins of A— AYE svhere [Ej — Olea. Therefore, thei eros ea be boned by ayy) © Tela — Of) Als = OC) sua fy This is very satisfeetry errr boul esl fo large eigenvalues (those a near [in maga), since they will ‘becomputed with ost of thee digits correct. Smal egeavlucs Ja < [) may ve fewer corte digits (but se ston 5.2.1). Wo wll prove WeyT's theorem using snother useful cassie! res: the CCourant"Fischer minima theorem. To stato this theorem we eed 40 into- doce the Rayleigh quotient, which wil ako play sn important roe in =xveral algorithms ch as Algorithm 5. DerINition 5.1. The Rayleigh quotient of asymmetric mats A and nonsero enor wi pA) = (a Anu) ere are some simple but important properties of pu). Fist, pA) — (0,A) for any nonzore sear. Second if Ay ~ aig, then 2A) — More generals, suppose Q™AQ ~~ dioga,) i to egendecomposition of ‘A, with @ = [tyes Gl» Bspand win tho basis of eigenvectors qs follows = QF) = GE = Fpa Then wecan write sway LOA ENE Voth Pee ita aise iors Incaer words, A) sa weg average ofthe epee o A. Is rps vale, masa tA), cece foe (= 4) ad es pe) ‘The Symmetric Eigenproblem ane SVD. 190 Tes smallest valve, ing-g pv), oncurs for = ay (=e) and equals plans A) ay. Togeter, thse foes imply snag lt A] = xe, = Ll 62) ‘Twwonsae 5.2. Courant eigenvalues ofthe symsnetrie matric A and y+. the corresponding wnt eigenvectors lh MEA 05 a ‘hema in the ft expression for a i ovr aj densiona ub gues of and the saben minima i ner ll once ets 7 ih the subspace. The maximus i etained for RP ~ santas), a @ minimising v8, “The minum inthe second expen for a is oer al — + > dimensional sutgaces 7" of By andthe uboquent main oer a rancor rctrs 8th the subsp, “The minimum sated or S* >"? = Spay jy), rd 8 masing 8s 8 ~ 4. BxaMrue 5.2. Let § = 1,50 i the lanest eigenvalue. Given RE, ¢r,) Is the same for all nonaera » € RY, soc all such rare sealae miles of one ‘other. ‘Thus the fst expresioa for at simplifies to ay ~ mayo pl A) Similars, sien n — j + 1 — the only subspace 8° 163, the whose Space. Then the seca expression fray also implies toa = maxe-« 4) ‘One can silty show that the theorem spies to the ellowing expres ‘Son foe the stale egenalue: ay = mine-aAts Al. © Proof ofthe Cournt- cher mininar theorem. Chonse any subspaces RY nd S77 ofthe indicated dimensions, See the sum of thee dtaensions J+ (w= 941) — net exces m, there must be a nonzero vector ns € Reset Thus cimin, lr. AY ipl. A) a) ap Eel Nest choose $79 = span(gos dy) 80 that eg a, AL Se oA) 5,4) oe Las tlar ome De, Tins, the lamer and upper hounds are sandwiched betwen a below al 14, aw 0 they ms leq 8 desired. Exanru 5.8. Figure 5. illusteats this theorem graphically for #by-8 nae ‘rics Sino pu/lal,} = pts), we can think of A) a «funtion 00 ‘he unit sper lla ~ 1. Figure 5.1 shows contour pot ofthis funtion on ‘hunt sphere fr A ~ dag(,25,0). For this simple mate q — ey, tho ith alu of the identity mates, "he igure f symmetre about the oda since (HesA) — km A). Th seal rod cles ne eq, sone the global as ‘nm fh.) 1, a the snl geen cic reae stro the plot tmininium fq, A) ~ 0. The two grea cree ane contours for pA) the send eigealie. Within the two narmow (gee) “apple slice” dened by the great eles, ou, A) < 25, and within the wide (red) apple ses, Hay) > 2 "et us Interpet the minis theorem in teins ofthe fgure. Chocsg 8 space R? is equltlent to choosing a great cree C: every’ point an Cs Within R2, and R? consists of all salar multiplestons of the vectors in C. ‘Thus ming pene pl, A) = mideec (A). There are four eases to consider to compute mityec APA): 1. C doesnot go through the intersection pnts ofthe two grea cies in Figure 31. ‘hen Cclsaciy mast interseet bath narrow pron apple slice (a8 wel asa wide red apple slic), 80 min pA) © 25. 2. C does go through the two intersection pints se and oterwie es ia the narrow green apple sles. Then mityec of, A) <2. Tho Symmetric Pigonproblem and SVD. am Tarspoert ponie i N Fig. 5.1, Contour lt of the Reseigh quien de wit apr 3. € does through the two intereetion points -tqe and otherwise les in tho wide red apple sees. Then minegc (ra) — 25, attained for rat 4. € coineides with one ofthe two great circles. Thon pr, A) = 25 fr al ree. ‘Tho minimax theorem says tat a2 ~ 25 & the maximum of mia-co pr, A) ‘overall choles of great elrcle C- This maximum I attained In eases 3 and 4 above. Ta pareular, for C bisecting tho wide ea apple slices (cso 3), BF = spanianta)- Software to draw contour plots Like these in Figure 5. for an sritray Shy symmetric. matric may te found at HOMEPAGE/Atalab/leayleighContoursn. © Finally, we can present the prof of Wels theorem. was Bw . aw a Be © Ste Te, ‘ ty the xin: theorem, cat te) eation (52) :+ (Blaby the minimax thooen agin Reversing the roks of A and A+ #, wo also got a5 < | |. Toner, thse 186 inequalities complete the prot of Wes!s theorem. 'A theorem closely rested to the Courant- Fischer minimax theorem, one hat me will od later to justify the Bietion algorithm in seton 334, Spfvester’sthenvem of inertia, ome Applied Noserical Linear Algebra Derinrmion 5.2. The neta of a symmenie mar i the tile of teers Inertia) = (v6), ater v lathe munber of negative elgenatues af Ay the munaler of 0 eigenvalues ofA and 6 the member of pase eigensalues oA LX is orthogonal, then NTAX and A are similar snl 50 have the same ciggnvalies. When ¥ is ol nosing, wey XT AN and A ae eomgrent In thiscaso X AX wil genevlly not have tho same eigenvalues as, but the rst theorem els Uh the two se of egestas ve the Same signs ‘Twwoneat 5.8. (Sylvester's Inertia Theorem.) Let be symmetric nd X be wonsingias. Then A and XTAX have the seme inertia. Proof. Let n be the dimension of A. Now suppose that A has » negative cigenvlis but that XT AX has” < e negatve egenvaless wl Sind 8 fnatradietion to pee chat this canst happen. Let N be se coresponding 1 dimensional negative eienspace of 4; ix, Nis spanned by the egenvetoes of the» negative eignvalnes of This means tt fr aay nonzero 2 € N, aT Ar < (Let P be tho (n—v!}-dimensioaal mgnagative elgempace of XTAN; this means that for any nonawo 2 €P, 2° XTANw > 0. Since X 1s nonsingular, the space XP is also n — of dinensiowal. Since dim(N) + din(XP) — 0 4 n—v > n, the spoces Nand XP oust eontsin a nonzero ‘wetor zIn tein ntersection Buc then D > 2" Az see 2 © N and O-< 2" ae since x € XP, which im eattadietion. Therefor, »— i ey A and XTAX have tho sare number of negative eignvaloes. An analogous argument shows they Ive the se mimber of postive cgennlis. Thos they mst also have the same numberof mero eigenvalues. [Now we corde hom egemetors cn change under perturbations af A +B of A-To sate out bout we need o dee tue gap in the spetru Derininion 5.8. Let A have eigewelves ag > => x ‘Then the ap bet ‘an eigenslue a, and the rst of the spetrum ts defined to be gape, A) — rnin [ay —aul We wil ao write gant) A i enderstod from the contest ‘The basi sult that the senstvty of an elgenwetor depends oa the gap of ts conesponding egensalue: a smal gap implies seaitive elgenvetor. xnrun BL A[2F Land AEB [Ef mtr << g ‘Tos guid) — 9 ~ wid 4B) fori ~ 1,2. The gevetors of Aare fine nee atl qs en Amal computation revel thatthe eevee wea Bare oo [ TO] [3] "The Symmetric Eigenproblem ane SYD 2m coe [., re wire = 1/2 norma toe. We e ot the ple betwen the erturbed vectors and unperturbed vectors q equals 9 to Ast order ine So the angie is proportional to the reciprocal af the exp 9. © ‘he general cas is esetilly tho same asthe 2-b-2 ens just analyz “Tuvoraat 5. Let A= QAQT = Qulag(a)Q” be on exendcmpastion of A tet AE — A QAG! be the prtartedeiendecompontion Wet Qing and Q — Ucn, eer g and ae the wpetered end petabot unt eigenvector reset. Let denote he ace nae teen ‘wand Then 1 ets danas provided at 4994, A) > 0 3S ap a one Siniaty Fainzy < Ele provided that sop, A) > 0 Bent S pA TB Note that wen 8-< 1, then 1/2128 sn 0 ‘The attraction of stating the bound in terms of gap A + #), a8 wol at s9p{i, A) is tat frequently we know only the eigenvalues of A since thay fare typialy the output of the eigenvalue algorithm that we have used. In this eae i is straightforward to vate gp, A+B), wheres we ean only sstimate gupta). ‘When the fst upper bound exceeds 1/2, Le, ly > empl) /2 the ound reduces to sin 29 < 1, whieh provides no information about @ Here Is why we cannot bound @ in this station: IT Is this large, chen A Be égenvalve dy could be suleiently fr from a for A+ # to haw & multpie fégenvalve at a For evample,coasder A — dlag(?,0) and Ay ET. But Sich an A | 2 doesnot have a unique eigenvector Indcd, A = has lay Weetor as a eigenvector. Thos, tans no sense to try to bour 8, The saune considerations apply when the second upper bound exexeds 1/2 Proof It suiees to prove the fist upper found, because th seond one folows by considering A+ Bas the unperturbed matrix and A = (A+) = ‘9 the perturb mae Let 4, + be sa elgenwetor of A | B. To mabe d unique we impose the resceton tht It be orthogonal to 4 (writen q) a shown below. Note 2 Applied Noserical Linear Algebra that this means that qd nota uit veto, 50 = (a+ d/l Then and Udy aed ce ladle ae le oy Now write the th colamn of (A+ BXQ = QA ws (AF Ba +) asl, on where wo have also mip eech side by fq + a. Define 1) — 8 — Sobrract Aq ~ ai from both sides of (51) and rearrange to got (A- odd = (nf — Ela + 63) Sinoe¢f(A asf) ~ 0, both sides of (5.5) are orthogonal og. This ets us rite = (nF —EXg 4) — 3,Gq and d= $5, fis Sino (A aslo, (as as, woean write Anal Tosa Loe OF Bhat e 4S Dee Ths since gap A isthe smallest denominator "The Symmetric Eigenproblem ane SYD 205 Ik we were to use Wey! hore and te triangle negli to bound Fels < (Ula fad= fala S21 ose, ten we cold conde tht ind = BIE Lanta. Bt we ca dite bere chan hs by bonding ea — eat — Ba + “ls mor carey: Makiply (5. by af both ses, cone temas a reareange to get 94? Eta 8). Ths 5 On BG +) = (4+ Oil Bla + lg +6) (ato ~ DEG +a, and ssa < Mae =I [Balt We aim that We dg? = Me I al Cee Question 57). Thus Is < lay} ALE 1a 0 let difhels _ stele lela tang < lela pl aot a5 Spt a) sana) = Wee 00 peony — ain Bao ah > et Hngemee = nae ded, ‘An analogous theorem can be proven for singular vectors (ste Question 5.8). ‘The Rayleigh quotients other nie properties. The next theorem tells us that the Rayleigh quotients “best approximation” toaneigenvale nie tral sense, ‘Piss the bss of che Rayeigh quotient iteration in section 53.2 and the iterative agrthms in Chaper 7. It may’ also be used to ovate the ‘accuracy of an approximate cgenpai obtained in ay way ml not jon by the slgoritins darned ee. “Tawonsnt 3.3. Let A be mnt, # be wit vcr, and ea alr ‘Then has on eco Avg fu © [AB Cen the ate 3 phe A) minanies |r Ba With ie more sora tte seta of A cam ge tier tonnde ete Aroha a et be thee of 4 cnet ty ttn! misled thsnraonon th gop defied ere Ua be he ate ol eer sand. Then lal ‘ sng tle os) = ta lo — ea < H si tox oteay He 6a Se Thor 7. for 8 goerzton oti st seo gees. Note tnt nego (37) the dence betwen the Rag quate A) andi os proportional ot are trea wr 20 Applied Noserical Linear Algebra [rl This tigh accuracy isthe bass of the enbie canrngence ofthe Raleigh ‘quotient eration slgorithm of setion 3.32. Proof. We prove only the fst result and kave the otbers for questions 5: find 5.10 a the ed of the chapter, 1 ian elgeavalue of A, the results immedite. So assume instead thot AWBIs nonsingular. Then 2 = (4 ~ B1)-1(A~ ge and 1 [ala = MAA) Aaa A= QAQ™ = Qalngfan. 0919, we Writing 4's eigendecompostion we a= 31 Na = Qa a0" 'Q"lle= MA BMa = 1/ nls — 8, 0 inj ~ 3 fA A 8 desi. To show that J — gz, A) minimizes |x — lly wo wll show that Is ‘orthogonal to Az ~ pe, A). s that applying the Pythagorean theorem co the ‘sum of etogonal veiars Ae — pr = [Av — pla, Ala + [ie A) — a vials lla asl2 = ae — pte, Ael2 + Mola.) adel > Ar ~ pla, Adel ith qualty only when 9 = a A ‘Tocontrm orthogonality of and Ar pry Ae we nod to verify Ut Pike dd le Ete aes 0 cos EXAMPLs 5.5. We illustrate Theorem 5.5 using 8 marx fom Example 5:4. Lat A=[TE" | where O- a Tage dinan s a elgape TaN * "The Symmetric Eigenproblem ane SYD 0 Proof, Lat = fa, HW = A~ buh, oad P= 6 — WBA &X PX NFAY at) 1), Note ut ‘Tins Hay =~ and (H+ BY(XG) = 0 0 that Xi an egenvectoe of 11+ F wih egenslve 0. Le be te seule sale between gad Xf. BY "Theorem 5.4, we can bound a ow, Wo have 2 = flrs Now gap, 7 F) isthe magnitude of the small st oneco cigenvale of Ht. Since NUH +P) — XTAN ~ al has ‘dgenvalves 6, ~ dy, Theorem 56 tells us that the eigenvalues of f+ Fi in intervals rom ( — e)s ~ a) t0 (1+ es) ~ ay) Ths gap tl + F) > (eae, XTAX), ads subsisting nto (3.10) sks fee eer aid = nm 28°" Tat TART” Toate Tan Ot) 1 : dana = 6.10) Now let be the acute angle between Xi, aid $0 that @ < Oy + Using trigonometry we ean bound sind = [(X— Nil < |A'— Fla Cad so by the triage Inequality (see Question 5:13) 3 i sina sind fanaa < $sinam + $sinam Tartan Tan 1 8 desied, ‘An analogous theorem ea be prow fo singular vets [BxAMPLE 5.6. We agin ensder the massspring stem of Example 5.1 ad te to show that bounds on egeralis provided by Wes tren (The fre 5.1) can be mich worse (Joo) than the relatve™ yersion of Wes {orem (Theoren 5.6). We wil lo see thatthe eigenseetor bound of Ti. rom 57 can be much bette (ighte) than the bound ef Theorem 5. ‘Suppose thst M = dag(1, 100, 10000) aod Kp ~ dag(1000, 1001). Fo: lowing Bxample 5.1, we deine K = BKB? and R= M-V2KC M12, whee 210 Applied Noserical Linear Algebra snd 50 : oi 10 keartewarie—| 10 Lo 00 <0. ann To five decimal places, tho eigorvalues of Kare 10100, 1.0001 an 0006. Suppose swe now perturb the masses (mn) and spring eoustants (ka) by at ‘most 1% each How mich ean the egenvanes chang? Tho largo mix entry i Ky and engin my to 9 ane fp to 10100 ill ehange yy {o about 10805, » change of 205 in norm. This, Wey!'s there tall oh seh elgenelue ould enue By as emieh as 2205, whieh would change the ‘tulle two elgenelts utely. The egeavetoe bound from Theorems 5. also Indicates thatthe corresponding eigenvectors could change completely Now lt us apply Them 56 10 K, oe setully Conalay 82 10 @ = APTEBICY?, whore K — GGT as defined ia Example 5.1. Changing esc ‘mass by of most 1% 8 equivalent to perturbing G to XG, whose is digo with diagonal entes between 1/99 ~ LMS aud 1//EO1 ~ 95, Then Corollary 3:2 tells us tat tho lage values of G can change only by actors within the invert [996, 1.00), 20 the eigenvalues of A can change only by 1% too. In ther words, the smallest eigenvalue can change only in its sacoad ecimal place, just lke the largest eigenvloe. Similarly, changing the spring coustants by at most 1% is equivalent to changing G to GX, and again the ‘gcovalcs cannot change by more than 1%, Ifwe perturb bath Mar Ky at the same tine, the eignvales wil move by about 25%, Since the geal ee Aller so mveh in mgt, thei reatve gape are all qi age ae so thet ‘iggmeetrs can rotate ony by about 3 in ang 0. Fora diffrent approach to relative ceror analysis, mor suitable for matrices seisng from diferent (“unbounded”) operstos, ee (59) 5.3. Algorithms for the Symmetric Eigenproblem Wo css variety of algorithms forthe symmetric eigenprobler. As men- ‘ion inthe ntrodation, we wil discus only det methods, leaving erative ‘metho for Chapher 7 In Chapter {on the nonsymmetee eigenproblem, the only algorithm thot ve dscused was QR iteration, wich cond fd al the eerie and o9- ‘ioral all the eigenvector. We have many tore algorithms alae or the symmetric eigenproblem, which offer us more Dexibiity aad einey. Fe ‘sample, the Bisertion algorithm deseribd below eas be used to ad only the figealcs ina useespecifed interval esa ean do so much faster Wn it old find al the egenvalies, All the algthms below, exogp Raylegh quotient Heratin and Jacob's method, sume thatthe mutes has Ast been reduced! to trigonal frm, ‘The Symmetric Eigenproblem ane SVD. au ‘sing the vrition of Algorithm 46 inset 44.7. ‘Pie nn ntact of 4? lls, oF fr Hops eigenvectors age ao dese 1. Brdhagonal QR iteration. This algorithm find all the elgensalues, ‘and optionally al tho eigenvectors, of a symmetric tridiagonal mati. Implemented ficients, tis curently the fastest peetical method to fin all the eigenvaics of a symmetric tridiagonal matrix, taking OU?) flops. Sine reicing dense matrix to tridiagonal frm costs? ops O(n) rege for large eng n- But forint igen ‘wll, QM iertion tales a itle ewer Gi? lps on average al only the fastest elgit for siall mati, up to about m— 25. This is the algorithm uversing the Matin comma! wig snd the LAPACK routines seyey (or dense matics) and este (fe (dlagonal mares) Rayleigh quotient iteration, This algo nderties QR Herston, but we present it separately in order to more easly analyze is extremely rapid convergence and Because it may’ be ws an algorithen by isl. In fact, it generally converges eubieally (as does QR tration), which roan tht the nimber of cores digits snphotiealy triples wt ach ‘5. Divide-aul-conguer. ‘Piss enerently the fstest method to fad ll te égenvalnes and egeaveeoes of syrnmetee diagonal matziens large than 2 — 25. (The Implementation in LAPACK, este, defaults to Qt eration for smaller matics) ln the worst ease, dividend eonquer gules O(0) Hop, but in estloe the constant is quite small Over & lege set of random test eases, It ‘appears to take oaly Of!) Hopson average, and as low a Oe!) for sme eigenvalue dlsttbutons. a tor, dividend conquer could be eplamesnted to run a O(log) ops, wie p is stall integer (129). This superfast implementation tes the fas utipole rete! (FMM) (122, origi weed for the completely diferent problem of computing the mutual Fores on n eee teleallyeirged particles. But che eamplestyof this superfast imple- mentation meas that QR iteration ls eurenty the algorithm of holce for finding all elgenalues, and dvide-and-conquee without the FM Is the mothod of coie for Knding all eigenvalues and ll eigeavetos 4 Bisertion and inverse ieation. Bistion may be ose to find just subee of the egeavales ofa syametec tridiagonal mate, sa, toe ‘an ites [bey Ht needs aly Ofek) ope, where ks ae cyanate agen vin appegeaea a Applied Noserical Linear Algebra number of eigenvalues desea. Thus Bisetion can be much ister than Qt iteration when k = [emci 2f] dBi Tre pate ate ts to ofthe alr mas D+ poo” where D [| ages, pte ba ae, an ara yer Teeth wl ese With tenth he ding Bydo of D coh dese th Totals aes oD tou sume itt DCA wocmaglar, so compute theca pl se de D + pul” —M) =Aet((D = ANU + (D—APaw), (18) Since D— AP 5 noasngula, det(l + p(D — 3)~tuv?) = 0 whenever 4 5 an fSgeavalye. Note that 1 (D—AJ-Tua? is te deity plus 9 rake mae the detertnant of such mati cary to compute Leanna B:te fa andy are wnctars, det + xy") = 1+ ya, ‘The prot is let to Question 5.4. Terfore eth AD-AI ea) = Lp DI a= 10S AE = 40, (5:19 ‘The Symmetric Bigenproblem ae SVD. a0 Fig. 5:2, Graph of 0) oe tt hse te and heigl rte te eal ear putin 3) =. ial ve dict dll 0 ho gre es), fonction J) bs the eaph shown in gue 2 (nwt p> Oh Site ca we, Ge hn y 1 bo hol tymplte, aad he Ins A= ave vertical asympotes. Since /"(A) = p32. gZige > 0, the faction isi ising ecto Th ro 7) ar tered ty th thre oer to th gtd =n Fg 3) Utp-e0, then J) dren nd here Woe ae ot 1 the a ay) Soe (0) rnotote and stout ote eral de) ps (ond son of Nein rnd ia cone and money te eck vou gv tang pl (de We dct el ter this mln” Alle ea oy bere he pre Neon ener it bomfed nae see pelea. Sas euane J) 20d 70) Cont Oe fg, Ning ome igs ets (0) foe td oN ‘eigenvalues of D + peu costs O(n*) flops. Teal ey dc expen for he gnc of D+ at Laan 5.2. Ifa tan eigensle of D+ pun, then (D—a)2v i ts eigen swetor. Since Da is dingo, this costs On) flops to compte. Proof (D+ pa? (Dalya) = (D~al+al-+ pus! \(D mal wh a(D — al)" lpn" (D— al)" 20 Applied Noserical Linear Algebra wt a(D at since pu? (D — af)" 41 = fla) = 1 o|(D ~at)'a)asdesiral. Dyaluatng this formula fr all m eigenvectors costs Ofn) Hops. Unoe- tunnel, this simple formula for the igeneetrs is not nureiealy stable brnose two very clase vale of a, can ret in nonorthogonal compat iggmetors ui. Finding «sable alternative took over dee from he oF nal formulation ofthis algorithm. We disuse details Later a this eto, “The oval lgoith i reussve Avcontrint 5.2. Finding eignnalues and eigenvectors of «symmetric tri ‘agonal mate wing dideand conquer roe deg (1.0.8) on Jr input compute outputs Q and A where T = QAQ™ @ rove rem At ae ae fo t= [7 2] ae cutee dr ee ae fe pl fi ha ecg nd gb oD toma [Qf ]-@ met oft rm as nal Wo analyze the eomplesty of Algorithm 5.2 a8 follows, Let ¢(n) be the ‘umber of flops to un desig on an n-by-n mati, Then {in} 20/2) forthe 2 eeu cll 0 de Qi A) 1O(0?) wind te gemies of D +p 402) ofa the dgpavetr of Dt aT ten? tomutiniy @= [9 9 ]-a IF we teat Q, Qo, and Qa dense mation and ose he standard mate rontipeation algorithm, the constant in the lst line fs ¢ — I. Thos wo se thatthe major cost in the algorithm is the mance elation the lst Une. Agnosing the On?) tems, we ge (a) ~ 2n/2)+- on. This gvomete sum can be evaluated, yedding (n) © ef? (owe Questlon 5.13). In peti, € ‘The Symmetric Bigenproblem ae SVD. 2 is esualy much less than 1, beet a phenomenon call deflasion rakes Q” duit span Aer discussing delat in the next section, we discus details of sa. ing the secular equation, nd computing the eigenvectors stably. Fill, we seu how to accelerate the method by expating FMM techniques used ia ‘ecrostate partie solution (12). These sections may be skipped on 8 est reading. Deflation ‘So far in our presentation we have assumed that the dy are distin, and the ‘tu; nonzero, When this not tho ease, the secular equation f(A) = 0 wil hve km vertical asympotes, and so K =m roots But nen ot that the remaining ~ K eigenlies are avilable very eheapye I dy dys, oF i'n, ~ 0, ove can ely slaw that os ako a8 eigenvalue of D+ pu see ‘Question 5.16) This proces elle deflation. la prtee we use a theeshald fd defste dy ether Ics elose enough to dy ow ysl enous In practe, deflation happens quite freqneatly: In experiments with eas om dease maisees with usformly distributed egervalus, over 15% of the ‘égenvales ofthe largest D + pau” delle, and In expeeients with eds Senso matrices with elenvalies approaching O geometrically, over SIE de- tat ti sential to take advantage ofthis behavior to make the algrthin fas [58 208) “The pao in deflation isnt in making the solution ofthe secular equation faster this costs only O(n) anyway. The payo isin making the matric ‘multiplication in the last step ofthe algorithm fast. Foe if wy = 0, then the corresponding eigenvector fr, the Ah eal of the intr mates (ae ‘Question 5:10). This means that tho ft eolumn of Q is ey 89-n0 work i ed to compute tue st ean of i te two rations ty ad Qs, There is simiarsmplifenton when dy ~ dy. When many egenalies| eae me ofthe work in ae mee multipin ca be imate. This Isborne out i the aumereal experiments peseatd in section 5.6 Solving the Seeular Bquation ‘When some wis smal but to large to dette, «problem arses whe tying to tse Newton's method to solve the scular equation. eel! that the peineple ‘of Newton's method! fr updating on approsimate solution A; of §(\) 08 1. twappeaimate te funtion J(2) near X=, with a finan fnesion 1X}, whose grap 6 straight line tangent tothe graph of f(3) at X= yy 2. to et Aye the roof this linear spprosimation: (Aye) = ‘The graph in Figure 5.2 offers wo apparent dius to Newton's method, because the function J(X) appears to be reasonably well approwinsted by oe Applied Noserical Linear Algebra Fig. 53. Graph of f(a) — 14 MEE EE EE stright lines near ach ero, But now conser the graph in Figure 5.3, which Aller from Figure 52 only by ehnnging from -3 to 001, whieh nek ‘ary small enangh to diate. ‘Th graph of 0) in the lef-hanlfgare ‘sully ndistinguishable from is wet and ezoatlsaypiotes, inthe Fightsiand figure we blow i up aroua owe ofthe vera! aymptoes, X= 2 We sew that the graph of J(3) “Tuens the cence” very api ad I nearly ocaontal for most values of A. Thus, i we started Newton's method fra most ay 2o, the Unear approxition((X) would aso be neatly hoezostal ‘ith slightly postive slope 3 4, would be an enormous negative number, a {soles proximation t te tee 22, [Newton's method can be modified to deal with tis sustion as follows Since (A) is no well ypprosimated by 8 sight line (2), we approximate it bby enater simple funetion (2). Thor is nothing special about straight nes: any apprasintion (tht Is both easy to compute and has zeros that a fmey to compute exn be ised in plane of Uz) in Newton's method. Sine J(3) has pokes ata nd dss and these ples dominate the behavior of £0) neae thea, tf natural when seeking the root in (dd) 4 eoae (3) ko have these poles aswell, Le nay org Bate ‘There are several ways t0 choose the constants cy, 6 ay 40 that A(R) spovoxinas f(A), we present sigh simpli version of the one usd In the LAPACK routine elaed [170 4 Assuring for moment that we hve chosen e, and ey, We ean easily solve h(A) ~ 0 for 2 by solving the equivalent quedeatie equation faldiyn = 2) + adh 2) +e — Mas = ‘The Symmetric Bigenproblem ae SVD. eo Given the appraxmnte wero Ay here how we compte, el 50 Ut for dear y sans sore gh Ti wie oe bad, + U0) 0400. For € (a dst), (0) 5 sum of negative cers ad Yn) is som of pose ive terms: Thos bosh v2) and vy(\) ean be compute accurately, wheres ‘aiding them togtier would Lie esl in eaorliation and los of ratve ‘accuracy fa the sum. We now chose ey and 250 that oy = aa g hy AD=ViO4) ad HOY=HO) 6.15) "This means tht the graph off} (8 hyperbola is tangent to the graph of sO)at A= Aj. Thetwo caditions in equation (5.15) are tho usual condtons In Newion’s method, xeept insted of using a stright ine approximation, we tse hyperbola. tis easy to verify that ey — of(\dh 2s}? and & — iQ) = #4} = A) See Question 3.17) ‘Siar, we coos and 0 that ha) = eb Ry satis ‘ha(Ag) = Vals) and Wa(As) = 9404). (6.16) aye 4) = 1+ m0 90 Ora rare gery eres BxaMPue 5.8, For example, in the example in Figure 5:3, if we start with N= 25, then Lan 10-8 | 4111 10-8 deer eet neat y= : ‘and its graph is visually ndistinguisable fom the graph of (4) in the ght Ina figure. Solving As) ~ 0, we get y — 2.0011, which i aocurate to 4 decal digits. Conauing, Ay i accurate to 11 digits and Ag is scurate to al 6dgts. © ma Applied Noserical Linear Algebra ‘The agoritha used in LAPACK routine staedé isa slight variation on the one desebed bee (the ane here Is ella the Midile Way in 170) The LLAPACK routine storages two to vce lerations per egenvale to eanveese ‘o fll machine presi, ane newer took more than even steps in extensive ‘umerial ests Computing the Eigenvectors Stably ‘Once me have solve the secular eustion to et the egenalaes of Da, Lemmn 32 proves » simple formula for the eigenseetars= (D ~ al) Vafortnstesy the frm cn be unstable 58, 88,2), particule when 00 ‘geval and ay ae ery case ogecer. Intel, che peoble is that (Deut) and (D ~ a4)" are “very eos” frmls yt are supposed to yield orthogonal eigenvectors. More predsly, when a; aid ay ate wey lose, they must also be close to the between tha. Therefor, Ure I & ‘rent doa! of eancllation, either whon evaluating &~as apd d,— anya oF when ‘aluating th secular equation during Newton iteration, Either way dk 05 tn! dy ~ ais1 may contain large relative erors, ste computed eigenvectors (Day)-M and (D mass)" are quite inaccurate andl fr from orthogonal. arly attempts to addres this problem [S8, 28] used double precision arithmetic (when the input data was single presion) to solve the secular equation to high accuracy 0 that day and ~ a4 cul be computed to Nigh secures. But when the pit data is already double precision, this scans quadruple presion woud be ade, and tls ot valle in may ‘machines aud languages, ot st least not cheaply. As described in section 1, 1s posse to simulate quadruple prison using double prechon [282 02) "This canbe dono poetably and relatively eilenty as Jong as Use udenying Sloating point erithmete rounds suffiseatly sceurataly: In pariula, Uae simulations require that fa 8) ~ (@-=O)(1 +8) with [| ~ Ofc), baring ‘rerfow or undertow (se setion 1.5 and Question 1.18). Unfortunately, the Geng 2, YMP, and CW do not round accurately enough to uke these een lochs. Finally, an slternative formula wee found that makes simulating igh pre ‘sion arthmese unnecessary. Its base! on the following theorem of Line luzr, 177 "Tuwonea 5.10. Lawnee, Let D = dlnga,.d) be diagonal ath dy < " ay Be gen, satiny the Itrlcing property by cy co Sgt S541 Ch Ca ee Cy ‘Then there a vector such that the a, ave the exact eigenvalues of D = ‘The Symmetric Bigenproblem ae SVD. Bs D+". The eres of are given by a sea” Proof. ‘The characteristic polynomial of D ean be written oth as det D M)=TIp-s(as~A) and (osing equations (5.18) and (5.14) a cs eab=a0) = lite] wtzty : lites] or Sh +) TT G-a) -@. Setting A= using bot expressions for dt — M) yk Thea TT @-a OT t= Using the ttrlcing property, we can show tat he fection on the gh olive, so we canta ls are rot to get the desired expression for fy Here the stale algorithm for computing the ign nd genetors {where we astm for simplicity of preenttinn that p~ 1). AUGORITIG 5:3. Compute the eigenvalues and eigensectors of D+ wi? Solve the secular equation 1+ SE 8, of Dan, Lie daner's theorem o compte ss at te a ae Peaet™ eigenvalues of Di se Leman ko compute the eigenvectors of D+ ai 0 to get the eigensalues 2 Applied Noserical Linear Algebra More a sketch of wy thls lg sume state By anal lag the topping terion the ule equation soe, one ean show that fue” a © 04) Da + fou) ths means that D-+ wu and DY a? fe 30 low togeter thatthe egenvalis and gemvctors of Dt sa are sable approiations oft genau sd egwetos of + wu? Next tote tat the formula for dy in Liwne’s theorem roles only difereaces of foting pint numbers d;~ de and ay — dy products nd qutients ofthese irene, sd aque oot. Priel tht the fating pnt ame facoute enough bat Ala 8) — (a 8+ 8) forall © (hoa) ad Sarto) — y(1+ 8) wih | ~ OL), this forma on be esta Yo high reatve etry. prise, we ete sh Ut y a ya J) san 4) pestis] ) 01a | with j= O(¢), boring ovetow or undertow. Silay, the formals tn ‘Lemans 5.2 eau ako be exlusied to high relative acute, 50 we ean expe the egenveewors of D+ di” to high eelative securacy. In partiulr, they are ‘wry eecurately orthogpaal 1 summary, provided the Boating point arthinete Is aeueste enough, Alyorithm 5-8 compotes vary acura elgenalucs and eigenvectors of @ matrix ‘D-sathat ils ony slighty rom the original matric Dw". This mee ‘ha ti menial stable “The reeder should note that our need for suficiently acura esting point tvithmetie i preesely what prevented the simulation of quadruple presion proposed in [282, 20] from working on some Crays. So we have noe yet suceeded in proving an algorithen that works rlinby on these rachis (One more tick is essary The only operations that fil to be sourate ‘enoh on some Crags are aiton nl subtraction, bese of he lack 0: Calle guard digit in se lating pot hardre. This tens Ut the bottom ‘most it of an operand may be ete 8 during dition subtraction, even IFILL. Ir most higher-order bits eanel, this “lost bit” bowomessigafca. For example, subtracting 1 trom the next smaller sting pot number, in which case all esding bts ene, results ins number twin too large on the Ceay C90 and in 0 0a the Cray 2. But Ifthe bottom bits already 0, no harm is doe. So the trek sto detibertely set al the bottom bts ofthe d {00 before applying Lowner’s theorem or Loma 5.2 in Algoethin 8:8. hie ‘modifeation exes only «smal lative change in the alo, ad = the goth is stl stable." "Ta mre de ier ition er it we for ado dye iba sequent totam ts mary ‘ioc he ts «Hating pu abr tO a Cras ene ca tha tetany nf tome S83)" Thing ompttion dr cna ft {ison machine wih scr Kny arth ring re hich el ed ‘The Symmetric Bigenproblem ae SVD. a "This algorithm is deseribed in more detail in [127,129 ad implemented I LAPACK rote #1903, Accelerating Divide-and-Conquer using the FMM “he FMM [2 oral ier or compel dirs poem Computing ie muta reson nce ren oom ‘Stal fos mses We ely eh thes pol "hed fsing signet gto hvig Ss [2 Tet teh be the hwesimessonl pst vero nates whch a Eat hog i poo moon or Fe ‘es ih it pe arg The he resem le tt te tren tho pel oy do porte nth proarona > satis — 5) $= Taalt I we are modi srs ino nso ed of tht foe I'énge tthe ive pe I= Son) O° d= o5lh Sine dnd ay axe voto in, wo can also eomsider them to be comple variables. In this ease hove dy anc are che complex conjugates of ek and a, respectively, I anda huppen to be real numbers, this simpifes further to. bas Now conser performing « matrix-vector mukiplcaion J7 — 27, where is theelgenvector matrix of Du, From Lerama 5.2, Q — was/t—a3) fm tang ne ce cy hse crs opts he ‘cmp rth cmt eration ope) sping (8 3) with an fatto eon edn eprom hei tines het hy he ine ‘Seen ir aoe st, bt lhe "Stine hi an psa etn si tn bs Applied Noserical Linear Algebra unit vector. Then the where 4 & a gale fetor chosen so that eon j Jub eaey orf? = =F is Wh Yexdly nog wich th sme ama he trent fr exp forthe ee fa tora Thus the most expensive part ofthe dvidesconqe lst, ‘th mat mtpeton I tlt ne of Aon 52, alent 1 vate chert frees lating thi im for j~ 1... appr one OC?) ops. The xs and ers hk [12,2] can be wo sproinaely (bt ey Serres evlnt this su a tag) tne or even OC) ne ne! {Sie there on nec Neth forthe Sb Prem PARALLEL-HOMEPAGE for eta) It ths ide nes ot emu co en tes ove conee {© O(n hg Aer al th oat egrcton mar Q has 1? nt ‘ie sppear io ma a the ety shu! be let. we ts epee Qing ewe thn n np sr This pa Seems an mn tengo mat hey 201 dees (he eizoal a mperdingna enti) of whch nea be epresata bythe Siseavalies ving fs he tog ats Q nts words nt ee rte! mati ean be the egenvetr mat of syne ioe! T: only a (0 Dimensinal beet he ete (nn 1)/2ineoal ot ertgonal tries canbe sch eigrtor mat ‘Weil repent @ ni te dveconguer te compte by Algo rithm 52, Rather than accumulating @—[ g, |-Q%, we wil store all the © tis och noe in the om. Ad wl ot store expt Su rater jst store Dn th elnino of D's” We cn do {hs sce eal we nto th FMI to ply by Thi ces ti storage tal for @ rom 4 to On -lagn). Ths, the utp ofthe alzrithi 8 fcorf orm of Qeonssing of al th lacoste noe oF to te. Tk an eequne repeat of bts me een ue the FMB to muily any vector by Qn O(n?) tie 5. The bisection algorithm exploits Syeste's inertia theorem (Theorem 5:3) to find only those cievales that one wos, at cost O(n). Real that Inertia A) — (encom), where, Gand 7 are the nimber of negative, inl positive genvalis of A, repoctively. Suppene that N i aansagular Srhesters inertia theorem sssers that Toei A) — leet X AX) Now suppove that one vies Gaussian elimination o factoize A Epi, where Lt toosingular and D diagonal, Then Tneria¢A ~ =f) = Bisection and Inverse Iteration ‘The Symmetric Bigenproblem ae SVD. ea Inertia(D). Sinoe D is dingonsl ts inertia is tiv to compute. (In what Fallows, we use notation such ag “Hf dy = 0 to zac “the number of als of that are les an 20") Tnertia(~ 21) — (ft <0, # da =0, # ds >0) (negative eigenvalues of 4 moro eigenvalues of A— 2, 4. positive eigenvalues of A 21) GF eigenvales of As, if sigevals of A= = if eipovalus ofA > =). Suppose 3 < zp and we eompute Inia (A "Then the aumber of eigenvalues in the interval [= of Aca) ~ (f egenvales of A< 3) "To make this obmervation ito an algorithm, define Negeount(A,) = # elgemlons of A < = Auconiriis 5.4. Bisertion: Bin all eigenvalues of A ssid a8) to a given fron taleronce ol he = Negeot A, a) = Negeri 0) ‘Png =a gl besnse there are wo genus in 00) aia ti onto Work 7° Worst contain tof ner 0.0) comainng eigen nna 1 though ~My which the goin tel epaaly bse wl they ae nse than ah. tee Wonka i not empty do emote lott) fom Works lupe Me ten thr are eigenen low.) ‘ot lo nal nto Work endif {Frag > ts then there ae eens in i 0) ot dl sag ont Wart endif end end we May 2-2 ay are eens, the same Hen cn be used to compute ay for § dyads + yernds. This becse we KOO On-naget BEDARD Cmnye ein the itera fo). 20 Applied Noserical Linear Algebra 1A were dense, we oul! implement Negoount(A,2) by doing symanetic Gaussian eimigatlon with plvoting ss dexibed in seton 2.72.” But this ould cost O(n Hops pe evaluation and 0 not be cst tetive. On the othr hand, Negeoun that we donot po 1 “4 14 $0.05 ~~ didly ~ by and chewater Sobstiuting Bf nto day Fe a % ‘lls he simple eeurence ala oy Naticthat wo are not pivoting, 0 you might think tat this dangerously unstable, especialy when dy is small, In fact, sie is eiiagonal, i ean be sown to be very sable 72, 73,154, Lista 5.8. Ted computed in lang pont arithmetic, using equation (3.17), ‘ave the same sgn fend so compte the same Insta) asthe dy compat actly from A ae As ory else to (A= 005 and Assn by = W146), where ei] S25 + OC. Proof. Lee dy denote the quantities computed using equation (5.17) ine rouaing tres Byte a ser ll the es are bounded by mackie rounof en mgt, snd thee Sulerptsindete which Noting point operation they come from (for exp, ‘as from the second subtretion when computing dj). Define the ew vanes a= fle -90 teas Cr gd]U tea Gas) a 4 = Tepito 6.9) ba = be Shall +e) ‘The Symmetric Bigenproblem ae SVD. 1 Note that od nd dy hve the same signs, ade] 2.e + OL). Subang (6.19) into (5.18) yes few completing he prot © ‘Acomplete analysis must take the possibly of overow or underflow nto sccount. Teed, sing the exeeption handing frites of IEEE. aithmeti, fone can safely compute even when somo ahi excl zero! Fr in this ease hy dyn — ain il the computation eontiniesanexoxpeionaly (72, 59, "The cot of single ell 19 Negeouat on tiagol matee 6 ps ‘Theetore the overall east tof fegervalus is Ot). Tiss implementa In LAPACK routine aetebe Note that Bisetion converges linearly, with ane moro bit of securay for ‘ach bisection of aa interval ‘Phere are mary’ ways 10 aecerate convergence ting algorithms like Newron's rmetod ae it relatives, o find aero of the ‘aracterstie polynomial (whieh may be compute by mulling al the d's together [171,172 17, 174, 176,267). "To compute eigenvectors ance we have computed (slated) eigenvalues, we ‘ean use inverse eration (Algorithm 42) this avalale in LAPACK routine fetein Since wo can uso seurate eigenvalues ws shit, convergence sully takes one orto erations, In this ase the costs O(n) Hops per eigenvector, Since one step of inverse iteration requires us oly to solve a tiagonal system ‘of equations (xe section 2.73). When several computed eigenvalues dr v oy are clowe together, thsi corresponding empitd eigenvetors Gy... 4} may ot be orthogonal In this ea the algorithm reorthogonalizes the cornu gennectors, eompoting the QR decomposition Us) = QM and replocing ‘ach de with the fh column of Q this goorantoes that the are orthonoreal ‘This QR decomposition is uslly compute using the MGS orthoganalzaion process (Algorithm 3.1; each computed eigenetor has any ompanents Inthe direction f previously computed eigenvettors exp subtracted ou When the cluster sae kj 1-41 is smal, the oat O(n) ofthis wording. ‘onalzation i ral son pracpl all he eigenvalues and all the eigenvectors ‘ould beeompated by Beton followed by inverse eration in just O(n?) Hops total. This Is meh faster than the O(n?) cost of Qt eration or dvide-end- conquer (In the worst eas). ‘The obstace to obtaining ths speedup reliably is thot If th cluster so k = j ~i 41s large, i, a sizable fraction of, then the total cost ses to O(n) again, Worse, there is no guarantee that the ‘computed eigenvectors are acurate of orthogonal (The trouble i that after rearthogonaliing 8 set of realy dependent ge, eanedlltion may teen some computed eignvertors consis file more than roupdel eros) "There hasbeen reat progiess on this probe, however (103, 19, 201) ‘and ie now appears possible that inverse Keraton may be “wepeed” to provide ane Applied Noserical Linear Algebra sccurate,orthngonal eigenvectors without spending wore than O(n) Hops per fSgeavector to reotogonaiz. This would make Bisection aod “repaired” Inverse iteration the algoethm of eboice ia all esses, no matter bow many guavas and elgenvetors are desir Wo look forward to deseribing this slgoetho in 8 futute elton [Note vat Bnetion and inverse eration ae “embarrassing parallel," since cach eigenvalne ne Iter eigenvector may be found independently of the others. (This presumes that inverse iteration has been repaized $0 that ‘eorthagonalzation with many other eigenvectors 10 longer anessry) This makes these algorithms very attactive fr paral eompaees [75 Jacob's Method Jncob's method does not start by reducing A to tedlagonal from a do the Previous methods but Instead works an the orginal dense rates. Jacobi Inethod is vsuelly moch slower than the previous methods and remains of Interest only because it ean sometimes compute tiny eigenvalues nd thle ‘gemvetors with much higher accuracy thn the previous methods ad ean be tess paralliad, Here we desecbe only the base plementation af Subs ‘method, and defer the diexsion of high aoraey to setion 5.43. Given a symmetric matt A = 4g, Jacob's method produces a sequence Aja... of octhogonaly similar mais, whieh eventually converge to ® liggonal matsix with the ezzaalues onthe diagonal Aj, i obtained frm ‘Arby the formula Aisa — Jf Ask, where J; san orthogonal matrix ealled @ “acob rotation, Ths Am = Sit Antted = Ek stot ali= Sinn Aeon = hay It we too ech J apport, An pcs a agonal mite A for lagen Thos wocan ile Ne D1 Adee JD A Tete the clan O14 ar pproinate or We wll nls JAY onary dag by Rertely cooing 10a one alr lagna esis o yy A oat tine, We wl do hy "The Symmetric Eigenproblem ane SYD 2a hosing Jt be & Givens oetion, se wat nl Ten eee ate ge ed —sind J fal? alo] Poon sino [sf Bs] - [se ey [BB] foe ae] ca aaa ee symmetrs, abreviating ¢ = cos8 and $= sin, aed dropping the superserpt (0 for simplicity yt [ss Setting the ofngonas to 0 aod solving for @ we get 0 — seas ~ 45) aul =), oF ) ye 4 ayes? + 2seaye—selage ay) + 0 Selous a3) Fane =) ays? taut? Dea om Way ee DD one r We now let ¢ = £ = tnnd and note that &2 = b to get (via the cqadrate formula) t= PRB, ¢ = aby and «= t-¢, We summarae this to (ere tol is the stoning eiterion st by user) ‘hoate J and ko ay 6 the largest efiagonal entry in magne fall Jacobi Rotation (A, nd whe ‘Tuonsnt 5.11, After one Jacobi rotation inthe clin Jai agri, we have of(A') < y/T— olf(A) atere W — S32 — the number of su- reragonal entries Wf A.” AMer k Jacb-Rettions of) ism more than (1-4)? ott), Pra. By Lemma 4, er one tp, oA?) — of (A) Al, where me isthe Iara oiagonal entry Thin 0F2(4) < MMR > eta OHP A, so off (4) = 03, = (1— 2) (A) andi. © othe laa Jacob alg converges a ast neal with the cor {eased by (4) decreasing by xfactor of fast = fat, eventually converges quately time tn ‘Tuponsn 5.12. Jacob's method & tally quadratically convergent afer Nv steps (ise, enough steps to choose each ay once This manns that fori lage enough fay 9) = Ofof*(AD) In practice, me do not use the elasieal Jacobi algorithm boca searching forthe lrg entry sto slow: We would ed to search entries for every -Ineobi rotation, wich ess only fn) flops ko perc, a so fo large me Search Cine would dominate. Teste, we ne the allowing simple method to eon J ad Auconsruas 5.8, Cylichyprow-Jacole Sweep through the offiagonate of A peat for j=1ton—1 fork jet 00 2 Applied Noserical Linear Algebra alt Jacobi Rotation A, 5.8) end for end for uni Ais sufiently dapat A no longer changes when Jncbi-Rotatlon(A, jy) chooses only — Land + —0 for an enti pass through tho inne lop. The eyelic Jacob's algo- ‘it iso saymptotclly qrtially convergent ke the easel Jabs gost 200, p27 The est of One Incobi “smwep” (where ech jk piri sled once) it poroximately ball che est of rection to idsgona form sl the ep {ation of egenalies and eigenvectors using QR Heratlon, aa tote than the cant sing divide adconguer. Sage Jaeah's method often takes 5-10 sweeps tecconverge, iis much slower than the competition, 5.3.6. Performance Comparison In tis section we analyze the performance of the the fastest algethins fe the synmettc cigeaprblem: QR iteration, Beton with iverse tee ton, and divideandconquer. More details may be found in (10, chap. 3} or [NETLIB lpack/lng/lpack ug hel ‘Wo bogin by dlscusing tho fastest algorithm end Inter compare the others Wo used the LAPACK routine seyeva. ‘The slgritin to find only egonva ses reduction to triagonal form followed by QR tration, for sn operation count of $x Of) Hope. The algorithm to find eigenvalues and eigenvectors ' tridiagonal retin followed by dvidesand- 0. “This ss factorization of 73 into lower trnngular matrix A tines es eae pose sing tho Cholesky fetorzaion i unigot this most in fete the Cholesky factorization, ‘The scoond fnctriaton i 12 — BE By BE By. Now ly Algorithm 59,1) ~ ByBif — BY By, 90 we can rewrke 72 ~ BF BoB Bo E(B B,) By — (By Bo)" By By. "This ako a Totorvation of TZ ito 9 lower tiangular mattis (189)? tines is teasspone, so tis mst again be the Cholesky faetoiztin. By uniqueness of tat Cholesky factorization, we ‘once By, thus relating two stops of LR iteration toon stop of QR iteration. We exploit this reltinship as follows: To ~ QU implies T= RQ=RQRA ~ RQR)R" = RTA vecasse T= QR au Applied Noserical Linear Algebra (2B) BKB)" ocuine R= BB aa Ty — BE By = But Bae? BBTB)B:! because BoB — Ts — BY By mit Ty sede, 0 Neither Algorithm 5.9 nor Lemma 5.6 depends on Tp being tridiagonal, Just symmetric postive doinite. Using the relationship betwecn LR iteration fal QR iteration in Lams 5.5, one enn sow that rah of the convergence alysis of QR iteration gas oer to LR iteration; we ill aoe explore this hee ‘urine git, ds, ential ula to eran, ut isnt inphental ecrba n Alitin 2, tease he wd Involve ext foring Ty, ~ BL 47, wich a eton we sowed cout! bo mmercly sabi Ustad om vl fr Bes ety rom Ph, sith eve forming the terme ate Ty “To simply mation, ee cng mo ad mperigoel bbc tt Bin Re dag bea aprdnga Be ism the omen fo fo by 01 We eat Bt Bi BE Baya +78 = Toa = BBE +93 620) auating the jj ents ofthe let and right sides of equation (5.20) for J 1 swh hate! Sy ran To eG <7. in olker won Car isa bound on the relative difference between each entry of B andthe ‘eorresnonding entry of B. Let oy Sos < oy be the singular values of Band tq Sons <0 be the nga values of B. Then le] = le 2-0 andr —1 0 <1, then we can write Fil v8 1 Un -2. +012. ‘Thus, the rtve change in the singulae vals — oo bounded by {in—2 ies the elative eang cin the mati etre. With tle more werk, ‘The Symmetric Bigenproblem ae SVD. a the fctor4n—2 enn be improved to 2n — 1 (8 Question 5:21). "The singular wetors ean also be showa to be determined quite aecueately, peopoetional to the eeipocal of the relative gap, as defied i setion 5.21 We will show that both Bisction (Algorithm 5.4 applied to Ze from Lemma 5:5) and dgds (Algorithm 5.1) ean be use to find the singular voles of bingo matrix to high rative aecuraey First me consider Biertion Recall tht the eigenvales of the symmetric tridiagonal matrix Ty ae the Singular values of B an Une negatives. Lemans 58 pbs hat the inertia a Tye M eoxaputed wing equation (5.17) i th exact inertia of some B, where te elaive dilereno of eorespouingenteies of B stad B i ot most about 25e. Therefore, by Theorem 519, the relative diference between the eam pated singular values (Ihe singular vals off) and the tue singular ves ‘a most about (101 ~ 5) ‘Now we cosider Algorithm 5.11. We will use Theorem 5.18 to prove that the sngulae values of (the Input to Algrithy 8.11) and the singular Ylts of B (che output from Algoritin 5.11) agree to high tative securacy. This fet implies cha after many steps of ds whens nea agonal with is Slogula vlues oa the diagonal hase singular vlles match the saga vals ‘ofthe orginal Input mates to high relative accuracy "The simplest situation to understand is when the shift 30. ta this ease, the only operations in dads ate additions of postive umber, multiplications, and divisions! no eancollation occurs. Roughly speaking, any sequence of ex presslons bulk ofthese base operation Is guarantd to compute each output to high relative acursey. Therefore, is computed to high restive accuracy, land so by Theor 15, te singular vals of B and B agee to high relative accuracy. The general case, whore 6 > 0, tir [102 ‘Tuvonsn 5.14. One sen of Algorithm 511 im floesing print erihmetic, op phe to B aod yielding Bi equivalent to the fll sequence of eperations: 1 Make asa rete chang (by obs 1) each nro B, pein B 2. Apply one sep of Algorithm 5.11 i exact arithmetic to B, getting B. S. Make o small elaive change (ty at most 2) im each entry of B, geting 2. Steps end 3 alove mate only small relative changes in the singulr values ofthe bdiogonal mari, 30 by Thaore 5.13 the singular wales of Bend B lage (0 igh relative accuracy Proof. Let us wete the ier loop of Algeithim 5.11 9 flows, introducing subeeipis on the and f variables to et us hep tack of them in different Teraions and inluding subseripad 1 -¢ tems for tbe round eroes: as Applied Noserical Linear Algebra Bh reMto0) = yalGMl +g) B= 6-40 tau Bho GE a) BML FG) Swbstitoting the rst ine into the second line ville _ a hey Bq Te Substituting this expression for fy Unto the lst lise of the leptin aa di vaing through by 1+, ye leaner Ms o ese ms 8 = ea G20), i [Note fom (6:24) that fers fom by a elative change of at mest 1.5 kn each entry (fom the vee I< feta in dsy1 = B85 y3) [Now me can deine dan n Bb Bde, B= Gul) B= 6ls bod hs. This s one step ofthe dels lgoithm applied exo to B, getting B To finally show that lifes from B by a relative charge of at most «in ench entry, noe that a = bse Treat SG teil +G0) Treas Trade) ‘The Symmetric Bigenproblem ae SVD. 20 snd a cred Lig fat [reat 5.4.3. Jacobi’s Method for the SVD. In secton 5:35 we discussed Jacob's method fr finding the eigenvalues sd ‘igonvectors of» dense symmetric matrix 4, apd sid it was the sonst avai. fable method for this problem. In this setion wo wil show how to apply “Ieobs meted to fin the SVD of « dense matrix Cb imply spp ‘Algoritn 5.8 fsetion 5.3.5 to te symmetric matrix A ~ GC, This implis that the eanvergence properties of this method are nearly the same ws thane (of Algorithm 53a! in partic Jobs rethod i nls the slowest tho ‘valle fr the SUD. Jacobs method is stil interesting, however, becuse for some kinds of navies C, it ean compute the singular ylies and singular vetors veh nore accurately than the other algorithms we have discussed, Por these “Ieobis method cornpates the Singular les aod singular vectors 10 high relative aceuraey as deveribd in sein 5:21, ‘After describing the impli Jeeabi method for the SVD of we will show that it computes the SVD to high relative scursey when G can be written inthe form G ~ DX, where Dis agonal and is wel conditioned (Phis meas that iil eoeitioned if wn nly iD has both age ane sal agi entries.) More geal wo bert as long as X is sigan beter ‘clitioasd than G. We wil uses this with a mate where any algorithm involving mauetion wo bidiagonal form necessarily oes al init digs in all but the largest singular valve, whereas Jacob computes al inglar valves to fall raachine precion, ‘Then we survey other clases of iatries G for which Jacob's metbod i also sgnlfeantly more accurate than methods using bidiagonaiaton Note that if Gis bidiagonl, then we showed in ection 5.4.2 tha we cok tse iter Bisertion or the dads algorithms (ection 5.1) to compute is SVD to high relative aocrsey. ‘The trouble s that nefucing cate from dense to bilagonal form ean introduce errors that are large enough to destroy high relative seeuray, as our example wl show. Since Jseab!'s method operates on 20 Applied Noserical Linear Algebra the ong matrix without fst raduing io bidiagonsl fom, ten achieve high relative aceuaey i many more scustious. The implicit Jacobi metiad i mathematieally equivalent to applying Al forth 58 19 4—G"G. In otter words, at euch sep we compute 8 Jaeabi Fotaton J a ipbeily update GPG to J7GTGS, whee J chases 50 that {0 llllagonal entries of GTC are set 0 zero in JF GPG. But stead of computing G™G or J7G"GI explicitly, we instead only compute GJ. For this reason, we ell oUF algorithm one-sidad Jacob rotation. Aucontrnst 5.12. Compute and apply © onesided Jacobi rotation to Gin ‘torinaes ‘00 One Siac Rotation (G58) Compnte yy = (GPG), au = (G"G)g, and oan = (GTO lanl 6 not too smal T= lay au)/2-e) te sign(ei/le)+ v7) envi G=G-R.K0) where e e050 and s = sind I rght singular vectors ere desied JF Rsk 8) endif endif [Note tha the J, jk, aa hk enti of A = GG are computed by pace dure One Sided-Janoi otto, alter whic 1 computes the Ssobi tation 'RUK,0) in te seme way’ prose Joeo-Rotation (Algom 53). Aucomrsins 5.18. One-sided Jobe Assume that C is n-bp. The omtps| ‘ere the snguler values o,, the left singer vector metre U, end the Fight ‘singular vector matre Vso that G —UEV", where 3 dagl a). repeat forj=1ton—1 fork=j+1t0n ‘el Ont-Sided- Jacob Ratation( jk) end for end for unl GC is diagonal enough Tat 05 = Gtsi)ia (the norm ofeotunn é of C) Hat fay ov th tehere vs ~ Cli) /66 let V = Jy the accwmilted product of Jaco voations ‘The Symmetric Bigenproblem ae SVD. 21 Question 5.22 ask fora root chat he marin 3, 1, and V compte by ‘one-sided obi do inde fr the SVD of G. ‘The following theorem shows that ouside Jarobi ean compote the SVD to high relative scurse, despite round, proved that we can write G ~ DX, wince D is diagoosl and is wel conditioned ‘Dupont 5.15. Let = DN be an n-by mars, where Dé diagonal and ronsinguler, and X is nonsingular. Let @ be the marie efter calling One Sided Jacobi Rotaon(Gj,8) tines in floating point ahnetic. Let oy > Dom be the sngulr elves of C, and lt dy 2 -- > By be the singular ates of. Then < Otme)R(X), (625) tuhere K(X) = LX]-.X7I] is the condition numberof X. tn other words the ‘elative err in the singular valves is small ifthe condition numberof X small Proof. We first consider m = 1; Le, we apply ony a single Jacobi rection and Iter generalize to larger m. Examining OneSided-Jocbi-Rotation(Cj,8), ww see that = AC), ‘whose itis 9 Hosting point Givens rotation. By constrtion, fe difers from some exact Givens rottion A by Ofc} in noem. (Ie pot important or nee fewnrily tre that fers by Of) from the “true” Jacobi rotation, the one tit Onosie-Incobi-Rotation(C, j,k) woud have compute in exes arithe imotie. Te is nostsary only that dnt it difers fom some rotation ty Ole). ‘This roquirs only that } 5? — 14 O(@), wid easy to wes) ‘Our goal & to show that G = GRU} £) for some 2 that bs smal tn orm: [lly — O(6)RLX). ICE were 2b, then CE sad GR woul ave the Sine siaguse valves, since Mis exactly exthogonal. Whea i less thn one In nor, we can use Corollary 3.2 to bound the elative diference ia sigulse values by a . Pet Lc utes 6)? ~~ eB BET 1 rotations, note that la ext aithietie we would fave G— GR DNR — DX, with w(X) — m(X}, 90 tbat the ‘ound (5.26) would apply at each of them steps, yielding bound (5.25). Bee cause of roundot, n(X) could grow by as much os wl} B) < (+ O(2)X)) at fch step, a fictor very clo tI, which we absorb into the Ole) “To compete the algorithm, we nod to be earful about the stoping = tevin, i¢- how to implement the statement “if [aa fs not too small” in Algorithm 5.12, OneSided-Incbi-Roeaton. The appropriate riteion Fs dsensed frther in Question 5 AuPLi 5.9. We consider an extreme example G = DX where Jacobs method computes all singular values to full machine presion; any method ‘eying on bdiagonalization computes only the largest one, v3, to fll machine Dreision: and all the others with no ceuray at ll (though it stil computes thom with ermes #02) v3, as expected from a backward stable algorithm). In this example « = 2-® = 10° (IPEE double precision) nnd 9 = 10-2 (any value of <2 will do}. We dene Cee eat eres 900 . aeveoo |i Gal oao ” a s0120 joes noon allroot "To at least 16 ds, the singular values of C are ¥%, 3-9, and 9. To see iow accuracy is lost by reduelng Go balagonal form, Wo consider jst the fst step ofthe algorithm section 4.4.7 After step 1, premulipestion by 8 Householder transformation to zero out C{2 41), Cin exeet arithmetic naa F343 See “my 5 3-5 0 0 0 a "The Symmetric Eigenproblem ane SYD 2 Note tha all information about 9 has been ene” from the last three eons (of G). Since te last thie columns of Gare identical Gy is exactly singular and indeed of rank 2. Thus the two smallest singulse values have been changed fron 7100, seomplete los of relative securacy. If we mode no further rounding errors, we would reduce Go the biiagonal form 29 vB 130 e oo 0 with singular vas V5, Bq, 0, and 0, the larger two of which are accurate Singular values of G. But asthe algorithm process to eehuce Gy to bidiagonal form, roundoff introduces nonateo quantities of O(¢) into the zero entees of B, ‘making all three sinall singular valves inaccurate. ‘The two smallest nonzteo ‘computed singular values sre aocidents of zoundoff and propostional to ‘Ono-sided Jacob's method! hs no dificult with this mate, converging a three weeps to G = UV, where to mschine peeision 1 gag wiass OFA and 35 = dag( Vp, V3). (Ueobl dows no autowstialy sort the sague Value; this en be done as & postprocessing stp.) © Here are some other examples whe versions of Jacob's method ean be sown to guarantee high relative accurey in the SVD (or syametric eigen ‘composition, wheres methods elying oa bdlagonaliation (or ted falization) may loo all significant digits in the smallest singular value (oe ‘gonvales). Many other examphs appear in (7) 1. IFA = EE isthe Cholesky decomposition ofa symmetcle positive de Ino matrix, then the SVD of L = UV" provides the egendeeompo- sion of —USAU7, IF L = DX, whore X is wel-conitioned and D is digonal, thon Theorem 5:15 tes us thot we ean st Jacob's meio to compute the singuar values 0, of Eto high relative accurey, with relative errors bonne by Oe)x(X)- But we slo he to acount for the ronda eons camping the Cholesky fsetor ising Cholesky Inkard ereor bound (2-1) (along with Theorem 5.6) ane ean bound the eelative error in the singular vals introduc by rourdatl during (Cholesky by O(e)2(.X). 80 EX i wel-conditoned, al the egexvalivs of A will be computed to high rave aocuracy 68 Question 5.28 aad (81,99, 181), a Applied Noserical Linear Algebra Bauru 5.10. As in Beample 5.9, we choose an extreme case where ny algorithin clay on initially reducing A to edlagol foe suse sted 10 lose all zlative seeuray in the smallest gene, whereas Cholesky alowed by one-sided Jacobs tpethod on the Cholesky factor computes all egoaves to nary fll machine preston. AS In tht fxample, le 9-10" (any 2/120 wil do), std et wats} [ee 28] 0 YA 10 100m} [10-1 10 10-2 1 90 reduce 10 trigonal form exsety, en Lm =| Vm 56 5-50 550) 5.440 Ea ich isnot even postive definite, sinc the bottom eight 2by-2 sub- Ioatrc is exatly singular. Thy the smallest eigenvalues of is no Dative, and 50 riingona reduction his lon ll relative seca in the smiles ciggavale. In contre, onesie Jobs method hs m0 rouble mputing the coreet sare 0015 of egenalins of A, namely, Eb yl 10°, 1 110", dm) 9-10", to neay {ull nschine precision. “s 2, Forextensions ofthe precaing result 1 indent symmetric egenprob- lems, [26,248 '. For extensions to the geeralinad symmetsie eigenpeoblem A — AB and the generalized SVD, se (65, 90, 5.5. Differential Equations and Eigenvalue Problems ‘Wo sek our motivation for this section frm conservation laws in pies. We nie once again the mass spring system inteodvced in Example U1 aad teevartined in Example 5.1. We act with the plat ease of oe spring and fae mas, without fetios "The Symmetric Eigenproblem ane SYD 25 Wee dnote horzontldpicement rom ula, Then Newt's law P= ma becomes i) ft) 0. Let 0) a) + 4h") “Kiet egy” “potential energy" Consernton of ence tl vs that $20) stout bw aero. We can cae Us te by eompuing (0) nso) + ha) — HCOKmaL) + he(D) — Oa des More general ne ve Mi(0)} Ka(@)— wave M the mass marian i the ste mars: The enray dined 1 be £(0)— HEAL) + Eat (Watt. That the eth comet dttion ie contd by eying that iis comers: ‘ 4 (Lereynnace + dot one) Low ~ £(Leronan 12eTona0) SET +37 OMHO 4 MORAY + = OKO) MOM) +8TOK A) TOCA) + KIN) ~0, whore we have used the symmetry of A and "The ferential equations MQ) + C2(?) —0 ae near, It 8 remarkable oct that some nonlinear diferental equation also conserve quantities such as ees” 5.5.1. The Toda Lattice Foret of notation, we wll write # ented of 20) when the argument scene fron contest ‘The Pd atic is lo 8 rs-springnystem, but the fore from the spring ivan exponetinlly decaying fnetion of it stretch, instead of nen fneton Wo us the bouslary conditions 4") = 0 (i.e, = ~se) and er) (0 (Le, Z9y1 ~ +20). More simply, thse boundary eoudtions mean thore are no walls atthe let or right (soe Figure 4.1). Now we change variables toby — et"! and ay — —bby. This leks the directa equations aaa saa aol Hts —tue) toner a. ig = bey = 20h) 26 Applied Noserical Linear Algebra thy =O ad by ab ob a ano p=] Ones us bet ty mb where = —HP. ‘Then one ean ell confirm that equation (5:27) the sre te f= TB. This scala the Toda flow. [Now dele the two tiiagonal rations ‘Tunones 5.16. T() has the some eigenualues as T(0) forall. In other sword, te eigennles, suchas “energy,” are conserved by the diferental ep tion Prof. Delon UY = BU, U(0) =f, We dao tha Os tog for {To poe thst safes wo stow ZU" =O sine UPU(O) = 4 Aut 070 400 — ut ATU y UTR UT BU -UTBU =o see BW sew syne Now we enn tht 71) = Ul }EO\UCO) satiss the Toda tow 2 = ‘Br ~ TB, iaplying cach TQ) ocosonaly sla 70) a 0h te ove gets: Aro = Honour sunnwore BUQUDTOWU? (t) + VOTO? (NBT) BOTW ~TBE) ded. © ‘Nate that the onky propery of B ane mas sew sper, 50 if P BP TB aad BF ~~ ten 1) has Ue ae eigen or al ‘Toweat 5.17. As + 420 ort > 20, 110) comers a agonal matric sth the elgonaes onthe diagonal Prof We wast to show b(t) —> 0.28 — 0, We begin by showing PASE Nfld oo. Wo se hndution to show f= (0) 1 (0) 32 aa then a these nequalis forall. Wham 50, we ge (0) + {2(e))et, whic iO by assumption. Now lt gl) = 2,0) jsf.) Bourdl by 2171s 21a for allt Then HO = 09 deap) 20-0) ~ 200541084) = + BO) — 208 (0) = 5D) ‘The Symmetric Bigenproblem ae SVD. nd 0 woraen = five = af hoe jona—2 fe 048. ‘Th nt integrals bounded foil by the indution hypothesis and (7) — {7 snl bounded foe al, sie Tat pt) ~ EIS! 42()._ We now know thst [pe < 0; ae! since 1) 2 O wo want to coed that i,m lt) ~ 0. But we need 9 exclude Uh psi tht) his eo spies a 0, whieh ease pO could be finite without 0) apronehing We show 0) hs to bes by Showing ts deiaive is bod G0 408 jdt ns bo bd 8 vo) [Seabee Sa an() —a(9)] — QR to set Reh = Great = Reh git or 7 = QT etn fet QURIQRIR-'— RR deems P= QR QIEITIONQ= HA beeasse Fy = FIT) FUQIT0}Q) — fue" FAT) = RI. hs GR dant at cpt nia (can be opal (95a Ween beep by Si where's wa gina mats ith agen c=). 7 ‘Sot Te cay sey by aint f= ST(S Fos we il ate ee Cac 8, thats har ba he Sa 7,7 ‘The Syme Eger ad SVD 2 Now 1= QQ tint = 40O = FOTO = OFT H(Q"O) ‘Thanos Q7Q ee sym nd 0 01 "Q) — QP ml FT) fie). Sine ae upper tng, Jose aed ty ss Bal Fd” tC). Now ano = gor ora Frog ra"nod = Sreohnoe 1 arroxaa?ve = rare" = “oii oa"@ =REMONT HORT) deed, "The nr clay expan the phnomonn bm In Quon 5, ware cou emote foun bean and ear ols sang Sevaho Quinn 825 Conouat 5. Sapo tht we alin T from the pte definite mati 1h by the fags 1. Dom sep ofthe eatilted QR agri on Ty ot 2. at Ty ~ ped 7" — 3.4, hee J eps he tty at wth ts amma i mers ore, Dom sep of ied QR on Ts a 9 bt tT Then Teo rf. 1X — X71 ety fo vel that (UJ) ~ Jel 80 Ty) = STUN ae gn — sgn Senay tsancryls LenEC OTD) UEDA ECE)S Soe 21 = UPA TIED) IMEUED ES tam) = RUPE Tat) ‘Thin eri these oon a8 7-H sii lye we coun ‘ gn “ i gree ~ rol PEYE + PRED So with the same intial eoationsT3, (0) and T(—1) must be equal Ite geting forme m, T(—0 takes Ty — TJ bork to FJ, te il sete, 0 yaad and T= Td Ty a8 dese, 200 Applied Noserical Linear Algebra 5.5.2. The Connection to Partial Differential Equations This seton may be sipped on wis eg ae 10) = ~B2 + aot) and) = —A + Hat + Bae.) oth 71) sad} are tear opersors on funeons, Ve, generalizations of Substiuutng into Mf = BrP — 1 yields = 008. tens 629) provided that we choose the correct boundary conditions for q, (2 rust be ew symmetric and T symmetric) Equation (520) i called the Kortowey de Vries equation and deseibes water ow ine shallow canna Ove ean rigorously show that (529) preserves tho eigenlies of P(0) fr all in tha sets tht the ODE, (Bren) ae 069 ‘ug same infinite sot of eigenvalues 2), o,f all. In other woes, Ure [San insite sequence of ergy quantities conserves bythe Kovtewes de ‘Vile equation, This important for both theeetial and numerical reasons, For more detlls on the Tode fw, se [142, 168, 65,67, 237| and papers bby Kruskal [164], lasek [104], and Mosor [185] in 186 5.6. References and Other Topics for Chapter 5 ‘An exellent general merece for the symmetee igenproblem is [195]. The ‘material on relative perturbation theory can be found in [74 SI, 9h so ton 5.2.1 was based on the latter ofthese references, Related work i found 1 65,90, 295, 28) A classical text on pertcbation theory fe general nee operator [50]. For» survey of paral slgoetns foe the sje ogee problem, so [75 The QR algorithm foe nln the SVD of biingonal makes § escussed in [79,66 18), and tho dads algorithm i in [102,108,207]. Foe tn eror analysis ofthe Bisetion algorithm, see [72, 73, 154] and for reat viempts to aceeerae Biscetin see (103, 201, 19,174,171, 17, 26], Curent ‘orkin improving inverse ration appt in [108,199 201. The divide anger eigeoatine ws introduce in [38] and further developed in, 185, 129,131,170, 28, 252). The posiiity of high see eaeyeigenvales ob: tained trom Joc is discussed in 65,74, $1, 90, 18,226) The Ta ow and related plinomena ave dseuse int, 67, 16, 142.16, 168, 185,195, 2) ‘The SyetsieEgenrobem and SVD a 5.7. Questions for Chapter 5 Cunsion 5.1 (Bays 2 Ba) Stow dh A BACs Meni doy a a-c [2 9] ts eymmetic, Expres Une eigenvalues ad eigenvectors of in terms of those aA, Questios 5.2. (Median) Prove Corollary 5.1, using Wes!s tore (Theo- rem 51) sd part 4 of Theorem 83. Questios 5.3, (Medium) Consider Figure 5:1. Consider the corresponding contour plot for nn arbitrary SHby-8 matric A with elgonvalues as < 09 < 03 Tet Cy and Cy be the to great cirles lon which pA) — gb what ‘gle do they intersee? QuEstios 5.4, (Hand) Use the Courant Fischer minim theorem (Theorem 5.2} to prow the Cauchy interface theorems «Supa that A= [8 isan mtn symmetric mse ad His (n— 1pyn— 1). ae ay < o> < ay be the elgonsalins of A and Oya << 0 be the elgenalies of H. Show that tse ewo sas of ignvales intertace: Oy SOuct SSS 015051 S064 SSS 04 eet A=[ fh | be bpm and be mbm wth eens Bn M4. Show that the elgenlues of A and JP iaveoer ta the sense a inom $8 $4 (Oreilly 0S 8; fnm) © 5mm Questios 5.5. (Medium) Let A = AP with eigenvalues ay 21> > dge Let =H with eigenvalues By > + > Oye Met A+ HE hase eigenvalues Xy > 2 ye Use the Courant Fischer minimax theoren (Theorem 5:2) 0 sow Ua ay + OS Ay Say +8. HIT i positive dei, oaclade tht, > ay Ta other weeds, adding a symmetric positive defaite matte 1 to sate symmetric matrix 4 ean ony increase Is egenvalies, "This ele wil be ued in the pro! of Toren Questios 5.6, (Metin) Lot A = [An ll be myn, where Ay is n-bysm and ala is mbydn = mje Let oy > =" > oy be the singular values of A 212" tm be the sngula vals of A. Use che Cael interes theorem ftom Quetin 5A and part 4 of Theor 8 1 prove tt > 1) > osm ome Applied Noserical Linear Algebra Question 5.7. (Metiym) Let be unt veto and be any vector orthng- nal cog. Show that (q+ Ag” ~ Ty ~ lar dla. (This result suse! inthe root of Theorem 8.) Questiox 5.8. (Hrd) Formulate snd prove 8 theorem for singular vectors alogous to Theorem 5 QuesTION 5.9. (ant) Prove ound (5.6) from Theorem 3.5, QueSTION 5.10. (Marler) Prose ound (57) fren "aor 5.5 {Quesmox 5.11. (Easy Supposed ~ 0,0, where all thre angles le bewean (and £/2. Prov that $sin29 < $sin2Hy + }sin2s. This result i used in the proof of Theorem 57 Queso 5.12, (Hard) Prove Crary 5:2, Hints Use part of Pore 33. QUESTION 5.15. (Mem) Let A be a symmetric matte. Consider runing shifted QR iteration (Algorithm 45) with a Rayleigh quotient silt (4, ~ dyn) st every eration, sling a squad? 0),02.--of shits. Abo run Rayleigh ‘quotient iteration (Algorithm 5.1), starting wlth 29 ~[0-.,0,1/, yeding fs sequence of Rayleigh quotients p12... Show th these sequences are ‘enteal: 9, ~ py forall. This justifies the clam in setion 5.3.2 that shifted Qt itezation enjoys local enbie convergence. QUESTION 5.14. (Bany) Prove Len Quesmiox 5.15. (Easy) Prove that if tn) = 2410/2) + ex? + O(@), then Hin) = edn? This juste the complexity analysis ofthe dvide-and-conquer slgoeihon (Algor 52). Queso 5.16, (Bary) Lot A = D 4. pi where D diag.) and = [ute ses? Show thot if dy = diy oF a =O, ten di an eigenvalue of A. if ay — 0, show that the eigenstctor corresponding 0 this, the ‘th column ofthe identity matrix. Derive asinlary simple expression when 4 ~ dist. This shows how to hole dfinton inthe dividend -conquee lost, Algorithm 3.2. {Queso 5.17. (Fay) Let end v be given salut. Show how to capute scalars € and in the fueten defniion M(A) — &+ 3 50 that at \~ &, ‘(g)~ @, and '(¢) This result i noded to derive the secular equation solver in section 53.3, Queson 5.18. (Boxy: % Bas) Use the SVD to show that fA is an me ‘bys rel matrix with m > m, then here exists an mete matrix @ with ‘orthonormal eons (Q"Q — 1) al an n-tyon portion smidinit matrix P Sek that A — QP. This decomposition is allel te polar desomposition of A, ‘ornuse ts analogous othe pola form a «comple aubee =~ e Show that is oasingule, then the polar decomposition Is unique "The Symmetric Eigenproblem ane SYD 2 Quesrios 5.19. (Bay Prove Lemna 3.5. Quastios 5.20. (Bay) Prove Lerman 5.7. Quesrios 5.21, (Hard) Prowe Theorem 513, Also due the exponent An Bin Theorem 5.18 40"2n— 1. Hits In Ler 57, mpl Dy a divide Da lyr an sppoprisely chosen constant. Quesri0y 5.22. (Medium) Prove that Algorithm 5.18 campates the SVD of nso that GTC converges to a dingo ti. Quesmiow 5.28. (ander) Let A be an myn symmetie postive defiite inarix wih Cholesky decomposition a~ ZF, and let be the Cholesky f= tor computed in Boating pot artic. Th his quetin we wll bound the relative ertor in the Square) singular values of Za approsmatons of the egeaalues of A Show Uat can be weiten A — DAD, where D = dlag(al?.. a2) and ay = 1 forall Weite b= DX. Show that 2208) —x(4). Calng bound (2.16) for the bnckward eror Bf Coley 2AV84 = HLT, show that one can wee £7 B — YT ETEY, whore [VTY =f = ‘Ofe)aA). Use Theorem 56 10 concli that the egevalves of EE of LL difer relatively by at mort O(2JA. Then show that tis i also true of the eigenoluny of £7 and 1", ‘This means that the squares of the singular yale off dif atively from the eigenvalues of A by a most OtebeA} — OU Questios 5.24, (Harder) This question justifies the stopping criterion for ‘one-sided Jacob's method for the SVD (Algorithm 5.18). Let A= OG, Whore Gand A ate mty-n, Suppose thet al ez forall j =k. Let (te ©" 0) be tho singular males of Gand ad'= 0 (ere Sea operator nor ll such thal tle & KR) Mle depends on teh Rand 20 Applied Noserical Linear Algebra Proof To stow p(B) < fl or any operator norm, let ¢ be an eigenvector for As where {R) = |A| and 0 [Rl] = mexyo heel > Ae — et — |x, ‘Toconstvet an operator norm I uch hat | pt), let SERS = Fb in Jordan form. Lat De = digi y en. ye°-¥). hen (SDyIRISD,) = De“ID, Me Dy me i de 1m Jorden fm” wth «sao th eingpl. Now ine he wetor norm [el (SD) ogee the pert ran ie Le = ong Mele ML = 8 Ta ng tS 285 SD a SDA *MSDrac a = [50M P mgs = Aiie © “Two 6.1. ‘he Merton ts — Ri concept the ston of ‘As b fora tating etre a for fond oty FAB) © Proof. If {R) ® 1, eboese ay ~ #10 be an eigenvector of wit egeavave wee A) = a(R) Then (ems 2) = Rey 2) = 009 = RN xg 2) = Neg = 2) will not eppeeach 0. 19) <1, use Lemma 65 to ehoose an operator noe so ij. < land then apply Lema 6. to conelude thatthe method conver Dereanion 6.5. The eate of convergence of Zmrin = Rw 4 6 4 (0) donno) Iterative Methods for Linear Spates I (Fs the incre inthe number of correct decimal pcs in the soon be Heaton, see 109 ~ 2-09 [m3 ~ le 2 1) + Ole. The ‘sual p(R}, the higher isthe rate a eomwergenee, ie. ta greater i Ue Taumber of corret decal places computed per teeaon, ‘Our gal is wow to choose a spliting A = AM — KC so that both ()) Re = AK 2 and © = Mb are ens to rante, (2) p(t) sma. We wil noe to bale thee eoactng pls Por extmple, choosing M = 7 1s goo for goal (1) but may not make 72) < On the ott hand, cong AMA suit K 0 god for goal (2) bt peobubly lad for oe! (1). "The splittings for the methods discus inthis set all share the fa: lowing uoation. When A has ao zeros ons dagotal, we write Aap- -0=pu-£ . (6.9) where D i the diagonal of A, —E isthe strictly lower telangular part of A, DL~L,-0 eke strktly upper tiangular part of A, and DU = 0 6.5.1. Jacobi’s Method Jacobs metho ean be described as repeats looping through the equations, ‘hanging variable so that equation j ists exactly. Using the notation of ‘ston (6.19), he spitting for Jacob's method A — DUE}; we denote By = DML +0) — L410 and 2) = DM, so: can write one step of Snobs method 8 92 ~ Rus. Tse that ths Formula everespods co or fst eseipion of gobs meta, ane tht it implies Diy ~ (E+ Oy +B fyytmris ~~ Earp Oatinn Hb 0 tori + Daj antnd — bs AUGORITIOS 6.1, One sep of Jacobi’ method frj=Vton, endfor In the spocal esse ofthe mode problem, the implementation of Jacobs algorithin simplifies as follows. Woeking dvety fram equation (6.10) and Ieting tay denote the mth value of te solution wt gid pnt), Jacobs meth! becomes the flowing. (by Eka) AwGonITION 6.2. One step of Jacob's meted for two-dimensional Poisson's equation fork=10N for j= 108 trig = ming Htmatas + Basa + magn FERN 2s Applied Noserical Linear Algebra end for en for In other words, at cach sap the new value of 8 obtain by “averaging As reghbors with hifi, Note that all new vals tsn4y may’ Be eoraputed lndepesenty of one otbee. ded, Alain 6-3 can be implemented ia fo line of Matlab i he yj ag0 Stored in squace aray Vt ineludes fn extra fst ae ast rom of arose Hest and Tat column of zeros (sce Question 66) ‘he motivation for this method i that atthe jth step of the loop for Jaci’ rnethad, wo hve improv eles ofthe fist j~ L eomponents of the soliton, ser shold se them in te sm Gauss-Seidel Method Aucontmns 6.3. One stp ofthe Gouse- Seidel method frj=toon smug = ak 6 Sooasmine= 3 enim Secor rare nu for For the purpose of Inter analysis, we want to write ths algorithm in the frm Smt = Rostig + eas. To this end, ote vat can Bs be reritten as Yoewemsne=— So arma tbs (020) “Then ying th notation of equation (6.10), we ean rewrite equation (620) a8 (D=D)tqy3 = Ui + bor Sap = D= Drm + (D= 1 BU (ED = Rost + eas As with Jacobs method, ww corse how to implement the Gans See method fr our mal prblen. In piel iti ie similar exept tit Ive to rep sek of Which vaeiables ore ew (nar mt 1) aad which se old (abe). But departing a the ores in which we oop tough the grid points fj, we will ge diferent (and vad) Implementations ofthe Ierstve Methods for Linear Systems oy Gauss Seidel method, ‘Tis ule Jacoi's method, fn wich the order in snlsch re epdate the erates invert, For exible epdate ous few (before nay other tna), Un lls oeighboring sles are recess ‘old. But we update tg lsty the alts neighboring ales ae erly ‘aw, 0 wo get elillernt ale For by Ho ther are a tay pole Implementation of the Gauss Seidel method st here are ways to order 2 savas (namely, N%), Tit of ll thee orderings, only two are of intrest “Thos isthe oedering shown ia Figure 6 this fea ho natura odaring "The second onering sealed ret-lack ordering. It important because ‘our inst convergence routs ia sections 63.1 and 65.5 depend oI Toe ‘lia ratbleck ordering, consider the chestboard-Hke coloring of the grid of ‘unknowns below: the @® nodes eorrespoud to the black squares on a ebess- boned, and the @® nodes coerespond to the red squaces “The re blak ordering to order the rad rvs before the bck nodes Note that wd noes ar adjacent to ony back nodes. So if we update ll the ted nodes fist, Uh will ie only ol dats from the Dae nodes. The whe 1 update the black nodes, whieh are only adjcent to red nodes, ey wil use ‘nly new data from the ra node. Ths the algorithia became the folowing. AUGORETIN GAL One step ofthe Geass Seilel method on two-dimensional Poisson's eretion wth vi-ack ordering for eit nates i that ae red (B) y+ Bnst hy + tansnt Homa ENA thas end for for ali noes ij that ae Back (B) Nts = ns tint Uo nhs Pina Fata MPLA nd for 6.5.3. Successive Overrelaxation Wie refer to this method as SOR), where ws the relation parameter. “The molivation i lo improve the Gauge Seiel lop by laking at appropite as Applied Numerical Linear Alba righted average o he sty a ny SOR'S sm yj = (wns Herts “Helding tbe following slr. Auconsrin 6.5, SOR: fri-1t0n ss ny + by EEL et oof io may earns op for $= 110% Ek juts x, ain using the notation of equation (619), (Daley = (19) bina bob fas = Dawa) 4 allem 4 (Dah) = F-81220 peg Fal 8D = sone) %m 80H): (oa We distinguish thre ewes, dpending on the vas of = 1 ai leat to th Gauss-Seidel method, <1 allel wadereleation, ond > 1 ial orerelarmtion. A somewhat supetil motirton fr overeasation is that if the diction from Z 10 zp isa ood direction in which to move the solution, then moving > tems as fr tn that dierin B btir. Tn the next tio soetions, re wil show ont Lo pik the opis for the mode! protien. This optimality depends on using el-black olen Ausonsrin4 6.6. One step af SOR) on two-dimensional Poisson's amation ‘with redlach endering Jor all nodes ij that re red (B®) tatty = C= 2)mayt inde Fs ison tts 2 FAN nt fr Feat nates toe Back @) Botnas tlm ademagt teeth ttn Het asa ag tA on fr Iterative Methods for Linear Spates oo 6.5.4. Convergence of Jacobi's, Gauss-Seidel, and SOR(.) Methods on the Model Problem Ieisensy to compute how fst Jacobs method converges on the model problem, since the corresponding spitng is Try ~ AP — (Ml ~ Ty), sa 80 Ry (4A ~ Torx) = Trev /Ae This the eigenvalys of Fy aro 1 yy ‘where the Ay ee the eigenvalues of Ty 3-4-2 ong ton gt) ply) the largest of 1 —A, mamy, ds 2 AAR) =I Saal =~ Av = 08 aR WaT Note that as N grows and T becomes ine ilkcolitioned, the spetral radius p(y) approaches 1. Since the error is ulti by the spectral radius at eae sep, convergence sows down. To estate the sped of convecsence ‘aoe precisely, cus eampute the aumbee m of Jacobi iets raul to ecrease the ervor by €-! exp}. Then m must satis (8 ))" — ey (= apap) 64 or m= 28M — Quy) — On). Ths the number of iterations proportional othe numberof unknowns. Since one step of eobi ‘emis O(1) to uprate each soation component oF O(n) to update al of them, ieeosts O() to derease the errr by €~ (or by ny constant factor less than 1). This expins the entry for Jacobs meio’ in Table 6.1 This sa common phenomenon: the mor il-conditoned the orignal prob- lem, the moe slowly mon erative methods converge. ‘There are important cerceptions, seh as mukigrid id domain desompostion, whieh we discs later Ti the next setion we will show, prod that the waeiables in Poisn's ‘equation are updated in rd: black onde (Se Algorth 6.1 and Cola 61), tat p( Pigs) a(R)® — eas? yey. In other words, ope Gauss Sel step decreases the error as much as two Jacobi steps, This I general ple nomenon for maisees arising from approximating diferent equations with certain finite diference eppreximations. This lsoexplsins the entry or the ‘Gauss-Seidel method In Table G1; snes it only ten as fast as Ineo, I sill bas the same complet in the O() sense For the same ed-back update order (see Algorithm 6.6 and Theorem 67), wo wl also sow tht for the relaxation pacameter |< «= 2/(L4sin yy) <2 can Train git Tg PARsomn) = ‘This tn contest to {R) = 1— Of) fo Ry att Rs. Tiss the opin vale oF a ty Hines Reon. With ths ei of , SOR) is 28 Applied Noserical Linear Algebra spprosimately ines faster than Jacob's oF the Gatss-Seidel method, since IT'SOR(.) take j step to decrease the err as much a sep of Jacobs oe the Ganss Seidel meth, then (1~ a) ~ (1— $Y plying t= r= 1 a= -.N. This lowers the comply of SOR) fom O(u!) vo Oln8), at shown in Table 6 Tm the nest section we wil show generally for certain Anite difleence mo Acces how to choo 10 mininize sont) 6.5.5. Detailed Convergence Criteria for Jacot Gauss-Seidel, and SOR(.) Methods ‘Wo will gi soqveno of eondtions that guarantee the converge of these methods. The frat erterion i siple to apply but is at always eppbiable in Dieticlae not to the mel problem. ‘Tha we gve several more complies Criteria, which ples stroager conditions on the matrix A but In return give ‘more information about convergence. These moze complicated extra are tailored to Bt the matroas arising fom dlseetzing eatin Kinds of patil Aiferentat equations suchas Poisson's equation ese ig summary ofthe resus of this seetion 1. I ie steely row diagonally dominant (Deiition 6.6), thon Jacob's tnd the Gains Seidel methods hath converge, snd the Gauss Seidel ‘method i faster (Theorem 62). Strit row dingonal domince means tho each diagonal entry of 4s incr in magaitne than the sumo te rmagnitdes ofthe or entries in is row. 2, Sina our model problem is nt sretly row digonslly dominant, the last remit does nt apps: So we ak fora wer form of dagonl dom inane (Definition 6-1) but impos a onditon called roti on the patten of noaero entries of A (Dasiation 67) co prove couse of dcobi's and the Gauss-Seidel methods. ‘The Gains Seidel method tain converges faster than Jacobs method (Theoern 63). Ths result ‘ples to the model problem 5. Turning to SOR(.), weshow that <0 <2 is noessary for converge (Theorem 6.2) 104 also postive diate (Like the model proton), D cur 2 i alo sullen for convergence (Theorem 63). To quantitatively compare Jacob's, Gauss Seidel, ad SOR(w) methods, sxe mae oe more ssnpion about the pater of nxieo entries of This property called Property A (Definition 6:12) and is equivalent to saying that the graph ofthe muir i Sigartte. Property A eseatially sys that we can update the variables using red-black ordering. Given Property A there Is simple sgebrae form relating the clgeavaloes of Ry Ras, abd Reonay (Theorem G4, which Jats us compare tele Iterative Methods for Linear Spates 2 rates of convergence. This forma ko lets v8 eps the optimal 2 that mikes SOR(a) converge a font as posible (Thaoren 67). Durxrri0x 6.6. A is steely row diagonally dominant if eal > 3)-,lesl forall ‘Tueonem 6.2. If A is strictly mw diagonally dominent, Jacob's and the Gauss-Seidel methods bth converge. Infact Rae Rix <1 "The inequality [Rx < Rll imple tat one sep ofthe worst prob. lem forthe Gauss Side! method converges at lest as fasts one te oF the worst problem for Jacob's method, It does not guarantee that for aay parteular sr — D, the Gauss Seidel method will be faster than Jacobis Ietbod Inco mettod could “aeckentally” have @ stale vor at sme stop. Proof. Ags using the notation of qustion (6.19), we wete Ry = LU and R (1 27°U. We want to prove Mecsllas = HRaslel < I1Rslelac = Hala, (ox) where € = [IT isthe wetor ofall aves Fqlity (6.22) wil be re i ‘an prow the seoayerenmpontatwise ineity [E-BW]-e= [Rosle Ss [Ral-e= (E+ UD-e (628) lace [EB W |e = EB IUL-e by the triangle equality [S-24)-wwi-e since a =0 < Elut-wie by te wile ecg CALI le sine LIP =0, inequality (6:28) will be true if ean prove the even stronger componentwise ineuality (=| WU) -e s(n + 1UD-e om ‘Since all entries of (I ~|L)}-* = S275} [Lf are nonnegative, inequality (6.24) wil be true i we can pve U|-e< (|B) GE) + W)-e= CE] + UI = ne) OS (E|— [EF — [2 -U)-e=E1-C- IE] -[Up-e (625) ass Applied Noserical Linear Algebra Sine al ents of |) ace nonnative inqualty (6.25) wll be tre if we ean rove OSU -[H—Wh-e o |ul-e= (+ \Uese (620) Finally, inequity (6.26) sre because by assumption jR-llx ~ URlls ‘Aa sinkgous resul hoks when Ais stil column dlagonlly de (le, Al is stretly row diagonally doasnsot) “The reader ray enslyeonirm that this simple ertrion does ot apy 0 the model problem, so we ned to weaken the assumption of tit dlagonl dominance. Doing So reuies looking at tho graph propertis of « mates Desisriox 6.7. A és en reducible matrix i there ts no permutation mats Prout tat ae pape [4a We connec in dtinton to graph ear 9 los Desintion 6.8. A rete graph «finite callertion of rds connected by a frit colection of directed eas, arr fom oe node to nother. A path in dvwtedgruh isa sguence of nodes 21. with an edge from ‘ach ny toes A ol ego 18 an edge fom a node to self. Derininion 6,9. ‘Thectieted geaph ofA, CLA), 6a graph wth nodes, 2,.004m ‘nd on edge from node ito de j if and nly if ay — Exanrus 6.1. The mates tus the directed grap OG 2 3 5 DeMeI0¥ 6.10. A directed graph i elle strongly connected if there exists ‘pth from every node fo every node 3. strony connected eammponent of 1 dct grph i «subgraph (0 subse! of the nodes with all jes connecting ‘hem) which s stramgly cmneoted and ennot be made lager yt sl be trmgly onmecie. Iterative Methods for Linear Spates ~o BxaMPue 6.2. ‘Phe graph in Example 6.1 strongly connected. 6 BxAMPLE 6.8. Lat whl asthe eed rk Las “This raph sot sty conecesnce theres 0 path oe fram anyubere le. Nodes 4, 5 and 6 frm a strongly connerted ecraponent, sce there i 8 pth rom any one of tn to ay ee. Exaurue 6.4. The graph of the model problem I strongly connected, The sroph is sentially ‘except that cach ego ln the ged rpresnis two edges (one In cach rection), fad the slf edges are not shown. ¢ [LEMMA 6.6, A is iodo if and ony if A) is strongly connerted Pr, 64 [8 42 eels hrs cy 0 yo 9 fm th nes eosin ose atk othe cms ermponting Yo Ay {la} not sory eaeced Sirti Cc wot sony cmd rember sd aun) st i nodes arr se Come enact come fim hen theme PAP wl be Dok pet Taner DBxAMPLE 6.5, ‘The motri A in Beample 6s edible, Dasstri0N 6.11. A is weakly row dagonally dominant if Jor al la > Sie sll with sit inept a east ance ‘Tuvonen 6.8. IfA is trencble and weally rw diogonlly dominant, then eth Saco’ and Cows Sede methods omserye, and (Ras) <9) & 1 Fr proof ofthis Uorem, se (247 20 Applied Noserical Linear Algebra Ean 6.6. The model problem is wekly diagonally dominat and ise dveile but not stroagly diagonally dominant. (The dgoasl = end the ‘ffagonal sums are ether 2, 8, oF) So Jacobs and Gauss-Selel mets onvergeon the model probe © Despite the above results showing that unde certain eonitions the Gauss Sciel method is faster than Jacob's method, no such general result hol This ie Beene ther are nonsyimcrie mations or which Jacobs metho cnverge ad the Gs Side nett diver, 8 wel ratios foe which the Gans Seidel method converges sd Jacobs method diverges 27 ‘Now we consider the coergente of SOR) [247]. Recalls dio: Reon) = (I~ ty M(1— alt +00. Tunowst 6.4. plRgom)® [=e Therefore << 2 i epi for convergence. Prof. Write the characteristic poems! of Rsony # (3) — de — Reon) Sell oEVAL~ Reppin) eds aML at) 0 ‘tar 20) = £]] MRsomuy) = + det((w —1)1) = He — 1)", implying max, [Agony] bo He "Tunonest 6.5. If 6 somatic postive definite, then ARs) <1 for all 0-20 <2, 50 SOMA) converges for all # Mahl +e) S41 + mv) e> 0 Here is table of vals of Te). Note how fst it grows as m grows, ven when ¢ Is tiny (se Figure 66) wot Tz 44 69-108 38-108 94 ope | 69-10 18-10! 13-108 ‘polynomial with the propetis wo want p(s) = TC /0)/Tal To seo why, noe that p(1) = 1 and that if © [ppl then (putz) 1/T 1/0) For example p= 1/(1-+ 2 the |S H/T). AS WE Ive jos see, his bound i ny for sal al todest "To impleaweat this ebesply, we use the thece-term eeeurrence T(2) 2xTy-y(2) ~ Ta) se 0 define Chis polynoaials. ‘This means tat "we need oy to sve and combine thse Wet Py Bt Atl By ot all the previous 2y- To seo how ths Works, lt ly = )Tm(1/0), 20 ul) = pT /9) ed 5 = Dy the ureter reareece In Definition 6:15. Then to # = dul 9 ~ 2} by equation (6:28) = mata() 2 rb (Biman (B)n-o by Deflation 615, ons Applied Noserical Linear Algebra Fig. 6. Graph of fn) erm 2. The dtd fn indicate that nf for bist - mG 2h Fw "Thi sels the algorithm. Auconrrint 6.7. Chebyshev aceration oft.) = Rey by the deiition ofp Iterative Methods for Linear Spates fo= Nios: = Rav te Sv in =U abs ~ zs) Vn = REPU — ESO + TE endfor [Note that cach iteration takes just one appieation of, x0 tise signi icantly more expensive than the other sear and yetor operations this alo- ‘thr sno more expensive per step than the orginal eration yor — Rr Unfortanately we eae apply ths diet to SOR) fr sling Arb, case yoni snerally has complex eigenvalues, and) Chebyse acceler ‘on uiesthat have eal eigevales i the interval [=p]. But we ean fe this by sig the flowing algorithm Auconrmins 6.8. S808: 1, Tate one step of SOR(x) computing the components of inthe usual ‘We will expres this algorithm af 2442 = Hits +e and show that Bi hs real eigenvaes, 0 wo ean use Chebyshev aceeration, Suppose Ais synmctrie a8 In the moet problem ard again write A D=LaU~ DT ~L-U) asin equation (618). Sinee A= APU = LP. Use ‘ston (621) to rewrite the to sop of SSOR ws Tefyy = Eady NGM 40a, Heya 2 20) 1G a) Hates He, Phiminoing yells 93 ~ Ba +6 where Le bey Wary be B= Usly = FBR ad wb)! + (2) wd Hw 2M = wt) eb) at Wo lam vat By has rl eigenals,sncr thas the sae eggs athe salar mts eth wy? <1 wFU—a Hw = 2Kt why" = 1 0-vF Us Hw = 2K wy, Fad) 4 2) ay Which Is lel symmetle sa So must ave eal genau 200 Applied Noserical Linear Algebra ExaurLe 6.12. Let us apply SSOR(a) with Chebyshev secleration to the ‘model problem, We need to both chose and estimate the spectral ead p ‘(E). The optimal that minimizes ps not Known but Young 265, 185] he Shown that che eokeo 2 Trea god one, yelling gE) = = “Fr. With Chesley aceleration the eros multiple By Be = pers $ 2/(1-4 my) ae stop m. Theorefs vo decreas te eor byw Fe fntor < 1 requis m = OLN") = On) iterations. Since each iteratign has the sane cost as an iteration of SOM), O(n), the oneal ast Ofn/) This esplins the entry for SSOR with Chebyshev eecceration in Table 61 Incontrast, after m steps of SORjiy), the error would decrease only by (1 3)", Por example, consider NV — 1000, ‘Then SOR) ries 200 erations to cut the ero inal, whereas SSOR(aie) with Chebysew seceleraton ries only m= 17 iterations. 2 6.6. Krylov Subspace Methods “These methods are ma both to sole Ax — 6 andl to find cigenlues of A They atsune that A is aosesible aly vin 9 black-box” subrotine that ro turns yy A given any = (and perhaps y — AP: iA i nonsymmetsie). In tales Words, no diner sone of manipulation of mate entees we This 5 renonable assumption for several reson. Fis, the ehisspestontsvil ‘operation chat one ean perform on a (Sparse) mates i to molly it by a ‘eta fA ls m nonzero estees, matex-vezor multiplication ests mm Upllatons and (at tnst) m adkltons. Second, A may not be reprcented explicitly a¢« matrix but maybe avalabeonly a4 subroutine or eompating oe EXAMPLE 6.18. Suppose that we have & physical device whose bebosoe is ‘modeled by a program, which takes a veetor # of input parameters and poe ‘oes weet y of output parsers describing the devie’s bese. ‘Th fontpot y may be an arbitrary complizated function y — f(z), peeps re ‘quiing the solution of nonlinear dilereatal equations. Por example, 2 eoilé be parameters deserting the shape of «wing and f(a) could be the deag 00 the wing, computed by solving the Navie Stokes equations for the sitlow ‘over the wing, A common eagineering design problen I to pick the Input ‘o opsimiz the deviee behavior f(2), where for concreteness we assume that this means making f(2} ws small as posible. Our problem is then to ty to seive f(z) ~0 as realy as we ean. Assume fr illustration that ae y aca ‘wetors of eu dimension. ‘Then Newton's method tan obvious exe, ying he iteration 2119 — ato) — (Uyiat))-¥f a), where V/la") BS the Jacobian of fat 20). We can rewrite cis a solving the Haar spsera CG AlatN) 811 al fe ga then computing 219 — 95, Iterative Methods for Linear Spates soi Bt ame eo we solve this eae system with eefcent mate Wy) when cempuring f(a) i ales complicated? Te curas out that we ean compute the materi product (V/(a))~= for an abiteary vector = thal we ea tse Krylov subspace methods to solve the near system One way (0 co pute (W/(z))-2 6 with dda diferences oe by vsing Taylor expansion to see that [f(x + hs) ~ f(e|/h = (Vfla)) 2. Thus, computing (Vft2) «= roqulrs to calls to the subroutine that computes J), once with argument {Pand once with 4 + hz. However, sometimes Is eileult to choose 0 {tan accurate spproximatin ofthe derivative (eoosing Yoo small esl In loss of seearacy duo to roundel). Another way to compute (W(2)) = isto ctu liferntiate tho funetion J. I fis simple enough, this can be one by hard. For complated f, compiler tols can take a (oem) arbitrary subroutine for computing f(z) ad autamatieally produce another subroutine for computing (Vfl2)) = [9 This ean also be dane by wing the operator ‘overlong fits of C++ oF Fortean 9, along thi sss eicent. 2 A sity of ferent Keyor subspace methods exist. Some aro suitable for ronsymineri marin, and eters sso smmety or poitive deinen Some methods for nonsymmetrie matriow sssine tht As an be compte ‘av wel As; depending on how A is represented, A= my oF ry 08 be ‘salable (se Example 613). The most eat and best understood met od, the eonjugate gradient method (CG), s suitable only fe symmeti postive efsite rates, including the model problem. We wil concenteate on CC in this ehaptr ‘Given matric that isnot symmetric postive dale, i canbe dicult to pick the best method from the many avelabe. In section 6.6 we will ‘ve ashore summary of the other methods slab, besides CG, alongwith Indice on which method to use in which situation, Wi alo reer the reader 1 the moce eomprebensive online help at NETLIB templates, which ies @ book 21) end implementations in Matlab, Fortran, and C+, Fora survey of current research in Kryloy subspace methods, se (15, 105, 134,212, Tn Chapter 7, we wil nko deus Kryio suibpace methods foe Sing gens 6.6.1. Extracting Information about A via Matrix-Vector Multipi- cation ‘Given vector band a subroutine for computing A, what can we deuce bout A? The most obvious thing that we can do is compute tbe sequence of Inatrbvector produets yn =, ya = Aya — As — Ay Ya — Ae Amy, whore A nbn Let R= [nyse Then we ean write ASI [Ascot = ssa Aan (6.29) Note that the king n — 1 columns of A are the same the ling ‘a1 columns of , sited lft by ene, Assume forthe moment tht Ci me Applied Noserical Linear Algebra oasinguae so we ean enmpute ¢= —K™LANy. Then AK = K feyesyoostay “el = Cy where es the th cola ofthe Ketity mati, oF bo 0 a 10 0g on wtaw c=] 2 5 0 ‘Note that CIs uppor Hessenberg. a fet, Its companion madris sc se tion 4.52), which means that its characteristic polynomial i p(x) — 2" | Sty! Thus, just by matsacvector multiplestion, we have reduced A ‘oa vor simple frm, nn in prep we eauld now find the eigenvalues of A by dng the roms ofp), over, thissmple form sno sel in practi, for the following reson 1, Finding ¢ requires m — 1 matri-wector multiplications by A and then solving w Hinar system with K Bem if is sparse, is likaly to be dense, thre no reason to expect solving ae sytem with K will bere eer thn solving the orginal problem Ar —b 2. K fs bly to be very il-oonditione, so ¢ would be very insearately omputed. hiss beease the algorithn poring tu powee meio’ (Algorithm £1 to get the ealumns y of K, 9 thay converging to fn egenvector corresponding tothe largest eigenvalie of A. This, the ‘columns of tend 0 get more and more parallel, Wo will overcome these problems as flows: We will epee K with an orthogonal matrix Q such that for all k, te feeding k columns of Band Q spon the same the Same spec. ‘Tis space is called a Kron subsyuce. Th contrast to, @ is well conditioned and easy to nveet. Futhormoe, we will compute only as many lading columns of @ as needed to pee an seurate solution (lor ar — bo Ar = Az). In practice we usualy need very few alums compared t the matrix dimension We proceed by writing KC = QM, the QR decomposition of K. ‘Phen FMA = UIQ) AQR) = C, implying @?aQ = ReR” Iterative Methods for Linear Spates sas Since Fe an R- are both upper tsiagular and Ci upper Hessenbers, ti fess Locontin that HRC is also upper Hessenberg (see Question 6.1) Tn thee wards, we have reduc A to upper Hester fem by an ethogoasl teansfrmation Q. (This the fst sep ofthe alge for nding eigen ts of norsyminetie matrices discussed In section 4.46.) Note tht i symmetie, so is QTAQ —H, and symmetric matte which Is upper Hes: snherg mist also be lower Hessenberg, ., tridigonal In this ease we write Qlag-T, Wo still nocd 10 sow how to compute the columns of @ one at « time, rather than all of thems Let Q = [arn-al- Stee Q?AQ =H imple AQ QU, we can equate clini J on both sides of AQ QI, skin Any = Yo hast ‘Slane the ate orthonormal, we can multiply both sides ofthis lst equality byl wo get Ags = Yo bisa = bis for 1m Sj and : pynsors Ay ~ Soha "This josie the flowing algorithm. ALGORITIO 6.9, The Arnold algorithm for (patil) reduction to Hessenberg forme = bylbla 7 kis the number of columns of Q and Ht to compute */ for j= 10k Ay forim tis tag = ale Pe ht end for ing = bole ihyvas — 0, quit Gyn shsas end for so Applied Noserical Linear Algebra ‘The g computed by Arnold's algsih are often elle Arwlisvetors ‘The loop overt updating = ean be also be described as applying the modifi Gram Selait algrhm (Algorthn 8.1) to subtract the compooeats In the dliwtions through gj avy franz kaving = ortogoaal to them. Computing 1 through 9g costs K matri-sector multiplications by A, plus O(n) oxbne Work: If vo stop the algorithm her, whot have we kaeotd sbout AT Let us rite @= (Qa, Qul where Qk = fns>s v4 8t Qu = [tepid Ne that 1 have computed only Qy and ey tho oer eolumns of, azo unknown, Then QfAQ QEAQ, = FAQ= [ean Qu)~ | SBE BEBE konok (th Haw nC) 0 [Note that Hy I upper Hessonborg,berause Hf has the same property. Foe ‘the same reason, Hay bas a single (peesibly) nonzero entry in its upper ret corner, namely, heyix. This, Hy si Hg aco unkoown: we know only Hy nl Hew ‘When A is symmetric, —Pfsyrmetic od tridiagonal, andthe Amol algorithm simpies considerably, bare most of the fy ae zero: Write a & ae) be Bact 8 aystng colamn j on both sides of AQ QP yikes Ay Beatin 40505 + Byun Sine the columns of @ are orthonormal, mplsng bah sides this ation by gy vllds qlgy ~ a. This josie the following wrion of the Arnold get, called the Lanesas aly. Avcontms 6:10, ‘The Lanczos algorithm fr (partial eduction to symmetric tridiagonal form. = 0/1. =D, w= 0 Farka =I Iterative Methods for Linear Spates ws 8, =0, git = 2/85 end for ‘The compel bythe Lancs alg arte call Hanns wet. Mer step Lars, bere sw we he ee sont T= QTAQ~ [Qu Qt" AQ Ql faq, afAa, ataa, ataa. kommt i neta net, 7% *] ox Because As syzmmeteic, we know Ty aad Toy = Te but not Ty. Toy bs 0 Single (posi) nonzero ent nis upper right cor, namely, Oy. Nove that ‘Ts oonegatlve, becuse Ils computed as the norm af. ‘We delve some standard notation associated withthe partial factorization of A computed by the Armond Lanes algorithms DurINtTi0N 6.16. The Keyl subspace Kel 4,6} span), Ab, A.A") We wil write Ky instead of Ki(A,0) i A apd b are impli from the contest. Provided that tie algorithm does not ait beease + — 0 the yetors Qt ‘computed by the Arno ce hanes algorithms frm an erthonoral brs of the Krylov subspace Ki. (One can show that Ky has dimension kif and caly ithe Arsold or Lancats aigortun ean compe qe without quicting Hest see {Question 6.12) We also ell Hy (or Th} the projection of A onto the Krylov sbpace ‘Our goal eto design algorithms to solve Ar ~b using only the information computed ty steps ofthe Arnold oe Lanczos algorithan We hope that fea te much smaller thn, so the algorithms are fle, (ln Chapter 7 we wil use this same information for Bnd eigenvals of A Wo can already sketch how wo will do this: Note that i sie appens to be see, then If (or 1) is block uppe tvinngulae ands all the eigenvalues of Hy fare also eigenvalues of Hand therefore also ofA, sneo A and Hae snr ‘The (right egemeetors of Hy a eigenvectors of, a we mulkiply them arp, wo ar eigenwetors of A. When fy is nonzero bat sal, we expert the egemalies and eigenvectors of Hy to provide good approximations tothe ‘égenvalnes and eigensetors of A.) ‘We faith this iateduction by noting that oundol eroe cats num ter ofthe algrithats that we discuss to behave entirely diferent rm bow they would in exact arithmesie. In partieula, the wets g computed by 306 Applied Noserical Linear Algebra the Laneaos algorithm ean quickly Inte orthogonality ad in fat often bee come lowrly dependent. This sppaently disastrous nurmerial stably ld restorers to abandon thee algoiths for several years afer thle disov~ fy. But eventually researchers learned elther how to stabilize the algrithins for that converge occured despite Instablity! We return to tbese pants In seton 66.4, wheee we analyze the convergence of the conjugate gredant method for solving. Az ~b which s “unstable” but converges anyway), ane in Chapter 7, espcinly in setions 7.4 and 7.3, where we show howto compte ‘geval (and the base algorithm is modified to ensure stability) 6 ow do we solve Az b ven only the oration available from sepa of titer the Aral or the Lanczos algo? Since the only vectors we know are the eolrans of Qt, the ony place to “look” for an approximate sluton isin the Krylov subspice Ky spanned by these wetors, In other words, we sce the “best” approximate solution ofthe form Solving Ax = b Using the Krylov Subspace Ki ae Sam=Qies) how [Now we have to define “best” There aro several natural but liferent ofnitions, leading to drone algorithms. We lt x AB denote the true solution ad ry = B= Ary denote the residual 1, Tho “best” 2 minimizes I — ara. Unfortunately, we do not have ‘enough information in oar Krylov subspace to compute this 2 The “best” 24 minimizes fr This is implementable, and tho core spending algorithns re called MINRES (Cor mse residual) when ‘As gymimesre [192] and GMRES (lor geuertced minimum residual) ‘hon As noasymmetse [213 8. The “best” se makes re LX, Le, Qfre = 0. This is sometimes called the orthogonal residual proper, or 8 Galerkin contin, by enlogy to similar condition inte theory offsite elements. When is symmet Fi, the eormsponing algorithm is called SYMMLQ [92], When Ais onsymmeti, aration of GMRES works [20 4, When A is symmetric and positive defi, i defies aoe I... (77a) (50 Lemma 1.8). We sy the “best” 24 minnie Ula» ‘This noe isthe same a [sea 4- The agosto alld th conjugate sadlent algorithm [4 When A is symmesri peitive deta, the ast to ditions of “bes bo tue out to be gullet Iterative Methods for Linear Spates sw ‘Tuponsn 6.8. Let A be symmetric, Te = QTAQL, nd re =0— Aris where Ky. ITs nonsingolar andy — Qu ela, where ef" = [1,0,...0)7, then Qh 0. I-A aloo paste deft hen T, mst be nonsingular, nd this choice of also minimizes rg» ote ally € Ky. We aso tae that ellie Proof. We deop the snbsespts& fr aie of notation. Let # = Qe [Bla and r= b= Ar, and asoume that T= Q"AQ Is noasingular. We conten hat QI =O by computing Ora" - As) = Q-QhAr = calla -Q*AQT-eyl0ia) cass the fst cluran of Q is /[ble ‘nd its other clurns ate orthogonal to & calla ~(Q"AQMT™ebla = eslbla (TT ely because Q7AQ = 7 0 Now assume that A ls ako postive dai. ‘Then 7’ must be postive efnte and thus nonsingular too (sce Question 6.1). Let #— 3 1 Qe be ‘other canddate solution ia Ky and lee # = b~ Ab We nowd vo show thot Illa is minimized when 20: But Ws = #4-% by detnition = (Ags) AM 492) since # =b— Ab = Ale Q: SAME —2.AQ)" An 4 (AQ) [rl —2e7QPr 4 |AQeH% sloeo (4Q2)"ar = oT Qh Ady = = Urls + Wael, since QP = 0, 0 [ll + Is minnized if and only AGE = 0. But AQ= =O I and nly i P= Singo A is nonsingular and @ has fll column rank. ‘To show that re = +[rallage 1, we reintroduce Subscripts. Since 24 ¢ Ke, west havere = Any © Ka, sore isa linen combination of he clams 0f Qc sie these olumns span K+ But since QEra = 0, the only eau Of Qry1 to which is rot erthogon ausn- AQ: ) Ney 6.6.3. Conjugate Gradient Method "The algoithm of eae for symmetric postive dette mates ig CG. Theo: reat 6 characteris the solution computed by CG. While MINKE might sera more natural than CG because enaimizes laste of Pula it sos Applied Noserical Linear Algebra ‘rns out that MINRES requires more work to implement, fs mare suscepti: ble to numerical instabldles, and thus often produces ks accurate ane than CG. We will ee that CG has the pareuarly atiesctive property that canbe implmented by Keping only four wetor> in memory at one tie, fad not & (9) through 9.) Furchernore, the workin the inne loop, beyond the matrix-vector product, Ie Lmlted to two dot products, three “Saxp9" 0p- cations (edding a tultipe of one yetoe fo another, and « handfl of scalar ‘operations, This vey stall anount of work and Stora. Now we drive CG, There aro svecal ways to do this. We wll tart with the Lance algthm (Algprithm 6.10), which computes the eons ofthe orthogonal matrix Qy sd the entries ofthe trig matrix Ta, slong with the formule 2, — Que [bla from Theorem 6S.” We will show how to camnpute 24 diety via teurtenees for thre set of vectors. We wl keep only the tot recat yetor from each sen memory stone ime, overweting the dom. The fest set of vets ate the appeawinste solutions 2. The second Sex of weetors are the residuals rq ~~ Ary, whieh Theorem 63 showed were paral othe Lanezos wets ey). The thin tof vezors are the conjugate gradients pa Tho pare called gradients because a singe step of CG. can be Interpreted choosing a seslar v0 that the new soliton 4 — 24-1 1 7k ‘minimizes the residval norm [rala-1 = (of AW*re), In other words he fre ase as gradient search dections. The pe ae ealed conjugate oF more pbroiely A-oongate, beeaune pf Ay Wit} —B. In other words the Py ate ‘orthogonal with respect tothe inner prsict defied by A (Gee Lerma 1.8) Since A is symmctie positive defite, 50 Ty — QLAQ, (x Ques toa 6.18). This means we ean perioem Cesky’ on 75 to get Ty ~ EAL LxDAlf, whee Li unit lower bidiagonal and Dy sdagons. The using ‘he formula for fn Theorem 63, wget m= Qufzeilble Qube; 41a = QeteT HDG; ea ti), whew A= QulZT an = Deeb. Wate A fnnfil- The Conjveate gradients yw turn ot be pall othe ols jy of. We nom cough to prove te floras lena Leanna 6.8. The columns fof Pave Acconiugate. In other words, PE AP, ‘is diagonal Proof. We compute PEAPL = (ely AQuEGT) = Wy NQEAQH LG = Lye” = GNUADLD GT = Dy Iterative Methods for Linear Spates wo Now we derive simple recurrences for he eolumns of Pk and enti of se We wll show that phon = i---stu-i[” ieatia! to the leading 1 fries of = near aad PL deat tothe ang #1 folunins of F. Therwote wean et oe ee tno ea Gay Sauna haat act os eee iae thes or oy TaDLE 1 a 1 i _ fa 4 thas bad oy aad [tt sme [tT we EB = 81] bs dines k=. Sic, Dy ad te also the leading (k~L}+by-(Kt— 1) submatrices of Dy! = ding Dj!) de?) and 1 [4], respectively, wire the detail f the last rn do nt concern. "his mes lla whore & has dimension 1 is ential t the Pelle 1-['] “Now we nood a recurrence for the columns of Fi = [pty.-.,Pxl- Sineo LZ, scm he gone) | Joa 10 Applied Noserical Linear Algebra sulbatex of £7. Therefore Pha i Metical to the leading —1 calc of A= aut? = (aeral | % Bt 2] = (eu utz ial = (Asi From P, = Qutg" we get ALLE = Qe oF, equating the kth cluran on both ide, the recurenee| Pe= thee (6.33) Altogether, me howe reunions for y (hom the Lanes agri), foe (Grom eaquation (6:3), a for the approximate soto 24 (from cai: to (62). All hes recuesions ae sere they regi ony te previous erate or two to implement. ‘Thus, they together provide the means to com pte 2, while storing «stall umber of vetors and doing «small uber of doe products, saxpys, and sealer work in the inner lop. ‘Wo sill have to simplify thew roursons slighty to got the ultimate CC lgosthm. Since Thorn 6 tel bs Ut 4 ald yy ae parle, we an replace the Lanczos weurence for qx with the reeurenee ry — b— Az, for quently ry ~ ri-y ~ fu (goten fom multiplying the recureence Be — 24-1 4 me by A ad subkesting from 6 — 8). This ylelds the the P= ret mAd (6s) Ze = m4 tmp from equation (082), (ox) Pe = G~beafaes from equation (6.8). (2) In order o eliminate guy substitute gy ~ raea/Pacal a p= rasa Into the above recurrences 10 ge n= ne 6 OTS reid om o eas Peake = 1 Hem os ee nn Utils iealnedan ine = a tne (oa) Wi still nd forms fo the scalars vy a gis As we wil ty ther are several equivnt mathemati! expression for them in terns of dot pits ft year computed the lgorithn. Oue iltimate formulas ace cha to ‘minimize te nner do produets add sad beease they are more stable than the alternatives, Iterative Methods for Linear Spates sn "Toot a forma foro, rst itp both sides of equation (6:9) on the left by Af 4, aed we the fact that pad a Aconjagate (Laan 63) ogee PE An =p Arnon +0 HE spy (6.0) ‘Thea, muluiply both sides of equation (6.37) on the kt by rf and use the fact that rfyne =O (sinc the Far parallel tothe columns of the orthogonal nats Q) 0 ge by equation (6:0) (oa) (Bauntion (6:1) ean lo be derive fom w property of in Theorem 6, mel, that minimis the reside norm nels = rary (rhea MApaITAGa-1 14m) by equation (637) net agin + An, ‘This expression is a quadratic fonetion of 1, 50 ean be ely minieiaed by setting is derivative with espe 10 1, co 20 and ving for v. This yes rhs thane (a tment via by equation (6.2) van” where we have used the fact that i_yre1 — 0, which holds since rian Is rg tl er Ray og a) “oat formula fry, mil bh sides of quton (6) on the et by pe andthe ft hn id Aon Lan 8) 1 as ehsAre Par "The trouble with this formula fo py 6 that it rauiesanotbe dot product, pf yAre-y, besides the to equ for 2. So we will erve anode formal rogrig bo new dot products i 2 s12 Applied Noserical Linear Algebra Wed thst svg an alternate fra fr 4: Mail both ies cavation (6) 0 the ty guetta Fy, ad ve form ta ” (6.19) rating th two expos (641) and (648) fr on (st that tne nari from te tarp) fare, td copering ota {en (62) ll ou tite formu An rac = oy Combining recurrences (6.57, (6.38), ane (69) ane forms (6.41) and (6.44 yids ov nal iplementacion ofthe eonjugae gradient lgoitan AUcORETH 6.11. Confgate gradient abort: = 0529 =O: = 0:9 =) peat bbs =A a= are aT) Daya = eB anand na al 6 sal eng ‘The cost of the inne lop foe CG 6 ane matrs-wetor pret |= Apa, two inter produets (by saving the save of fr, team one loop leeaton to the next), thee saxpys, anda few scalar operations, ‘The nly vectors that oad to be stored are the eureat values of 2, p, and > — Ap. For mace limplementation details, inluding how to decide i “Ire small enough,” seo NETLIB templates templates him 6 Wo begin with a convergence analysis of CG that depends nly on the eoadiion smb ofA. This analy wl show that te namber of CG iterations need {o reduce the errr by & Exod factor less chan 1's proportional to the square 100k af the contin mimber. This worsense anal Is «good estimate foe the speed of convergence on our model problem, Poisson's equation. But It Convergence Analysis of the Conjugate Gradient Method Iterative Methods for Linear Spates sis severely andeestinates the spot of convergence in many other eases Aer resenting te bound based on the condition number, we dexcibe when we ‘ean expt faster ernvergence We start with tbe lial appeavimace solution y= 0. Recall chat 2 tminimizts the Aner of the rsidval ~~ Are overall possible solutions 2 EKA). This moans 25 minimizes Ae Ie Ass = JG) = 0 Agha" A) =e cover all © Ky =spanlh, Ab AM... AW), Any © € Ke(4) may be waiten Tih aA = px = pads, where py-a(€) = Tid age? bs a polynnal of degree 1. Therefore, $2) = (mA) AS PAL — AVA] = (ut Ala)" Alt A)2) sTa(a)dan ale, whore gue) = 1 ~ pu) € isa degre polynomial with a(0) = 1. Note that (gD? = aya) boone A— AT. Letting Oy be the sto all degre lyons whieh ake the valve 1 a, ths ens Seu) = sn) = pT a AAA (os) Te simplify this expression, write the eigendecomposition A= QAQ! snd lt lem yso that Jeu) = so 502) = guia Tia QNQNQA" QAO" ie “tig, AT OaL OMAN AAAI (ug, vw Away ig 9 asiacAoa-¥ ng Spoon! 1, (apo?) ats 28, (ass 8 (eae, oon) aon sinc x9 = O implies fey) = 2 Az = yPAy = EP we Throne, Il _ seen . rls ~ an) 5 SERA, (OHO au Applied Noserical Linear Algebra Tak < aia ae Im tae hrc at oo last convo oo ution soon pum Hw wal cas dee tv) Ben tmag oer te seme oA whe satan wing a) 24 Since soe it tsi enh eral ee ere Oc haw Angst ain upe tend ew itl re 1 FaieSM Rig tonal oe th vl carte] and 07h bron) ihn he the propery eens coct fom te ‘Chebyshev polynomials 7),(£) discussed in section 6.5.6, Recall that |Ti(E)| <1 he nena py wie el we ire 0). Now et aon petane® is cay to se that (0) = 1, and I € in, al then nee + Made 26) ere le Fralls-« Poiaa = 23 By OI 1 1 Hep whore = Aya / An te condition number of A the condition nimber is near 1, 1-4 2/( 1) i large, 1/TA(1 + 22x) ‘smal and eonvergence Is rapid. If is lrg, convergece sows dom, with the conmerzee rte 1 2 ih oe EXAMPLE 6.14. For the -by-V model problem, = O(N), so alae steps of CG the residual Is multiplied by about (1 ~ OV"), the same as SOR with optimal overelaxation parameter 0. In otbee words, CG takes O(N) (O(n!) iterations to converge. Sine each Heration costs On), the ovr 1s O(n!) This explains the entry for CG in Table 6.1, © ‘This analysis using the condition number dees not expan al the impor ant conseegence Baur af CG. ‘The next example shows Ut the entire Aistibation of eigenvalues ofA i inportant, not jos he ratio of te large to the smallest one. Iterative Methods for Linear Spates sis 44 aage Fig, 67. Graph lative eid compute by CO. BXAMPLE 6.15. Lets consider Figure 6.7, whlch plots the relative residual lralafrala ot each OG step for cight diferent Liner systams. ‘The eclative residual [ra l/pll, measures the spoad of convergence; our implementation ‘f CG terminates when this ratio sinks below 10", or ater k— 200 step Whichever comes fest All eight linear ystems shown have the spe dimension n = 10" and the same condition number f= 134, yet thee convergence behaviors nro radically diferent. ‘The uppormost (dsh-lot) ine Is 1/T3(l + =), whieh inequality (6.46 tals ian upper bound on felis /iall 4-1 Te tors out the gmp of Irela/irala and the graphs of Ira, /Lrola > are nary the sme, se pot only the former, whith are cise t inter. The sl in i ral/plle for Poisons equation on a 10D-y-100 gid with » random right-hand side We see thet the upper bound captures is vec convergence bchawio. ‘The seven dad ines are plots of [r/o for seven diagonal linear systems Dar — Dy bed from Don the ele 10 Dy om the right. Each D, hs the seme dension and eoadition amber 36 Poisons equation, so we nee to study Usa tar lonely to understand the dieing convergence behaviors. ‘We have construe ese D, so that ts smallest my ad largest my ee values ar Metal to those of Poisson's eqion, with te remaining — 2m cdgenvales equal to the geometric man of the largest and smallest elgeaval- ts. In ather words, Dy as oaly dj — 2m, 1 distinct eigen. We bt {K denote the umber of CG erations it takes forthe sluton of Diz = 0 a6 Applied Noserical Linear Algebra coc rally 10°. The convergence properties are summarzal inthe folowing table: ample nomber el asignen ses egies ate: ‘Numba of distin digonvaloes [dp [81a 812017015000 ‘Nunbar af tops to couvwrge | [312769011385 200 ‘Wosee thatthe number by of stp reglred to converge grows with the numbe 4; ofdstnes eigenvalues Dy ls th same spectrum ss Poisson's equation, fd converges about as slow Inthe absence of rounol, we claim that CG would take ezecty by — ds spe vo converge: The rena that we can find » polyno () of degree tht nro athe cigs ao A ite ga(0)— 1, may Tt -9) 0 TD uation (645) tl ws that afer aston, CG misimises Urals ~ fea) overall posble degree pobomias equaling | at 0. Since gy, one of tone Polroatin and ga(A) ~O, we must have fray =O, ort =O. (One ston of Example 6.15 is that f the lgest aad smallest eigenvalues fA ate fin amber (or eustered chly together), then CG will eonverge toch more quik thon sn analysis based just on A's condtion aunier would Indicate “Another lesson is thatthe behavior of CC in floating pont eithmetc can Afr sigaiteanty from ts behavior in exact archmece. We sw this breause the number dy of eistinet eigenvalues frequently fre from the number of steps require to converg, although in theory we showed hat they should borden Sil dy aed were of the same order of magi Tadd, if en were to pecan CG in exit arithmetic and compare the mputed solutions nil rsials with those comput in oti point eit mete, cy would very probably diverge aed soon be quite dflerent. SU, as Jong a8 Ais not 00 illeondiioned, the Hosting point relt will eventually aaverge 10 the delved solution of Ax — B, and so CG i sill very usta "The fet that the exset nd Haut point results ean ifr deamatiesly Is Interesting but does not preveat the practi use of CG. ‘When CG was discovered, it was proven that in exact rithmetie it would provide the exaet answer aterm steps, since then ry would be oethogonal tom other orthogonal vectors r through ray and s0 must bo zero. In otbte sons, CG as thought of ws 8 vet metho ater thn aa erative meth ‘When eonvergenee ater m stops not ocr in pret, CG wa ened unstable and then abandoned for many years. Rventualy it was recognized 3s ' erly god iteaivewathod, len providing quite secure answers alter been step Iterative Methods for Linear Spates si Recents, a subele backward error nalyis was devi 4 expin the ob- served behavior of CG in Hosting plat aa expla how i ea dif ron ‘exact selthmetic [123]. This bebavie can also talide lng “plates tn the fonversence, with [fle decrssng ile fr many trations, interspersed with reiodsofcpid convergence. This behavior can be explained by showing that (CG spplind to Az ~ (in Hosting pot slthmetie behaves eanetly ke CG applied 10 Az — 6 a exsetarthmie, where A is ease to A in te flowing sense: A-has @ moeh larger dimension than A, but A's eigenvalues alle In tow clusters around the egenvlues ofA. Thus te piateaus in eonvergenee correspond to the polynomial q, underlying CG developing more and more ‘eros neat the eigenvalues of Ing in luster 6.6.5. Preconditioning In the previous section we Saw that the convergence rate of CG depended on the conttion numb ofA, oF more gneeally the dstebution of ’segernal- ns. Other Krylov subspece metho have the same property: Preconditioning means eplcing the spate Az —b with the system Miz = M1 where [A isan approximation to A with the proper that 1. Mis symmetic and postive defini, 2. AI-1A is wel conditioned or has fow extreme cigenalus, 3. Max = bis casy to solve, ‘A careful,problem-depradent die of M ean often make the codon num ber of M71 much smaller than the condition numberof and thus seclerste convergence dramatically Indes, a gped precondition Is often necessary for fn erative method to converge stall tx veh current resereh In Keeative rnethods is directed at finding botterpreeonlitioaers (so also section 6:10), ‘We cannot apply CG diietly to the system M-lAz — M7", beeanse AHA is gonerally not syumetric. Wo deive the precondtionad conjugate rient method x follows. Let, M ~ QAQ™ be the eigendecomposition of yard defise 807? = QN2Q", No that AP js also symmetric positive elit, and (M2)? — IY. Now mukiply M-!Az — Mf through by MP to ge the new symmetric postive dint estem (4*/2AM 2/2) MP2) = A=?hvor az —b. Note that A and fA have the sane eigenvalues since they ar similar (M!A~ M1"), We now apply CG impli tothe system AP — bin such away that avoids the ne oly by A=, This ‘els the following alge, ALCORN 6.12, Preconditioned CC algorithm: = 0529 = Ory = bem = MMs = Mg peat a8 Applied Noserical Linear Algebra bokat 2A = Ears) eam i w= Mn Bees = (of re COP gPe8) Peat = Yet HeysPe ‘ull small enough ‘Turomes 6.9. Let A and A be pnectric noise definite, A= M-N2ANNIA, ‘and b= M~i2), The Calgon app to AB —, oA P= OL yh aT fetes Habe Baya = EAMGT shea) Pus Fe fue sun ila stall enough ‘and Aloriton 6.12 are related solos B= Ml, & = aa, f= wn, fe = My ‘Tevefore, 24 converges to MP times the solution of AL = 6, 0, to ATE b ale, Fora prt, sae Question 6.14 [Now we describe some common preconditioners. Note that our twin goals of minimizing the condition umber of MA snd keeping Mz ~ 8 cay to sive te in eonfet with ope another: Choosing IM = A minimiaes the condition ‘umber of ALA but leaves Mz — bs bal to solve asthe orginal problem Choosing =I miakes solving Ms —bivlt lowes the onion name of ATA unchanged. Since we need 1 solve fz —b in the inns lop ofthe gosta, we restiet our discussion 10 thane M for which solving Mi — bs fosy, and Weseribe when they are hikely to deerese the coditon number of aia Iterative Methods for Linear Spates sig + IF Ans widely varying dngonsl entries, we may use the stole diagonal preconaiioner M~ dliag( a3. --y gy )- One ean show that among All posible diagoaal precnaitioer, this eer redo the cadiion umber of ATA to within factor of of it minimum value [22 ‘This is also called Jacob reconditioning. As a generalization of the fist preconditone, lt Ano Aw acl: by oA bom block mari, where the agonal Blocks Ay are square. Then among Al bloc digonal preconlitiones Po] Whore Mis and Ais have the samo dimensions, the chico Mig = A Iminimizes the condition number of MAM! to within a Factor (fb [6S]. This fe also cael blr Jacobi preondiioning. Like Jacobi, SSOR enn also bo uel to erate a (back) precorliioner An incomplete Cholesky factorization LET of A is an approximation A LIT, whore 1 is limited to a particular sparsity patter, suchas the original potter of A, In other words, no Filkins allowed during Cholesky. ‘Then MI — LL" is used, (Ror noasymmetie problems, there Ie corresponding incomplete LU preconationr Domain decomposition is use when A represents an equation (Such as Poisson's equation) on a psa region So fr, for Potsson's equation, we have let be the unit square. More generally, thereon fray be broken up into disjoint (or sihtly overlapping) subrexions 8 — U8, ‘and the equation may be salve on enc subregion independents- or ‘example, I we az solving Possoa’s equation and Ifthe subegions see ‘Squares o etangles, dese subproblems ean be solved very quick using FFTs Solving these subprblemn corresponds o a block diagonal AF Gi the subregions are disjoint} or a product of block diagonal M (if the subeezlons overiap). Thi is dscusied In more detail in section 6.10 ‘A muaber of tase prsconditoners have been plemented in the software pckages PETSe [230] and PARPRE (NETLIB/scalapek parpre tar). 0 Applied Noserical Linear Algebra 6.6.6. Other Krylov Subspace Algorithms for Solving Ax = b So far we hve concentrate on the symmetric postive definite linear ae tems aml mnie the A“"-norm ofthe esa Tn his section we desetibe snetiods for other kinds of linear systems ad ole vee on hie eth 10 ‘se, bas on simple properties ofthe mate. Se Figure 6 fo summary, (15,105, 181, 212) and NETLIB/templetes for deals, and NETLIB/templates In portiealr for more comprehensive advo on choosing ® eth, slong with softwar, “Any systom Ar = can bo changed toa symmetric positive definite system by solving the norms! equations "Ar — ATS (oc AATy — b, 2 — Ay). This inlides the least squares problem min, [A ~ bla, This hts us use CG, provided that we ean multiply vectors both ty A and AP. Sinco the condition romber of ATA or AAT is the square of the condition number of A, thin rmcthad ean lead to slow convergence if ill conditioned ut fast if Ae sellconditioned (or AAs good” dstbution of envalues, as diseased In section 66). "We can minimize the two-norm of the residual instead of the A~-norm when is symmetie postive defiite. This sealed the minlznum resid lgocthm, or MINRES [192. Since MINRES is more expeasive then CG apd is ten less accurate beeaise of mmercal instil, if ot sel or potive ‘ofiniverysters. But MINES ean bo used when the matrix is synmetric Indefinite, whereas CG cannot. In this eas, wo can also use the SYMMLQ algorithm of Paige and Sunes [12], which produce a residual re x46) teach stp. ‘Vnlortinatly, ther are few matrices other than symmetric matries whee lactis ike CG exist that smultanusly 1 either minimize the residual sla or hep it orthogonal rs Kk 2, require a fed umber of dot proets an saxpy’s in the Inner loop, Indepenteat of k ‘seni, algorithms satiyng thee two properties exst ony for matriees ofthe frm e¥(T-of), where PT (oe TH (HT) for soe symmetric postive definite H), 0 ral do fseuenple [100 249]. Ror those symmetric inl spetal aasynmetsie 4, it tras out we cu ad & shore neuer, 36 In se Lanczos age, foe computing an ortngonsl bass [dau] Ky(4,0). Th oe that thee are ust fe tert in te eure for updating ‘toons that it ean be computed very elie “This exstnce af short recurrences no longer holds for general nonsyn- motte A. In this eas, we can use Arnold's algorithm. So Instead of the ‘wiiogonal matrix Tk ~ QLAQs, we ge a fully upper Hessenberg matrix He = QEAQs. Tho CMRBS algorithm (gencrazad maninusn residual) uses this decomposition to choose xy ~ Qupn © Ku(Asb) to minimize the residual rule = Wary Iterative Methods for Linear Spates m1 so. = BP (QHQ")Qenella by equation (6.30) = IQ) HQ™Qey|l2 since Q is orthogonal eof te |-[ ‘by equation (6.30) and since the first column of oie fests [ 2 Jo Since only the fist row of Ps nonzero, this (+ )-by-& upper Hesenbers least squares problem forthe entries of y. Sine tis upper Hesseaber, the Qt compenition need 0 sae i ean be secs with Givens rotations, ‘a cnt of OC) sate of (09). Aso, the storage rahi s (kD), sine {jst he sored. One way to Limit the growth In cost al stage Is 10 ‘ator GMIRES, 1, aking the answor 2, computed aller steps, sestarting GMRES to solve the Hnwar system Ad ry 0 — Ath, atl updating the soluton to eta d this elled GMRES(A), Stl, even GMRES(E) is more fespensive chan CG, where the cor of the Ine lop does aot depend on Kat a Another approseh to nonsymmetrle nar systems sto abadon comput: Ing an orthonormal bass of Ky(, 6) and compute a nenorthonormal basis that ‘gain reduces A to (nonsymmetec) tridiagonal frm. ‘This is called the non symmetric Lanczos method wd requires motrx-wetae mukipeaion by beth And AT, This is iportant benuse APs sometimes hares (oe impossible) tw-eompate (ss Exanpe 6:3). The advantage of tiagon form that 8 much esi to sole wth trigonal mates than «Hessenberg one. Tho di ‘advoatage is that the bass wetors me wry il conditioned dry in fk fail to exist otal, a phenomenon called bral. The potential eiceney tas led to 8 gent deal of scare on svoing or alleviating ths iastablity (loo-ahend Lanes) and co exnpeting meth, inluting bieonjugte gra lente and quas-nsnimamn residuals ‘There are also Some yrsons that do ot eequie multipeation by AT, including eonugate gradients square, aid conjugate gradient stabilized, No one method fe best in all ase, Figure G8 shows decision te giving sple advice on whieh method to ley fist, assuming that we have no other deep knowledge ofthe matrix (Such ‘8 that i arises from the Poisson equation) 6.7. Fast Fourier Transform [a this section ¢ wil always deuote y=T. ‘We gin by showing how to solve the two-dimensional Poisons eae on ln a way rogulrng multiplication by the mate of eignseetors of Ty mm Applied Noserical Linear Algebra Gas Fee] = (ore) SS ES ee ig, 68. Deaton te for cawsing an dere lpritim for Arb. Bh Cab = eonpante grater salcod QUIN ~ qus-mnonam snl A staightfoowaed implementation ofthis matsixmatriccultpication would cent O(N) ~ O(n) operations, which Is expensive, ‘Then we show How this multiplleation can be implemented ving the FFT in only [email protected]) Ofalgn) operations, whieh is within a factor of gn of opin “his solution is @ disrete analogue of the Fourier sores soltion of the viginal diferntal equation (6.1) or (66). Latee we will make this analogy more precise, et Try = ZAZ beth cigndecompstion of Ty, as define in Ler 61 Wo begin with the femlation of the two-timersonal Poison's equation in uation (5.0 TW 6 Vy =P. Substiute Ty = ZZ" and multiply by tho 2 on the lt and Z onthe eight toset Batwa Bveadlye— ere Ave VASE where V1 = 212 and P= 27872, Te Gb) ety ofthis ls equi (AV V8) aa Yd =H i an be sled far fy og aa “Ths elds be Ast vrton of ou agri, Aucomrnins 6.18. Soling te two-dimensional Poison’ eyution sing the igendcomposition Ty = ZZ: Iterative Methods for Linear Spates ws 1) Faate2 2) Pals and dy acme "The cost of step 2 i BN? = 3n operations, and the cost of steps 1 and 3 1s matrt-mateix multiplications by Z and 2” ~ Z, which is 8N* ~ Sn? ‘operations using eonweatonal alot, In the next setion wo show how multiplication by Z is essentially the sumo as computing a aiverete Fourier transform, which can be done in OIN#og N} = Ofna) operations sing the FFT (Using the language of Kronecker prodvets introduced i section 6.3.3, and in poral the eigendecorposition of Ty fom Proposition 6, Tyan =10Ty 1 TyO1= (222) LOA KON (ZO2, we can eewsite the forma jostiving Algorithm 6.18 as follows wee(¥] = (Men) wee?) ((202)-(LeA+NeN-ZoZIy* -weth!P) (Ze2yF- ORF AGT -Z0 2) vel) (Ze 2) Oat Aol (2? of") wel), (647) \We chim that doing the incited! mntrix-vector mulkipiations frm right to left mathemati the same ne Algorithm 6.18 sce Question 6:9. This ao sms how tower! algorithm to Posson's aqui in higher dimensions) 6.7.1. The Discrete Fourier Transform ln this subsection, we wll mmber the rows and calums of mates frm Oto NY =1 insted of fom ft. Devinn 6.17. The dente Fore sso (DFT) of on N-aeton fre wctor y= where aon NbN arsed flls et weit i-sin pn Nh wot of uly. Then ye ‘The inverse discrete Fourier transform (IDFT) of y is the vector x = ®-"y, EMMA 6.9. fy isa symmetric unitary matrix, so o-* — yt" — He Pro Clay 8 — 47,0 8 — yw el nly sh 88 — NI Gree 8), SA ay SE ILO SE en 2 seit Gls." oa ptt sm th le ‘Th thie DET a ID oe ust mete mula and can sere pleted 208 Nop This eto eal ma Applied Noserical Linear Algebra ‘DET because of ts dose mathematical evationstip to two other Kinds of Fourier analyses The Foor soem FUG) — [ePaper adits inverse Sts) = Fevel Rohe the Pure sees 6 = Ree fae here fs pero o [1 Bnd verse $e) = EP ety ‘he DF w= Way = See NG fed ives B= On FENG, We will make this cose relationship more eoeret in two ways. Fest, we vil show how to solve the mode problem sing the DFT aod then the origi Poison’ equation (61) sing Fourier seis. Thi example will motivate v= 10 find ast way to mulipy by, bose this wl give us fast way Lo soe the model problem. This fst way i ello the fast Fourier transform oe FFT. Iustend of 2N? Hops, i wil regi only about $V logy Hops, which i much ses. We will devise tho FFT by sting a stcoad miathematea reatoaship shared antong the eliferent Kinds of Pour analyses: reducing convolution to ‘multipllestion, In Algorithm 613 we showed that to solve the discrete Poisson equation TyV + VE = KEP for V-requiced the ability to maki by the N-by=N rants Z, where ee = fen DEY YY SEV areata tar (Recall tint we nob rows al eons frm 0 to. ~ inthis section.) ‘Now conser the (2N 2)hy-(2N 42) DPT mate, whose jk entry i exp (227) — oxy (=F) — cn FE — rn HE De de "Thus the N-tyeN’ matric Z consists of —y/GFy tne the maginay part of the second through (N 4 1st coms and enum of ®. So if we ean malty ‘ficiently ty A uving the FFT, then we ean molly ficient by Z. (To bbe mos ficient, one modes ihe FFT algorithm, which we describe below, to multiply by 2 directly; this & called the fat sive Sransform. Bue one cau also just use the FPT.) Thus, multiplying 22 quickly requtes an FPT- le operation on ese columa of F, snd multiplying #7 requis the sane operation an each row. (In three dimensions, we would lt V be an N-bp- N= bys artay of unknowns and apply tho same operation to each of the 3N# sections pera to the coordinate axes) 6.7.2. Solving the Continuous Model Problem Using Fourier Series We now return to nubering rows ad ealumas of matics from 10 N. Iterative Methods for Linear Spates 5 In this section we show how the agit for sling the sree mode probe Io maurl anatgue of vag Few sees to stv the oifeal itera! equntion (61). We wll do tis for the one-dnesoal del probe, Rem tht Poison’ onto on (0.1 8 —£¥ ~ {2) wit boundary conditions (0) ~H. To solve ti ew expo) kn a Four series @) ~ Soe yay an|jna). (be boundary conden el) —O tll that 0 terns tr appear) Plggng =) ie Pato eatin yds Yass") aingixe) = 12) Mit both sides sin, nate ro 0 1, lhe fe hi Seinresnie)de 01)" kand 12) “htop 2 n= ip [sts rene nd aly we ['saurnrans) sure, Now conser the iserste move! problem Ty = A2Y. Sine Ty = ZZ, wean wate v= TRARY = ZAZ, 90 Boy te ak (At Enero, Lang (F 1), a 2h ann = af sinsinsonay sina tho last sum is just 8 Rimagn sum aproimaton ot nha. Fur tho fr mall al that f= p30 we se how the slam of the discrete role (649) aporosnses ie volt ofthe cetits proerh {Gs with maison by 2 corsponing toatl by Rts) Sd negrtion aed akiation hy 2 corepnding sri th er ‘et Fae component 6 Applied Noserical Linear Algebra ‘Te conolition sa important operation in Pore ana whose definition deers cc whether we are deg Foaser teacras, Fourier sore, othe Dr: Tourer wanna 77a = Te wav Fourerseies (Fe 92) = So S(e—wiainhty per Ha ago e-m 0,0 and hgncsbnen Des ae 2N-vectrs the 6b fogs where = Shatns ‘To llostrate the wr of ieee corolationensier pom ok Ailcton. Laka) Seat and bls) Sop db ere. N 1) obromils. Thea their roduc ee) = 03) -6=) — SEN eae, were the Coote, seynny are gen by the diet colton. ‘One purpose of Un ure tran, Fourie series, or DPT i to convert convolution ito mulkipnto. Tn thoeaseo th Furie trator, Fog) — ‘FU)-Flg); be, the Fourier trator of he convolution ithe prod ofthe Feuer transforms. Inthe case of Fourier sein off +9) — oD) 60 1, the Feuer cae of the convaliion are the product of i Fe coniciens. The sme i tr ofthe dere convo. ‘Convolutions | "Turon 6.10. feta = fy. sty 14040009? nd b= fons ye so fe tetrs of dnenaon 2N, and ete = 8 #0 = [eoee ana Then (oe C80 (8 Proof, al = a, then 6 = S38)" aj%, the value ofthe polynmia 2) = ENs!a9) at 20%, Silay Y= 6 mens f= FN ya = 14) an 2 — Be means, = 5:35" ou — ed. here t= atu) blah) = el) = ss despa, © nother words, the DFT is polynomial evalation atthe points w", sul conversely the IDFT sponta interpolation, producing th conics ofa polynomial give its values at wy. 6.7.4. Computing the Fast Fourier Transform ‘Wo will derive tho FET via it interpretation as polynomial evaluation just diseussed. ‘The goa! is to evaluate a(x} — Deg eer* at x — w for 0 (f) where Ta () sa Chehysow polynomial (st section 6.5.6) Les 6.11. py(0)= [TE a(t) whee ty = 2oos(a2ieh, Iterative Methods for Linear Spates ss Proof. The zeros ofthe Chebrshey polynomials are given in Lemma 67. 0 This AO = []F A Boose), wo saving Als ~ es equivalent to solving 2 teagonal systems with tilagoal oxen mates | 2eos(s 24, each of which costs O(N} vio tiiagonal Gaussian liminotion or Chol. More changes are needed to have a numerically stable algrthin. ‘The final gorithm sce wo Buea desert! 1,13 ‘Wo analyze the eos the spe agovithm as lows the stable algrtn is analogous. Mattilsng by tinge! motex or solving wagon! sytem of size cats O(N fps. Therefore multiplying by A oe solving a system with A cots O(2°N) fos sine A? i the produc of ello aris. The ier top of stop 1) of the algorithm toro cots > O(2"N) ~ OLN Bop o update te Ney ae wets. AED is not computed expt. See the Joop in stop 1) exerted = Hoy ¥ tines the total est of step 1) OLN? ons) Foe slay reson, step 2) cota O(N) = O1N2) fos, an step 8) eats O(N Toga N) foo, fr tal ost of OWN? lag 8) Hop This jf the entry fr block eye rection in ‘Table “This algprithi generalizes to say block eiiagonal matrix with a sym rmtre mate repeated slong the ciagoal sad symmetric matrix F that commutes with A (PA ~ AF) repeated long the ofiagonnls Se alo Ques thon 6:10, ‘This sm coron skution when solving near ystems ang rs inci dentin eqaions sachs Poin ion. 6.9. Multigrid Multigrid methods were invent or partial ferential equations nich a8 Poi son's ition, but they work on wide ets of problems oo. In contrast to ‘other teatve seers that we he disessed 80 fn, multigrid eosergence rate is independent of the problem sine, insted of slaming down for lage Problems. As a comoquene, i ean solve problens with m unksowas in O(n) time oe for #coetant ainoant of work per unkown. This opal, ola the (toes) constant hidden inside the Of). ‘er i Why the other iterative algorithms that we havo sessed cannot be optimal forthe model proble. In fc, this ks true of any iertive a aorithm that computes approximation aj by averaging ales of the right-hand site b fram neighboring ged pnts. This icles cobs, ‘Gauss Seidel, SOR(s), SSOR with Chebyshev nesteraion (he lst tee wth rods ordering), ad any’ Keyl sutmpice method ed on atria wetoe ‘multiplication wih the main Ty. this becuse maliplng a vector by ‘ryan is aso equivalent to averaging oghbosing ged poi Ylins. Suppose tat we start with lghtchand sie ouch ri, with single nonarso feat, ss shown in the upper let of Figure 63. The true solution + is shown 82 Applied Noserical Linear Algebra Rehan sie sunset Spot sc as ep ssn Fig, 6.8. Lam of eneragngneighoring gi point. ‘in the oppor right ofthe same Buns noe that is everywhere nonzero and fs smaller we ge farther from the center. ‘The bottom lft pot in ge ‘© 6: sons the soltion 29 afte 5 step of Jacobs method, starting with fetal solution of all non Note thatthe soltion ss ero moce than 5 gr points aay from the centr, Ieanse avers with neighoeing ged points ean “progagate information” only one grid point er eration, aad the ‘nly nonzero vale fs intl In the center ofthe grid. More generally, after & erations only gr points within &of the centres be ponzeeo. The bottrn right figure shows the best posible solution eq obtlnable by any “nearest hoighbor” method after 5 stops: it agrees with © on gi points within 5 of the center and is naessarly 0 farther away. We see graphically that che ere ‘say ise t the sae of xo the sixth gid pot aay’ rm the center This ells lange eror by formalising this argument, one can show that ‘wok! tent lest OUlagn) stepson an nbn gid to deren the ertor by ‘eonstant factor less than I, 10 matter what *nearneghbor” algorithm Fue. Ime want to-do beter than OXlogn) step (ene OG 0g) ost), te need to "propagate informatio” farther than one gid point per iteration, Multird dots this by communiating with naaesteghbors on curser ards, where nearest neighbor on 8 canes arid can be sneh farthee away’ thn 8 earest neighbor on 8 fine gd Multged uses cone gr to do dide-andcomuer in two relate senses. Iterative Methods for Linear Spates ss Fit, it obtains an initial soltion for N-byN grid by asian (2/2) hy) geld a8 an approcimation, taking every olher grid pont from the Nby-N ge. The cnrser (1/2}-b5-(/2) ged isn hen apposite y an (UV/8-by-(/4) ged, and 20 on recursively. ‘The second way muligrd ses Aivideand-conque i in the frequency dma. This requires us to think of the err a 8 sum of elgenectors, o sie-cuves of diferent frequencies. Thon the work that we don a prteulr rd wil elingte the ere In haf ofthe frequency compovents not eliminated on ote grids In particular, the wok Perform particular grid averaging the solution a each gid point with Its neighbors, a yriation of Jacobs method —makes th solution smoother, Which is equivalent to geting dof tho high-frequency ero, We wil ilusteate these notions further below 6.9.1. Overview of Multigrid on Two-Dimensional Poisson's Equa- ton We bein by stating the algorithm at high level and then lia deta As with Hock eyele reduction (set 6), it turns out to be aavenient 10 comsider 2! —1)-by-02" —1) gid of unknowns rather than the 2-by-2* ged favored by the FFT (Setlon 67). For understanding snd implementation, i Is convenient to edd the nds atthe boundary, whieh have the known value 0, to get a 2 4 1}-by-2" +1) geld, shown ia Figures 6.10 and 6.13. We also let Ny = 2-1, ‘We wll lt P® denote the problem of slvng a discrete Poisson equation on (2° ety(2 4 1) grid with (2°— 1)? unknowns, oF equivalently» (N+-2)- ty(N; 42) aril with X2 unleowns. ‘The problon is spceied fy the Fightin side ae mpi the gr i 21 and to eowtcent trie T= Ty An approximate soliton of PO wil be denoted 2, Ths, UO ‘and 2 are (2 — De" —1) aera of vas at each grid point. (The zo hundary wales ace implicit.) We wil generate a serene of relat robles PO, PE, PB... PA on creasingly corse gids, where the solution 1 PI i's good approximation to the eer a he soltion of PO, "To explain bow malig works, we nd some operators that take a peob- lem oa ae sid ad citer inprove it or transform It toa elated problem oo nother grt: # The solution operator S takes problom Pandit approximate solution 22° god computes an improved 2° improved 2 = $162!) os) ‘The improvement to damp the “high-frequency eomponents of the ror. We will explain what this meas below. ts iapleaented by a ‘raging each gd pot value with es nearest righbors and sa vation of Jacobs method 3 Applied Numerical Linear Alba Pomme! Piepretae, Faiengee ae" wee ig 6.10. Sopne of arid ta by o-dimenainal mali 1 The streon operator takes «ghd se UO frm probern PO ‘and maps i t0 0, whieh ian appreximation on the enter ge #0 Ra, (0.39) ts implementation also rqules Just « wughted over with nearest eighbors an tho gi + The tterotation operator In takes sn spproniate solution 2 for ‘and converts it tan appeuximate solution 2 for the prem IO on the net Baer rd 69 Intel (os) {es implerectation also requires Jost a weghted average wth nearest, ‘nighboes on he grid Sine all trwo epecatos are implemented by eplcing vals at each gr point by some weighted mverages of nenestrehhors, exch operation costs Jost O(1) per unknown, oc OCH) form unkown. This isthe key to the ow ‘ot of the lime algo Multigrid V-Cyele ‘This is onough to state the basic algo, the matigrd V-eyele (MGY)) AuconrinK 6.16. AIGY (the fines ore numeral for later referee): Simetion MGV, 2) replace an approximate solution 2 (of P wath en inproed one yizt buly ge unknoum onmpate the exact solution of ele Iterative Methods for Linear Spates 5 » at S29) improve the solution a 10 79.20 8 compte the resid 3 49) — In(MGVCA- REO), 0)) sole rearsively (on coarser grids 4» Wa 20-40 carer ie gid station 5) 2) — S92) improve the tolation again return 2°) endif In word, the algoritim does the following Starts with a problem on a fine grid (4,2, Improves it by damping the high-trequney error: 9 = S10, 20 5. Computes the residual ofthe approximate solution 2. 44 Approximate the fine grid esi on the next courer gids) 5. Solves the coarser prob ecursivaly, with w eroinial gues: MGV ‘(e),0). The fetor 4 appears because of te A factor in the eght- hn side of Poisons enuation, which changes by factor of Irom Sie st to coos gi 6. Mops the conrwe solution back 40 the fine i 4 MGV (R}.0)) 7. Subteats the eormetion computed n the couse gid from the ne gid solntion: 2 = 2 — dl 8. lproves the solution some mone: 2° = S409, ‘We jst the algorithm bls fllows (we do the details late). Suppose (by induction) that di the ezat solution to the equation 79d) = 70-209 _¥9, Rearranging, we gt 7 (20a) — 9 0 that 29 — A is te desire sltion "The ngorithm i called 2 Vayce, brane if we draw i schematically in (grid umber 5 tne) spn, with x pt foreach rire call to MG, it looks like Figure 6.1, stating with all to MGV (GI), ) ia the upper et ‘omer. ‘This ells MG on grid chen 3, a so. down 10 he canes gid ‘Vand then back upto ged 5 again. ‘Keown only thatthe bulling blocks S,, and Jn replace vals at Points by cert weighted averages of thelr neighbors, we know enough to do 236 Applied Noserical Linear Algebra 5 3 1 tine Pig. Gat. we: 1 0{) complesitynnalysis of MGV. Since exch bung block does a constant !mount of work per rd point, t does a total amount of work propetianal to the aaser of grid points. Ths each point a grid level on the "i the Vecgle wil oot 0((2 ~1}?) = OL) operations. Ifthe Bes grid at love E with n OC) unknowes, then the total cnt will be given by the grote Sow out 010) ull Multigrid ‘The uke culigd algrithan uses the MGV just described os buling Dock. Ii called fall mage (EMG: Auconreist 6.17. PMG: Sanction PMG, 218). return an acura solation 0 of PIO ‘ole FA enaely to get =) fori 200k 20) = MGV(O, Infa-M)) end for In words, the algorithm does the following 1. Solves the sips problem PO? exacts 2. Given solution fo the coarse problem PA, rap toa starting guess 2 forthe next He problem Ps fat) Iterative Methods for Linear Spates ss Fig. 2, Fue, ‘5. Solves tein problem using the MGV with his starting guess MGV (OM, Inia). Now wo ean do the oxrall O() complesity analysis of PNG, A pictur of PMG in (erd umber i, time) space fs shown in Figure 6.12. ‘Thre is one V" inthis pict foreach ell 0 MGY inthe inner lap of FMC. The "V starting owl & eons OU) before. ‘Thos the toa eat 6 agin ven by the genni ss Yow -o4t) -0, which is optimal, since it does constant amount of work for ech of them tikoens. Thin explains the entry for emltigrid in Tabi 61 ‘A Matlab iplezrattion of multi (bol fo theone sad two nessa ‘model probles) i availabe HOMEPAGE/Matlab/MG README kaa. 6.9.2. Detailed Description of Multigrid on One-Dimensional Pois- son's Equation Now we will explain in det che various opeaiaes , R, and In composing the multigrid algorithn ad sketch the eocvergence pool! We wil do his for Poisons equation in one dimension, sine this will eapture all the relevant buvior but is simpler to wate. La particular, we ean now consider 8 nested set of one diners problems instead of two-doensional probes, 88 shown In Figure 618, ‘As hofor we denote by che problem tobe solved ong i, neal, 219 = 19, where as blore Nj =2—1 and 1° = Ty, We ben by describing the solution operator, which i form of weighted Jacob convergence. Solution Operator in One Dimension [a this subsection we drop the superseripts on 7,2, ae Ui? for simply of potato. Let T= ZAZT be the egendecomposition of 7, a defined ia 38 Applied Numerical Linear Alba WY rvgidsernine — AM anptatspim — A ages aan cfectoaneryid — peafeetenaner pd Fig. 6:13. Sopne of aia tl by one-deneainal mal Lemma 6:1. The standard Jocoi's method for solving T's — Bis 1 ~ Re where R= —T/2 and ¢— h/2. We consider wrighted Jaceb contegence Sonst ~My Wher Rey = (a0 2 aa ey ~ wh/25 w ~ Tcoresponds Uo" the standard Jacobs metho. Note that Ry — ZI wh f2)Z" fs tho sgeadecompostion of Ry. The egealins of determine th comversence ‘of weighted Jacob i te sul way: Let én ~ aig —2 be he ree at the mth Iteration of weighted Jodi convergence £0 that ly = Reatnet = Riteo (Zu ~wn/2y2" yen = A whyrZ eg Bem (1 waj2ynZ eo or (ZPen)y = C= WA /2) 12% We call (2 eq)) the Uh femueney conorent of the err tm S18 Cy Zen) i 8 sm of eokimns of Z weighted by the (Zea) Le 8 sam of ‘Sinusods of varying frequencies (see Figure 2)- The genvalvs (Pe) — 1w)y/2 devermie how fst enh frequency eomponent gous to uro. Figure 6.14 Plots 3s (iy) for Nand varying yles ofthe wight ‘When w= § and j > 4, Le, for the upper half ofthe frequencies 2, we hwo (4y(Ha)| © 4. This nigans tha: the uppee half ofthe errr companeats (Zeq), are mulled by for les at every iteration, independently of Lemetraqecyeeror compononts are nt dered ws ey a we wil eo in Figur 615. So weight Joeobi convergence with w~ # good at decreasing he tigh-fvquency eer Th, alton operator § in aqustion (B51) eorsts of aking one step of weighted Jacob convergence with w= & S(.2) — Raye 40/3. (os ‘When we want to iozate the gl on wich Ras operates, we wl end write Ho, igi 615 shows the fet of taking two sto of fr 4 6, where we howe 21" GB unknown. There ave thes rows of prs, the Bat row Iterative Methods for Linear Spates 0 are seer i Fig. 6.4. Graph ofthe spectrum of Ry for N = 90 and we = 1 (Jaco’s method, w= 12 andw 2/8 homing th intial solution snd error and the following two roms showing the Soliton ry and 170 Gm after sues appliations ofS. The trv solution Isa sine curve, shown a5 a dotted line inthe leftmost plo. in each ro. ‘The approximate sation i shown as w soll line in the same plot. The maidle Pot shows the error alone, ineling ts two-nor inthe label tthe ttn ‘The rightmost plot shows the feqaeney components of the eror Ze. One can sein the rightenoat plots thats Sf apple, th right (upper bof the Frequency compan re damped ut. Thisean kobe seen in the aid ind It plots, Because the approximate solution grows smoother. This Is because high-requeney’eeror loos like “rough eer az low-fequency errr Lok ike Snooth” errr. Tally, the norm ofthe weetor derenss apy, om 1.65 1 1035, but then decays move gradually, becuse ther ste mee eroe in the hgh requoneles to damp, ‘Thus, i only makes sense to doa few Kerations of Sst cine, Recursive Structure of Multigrid Using this terminology, wo ean desribe the recursive structure of multigrid as follows. What rultigrd does on the Bnet ged 9, stomp the oper hal ‘ofthe frequency compet the error in the slo. This accomplish by che solution operate, a just ceseries. Oa the next couser rl, with In many’ points, multiged dnp the upper bal ofthe remaining reqvenes ‘components inthe ever. ‘This is because taking conser grid, with hal 8 ‘aaay poats, makes Fequeaces appeae twice as igh, as Hustentd tn the ‘example below. FAW ANN \/ I ul ate Applied Noserical Linear Algebra ten Sg Fig, 6.7. striction from grid with 28-1 = 18 point oa grid wth 21 = 7 sms 0 ondary ae ale show) “The simplest way to computer) would be to simply semple rat the common sr points of the corse ved fine grids, uti is beter to compute 1 ato conre grid point by averaging values of neighboring ine rid points the male a conn grid pont 3times the vale atthe erresponding fie gr point, os 25 times each of the fie rid point neighbors. We eal ‘his smoothing Bosh methods se lista! in Fg 6.17 So altogether, we wete the resection operation sx Dl) my aad tad = ei 1, 38) ii ‘he subseript ‘and suprsript/~ 1 on the matrix PY indieate that it maps Soom the pid with 2 1 points to the grid with 2-1 — 1 pnts. Tn two dimensions, restrietion inolwes averaging with the ight nearest rnghbors of each ged pts: tis Use grid eel vole ial, plus Uae ‘he four neighbors to the let, igh, top, and bottom, ps tines the four ‘emmlning neighbors a the upper let Tower lt, uppeeeght ad lower right. Iterative Methods for Linear Spates sis snp i aor Fig. 618. tration om a grid wath 2 — oda (0 Boundary eles aoe shown} 7 pints to a grid with 21 = 18 {Interpolation Operator in One Dimension "The interpolation operator Jn takes an approinate solution a-Y on aeons ‘id and mops toa Turelon on the next fn grid The solution Is interpolated tothe Ser grid se shown In Figure 18: we do simple lest Interpolation to fl In the values on the fine rid (using the fect that the boundary values are known to be ze]. Mathematialy, we write tis as told) = Fg? oY, (636) 1 1 “The subsesipt —1 ad superserit on the mat Py india chat tmp from th id wth 2°”? 1 points wo the pid with 21 pots Note that 2; =2-(P21) ln oche words interpolation sd smoothing ‘re essentially transpose o one another. This fat willbe important in the conversence anal late Tn two dimensions interpolation again involv averaging che values at ‘mre nearest neighbors of ne gid point (one neighbor If the fie rl point au Applied Noserical Linear Algebra 5 alo oars gid point two neghors ithe fn gid points neszest coarse ‘eighbors are co the left and right ar top and bottom; and four neighbors stews). Putting It All Togother ‘Now wo run ho algorithm jon dseribed for eight erations on te probs iturd Inthe top two plot of Figure 6.1% both the te solution 2 (on the {op Het) snd right-hand seb (on Uke top right) are shown. "The sumer fl uniaowss 271 = 127. We sbow how muligedl enaverges in the bot: ‘om thi plots. The midlet plo shows the rato of eoniaetive reals [rmsal/al, where the subscept mi the number of erations af mle (6, calls 9 FMG, or Algoelth 6.17). These ratios are about 1S, ladieating ‘hat the residual deeroses by more than a factor of 6 with cach multigrid ‘tration. This quick eouveezence is indicated in the mid right plat, whieh shows a semilogacthmie pot of [pl] vrsus mt sa straight line with lope logia-15) expected. Finally, the bottom plo plots all eight error vectors mz. Wese how the snooth xt and tome paral on seagate lor, with a constant decrese between njtent pls of lpg 13) Figure 620 slows silar example for «two-dimensional model problem, Convergence Proof ‘Finally, we sktch convergence prot that shows thatthe overall err in an [PMG "Weyce is decreed by neonates than I, independent of rl sie [Ng —2*-1. Thin moun that the number of FG V-ayeles need to deerense the eer by any fcr les Ha 1 ependent of, sd 50 the tal woe § proportional tothe cst ofa single PMG V-cyee i, proportion 1 the ‘mabe of uaknowas Wo will simplify the proof by looking at one Velo and assuming by Indution tht the anes ged problem is soled enc [12]. ta reality, the caro gril problem int solve quite exactly, but this tough analysis slices to eapture the spt ofthe root that low-fequency eror I eliminated on the conver grid end high-frequency error i eliminated on the fine gr [Nolet ut write al the forms dining V-eele aa combine them al 10 et a singe Formals of the fr "ew el) —Af-e9.° wher el) = a0) the ‘ror and A 3 mate whe eigeaaluesdeveine the eae of converge: fe goal iS to show tht they ze bouaded aay tr 1, independently of "The line numiees in te folowing table eer vo Algpethin 6.16 leeratne Methods for Linear Systems ao a ey ‘I |_| ee Pig 6.19, satirslation of one donenional mal psom a 6 Applied Noserical Linear Algebra ‘rue atten Reprise _Lementmtermn)— __romen of we of 1s" ig, 6.20, Matin slat oft dneninal mal pron. ) 2) — sean) ~ Ra 800 by ne 1) and eauton (50, 6 = rg g by bine 2), 4 = IqMtev(e- ee),0)) by line 3) = pr9" 4-H) by our assumption tha the coarse ged probe Is saved exactly Ing {T DY (4H ‘by eqaton (635) © = FI a Bey equation (56) oo = we oy ln) 0) 29) = se x69 = Al 520 te in). Inger to at egutons updating the ere, batt ery 2 Hil U3 on is 6 nd (9) nen 0 2989 fa i (8, Iterative Methods for Linear Spates sr ad = fron ine (A) toe fe Bel, (oy 9) reset, fo 8 Elen] ret, fo) = ea, (= Reto Substituting each a the above equations nto the next yl the falewing formula, shaming how the eer is updated by a Vcyce new 9 18 f1— iy fen] cee are (57) Now we oid! 1o compute the eganalues of MWe Ast simpy equa on (57) sng he fats ht Png 2 (P22 and 20 4. PURE a INCRE ay (sce Question 6.15). Substituting chew into the expresion for M in equ tion (57) yells wom (ror ce doping nso snp statin IM Ryo {I~ PP (Pe ty oF [renourse) caren) Pr) Ra (59) ‘We contin, sing the fac tt ll the matrices composing MF (P; Raj and P) can be (nearly dngonatize by the eigenvector matrices 2 7 ae 2M of TT and THY, respectively: Recall that Z = 2? = 2-1, 2N2, and Rays = ZU ~A/8)2 = Zn 2, We leave t the render to conte “HPZO — Ap, wow Ap Is alipost dlagonal (ce Question 6.15) (1 seo /vE kj Anse =} 1+ con VE ith =2— 5, a0) ° otherwise, M2 = (2hya2) {1-carramy.venyerayenae-n]" (eh payara)- tama = fae {1 AE AeANDT "et Ae ai Applied Noserical Linear Algebra ‘The matic 212 i sila to M since Z = 21 also has the se eg alas as Af Alb, 22M i sey diagonal thas nonzero only on is main Alisgonal and “pediagonal” (che dagonal from the lower left caeuer to the ‘upper right comer of the matrix). This Jets ws compute the eigenvalues of 7 expletly "Tuwomea 6.11. The matrr M has eigenvalues 1/9 and 0, independent of 4. Dhersfore multigrid convenes ota fired rate dependent ofthe number of ‘dns Fora proof, se Questlon 6.15. For a move general analy, se [266 Foran implemeniation of this algorithm, see Question 6.16. The Web se {0]contins pointers to an extensive leet re, satware, and son. 6.10. Domain Decomposit n Domain decomposition fo aalving sparse systems of Knar equations is tie fof eurrnt reste. Se [18,1 05) ad especially 290) for reat suewys ‘We will give only simple ecamples The aed for methods beyond those we have discussed arses from of te lergulsity an sae of wel problens and also from the need fr algoithins for parallel computers. The fasest methods that we have discussed 50 fa, those based on block eyeie reuction, the FPT, and multigrid, work best (or only) on pacieulariy regu problems such asthe model problem, i Poison's equttion dseretized with « uniform grid on rectangle. Tut the region of solution of areal problem may ot bea retansle but rore ierexula, representing physieal objet like wing (ste Figure 2.12) Figure 212 also iystrates tha tere may be tore posts reins whet selon fecpecta to be less smooth than in eegions with 2 smooth solu. Also, we nay have more complied equations thn Poisons equation oe eve diflret xquations in dileent melons. Indeprdeat of wither the probisn regula, Imag be oo large to Ht i the computer memory aud way’ have tobe solved in pices.” Or me may want to break the problem into pees that can be solved in poral on parallel computer. Domain decomposition addresses all these sues by showing how tos) tematialy reat hybrid” methods from the simpler methods dicused in rovious setions. These simper methods are applied to smaller sd more re ‘lar subproblems of te veal problem, alter which these partial soutons are eal togethte” to we the overall soon. ‘These suiproblens em be sor fn a time ithe whole problem docs not 6 eto merry, oF in pall on poral! computes. We give examples below. There ae gery many ways fo break lange problem ito pics, may ws co sole te vidal pes, sind many was to pee the solutions tage. Domain deompesiion theory oes not provide muagie way to choase the best way to do this i ll cases Iterative Methods for Linear Spates a9 bout thera set of resonable pss to try. ‘There ane some ese (ch ‘8 prblews sulietly like Pesson's equation) whee the Uhory does yield “pptimal methods" (costing O(1) work pee unknown) Wo divide our dscusion ino two pats, nonnerapping methods ad ose Lapping methods. 6.101. Nonoverlapping Methods "This method is as called sbstructaring ora Schur complement method in the Ttrature, Tt has been used fr decades, expecially in the structural analysis ‘community, to trea large problems Into smaller ones that fit Into computer For simplicity wo wil lstrate this meow using the vs isons ote tion with Dirichlet bonndary conditions csertins! with » point ste bt ‘on hn Teshaped region rather than square. This region many be decompose into tro domain smal re ane Inge square of tie the side Seng, whee the sual square i connected to the bottom of the eight side fw lage sare, We will design a solver that can exploit our ability to solve prolens ‘ily on squares. In the figure below, the numberof each gd pont Is shown fr a nese ‘scretinston (the number I above and to the tof the corresponding grid point; only gl points interior tothe “1” are numbered). Lt 2H 2425 14 18 1h 14 4413 12 14 td 34g Noto that we have numbered fst tho grid points aside the ewo subdomains (1 to 4 and 5 to 29) ad then the grid pats on the boundary (30 aed 31). "The rsultag matrix 0 Applied Numerical 0 ae [a | ABTA Ta | ere, Ary = Tawa, Aza = Tyas te Ay = Ther = Ty 2a here Ty is efi in eatin (6.8) and Ty is defined in equntion (6.14). One ofthe toost important properties ofthis matrix i that Aye ~ 0, ince theres 0 leet coupling between the interior ged polns of the cwo subdomains. ‘The only coupling i through the boundary, which Is numbeeed last (gtd polnts 30 fand 31). "Thus Ag comtlns the coupling Batween the small square and the ‘boundary, aad Azy contains the coupling becween the lange square and the oundar. “To sc how to take avantago ofthe spacial structure of A to solve Ax ~ 6 vite the block LU decompaston of A as flows ie erotoqe prints ae ag ace | ae Catee aren atesue| | ere 0s | 0st tke agar? aba 1] loos} lo oor where abet o [aw = Ay — AB AG Ay — ABA Aas (won) Iterative Methods for Linear Spates ss in called the Scar complement of the leading principe submatrix containing ‘Ayn and Az. Therefore, we may wete as Al 0, -AglAw] [to 0 1 Oe0: 0 ag Agana || or 0 ° an: oo 7 oo st) | -ahay -abagh 1 ‘Theetore, to multiply a vetor by AM we noe to multiply by tbe bloeks in the eatees of this fated frm of AY, namely, Ay and Ags (and Ue ans poses), and Ag? and S-. Mulupiyig by ays and Assis hap because thay are Wry sparse. Multplving by Aj? and Ag? Ie alo chesp bacsuse we ‘hee these subtlornins to be solmbl by FET, block cyclic redvetion, elt 12d, or sane oter fast method discussed sofa. I rennin to explain bow to mukiply by $-! ‘Since there ace many fewer ged points an the boundary thn in te subo- mains, ls and Shave a mac male dension than Ay an Aag thi eet rows oe finer ard spacings. Sis symmetic positive definite as is A, andl (in this eas) dense. To compute it explicitly oe wonid teed to solve with enh subdomain ante per hoary gre pit (rom the Agta and Az Aas terms in (66)). This an eataaly be dou, alter whieh one Gould fet ing ease Cholesky and proce! to soe the system. Bur this is expensive el ‘moreso than just mulplying vector by , which requires just aa solve er subdomain us equation (6.61). This makes a Krylov subspece-basod Htezative meted svc as CG lok atraetve Seton 6.6), stew these methoas roquire only multipying @ wector by S. The aumbee of matrix-vector multi Plieatons CG requires depends a the condition number of S- What makes Sonia desompositin so attractive that S turns oat to be much biter con tioned that the orginal matrix (e eondtion number that grows lke O(N) Instead! of O(N?) and so convergence is fet [114,28 ‘More severly, one has f > 2 subdomains, separate by boundaries (ce Figure 621, where the heavy ines separate subdomains). If we number the rele in each subdomain coseentvny allow hy the boundary rade, we sot the motes (6.02) ween we en fico i tying ech Ay indepen and fring Uh Schr eampenent 8 Anya Soy Abang a Tn thc, wn tees nt hn Ge aa seen has farther strata tht canbe expat to pron ie For campy mnberog tin gi pons inte ater fe boundary seen lr te gr pol se Applied Noserical Linear Algebra the intersection of boundary segments, oue gets Hock strvetite 8 A "The diagonal bloks of $ a complicated but may be approsimated by 13, which may be Inverted eee using the FFT (85, 36, 87,88, 9]. To som: ‘marae the state of the set, by dosing the precondition for appeopeately, tno can make the numberof steps of conjugate green Independeat ofthe ‘umber of boundary grid points [20 6.10.2. Overlapping Methods “The metas In the lst setion were elle nonoverlapping because the do- tmsins corresponding tothe nodes in As wre dept naling to the block Aingnal structure in equation (6.62). In this section we permit ovelapping domains s shown in tho figure below. As wo wil seo, his oveciap permits uF to desig an algorithm comparable in speed with multigrid but applieable to 1 wider set of problems ‘The rctanglo with a dashed boundary in te igure fs domain M4, and the sauare with 0 sod boundary i domain fg. We hive renumbered the nwes se that the nodes in ace numbered frst and the nodes in {gare numbered Ins, with the node in the overlap fy fin the le. aie a”? ms 2 pad | wim § aang NBRBBE REE RR ‘These domains are showa in dhe matrix A blow, whieh fs the some tise in seetlon 10.1 but with snows and columns dened as abawn above: Iterative Methods for Linear Spates as ‘We have indicted the boundaties between dorskins inthe way that we uve perttonad he mat: The singe lines dlvide the matrix into Use nodes assoclted with ©} (1 through 10) and the rest Q\ 0 (11 through 81). The ‘double lines divide the matrix Into the nodes ssoeated with Sy (7 through 51) and the eet 2% (1 through 6). Tho submaties below ae suserpt ‘econdingl: Ano, | Any |- [agate , Auan| ina. TAaa.oe | [Ano T Aso, | We conformally partition vectors sch as ao) [a a [ata tna) _ [20 ] ara Now we have enough notation to slate tw hase oecaping dain dosan- position algorithms. "The simplest ove i called the addive Sehr method for Histol reasons but could ws well be called evertppng Block tecobs eration breause of ts snl to (Boek) Jaco ertion fn sects 6.5 and 65 ey Applied Noserical Linear Algebra Aucomrnins 6.18. Additoe Schuur: method Jor updating en eprasimate so- anion 2, of Ax ~ to ge attr solution 14 Pab— Ae, /* compute the residual */ nant Bina 0+ AGL gta, /* update the slaion on My */ iit.) —#41,05+ABh ng" /* update the solution om My */ | In words, the algorithm works a follows: The update As! g, mn, corresponds to solving Poisson's equation just on 1, usig boundary conditions ot nodes 18, and 10, which depend on tbe previous approwiate solution ‘The updaie Aj! 94 Is analogous, using boundary conditions at nodes 3nd 6 depenng of Ti our case the ©; are rectangles, so any ono of ou eal ast methods, suet as multigrid could be used to solve An! rm. Since the sditve Shane method Is iterative, It snot necessary to save tho problems on 0, exactly Indes, the alse Scare method i typi ws a6 preoaitioner for » Krylov subspace method ike conjugate gradients (se section 6.6.5). Tn the notation of setion 665, the preconditioner Ms siven by “Sete tats] 110, ana 9, ed not onerp, then M- woul imply co [ir ada and we would be doing block Jecobl iteration. But we know that Jacobs method does not converse parsenary quickly, because “information” about the slution from ove domain en only move slowly to the other downainaeoss the boundary beten them (se te diseussion ne the beginning of setion 6.9) But as lng as the oeriap is nrg enough ration ofthe two domains, infor ‘mation wil travel quickly enn toRoranton fas convergence. OF ese we donot want too large an overlap, brea this eres he work signs "The goal in designing «good domain deworponiion metiod i 1 ehowe the domnsizs ad the overlie s0 a8 to have fst convergence while dang os tle ork ae posible; we ay more on how convergence depends on oveap below This algorithm ako be writen in ne in at Ti, Iterative Methods for Linear Spates 5 From the discussion in set 65, we know thatthe Gass Sel metbod is likly to be mor elfective than Jaeabis metid. This the case here a8 well with che overtpping Blok CaussSelel method (move coaraoaly eld the mulipcasve Scheer: method) aften bang tie as fast as additive blo ‘Icobi iteration (the aditive Sears method). ALCOWION 6.19, Mutipliotve Sowers methad for wpdting on appr rate elation of 0) ra =(—Aridy | /* compte residual of . on 21 */ 2) Fiegsqy Fam FAjhgg tos /* Mp soation om My / 2) Fahim, =Fia0, (8) ra Axia, / compute wesidal of 55 00 08 */ (Bena Zu), TAR ng Fm / plate slo ony */ (2) Rene 2am Note tat Bines (2) and (4) do not rape any data movement, provided that Hyy and 4) onerarite ‘This srt fn soe Poss ution on wing toda date tena Agr G18 thea sates Paton ean on thing tary dts that ts ist ben peed Tt meas be wed a 9 prerdtoersKiso sting ed Tips re Jos thn te mo (0a ae sed. hiss dane ithe min whan wore crmpatd rth ae ary nspaten Darl proces salable se epeden rola Agr ost Coes te spree Aa el i iexpene woah oes summary of he neal eonegets ans oth methods fort noel rots a alr lip protien, Leth bette mshig The tery row ry Hats ees) Cova unin ho des 0 Wi two do og Then ron hfs me con he a oni sth nmi arose fr verges dependant af a ho 0 So Thisean atv oper seit maid iho Since sma of mes oh Bu he otf orton Sra sig slams ony sn yt, may emp ieee tothe oil poem, So une tbe lane a es re ver cap (wih he L-sepd ei ee) tl gh, ‘Now soppne hw many dain ah oon fh Thee wort Ui oe fw the tne ly cre ech th sp ine ps woe cls yn the nay 8 own byte a Fie at 11 5H be the sont by wich adjacent dans nea. Nowe tH, 4, ant hallgoso event tn the no ton 677 mai asta 336 Applied Noserical Linear Algebra ] - e Fig. G21, Coarse and fine dscrtenton of on L-shpad rion and Hh. Then the number of iterations requlted for convergence grows lke 1/1, Le, independently ofthe fae mesh spacing h. This cose to, but still ot ns good as, ulti, which dows «constant numberof iterations and (04) work per unknown, taining the perfomance of multigrid requires one more ide, which, per- fps not surprising, sine to multigrid. We tse an approximation ly fot the problem on the cave grid with spcing H Yo ge 8 eoerse grid pron dsioner in ton to the fiw ge proenditioners Ag We nae theo matrices to dewribe the slept. Pst, kt Ay be the isis for the model problem diseased with conse rest spac. Sead, we ae restriction ‘peretor vo take a residual on the Re mash and eesti Io Yl oa the are tes; tht Is esenilly the sane as in muliggid (ace section 6:92) Finally, we ncod an interpolation operator to take values oa the coarse mash nd interpolate them tothe Boo mesh; in multigeld this also tara out to ben Aucoin 6.20, Two-leetadative Sours method for wpdating an appro fae sli ay of Ax ~ to got w eter slation 249 for =1 tothe munter of domeine 8 ta, = (b= Anda, endjor naan mien FRE iste [As with Algorithm 6.18 this method is typieall ws at prenditonee for a Keyl subspoce method. Coneraenoe thor foe tis lant, whic applicable o noe generat robles than Poisons qquation, says that 98 HM, 8, and Psa to O with Iterative Methods for Linear Spates sr 45/11 ssing x, the numberof iterations ui to converge independent Of H, hoe 8. This teans that as lng a6 the work to solve the supoblens “Ag! , dA? is proportional ote numberof unkuows, the exnplety 4 od 6 muha Te is probably evident tothe rear tht implementing these methods in 8 real world problem ean be complete. There setae sala on-line that Implements many ofthe bulding blocks deserb here sa lo euns on poral machines. IC & calle! PETSe, for Portable Extersble Tolkt for Selenite computing. PETSe is aallable at hip://www.mcs.anl.goe/petse/ pets ha ‘and i dseibed relly in 290 6.11. References and Other Topics for Chapter 6 UUpto-davesurwys of modern iterative methods ae given in 15,105, 14,212), ‘and thelr parallel implementations ae alo surveyed in 75. Clases methods sich as Jobs, Gauss-Seidel, and SOR methods are cussed In deal a [247 185, Muligid methods ae discus! In 2, 18, 14,258,206] and the references therein 89s @ Web sit with places to an extensive bibliography, saftware, and soon. Domain decomposition are discussed! in [#8 14,208,230, (Chebyshev and other polynomials wre discussed in [238. The FFT is discus in any good textbook on computer scene algorithms, such ms 3) and [246 {sable version of block epic ection fond in [45,43 6.12. Questions for Chapter 6 QuEsTi0S 6.1. (Basy) Prove Lar 6. Questios 6.2, (Besy) Prov the following formulas fr triangular fctrie- Sons of Ty 1. The Cholesky factorization Ty ~ BE By basa upper bidiagonal Cholesky factor By ith ant = YEE ont aut = fe 2. The result of Gaussian elimination with partial pivoting on Ty i Ty LxUy, wow the triangular fetors ae bidiagoal: Inti) 2 ond bli 1) = — rh, i uti = and Uy(Gi+ N= = as Applied Noserical Linear Algebra 8. Ty = DyDf, where Dy isthe N-by(N +1) upper bidiagonal mate wlth ron the main diagonal snd —1 oa the supeediagoaal a n0% 6.3. (any) Conte equation (618) Question 6.4. (Easy 1. Prove Lemna 62 2, Prove Lena 63. '. Prove that the Sylvester oquation AN’ —'B = C is eayvalent to (Un A~ BP o ly)weX} ~ we) 4, Prov that we(AN'B) = (BP. A) we(X) Quesios 6.5. (Matiym) Suppose that 4°" Is digonalizsbe, so A has Indepesdonteigeavtetors? Axi ~ aes, 0F AX = XAq, where X ~ [0-2] and 4 ~ dlag(a). Silat, suppose that Bi diagonalizabl, sb bs Independent eigonveetos: Byy — ig, oF BY = Ay, where ¥ = lyst) tind A= eg). Prove the folowing results 1. Tho mn egeaalucs of IgA § 82 fe Ny = 25+ Le all posible suns of pes of eigenvales of A and B. Tho corresponding elgenvetaes sr 2, wheres; = 2424, whose (km +()th entry Is2(8)y(). Watt note way, “ AY BONNY OX) =O) (lm GAA + Aa ela (6.63) 2, The Sylvester equation AX’ 'B7 — C is nonsingular (colvable for X, ven any C) and only i the sum 0, +3; — 0 for allegories 0, (A and 3 of Tho sce i tun forthe slightly ferent Sylvester ‘equation AX +X =C (ee also Question 0), 8. The mn elgenvalins of A. B are Xj ~ a5, Lo, all posible products of pairs of eignvalies of and, "The corresponding eigenvectors are fis, whare 5 ~ 5 © yj, whose (om + th entry fs (A) (0). Weeten nother wa, Woayvor-w X)-(A 2 Aa) (04) Quesi0x 6.6. (Hasy: Programming) Write oneine Mati program to in plement Algorithm 6.2: one step of Jacl algorithm fr Poisson's equation. Test it hy confirming that Ie converges ws fast os predicted in section 6.3.4, {Question 6.7. (Hard) Prow Lemma 67, Iterative Methods for Linear Spates 9 QuESTiON 6.8. (Medi; Progaming) Write » Moin program to slve the dserete model problem on a see sing FFTs. Th inp shoul be the di mension Nand square N-byV mate of vals of fy- The outputs sould be fan N-by-W mates of solution ey and the esidual [Ty 0A? f| (Ten Te. You should aso prodoee thre ciaensonal plots off and Use the ft Tul into Matlab. Your program should nt have toe move than afew ls long if you use all the Features of Matlab that you ean. Solve It for several problems whe solutions you know and several you do ot 1 f= sinGin/( 41) singh lA +0, 2 Sy —salj/(W $1))- SiN 41) 4a / (N42) ih 1) ‘3. Fhe few sharp spe (Doth postive and negative) and Is 0 elsewhere. ‘This approximates the elctestatie potential of charged parties leat atthe spikes and with charges proportional to the heights (poskve or Ingaive) of the spikes. If the spikes are all positive, this also the ravitetional potential. Questi0s 6.9. (Median) Confirm that evaluating the formu in (67) by Dforming the matex-vetor muliplestions frm right to lt is mathezat ally the sume Algoitn 6.8, Quesmos 6.10, (Medium Hard) 1. (Hard) Lat A and H be ral synumettie n-by-n matics that commute, Le, AM = HA. Show tht chere isan orthogonal matrix Q such that QAQ" = diag...) and QHQ" — dist, t) are both dis ‘onal, In other words, and 17 have the sme eigenvectors. Hint: Fist. assume A hay dite eigenvalues, and then remove tis aesumpton. (eam) et o oa bos symmetric tridiagonal Toeplite matrix, ie, asymmetric tengo imatrie wth constant along the cago and sons the odingona Waite down simple Fels fr the geass and eigeavecors of Hats Use Ler 6. (Mend) Let 0 Applied Noserical Linear Algebra bye an n2.byn? block tridiagonal mates, with n copies of A along the agonal. Lat QAQ = diaglay,.--) be the eigendecompeition of ‘Acad let QHUQ? ~ dag(,...,2,) be te egendecomposiion of HF howe, Witte down simple formule for the n egenvalite and gence tors of Tin terms ofthe ay, 8, and @. Hint: Use Kropeckerprodvets (Median) Show how to solve Px = b in OCW) time. In contrast, howe ‘much bigger are the running tines of dense I faetoriztion sd ad TU factoriation? (Medion) Suppose that A and ate (possibly diferent) symmetric tr ‘tgpnal Toepliee matric, as dein above. Show how to us the FFT tolwe T= bin jont OC lag) te QUESTION 6.1, (asy) Suppome that is upper triangular and nonsingvlar tnd that Cis upper Hessenberg. Contin that RCR™" is upper Hessen QUESTION 6.12, (Medion) Confirm that the Krylov subspace 1 (A 95) has dlipension fan only ish Arvo algorithm (Algorithm 6:0) othe Lanezos algorithm (Algorithm 6.10) ean compute q, without quitting fist QUESTION 6.13. (Aediam) Confer that whan AM" Is symmetric postive dofnite snd Q°** has fll column eank, then T— QTAQ Is also symmetse postive definite. (Fr this question, Q need not be ofthogoaal) QuESTION 6.14. (Median) Peove Theor 69. QUESTION 6.15. (Mediums Hard) 1, edu) Coatem equation (6.58) 2. (Median) Coaer equation (650 4, (Hard Prove Theor 6.1 QUESTION 6.16. (Median; Programming) A Matlab progr implementing mul to solve tho dserete model problem on a square ssallable an the class hormpage st HOMEPAGE/Matlab/MG.READMEhl, Sear by rune hing the demonstration (type *makemgdems” and thon “tstingv”). "Then, ‘ty running testing for liferen rghtshand sles (input area), elforent numbers of weighted Jacobi convergence steps belor a after each recursive call tothe mukged solver (inputs jack ane pe}, aed diferent numbers of erations (input ee)- Th soltaee wil lot the congener rate (tio of nancevtive sidan) does this ped on the sin of? the Iqueaces in b2 the values of jac and jac? For which vl of set and joe? the sation ‘nos flient? Iterative Methods for Linear Spates wo QuEsTON 6.17. (Means Programming) Using fst model problem solver frown eine Question 6.8 oF Question 6.18, use domain devomposton to blld 4 fist solver fr Poisons equation on a L-shaped region, as described in Section 6.10, The lange square shoul! be I-byl and the smal square should bo by, attach at the bottom sight of the Inege square. Compute the reskdval in order to show that your aswer correct. Quesios 6.18, (Hard) Fill a the entries of «table ike Table 6.1, but for solving Poisons qt in the dimensions insted of two. Asie that the ri af kn s NN, wth nN. Tey to il as ny een of clus? nd 3.38 you ea 7 Iterative Algorithms for Eigenvalue Problems 7.1, Introduction In this ehaptor we discus mthods for fing egerves of matrices that re too lige to ust the “esc algorithms of Chapters and 5. In other ‘words sock algorithms that take far Jes than On) storage and. OC) Tops Sine de eigenvectors of most n-yjn rasteces would take? store 10 represent this means thst we seek slgsitins that eompue jst few use Selected eigenvalues and eigeavetors of «mates ‘We will depend on the material en Keslov subspace methods developed in section 6, che materia on symmetric elgenvalive problems in sction 5.2, and the material 0a the power method sod inverse Keraton in setion 5. The reir i ved to tevew thes sections The simplest egenal rable so compute jos the lrg eigeale in abeoite val, along with i eigenseter. The power eto (Algorithm 41) ie the simplest gorithm suitable fr ths ta Real tha it ner oops wa = An, Whar 2; converses 1 te eigenertor corresponding to the desired eigenvector (provided thot thew Is ony ove eigenvalue of largest absolute vale, and 2 16 noe orthogonal to its eigorwector). Note that the algorithm uses. only to perform mneri-vectoe multpictin, 50 all that wo need to run the algo rion is a “blake that eakes as inpat and returns Aya output (S00 mpl 13). "A cksely elated problem ito fad the egenvalu ehsest toa user-auppiid value oy long with its eigenvector, This is preciely the situation inverse ieeution (Algorithm 42) ws designs to handle. Reel that is iner loop is tn = ale, sus = waafllals 6 st Apolial Numerical Linear Algebra ie, solving near sstem ofecuntions with cooiknt matric Ao. Arnin converges tothe desired eigenvector, provided tt there just one egenalae ‘dosest 10 @ (and 21 not orthogonal cots eigenvector). Any of the spare nate teciiques In Chapter 6 or setion 27-4 ould be used to solve for ‘net, although this is usually much more expensive than simply ultipying ty. When is symmetric Rayleigh quotient iteration (Algorithm 5.1) can ako be ust fo acalerate convergence (although it ot lays guranteed to converge to the elgeaalie of clot 10 6). Stating with a given 21, &~ 1 Iterations of ether the power method ot Inverse itration produce 6 sequence of weetors21,22,-..,Z4 These vectors span a Krplor subypac, ne defined in sccton G1, In teense ofthe poner retho this Krylov subspace is Ka, A) ~ spank, Ary, AP0 AS, fl in the ese of inverse iteration this Kor submpce fe Ku(aiy(—21)-") Rather than taking 8 our approximate egerwetor tit atral tok forthe “hes” approximate eigenvector in Ky, i th best linear eombintion SEE ave We took the some spprovch for solving Az bin setion 66.2, ‘where me asked forthe best approximate solution to Ax = b fom Ka. We Wl sc tht the est eigenvector (aa lgenvale) sppesiaatios from Ki se uch better than 2p alone. Since Ky has dimension & (ia gee), we can setually ww It co compute best approximate eigenvalues and elgnvectors ‘Those best appeawimatioas are called he Rite values and Fit vectors. ‘We will concentrate on the symmetric ese A= AP. ta the last set we wil bity deseribe the nonsyminetsie ease. "The rest of tis ehaptar is organlod a5 follows. Section 1.2 dlacsss the Raylogh itz method, our base teehique for extracting Information bout ‘igenvalus aa eigenvectors from a Krylov subspace. Section 73 discusses ‘our main algorithm, the Lapeaos algorithm, inexact arithmatl. Section 7A ‘analy the rather dfleent behavior of the Taneaoe algorithm in fot Point arithmetic, end sctions 7.5 and 7. desrbe practical implementations ‘of Lancoos that compute relinble answers despite rondo, Finally, sesion 7.7 bri discuss algorithms for the aonsymmetreeigenproble, 7.2. The Rayleigh-Ritz Method Let @ ~ [Q1,Q be any n-by-m orthogonsl matrix, where Qk is mb and Qu is moby — 8). In prctce the clus of Qe wil be computed by the Tena sgostin (Algorithn 6.10. Agoethen 7.1 elow) aed spas Kyle subspace Ky, and the subserpe indicates that Qy is (mostly) unkown. But foe tow we do not cae where we get. ‘We will uso the folowing notation (whieh was ako used In equation (6.1): OfAg. grag. T= QAQ= (0, Q4F AQUA = | BEBE BES Teatve Meshods for Rigunvale Problems Py pare ah) fae ey When & = 1,7 Is ust the Raleigh quotient T= (Qh, A) (ee Definition 5.) Sofor k> 1, Ty is natural gnoralzation ofthe Rayeigh quotient Darin Tt, The Rayleigh itz poeedne i to epmainete the eigen tales of Ay te eigetales of = QETQy. These aproinatins ere Caled ten. Let Th = VAN" be the egendrompantion of Th cr raging iene apyesinaton ree nun of Qu ar led Tie metas ‘The Ritz values and Rite vectors are considered optimal approximations to the eigenvalues and eigenvectors of for several retons. Fit, when Qe fan 80 Fare known but Qu and so iy ae Ty are akrown, the it wales fad Yeetoes are the natural apprsimations fet the known par ofthe mati, Secon they satis the following generalization of Theorem 3.5. (Whore 3.4 semed that the Raskigh qoasient am a chest approximation” to a single fgg.) Recall at the columns of Qy spa an iarant Subspace of A i and only if AQe = Qu for sme mate ‘Tuwonew 7.1. The minimum of |AQe ~ Qala oer all kbp symmetric tnatrioes attained by RT sich case /AQk Qua ~ [Thala Let T= VAV? be the eigendecomponiton of Te. Te mina of |AP,~ PLD oer all tye orthogonal mares Py were span( 4) ~ span( Qa) and D is ‘agonal Sa also Til ad ta attained by P,—QuV and DA In other words, the columns of QaV’ (the Ritz vetors) are the “best” Approximate eigenvectors and the diagonal entries of 4 (the Ritz vals). ane the “best” appravinate eigenvalues in the sease of miming the reid [Ae = PD. Prof. Wo temporsly drop the subseripcs K on Te and Qe ‘0 slmplfy ‘otaton, so we can write the F-by-k matric T ~ QPAQ. Let = TZ. We ‘want t0 show |AQ— QR is minimized when Z ~ 0. Wedo tis by using a Aiguse form ofthe Pythagorean theorem: WAQ—QRIZ = Ane (AQ QRI(AQ— QR] ty Pot of Lama 17 = aw (MQ QUr +297" AQ— OEP 2) Aanas [AQ = QT} (AQ QT) ~ (AQ = QTIQZ) “nag aN 11927102) si Apolied Numerical rear Alger amas [(4Q — QTY" (AQ — QT) — (QTAQ— TZ PQ ag—1)4.2°2) Nae l4Q—QNYF(AQ 0) +277) because QTAQ=T Anon [(AQ — QTY" (AQ - QT) ‘by Question 5.5, since 27Z is Smee postive emilee 1AQ = QTE by Part 7 of Lemma 1.7, Heston suberps sy to compute te minimum vale WAQe ~ QeTilla = QT + QuTieu) — (QuTel2 = Qu Tinea = Tiel Ie ples sy ay prot Qi whe Ue nero gn mat than hte ad Qe go th vane pcan, [AQ ~ Qua = [AQ ~ Qua = HANQ~ (QUILT RU Thee cue se il mined wR = Ti ed chosing U = ¥ sot Us nal ese the sod ton pee in he Sitenn fthe thre, "This theorem justifies sing Riz valves as igenalve approciations. When Qc compute by te Lana algorithin, in whic ease (ee equatian (6.1) a a Ba Beran | oe [eas Dear Bens Bet far On ‘thon it easy to compute all the quantities in ‘Theorem 7.1, ‘Tiss beause ‘her are god alznrithms for finding igenalues and eigenvectors of the sym etre trigonal matrix e section 5.) apd benuse the residual norms Simply Tia — a (From the Lancs algorithm we know that 3 nonnes= ‘ee This simplifies the erroe hounds on th approximate eigen ad ‘Sgenwrctrs inthe following tore ‘Turon 7.2. Let Th Tay eral Qe be sin uation (71). FQ comput ithe Lanceos algrtvn le, be the shale (posi) noniero entry én the pper right corner of Ty. Let Ty VAV? be te eigendorompositin of Ty, there P= [sth] orthogonal end A = dlag(,---0.)- Then Teatve Meshods for Rigunvale Problems sar 1. There are egenraies ay...04 of A (nt nerssriy the forest) sc that, — al Tilly fo" 5 — tan. IPQ 8 computed by the aye20s algriio, then, ~ 4] < [ills ~ 5 2 JAIQue)~(Quehls~ Mage. Thos, the diference fete the Rie alue and some eigeneatue oof Ais at mos [Tay wheh may be much sail than Tra. 1fQe 8 computed by the Lanczos alorithn, then Tala — Sel], where wh) te the Rib (lotion) entry of ‘This forts lets us compote the residual A(Qus) (Qa cheney, fey without multiplying any vector by Qe by A { Wihout any further information abont the spectrum of Tx, we cannot deduce np weal ervor ound on the Rts vetor Quvs. Ife bow that the gap bese ® and eny other eigenvaie of Ty Ty iat las, the we con bd the angle D detwren Qu ada re eigenvector of A by }singe < Hele, (72) 2 5 11's computed by the Lanezs algorithm, then the Bound simplifies to Be ° fanz Proof 1. The eigenvalves of P Include through 8. Sine =I Re om-[[2, Fl rte Weyls theorem (Theorem 5.1) telus that the ogeavalves of 7 and ier by at most [Tigl2- But he egervalies of T and A are identical, proving the result 2: econ WACQeee) — (Qevedile HQ" ALQue) — Q"(Ques)Aille = [Ls ELT so Tin = [Pewvila. ‘Then by Theoeen 5.5, Aas some cigenvalve @ satssing la ~ 8 < [veal Qh computed by the Laneson soem, the [Tas = ils), brease on the top ight entry af Th, namely, snot, sis poled Numerieal Linear Algebra 5. Wereuse Example. toshow that we ent deduce a useful eror band ‘on the lita vector without further information about the spectra oT [tg]. where 0 < € <9, We lot k= 1 and Qh = lei), s0 Ti = 1+ 9 and ‘the approximate eigenvector i simpy. But as shown it Example 5.4, the eigenvectors of Tae lose to [1/59 an [-e/9,1/". So without ‘lower bound on gi the gap between the eigenvalye of Te and all, the other egenvales irlitng tne of Ty, we cana bows the error in the computed eigenvector. If we do have such lower band, we et apply the soa! hou of Theor $19 T and T+ H ~ diag, Ty) a derive qquation (72). 6 r 7: |. The Lanczos Algorithm in Exact Arithmetic "The Lancaos algrithan for fading eigenvalues of @symmetse matte A com bines the Lancaos algorithm for building a Kryloy subspace (Algorithm 6.10) ‘with the Rayleigh Rite procedure ofthe last secon. In ober words, bulls sn orthogonal matrix Qe = [al of orthogonal Lancios veetors and proximates the cigervalies of by the Rite values (the eigenvalues of the Symmetric tridiagonal matrix Tx = QFAQu} os in uation (7.1). Auconiin 7.1. Lanesos Aloit in exact erithti for finding eiensal sand eigenertors of A A 41 = W/Pla, = 0, w=0 Jory Leak ayn ate 55 04g) ~Bpaati =I 85 ~0, gut ayn 2/3) Compute egenealues, eigenosctors, and exor bounds of Ty end for In this section we explore the enewergenoe of the Lanes algorithm by d= sesiing # numeral example ia soe deta. This example has bees chase 10 istrate both spiel convergence heaviness well some more problematic baliavior, which we ell misconsergenc. Micomergence can oetur beease the stacting vector q i nearly orthogonal to the eigenvector of the desl fgenvalue or when thee are multiple (or very cos) egenvals Teatve Meshods for Rigunvale Problems va ‘The tite of this setion indicates that we have (aeary) eliminated he lls of roundel eor oa our example. Ofcourse, the Matlsh code (HOME PAGE/Matlab/Lanezosullteorthog.m) used 0 produce the example below ran In floating point arithneti, ut we Implemented Lanczos (in patinlar the iner loop of Algorith 6.10) in particularly carcful andl expensive wey ‘onder to make It mime the exact resulta closely ne possible, ‘This careful Implementation is ealed Lanczos with full eorthegonalization, a indented in the ites ofthe figures below. In the nest seton we wil explore the some nureseal example using the ginal, expensive implementation of Algorthin 6.10, which wo call Lane- ‘os with no terthoganatization In der to contrast with Laneson with fal rearthogonalization. (We wil so explain the diferencia the two impeneats tions) We wl sc thatthe original Lanczos lgorichm ean behave signifier Aiferensy fom the more expensive “esaet™ algorithm. Neverioks, we will ‘hem tom toms the ls expensive algorithm to compute eignvales realy Exaurut 7.1. We llustrate the Lancs algorithm aud its extor bounds by runing large example, « 1000-1000 diagonal mately A, most of whose ‘gonvalves were chase randomly fom a normal Gaussian distribution. Fig ture 71 i plot of the eigenvalues. To make later pls easy to understand, ‘ve have aso sorted the digont! ences of A from largest to smallest, <0 (A) ae with eorespondn eigenvector the th calean of th entity ratrix. ‘Ther aro a few extreme ogenvales, and the rst clster near the frner of te spetevin. The starting Lanems vetor gy os al eal eee, lescept for on, as described below. ‘Theres a losin generality in experimenting with diagonal matrix, pee running Lanczos on wth starting vector q is equivalent to running Lanczos on QPAQ with starting wer Q (se Question 7.1). ‘To iluseate coergenen, we Will use sevecal plots of the sort show in Figure 72. In this figure the eigenvalues of ach Te are shown plotted in column, for &= 1 109 00 the top, and for 1 0 29 on the bottom, with th cigenvales ofA pleted in an extra column at he right. Thus, column [bhas pluses, one marking each eigenvalue of Tk. We have sso eolr- MUTA) > AyaTaoa) 2 Apa (Ta)s In other words, (7) 0 poled Numerieal Linear Algebra Fig. 7.1. Bienes ofthe diagonal matre A Increases monoonleally with For any sedi ot just ¢ = 1 (the largest lg ‘vale). This ilustrated by the colored sequences of plus moving ight and up in the igure. ‘A completely anslogous phenomenon occurs with the smallest clgenvals: "The bottom bise plus sign In each column of Fgueo 7.2 shows tho smallest cigenvaloe of eee, and these are monotonically decreasing asf increases Similars ho sth sma eigenvalue i lo monotoniealy doeresing. This also a simple consaquenee of the Cauchy interlace theorem ‘Now wo can ask to which egervalue of A Ue eigenvalue (7) ean converge asf increases, Clearly the largest eigonvaloe of Ty, \y(Te), ought to eonvecae tothe largest eigenvale ofA u(). Indeed If Lancans proceeds to sap kn (without quiting early because some 0), then Ty nnd Aare sia, and 50 Ai(Fa) — AA). Sisley, the th largest elgeaalie A(T) of Ty must inerese monotonically and converge to the ih gest eignvalve (A) of (provided that Lanezos does not quit eri). And the ith smalkst eigenvalue Neva Ta) of Tost sinlarly derease monotonically and ennverge to the 4h Salles lgenale Reys-iA) OA All hse converging uence re reprsend ty saquenees of ples of ‘common coir in Figure 72nd othe gues ia this sion. Consider he ght ‘rap in Figure 72: or large than abut 13, te topmost and botommost Diack pluses for ociaoatl ows text tothe extern eigenvalues of, which ‘re potted in the eighties column this demonstrates convergence. Sitialy, the outermost soqucnes of tel pluss fot hoeaontl rows next to te saad Iteative Methods for Eigenvalue Probiems m a {a eo Pig. 7.2. The Lancs alorim api to A. The ist 9 sepa ore sow on the tay, sd the fr 9 sap ee shun Be Stim Calan Kahane the eels fe, exo tht righnest crn (etn 1 onthe ft and sal 300 the elt show ll the genes of 8 largest an second smallest elgenvaves of A in the rightmost column; they ‘convengt late than the outemos eigenvalues. A blow-up of this behavoe for more Lanezo spss shown in the top two graphs of Figure 7.3 ‘To summarize the above discussion, extreme eigenelue, i, the largest ‘and smalest ones, conterye ft, ad the interior eigenvalues coneege last arthermore, comeryence is monotonic, withthe i farpest (smallest) eigen- tnlue of Ty inreasing (decreasing) tothe ith leryest(emallest) egensatue of A, provided tht Lanesos dees nat stop prematurely with some 0. Now we examine the convergence behavior in move detail, compute the seal erorsin the Ii walues and compare these errors with te error bounds in part 2 of Theorem 7.2. We run Lanezos for 9 steps on the same motit pictured in Figure 72 and display Ue esl in Figure 73. Th top lel gah sr poled Numerieal Linear Algebra in Figure 73 shows only the legs egenvalcs, al the tp righ graph shows aly the stale geval. "The mile two grap in Figure 73 show he eres in te fue largest computed eigenvalues (onthe lft) andthe fou smallest computed eigen (oa tho sigh). ‘The colors in the middle graphs mete the ears inthe top rapt. We measure and plot the errors in thre ways 1 The global errors (he slid Lines) ar given by [A Ts) ~ (AIDA Weedlvide by [iA] i oder to normal the eror oe betwen (20 securacy) and shout 10°? (machine epson, oe fll curacy). Ask Increases, the bel erro decreases monotonically, and we expect 10 eerease io machine epsilon, unis Laneaos quits prmatures +The toe! crore (tho dotted tines) aw given by sminy (Ta) ~ CAVING) The loa error measures te sales is toner between NCTA) andthe nearest egenaline 2(A) of Ay not jut the taste valve (4). We plo this hastne sometanes ta local roe is ‘nuh salle than the global eror. +The err Bounds (the dashed fins) are the quits [2em(44/1\A) computed by the agortim (excep for the normalza- tion by [M(A)], whit of couse the algorithm does nt kaon). ‘The bottom two grephs in Figure 73 show the eigenvector components 1 the Lanezns vectors 9, forthe four eigenvectors corresponding tothe four largest eigenvalues (on the Kft) and fr the four eigenvectors eoeresponding to the four smallest eigenvalis (on the right). nother words, they ple akes — ai), whore esis tho jth eigenvector of the diagonal matric, for EA 1 Sand for j 110-4 (othe lft) j — OT to 1000 (on the Fgh). The components are plotted 0a logarithm sel with * "ad 0 to lnicate whethor the compouent is postive oe aczative, respectively. We ie these pls to help explain convergence below. ‘Now we use Figuee 7-3 10 examine conversence In move detail. "The largest cigenvalue of Ty (topmost black ploss in the 1p left graph of Figure 73) bri conversing to its nal value (about 2.81) right eWay Is correct to six lcimal places aftr 25 Laneas stops, and is coret to machine preison by step 10. ‘Tho global eror is shown by tho slid black ine in the mile fet zap. ‘The lal error (the dotted block lin) is the sane ws te global error ee not too many’ steps, lthough ican be “ocidentally” much stall if ‘an egenvalve (7) happens to fll low to some other 2y(4) an its way 0 (A). The dad block ie in the sare graph ts relive eror bow ‘computed by he algorithm, which verstnates the tue ror up tO about ‘Sep 75. Stil, the relative erroe our corrects ideas tha the largest ‘detuvale is eoeret to several decal digs ‘The cond through fourth largest geal (the topmost red, green ad blue puss inthe top le graph of Figure 7) converge in slr ashon, Iteative Methods for Eigenvalue Probiems a Fig, 73, 90 stp of lanes applied to A, The the lat eevaes ave shun on the bf and the ames onthe right ‘The opto grap show the eigenen ‘Mle, the mde trp the ers fll ~ nd, al ~ det. Bonds nce) en the btn tn graph show wencomponentn of Lane eto. The lars im clr of tre gre mah wt poled Numerieal Linear Algebra ‘with gene converging slightly faster than egenale #1. Thi typical Travir of the Lacon. ‘The bottom ltt graph of Figure 78 measures eonvorgence In terms of, the eigenvector components gfe. To explain ths graph, onside what hap- pens to the Lancaos vectors gx the fat eigenvale converges, Convergence Incas thatthe eoreesponwing eigenvector ey nearly is in te Krylov subspace synned by che Lancs vectors. Tn partes, since the fst egenvalie hs fonvorged after — 50 Lanczos steps, this meer that cx must very eae be ‘fnear combination of q though ga Snot the ge are mutually orthogonal, this means ge tus also be orthogonal to e for k> 50. ‘This borne oot by the bbe eure in the bottom le raph, whieh has dereased to es than by step 50. The re eure is the component of ex gy a his reaches 10°° by step 60. The green ene (Iki egeacomponent and blue eure (our ‘gecomponen!) get comparably smal afew steps later. ‘Now we dscus the smallest four egenvales, whose behavior Is deseebd bythe three grapis on the right of Figure 73.” We have chosen the matrix Aud starting vector qt illustrate cercln difenties that ean arise inthe ‘onvergence ofthe Lanczos algorithm to show that convergence isnot lays fs sraightforwar 4s inthe eas of the four eigenvales just exo In particular, we have chowen (99), the egeneomporent of gx in the egtion of the secon smallest eignvalve (281), to be bout 10-7, whieh i UP times slr in ll the other componeats of), which sree. Ao, ‘we have cowen the third and fourth smallest eigenvalues (numbers 98 and 907) to be nearly the saune:—2.7ON0D! ane 27 ‘The convergence of the smallest eigenvalue of Tyo Daan) ~~ 6 uneventful similar to the largest eigenalues, It core to 16 cigs by step 0. ‘The stondamaat gala 7, sh i od, besa iy miscomerng tothe sda ena of, nae “2 ade th dt i i {In might gap of gue 7. ws that SCT) ees tho) tonic decimal pla for Lato step 40 <0 Te cormponing roe Bonod (th we dated in) ell tha on) enn some eg of A toto eile forth same vaso Th reson yp ‘miconerger shat the Ko subypace start wh ery sal open ot {hecorepondins Kr iby nn any 0 Thscan been he te curvoin bottom ist graph, wich stars 1-! a takes ntl tp 1 before compart oe appar Onl at this pt when the Kel tutngcecotio » sfcely ge ermons: ft Sesto omy eh Aoi) sare conegg sein Ks hel lw May) = “28%, 8 own ithe top end wile pt grophe. Ons ti contre bas atin a, the crnonen of tp starts Geretng agin ad etm very el ees w(t) bas cone to hasty crt. (For «aaa ‘lta ewe th converge ate ate egecponta , the tort of Kane a Sua daca below) Iteative Methods for Eigenvalue Probiems a5 Fig. TAL Laneson apie to A, here the alerting eer. is orthogonal Lo the ‘anentor sorepoding to the second malt egveae 281. No eprrimetion Tagen, fg were exact etn 1 en, 0 ey = 0 ae than jun ey ~ 10° the later Laneonveciom woulda be ogo to ti. This means Ayg( Ta) would power conser to Naw). (FOr 8 pa te Question 73) We late this in Figure were we ve odie st ig s0 Uae fern O. Note tt no sprain Wo Awa)» 281 ce appen Fortunstely if we chose qa ado, extremely why o be o- Uogutal to ab gence. We can olvaps rerun Lage with dierent andor to provide more “waite” ever that we have ot mised any Seorvalins ‘tbe sone of “covers” are (searh) mle geval, seh ss the the third smallest eigamlie AA) ——2-THO0. sd the fur Stale Gieaaloe A) 2-1. By examining Na) the i Inet green curr inthe tp right med middle right graph of igure 73, we sm tat during Lanes sap 50 © k 75, MT) mdsonerges Lo abot “Sons, hayeay ec the two dows egenalve of A This to ‘ible at the elton povided by the top sgh grap but evident fro Ue boron een of te sod ve ine a te ke gh arp Suing Uaneon sepa 50-2 F< 75. AL sp 70 raph converte tthe fal vale Sou() 2700 st aa. ‘Nani, the fourth sae aenalue Av), shown i Be, as isomer toa ae peat om(A) = 24; the ble dotted oe Ih the tidle righ gop nats at Ayr) at Ava( A) re 1 op tose {einalpaces ear sep Gl. At sep k= 05 rp convergent In gain othe Hal wie hr) ~~ "Tis ean lobe seen nthe oti x6 Applied Numeral Lincar Algeben Fig. 7.5. taneon applied lo A, here the thd and fourth wna eigenalacsore rl Only me appreciation itn dete eigenvalue i mut Fight graph, where the eigenwetor componcats of ear al ena grow agai ring step 30 << 65, afer which rapid convergence sets in ad ey agin decrease Tadd, i yn( A) were exactly double igenalue, we ela Uns, woud ver have two egenvalivs fear that value but aly one (a exset arith) (For a pro, se Question 7.) We ilustrate this in Figure 7.3, where we have ‘modifed A just slghly so uh i has two eigenvalues exactly equal to —2.7. ‘Noto tht only ono approsimation to 2yi(A) — Ayn (A) = —2.7 ever appears. Fortumsoly theo ae many epplications weee iti slant to And one ‘copy ofeach cgonvalve rater than all mile copies Also iti pssble to use “block Lanez” to recover multiple eigenvalues (ee the algorithms ced insecion 7.) Examining other eigenvalucs in the top right graph of Figure 73, we see Liat maeonwergence quite common ident by Ue fu shor. bor ronal segments of Hoard pluses, whieh then dropoff tothe right tothe eet stale eigenvalue. For example, ue seveath sales geal ppeowimatid by the fit (back), sth (ed), and seventh (aren) ‘dgonvalves of 7, at various Lance steps “These rasconversee piernmena expla why tt compatable eroe bana ovided by part 2 of Theorem 7.2 exentnl to monitor convergence. I the ‘ers boul is snl, the computed eigenvalue inde & 00d approsiation to some eigenvale, even ifone is nssing” © ‘There is nother erro bound, de to Kani! and Sed, that sheds ight on why miseorvergeace occurs. ‘This ert bound depends on tke age betwee the starting vector gat the desi eigenvectors, the Ritz wales, snd the erative Mechods for Rigenvalve Problems w7 ese! egervalus. In other word it depends on quantities vaknown dosing the computation, soit not of pestial tse, But It shows ct if yf peel forthogoaal to the dsied elgenvector, oF Ifthe dese elgenalue is nerly mull, then we ean expect stow eonvergence. Soe [15,set. 121 for details 7.4. The Lanczos Algorithm in Floating Point Arith- metic ‘The example in the ast section desribed the behvior ofthe “dea” Lancoos orth, esetilly without rou. We el the coresponding ere bat ‘expensive mplereatation of Aletha 6.10 Lanezos wih fal rorthoqonalis tion to coatrast it with the orignal inexpensive implreatation, whieh we ell {eos with no reovthogoalaion (HOMEPAGE /Matlab/LaneaosNoReothog.). Bath algorithms are shown below. Aucoin 7.2. Lanczos alpritim with fll or no reorthagonatiation for Finding eigenvalues end eigenvectors of AAT 41 = O/Pla, = 0, = 0 forj=ltok Ay age 2a 2 DE CMa = 2— EI ICT fal rortogonaistion B= 2- ay Bit to rerdhogonaization 3 Iele 3735 —0. gut aya 2h Compute cipenalus, eigenvectors, end error bounds of Te ce for Pull reorthogonaization corresponds to applying the Gram-Schmidt or ‘hogonlization process = = 2—3-/-{(sT a)” fen order to almost surely rake = orthogonal to; tough qi. (Se Algorithm 3.1 as wol as (195, set 6.9} and [103 chap. 7] for disesiens of when "twice is enh”) In eet arithmetic, we stomed in soetion G1 that =f orthognnal to a through jt Without reorthogonalization. Untortusaely, we wil se tat oad! destroys this orthogonality property upon whieh all of due analysis has depended fe. This loss of orthogonality doesnot emse the algorithm to bebe om pletely unpretitae- Ted, we wil ee that the price we pay i 0 st nuliple copies of converged Bits wales. In other word, ested of Ty having fe eigenvalue eae eal to (A) fe Bare, may have many egeoies arly equal to (This not edsaster i ne mak eonceened about srs Apolial Numerical Linear Algebra ‘computing multiplicities of egenvalves an! doesnot mind the msuling de Tal convergence of interior eigerals. See (56) fr dead description of ‘ Lancars ipeanentation Ut operates la this fashion, nd NETLIB lanes foe the software ial But if acurate multiplicities are important then one neds to keep the Laneats vectors (nets) orthogonal So one could use the Lanczos lgnethm ‘wth fall reorganization, as we did in the Tost section. But one an easy ‘onfir that this eons O(n) flop insted of OU) flops for stop, sod ‘O(n space ies of O(n) spacey wih tay Be 0 Mig ee Co ey. Fortunately, there isa middle ground breween no reorthozonalization and full reortogoaalization, whieh realy gets tho best of oth Wolds. Tt turns ‘at that the q lose thee orthogonality na very spstematie way by developing Inrge components inthe decions of seedy converged Rite vectors, (This = ‘wnt lea to rnlipe copies of converged Ritz values.) This systematic loss ‘of onthogonaity is llustrated by the ext example and expnined by Pale "hcorem below. We wll se that ly monitoring the compute era bonds, we fan conservatively preit which 1 wll hae lee components of which Rice vetoes. ‘Then we can seertivelyortagonalise qe agus just thse few price Rite wetor, eather than aginst al the eal qu at each step, a8 with full reoathogonaization. This ktpe the Lanezos vectors (beady) orthogonal oe very Kl extra work. The next section disuses seltive orthogonalization In deca [EXAMPLE T.2. Figur07-7 shows the convergence beso of 149 tops of Lane ‘anon the matrix in Example 7.1. The grap on the right are with fal ‘xthogonaliation, ard the graphs on the kt are with no reorthogoaalzaion ‘These wraps are sie to tae in Figure 7.3, expt that the shel eae ‘ait, sine this elutes te middle graph igure 7.6 plots the smallest singular value own( Qa) versus Lanczos stop 1 In-exaet arithmetic, Qu orthogonal rd $0 Omu( Qu) — Ie With round, Gc lows orthogonality starting nt around stop kT, cd aul Qe) dope to DI by step E80, which fe where the top two graphs in Figure 77 begin to diverge visuals In particular, starting at sop &~ 80 Inthe tp le. graph of Figure 77, the second smallest (red) eigenvalue Aa(7., which hed converged to XA) = 2.7 to almost 16 cigs, kmps up to (A) ~ 21 in just fe steps, ying © “second copy” of §y(A) ning with A(T) (in back). (This may be har ose, ‘nce ther plises oxerwrite and so ube the black plas.) This earsiton fan beso in the leap in the dashed rod errr boul nthe mil et graph ‘Abo, this (anton as “oresbndowed” by th increasing component of ¢) in the bottom let graph, where the hick curve stats sing ain a Stop [E50 rather than coating to decrease to msehine epalla, aI does with full rethogonszation ia the botiom eight graph. Both of tise ladeate that the algrithan is diverging fom Ks exact path (and that some selective Teatve Meshods for Rigunvale Problems so Fig. 7.6, Lancos algriti alto morthognslcatin ep tA. The amalent ‘anlar vl opus) afte Lanes lor mates Qe show for k= | to 1 Inthe ebenc frome Qs se ordngenal tnd 0 al ager ace shold be oe hth rund Qe comes ak dist xthogonallzation is ale for). After the second eopy of (A) has conversed, the component of ey in the Lanes vetors starts dropping agua, starting @ Tae after step k= 80, Similarly, starting at about stop k = 95, a second copy of Xa(A) appears ‘when the bie eure (4(T,} i the pe ll ph maw rm about ACA) = 26 10 (A) ~ 27. Ac this point we have Iwo cops af y() = 2.81 ad 190 ‘opis of g(a). ‘This i a bit hard to ate on the graphs, since the plies fof one elor obscure te ples of te ober eolo (el enero ble, ad blue overwetes geen). This transition Is indested by the dashed blue eror bound for A(T) in the middo ee graph ssing sharply near f ~ 95 apd le ‘ocesdonea y the ring red eure in the bottom lt graph, which indicates ‘that tho component of ea in the Lanczos vectors is eisng. ‘This eomponent rks near had starts dropping aga, Final, around step & = 145,» thirdenpy of 4(4) appears again ented ‘and foreshadowed by changes jn tho wo baton ltt graphe- If we mer 10 fontinue the Lanezon procs, we would pvioieally get dion copes of any other converge Ritz wales. © The next theorem provides an explanation forthe behavior seen in the bow example, td hints tx peti! erterion for selctively ort booing TLeancas wetor. Ino ns to bo veri by taking ll posible rondo rors ito eenount, we wil dea an ars expeience Uo idea Use few rounding cr that ae kaportant, ad spy ignore the st (195, set. 13 4, This ets us sunmare the Lanczos algorithm with no weoethagonalztion 380 Applied Numeral Lincar Algeben Fig, 27. 180 sep of Lancon spied to A. Column 10 (a he gh of the top ‘pap stows the egenalcn of AI the graph, na rearthogonalcaion ss dane Inthe righ rpin Jl rarthgpalcatonw dant erative Mechods for Rigenvalve Problems ss 1a one ine Basin By = May ~ 0985 ~ By) ws) In thie equation the variables represent the values scully sored inthe rn chine, except for J, which represents the ronda erorinewreed by eval ing the righ-hand si aa then computing, and gh non [is ound by Of), shee es machin cpio, wih val we nae to ko bout fy. In addition, we wil write Ty —VAV! emely,since we kuow tat th roundoff eors occurring in this elgendecompostin age not important "Ths, Qe not maessrily an orthogonal mates, but Vi. ‘Turonss 7.8. Paige. We use the notation end assumption of the last para sgreph We as et Qy ~[a--sayl,V ~ [tra], and A= dig(,--- 0) We eontinne ro call the colunns uy ~ Ques of QuY the Rts vectors and the 8 the Rit walues. Then oan Tale vhetee In ctr worth the emponet fs of th compe Lae tor turin the dieton ofthe Rite wer, — Qs proportional the Fra of se wich the enor tiund on U orepoing ike {elu fe Pa of Theorem 12). Ty ne the Rie wie comers Zn ivenror bond es Bo oa, Ue Laue eco ces a lrg companent in the ein offs etn woe Th er wetor Dace nar dent yn fn Fsample 73, Ted, Pure 73 et both th err bn Be] PCAN = [LAL sd ite etn component fy for the tare Rite val (6, the top sap or the mond lagen Rit ae (12 the btm sap) for 10 diagonal exp Acorn to Paige theorem, th posit ofthe 0 ume ould be Of) nde ti an cab te by th pty a the Cures ait the ie lie y= ate sone sp Fr of Page's thane. We war with quton (73) Tor} 1 10. ~ ba ‘we thse ations ste snl eqvton AQy = ONT + (0.0, Baas + Fe uti + Bates rel + Fey whee ef 6 the k-dimensional rom vector (0.01) ad Fe = Ufi-ooSe the mats of roundof eros. We spl nation by dropping the subceript Eto get AQ — QF 4 dye! | F. Multiply on tie let by QP to ext Q7™AQ GEG 1 aQT ac? + QI. Sineo QTAQ Is symm, wo get that Q*QT Qh” + QTP equals ts transpose o, rearranging this equality 0=@ar— TQ). (ray Q7Q) | 10" ae? a4" Q) QF se Apolial Numerical Linear Algebra 1 and wae it ae x vector, eet that — Ae then tote ta wT aed Qyw ~ (9o(8)|-lU"(QUI} as) is the product of error bound du(k} and the Ritz vector component g”(Qu) ay, which Paige's theorem says should be O(={A]). Our goal is now to manu enon (7) to mn expen Fore lo, then cunton 13) “oth er, we now invoke mor sinplfpingssmptions abo ound Sino ech ann o Q m gotten ty diving’ 8 wet 2 by Re rm ho diagonal of Q"Q is equal to 1 to full machine precision; we will suppose that it Sensei. Furthermory the vecee 2 == —yy = 4 compte the Lani lim otc to be othgnal oft eo te tha gs an gy aretha to mal fl eactne pect, Ths (GQjnug OU wewilsiny soars (Q"Q)jyny Te Now write OO 1G eh, wt Oi lor trguler Hein fou nein abt fon, Co ft seme nl th sod suidiagyal un ow. Tis @or-19'Q~ (er TE) (CFTC, ‘whete we enn use the ze seuctures of C and T to easily show that CT —7C Kestrel lower triangular andl CT ~ TC is stritly upper triangular, Ako, See e is ronzero only in its last entry, eg? Is nonzero only i the last row Frehermore, the sree of Q7Q jost esrb implies that th last two nts ofthe ast row of 4 ate mo. So in pula, egQ ako strict lower triangular and Q¥ ge istry upper triangular. Applying theft that ea @ and CT —TC sve both sects lower tingle to eatin (7.4) yes ~ BenQ y eo) hae Ls he strict lower rangle of Q"F— FQ. Mukiplying equation (7.6) ‘the lft by 0 and othe Fgh by, wsing equation (7.5) ad the fat tat (CT TCyw 0" Cxo — Oo! Cv ~ 0, sills Blea @yv = (9h) -[g(Qe] = be Since ei Lo] <= O(IQ"F ~ PPQI) = FI) = OFA), we set (49) fa" Qe] = OFA, whichis equivalent to Pages theorem, © o-er 7.5. The Lanczos Algorithm with Selective Orthogonal- ization Wo liscuss variation ofthe Lanczos alge which has (cay the high = curacy ofthe Lanesos algthmn with fll eorthogonalzation but (sey) the Iteative Methods for Eigenvalue Probiems 3 Pig. TA. Laneon with no rothponasation ppl Uo A The fr 119 steps are ‘shou forthe larg! igenel in ac, a lop) and forte sz gegen (icrod wollen). The dhe linen ar err buns lore, The lines mand pean os chew piu he comproc of Lame err Bt in the trees of the ht tor for he eget Hk a, tp) ofr the ea agen = fede (2, atom) ast poled Numerieal Linear Algebra lo cos ofthe Lance algorithm with n reorthogonaizion. This algcihin Wale the Lanconalgoithn wth selective orthogonatzation, A discussed in the ist section, egal to ep he computl Laneaon vector ym ey thos! as pasible fr high seearay) by orhogonaiing them sens 95 few other vectors posile at exch step (fe law cost. Paige's theorem (Te fe 7.3 in the Inst seco) tell vs thatthe ay lote ortho beease hey fcaure large components In the direction of itz veto ua — Qed whose Rite valves 0 have converged, mesure by the error boul yt] ex coming smal. This paornenon wa usted In Beample 7.2 "This, the simplest venon of sletie ortngnaliation simply motors ‘he err bound Sli) teach sep and wen i becomes sal enough, the ‘eto = in the Jane lop ofthe Lanes algorithms ortowanalina agin seg: 22 (cla We conser et(}| to be small when ts kes than ‘lal, sce Pag’ theorem tls us that te vector component yf a1 = Intl then kel to exes YE. (In practice we may replace A by THE, since TL known and All may nak be) This leads tothe following slgosi Ausonmne 7.3, The Lanczos algritho with selective orthagonaisation for Finding eigencaives ond eigensectors of A~ A 41 = b/Bla, = 0, w= 0 forj=Vt0ok += Ay, 52-046) ~8)-1t;-1 / Selectively onthogonelce agonal converge Rit vectors */ forall < ke suc that l0(8 $ veh a= e(ueelve ed or 9) - lela 0, —0, quit ass 213; Compute egencalues, eigenosctors, and eror bounds of Ty end for ‘The flowing example shows what wil happen to our eater 100-by- 1000 diagonal matvix when this algeithm is sed (HOMEPAGE /Matlab/ LancaceSeectiveOrthog). Exaurue 1.8. ‘The behavior ofthe Lancas algoritha with slotve oetbog- ‘onalzation is wall idstingsinble fom the ebavior of the Lancy forth with fall orhogonalzaion shown ia the thre graphs oa the right ff Figuee 7.7. In other womds, sletve rthagonalization provided 86 much ‘securay as fll orthogonalation, Teatve Meshods for Rigunvale Problems as "The smallest singular values ofall the Qe were gzeaer than 110°, which means that selective rthogonalization dl ke the anezos vector oregon 1o bout half presion, as dese Figure 79 shows the Ris vals ofthe Ritz vectors selected for reortogo- rallzation. Since the seed Rit vetoes correspond to converged Ritz vals fd the langset and smalet Rita vals converge fst, there are two grap ‘the large converged Fitz values are at the top end the small conversed Ritz valves aro atthe bottom, ‘The top graph matches the Ritz vals shown in the upper right graph in Figure 7.7 that have converged to a est bal pre ‘on. All tether, M85 Mite tts wore sled for orthogonalization af 3 total posible 1499150/2 = 11175. Thi selective ortiogmration dit only 1485/1175 ~ 19% a5 ek woek to ap he Lana vectors (nes) orth onal fll worthogonslization, Figure 7.10 shows bow the Lanes algorithm with selective ceorthagnal- lation keps the Lanczos vetoes orthogonsl just tothe Rutz vetoes for the Tnrgst wo Fu yl. ‘The graph at tho top a supeepesition of the two rapt in Figure 78, which show the errr bounds and Ritz vectors eompo- ‘ents forthe Lancaoe gorithm with no reorthogonazatio. ‘The gph atthe bottom sth cormesponding trap fo the Laneen algorithm with sete or thogonlization. Note that at step k — 50, the err bound fo the largest cigenvale (the dase black in) has reached the threshold of y. ‘The Rite ‘ror islet for orthogonlizton (as sown ty the top black pases inthe ‘op of Faure 7.9), and the eompodent in this Rit vector direction diseppears from dhe beim grap of Figure 7.10. Te sap Iter, at = 58, ta eror ound foe the second Innis Ritx vale races /, and it too is selected for ‘rthogonalation. ‘The ere bounds in the top graph continue to decrese to mochin epsilon © and stay there, whereas the ero bounds inthe bot raph eventually grow again, © 7.6. Beyond Selective Orthogonalization Selective orthogoalizaton isnot the end of the tory, beans the symmetric Laneaos agri ean be made even less expecve. It tras out that enc Lancs vector ls been orthogonaliznd agaist partlelar Hitz vector It tales many steps before the Lancaos vector again requltesorthogoliaion inst. So teh of the oethogonalzation work in Algorithm 7-3 ean be ‘liminated. Indeed, there is «simple and inexpensive recurrence for deciding ‘when to reorthogonalze (222, 10), Another enhancement to us the eror bounds toefcentl distinguish between converged and cmscorwergd” eigen ‘ales [16]. 4 tae-o¢-hoare implementation ofthe Latico lgoritn de- ‘eribed in [25] A dierent software implementation is wala in ARPACK (NETLIB scalapuck rede rpc (168, 281) we apply Lane2o tothe sifted and iver mates (Ao), cea we cexpoet th eigenvalues osest to 6 to converge fest. There are ater mathods 386 Applied Numeral Lincar Algeben erthogenaiation sped tA. The ‘wefrn ae select or othgpatcaton are teu 1 “precondition” a matetx Ato egverge to eatin egemalaes more que Focesample, Davidson's method (8) i wel in quantum chemise whee A ts strongly diagonally dominant, 1 also posite to combine Daval- {en's method with Jacobs method (27 7.7. Iterative Algorithms for the Nonsymmetric Eigen- problem Wha A & nonsymmetsie the Lanes algorithm deseiba above no longer applicable. Thore ar two alternative ‘The first alternative is to use the Amal lprithm (Algorithm 6.9). Re- call that the Arnoldi algorithm eompates an orthogonal basis Qe of 2 Krylow subspace Ke{an A) such that QZAQK = His upper Hesse rather than Symmetric tiiagonal The Taygh Ritxprocesire is ain to apposite Iteative Methods for Eigenvalue Probiems a Fig. 7.10, The Lancson agri with slate othoonaztion applied te A. The tap org sh he a 1 se fhe Lanecn agri thn retention te te lotion soph sons the Lamon arth wit elective othoowalation The lags cient mown Mack wd the second lye egea¥e ‘red The dahl tines re ero ands as bere The Ines mare by oad o's ‘how ues the component of Lacon rstar B+ the dection of the Ha wet Jor the ingest Rt ale 0 ~ yw ack) 0 forthe second nest Hus ale 2 tn ra). Note hat sel orthgonaleationcleinten component hese samen! ‘ker theft sles orthoonacations a step 3 (I) wad 8 (0). ass poled Numerieal Linear Algebra the cigenls of A by the eigenalies of He. Since Ais nonsymmmeti, is geval may be comple and jor badly coaditond, 0 many’ af the sk Tesetive error bounds ase! monotonic eoaergeace propetics enjoyed by the Lancaoe algorithm and deselbed in sxcton 7-4 no loager bold. Novethe- les, effective algortins and implementations exist. Good referees inelnde [152, 168, 10, 214,215,281) aud the book 211) The Invest sotware is do- Scribd in 169,281 and may be found in NETLIB/sealapack readme arpoek ‘The Matib command apeig (or “sparse eigenvalue") uses this software, ‘Asecond aernatvo so use the nonsynnetrie Lanes algo. Thi orth attempts to reduce 4 to onsymmeti tridiagonal form by nano bog ‘onal similarity. Th hope that i wll be easier to Bind the eigenvalues of | (spore!) nonsymmesre tegiagonal matrix than the Hessenberg matrix po- ded by the Arnoldi algorithm. Unfortuntey, te similarity trasormations| fan be quite leon, whieh reas that the inva of the rings ‘nal nd of the origin matrix may re fle In fact, 3 90 says possible o Bad s appropriate sindlaety because of plsaomenan hewn 38 rekon” [11,182 138, 197] Attempts to rep beso by by a ro. css called “lookahead” hive been proposed, implermeated, and analyzed in [i618 54, 55, 68,105, 200, 265,26) ‘izally i posible 10 apply subspace eration (Algorithm 43) (19), Davidson's algorithm [214, of the Jacob Davidson slgoethin [228] to the sparse noxsymmetre genproble, 7.8. References and Other Topics for Chapter 7 In addition tothe references in sections 7.6 and 77, there ave a number of ged surveys sible on algcitimns fr sparse eigenvals probiens: see [17 30, 128, 161, 195, 211, 260), Paral implementations ae also diseased in 75), ‘In scion 62'ne discussed the existence of ole help to choos from mang the variety of erative methods wllble for solving zB. A sill projet is underway for elgenpeoblems nnd will be incorporated Ia a ature tation of this book 7.9. Questions for Chapter 7 Qursri0y 7.1. (Baxy) Confirm thot running the Aeoold algorithm (Algo- than 69) o the Lanezos alge (Algorithm 610) on 4 with tating wetoe 4 leds the denial teidiagonal mateees Ty (oe Hesenberg matelees My) 8 Funning on QPAQ with starting weetor Q"y Quesrioy 7.2. (Medi) Let A, be simple eigenvalne of A. Con that 1g orthogonal o the coreesponing eigenvector of a, then the egenvalis of the tdingoaal mateices Ty computed by Use Lanes algorthen in exset facthmetie cannot converge «0 A, i the sese thatthe largest 7, computed erative Mechods for Rigenvalve Problems ao cannot have A a8 an eigenalue. Show by meas ofa 3:by-$ example, that fgenvalue of some other Ty ean equal *eeseatally Quesnios 7.3. (Medi) Contre that no symmetsctridigonal matrix Te ‘computed by the Lancio algorthm ean hve an exatly ltipke geal Shove that if A has a muliple cignvale, then Lancaen applied oA must rk down before te last step. Bibliography [1] R. Agarwal, F. Gustavson, end M. Zubalt. Exploiting functional par allel of POWER® to design high pevformanee numeelealalgthins TBM. Res. Development, 38:63 576, 1904, [2] LAs. Compler Analysis. MeCrowe Hi, New York, 1966 [5] A. Abo, J. Hoper, and J Ullman. The Design and Anais of Com ter Agorthms. Addvor-Wesley, Resling, MA, 197% [a] G. Aled and J. Herabergee. Iureduetion to Interal Computations Academie Prss, New York, 188. [5] PR. Amestoy and 1. $. Duff, Veetortanton ofa multiprocessor sult frontal eo. International Journal of Sepereomputer Applications, 1 5,198, [6] P.M Amestoy: Pactorzation of large unsyrmmotrie sparse matrices bas ‘on mulirontal approach in multiprocesor enviroment. "Toebni= tal Report TH/PS/91/2, CERFACS, Touloe, Francs, Febery 1901 PhD. thesis IT] A. And and 1 Park, Fast plane rotations with dai scaling, SAE SH Masrie val Appl, 1162-174, 1904 [5] A. Anda oud H, Pask, Sal sealing fst eottions fr st least squares problems. Linear Algera Appt, 24:187-162, 16 [9] A. Anderson, D. Caller, D. Paterson, aod the NOW Team. A ens for networks of workstations: NOW. TEBE Micra, 151), February 1955. [NO] B, Anderson, Z. Bal, C. Bisho, 3. Demmel, J. Dongarra, J. Du Croz, A. Gree, Hammaring, A. MeKenney, S. Ostrov an D, Sorensen, LAPACK Users" Guide (2nd edition). SIAM, Philadel hia, PA, 185 [tt] ANSI/IEEE, New York. 1EBE Standard for Binary Mlatng Point Arh Inet, Si T1085 eiton, 1985. 112] ANSIIEEE, New York, 126 Stondand for Radi buependent Pang Point Arihmetie, Std 854-1987 edition, 1987 om Bibiograply [18] P.Arbens and G. Golub. On the spect decampesition of Hermitian ruatices medied by sow rank perturbations with spplestions. STAAL J Matriz Anal. App, 940-58, 188, [14] ML Aol, 3. Demme, and 1S. Dull. Solving sparse ner systms with ‘ise bad ert. SLAM J. Aforir Anal pn, 10.165 29,198, [05] ©. Axelson. Hterative Solution Methods. Cambridge University Press, Cambridge, UK, 1994 [06] 2 Bri. Rrvor sna ofthe Lancso algorithm for the nonsymmeti égevale prea Math. Comp, 62210 226, 104 [17] 2B, Progress inthe numerical solution ofthe nosymmeti ignvale problem. J. Nemer Linear Algeew Appl, 22023, 195. [IS] 2B, D. Da, atl Q. Ye. ABLE: An aeaptive block Lanczos method fee ron- Hermitian eigenalie problems. Mathematis Dept. Report 95.0%, University of Kentucky, May 1955. submitted to Math. Comp. [19] 2 Hos and GW. Stewart. SRRIT: A Fortran subroatine to ealeaate the

You might also like