58% found this document useful (19 votes)
11K views

Solution Optimization 2ed

An introduction to optimization 2nd edition solution

Uploaded by

Sun Young Jo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
58% found this document useful (19 votes)
11K views

Solution Optimization 2ed

An introduction to optimization 2nd edition solution

Uploaded by

Sun Young Jo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 138
An Introduction to Optimization Solutions Manual Second Edition Edwin K. P. Chong and Stanislaw H. Zak ‘A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York / Chichester / Weinheim / Brisbane / Singapore / Toronto 1, Methods of Proof and Some Notation Li A B|notA notB | ASB (not B)=>(not A) Fr>T TT [tT T Fqoo?t FT T TF) F | F F ror Ft T 12 A B| notA notB | AB not(Aand (not B)) FF)? | T T Ft)tr Flt T TF) F | F F tr} rF FT T 13. ‘AB | not(Aand B) | not A not B | (not A) or (not B)) FOF T rT T T FT T TO OF T TF T FOOT T TT F FOF F 14. A B|AandB Aand(notB) | (A and B) or (A and (not B)) FF] F F F FT| F F F TF F T T rotor F T 1s ‘The cards that you should turn over are 3 and A. The remaining cards are irrelevant to ascertaining the truth or falsity of the rule, The card with $ is irrelevant because S is not a vowel. The card with 8 is not relevant because the rule does not say that if a card has an even rumber on one side, then it has a vowel on the other side. ‘Turning over the A card directly verifies the rule, while turning over the 3 card verifies the contraposition. 2. Vector Spaces and Matrices 2d ‘We show this by contradiction. Supposen < m. Then, thenumber of columnsof A isn. Since rank A isthe maximum 1 number of linearly independent columns of A, then rank A cannot be greater than n < m, which contradicts the assumption that rank A = m. 22 >: Since there exists a solution, then by Theorem 2.1, rank A = rank[AZb]. So, it remains to prove that rank A =n. For this, suppose that rank A < n (note that it is impossible for rank A > n since A has only n columns). Hence, there exists y € R", y # 0, such that Ay = 0 (this is because the columns of A are linearly dependent, and Ay is 2 linear combination of the columns of A). Let a be a solution to Aw = b. Then clearly « + y # 2 is algo a solution. This contradicts the uniqueness ofthe solution. Hence, rank A =: By Theorem 2.1, a solution exists, It remains to prove that itis unique. For this, let and y be solutions, i. Az = band Ay = b. Subtracting, we get A(z ~ y) = 0. Since rank A = n and A has n columns, then 2 — y and hence 2 = y, which shows that the solution is unique. 23 Consider the vectors ai = [laf]? € Re, 7 linearly independent in Re. Hence, there exist a1, Sosa, ai 1k, Since k > n+ 2, then the vectors @i,...,, must be 44g, not al zero, such that The first component of the above vector equation is SOE TAL, asa; = 0, completing the proof 24 1. Apply the definition of| — a) 4 = 0, while the last m components have the form if-a>0 [=a] = if-a=0 ~(-a) if-ao 2. Ifa > 0, then lal = a. Ifa < 0, then jal = ~a > 0 > a. Hence |al > a. On the other hand, | ~ al > «(by the above). Hence, a > —| — al = —lal (by property 1). 3. We have four cases to consider. First. if a,b > 0, then a +6 > 0. Hence, ja +3 Second, ifa,b > 0, then a+b < 0. Hence [a + }| = —(a +5) ‘Third, if a > Oand 6 < 0, then we havetwo further subcases: 1. a+b > 0,then|a+b|=a+6 <|al +[b} 2. a+b <0, then|a+ 6] = ~a~b< lal + [i ‘The fourth case, a < O and b > 0, is identical tothe third case, with a and b interchanged. 4. We first show ja — 6] < Jal + |b]. Wehave la-4 = |a+(-0) S |al+|-0] by property 3 = [al+ |) by property 1 To show [Jal — jl] < Ja —b), we note that ‘other hand, from the above we have [b| — wwe have ab > 0 and hence lab] = ab = (~a)( lal < [0 — al 5. We have four cases. First, if a,b > 0, we have ab > 0 and hence |abj = ab = la ~ b+ b| < Ja ~b| + [0|, which implies Jal — |b] < Ja — 6]. On the la b| by property 1. Therefore al — [| < Ja ~ b. lalli|. Second, if a, < 0, —b) = Jallo|. Third, if a <0, 6 < 0, we have ab <0 and hence 2 lab] = —ab = a(—b) = |allbl. The fourth case, a < 0 and b > 0, is identical to the third case, with a and b interchanged. 6. We have lato < by property 3 < 7. =>: By property 2, ~a < |al anda < |a. Therefore, al < b implies ~a < |al < banda < Jal 0, then [al =a < 6. Ifa <0, then fal For the case when “<" is replaced by “<", we simply repeat the above proof with “<" replaced by “<" 8. This is simply the negation of property 7 (apply DeMorgan’s Law). 28 ‘Observe that we can represent (2, y)2 a8 (ews =2" [3 $]y= (anew = 27a", where 14 a=[I 2: Note that the matrix Q = QT is nonsingular. 1. Now, (z,2)2 = (Qz)"(Qz) = |Qall? > 0. and (e.2=0 & [I@el?=0 Qr=0 oe since Q is nonsingular. 2. (@,y)a = (Qz)" (Qu) = (Qy)"(Qz) = (y,2)2- 3. We have (@ty.z)2 = (e+y)7™Q?z = aTQ@ztyTQe = (zh t(nzle 4. (ray u)a = (rat) Qy = re Qty = r(z,y)2- 26 We have [a other hand, from the above we have j[ull — [kell < lly — 2! Ile] ~ lly < lle - vil. 27 Let e > 0 be given. Set 6 = «. Hence, if x — yl] < 6, then by Exercise 2.6, ||| -llylll < ll — yl] <6 Ia ~ y) + ull < [le — yll + lly] by the Triangle Inequality. Hence, ||x|| ~ jlyl] < || — yl. On the llz — yll. Combining the two inequalities, we obtain 3. Transformations 3a Let be the vector such that are the coordinates of v with respect to {€1,€2,..-€n}. and 2" are the coordinates of ‘v with respect to fe},es,...,€%,}. Then, Hence, fet, senke which implies [eee senje = Ta. 32. ‘Suppose v1, ...,Uq are eigenvectors of A corresponding to A1,..., An, respectively. Then, for each i = 1,...,m, we have (Ln - Avs = which shows that 1 —\,,..., 1— Ay arethe eigenvalues of I, ~ A. Alternatively, we may write the characteristic polynomial of I, ~ A as rp, A(L—A) = det((1- In ~ (In = A)) = det(—[Mn — Al) = = Av, = 1 — AV, = (1 ADD —)"rAQ), ‘which shows the desired result. 33. Leta, y € V4, and a, € RB. To show that V» is a subspace, we need to show that ax + By € V+. For this, let v be any vector in V. Then, v" (ae + By) = ave + Buty = since ve = 34 Letw,y € R(A), and a, 8 € R. Then, there exists v,u such that z = Av and y = Au. Thus, 'y = O by definition. az ~ By = aAv + PAu = A(av + Bu). Hence, ax + Sy € R(A), which shows that R(A) is a subspace. Lete,y €.V(A), anda, € R Then, Az =O and Ay = 0. Thus, A(aw + By) = aAz + BAy = 0. Hence, az + Sy € (A), which shows that (A) is a subspace. 35 Let v € R(B). ic., v = Be for some a. Consider the matrix [A uw éN(A"), then u € N(B") by assumption, and hence uw N((A v]?), since if Now, dim R(A) + dim .N(AT and dim R((A v]) + dimN(A v]7) =m, Since dimW(A®) = dimN([A vf”), then we have dim R(A) = dim R({A v]). Hence, » isa linear combination of the columns of A, ic., v € (A), which completes the proof. 36 We first show Vc (V=)+, Leto € V,andu any clement of V+. Then uv = 07 u = 0. Therefore, v € (V2) We now show (V+)4 C V. Let {a1,...,ax} be a basis for V, and {bi,...,bi} a basis for (V+)+. Define A= [ay---ay] and B = [by ---by), so that V = R(A) and (V+)+ = R(B)- Hence, it remains to show that R(B) C R(A). Using the result of Exercise 3.5, it suffices to show that V(A) c (BT). So let z € N(A™), Which implies that € R(A)+ = V+, since R(A)* = (A). Hence, forall y, wehave (By)"x = 0 = y™BTx, which implies that B? = 0. Therefore, € VB"), which completes the proof. 37 Let w € W4, and y be any element of V. Since ¥ CW, then y € YY. Therefore, by definition of w, we have wy =0. Therefore, w € V+. 38 Letr = dim V. Let vy,...,0- bea basis for V, and V the matrix whose ith column is »j. Then, clearly V = R(V) 4 Let t1,.+.thnap be a basis for V4, and U the matrix whose ith row is u. Then, V4 = R(UT), and v= (v4)t = RUT) = V(U) (by Exercise 3.6 and Theorem 3.4). 39 a. Let € V. Then, = Pa +(I-P)x. Note that Pa € Vand (I—P)x € V", Therefore,e = Pa+ (I P)x is an orthogonal decomposition of ¢ with respect to V. However, x = 2 + 0 is also an orthogonal decomposition of <= with respect to V. Since the orthegonal decomposition is unique, we must have e = Pr. ', Suppose P is an orthogonal projector onto V. Clearly, R(P) C V by definition. However, from part a, z = Pz. for all z € V,and hence VC R(P}. Therefore, R(P) 3.10 To answer the question, we have to represent the quadratic form with a symmetric matrix as PGE SLs Deets Pe -7/2 1 =45/4. Therefore, the quadratic form is indefinite, ‘The leading principal minors are Ay = 1 and Aa 31 ‘The leading principal minors are Ay = 2, Ay = 0, As = 0, which are all nonnegative. However, the eigenvalues of A are 0, ~1.4641, 5.4641 (for example, use Matlab to quickly check this). This implies that the matrix A is indefinite (by Theorem 3.7). An alternative way to show that A is not positive semidefinite is to find a vector z such that 2” Az < 0. So, let x be an eigenvector of A corresponding to its negative eigenvalue X = ~1.4641. Then, a? Ag = 2" (Az) = Axx = Alle <0. For this example, we can take x = (0.3251, 0.3251, -0.8881]?, for which we can verify that 7 Az: = ~1.4643. 32 a. The matrix Q is indefinite, since Mg = —1 and Ay = 2. b. Leta € M. Then, 22 +2 =—m1,21 +43 = aa, and 2) + 22 = a5. Therefore, 27 Qu = ai(z +25) + 20(a1 +20) + 20(t +22) = (0) +24 +29). ‘This implies that the matrix Q is negative definite on the subspace M. 3.13. ‘We represent this quadratic form as f(x) = 27 Qzx, where 1 ¢-1 a=|e 12 “125 ‘The leading principal minors of Q are Ay = 1, Aa = 1 ~ 2, Ay = —5€ —4€. For the quadratic form to be positive definite, all the leading principal minors of Q must be positive. This is the case if and only if € € (—4/5,0). 3.14 ‘The matrix Q = QT > 0 can be represented as Q = Q'/?Q'/?, where Q'/? = (Q'/?)F > 0. 1. Now, (2,2)9 = (QY/#2)"(Q'*z) = Qa? > 0, and (e,2)9=0 & [1Q'a|? & QVr= » 2-0 since Q"/? is nonsingular. 2. (au)q = 27Qy = y"Qle = y"Qe = (y,2)Q. 3. We have (@+u)7Qz 27Qz+y"Qz (e.z)9 + (yz). 5 (e+uz)0 4. (ra,u)q = (ra)? Qu = rat Qy = (a, va. 34s We have HAlloo = max{||Az|}oo : {|#lloo = 1}- We first show that |[Allao < max; Sf, [agel- For this, note that for each 2 such that aay = 1, we have WAztlos = ona < mpc 5 le < mpc 5 fa since [2g] < max [24 = [lho = 1. Therefore, [Ales < max lol To show that |[Alleo = maxi Df, oul, it remains to find a € BY, [Iélloo = 1, such that ||Azlloo max; Sf; laiel- So, let j be such that Seal = max} loa Define & by 1 otherwise wel/ase ifoje # A { layal/a; 0 1. Furthermore, fori #5, So ante S$ Do leaiel < max) laiel = D> lose! mm and > ase] = D> loyal. ‘Therefore, ABI = max | aunts] =D lasal = max) lol. im i & 3.16 We have |All = max{||Azlh + fhelh = 1}- We first show that || All: < maxe S77, lois|- For this, note that for each z such that [zy = 1, we have 4a, = 2] ane < EY batted 6 Steal Sool (=p:35 eu) Sisal A " wa ap Sle la = [ally = 1. Therefore, All: < max 5 Jaie| a Toshow that || All; = max S72, lash itremains to findad € R, || So, et j be such that = L,such that | Ag]; = maxe D7, lawl. max > lau “fi Define & by qo{) itk=s | 0 otherwise Clearly |J@]: = 1. Furthermore, az =o Slay! = mp5 eal 4. Concepts from Geometry 4a =: Let S = (w: Aw = b} be a linear variety. Let x,y € Sanda € R. Then, A(ax + (I -a)y) = Aa + (1-0) Ay = ab + (1-a)b=b. ‘Therefore, aw + (I-a)y € S. 4: IFS is empty, we are done. So, suppose azo € S. Consider the set So = S — 29 = {x ~ 29 : x € S}. Clearly, for all «,y € Sp and a € R, we have az + (1 —a)y € So. Note that 0 € So. We claim that Sp is a subspace. To see this, let 2,y € So, and a € R. Then, aa = ax +(1-a)0 € Sp. Furthermore, ba + ly € So, and therefore a + y € So by the previous argument. Hence, So is a subspace. Therefore, by Exercise 3.8, there exists A such that Sy = N(A) = {2 : Aw = 0}. Define b = Az. Then, S = So+eo={yt+zo:yEN(A)} = {y+2o: Ay =0) = {y +20: A(y +o) = b} = (e:Az=b}, Let u,v € © = {x € R" = lal] < r),and a € (0, 1]. Suppose z = au + (1 —a)v. To show that O is convex, we need to show that z € ©, ie. ||zl] 0}, anda € [0,1], Suppose 2 = au + (1 —a)v. To show that @ is convex, we need to show that z € ©, ie., 2 > 0. To thisend, write a = [z1,...,2n]". y = [v15--- sn)”, and z = [21,...,2n]”, 1. Since 24, us > 0, and a,1 — > 0, we have 2 > 0. Therefore, z > 0, Then, 2 = 02; + (1- aye and hence z € @. 5. Elements of Calculus Sa Observe that WAR < A**UILAll < [44 ULAIP < +++ [Axl] = [Axl] = [Alllell = Al, which completes the proof for this case. In general, the eigenvalues of A and the corresponding eigenvectors may be complex. In this case, we proceed as follows (see (271). Consider the matrix A TAll+e" where ¢ is a positive real number. We have — All |B\| = W8l= Tapes By Exercise 5.1, B* -+ O as k > 00, and thus by Lemma 5.1, |As(B)| < 1,4 = 1,...,7. On the other hand, for each i and thus, which gives [As(A)] < [IAI] +. ‘Since the above arguments hold for any = > 0, we have |A;(A)| < |JAl]. 53 We have Df(ax) = (1/3, 22/2), and doy [3 40 =[}]. By the chain rule, a u dro = vio %o [(at +5)/3, (2t — 6)/2] [2] = 5-1, 54 We have Df(x) = [22/2,21/2), and Be, = [3] : By the chain rule, oe 0 _ 29 6 Filo.) = dso Zoo 1 2 = Foss area] = 10474, to Fila = Dsoey Bts.0 1 3 = Beste [ = 5043t ss Weave Df(a) = [Sxjzax} + 22, 2}2} +m, 2z}zoz3 + 1] sod de ef + 3t? go-| ft . By the chain rule, a dx ge) = Dfe_o " [321 (t)?x2(t)aa(t)? + x2(t), 21 (t)°xa(t)? + 21 (t), 2x1 (t)*za(t)za(t) +1) [ eI ‘ = 12t(e +302) + ate! +60? + 2+ 1 9 56 Lete > O be given. Since f(z) = o(g(z)), then se 50 az) ~° Hence, there exists 6 > 0 such that if xl] < 6, then WFG@yih az) <" which can be rewritten as I F(@)l < e9(e). Sa By Exercise 5.6, there exists 6 > 0 such that if || < 6, then jo(g(z))| < g(2)/2. Hence, if al] < 6, ae # 0, then E(x) < ~9(@) + lo(g(a))| < —g(a) + o(a)/2 38 We have that fe: file) A= 12), and {a fale) = 16) = (a: = = 8/21} ‘To find the intersection points, we substitate = 8/21 into x? — 23 = 12to get xf ~ 122} — 64 = O. Solving gives 2} = 16,4, Clearly, the only two possibilities for 2; are 21 = +4, ~4, from which we oblain 2 = +2, -2. Hence, the intersection points are located a (4,2/" and {—4, ~2]7, ‘The level sets associated with fi(t1,2) = 12 and fa(1,#2) = 16 are shown as follows. Xe fbx x9) = 16 Fy(1.Xq) = 12 Fy(x40%Q) = 1 x 10 59 a Wehave 1 Fle) = fs) + Df wo)le 2.) + Ha ~ #0)" DY (aa) @ ~ ae) + We compute Df(z) = [e*,-me-** +1], pe) = [2a Zon]: Hence, fle) = ata [*5 1] + ternal [% ul [* = ltatm-sn+tete b. We compute Dj{a) = [det + Aare Auten + 4%), a 12a? +40} Bayz Dij(e) = [ Rez, 4a} + 1223 Expanding f about the point z» yields - pote 1 Pe f(a) = ++. [2 ries neal? sb] [B21] = 82} +82} - 16x, - 16x, + 8xyz2 + 12+--- cc. We compute Dla) = [e774 e 4 41, -e7! pet 4 1], oye) = [SLE ease Expanding f about the point 2, yields fla) = a4 aee pert [* 5") + 5t— ned [S pal ["5"]+ = ltarte(itat+a3)+ 6. Basics of Unconstrained Optimization 61 a. In this case, 2° is definitely not a local minimizer. To see this, note that d = [1,—2]" is a feasible direction at «*. However, d”V f(z") = 1, which violates the FONC. b. In this case, satisfies the ONC, and thus is possibly a local minimizer, bt it is impossible to be definite based ‘on the given information. ¢. In this case, 2* satisfies the SOSC, and thus is definitely a (strict) local minimizer. 4, In this case, 2* is definitely nota local minimizer. To see this, note that d = [0, IJ” is a feasible direction at 2°, and dV f(x") =0. However, d” F(2*)d = —1, which violates the SONC. 62 Suppose a is a global minimizer of f over 2, and 2* € 1 CM. Letw € WM. Then, x € M and therefore f(e") < f(z). Hence, 2* isa global minimizer of f over 9. 63 ‘Suppose 2° is an interior point of ©. Therefore, there exists ¢ > 0 such that {y : lly ~ 2*l| < e} CO. Since 2* is a local minimizer of f over 2, there exists e" > O such that f(x") < f(x) for all & € {y : lly ~ 2*l] f(z0 — #0). Sofixy € . Then, y ~ zo € 2. Hence, fy) 2 min fe) 4 (axgmin (2) f(Z0 - 0), Which completes the proof. 65 a. The gradient and Hessian of f are ovo = off sa [5 3]. Hence, Vf ({1, 1)) = [11, 25]", and F([1, 1”) is as shown above. . The direction of maximal rate of increase is the direction of the gradient. Hence, the directional derivative with, respect {0 a unit veetor inthis direction is F(z) ( Vil@) Vey" VF Ifa IVF Ata =(1,1]7, wehave IV/((1 1") 2731 «©. The FONC in this case is Vf(2) = 0. Solving, we get 3/2 [2] ‘The point above does not satisfy the SONC because the Hessian is not positive semidefinte (its determinantis negative). 2 66. a. We can rewrite f as ‘The gradient and Hessian of f are Hence V([0, 1") = (7,6]”. The directional derivative is [07 VF((0,1]7) = 7. ». The FONC in this case is V ( ‘The only point satisfying the FONC is, 1f-s al2]- The point above does not satisfy the SONC because the Hessian is not positive semidefinite (its determinantis negative). Therefore, f does not have a minimizer 67 a Write the objective function as f(z) = —23. In this problem the only feasible directions at O are of the form [di,0]". Hence, dV f(0) = 0 fer all feasible directions d at 0. b. The point 0 is a local maximizer, because f(0) = 0, while any feasible point a: satisfies f(«) < 0. The point 0 is not a strict local maximizer because for any z of the form a = [z1, 0)", we have f(z) and there are such points in any neighborhood of 0. ‘The point 0 is not a local minimizer because for any point « of the form « = [1,23] with x; > 0, we have ‘f(«) = =z} < 0, and there are such points in any neighborhood of 0. Since 0 is not a local minimizer, itis also not a rit local minimizer. 68 a. We have Vf(x*) = [0,5]. ‘The only feasible directions at «* are of the form d = [dy,ds]” with dz > 0. ‘Therefore, for such feasible directions, d” V f (w*) = 5d2 > 0. Hence, «* = (0, 1)” satisfies the first order necessary condition. b. We have F(x") = O. Therefore, for any d, d”F(a")d > 0. Hence, * = (0, 1]7 satisfies the second order necessary condition, = £(0), ©. Consider points of the form a: 2} + 1]. 21 € R Such points are in ®, and are arbitrarily close to 2” However, for such points x # 2°, S(a) = 5(-23 +1) = 5-523 <5 = f(a") Hence, z* is not a local minimizer. 69 a. We have V(2*) = 0. Therefore, fr any feasible direction d at 2°, we have dV f(z") = 0. Hence, 2° satisfies the first-order necessary condition. b. We have 0 F(e") [°c I Any feasible direction d at * has the form d = [d;,dz]" where dz < 2dy, d;,dz > 0. Therefore, for any feasible direction d at 2°, we have a” Pa*)d = 8d} ~ 2d > 8 - 2(24,)* = 0 B Hence, «* satisfies the second-order necessary condition. ©. We have f(a") =0. Any point ofthe form x = [01,27 + 22}, 2 > 0, s feasible and has objective function value given by F(a) = 4a} — (23 + 2m)? = —(2 + de8) < 0= fla"), ‘Moreover, there are such points in any neighborhood of *. Therefore, the point zc* is nota local minimizer 6.10 Given € R, let so that 2 is the minimizer of f. By the FONC, ‘and hence which on solving gives 611 Let 6 be the angle from the horizontal to the bottom of the picture, and 4 the angle from the horizontal to the top of the picture. Then, tan(6) = (tan(@z) ~ tan(8,))/(1 + tan(@2) tan(6,)). Now, tan(®y) = b/z and tan(0,) = (a+ 6)/2. Hence, the objective Function that we wish to maximize is (a+ b)/e—bje _ a 10)" Te iar b/s? ~ Feat ole Weave a wax) £0) =~ Faia spay (be Let 2" be the optimal distance. Then, by the FONC, we have /'( 0, which gives a+b) _ 4 = at = ear 6.12 ‘The squared distance from the sensor to the baby’s heart is 1 + 2, while the squared distance from the sensor to the mother’s heart is 1 + (2 —z)?. Therefore, the signal to noise ratio is 1+(2-2)? T+a f(z) We have =2(2~ 2)(1 + 2) ~ 22(1 + (2-2)? (+27) 4(2? ~ 22-1) (+27? f'(@) By the FONC, at the optimal position *, we have f'(* 1- V2 ora* = 1+ V2. From the figure, it easy to see that z* = 1 — 2 is the optimal pos 4 6.13: a. Let be the decision variable. Write the total travel time as f(2), which is given by vit# | Vit@ tay = LEH , YU ea? v1 v2 Differentiating the above expression, we get d-2 f@) avite wit d-2) 0, which corresponds to By the first order necessary condition, the optimal path satisfies f"(2" z d-2" uvit@y wyltd-ey or sind /v1 = b. The second derivative of f is given by iin 82 /v2. Upon rearranging, we obtain the desired equation He 1 1 10)= Taaar + yas -appe Hence, f”(x*) > 0, which shows that the second order sufficient condition holds. 6.14 We have Vela) (ay — 22)° + 2x1 —2 A(z — 22)° ~ 2p +2 Setting V(x) = 0 we get A(z, —22))+22,-2 = 0 —A(a, — 22) - aa +2 0. Adding the two equations, we obtain x; = 22, and substituting back yields msm =1 Hence, the only point satisfying the FONC is [1,1]. We have FG 12(e1 — a2)? +2 —12(z, — 22)? 1221 22)? 12(a, — 29)? -2 Hence rau)=[> 5] ‘Since F({1, 7) is not positive semide‘nite, the point 1, 1}® does not satisfy the SONC. 6.13 Suppose d is a feasible direction ata. Then, there exists cg > O such that + ad € 2 forall a € [0,a9). Let 8 > 0 be given. Then, x + a(8d) € 2 forall a € [0, 0/3]. Since ap/8 > 0, by definition Ad is also a feasible direction at 2 6.16 =: Suppose dis feasible at a € 2. Then, there exists a > O such that x + ad € O, that is, A(@ + ad) = Aa = band a # 0, we conclude that Ad = 0. =: Suppose Ad = 0. Then, for any a € (0, I], we have Ad = 0. Adding this equation to Ax = b, we obtain A(w + ad) = b, that is, x + ad € for ll a € (0, 1]. Therefore, dis a feasible direction at x. Since 15 647 The vector d = [1,1]? is a feasible direction at 0. Now, 4970) = £0) + Leo Since V f(0) < 0 and V f(0) # 0, then a9 F(0) <0. Hence, by the FONC, 0 is not a local minimizer. 6.8 ; ~ We have Vf(«) = © # 0. Therefore, forany « Ef, we have V f(a) # 0. Hence, by Corollary 6.1, « €f? cannot be a local minimizer (and therefore it cannot be a solution). 6.19. ‘The objective function is f(x) = —cyr1 ~ exz2. Therefore, Vf(«) = [-c1,—ca]” # 0 for all x. Thus, by FONC, the optimal solution * cannot lie inthe interior of the Feasible set. Next, forall ¢ € Lx UJ La. d= (1, 1)" isa feasible direction. Therefore, d”V f(a) = ce: — e2 < 0. Hence, by FONG, the optimal solution «* cannot lie in Ly J Lo. Lastly, for all z € Ly, d = {1,—1]” is a feasible direction. Therefore, d” V f(z) = c2 — c1 < 0. Hence, by FONC, the optimal solution a* cannot lie in L, Therefore, by elimination, the unique optimal feasible solution must be fo”. 620 a. We write f(a,b) Me = : = (EE) a H =o eeeuSe) t Sie 27Qz—2cTz +d, where 2, Q, ¢ and d are defined in the obvious way. b. Ifthe point * = [a*,*]? is a solution, then by the FONC, we have V f(z") = 2Qz" ~ 2¢ = 0, which means Qz* =e. Now, since X? — (X)? = 1 LN (a — XY". and the x; are not all equal, then det Q = X? — (X)? # 0, Hence, @ is nonsingular, and hence Since Q > 0, then by the SOSC, the point z* is a strict local minimizer. Since =” is the only point satisfying the FONC, then 2* is the only local minimizer. cc. We have exee = (QP) 2, BH - Ho xX? (XP xX? (XP 16 621 Given z € R", let ns 2 fa) = 2 je — 20 Pit be the average squared error between z and a),..., 2°). We can rewrite f as fe) = FD e- 2 e- 2) SI 12m) 2.2 = atz-2(- S20) 24-2)? xy alk If So f is a quadratic function. Since & is the minimizer of f, then by the FONC, V (2) = 0, ie., 2-28 Sal) = ae-2h ral =o Hence, we get a=1yr20, P i.e, @ is just the average of 2(0),...,219), The Hessian of f at 2 is F(@) =n, which is positive definite. Hence, by the SOSC, # isa strict local minimizer of f (infact, it sa strict global minimizer because fis a convex quadratic function). 622 Fix any x € 9. The vector d = a ~ a is feasible at 2* (by convexity of ). By Taylor's formula, we have S(w) = flew") + d?V f(x") + o(lldll) = F(a") + elldl] + o(|}a)). ‘Therefore, for all « sufficiently close to x", we have f(z) > f(x" 623 Since f € C?, F(w*) = FT(a*). Let d # 0 bea feasible directions at 2*. By Taylor's theorem, |. Hence, 2° is a strict local minimizer. Sa" +d)~ f(a) = hd Vi(e") +d F(@")d + oll). Using conditions a and b, we get Sle" +d) ~ fa") > ela? + ofa), ‘Therefore, for all d such that ||dl] is sulficiently small, Sa" +d) > fle"), and the proof is completed. 6.24 "Necessity follows from the FONC. To prove sufficiency, we write f as 1 aera fe) = 5(@- 27)" Qe )-— 32°7 Qe where 2* = Q~'D is the unique vector satisfying the FONC. Clearly, since $+? Qzx* is a constant, and Q > 0, then 1 Sa) > fe") = -F2"T Qe", " and f(«) = f(a") if and only if x = 62s Write w = [uiy.++5tha)» We have Bn = Gant + bun a(ay-2 + bint) + bn = arya + abtinat + Ditn = aay ba" tbuy +--+ abtinaa + btn cu, where ¢ = [a"=*b, ..,ab,}]”. Therefore, the problem can be writen as minimize ruTu ~ qeTu, which is a positive definite quadratic in u, The solution is therefore equivalently, us = ga"~*b/(2r), 4 7. One Dimensional Search Methods 7a ‘The range reduction factor for 3 iterations ofthe Golden Section method is ((V — 1/2)° = 0.236, while that ofthe Fibonacei method (with ¢ = 0) is 1/Fhs1 = 0.2. Hence, if the desired range reduction factor is anywhere between 0.2 and 0.236 (e.g, 0.21), then the Golden Section method requires at least 4 iterations, while the Fibonacci method requires only 3. So, an example of a desired final uncertainty range is 0.21 x (8 ~ 5) = 0.63. 12 a. The plot of f(z) versus 2 is as below: 0 b. The number of steps needed for the Golden Section method is computed from the inequality: 61303" < 2? > N2334 2-1 18 ‘Therefor, the fewest possible number of steps is 4. Applying 4 steps of the Golden Seetion method, we end up with a uncertainty interval of [44,00] = (1.8541, 2.000]. The table with the results of the intermediate steps is displayed below: [Teraon& | ax |b | Flex) | F(x) | Newuncerainty interval | 1 | 1.3820 | 1.6180 | 2.6607 | 2.4292 | 1.3820,2] | [2 [16180 | 1.7639 | 24292 |[23437| (161802) —~| [3 [1769 | rasa | 23437 | 23196] ear | [4 [asst [19098 [23196 ]2311| assez) ‘| c. The number of steps needed for the Fibonacci method is computed from the inequality: 1422. 02 Fy Therefore, the fewest possible number of steps is 4. Applying 4 steps of the Fibonacci method, we end up with a uncertainty interval of [a4, bo} = (1.8750, 2.000]. The table with the results of the intermediate steps is displayed below: | Merationk | pe | ax | be | Flax) | F(be) | New uncertainty interval | |_| 0.3750 | 1.3750 | 1.6250 | 2.6688 | 2.4239 | 1.37502) | [2 | 04 | 16250 | 1.7500 | 24239 | 23495] 2s02) | [3 [0.3333 | 17500 | 1.8750 | 23495 23175] (1.75002) ‘| [4 | 045 | 18750 | 18875 2.3175 | 23160| (187502) ‘| = 20 —Asinz, f"(2) = 2 —4co82. Hence, Newton’s algorithm takes the form: 21) ~2sin2(*) tet) = gf) T= 2eos2 Applying 4 iterations with 2( = 1, we get 2(!) = -7.4727, 2) = 14.4785, 2) = 6.9351, 2) Apparently, Newton’s method is not eflective in this case. 13 , We first create the M-file £m as follows: fem function y=£(x) y=8*exp(1-x) +7*1og (x) : ‘The MATLAB commands to plot the function are: fplot("£", (1 21) xlabel (/x!); ylabel (/£(x)") ; The resulting plot is as follows: 19 . The MATLAB routine for the Golden Section method is: SMatlab routine for Golden Section Search lett=-20; right=20; uncert=0.1; hos (3-sart (5))/27 Nsceil(1og(uncert/ (right-left)!/1og(1-rho)) Sprint W lowers'a’; aeleftt (1-rho)* (right-left) ; easéla): for isl Lf lower bea Ebefa azleft+rho? (right-left) fasé(a) else acb fast beleft+(1-rho) *(right-left) fb=£(b) end tif if Lact righteb; lower="a" end 3if New_tnterval = (left, right) end @for i ® Using the above routine, we obtain N intermediate steps is displayed below: 20 and a final interval of [1.528, 1.674]. The table withthe results of the [terion | an | bx | Flen) | Flbx) | New uncertainty interval | [1382 [ 1618 | 7.7247 | 7.6805 | 2 | 118 | 1.764 | 7.6805 | 7.6995 | 3 | 1528 | 1.618 | 7.6860 | 7.680 | 4 | L618 | 1.674 | 7.6805 | 7.6838 | ¢. The MATLAB routine for the Fibonacci method is: awatlab routine for Fibonacci Search technique No; while F(W#2) < (142*epsilon) *(right-Left) /uncert (+2) +P (N+) (N43) NeNeL; end while N Sprint N fasf(a); AF (42-4) /F (N+3~i) rho=0.5-epsilon end #if LE Lowes bea fb azleft+rho* (right-left) fe else ab fantb beleft+(1-rho) * (right-1eft) f_bef (b) end sit if Lactb right=! lowe: alee leftea; lower="b’ end tig @ 24 (1.3822) (1.382, 1.764] [1.528,1.764] [1.528,1.674] | | | | New_Interval = (left, right] end for i a - 3 and a final interval of (1.58, 1.8}. The table with the results of the Using the above routine, we obtain V intermediate steps is displayed below: [eraion [px [an [ bx | Fox) | 10x) | New uncertainty interval | cneraton | te | ae | be | fax) | £06) | New uncertainty interval | [Tos [ta [ref are [r6e0s) tiaay | [2 [03a | te [is | 7680s [77091 | nana) [3 | 045 | 1.58 | 1.6 | 7.6812 | 7.6805 | 1.58,1.8) | ee | | Fr . Hence, 1s Fret /Fiv—e42 T= Frys /Fvnsa Fynns2 ~ Fyne Pres Fry-« Frans = Pes 1 To show that 0 < px < 1/2, we proceed by induction. Clearly py = 1/2 satisfies 0 < pr < 1/2. Suppose 0 < pe <1/2,wherek€ {I,...,N ~ 1}. Then, 1 ees pSl-mst and hence Therefore, <1 Since pay = 1— 24, then 1 Span $5

xcurr*uncert, xoldexcurr: guoldeg_curr: g_curr=feval (g,xcurr) ; xnew= (g_curr*xold-g_old*xcurr) / (g_curr-g_old) ; end fwhile ‘print out solution and value of g(x) Af nargout >= 1 Af nargout == 2 sfeval (g.xnew) + end else final_point=xnew value=feval (g,xne) end tif ® b. We get a solution of « = 0.0039671, with corresponding value g(t) = —9.908 x 10-*. 19 24 function alpha=linesearch_secant (grad, x,4) Line search using secant method epsilon=10"(-4); #1ine search tolerance max = 100; tmaximum number of iterations alpha_curr=0; alpha=0.001; aphi_zero=feval (grad,x)/*d; aphi_curr=dphi_zero; while abs (dphi_curr) >epsilon‘abs (dphi_zero), alpha_old=alpha_curr; alpha_curr=alpha; aphi_old=dphi_curr; dphi_curr=feval (grad, xtalpha_curr*) '*4; alphas (dphi_curr*alpha_old-dphi_old*alpha_curr) / (dphi_curr-dphi_old) ; Seis; Af (i >= max) & (abs (dphi_curr) >epsilon*abs (dphi_zero)), disp(‘Line search terminating with number of iterations:’); aisp(i); break; end fend while 2 8. Gradient Methods 81 Let s be the order of convergence of {2"")}. Suppose there exists © > 0 such that forall k sufficiently large, je) —2*|] > ofa” —2"IP. Hence, for all k sufficiently large, lle — 2 lla) — at 1 Ole, > ei ep oR 2 ewe ‘Taking limits yields sn HAH 2 e pote |e — a = Tima ee Since by definition s is the order of convergence, lim esteo [fe Combining the above two inequalities, we get 0, we conclude that s < p, i.e, the order of convergence is at most p. ‘Therefore, since lim, sco {la*) ~ ar” 82 ‘We use contradiction. Suppose ar") + a:* and fmm (ett? — 2) lim 2" =2"ll . 9 ete [ja — ale 25 for some p < 1. We may assume that «'*) # 2* for an infinite number of & (for otherwise, by convention, the ratio above is eventually 0). Fix € > 0. Then, there exists , such that for all k > K, jot) — atl] tee as'l ea Dividing both sides by jx(*) — ||'-, we obtain [oer — ah . [eel ie [eae] 7 Ie — a? Because x(#) -+ x* and p < 1, we have jl — a*||!-? — 0. Hence, there exists Kz such that for all k > Ko, je) — 2*||!-” < ©, Combining this inequality with the previous one yields at) — 2" Tee >. for all k > max(Ki, Ka)s ie. 2) — aI] > [2 — 2", Which contradicts the assumption that 2) + 2*, 83 We have une: = (1 — pte, and ue + 0, Therefore, san sal =1-p>0 ws Tel and thus the order of convergence is 1. 84 4. The value of 2* (in terms of a, 6, and c) that minimizes f is 2* = b/a. b. We have f"(z) = ax — b. Therefore, te recursive equation for the DDS algorithm is - a+) = 2 — o(az\ —b) = (1-aa)z") + ab, c. Let é e-so0 2, Taking limits of both sides of z(*#) = x" — a(az'*) — 5) (from part b), we get =#-a(ak—0). Hence, we get = bja = 2° 4. To find the order of convergence, we compute Jz) — b/al aaa)e"*) + ab — bf [x® = d/alP [a pal? [1 = aja") ~ (1 - aa)b/al 124) b/al [1 = cal [e — b/a}' Let 2(#) = |1 ~ aal[e\*) ~b/al!-P. Note that 2“) converges to finite nonzero number if and only if p = then 2") — 0, and if p > 1, then 2(®) —> 00). Therefore, the order of convergence of {2'*)} is 1, €. Let y!*) = |2(®) —b/al, From partd, ater some manipulation we obtain (itp <1, yl) = [1 —aaly! = [1 — aalt'y ‘The sequence {2} converges (to b/a) if and only ify) > 0. This holds if and only if |1 — aa] < 1, which is ‘equivalent to 0 < a < 2/a. 26 85 We rewrite fas f(z) = je™ Qa — 6x, where 64 a=[f 5 ‘The characteristic polynomial of Q is 1? ~ 12A +20. Hence, the eigenvalues of @ are 2 and 10. Therefore, the largest range of values of a for which the algorithm is globally convergent is 0 cif and only if f(z) -> 0. Hence, the algorithm is globally convergent if and only if f (2) -+ 0 for any zp. From part a, we deduce that f{r,) —+ 0 for any zo if and only if []j2.g(1 — ae)” = 0. Because 0 0, which means that f(24s1) < f(x) if re # 1 for k > 0. This implies that the algorithm has the descent property (for k > 0). Se Son aat (Set Foet) - ot (2-1) co Since “4 > 0 for all k > 0, we can apply the theorem given in class to deduce that the algorithm is not globally convergent. 810 We have bea SOF) By Taylor's Theorem, Fe) = f(a") + (a*V(a ~ 2") + O(|2 — 2"P) Since f'(2") = 0 by the FONC, we get oC lle — — Fa) Fe" ‘Combining the above with the first equation, we get Je) — 2] = Ole —2"/), which implies that the order of convergence is at least 2. 8.1 a. We have fle) =||Az-b]? = (Ae—b)"(Az 6) (2? AT — 0")( Ax —b) a? (AT A)x ~ 2(AT)"x +67 which is a quadratic function. The gradient is given by Vf(«) = 2(A™ A) — 2(A76) and the Hessian is given by F(a) = 2(A" A). ». The fixed step size gradient algorithm for solving the above optimization problem is given by att) 2 2 —a(2(A?A)a — 247) wl — 2a AT (Aa — b). c. The largest range of values for «such that the algorithm in part b converges to the solution of the problem is given by 2 Xmas(2A?A) 1 O 1, we conclude that the slgorithm is not globally monotone. 'b. Note thatthe algorithm is identical toa fixed step size gradient algorithm applied to a quadratic with Hessian A. ‘The eigenvalues of A are 1 and 5. Therefore, the largest range of values of a for which the algorithm is globally convergentis 0 0, and by Lemma 8.2, Amin(Q) ma na-a (Fmei$}) > which implies that S729 74 = oo. Hence, by Theorem 8.1, 2(*) -+ &* for any x), If8 2, then 6(2 — 8) <0, and by Lemma 82, 2) Pat me sata) (Seta? 29 By Lemma 8.1, V(2'")) > V(@)), Hence, if) £ 2°, then {V(c'"} does not converge to 0, and consequently 221") does not converge to 2". 8.16 By Lemma 8.1, V(2(**)) = (1 — 94)V(z") for all k. Note that the algorithm has a descent property if and only iF V(2*D) < V(@l) whenever gi") 4 0. Clearly, whenever g!") 4 0, V(el**D) < Vel) if and only if 1 ~ "4 < 1. The desired result follows immediately. 8.17 We have HD (8) — aya”) and hence (aD — 2), of 2l44)) = ald”), Vf(el**)), Now, let ¢e(a) = f(a + ad), Since a4 minimizes dp, then by the FONC, di, (ax Hla) = dV F(2 + ad). Hence, 0. By the chain rule, 0 = de(ag) = dT Vy (al) + apd) = (a), Vf(e**™Y)), and so (al — 24), 7 (al) <0, 818. Assimple MATLAB routine for implementing the steepest descent method is as follows. function (x,N]=steep_desc (grad, xnew, options) ; STEEP_DESC( "grad, x0) 7 STEEP_DESC( ‘grad’ , x0, OPTIONS) ; x = STEEP_DESC( ‘grad’ , x0) 7 x = STEEP_DESC(" grad’ ,x0,02TTONS) ; [eM] = stBEP_pEsc( ‘grad’, x0); Ge.M] = STEEP_DESC(’ grad’, x0, OPTIONS) ; Ho athe first variant finds the minimizer of a function whose gradient Bis described in grad (usually an M-file: grad.m), using a gradient Sdescent algorithm with initial point x0. The line search used in the Ssecant method. ‘The second variant allows a vector of optional parameters to defined. OPTIONS(1) controls how mich display output is given; set to 1 for a tabular display of results, (default is no display: 0) SOPTIONS(2) is a measure of the precision required for the final point. SOPTIONS(3) is a measure of the precision required of the gradient. 2OPTIONS(14) is the maximum numker of iterations ‘For more information type HELP FOPTIONS. ° @the next two variants returns the value of the final point ‘The last two variants returns a vector of the final point and the ‘number of iterations: if nargin “= 3 options = (1; if nargin “= 2 disp(‘Wrong number of arguments."); return; end end 30 if Length (options) >= 14 if options (14 options (14) =1000*Length (xnew) end else options (14. 000*1ength (xnew) ; end cle: format compact; format short e7 options = foptions (options) + print = options(1); epsilon_x = options (2); epsilon_g = options (3); max_itersoptions (14); for k max_iter, g_curr=feval (grad, xcurr) ; if norm(g_curr) <= epsilon_g disp('Terminating: Norm of gradient less than’); disp (epsilon_g| alpha=linesearch_secant (grad, xcurr,-g_curr) + xnew = xcurr-alpha*g_curr; if print, disp(‘Tteration number k =") disp(k); tprint iteration index k disp(/alpha ="); @isp(alpha); Sprint alpha disp('Gradient =); disp(g_curr'); eprint gradient disp(‘New point ="); disp(xmew'); §print new point end tif Af norm(mnew-xcurr) <= epsilon_x*norm(xcurr) disp(‘Terminating: Norm of difference between iterates less than’); disp(epsilon_x) ; break: end tif if k == max_iter disp(/Terminating with maximum number of iterations’); end #if end tfor AE nargout >= 1 if nargout Nek; 31 end else disp(’Final point ="); disp (xnew') ; disp(‘Munber of iterations ="); displ; end if 8 ‘To apply the above MATLAB routine o the function in Example 8.1, we need the following M-file to specify the gradient. function y=g(x) Y=(A*(3e(1) 4) 73; 24 (30(2)-3) 7 16% (4(3)45) We applied the algorithm as follows: >> options (2) = 107 (-6 >> options(3) = 10(-6); >> steep_desc('g', [-4;5;1] options) 31; ‘Terminating 1.0000e-06 Final point = 4.0022e+00 3.0000e+00 -4.5962e+00 Number of iterations = 25. Norm of gradient less than ‘As we can see above, we obtained the final point [4.002, 3.000, —4.996]" after 25 iterations. ‘The value of the objective function at the final point is 7.2 x 10-2, 819 ‘The algorithm terminated after 9127 iterations. The final point was (0.99092, 0.99085)" 9. Newton’s Method 9 a. We have f'(2) = 4(2 — 20)? and f(x) = 12(a ~ 20)®. Hence, Newton’s method is represented as 26k) 2 8) which upon rewriting becomes 2 20 xy = 2 (al 2g o 0 = 5 (" ~20) », From part, y(#) = |2(4) — zo] = (2/3)§2 — zal = (2/3)y'*-, €. From part b, y) + 0, and hence 2") > zy for any 21) 4. From part b, we have in OY = zo] 2_ 2 and hence the order of convergence is 1 ©. The theorem assumes that f”(2*) # 0. However, in this problem, 2* 92 & We compute f'() = 4'/*/3 and f"(2) = 4z~*/3/9, Therefore Newton's algorithm for this problem takes the form afte) 2 gy — A@O)9/3_ = Ta) 779 32 ra.and f(x") 22 >. From part a, we have 2(*) = 2x(©), Therefore, as long as (©) # 0, the sequence {2\")} does not converge to. 93. a. Clearly f(#) > 0 forall «. We have fle)=0 @ m-z}=0 and 1-4 @ #=[1, 17. Hence, f(a) > f((L, 1") for all 2 4 1,1)", and therefore [1,1]? is the unique global minimizer. —— v#le) Pee son - *] ~ 12002? - 40022+2 40021 ea) = [10st port? oe ‘To apply Newton’s method we need the inverse of the Hessian, which is Flay" 1 200 40021 ‘B0000(z7 — z2) +400 | 4002, 1200z7 ~ 40022 +2] ° Applying two iterations of Newton's method, we have z(") = (1,0]7, 2) = (1,1]7. Therefore, in this particular case, the method converges in two steps! We emphasize, however, that this fortuitous situation is by no means typical, and is highly dependent on the initial condition, c. Applying the gradient algorithm z'*+!) = 2) — a,V f(z) with a fixed step size of ae = al) = (0.,0]7, 2 = [0.17,0.1]7. 94 If al = 2*, we are done. So, assume x) # x*. Since the standard Newton's method reaches the point «* in one step, we have 05, we obtain $2 +Q7'g) rain f(2) < se +0Q-9) f(a" for any a > 0. Hence, gmin f(2'°) +aQ~4g!) = 1. 0 Hence, in this case, the modified Newton's algorithm is equivalent to the standard Newton's algorithm, and thus a) al 0 10. Conjugate Direction Methods 104 We proceed by induction to show that for k = 0,...,n — 1, the set {d"), ...,d)} is Q-conjugate. We assume that 9) £ 0,4 1,...,, $0 50 that d!™ Qa! # 0 and the algorithm is well defined. For k = 0, the statement trivially holds. So, assume that the statement is true for k 0, for any a € RT, we have (D'@p) a= (Da)FQDa) > 0 and (D¥Qb) a= (Day"Q\De) = = 0. Since rank D = r, Da = 0 if and only if a = 0. Hence, the matrix D™QD is positive if and only if De definite. 10.7 a. We have f() = ba" Qa — 6 where b. Since f is a quadratic function on I, we need to perform only two iterations. For the first iteration we compute a = a = aw) = [0.51724,0.17241]" = [-0.06897, 0.20690)". For the second iteration we compute fy = 0.047534 a) (0.08324, -0.20214]7 ay = 5.7952 2) = (1.000, -1.000]7. cc. The minimizer is given by 2* = Q~'b = [1,1], which agrees with part b. 108. ‘A MATLAB routine for the conjugate gradient algorithm with options for different formulas of 3 is: function [x,N]=conj_grad(grad, »new, options) ; 2 CONJ_GRAD( ‘grad’, x0) &—CONJ_GRAD( ‘grad, x0, OPTIONS) ; a x = CONJ_GRAD( ‘grad’, x0); & x = CONJ_GRAD( ‘grad’, x0, OPTIONS) ; a 2 DEN] = CONJ_GRAD( ‘grad, x0); & — DeyN] = CONJ_GRAD( "grad" , x0, OPTIONS) ; a ‘athe first variant finds the minimizer of a function whose gradient Qie described in grad (usually an M-file: grad.m), using initial point x0. The second variant allows a vector of optional parameters to be Sdefined: SOPTIONS(1) controls how much display output is given; set Sto 1 for a tabular display of results, (default is no display: 0) SOPPIONS(2) is a measure of the precision required for the final point. BOPTIONS(3) is a measure of the precision required of the gradient. SOPTIONS(5) specifies the formla for beta: 8 0=Powell; 8 ‘Letcher-Reeves; ® olak-Ribiere; 8 lestenes-stiefel ‘0PTIONS(14) is the maximum nurber of iterations. ‘ror more information type HELE FOPTIONS. * athe next two variants return the value of the final point athe last two variants return a vector of the final point and the ‘number of iterations if nargin “= 3 options = (1; if nargin “= 2 disp(‘Wrong number of arguments."); end end 36 nunvars = length (xnew) ; if length (options) >= 14 Af options (14 options (14) =1000*nunvars; end else options (14 end 000*numvars; ele, format compact; format short e options = foptions (options) : print = options(1); epsilon_x = options (2); epsilon. = options (3); max_itersoptions (14) ; g_euri AE norm(g_curr) <= epsilon.g isp(’Terminating: Norm of initial gradient less than’) disp(epsilon_g): end #if eval (grad, xnew) ; a-g_curr; reset_cnt = 0; for k= L:max_iter, alpha=1inesearch_secant (grad, xcurr,d) ; Salpha=—(a" *g_curr) /(a"*9*d) ; umew = xcurrtalpha* if print, disp(’ Iteration number k =") @isp(k); @print iteration index k @isp(‘alpha ="); disp(alpha); print alpha disp( ‘Gradient ="); disp(g_curr'); Sprint gradient disp(’New point =); disp(mew'); tprint new point end tit Lf norm(snew-xcurr) <= epsilon_x‘norm(xcurr) disp(‘Terminating: Norm of difference between iterates less than’); disp (epsilon_x); break; end Sif g_old=g_curr; g_eurr=feval (grad, new) ; if norm(g_curr) <= epsilon disp(‘Terminating: Norm of gradient less than’); disp(epsilon_g): 37 break; end Sif reset_cnt = reset_ent+1; if reset_ent == 3*numvars s-g_curr; reset_ent = 0. else if options (5)==0 $Powell beta = max(0, (g_curr'*(g_curr-g_old)}/ (g_old’*g_old)); elseif options(5)==1 ¢Fletcher-Reeves beta = (g_curr’*g_curr) /(g_old’*g_old) ; elseif options(5)==2 tPolak-Ribiere beta = (g_curr’*(g_curr-g_old)}/(g_old’*g_old) ; else tHestenes-stiefel beta = (g_curr’*(g_curr-g_old)}/(4’* (g_curr-g_old)}; end Bit d=-g_curr+beta*d; end if print, isp('New beta ="); disp (beta) ; disp(/New a disp(a); end if k == max_iter disp(/Terminating with maximum number of iterations’); end tif end @for Af nargout Nek; end, else disp(’Final point aisp(xnew’); disp("Number of iterations disp (k) : end if * ‘We created the following M-file, g .m, forthe gradient of Rosenbrock’s function: function y=g (x) 400% (5¢(2) —2e(1) 72) (1) -2* (Lv), 2004 (302) x (2) « We tested the above routine as follows: >> options (2)=107(-7); >> options (3) =107(-7); >> options (14) =100; >> options (5) =0; >> conj_grad(‘g’, [-2;2], options) ; ‘Terminating: Norm of difference between iterates less than 1.0000e-07 38 Final_point = 1,0000e+00 1.0000e+00 Number_of_iteration = a >> options (5) =1; >> conj_grad('g", [2:2] ,opticns) ; ‘Terminating: Norm of difference between iterates less than 1.0000e-07 Final_point = 1.0000e+00 1.0000e+00 Mumber_of_iteration = 10 >> options (5 >> conj_grad('g’,, [-2;2] ,opticns) ; ‘Terminating: Norm of difference between iterates less than 1.0000e-07 Final_Point = 1.0000e+00 1.0000e+00 Number_of_iteration = e >> options (5) =3; >> conj_grad(’g’, [-2:2] options) Terminating: Norm of difference between iterates less than 1.0000e-07 Final_point = 1.0000e+00 1.0000e+00 Number_of_iteration = 8 ‘The reader is cautioned not to draw any conclusions about the superiority or inferiority of any of the formulas for ‘x based only on the above single numerical experiment. 11. Quasi-Newton Methods wt a. Let (2) + ad"), oa) ‘Then, using the chain rule, we obtain #(a) = dT VF(e + ad) Hence (0) =a" 4 Since @' is continuous, then, if d"g'*) < 0, there exists & > 0 such that for all a € (0,4). (a) < (0), ie. F(a) +ad™) < f(a), b. By parta, #(a) < 4(0) for all « € (0,4). Hence, ay = argmin g(a) #0 m0 which implies that ax > 0. c. Now, ATG = aTY Fe + ad) = (as). 39. 0. Hence, g**)Td® = 0, Since ay = argmingso f(a + ad) > 0, we have 6} (a) a. llo™|P. 16g # 0, then i. We have d® = -g, If 2 0, then lg)? > 0. Hence, d®™™ 9 llg®[?? > 0, and hence dg) < 0. ii, Wehave d® = —F(2'®)~1g, Since F(a") > 0, we aso have P(e)" > 0. Therefore, d®%g'") = <9!" F(a)-1g") < itg® £0. iii, We have d®) =~) +6 d®, Hence, . A®T 9") = Ig |? + Bp rd*YT gl By parte, d*-YT g() = 0, Hence, if g") # 0, then |lg'®)||? > 0, and aT {®) = —Ig®IP <0. iv. We have d*) = —Hyg"). Therefore, if Hy > 0 and g\* #0, then d®)7g(®) = —g®T Hyg <0. ¢. Using the equation V f(x) = Qu —b,we get ahTgt) — gta) —») = dT (Q(e + and) - 6) and Qa") +a" (Qe - b) = ad Qa” + a7. By parte, dg) = 0, which implies _a87y =~ eT QE 2 ‘We are guaranteed that the step size satisfies ay > O if the search direction is in the descent direction, ice., the search direction d) = —MgV f(zl®)) has strictly positive inner product with ~Vf(c')) (see Exercise I1.1). Thus the condition on My that guarantees ay > 0 is Vf(a'*))" M,V f(z") > 0, which corresponds to 1 + a > 0, or @ > —1. (Note that if a < 1 the search direction is notin the descent direction, and thus we cannot guarantee that ay > 0) 113 Leta € R". Then (de ~ Hy Ag®) (Aa — soon) . aga) — A ag®) (2? (Ax — Hag")? Ag®T (de® — H,dg) 27 Hye = eye +e? ( = a Ayet Note that since Hy > 0, we have 2” Hy > 0. Hence, if Ag" (Az — Hy Ag") > 0, then 27 Hes > 0. 1s ‘The complement of the Rank One update equation is (dol — Bide)(Ag — Beda)? Ae®T (Ag — B,Az™) 40 Bus = Bat Using the matrix inverse formula, we get a1 1 Bui = By - 1(Ag!" — ByAx®) (Ag — ByAx®)) By! TeOT ag" — B,Az™) + (Ag ~ B,Ae®)TB, (Ag — Byda®) pet 4 al = BztAg') 2 — Be ag)? a! + JT aa) — Beta gh Bp ag) Substituting Hf, for By;*, we get a formula identical to the Rank One update equation. This should not be surprising, since there is only one update equation involving a rank one correction that satisfies the quasi-Newton condition. us a. Since fis quadratic, and ay argmingsy f(a + ad")), then ora aga" b. Now, d) =—Hyg'"), where Hy, = HT > 0. Substituting this into the formula for ay in part a, yields OT yg” Hg on GeTQae *° 16 ‘The first step for both algorithms is clearly the same, since in either case we have 2) = 2 — agg. For the second step, a) = Mg OTA) AxMalOT = - (1,4 (14 Sorta) daOacO™ AGT a2 } AsO age Agaz™ + (aga) 4) raed = gt (14 BAI) aan!" 91) Agar) AeOPAgr— Ag A291) 4 da Ag Fy”) +A Since the line search is exact, we have MM) 2 gi Agitg (0) = —g) gag | yor where o GPT Ag — GOT gl) — g( CT Ag ~ GOT (gi) — gO) 1), and g(tI7 9) = 0, we have 9) is the Hestenes-Stiefel update formula for Bo. Since d) oT (gi? gg Bo Which isthe Polak-Ribiere formula. Applying g(?)" 9) = 0 again, we get gg gOT which isthe Fleicher-Reeves formula, "7 a. Suppose we apply the algorithm ta quadratic. Then, bythe quasi-Newion property of DFR, we have HPAP Agt™ Aa"), 0 O and H'E*CS > 0. Hence, for any = #0, a” Hyx = da HPPP a + (1— da HPF > 0 since ¢ and 1 ~ ¢ are nonnegative. Hence, Hy. > 0, fom which we conclude that the algorithm has the descent property if is computed by line search (by Proposition 11.1), 118 ‘We proceed by induction, For k++ 1 ig clearly Q-conjugate). ‘Assume the results true for k ~ 1 < ns ity that d,,..,d(, & > 0, are Q-conjugate., We now prove the result for by ie thacd,.-.,d**) & < n— ty are Qeeonjupate, it sufices to show that d™*® Qa = 0, O 0, and dl = Aa Jay, So, giveni,0 = 14 Lf options (14) ==0 options (14) =1000+numvars: end else options (14) =1000¢numvars; end cle: format compact; format short e: options = foptions (options) ; print = options(1); epsilon_x = options (2); ‘epsilon_g = options (3); reset_ent = 0; g_curr=feval (grad, xnew) ; Af norm(g_curr) <= epsilon. 43 disp( ‘Terminating: Norm of initial gradient less than’); disp(epsilon_g); return; end Bie de-H+g_curr; for k = I:max_ iter, alpha=linesearch_secant (grad, xcurr, 4) ; xnew = xcurrtalphatd; if print, disp(Tteration number k =") disp(k); @print iteration index k disp(‘alpha ="); disp(alpha); @print alpha disp(‘Gradient ="); disp(g_curr'); tprint gradient disp(’New point ="); disp(mew'); tprint new point end tif LE norm(xnew-xcurr) <= epeilon_xtnorm(xcurr) disp(’Terminating: Norm of difference between iterates less than’); disp (epsilon_x) : . break; end tif gold=g_eurr; g_curr=feval (grad, xnew) + Af norm(g_curr) <= epsilon. disp(/Terminating: Norm of gradient less than’); disp (epsilon_g) break; end 8if pealpha*a, geg_curr-g_old; reset_cnt = reset_ent+l; LE reset_ent == 3*numvare g_curr; reset_cnt = 0; else if options(5)==0 Rank one a" (p-H*a) H = H+ (p-H*q) * (p-#*q) "/(q"* (pHa) ) elseif options (S)==1 ¢DrP H = Hepp! /(p! a) = (H¥q) * (Hq) "/ (a? tA) else tEFGS Hs He (L4q/ *H*q/ (q! *p)) *E*p’ / (p’ *a) - (tap! + (Heat) / (a *B) end #if H*g_curr; end if print, disp('New disp; disp('New d ="); dispia); end ie k max_iter disp(’Terminating with maximum number of iterations’); end tif end for if nargout ©: end else disp(/Final point disp (xnew’) : disp(‘Number of iterations disp); end Bit s V We created the following Mile, gm, for the gradient of Rosenbrock’s functio function y=g(x) y= (-400* (3¢(2) (1) .72) #2e(2) -2* (1-1), 200° (22) ~2€01) 7291" ‘We tested the above routine as follows: >> options (2)=107(-7) ; >> options (3)=10°(-7); >> options (14) =100; >> xO=[-2/2); >> HO=eye (2): >> options (5: >> quasi_newton('g',x0,H0, options) ; ‘Terminating: Norm of difference between iterates less than 1.0000e-07 Final point 1.0000e+00 1.00008+00 Number of iterations = 8 >> options (5) =: >> quasi_newton (’g’,x0,H0,options) ; ‘Terminating: Norm of difference between iterates less than 1.0000e-07 Final point = 1.0000e+00 1.0000e+00 Number of iterations = a >> options (5) =2; >> quasi_newton(’g’,x0,H0, options) ; ‘Terminating: Norm of difference between iterates less than 1.0000e-07 Final point 1,0000e+00 1. 0000e+00 Number of iterations = 8 ‘The reader is again cautioned not to draw any conclusions about the superiority or inferiority of any of the formulas, for Hx based only on the above single numerical experiment. 11.10 4, The plot of the level sets of f were obtained using the following MATLAB commands: >> (K,¥]=meshdom(~-2:0.1:2, -1:0.1:3); >> aK." 4/44¥.°2/2 >> Va(-0.72, -0.6, >> contour (X,¥,2,V) ‘The plot is depicted below: b. With the initial condition [0, 0)", the algorithm converges to [1,0], while with the intial condition {1.5, 1J", the algorithm converges to [1,2|". These two points are the two strict local minimizers of f (as can be checked using the ‘SOSC). The algorithm apparently converges to the minimizer “closer” to the initial point. 12. Sol a Write the least squares cost ing Ax =b the usual notation || Az — bl? where ‘The least squares estimate of the mass is m* = (ATA ATH = = 122 ‘Write the least squares cost in the usual notation || Ax — bl? where si} Bh Eh 46 The least squares estimate for (a, Bis (e] = (ATA)AT + al 1 far -7)"* fiz u[-7 3 31 1 [35 ult] - [52 > [sf] 123 a. We form ae = a- [2A]: ’ (i) 3/2 440 ‘The least squares estimate of g is then given by 9 = (ATA) AT = 9.776. 040816, and w!°) = 9.776. We have a; = 42/2 = 8, and 6") = 78.5. Using the RLS 9.802, which is our updated estimate of g. b. We start with Py formula, we get a!) 12.4 ‘The least squares estimate of Ris the least squares solution t0 LR =%M LR = Vp ‘Therefore, the least squares solution is, bol) ‘We represent the data in the table and the decision variables a and b using the usual least squares matrix notation: 12 6 as|uaf, sla, =[3] 32 5 The least squares estimate is given by =f] fly ll = lle* - 2|]. ‘Therefore 2” minimizes lla ~ «|| subject to {x : Ax ‘To prove necessity, suppose a:* minimizes ||ax — (|) subject to {x : Ax = b}. Lety* = 2" ~ 2!), Consider any point y, € {y: Ay = 6 — Az}, Now, Aly; +2) = Hence, by definition of 2°, Hull =I, +2) — 2} > If TL = HI". Therefore, y* minimizes |y|| subject » {Ay = b — Awl}. By Theorem 12.2, there exists a unique vector y* minimizing |[yl| subject to {Ay = b - Aa\}. Hence, by the above claim, there exists a unique 2* minimizing || — (| subject to {: Ax = 6) b. Using the notation ofthe proof of Theorem 12.3, Kaczmarz’s algorithm is given by wl) = ol) + ubacayer — Oho 2”) )anaye Subtract 2 from each side to give (att) — 20) = (al) — 2) + w(aayer ~ Ayr 2) — OF ays (2 — 2am Writing y(® = 21 — 20, we get yD = y + u(Omayer ~ @hayert) — aay Yon Note that y® = 0. By Theorem 12.3, the sequence {y(")} converges to the unique point y* that minimizes [ly subject to {Ay = b— Az}. Hence {2\")} converges to y* + 2, From the proof of part a, 2" = y° +2) minimizes llr — 22] subject to {a : Aw = b}. This completes the proof. aa Following the proof of Theorem 12.3, ssuming lal = 1 without loss of generality, we arrive at (ee) je" — 2°)? — w(2— w)(a" (2) ~ 2°)? Ue al? Since 2("),2* € R(A) = R([a™)) by Exercise 12.19, we have 2(") —x* € R(A). Hence, by the Cauchy-Schwarz inequality, (aT — 2"))? = Jal |f2® — since al] = 1 by assumption. Thus, we obtain 1 — (2 — p))|he — 2°)? = 7? — 2°? IP = [le —2"/P, jer where 7 = T= p(2— A). It is easy to check that 0 < 1 ~ u(2~ 2) <1 forall € (0,2). Hence,0 <7 <1. 38 12.22 In Kaczmarz’s algorithm with = 1, we may write on) =a) paul) att) 22! + (payer — aya ; Cntr — Bhar Tenaya Subtracting 2* and premultiplying both sdes by aF 4,4. yields Doe) 2 al . anys ahayea(et 2°) = af uggs (29-2 + Cnayes — ohana) eee rg rll = aha 2” ~ ah uyere” + bmeayst — hayne™) Tat brenyet — Ayre 0. Substituting aF,4)42° = Br(aysa Yields the desired result, 1223. ‘We will prove this by contradiction. Suppose C-z" is not the minimizer of || By — Ol]? over RT. Let 9 be the minimizer of || By — D|/? over BF. Then, ||Bg — BI? < ||BCx* — bl? = ||Ax* — bl. Since C is of full rank, there exists & ER” such that 9 = C. Therefore, BCS — b|? Ae — bl? I1Bg — bl? < [Aa — bl? ‘which contradicts the assumption that z* is a minimizer of || Aw ~ bl? over R. 1224. a. Let A= BC bea full rank factorization of A. Now, we have At = O'B". where BY = (B™B)-1B™ and Ct = CTC), On the other hand (A7)t = (C7 B7)!. Since AT = C7 B? isa full rank factorization of AT, we have (AT)* = (C7 BT)t = (B7)*(CT)'. Therefore, to show that (A7)t = (At)*, it is enough to show that (BT = (Blt (ery = (ey? ‘To this end, note that (B7)t = B(BTB)-", and (C7)! = (CCT)-'C. On the other hand, (Bt) ((BTB)B?)? = B(B™B)~', and (C")" = (C™(CCT)“1)” = (CCT) *C, which completes the proof. b, Note that At = Ct", which isa fullrank factorization of Al. Therefore, (A")! = (Bt)'(C1)*. Hence, to show that (A!)t = A, itis enough to show that (By! = B (cy! = 7. ‘To this end, note that (B#)* = ((B7B)-B7)! = B since BY is a full rank matrix. Similarly, (Ct)! = (CT (CCT)! = C since Ct is a full rank matrix. This completes the proof, 1225. =: We prove properties 1-4 in turn. 1. This is immediate 2. Let A = BC be a full rank factorization of A. We have AT = C!BI, where BY = (B7B)-'B™ and Ct =C?(CCT)*. Note that B'B = and CC! = I. Now, Alaa! = clB'Bcc'Bt cia At 3, We have (AAN? = (Becta ht (BBty" (BiB? ((B7 B)"' BT)" BT B(BTB)-'BT BBt BcC'B! AAl 4. We have (ala? = (ctatBoy™ (coy oncyt = eMctec")-y? etc") "Cc etc ctBtBc = Ata. <=: By property 1, we immediately have AA'A = A. Therefore it remains to show that there exist matrices U7 and V such that At = UAT and At = ATV. For this, we note from property 2that A''= 'A.Al. But from property 3, AA! = (ANT = (Al) AT. Hence, AY = AN(ANTAT. Seting U = AI(A')?, we getihat Al = UAT. Similarly, we note from property 4 that AYA = (A'A)T = AT(A')™. Substituting this back into property 2 yields At = ATAAT = A7(A')T Al, Setting V = (A')? At yields At = ATV. This completes the proof. 12.26 (Taken from [18, p. 24) Let co 0 100 cid, Ar=|0 10 c1o0 000 ‘We compute 00 100 oa, aj=|o 1 0] =4, 1-1 000 Wehave 000 AiAr=|0 1 0 010 which is a full rank factorization. Therefore, (ArAz)! = [ But Hence, (Ai Aa)! # ASAI. 13. Unconstrained Optimization and Feedforward Neural Networks tof f is given by Vs (w) = —Xalua — X7w). b. The Conjugate Gradient algorithm applied to our training problem is wy, L. Set k select the initial poi 2. gl) =—Xaya— XTw). If g = 0, stop, else set d = —g') Bens — gut ras 4. wt) = wi!) + and” 5. gt) _ ghtirx.xta® 6 te = Sore eta 7. dtl) = gt) + Beal” 8, Set ks Xalug— XTw"), ttgt) 0, stop. + 1igoto3. c. We form the matrix Xu as 0 0 05 05 05 005 -05 0 05. and the vector yy as u4 = [-0.42074, ~0.47943, 0.42074, 0,0,0,0.42074, 0.47943, 0.42074)". Running the Conjugate Gradient algorithm, we get a solution of wo" = (0.8806, 0.000)". 4. The level sets are shown in the figure below. ] O ‘The solution in part agrees with the level sets. . The plot of the error function is dep cted below. 132 a. The expression we seck is ass = (1- Wer To derive the above, we write een ee = ya~ayw") — (yy ato") -2F (ui — wt) Substituting for w(**)) — w(#) from the Widrow-Hoff algorithm yields eet — Hence, ex41 = (1 — wer b. For ex + 0, it is necessary and sufficient that |1 — | < 1, which is equivalent to 0

= 14 Lf options (14) ==0 options (14) =1000*Length (xnew) ; end elee options (14) =1000*1ength (xnew) ; end ele: format compact format short e; options = foptions (options); print = options(1): epsilon_x = options(2) ; epsilon_g = options (3); max_iter=options (14) ; for k L:max_iter, g_curr=feval (grad, xcurr) ; if norm(g_curr) <= epsilon disp(‘Terminating: Norm of gradient less than’); disp (epsilon_g) ; a1: break: end Bi alpha=10.0; xnew = xcurr-alpha*g_curr; if print, 59 disp('Iteration number k =’) @isp(k); Sprint iteration index k aiep(‘alpha ="); @isp(alpha); Sprint alpha disp(‘Gradient ="); @isp(g_curr'); print gradient @isp(‘New point ="); disp(xnew’); Sprint new point end 2 LE norm(xmew-xcurr) <= epsilon_xtnorm(xcurr) disp(/Terminating: Norm of difference between iterates less than’); disp (epsilon_x) ; break; end Bit Af k == max_iter disp(/Terminating with maximum number of iterations"); end Bit end Sfor if nargout >= 1 if nargout else disp( ‘Final point isp (xnew }; disp('Number of iterations aispik); end Bit 2 ‘To apply the above routine, we need the following M-file for the gradient. function y=grad(w, xd, ya) 5 wh: whoLew(2) whtz=w(3); whod=w(4) 7 woll=w(5); wol2=w(6) ehew(7): (a; i); ce wedd=xd(1) 5 xd2=xa(2) nL *xd1+wh12*%d2-C17 ‘vaewh2i *xdi +wh22"xa2-t2; zissigmoid(vi) ; 22eeigmold(v2); ylesigmoid(woll*zi+wol2*22-t3) ; d= (yd-yl) ty (1-y1) 5 Al*wol1*214 (1-21) x41; dl*wol2*22* (1-22) *xdl; dl*woll*21* (1-21) *xd2; y (4) =-d1*wo12*22* (1-22) a2; y(S)=-d1*21; ¥(6)=-d1*22; (7) =d1*wold*21* (1-21) ; y(8)=d1*wo12*22* (1-22); y(9)=a1; yey" We applied our MATLAB routine as follows. >> options(2)=107(-7) ; >> options (3)=10°(-7); >> options (14 >> w0=(0.1,0.3,0.3,0.4,0.4,0.6,0-1,0.1, >> [wstar,N]=bp( ‘grad’, w0, options) 0.11; ‘Terminating with maxinum number of iterations wstar -7.7771e+00 -5.5932e+00 -8.4027=+00 -5.6384e+00 =1.1010e+01 1.0918e+01 -3.2773e+00 -8.3565e+00 5.26068+00 10000 As we can see from the above, the results coincide with Example 13.3, ‘The table of the outputs of the trained ‘network corresponding to the training input data is shown in Table 13.2. 14. Genetic Algorithms 141 a. Expanding the right hand side of the second expression gives the di result . Applying the algorithm, we get a binary representation of 11111001011, i.c., 1995 = 29-42 42° 427 + 2° 429 428 4 2°. . Applying the algorithm, we get a binary representation of 0.101101, i.e. 0.7265625 = 2-1 42-9 42-4 4 2-8 4 2-7, 4, We have 19 = 24 +2! + 29, ie, the binary representation for 19 is 10011. For the fractional part, we need at least 7 bits to keep at least the same accuracy. We have 0.95 = 27? 42-2 42-3 49-4 49-74. . the binary representation is 0.1111001 ---, Therefore, the binary representation of 19.05 with at least the same degree of accuracy is 10011.1111001 142 It suffices to prove the result for the case where only one symbol is swapped, since the general case is obtained by repeating the argument. We have two scenarios. Fist, suppose the symbol swapped is ata position corresponding to a. don’t care symbol in H. Cleary, after the swap, both chromosomes will still be in H. Second, suppose the symbol ‘swapped is ata position corresponding ‘0a fixed symbol in IT. Since both chromosomes are in H their symbols at that position must be identical. Hence, the swap does not change the chromosomes. This completes the proof. 6 143, Consider a given chromosome in M(k)(VH. ‘The probability that itis chosen for crossover is qe. If neither of its offsprings is in HT, then at least one of the crossover points must be between the corresponding first and last fixed symbols of H. The probability of this is 1 — (1 — 6(41)/(L ~ 1))?. To see this, note that the probability that ‘each crossover point is not between the corresponding first and last fixed symbols is 1 ~ 6()/(L — 1), and thus the probability that both crossover points are not between the corresponding first and last fixed symbols of HT is (1 ~ 6(H)/(Z — 1))?. Hence, the probability that the given chromosome is chosen for crossover and neither of its offsprings is in HT is bounded above by 144 As for two-point crossover, the n-point crossover operation is a composition of n one-point crossover operations (i.e., ‘m one-point crossover operations in succession). The required result for this case is as follows. Lemma: Given a chromosome in (k) (H, the probability that it is chosen for crossover and neither of its offspringsis in FT is bounded above by For the proof, proceed as in the solution of Exercise 14.3 replacing 2 by n. 145 function Meroulette_wheel (fitness) ; function M=roulette wheel (fitness) fitness = vector of fitness values of chromosomes in population aM = vector of indices indicating which chromosome in the % given population should appear in the mating pool fitness = fitness - min(fitness); % to keep the fitness positive if sum(fitness) =="0, disp(’ Population has identical chromosomes break: else Fitness = fitness/sum(fitness) ; end cum fitness = cumsum(fitness) ; sToP"); for i = i:length(titness), tmp = find(cum_fitness-rand>C) ; Mi) = tmp(1); end 146 & parenti, parent? = two binary parent chromosomes (row vectors) L = length (parent); crossover_pt = ceil(rand*(L-1)): offspring = (parentl(1:crossover_pt) parent2(crossover_pt+l:L)]: offepring2 = [parent2(1:crossover_pt) parent (crossover_pt+1:L)]; 2 147 % mating_pool = matrix of 0-1 elements; each row represents a chromosome % pm = probability of mutation N = size(mating_pool, 1); L = size(mating_pool, 2); mutation_points = rand(N,t) < p_m new_population = xor(mating_pool, mutation points) ; 148 A MATLAB routine for a genetic algorithm with binary encoding is: function (winner, bestfitness] = ga(L,N, £it_func, options) % function winner = GA(L,N, £it_func) % Function call: GA(L,N,"£") 8 L = length of chromosomes N= population size (must ke an even number) 3 £ = name of fitness value function 8 Soptions: Sprint = options (1); Sselection = options(5) ; Smax_iter=options (14) ; Spc = options (18); Spm = p_c/100; e selection % options(5) = 0 for roulette wheel, 1 for tournament elf; if nargin options if nargin isp(‘Wrong number of arguments.’); return; end, end Af length (options) >= 14 Lf options (14) =0 options (14) =34N; end else options (14)=3*N; end Af length (options) < 18 options (18)=0.75; toptional crossover rate end format compact; Sformat short e; options = foptions (options) ; print = options(1); selection = options (5); max_iter=options (14) ; pic = options (18); 8 Pim = p_c/100; P = rand(N,)>0.5; bestvaluesofar = @initial evaluation for i= 1m, fitness(i) = feval(fit_fune, P(i,:)); end (bestvalue, best] = max(fitness); Af bestvalue > bestvaluesofar, bestsofar = P(best,:); bestvaluesofar = bestvalue; end for k = L:max_iter, ‘selection fitness = fitness - min(fitness); ¢ to keep the fitness if sum(fitness) == 0, isp(’ Population has identizal chromosomes -- STOP’); disp( ‘Number of iterations disp (kd; for i = k:maxiter, upper (i) =upper (i-1) average (i) =average (1-1); lower (i)=Lower (i-1) : end break; else fitness = fitness/sum(fitness) ; end if selection == 0, ‘roulette-wheel cum_fitness = cumsum(fitness) ; for i= Lin, tmp = £ind(cum_fitness-rand>0) ; m(i) = tmp(2); end else ‘eeournament for & fighteri=ceil (rand*N) ; fighter2=ceil (rand*n) ; if fitness(fighter1)>fitness(fighter2), m(i) = fighter1; else m(i) = fighter2; end end end M = zeros(N,L); for i= 1:0, M(i,2) = P(m(i), 2): end Scrossover Mnew = M; for i = 1:8/2 64 positive indi = ceil (rana‘n); ind2 = ceil (rand*N); parenti = M(indl,:); parent? = M(ind2,:) if rand < pe crossover_pt = ceil(rand* (L-1)); offspring] = (parenti(::crossover_pt) parent2(crossover_pt+1:L) 1; offspring? = (parent2(::crossover_pt) parent] (crossover_pt+1:L) ] Mnew(indl, :) = offspringl; Mnew(ind2,:) = offspring2; end end amutation mutation_points = rand(N,L) < p_m P = xor(Mnew,mutation_points) ; sEvaluation for i= 1m, fitness(i) = feval(fit_func,P(i,:)); end (bestvalue, best] = max(fitress) if bestvalue > bestvaluesofar, bestsofar = P(best, :); bestvaluesofar = bestvalve; end upper (k) = bestvalu average(k) = mean (fitness); lower (k) = min(fitness); end for if k == maxiter, disp(‘Algorithm terminated after maximum number of iterations: disp(max_iter) ; end winner = besteofar; bestfitness = bestvaluesofar; if print, iter = (1:max_iter]’; plot (iter upper, ’0:", iter, average, 'x-', iter, lower, '*--"); legend( ‘Best’, ‘Average’, ‘Norst'): xlabel ( ‘Generations’, 'Fontsize’, 14); ylabel (‘Objective Function Value’, 'Fontsize’, 14); set (gca, ‘Fontsize’, 14); hold off; end a. To run the routine, we create the following M-files. function dec = bin2dec (bin, range) ; Seunction dec = bin2dec (bin, range) ; SPunction to convert from binary (bin) to decimal (dec) in a given range index = polyval (bin, 2); dec = index* ( (range(2)-range(1))/(271ength (bin)-1)) + range(1) 65 function y=f_manymax (x) ; yo-15* (sin (24%) )°2> (x2) 724260; function y=£it_funcl (binchrom) + Q1-D fitness function _manymax’ ; range=(-10, 10]; xebin2dec (binchrom, range) ; sfeval (£,%) 7 ‘We use the following script to run the algorithm: clear; options (1) =1; (x,yJ=ga(8, 10," €it_funct", opticns) ; f=" fmanymax’ range=(-10, 10]; disp(’GA solution: "); disp (bin2dec (x, range) ): disp(/Objective function value:'}; dispiy); Running the above algorithm, we obtain a solution of 2* = 1.6078, and an objective function value of 159.7640. ‘The figure below shows a plot of the best, average, and worst solution from each generation of the population. a ) ». To run the routine, we create the following M-files (we also use the routine bin2dec from part a. function y=£_peaks (x); ya (Lx (1) ) 72 erp (= (20(1) -°2) = (20(2) 42) 072) = 10." ((1) /5-x(1) .73-¥(2) .°5) -Pexp (x (1) .°2-(2) 72) = el (2e(1) 41) 72x (2) 729/37 function y=fit_func? (binchrom) ; 92-D fitness function f=" peaks’: xrange=[-3,3]+ 66, Lslength (binchrom) ; jin2dec (binchrom(1:1/2) ,xxange) ; in2dec (binchrom(L/2+1:L) yrange) ; yefeval (£, [x1,x2]): We use the following script to run the algorithm: clear; options (1)=1; (x, y]=ga(16, 20, "£it_fune2" , options) ; =" f_peaks"; xrange=[-3,3)7 yeange=(-3,3]7 L=Leagth (x) 7 in2dec (x(1:L/2) ,xrange) ; x2ebin2dec (x(L/2+1:1) ,yrange) ; @isp(‘GA Solution: "); @isp((x1,22)); disp( ‘Objective function value:") disp(y); ‘A plot ofthe objective function is shown below. Sa Running the above algorithm, we ottain a solution of {—-0.0353, 1.4941), and an * = [-0.0588, 1.5412], and an objective function value of 7.9815. (Compare this solution with that of Example 14.3.) The figure below shows a plot ofthe best, average, and worst solution from each generation of the population. 67 Cbjectve Function Vale 149 A MATLAB routine for a real-number geretic algorithm: function (winner, bestfitness) = gar(Donain,N, £it_func, options) % function winner = GAR(Donain,N, £it_func) % Function call: GAR(Domain,N, '£") % Domain = search space; e.g, [-2,2/-3,3] for the space (~-2,2]x(~ % (number of rows of Domain = dimension of search space) 3 N= population size (must be an even number) f= name of fitness value function 3 Soptions: ‘print = options (1); selection = options (5) ‘max_iterzoptions (14) ; ‘ap_c = options (18); ‘pm = p_c/100; a tselection: % options (5) = 0 for roulette wheel, 2 for tournament elf: if margin “= 4 options = () if nargin “= 3 igp(‘Wrong number of argunents.'); return; end end if Jength(options) >= 14 Lf options (14) ==0 options (14) end else options (14) =3*N; end if length(options) < 18 options (18)=0.75; Yoptional crossover rate end 68 ‘format compact; Sformat short 7 options = foptions (options! print = options (1); selection = options(S); max_iter=options (14); PLc = options (18); Pom = p_c/100; n= size (Domain, 1); Lowb = Domain(:,2)'; upb = Domain(:,2)": bestvaluesofar = 0. for i = 1:8, PG lowb + rand(1,n) .*(upb-lowb) ; tinitial evaluation fitness (i) = feval (fit_fune, P(A, :))7 end (bestvalue, best] = max( fitness) ; if bestvalue > bestvaluesofar, bestsofar = P(best, :); bestvaluesofar = bestvalue; end for k max_iter, aselection fitness = fitness - min(fitness); $ to keep the fitness positive if sum(fitness) == 0, disp(‘Population has identical chromosomes -- STOP’); isp( ‘Number of iterations:"): disp): for i = k:max_iter, upper (i) supper (1-1); average (i) saverage(i-1 Lower (i)=Lower (i-1) + end break: else fitness = fitness/sum(fitaess) ; end if selection == 0, froulette-wheel cum fitness = cumsum(fitness) for i= 1:n, tmp = find(cum_fitness-rand>0) ; m(i) = tmp(1); end else ‘@rournament for 1 = 1:N, fighteri=ceil (rand*w) ; fighter2«ceil(rand*N) ; if Eitness (fightert) >fitness(fighter2), m(i) = fighterl; else m(i) = fighter2; end 0. end end M = zeros(N,n); for i= 1:m, M(iss) = Bim), end Scrossover mew = M: for i= 1:m/2 indi = ceil(rand*N) ; ind2 = ceil(rand*n) ; parent = M(indl,:); parent2 = M(ind2,:); AE rand < pe a= rand; offspringl = a*parenti+(1~a) *parent2+(rand(1,n)-0.5) .* (upb-Lowb) /10; offspring? = a*parent2+(1-a) *parent1+(rand(1,n)-0.5) .* (upb-Lowb) /10; do projection for j = lin, LE offspring! (3) upb(3), of fpsringh (3) =upb(3) end Lf offspring? (3)upb(3), of fpsring2 (3) =upb(3) end end Mnew (indi, :) = offspring: Mnew (ind? offspring? end end amutation for i= Lm, if rand < pm, a = rand) Mnew(i,:) = atMmew(i,:) + (1-a)*(lowb + rand(2,n) end, end P = wnew: ‘evaluation for i = 1:n, fitness (i) = feval(£it_func, P(i,: end (bestvalue, best} = max(fitness); if bestvalue > bestvaluesofar, besteofar = P(best, :) bestvaluesofar = bestvalue: end upper (k) = bestvalue; average(k) = mean( fitness); lower (k) = min(fitness) ; 70 + (upb-Lowb) end for if k == maxiter, disp( ‘Algorithm terminated after maximum number of iterations:') disp(max_iter) ; end winner = bestsofar; bestfitness = bestvaluesofar if print, iter = (1:maxiter]’; plot (iter, upper, ’0:’, iter average, 'x-', iter, lower, /* Jegend( ‘Best’, ‘Average’, ‘Worst'); xlabel (‘Generations , ‘Fontsize’ , 14); ylabel (‘Objective Function Value’, ‘Fontsize’,14) ; set (aca, ‘Fontsize’, 14); hold off; end ‘Torun the routine, we create the following M-file for the given function, function y=£_wave(x) ; yox(1) sin (3e(1)) + 2€(2) *9in (542¢(2)) 7 We use the following script to run the algorithm: options (1) options (14) =50; (x, yl=gar( (0, 10;4,6],20,'£_wave' options) ; disp('Ga solution: ’); disp (x) ; disp Objective function value:' disp(y); Running the above algorithm, we obtain a solution of 2* = {7.9711, 5.3462]”, and an objective function value of 18.2607. The figure below shows a plotof the best, average, and worst solution from each generation of the population. 3 ‘enaratone Using the MATLAB function €minunc (from the Optimization Toolbox), we found the optimal point to be (7.9787, 5.3482], with objective function value 13.2612. We can see that this solution agrees with the solution ‘obtained using our real-number genetic algorithm. n 15, Introduction to Linear Programming, 15.1 minimize ~ 22; — 22 subject tom; +25 = 2 ntmtnm = 3 mt+2mtm = 5 Byevnte 2 0 15.2 We have 2 = ary + buy = 029 + abug + buy = a + (ab, bu where u = (uo, 11]” is the decision variatle. We can write the constraint as uj < 1 and u; > —1. Hence, the problem is: minimize a? + [ab,b]u subjectto — -1 Obe such that = 2) +2). ‘Substituting into the original problem, we have minimize o:(af +2;) +e2(ef +05) +--+ + enlet +29) subjectto A(wt-2-) =6 ate 20, J7. Rewriting, we get where a* = [2,...,2t]7 and 2~ 1 minimize e" e"}z subjectto [A - A]z=b 220, which is an equivalent linear programming problem in standard form, ‘Note that although the variables 2" and 2" in the solution are required to satisfy 2{'z7 = 0, we do not need to explicitly include this in the constraint because any optimal solution to the above transformed problem automatically satisfies the condition «27 = 0. To see this, suppose we have an optimal solution with both zi? > O and.x; > 0. In this case, note that cj > 0 (For otherwise we can add any arbitrary constant to both 2° and 2 and sill satisfy feasibility, but decrease the objective function value). Then, by subtracting min(2,27) from 2f' and 27, we have 4 new feasible point with lower objective function value, contradicting the optimality assumption. [See also M. A. Dahteh and I,J. Diaz-Bobillo, Control of Uncertain Systems: A Linear Programming Approach, Prentice Hall, 1995, pp. 189-190.] 15.4 — Not every linear programming problem in standard form has a nonempty feasible set. Example: subject to —21 = 20. Not every linear programming problem in standard form (even assuming a nonempty feasible set) has an optimal solution, Example: minimize ay subjectto 2 =1 2,22 >0. 185 Let 2; > 0,1 = 1,...,4, be the weight in pounds of item i to be used. Then, the total weight is x; +22 +23 + 24 ‘To satisfy the percentage content of fiber, fat, and sugar, and the total weight of 1000, we need Bay + 82 + 165 + ry Gy + 46.09 + 95 +924 2021 +529 + dry + Ora ntrtas +24 = (xy +22 +25 +24) 22 +22 +25 +24) Sle +a +23 +24) = 1000 The total cost is 22 + 42 + 23 + 224. Therefore, the problem is: minimize 271 + 42 +3 + 24 subjectto Tz; ~ 2x2 +6r3- 6, = 0 4x, +4402 +705+72, = 0 152, —2y—5zy = 0 ay tay tay+z4 = 1000 2n23,24 > 0 Alternatively, we could have simply replaced 2 + x2 + ry +r in the first three equality constraints above by 1000, to obtain: 3x1 +822 + 165 +424 = 10000 6x, +462, +923 +924 = 2000 Oz; +522 +425 +02, = 5000 r+z+aa+2y = 1000. Note that the only vector satisfying the above linear equations is [179, 175,578,422", which is not feasible. Therefore, the constraint doesnot haveany any feasible points, which means that the problem does not havea solution. 156 The objective function is py +--+ ps. The constraint forthe ith location st gap + the optimization problem is + GignPn > P. Hence, the minimize py +--+ Pn subject to giapi +--+ GinPa > Pri “m Plys--1Pn 20. By defining the notation G = [g:,j] (mm x n), €n = [1,...,1] (with n components), and p = [p,,..-,Pa]”. we can rewrite the problem as minimize Zp subject to Gp > Pem p20. B 15.7 It is easy to check (using MATLAB, for example) that the matrix 2-1 2-1 3 A=|1 2 3 1 0 1 0-2 0 -5 is of fll rank (ie., rank A = 3). Therefore, the system has basic solutions. To find the basie solutions, we frst select bases. Each basis consists of three linearly independent columns of A. These columns correspond to basic variables of the basic solution, The remaining variables are nonbasic and are set to 0. The matrix A has 5 columns; therefore, wwe have (8) = 10 possible candidate basic solutions (corresponding to the 10 combinations of 3 columns out of 5). It turns out that all 10 combinations of 3 columns of A are linearly independent. Therefore, we have 10 basic solutions. ‘These are tabulated as follows: Columns Basie Solutions 12,3 [-4/17,-80/17,83/17,0,0]" 124 [-10, 49, 0, -83, 0]? 1,2,5, (105/31, 25/31, 0,0,83/31]" 13,4 [-12/11,0,49/11, 80/11, 0]” 1,3,5 (100/35, 0, 25/35, 0,80/35]” 1,45 (65/18,0,0,25/18, 49/18)" 2,34 (0, -6,5,2,0/" 2,3,5 (0, -100/23, 105/23, 0, 4/23)” 24,5 (0,13,0, -21, 2)? 3,45. (0,0,65/19, 100/19, 12/19]7 158. Inthe figure below. the shaded region coresponds to the feasible set. We then translate the line 2sr, + 5irz = 0 across the shaded region until the line just touctes the region at one point, and the line is as far as possible from the origin. ‘The point of contact is the solution to the problem. In this case, the solution is [2,6], and the corresponding cost is 4. 4 15.9 ‘We use the following MATLAB commands: £=[0,-10,0,-6,-201; A=[1,-1,-1,0,0; 0,0,1,~ b=[0;01; vib=zeros(5, 1); vub= (4:3; xelp(£,A,b, vib, vub, x0, negestr) 4.0000 2.0000 2.0000 0.0000 2.0000 The solution is [4,2,2,0, 2)”. 16. The Simplex Method 16.1 b. Pivoting the problem tableau about the elements (1, 4 ©. Basic feasible solution: a = (0,0, 1,4)", ¢ 4. friyra,rasr4] = (5,0, 0, 0). ) and (2,3), we obtain © Since the reduced cost coefficients are all > 0, the basic feasible solution in part c is optimal. f. The original problem does indeed have a feasible solution, because the artificial problem has an optimal feasible solution with objective function value 0, as shown in the ¢g. Extract the submatrices corresponding to A and b, ap toobtain final phase I tableau. pend the last row (c,0], and pivot about the (2, 1)th element oo - 1 3 1 3 1/3 0 1/3 0 -5/3 -5/3 0 -2/3 16.2 ‘The problem in standard form is: minimize -2 xy — 3x9 subject to ay -+2y = 1 ata =2 112,03 > 0. ‘We form the tableau for the problem: 10 1 We pivot about the (1, 3)th element to get ‘The reduced cost coefficients are all nonnegative. Hence, ‘optimal cost is 4. 163 the current basi feasible solution is optimal: [0, 1,1]. The ‘The problem in standard form is: minimize subject to 2m =m a tay =5 mtr =7 ata +25 By Bh 20. 76 We form the tableau for the problem: 101005 o 10107 1 10019 -2 -10000 ‘The above tableau is already in canonical form, and therefore we can proceed with the simplex procedure. We first pivot about the (1, 1)th element, to get 10 1005 o1 0107 Oo 1 1014 0-1 200 10 Next, we pivot about the (3, 2)th element to get 10100 5 oo 11-13 o1-101 4 oo 101 1 ‘The reduced cost coefficients are all nonnegative. Hence, the optimal solution to the problem in standard form is [5,4,0,3, 0)". The corresponding optimal cost is —14. 16.4 a. The BFS is (6,0,7,5, 0], with objective function value ~8. b. r= [0,4,0,0, —4]". «©. Yes, because the Sth column has all negative entries. 4. We pivot about the element (3,2). The new canonical tableau is: 00 -1/3 1 0 8/3 10 -2/3 0 0 4/3 o1 1/3 0 7/3 0 0 -4/3 0 0 -4/3 ¢. First note that based on the Sth column, the following point is feasible: 6 0 +e 7 5 0. Note that 25 = «. Now, any solution of the form x = [+,0,«,+,¢] has an objective function value given by zaatrse —4 (from pars and b). If » = —100,then ¢ = 23. Hence, the following point has objective function value z = 106 6 2) p52 0 0 0 7| +23|3] =| 76 5 1 28 0. 1 23 1 f. The entries of the 2nd column of the given canonical tableau are the coordinates of a with respect tothe basis {ax a1,a3}. Therefore, 44 + 2a, + 3a3. ‘Therefore, the vector (2, ~1,3, 1,0]” lies in the nullspace of A. Similarly, using the entries of the Sth column, we deduce that (~2, 0, ~3,~1, ~1]P also lies in the nullspace of A. These two vectors are linearly independent. Because A has rank 3, the dimension of the nullspice of A is 2. Hence, these two vectors form a basis forthe nullspace of A. 165 1 We can convert the problem to standard form by multiplying the objective function by —1 and introduc Variable as. We obtain: a 2a surplus minimize 2, +2 subject to zz ~e =1 1, 22,23 > 0. Note that we do not need to deal with the absence of the constraint x2 > 0 in the original problem, since 22 > 1 implies that 22 > 0 also. Had we used the rule of writing 2 = u —v with u,v > 0, we obtain the standard form problem: minimize subject to 21,1, 0,273 2 0. ». For phase I, we set up the artificial problem tableau as; o1-a11 00 10 Pivoting about element (1,4), we obtain the canonical tableau: Ol 1d o-1 10 Pivoting now about element (1,2), we obtain the next canonical tableau: O1r-l1i1 ooo010 Hence, phase I terminates, and we use 2 as our initial basic variable for phase TI For phase Il, we set up the problem tableau as: 01-11 1200 Pivoting about element (1,2), we obtain o1-1 1 10 2 -2 Hence, the BFS [0, 1,0)” is optimal, with objective Functi problem is (0, 1]! with objective function value ~2. value 2. Therefore, the optimal solution to the original 16.6 a. 1,0)" b [1-1 1] Note that the answer is not [1 =" 1]. which is the canonical tabl weanswer is not |} 1], which isthe canonical tableau, 78 c. We choose q = 2 because the only regative RCC value is r2. However, y,2 <0. Therefore, the simplex algorithm terminates with the condition that the problem is unbounded. 4. Any vector of the form (21,11 ~ 1]? 11 > 1, is feasible. Therefore the first component can take arbitrarily large (positive) values. Hence, the objective function, which is —21, can take arbitrarily negative values. 167 ‘The problem in standard form minimize 1 +22 subject to 2, +2n2 a5 2a, +22 ~ a4 = 2, 09,9524 > 0. We will use zr and zr as initial basic variables. Therefore, Phase I is not needed, and we immediately proceed with Phase II. The tableau for the problem is: a a a3 a b 1 2-103 2 1 0 -13 110 00 ‘We compute Av = eSB onl} 7] (1/3,1/3}, rh = ch -aD =(0,0]—[1/3,1/3] icy 4] [1/3,1/3] = (ra,r4) ‘The reduced cost coefficients are all nonnegative. Hence, the solution to the standard form problem is (1,1, 0,0)". ‘Therefore, the solution tothe original problem is (1, 1]”, and the corresponding cost is 2. 16.8 a. The problem in standard form is: minimize 4x, + 322 subject to 5x +22 ~ 5 = 11 We do not have an apparent basic feasible solution. Therefore, we will need to use the two phase method. Phase I: We introduce artificial variables 26, 27,25 and form the following tableau. 6 a) a3 a4 a5 a5 a7 as Db 5 1-10 0100n 2 1 0 -10 0108 12 0 0-10 017 ef 0 0 0 0 0 1 11 0 We compute: aT os [1,11 7B = [rurasrssrasts] 4,111) We form the augmented revised tableau by introducing y = B™'ay = ay: Variable | BO! [up ve ze [to ojm 5 av jorols 2 as oo? t We now pivot about the frst componentof yo gt Variable] B- | x | 1/5 0 01/s zr |-2/5 1 0) 18/5 zs |-1/5 0 1| 24/5 ‘We compute " aT = [-3/5,1,1] rh = [rasrasresrsyra] = [-12/5,—3/5,1, 1,8/5] We bring yp = B~'ay into the basis to get Wwriable| B- | vo ws zm | 1/5 0 0] 11/5 1/5 a, | -2/5 1 0) 18/5 3/5 zy | -1/5 0 1] 24/5 9/5 We pivot about the third component of ys to get ‘Variable | BO | vo an | 2/9 0 -1/9/ 5/3 zs |-13 1-13] 2 » |-9 0 5/9 |8/3 ‘We compute AT = [-1/3,1,~1/3} rh = [rsrasrssresrs) = [-1/3,1,-1/3,4/3,4/3} We bring y; = B™ay into the basis to get Variable | lve us mn | 2/9 0 -1/9/5/3 -2/9 a [1/8 1-1/3] 2 1/3 nm | -1/9 0 5/9 [8/3 1/9 80 We pivot about the second component of yg to obtain Variable] Bo Iu m |0 2/3 -1/3| 3 my |-1 3 -1\6 m |o -1/3 23 | 2 ‘We compute ar [0,0,0} 1B = [rasrssresrrsre] = [0,0,1,1,1] 2 07 I basic feasible solution is [3, 2,6, 0,0)”. a a a; ay as b 5 1 -1 0 0 210-108 1200 0-17 oT 4 3 0 0 0 0 ‘The initia revised tableau for Phase Ts the final revised tableau for Phase I. We compute AT = [0,5/3,2/3) rh = [rar] =(5/3,2/3] > 07. Hence, the optimal solution to the original problem is (3, 2)". ». The problem in standard form is: minimize ~62y ~ dary ~ Trg — 524 subject to 2 + 2x, +25 + 2ry +25 = 20 Gr; + 5z2 +325 + 2x4 +26 = 100 Bx, +429 + 9r5 + 12r4 +27 = 75 Bayer 20. We have an apparent basic feasible solution: [0,0,0,20, 100, 75)”, corresponding to B = Is. We form the revised tableau corresponding to this basic feasible solution: Variable] Bo! | yo zs [1 0 0] 20 zy {0 1 0] 100 ze [0 0 1] 7% We compute AT = 00,0] rh = [nyrarayre We bring yz = B™'as = as into the basis to obtain variable] B- | vy vs ‘We pivot about the third component of ys to get Variable i | v0 zs | 1 0 -1/9/ 35/3 zy |o 1 -1/3| 75 zy |o 0 1/9 | 25/3 ‘We compute aT = [0,0,-7/9] vB = (rasrasrar] =[-11/3, -8/9, 13/3,7/9). We bring y, = Bay into the basis to obtain variable] BT | woh a =1/9 | 35/3 2/3 10 re [01 -1/3| 75 5 x; |0 0 19 [25/3 1/3 about the second component of y, to obtain Variable | |v zs {1 2/15 -1/15| 5/3 x |0 1s -1/15| 15 zy {0 -1/15 2/15 | 10/3 We compute A? = (0,-11/15,-8/15) rh = (rayra,re,ra] = (27/15,43/15, 11/15, 8/15) > 07. ‘The optimal solution tothe original problem is therefore [15,0, 10/3, 0)" 16.9 1. By inspection of r™, we conclude thatthe basic variables are ,3,-r4, and the basis matrix is. oo1 =|100 010 Since r? > 07, the basic feasible solution corresponding to the basis B is optimal. This optimal basic feasible solution is [8,0,9, 7)". b. We have rf, = ch — ATD, where rh = [I], ef = [ea], Av = (5,6,4], and D = (2,1,3)7. We get 1 =e, ~ 10 ~ 6 ~ 12, Which yields c» = 29. 16.10 a. There are two basic feasible solutions: [1,0] and [0,2]. b. The feasible set in R? for this problem isthe line segment joining the two basic feasible solutions (1, 0[” and (0, 2)". “Therefore, if the problem has an optimal feasible solution thats not basic, then all points inthe feasible set are optimal. For this, we need [2] =a[?]. 82 where a € R. c. Since all basic feasible solutions are optimal, the relative cost coefficients are all zero. 16.11 a. 2-0 <0,8 0,7 < 0,and 5 anything, b. 2-a > 0,6 = ~7,8 and 7 anything. ©. 2-a <0,7> Ojeither 8 <0 or 5/7 < 4/8, and 5 anything. 16.12. a. We form the tableau for the problem: 100 1/4 -8§ -1 90 010 12 -12 -1/2 30 001 0 0 1 01 00 0 -3/4 20 -1/2 6 0 The above tableau is already in canonical form, and therefore we can proceed with the simplex procedure. We first pivot about the (1, 4)th element, to get 4001-32 -4 36 0 2100 4 3/2 -5 0 oo100 1 O18 3000 -4 -7/2 33 0 ig about the (2, 5)th element, we get -12 8 010 8 ~84 0 1/2 1/4 00 1 3/8 -15/4 0 0 0 1001 0 4 1 1000 0 Pivoting about the (1,6)th element, we get -32 1 0 18 01 -21/2 0 1/16 1/8 0 -3/64 10 3/16 3/2 -1 1 -1/8 00 2/2 1 -2 3 0 14 00 -3 0 Pivoting about the (2, 7)th element, we get 2-6 0 -5/2 56 100 1/3 -2/3 0 -1/4 16/3 0 1 0 -2 6 1 5/2 -% 001 -1 1 0 -1/2 18 000 Pivoting about the (1, 1)th element, we get 1-3 0 5/4 2 1/2 0 0 0-1/3 0 1/6 -4 -1/6 1 0 oo 1 0 0 1 o18 0-2 0 -7/4 44 1/2 00 PPivoting about the (2, 2)th element, we get o 14 -8 -1 9 0 1/2 -12 -1/2 3 1 0 0 1 0 0-3/4 20 -1/2 6 3 10 on 00 oo hich is identical tothe inital tableau, Therefore, eycling occurs. b, We start with the initial tableau of parta, and pivot about the (1, 4)th etement to obtain 4001 -32 -4 3 0 2100 4 3/2 -15 0 oo100 1 O11 3000 -7/2 330 Pivoting about the (2,5)th element, we get -2 8 010 8 -% 0 “1/2 1/4 0 0 1 3/8 15/4 0 o oO 1001 0 2 1 1000-2 B oO Pivoting about the (1, 6)th element, we get -3/2 1 0 18 0 1 -21/2 0 1/16 -1/8 0 -3/64 1 0 3/16 0 3/2 1-1/8 00 2/2 1 -2 0 14 00 -3 0 Pivoting about the (2, 1)th element, we get 0-20 -1 2% 1-60 1-20 -3/4 16 0 3 0 o21 1 -40 61 0-10 -5/4 320 3 0 Pivoting about the (3, 2)th element, we get oo 1 0 O 101 101 14 -§ 09 1 o112 12 -2 03 1/22 00 1/2 -3/4 2 0 6 1/2 Pivoting about the (3, 4)th element, we get oo 10010 1 1-1/2 3/4 0 -2 0 15/2 3/4 o 2 11-40 6 1 0 3/2 5/4 0 2 0 2/2 5/4 ‘The reduced cost coefficients are all nonnegative. Hence, the optimal solution to the problem is (3/4, 0,0,1,0,1,0]7. ‘The corresponding optimal cost is 5/4. 16.13 ‘The following is a MATLAB function that implements the simplex algorithm. function [x,v]=simplex(c,A,b,v,options) * SIMPLEX (C,A,b,¥) e SIMPLEX (c, A,b, v, options) 3 8 x = SIMPLEX(c,A,b,v); 84 ® SIMPLEX (c,A,b,v, options) ; ® ® (x,v] = STMPLEK(c,A,b,v) : ® (x,v] = STMPLEX(c,A/b,v, options) ; 8 QSIMPLEX(c,A,b,v) solves the following linear program using the simplex Method: % min c’x subject to Axb. x=0, Qwhere [A b] is in canonical form, and v is the vector of indices of ‘basic colunns. specifically, the v(i)-th column of A is the i-th Sstandard basis vector. She second variant allows a vector of optional parameters to be defined: SOPrIONS(1) controls how much display output is given; set Sto 1 for a tabular display of results (default is no display: 0). SOPTIONS(5) specifies how the pivot element is selected; % Oschoose the most negative relative cost coefficient & Isuse Bland’s rule. Af margin options if nargin @isp(‘Wrong number of arguments.’); return; end end format compact; ‘format short e options = foptions (options) + print = options(1); nelength(c) m=length (b) ; eBsctv(:)): r= c’-cB’*A; Brow vector of relative cost coefficients cost = -cBr* tabl={A brr cost); if print, disp); disp(’ Initial tableau’ disp(tabl); end Bit while ones(1,n)*(r' >= zeros(a,1)) if options (5) == 0; [aa] = min(x); else epland’s rule while r(q) >= 0 aeatl; end 85 end aif min_ratio = inf; p=0; for isin, if tabl (i,g)>0 Af tabl(i,nti}/tabl(i,g) < min_ratio min_ratio = tabl(i,n+1)/tabl (i,q); pei; end aif end tif end Sfor if p= 0 disp( ‘Problem unbounded’) ; break; end tif tabl=pivot (tabl,p,q); if print disp( "Pivot point:'); disp i tp.al) disp(‘New tableau: ') isp(tabl) ; end &if vip) = a r= tabl(mt,i:n); end twhile eros (n,1); x(v(:))=tabl (2:m,n42) ; ‘The above function makes use of the following function that implements pivoting: function Mnew=pivot (M,p,a) aunew=pivot (M,p.a) ‘Returns the matrix Mnew resulting from pivoting about the S(p,q)th element of che given matrix for ist:size(u,1), if isp Mnew(p, :)=M(p, 2) /M(p.a else Mnew (i, :)=M(i, 2) -Mip, 2) *(M(£,q) /(p.a)) end tif end for ® We now apply the simplex algorithm to the problem in Example 16.2, as follows: >> AeI1 0.10 > 6:8); > 0:01; >> ve[3:4:5]; >> options (1) =2; >> [e, v]=simplex(c,A,b,v, options); o10n 1icoa; Initial Tableau: 1 0 2 0 ok 86 o 1 0 1 6 6 . 1 2-5 0 0 oo a Pivot point: 2 2 New tableau: 1 ° 1 2 Pivot point: 3 New tableas 0 >> disp(x'); 2 6 2 0 2 >> disp(v’ 3 ‘As indicated above, the solution tothe problem in standard form is [2,6,2,0,0]", and the objective function value is ~34. The optimal cost forthe original maximization problem is 34. 16.14 The following is a MATLAB routine that implements the two-phase simplex method, using the MATLAB function from Exercise 16.13, function [x,v] =tpsimplex(c,A,b, options) ‘TPSIMPLEX(c,A,) : ‘TPSIMPLEX(c, A,b, options) + x = TPSIMPLEX(c,A,b); x = TPSIMPLEX(c,A,b, options) ; lewd bev) ‘TPSIMPLEX(c,A,b); "PPSIMPLEX (c,A,b, options) ; STPSIMPLEX(c,A,b) solves the following linear program using the Stwo-phase simplex method: % min c'x subject to Axsb, 10-0. Sthe second variant allows a vector of optional parameters to be defined: OPTIONS (1) controls how much display output is given; set ato 1 for a tabular display of results (default is no display: 0) BoprIONs(S) specifies how the pivot element is selected; & Oschoose the most negative relative cost coefficient; & leuse Bland’s rule. if margin “= 4 options = (]; if nargin “= 3 disp(‘Wrong number of argunents.‘); return; end end ele; 87 format compact; Sformat short e: options = foptions (options) ; print = options(1); nslength(¢) : m=length(b) : ‘aphase I if print, disp"); aisp(‘Phase 1); disp(" y end ventones (m, 1); for islim v(ddev(iy ris end (x, v] =simplex([zeros(n, 1) snes (m,1)], [A eye (m)],b,v, options) ; Af all(veen), ‘Phase II if print displ); isp(’Phase 11"); disp’ =e disp(/Basic columns: ") disp’) end Sconvert [Ab] into canonical augmented matrix Binv=inv(A(:,{v)))2 AsBinvta; beBinv*b; [x,v]=simplex(c,A,b,v, options) : if print disp); disp(‘Final solution: ’}; disp(x'); end else ‘assumes nondegeneracy disp(‘Terminating: problem has no feasible solution.") end ® We now apply the above MATLAB routine to the problem in Example 16.5, as follows: >> Ae(1.110; 530-11; >> be(4:8)7 p> c=[-3;-5;0;0) >> options (1)=1; >> format rat; >> tpsimplex(c,A,b, options) ; 88 Initial Tableau: 2 1 2 0 2 0 4 5 3 0 2 o 2 28 “6 dtd wan Pivot point: 2 1 New tableau: 0 a/s 2 SAS 12/5, 13/5 0-4/5 1/5 8/5 0-2/5 -2 3/5 06/5 22/5, Pivot point 1 3 New tableau: 0 2/5 1 4/8 2-4/5 125 1 03/5 0-1/5 0/5 B/S Cr) aoa Pivot point: 2 2 New tableau: -2/3 0 2 4/3 2-4/9 ay. 5/31 0 -1/3 0 4/3 8/3 “0 0 * a oa # Pivot point 1 New tableau: 2 3 2 3 24 4 1 o 4 4 * * @ 4a a. Basic columns 4 Phase IT initial Tableau: 2 0 3 2 4 2 1 1 0 @ 2 0 5 0 20 Final solution: 0 4 0 4 16.15 ‘The following is a MATLAB function that implements the revised simplex algorithm. function [x,v,Binv] =revsimp(c,A,b, v, Binv, options) REVSIMP(c,A,b,v,Binv) ; REVSIMP(c,A,b,v,Binv, options) ; x = REVSIMP(c,A,b,v,Binv) ; REVSIMP(c,A,b,v,Binv, options) + [x,v,Binv] = REVSIMP(c,A,b,v,Binv) ; (x, v,Binv] = REVSINP(c,A,b,v, Binv, options) ; SREVSIMP(c,A,b,v,Binv) solves the following Linear program using the 89 ‘revised simplex method: & min c’x subject to Ax=b, x ‘here v is the vector of indices of basic columns, and Binv is the ‘inverse of the basis matrix. specifically, the v(i)-th colum of 4a is the i-th column of the bssis vector he second variant allows a vector of optional parameters to be ‘defined: QopTIONS(1) controls how much éisplay output is given; set to 1 for a tabular display of results (default is no display: 0) ‘oPTI0Ns(5) specifies how the pivot element is selecte: © Oschoose the most negative relative cost coefficient; a ge Bland’s rule if margin “= 6 options = (1; if nargin “= 5 igp(‘Wrong number of argunents."); return; end end format compact: ‘format short e; options = foptions (options) ; print = options (1); nelength(c) melength(b) eBec(y(:)); yO = Binv*b; Lambda?=cB’ *Binv; ‘c/-lambdaT*A; $row vector cf relative cost coefficients if print, displ); disp(’ Initial revised tableau [v B(-1) y0}:'); disp({v Biny y0}); disp(‘Relative cost coefficients:"); dispir) end 3if while ones (1,n)*(r" A if options (5) == 0; (egal = min(x); else ‘SBland’s rule zeros(n/1)) while x(q) >= 0 gael; end end #if ya = Binvta(s,a): min_ratio = inf =O; for isi:m, LE ya(i)>0 90 if yO(i)/ya(i) < min_ratio min_ratio = y0(i)/yalil: pei; end tif end #if end Sfor if p == 0 isp(*Problem unbounded’); break; end tif if print, disp(/Augmented revised tableau [v B7(-1) yO yal:/) disp({v Binv yO yal); disp’ (pea) i"): disp({p.a}): end augrevtabl=pivot ({Binv yO ya],p,m+2) ; Binvsaugrevtabl (:,1:m) yOsaugrevtabl (:,m+1); vip) = eBec(v(:)); lambda?=cB’ *Binv; r= c’-lambdaT*a; trow vector of relative cost coefficients if print, disp(‘New revised tableau [v B7(-1) y0l:/; disp({v Binv y0]); isp( ‘Relative cost coefficients:"): disp(r); end aif end Suhile eros (n,1); x(v(s ‘The function makes use of the pivoting function in Exercise 16.13. ‘We now apply the simplex algorithm to the problem in Example 16.2, as follows: 1010 >> 6:8); >> c=[-2/-5:0 3> ve[374s5]; >> Binvseye(3); >> options (1) =1; >> (x,v,Binv] =rev_simp(c,A,b, ¥, Bin, options) ; Initial revised tableau [v B*(-1) yO): 30 1 0 «0 4 4 0 2 0 6 5 0 0 2 8 Relative cost coefficients: 2-5 0 0 Augmented revised tableau [v E*(-1) yO yal: 3 1 0 0 4 0 @ 0 2 0 6 2 a1 5 0 09 4 8 2 (pea: 2 2 New revised tableau [v B*(-1) yo]: 30 1 0 9 @ 2 0 21 0 6 ed Relative cost coefficients: 2 0 0 5 0 Augmented revised tableau [v B°(-1) y0 ya): 302 00 [Ol UMD 200 2 o 6 0 5 0 2A 2 2 2 a: 3 oa New revised tableau [v B°(-1) yd]: 3 1 1k 2 0 2. o 6 1 0 - 2 2 Relative cost cosfficients: 0 0 0 3 2 >> disp(x'); 2 6 2 o o >> dispiv'): 3002002 >> disp(Binv) ; .oolo-t o 1 0 oo oa 2 16.16. The following is a MATLAB routine that implements the two-phase revised simplex method, using the MATLAB function from Exercise 16.15. function (x,v1 prevsimp(c,A,b, options) ‘TPREVSIMP(C,A,bi: TPREVSIMP(c,A,b. options) ; ‘TPREVSIMP(C.A,b) ‘TPREVSIMP (c.A,b, options) bev) (xvi ‘TPREVSIMP(c,A,) 7 = TEREVSINP(c,A,b, options) ; 8 STPREVSIMP(c,A,b) solves the fo:lowing linear program using the Stwo-phase revised simplex method: % min c'x subject to Ax=b,,.x>=0. the second variant allows a vector of optional parameters to be ‘defined: QoPTIONS(1) controls how mich display output is given; set to 1 for a tabular display of results (default is no display: 0). oPrIONS(5) specifies how the pivot element is selected; % Oschoose the most negative relative cost coefficient; @ Isuse Bland’s rule. if nargin “= 4 options = [17 if nargin “= 3 disp(/Wrong number of arguments.’); end end cle: format compact; ‘format short ©; options = foptions options) ; print = options (1); nelength(c) ; melength (b) ; ‘phase I if print, disp); disp(*Phase I'); displ’ 4 end ventones (m,1) ; for w(iav (i) eis end (x,v, Binv] srev_simp (zeros (n,1) :ones(m,1)], [A eye(m) ],b,v, eye (m) options) ‘hase IT if print disp(' "): disp(*Phase IZ") disp(’ oN end (x,v, Binv] =rev_simp(c,A,b, v, Binv, options) Af print dist): disp("Pinal solution: "); dispix'); end ® - ‘We now apply the above MATLAB routine to the problem in Example 16.5, as follows: >> Ae[4.2-10; 140-11 >> be[12;61; >> c=[2:3;0:01; >> options (1 >> format rat; >> tprevsimp(c,A,b, options) ; Phase I Initial revised tableau [v B"(-1) yO}: 5s 1 0 6 0 2 6 Relative cost coefficients: 93 Ss 6 1 1 o Augmented revised tableau (v 8°(-1) yO yal: 5 1 0 Ww 2 6 0 1 6 4 (ea: 2 2 New revised tableau [v B7(-1) yO] 5 1-1/2 9 2 0 wa 3 Relative cost coefficients: “17202022 1-1/2 03/2 Augmented revised tableau (v B*(-1) y0 yal 5 1-1/2 9 (972 2 0 1/6 3/2 18 wa: a4 New revised tableau (v B°(-1) yO): 12/7 -1/7 18/7 2 1/14 2/7 6/7 Relative cost coefficients: 0 0 0 0 4 8 Phase IZ Initial revised tableau (v B*(-1) yO]: 12/7 3/7 18/7 2 1st 2/7 6/7 Relative cost coefficients + 0 s/n 4/7 Final solution: 18/7 6/7 0 oO 17. Duality a4 Since a and A are feasible, we have Az > 6, x > 0, and ATA < eT, A > 0. Postmultiplying both sides of ATA < eT by x > O yields AT Ax < che, Since Aw > band A” > 07, we have AT Aw > AT. Hence, A7b < eT x. 172 ‘The primal problem is: minimize ep subjectto Gp > Pem p20, where G = [91,5]. én symmetric duality): «++ 1]? (with n components), and p = {p1,.--+Pn]7- The dual of the problem is (using maximize PAT Em subjectto ATG < eT A20. 94 173 a. We first transform the problem into standard form: minimize —2r, ~ 3ry subject to 21 +202 +23 2a, + 2a +04 = 5 71,2, 03,44 > 0. ‘The initial tableau is: ‘We now pivot about the (1, 2)th element to get: 2 1 12 02 3/2 0 -1/2 1 3 “1/20 3/2 0 6 Pivoting now about the (2, 1)th element gives: O01 2/3 -1/38 1 10 -1/3 2/3 2 00 4/3 1/8 7 ‘Thus, the solution to the standard form problem is 21 = 2, 22 = 1, 23 = 0, 24 = 0. The solution tothe original problem is 2) = 2,22 = 1. b. The dual to the standard form problem is maximize 4A, + 5.2 subject toy + 2M $2 2M + Aes -3 Aide $0. From the discussion before Example 17-4, it follows thatthe solution to the dual is A” = ef — rf 174 ‘The dual problem is maximize 114, +842 + 7s subjectto 5 +2 +As <4 Dit Aa +203 $3 Das day As 2 0. Note that we may arrive at the above in one of two ways: by applying the asymmetric form of duality, or by applying the symmetric form of duality to the original problem in standard form. From the solution of Exercise 16.7a, we have that the solution to the dual is 7 = ef B~' = (0, 5/3, 2/3] (using the proof of the duality theorem). 175 . Multiplying the objective function by ~1, we see thatthe problem is ofthe form of the dual in the asymmetric form of duality. Therefore, the dual (0 the problem is ofthe form ofthe primal in the asymmetric form: minimize 7B subjectto A? A = ~c! Azo 95 b. The given vector y is feasible in the dua. Since b = 0, any feasible point inthe dual is optimal. Thus, y is optimal im the dual, and the objective function value for y is 0. ‘Therefore, by the Duality Theorem, the primal also has an ‘optimal feasible solution, and the corresponding objective function value is 0. Since the vector 0 is feasible in the primal and has objective function value 0, the vector 0 is a solution tothe primal 176 4. The dual (asymmetric form) is maximize subject to Aa SLi ‘We can write the constraint as XS min(I/as:i ‘Therefore, the solution to the dual problem is d= Vay ». Duality ‘Theorem: If the primal problem has an optimal solution, then so does the dual, and the optimal values of their respective objective functions are equal. By the duality theorem, the primal hasan optimal solution, and the optimal value of the objective function is 1/4. ‘The only feasible point inthe primal with this objective function value isthe basi feasible solution [0,...,0,1/aq] c. Suppose we start at a nonoptimal initial basic feasible solution, (0,...,1/a;,...,0]”, where 1 < i ay for any j # m, rq is the most negative relative cost coefficient if and only if q a7 a. The dual is minimize 970 subjectto ATA De™ A>0. b. By the duality theorem, we conclude that the optimal value of the objective function is 0. The only vector Satisfying © > O that has an objective function value of 0 is = 0. Therefore, the solution is ar = 0. ¢. The constraint set contains only the vector 0. Any other vector & satisfying a > 0 has at least one positive ‘component, and consequently has a positive objective function value. But this contradicts the fact that the optimal solution has an objective function value 00. 178 a. The artificial problem is: minimize (07 e" subjectto [A,I]z=6 220, where € = [1.51]? and 2 = fe", y] b. The dual tothe artificial problem maximize AT subjectto ATA < 07 aT cel. 96 . Suppose the given original linear programming problem has a feasible solution. By the FTLP, the original LP problem has a BFS. Then, by a theorem given in class, the artificial problem has an optimal feasible solution with y=. Hence, by the Duality Theorem, the dual ofthe artificial problem also has an optimal feasible solution. 179 ‘To prove the result, we use Theorem 17.3 (Complementary Slackness) Hence, 2 is a feasible solution to the dual. Now, (¢~ ATA)? = wTe are optimal for their respective problems. 17.10 7 ‘Touse the symmetric form of duality, we need to rewrite the problem as ee > 0, wehave ATA=e—p ce. 0. Therefore, by Theorem 17.3, and A minimize ~c"(u-»), subjectto —A(u-v) > ~b wv 20, which we represent in the form minimize [=e e?] [:] : subjectto [-A A] (2) 2b [Jeo By the symmetric form of duality, the dua i: maximize —-7(—6) subjectto A"[-A A] <[-e" e7] ADO. Note that for the constraint involving A, we have AT[-A A] <[-eP eT) -ATAS @ OTA eT and ATA <7 r Therefore, we can represent the dual as minimize AT subjectto ATA = Azo. wan ‘The corresponding dual can be written as: maximize 3h +32 subjectto A, +22 $1 2+ <1 Dn. A2 2 0. 16.7. Using the idea ofthe proof of the duality theorem ,B-' = (1/3,1/3]. The cost of the dual problem is 2, To solve the dual, we refer back to the solution of Exerci (Theorem 17.2), we obtain the solution to the dual as AT which verifies the duality theorem. ” 17.12. ‘The dual to the above linear program (asymmetric form) is maximize 70 subjectto — ATO 0. Since any feasible solution to the dual is also ‘optimal, the dual has an optimal solution f and only if ¢ > 0. Therefore, by the duality theorem, the primal problem has a solution if and only if > 0. IF the solution to the dual exists, then the optimal value of the objective Function in the primal sequal to that ofthe dual, which is clearly 0. In this case, 0 is optimal, since e 0 = 0. 1743. Consider the primal problem minimize OT x subjectto Ar >b 220, and its corresponding dual maximize yb subjectta yTA $0 y20. =: By assumption, there exists a feasible solution to the primal problem. Note that any feasible solution is also ‘optimal, and has objective function value 0. Suppose y satisfies A” y <0 and y > 0. Then, y isa feasible solution to the dual. Therefore, by the Weak Duality Lemma, 6”y < 0. 4: Note that the feasible region for the dual is nonempty, since 0 is a feasible point. Also, by assumption, 0 is an ‘optimal solution, since any other feasible point y satisfies 6” y : By assumption, there exists a feasible solution to the dul problem. Note that any feasible solution is also ‘optimal, and has objective function value 0. Suppose y satisfies A”y = 0 and y > 0. Then, y is a feasible solution to the primal. Therefore, by the Weak Duality Lemma, 6”y > 0. =: Note that the feasible region forthe primal is nonempty, since 0 is a fe an optimal solution, since any other feasible point y satisfies 6 y > 6"0 dual problem has an (optimal) feasible olution. 17.16 Let e = [1,..., 1)”. Consider the primal problem ible point. Also, by assumption, 0 is 0. Hence, by the duality theorem, the subject to and its corresponding dual maximize e"y subject to yTA=0 y20. = Suppose there exists Az < 0. Then, the vector! = /min{|(Az)i|} isa feasible solution tothe primat problem. Note that any feasible solution is also optimal, and has objective function value 0. Suppose y satisfies ATy =0,y > 0. Then, y is a feasible solution to the dual. Therefore, by the Weak Duality Lemma, ey < 0. Since 1v 2 O, we conclude that y = 0 “=! Suppose 0 isthe only feasible solution to the dual. Then, 0 is clearly also optimal. Hence, by the duality theorem, the primal problem hasan (optimal) feasible solution z. Since Ax < —e and -e <0, we get Az <0. 1147 Write the LP problem minimize ex subjectto Az > 220 and the corresponding dual problem maximize AT subjectto ATA < o™ AZO. By a theorem on duality, if we can find feasible points x and A for the primal and dual, respectively, such that che =X", then & and 2 are optimal for their respective problems. We can rewrite the previous set of relations as -c" oot 0 er 0 A 0 {fe} }o m 0 |G]2/0 0 -aT 0 In o ‘Therefore, writing the above as Ay > 6, where A € ROM+2A+2)x(m+") and 8 R242), we have that the fist ‘components of f((2m + 2n + 2), (m +n), A, 6) is a solution to the given linear programming problem. 99 17.18 1 Weak duality lemma: if # and yg are feasible points in the primal and dual, respectively, then (ao) > Ja(uo) Proof: Because Yo 2 Oand Azo ~ b <0, we have yf (Aaa ~ b) <0. Therefore, filto) > filwo) + us (Aaa —b) 1 7to to + Up Ato ~ 5b. Now, we know that 1 1 52040 + Uy Ato > 52°Ta* + ys Aw", where 2* = —ATyo, Hence, Lit gat, pabeo + vB Az > uf AA" yy — yPAAT yy = —lyf AAT yp ‘Combining this with the above, we have file) > —tyBAAT yy ~ ub hia). ‘Alternatively, notice that 1 1 Silo) ~ falto) = 32020 + 5uSAAT Uy +07 0 2 j0Bzo + LyPAAT y +20 A" yy 1 ; pllto + AT yoll? 2 0. ». Suppose fi (a0) = fala) for feasible points 2 and yp. Let be any feasible point in the primal. Then, by part a, file) > falyo) = fuleo)- Hence 2p is optimal inthe primal. Similarly, lety be any feasible point in the dual. Then, by part a, f(y) < f(a) = falta). Hence 4p is optimal in the dual 18. Non-Simplex Methods 18.1 The following is a MATLAB function that implements the affine scaling algorithm, function [x,¥] affscale(c,A,b,u, options) ; AFFSCALE(¢,A,,) APRSCALE(c,A,b,u, options) : x = AFFSCALE(c,A,b,u); x = AFFSCALE(c,A,b,u, options) ; (x,N) = APPScALR(C,A,b,u) ; (e.N] = APFSCALE(c,A,b,u, options) GAFFSCALE(c,A,b,u) solves the following linear program using the taffine scaling Metho % min c’x subject to ax=b, x>=0 awhere u is a strictly feasible initial solution Sthe second variant allows a veccor of optional paraneters to be 100 ‘defined: ROPTIONS(1) controls how much display output is given; set Sto 1 for a tabular display of results (default is no display: 0). S0PrIONS(2) is a measure of the precision required for the final point. B0PTIONS(3) is a measure of the precision required cost value. SOPrIONS (14) = max number of iterations. SOPTIONS(18) = alpha. if margin “= 5 options = 7 if margin “= 4 disp(‘Wrong number of arguments. return; end end if Length(options) >= 14 if options (14 ‘options (14)=1000*1ength (2mew) ; end else options (141 end 000*2ength (new) + Sif Length(options) < 18 options(18)=0.99; Yoptional step size end Sele; format compact; format short ¢ options = foptions (options): print = options(1); epsilon_x = options (2) epsilon_f = options(3) max_iter=options (14); alphasoptions (18) ; nelength(c) ; m=length (b) ; for k = L:max_iter, D = diag(xcurr) ; Abar = A‘D; Pbar = eye(n) ~ Abar’*inv(Asar‘Abar’) ‘Abar; d= -D*Pbar*D*c; if d “= zeros(n,1), nonzd = find(d<0) ; r = min(-xcurr(nonzd) . /d(nonzd) ): else disp(’Terminating: d= 0' break; end xnew = xcurrtalpha*r*d; 101 if print, disp(/Iteration number k =) disp(k); ‘print iteration index k disp(‘alpha_k ="); disp(aipha*r); @print alpha_k disp (‘New point ="); disp(xnew’); aprint new point end #1 ‘epsilon_x*norm(xcurr) Relative difference between iterates <'); Af norm(xnew-xcurr) disp(/Terminating: disp(epsilon_x) ; break; end sit LE abs (c'* (xnew-xcurr)) < epsilon_ftabs(c! *xcurr), isp(‘Terminating: Relative change in objective function < '); disp(epsilon_e); break: end tif LE k == max_iter ‘disp( ‘Terminating with maximum number of iterations’); end tif end for if nargout if nargout w end else disp(‘Final point disp (xnew’): disp(‘Number of iterations disp(e); end 3if * ‘We now apply the affine scaling algorithm to the problem in Example 16.2, as follows: >> As[1.0100;01010;11002); >> be[d:6:8): >> cn[-2:-5:0;0;01; >> un (23:2:3:31; >> options (1)=0; >> options (2)=107(-7); >> options (3)=107(-7); >> affscale(c,A,b,u, options) ; ‘Terminating: Relative difference between iterates < 1.0000e-07 Final point 2.0000e+00 6.0000e+00 2.0000e+00 1.0837e-09 | 1.7257e-08 Number of iterations = 8 ‘The result obtained after 8 iterations as indicated above agrees with the solution in Example 16.2: [2,6,2,0, 0)". 18.2 ‘The following is a MATLAB routine that implements the two-phase affine scaling method, using the MATLAB function from Exercise 18.1 102 function [x,N] =tpaffscale(c,A,b, options) 3 March 28, 2000 ‘TPAFFSCALE(¢,A,b) + 'TPAFFSCALE(c,A,b, options) : x = TAFFSCALE(¢,A,) ; x = TPAFFSCALE(c,A,b, options) + (x,w) (x8) TPAFFSCALE(¢,A,) : TPAFPSCALE(c,A,b, options) ; STPAFFSCALE(c,A,b) solves the following Linear program using the 21wo-Phase Affine Scaling Method: @ min c’x subject to Axeb, x20. ‘the second variant allows a vector of optional parameters to be defined: SOPTIONS(1) controls how mich display output is given; set Sto 1 for a tabular display of results (default is no display: 0). SOPTIONS(2) is a measure of the precision required for the final point. SOPTIONS(3) is a measure of the precision required cost value. Q0PTIONS(14) = max number of iterations, S0PrrONS(18) = alpha. if nargin “= 4 options = () if nargin “= 3 disp(‘Wrong number of arcuments.’); return; end end Bele: format compact; format short e7 ceptions = foptions (options) ; print = options(1); nelength(c) ; melength (b) ; ‘phase 1 if prin, displ); isp(/Phase I"); disp’ =95 ena u = rand(n, 1); v= bat; if v “= zerosim,1), u = affscale({zeros(1,n),1]', {A v],b,[u’ 1]’, options) ; uinel) = Os end Af print dispt’ ") disp(/Initial condition for Phase IT:') 103 disp(u) end Lf uint1) < options(2), ‘hase IT une) = (5 Af print disp(" 9); disp("Phase II"); aiep('~ > options (1)=0; >> tpaftscale(c,A,b, options) ; ‘Terminating: Relative difference between iterates < 1.00006-07 ‘terminating: Relative difference between iterates < 1,0000e-07 Final point 4.0934e-09 4.0000e+00 9.4280e-09 4.0000e+00, Number of iterations = 7 ‘The result obtained above agrees with the solution in Example 16.5: (0,4,0,4]"- 183 The following is a MATLAB routine that implements the affine scaling method applied to LP problems of the form given in the question by converting the given problem in Karmarkar’s artificial form and then using the MATLAB, function from Exercise 18.1. function [x,N]=karaffscale(c,A,b, options) KARAFFSCALE(c,A,) EARAFFSCALE(c,A,b, options) ; x = KARAFPSCALE(C,A,D) x = KARAFFSCALE(C,A,b, options) ; bem) teem) CALE (C/A, D) PALE (c,A,b, options) ; 8 SKARAFFSCALE(c,A,b) solves the following Linear program using the QAEEine Scaling Method: Los % min c’x subject to Ax>=b, x>=0. She use Karmarkar‘s artificial problem to convert the above problem into 8a form usable by the affine scaling method, Sthe second variant allows a vector of optional parameters to be defined: SoprroNs(1) controls how mich display output is given; set Sto 1 for a tabular display of results (default is no display: 0). GOPTIONS(2) is a measure of the precision required for the final point. SOPTIONS(3) is a measure of the precision required cost value. OPTIONS (14) = max number of iterations OPTIONS (18) = alpha. if nargin “= 4 options = (1 if nargin “= 3 Gisp(‘Wrong number of arguments.’); return; end end sele; format compact, format short e| options = foptions (options); print = options(1); nelength(c) : melength(b) : Qconvert to Karmarkar's aftificial problem x0 = ones(n,1); 10 = ones(m,1) ud = onesin, 1); vO = onesim, 1): aa= ¢! -b’ zeros(1,n) zeros (1,m) (-c’*x0+b’*10); A zeros(m,m) zeros(m,n) -eye(n) (b-A*x0+v0) 5 zeros(n,n) A’ eye(n) zeros(n,m) (c-A‘*10) 1 bb = (0; br cl: ce = [zeros (2tme2*n,1); 1); yO = (x0; 10; uO; vO; 2); (y:M)-afEscale(ce,AA,bb, yO, options) : if cory <= options(3), x = y(iinl: if nargout == 0 disp( ‘Final point disp(x'); disp(‘Final cost =) disp(c’*x): Aisp(’Number of iterations ="); dispin) ; end #if else disp(‘Terminating: problem has no optimal feasible solution.’ end 105 ‘We now apply the above MATLAB routine to the problem in Example 15.14, as follows: >> options (2 >> karaffscale(c,A,b, options) ; ‘Terminating: Relative difference between iterates < 1.0000e-04 Final point = 5.1992e+00 6.5959e+00 Final cost = =4.8577e+02 wumber of iterations = 3 “The solution from Example 15.14 is (5,7). The accuracy of the result obtained above is disappointing. We believe that the inaccuracy here may be caused by our particularly simple numerical implementation of the affine scaling ‘method. This illustrates the numerical issues that must be dealt with in any practically useful implementation ofthe affine scaling method. 184 a. Suppose T(z) = T(y). Then, Tia) = Ti(y) for i = 1,...,n +1. Note that for i = (ei /as)Tnon(@) and T3(y) = (us/04)Tasa(y)- Therefore, Tile) = (24/a4)Tr4i (©) = Tay) = (¥/ 04) Tn4a(y) = (Ys/a)Tnoa(@), which implies that 2; = ys,i = 1,-..,n. Hence a = y. b, Lety € {@ € A: angi > 0}. Hence yner > 0. Define « ‘Then, T(2) om, T(z) = [21,..-san]? by a4 ays /Ungiet 1y. To see this, note that 1 vost Tyas) = = i, 10) = Tab tile Fl te Poe Also, fori = 1,4 2) = (vil vasa Tos) = ve An immediate consequence ofthe solution to part 4. We have 1 1 Tol) = STa po tanfan ed wT ~ (a) = (0/0) Tua) = > n(a) = (e/a)Tass(a) = e. Since y = 17(2), we have that fori = iy... 94 = (ula) Merefore 2 = yas = 2i¥nese which implies that a! = yas. Hence, Ae! = ynyi Ae = bynst 18.5 Leta: € R", and y = T(x). Let ay be the ith column of A, i 1y.--ym. Asin the hint, let A’ be given by Al = lara, .--,4n0n,-b) ‘Then, Av=b @ Ar—b=0 4 [aty--5@n, 8] 0 1 106 i/o [a1@1,...,Gn@n, —b] «t ont Olan 1 (ei/ar)vnes (2n/an)ynsa Unt & Aly=0. 18.6 ‘The result follows from Exercise 18.5 by setting A := e and 18.7 Consider the set {a € RB" : e 1 > 0,21 =0)}, which can be written as (a € R® : Az = b,z > 0), where Jr a-[a], [i]. {1,... M7 e1 = [1,0,....0]". Leta = e/n, By Exercise 12.16, the closest pointon the set {x : Aa = b} to the point ag is 7 AAT)“H 1)" AT AA?) (0 Aas) +49 = [0, 45... Pil : Since #* € {x : Ax = b,x 2 0} C {x : Ax = b}, the point 2* is also the closest point on the set (@ : Az = b,x > 0} to the point ao, Let r = lag — 2*||. Then, the sphere of radius r is inscribed in A. Note that 17 = |lao — 2" || Hence, the radius ofthe largest sphere inscribed in A is larger than or equal to 1/ /n{m 1). It remains to show that this largest radius is less than or equal to 1//n(n — 1). To this end, we show that this largest radius is less than oF equal to 1/ Vn) +e for any © > 0. For this, it suffices to show thatthe sphere of radius 1/ nn =) +e is ‘not inscribed in A. To show this, consider the point 1 1 r Fron mel It is easy to verify that the point a above is on the sphere of radius 1/\/n(n=1) + ¢. However, clearly the frst ‘component of 2 is negative. Therefore, the sphere of radius 1/ y/n(n = I) + is not inscribed in A. Our proof is thus completed. 188. We first consider the constraints. We claim that x € A 5 & € A. To see this, note that if € A, theneT a = land hence eB = e™D"'2/e"D x = 1, hich means that @ € A. The same argument can be used for the converse. Next, we claim Az = 0.4 AD = 0. ‘Tosee this, write | Ag = ADD™'z = AD2(e Dz). 107 Since eT D™'z > 0, wehave Ax = 0+ AD = 0. Finally, we claim that if is an optimal solution to the original problem, then 2° = U (z*) is an optimal solution to the transformed problem. To see this, recall that the problem in a Karmarkar's restricted problem, and hence by Assumption (B) we have ¢7x* = 0. We now note that the minimum value of the objective function e” Dz in the transformed problem is zero. This is because €” Dz = ez/e"D~'z, and 7 D™'ar > 0. Finally, we observe that at the point, £* = U (2) the objective function value for the transformed problem is zero. Indeed, oT De 0. ‘Therefore, the two problems are equivalent. 18.9 Let v € RB"! be such that vu? B "DD" Je™ Dx 07, We will show that and hence v = 0 by virtue of the assumption that owl] (<. where u € R” constitute the first m components of v. Then, vB =u AD + mie” m+1. ‘This in turn gives us the desired result. To proceed, write v as Postmultiplying the above by e, and using the facts that De = arp, Aaro = 0, and ee = n, we get ul Azo + Umsin = Um4in = 0, which implies that vms1 = 0. Hence, u?AD = 0", which after postmultiplying by D~' gives uA = 07. Hence, ph that » = 0. Hence, rank B 18.10 ‘We proceed by induction. For k = 0, the result is true because ar(°) point of A. We first show that ("+") is ¢ strictly interior point. Now, 10. Now suppose that (is a strictly interior att) = ay — are”) Then, since a € (0, 1) and ||2)} 1, we have lle) ~ aol < Jar|le || 0. To see tis, note that e? D,(*+") > 0. Furthermore, we can write ee g(t) Death) att Since a\*) = [e{*),,.., 26°] > 0 by the induction hypothesis, and 20+!) = [2{**9),...2*)T > 0 by the y \ypot above, 2'**!) > 0 and hence it is a strictly interior point of A. 108 19. Problems with Equality Constraints 191 a. As usual, let f be the objective function, and fh the constraint function. We form the Lagrangian I(z, 4) = ‘J(z) + ATh(e), and then find critical points by solving the following equations (Lagrange condition): Dzl(x,d) = 07, We obtain 22014 26020 00005 12000 40500 ‘The unique solution to the above system is = (16/5, -1/10, ~34/25]", Dalle) = 0". ny [-4 2 5 23| = |-6 AD 3 wd Le AT = [-27/5, -6/5]". ‘Note that zc* isa regular point. We now apply the SOSC. We compute 2 L(a*,\*) = Fla’) +N H(@")) = [ o ‘The tangent plane is T(2*) Let y = a[-5/4,5/8, 1]7 € T(w*), a #0. We have yt Le", A) ‘Therefore, 2* is a strict local minimizes. ». The Lagrange condition for this protlem is 44 2A81 20 + 2X22 af+at-9 We have four points satisfying the Lagrange condition: {uew:[{ 5 gy {a[-5/4,5/8, 1)” a € R}. } peo. =0 20 0. AO = -2/3 NO 22/3 x XO = a1 [Note that all four points)... are regular. We now apply the SOSC. We have U(x, A) T(z) 109 00 20 = 2 Joy 0 2° {y: 2x1, 2x2]y = 0}. For the first point, we have yay . [4/3 0 Ee, 40) = [ as Ta) = {al0,1]":a€ Rh. u Lety = a(0, 1)” € T(x), a #0. Then WT H(e!,AMy = 2a? > 0 Hence, x") is a strict local minimizer. For the second point, we have He 4) lhc wos] > Hence, x') is a strict local minimizer. For the third point, we have Ea, 1) = leat a Te) = {a[-v5,2]":a€ RB}. Let y = al-V3,2)" € Te), a £0. Then TE (2),)y = —100 <0. Hence, 2\®) is a strict local maximizer. For the fourth point, we have seas) = [29 Te) = {alv5,27 sa € Rh. Lety = a[¥5,2]" € Tel), a #0. Thea yTL(2), Ay = 10a? < 0. Hence, 2) is a strict local maximizer. ©. The Lagrange condition for this problem is mt2n = 0 a +8 = 0 st4de?—1 = 0. ‘We have four points satisfying the Lagrange condition: V2, ~1/(2v2))7, x Yv2,1/(2v2)J?, N°) = 1/4 f/v2,1/2v3\7, Ae) = 1/4 [-1/v2,-1/(2vB)J", A = ~1/4, ‘Note that all four points a"), ..., x‘) are regular. We now apply the SOSC. We have 0 1), ,[2 0 [i }ea[e 3). ty: (2r1,802]y = 0}. 110 any 2 = 2) a) 4 L(@,d) Tie) Note that 2 fae. Kz,-1/4) = [ i 4] = fed La,1/4) = le ale After standard manipulations, we conclude thatthe first two points are strict local maximizers, while the last two points are strict local minimizer. 192 : By the Lagrange condition, 2* = [xi,2|” satisfies a+r = 0 n+4t4 = 0. Eliminating A we get 3x, -4=0 which implies that 21 = 4/3. Therefore, V f(x") = [4/3,16/3]". 193 a. The Lagrange condition for this problem 2" - a0) +2Me* = 0 le"? 9% where A* € R. Rewriting the first equation we get (1+ A*)2* = ao, which when combined with the second equation ives two values for 1+ \*: 1+ Af = 2/3 and 1+ Aj = -2/3. Hence there are two solutions to the Lagrange condition: 2°!) = (3/2)[1, V3), and 2°) = ~(3/2){1, V3}. b, We have L(a*(), 4t) = (1+ AZ). To apply the SONC Theorem, we need to check regularity. This is easy, since the gradient ofthe constraint function at any point z is 22, which is nonzero at both the points in part a For the second point, 1+ A = ~2/3, which implies tha the point isnot a local minimizer because the SONC does not hold. ‘On the other hand, the first point satisfies the SOSC (since 1 + Af = 2/3), which implies that it is a strict local minimizer. ~ 194 4. Let, 2, and zy be the dimensions ofthe closed box. The problem is minimize 2(erz2 +2275 + 521) subject to 1t2ts = V. We denote f(w) = 2(c122-+2222-+2921),and A(z) = 2y2275—V. Wehave V f(x) = 2[z2+23,21 423,21 +22)" and VA(z) = (raza, 123, 21:22]". By the Lagrange condition, the dimensions ofthe box with minimum surface area satisfies 2b+0)+Abe = 0 ate) +Aac = 0 2(a+)+Aab = 0 abe = V, where A ER b, Regularity of 2* means Vh(zx*) # 0 (since there is only one scalar equality constraint in this case). Since 2° = [a,b,c]? is a feasible point, we must have a,b,c # 0 (for otherwise the volume will be 0). Hence, Vi(z*) # 0, which implies that 2 is regular. ul «©. Multiplying the first equation by a and the second equation by b, and then subtracting the first from the second, we obtain: ea~0) Since c # 0 (see part b), we conclude that a = 8. By a similar procedure on the second and third equations, we conclude that b = c. Hence, substituting into the fourth (constraint) equation, we obtain vis, with \ = —4v-¥, d. The Hessian of the Lagrangian is given by 0 24Ac 2426 0 EeAy=|2+% 0 2400 -2 2+ 24a 0 -2 oid 2/1 0 1]. 110 ‘The matrix L(:e*, A) is not positive definite (there are several ways to check this: we could use Sylvester's criterion, or we could compute the eigenvalues of L(", A), which are 2,2, ~4). Therefore, we need to compute the tangent space T(x"). Note that Dh(x*) = Va(z)" = [bc, ac, ab] = VSIA, 1,1) Hence, Tle") = {y: Dhw")y = 0} = (y: (1,1, Uy = 0} = {ys = ~(v1 +92)}. Lety € T(@*), y #0. Note that either yx 4 0 or y» £0. We have, at vaya 0 i|» ~A(viye + vas + voys)- 10 ‘Substituting ys = —(y: + yz), we obtain PL", dy = —A(yiv2 — lyn + 42) — als +2) = ACV} + nan + v8) = 427 Qe _fa ap Q= lye 1 ] au ‘Therefore, y7 L(x", X)y > 0, which shows that the SOSC is satisfied. ‘An alternative (simpler) calculation: where = = (yi, ya] # 0 and VP Ay = 29? [ ] y= —Uyi(y2 + ys) + ye(vn +s) + va(t + y2)). Substituting n= ~(va+ys).4n = ~(v1-+v0)-and ys = (ys +02) inthe fist, second, and hid terms, respectively, we obtain YT Da" Ny = 207 + v3 +28) > 0. 198 1. We frst compute critical points by applying the Lagrange conditions. These are: 2m, +22) = 0 6x +2dr = 0 1+ 2X23 0 ai+ated-16 = 0. 12 ‘There are six points satisfying the Lagrange condition: a2(?) = [V63/2,0,1/2]", 2) = [-V63/2,0,1/2]", 2 = [00,4], al") = [0,0,-4)", a) = (0, V575/6, 1/6)", ~V575/6,1/6)", All the above points are regular. We now apply second order conditions to establish their nature, For this, we compute 20 0 200 =|0 6 0], me)=/o0 2 of, 002 000 {ye B® : (201,229, 2za]y = 0}. a= F(a) and Te") 00 0 @,.) = Jo 4 o For the first point, we have oo -2 Tee) = {[-a/V,b,a)" : a,b € R}. [-a/V63, b, a]” € T(x"), where a and b are not both zero. Then >0 if lal bv2 Lety From the above, we see that «!) does not satisfy the SONC. Therefore, 22") cannot be an extremizer. Performing similar calculations for 2°), we conclude that x‘) cannot be an extremizer either. For the third point, we have 7/4 0 0 1208) = [’ aja :,| 0 0 ~1/4 Te) = {[a,,0)" :a,6€ R}. Let y = (0,6,0]" € T(x"), where a and b are not both zero. Then VE (2, 1)y = a? + i > 0, Hence, 2") js a strict local minimizer. Performing similar calculations for the remaining points, we conclude that 2") isa strict local minimizer, and x) and x'®) are both strict local maximizers. b. The Lagrange condition for the problem is: 2x +A601 +402) = 0 2eq + Md: +1222) = 0 3x} +4ziz +62}-140 = 0 Pa? tal J+ (2) 113 ‘We represent the first two equations as From the constraint equation, we note that = [0,0] cannot satisfy the Lagrange condition. Therefore, the determinant of the above matrix must be zero. Solving for A yields two possible values: —1/7 and —1/2. We then have four points satisfying the Lagrange condition: al = Ba)", AO) = 1/7 2 N@) = 47 2) = [-2vT4, Vig?, A) = 1/2 RV -vig, AM = -1/2, Applying the SOSC, we conclude that a) and 2") are strict local minimizers, and a>) and @*) are strict local maximizers. 196 a. We can represent the problem as at minimize f(z) subject to h(a) where f(x) = 221 + 22 ~ 4, and h(z) = 2122 ~ 6. We have D(x) = 2,3], and Dh(ze) = [2,24]. Note that O'is nota feasible point. Therefore, any feasible point is regular. If" isa local extremizer, then by the Lagrange ‘multiplier theorem, there exists A* € R such that Df(2*) + * Dh(a*) = 07, or 2eaep = 0 34+Mzl = 0. Solving, we get two possible extremizers: 20) = [3,2I7, with corresponding Lagrange multiplier A) = —1, and 20) = —(3, 2)", with corresponding Lagrange multiplier \®) = 1 b. We have F(x) = O, and H(z) =(0 11 0} First, consider the point 2(?) = (8, 2", with corresponding Lagrange multiplier A®) = We have U2, =-[0 11 0], and. T(@) = {y:[2,3]y = 0} [-3,2]7 € T(e"), a £0. Wehave a[-3,2]" :a € R}. Lety 2a? > 0, PE, AM)y Therefore, by the SOSC, 2?) = [3,2]? is «strict local minimizer. ‘Next, consider the point ®) = —(3, 2)", with corresponding Lagrange multiplier Le,\2) =[0 11 0) and P(e) = {y:-2,3]y = 0} = {al-3,2]" ae RY = Te") 3,2] © T(x), a #0. We have VTE (2, Ay 120" <0, ‘Therefore, by the SOSC, x?) «. Note that f(a") = 8, while f(a) = -16. Therefore 2), although a strict local minimizer, is nota global minimizer. Likewise, x'), although a strict local maximizer, is not a global maximizer. =[3,2]7 isa strict focal maximizer. 114 19.7 We use the technique of Example 19.7. First, we write the objective function in the form Qa, where o-0"-[! 3]. ‘The characteristic polynomial of @ is A? 6 +5, and the eigenvalues of Q are 1 and 5. The solutions to the problem are the unit length eigenvectors of Q corresponding to the eigenvalue 5, which are -{1, 1]"/ V3. 198 Consider the following optimization problem (we need to use squared norms to make the functions differentiable): minimize — -||Az||? subject to |ja||? = 1. As usual, write f(a) = —|| Az? and h(x) = |jx||? — 1. We have V f(z) = -2A7 Ax and VA(z) = 2x. Note that all feasible solutions are regular. Let 2* be an optimal solution. Note that the optimal value of the objective function is f(z") = -||AIlg. The Lagrange condition for the above problem is: -2AT Ax" +\(2n") = 0 lle"? = 1. From the first equation, we see that AT Ant = M2", which implies that A* is an eigenvalue of AT A, and 2 is the corresponding eigenvector. Premultiplying the above equation by a*” and combining the result with the constraint equation, we obtain f(a" izes f(x"), we deduce that A* must be the largest eigenvalue of AT A; ice., X Alb = Vm. Mt ae'TAT At = ||Az"|l? Ale Therefore, because 2* mit Therefore, 199 Let h(w) = 1-27 Pa = 0. Let xp be such that h(eo) = 0. Then, xo # O. For ao to be a regular point, we need to show that {Wh(ao)} is a linearly independent set, ie., VA(z0) # 0. Now, Vh(a) = -2P2. Since P is nonsingular, and ao # 0, then Vi(to) = —2P 29 #0. 19.10 Note that the point (1, 1)” isa regular point. Applying the Lagrange multiplier theorem gives ata = 0 b+arr = 0. Hence, a = b. 19.11 a, Denote the solution by [2}, 23]. The Lagrange condition for this problem has the form Bh - 24202} 2-225 (ai)? ~ (23)? 0. From the first and third equations it follows that 23,23 0. Then, combining the first and second equations, we obtain 2-23 xt Qep Day 1s which implies that 23 — (23)? = (77)?. Hence, 1 = 1, and by the third Lagrange equation, (27)? = 1, Thus, the only two points satisfying the Lagrange condition are (1, 1] and [-1, 1]. Note that both points are regular. b. Consider the point e* = [-1,1]". The corresponding Lagrange multiplier is A* = —1/2. ‘The Hessian of the erat JG SJ-[8 ‘The tangent plane is given by T(a") = (y [-2,~2ly = 0} = (a,-a)" sae Rp Let y € T(z*), y # 0. Then, y = [a,~a] for some a 4 0. We have y™L(2*, A*}y = —2a? < 0. Hence, SONC does not hold in this case, and therefore * = {—1, 1] cannot be local minimizer. In fact, the point is a strict local ‘maximizer. Consider the point * = 1,1]. The corresponding Lagrange multiplier is \* = 1/2. The Hessian of the Lagrangian is nerx)=[? ye 4] i A Tle") = {y : 2,—2y =0} = {{a,a]” a € Rh Lety € T(x"), y #0. Then, y = [a,a] for some a 0. We have y? L(zx*, \*)y = 2a? > 0. Hence, by the SOSC, the point * = [1,1] is a strict local mininizer. 19.12 a The point z* is the solution to the optimization problem ‘The tangent plane is given by me 2 minimize ye — eel subjectto Az =0 Since rank A = m, any feasible points regular. By the Lagrange multiplier theorem, there exists A* € RM such that (@* — 20)" = X°7A = 07. Postmultiplying both sides by a* and using the fact that Az* = 0, we get (w* - a)" 2* ». From part a, we have ay = ATA Premultiplying both sides by A we get Az = (AAT) from which we conclude that \* = —(AA")-' Azo. Hence, Bo + ATA" = xe — AT(AAT)~! Azo = (Ip — AT(AAT) 1933 Write f(x) = $27 Qe and h(x) = b — Az. We have Dfiz) = 27Q, ‘The Lagrange condition is 2TQ-xTA b-Az* = 0. From the first equation we get Qt Ata, ‘Multiplying both sides by A and using the second equation (constraint), we get 2 AQ ATA = 6. Since Q > 0 and A is of full rank, wecan write At = (AQ™AT)“0. Hence, 2* =Q"'AT(AQ™AT) 19.14. Clearly, we have M = R(B), ic., y €M if and only if there exists 2 € R™ such that y = Ba. Hence Lis positive semidefinite on M forallyeM, y"Ly >0 forallz€R™, (Be)"L(Ba) > 0 forallz €R", 27(B™LB)x > 0 forall ER", 27 Lyx >0 Lu 20. eoaag For positive definiteness, the same argument applies, with > replaced by >. 19.15 a. By simple manipulations, we can write 2, = 09 + abug + bus Therefore, the problem is ; Lad gud sminimize — 3(u3 + uf) subject to aap + abuy + bu Alternatively, we may use a veetor notation: writing u = [up, ui], we have minimize f(u) subject to Au) = where f(u) = 4llull?, and h(u) = aro + (ab, blu. Since the vector Vh(u) = (ab, |” is nonzero for any u, then any feasible point is regular. Therefore, by the Lagrange multiplier theorem, there exists A* € IR such that uptaad = 0 uy ead @xp + abus + buj ‘We have three linear equations in three unknowns, that upon solving yields oto ol +a)’ uj = '. The Hessians of f and h are F(u) = Iz (2 x 2 identity matrix) and H{(1) = O, respectively. Hence, the Hessian of the Lagrangian is L(u*, \*) = Ia, which is positive definite. Therefore, u* satisfies the SOSC, and is therefore a strict local minimizer. 7 19.16 Letting 2 = (22,1, 2)", the objective function is 2"Qz, where The linear constraint on 2 is obtained by writing 2 = 2x1 $y = 224m) +42, which can be written as Az = b, where A=[l,-2-1, b=4 Clamae! 4] -(12)7h 4 = | 4/3. -3 -1 Hence, using the method of Section 19.6, the solution is| Q7Aat(AQ7at)-! 4/3 and uj = ‘Thus, the optimal controls are uj = 20. Problems With Inequality Constraints 201 a. We first find critical points by applying the Karush-Kuhn-Tucker conditions, which are ry -2- rrr +52 = 0 1 2 1 = n (lacs) +m(enete-s) = 0 we 0. We have to check four possible combinations. ‘Case 1: (44 = 0, p22 = 0) Solving the first and second Karush-Kuhn-Tucker equations yields x!) = (1,5]", However, this point isnot feasible and is therefore nota candidate minimizer. Case 2: (1 > 0, pp = 0) We have two possible solutions al =(-0.98,48)" yl) = 2.02 2 = (-0.02,0)7 yf” = 50. Both x(®) and 2\®) satisfy the constraints, and are therefore candidate minimizers. Case 3: (j1, = 0, jz > 0) Solving the zorresponding equation yields: al = (05050,4.9505)" pf?” = 0.198 ‘The point «(is not feasible, and hence is not a candidate minimizer. Case 4: (11; > 0, 1 > 0) We have two solutions: (5) (C.732, 2.6797 xl) = (13.246, 3.986)" xl = [-2.73,37.32)7 al) = (188.8, -204]". ‘The point 2") is feasible, but 2°) is not. 18 We are left with three candidate minimizers: 2), 2°), and 2). It is easy to check that they are regular. We now check if each satisfies the second order conditions. For this, we compute Dan) = [ae 2] For x"), we have ery) = [20 9) Te = {al-0.1021, 11" :a€ R} [-0.1021, 1] € F(x) with a #0. Then TL (@, ul))y = 1.9790" > 0. Thus, by the SOSC, x!) isa strict local minimizer. For 2), we have L(2),y) = lacus 9] Ta) = {a[-4.898,1]":a€ R}. Let y = a[—4.898, 1] € P(e) with a £0. Then, yTL(2™, wy = -2347.90? < 0. Thus, °°) does not satisfy the SOSC. In fact, in this case, we have T(x) = F(x), and hence 2) does not satisfy the SONC either. We conclude that x) is not a local minimizer. We can easily check that 2") is not a local ‘maximizer either. For 2), we have sau) = [-MgoI9 9] 0 2 Te) = (0). ‘The SOSC is trivially satisfied, and therefore 2!) is a strict local minimizer. b. The Karush-Kuhn-Tucker conditions are: 2e1-j—ps = 0 2e2-I—- py = 0 -1 <0 -m <0 -m-m+5 <0 pat ~ pata + ps(—21 — 22 +5) = 0 HisHayws > 0. 1c is easy to verify that the only combination of Karush-Kuhn-Tucker multipliers resulting in a feasible point is Ha = 0, 4g > 0. For this case, we obtain 2* = [2.5,2.5]", 1" = [0,0,5]", We have = a] >o. L(e", 2") 02 Hence, x is.a strict local minimizer (in fact, the only one for this problem), 119 . The Karush-Kuhn-Tucker conditions are: 2a + 6x — 442121 +2 = 0 62, -2+21-2n = 0 ai+2m-1 < 0 2x -2s-1 < 0 wa (e} = 20a -1) + ya(2z - 222-1) = 0 Min > 0. tis easy to verify that the only combination of Karush-Kubn-Tucker multipliers resulting in a feasible point sp M2 > 0. For this case, we obtain 2* = (9/14,2/14]", yu" = [0,13/14]". We have wen) = [§ 5] Ta") {all 1] sa € R}. Let y = afl, 1] € T(w*) with a #0. Then yTL(2*,n")y = Ma? > 0. Hence, 2* is a strict local minimizer (in fact, the only one for this problem). 20.2 ‘The Karush-Kuhn-Tucker conditions are: 2ey + 2Az1 + 2dy + 2p) = 2p + 2a + 2hz2 =H 42nd 1 wom wz} — 22) 1 < > We have two cases to consider: Case 1: (jz > 0) Substituting 2» = 2} into the third equation and combining the result withthe first two yields two possible points: 20) =[-1.618,0618)" ol) = (2.618, 0.382)" Note that the resulting j2 values violate the condition jx > 0. Hence, neither of the points are minimizers (although they are candidates for maximizers). Case 2: (11 = 0) Subtracting the second equation from the frst yields 2, = 22, which upon substituting into the third equation gives two possible points: 2 2 [1/2,1/7, 2) = [-1/2,1/2]7. Note that ("is not a feasible point, and s therefore not a candidate minimizer. ‘Therefore, the only remaining candidate is z°), with corresponding A‘) = ~1/2 (and ys = 0). We now check second order conditions. We have 12,0.) = [1 7] Pe) = {ofl,-1]7:a€ Rh 120 Lety = all, -1]" € F(@)) with a #0. Then uP L(2,0,A))y = 4a? > 0. ‘Therefore, by the SOSC, 2 is a strict local minimizer. 203. ‘The optimization problem is: minimize eZp subjectto Gp > Pem p20, ++++)Pn]®. The KKT condition for this problem where G = [gij]+€n = [1,---,1]” (with n components), and p = is \ 2 en TG — wy HT (Pem—Gp)- Wp = 0 Gp ys MosP viv 204 a. Wehave f(z) = 22 ~ (21 ~2)° +3 and g(x) = 1-2. Hence, V f(@) = [-3(z1—2)?, 1]? and Vol ‘The KKT condition is, = (0-17 wero -3(2,-2)? = 0 I= = 0 (1 = 22) 0 I-m < 0. ‘The only solution to the above conditions is 2* = [2,1], x" = 1. To check if e* is regular, we note that the constraint is active. We have Vg(:2" Hence, 2° is regular. = [0,-1]7, which is nonzero, b. We have L(a*,u*) Pee) + wate") = [9 A Hence, the point z-* satisfies the SONC. ©, Since p* > 0, we have F(x*, u") = T(a*) = {y : [0,—I]y = nonzero vectors, Hence, the SOSC does not hold at 2". 205 a. Write (2) = 22, 9(@) = —(a2 + (t1 ~ 1)? = 3). We have V(x) = [0,1]? and Vo(x) = [22 — 1), 1)". ‘The KKT conditions are: = {12 =0}, which means that F contains wz 0 (2-1) = 0 1-4 = 0 (zz +(a-1)?-3) = 0 m+ (m1? +3 < 0. From the third equation we get = 1. The second equation then gives 2; = 1, and the fourth equation gives 22 = 3. ‘Therefore, the only point that satisfies the KKT condition is e* = [1,3], with a KKT multiplier ofp 12 b. Note that the constraint z2+(2;—1)?+3 > Oisactiveate*. WehaveT(x*) and N(z*) = {y:y = (0, Iz, 2 €R: oh ¢. Wehave ool tae new)=o41[ 3 ° ie a: From part b, T(a*) = {y : yo = 0}. Therefore, for any y € T(2e"), y" L(w*,u*)y = —2y? <0, which means that ‘2* does not satisfy the SONC. 206. a. Write h(a) = 21 — 22, g(x) = ~a1. We have Df (x) = (23, 2x22], Dh(x) = (1, —1), Dg(x) that all feasible points are regular. The KKT condition is [=1, 0]. Note aedap 222-2 yy “ u-m vu Vv We first try 2; = 2} = 0 (active inecuality constraint). Substituting and manipulating, we have the solution ] = 2} = 0 with w* = 0, which is a legitimate solution. If we then try x = zj > 0 (inactive inequality constraint), ‘we find that there is no consistent solution to the KKT condition. Thus, there is only one point satisfying the KKT condition: a" = 0. . The tangent space at 2* = is given by T(0) = {y : [1,—1]y = 0, [-1, O]y = 0} = {0}. ‘Therefore, the SONC holds for the solution in part a. ar oo 7 werd=[50, 22] Hence, at * = 0, we have L(0,0,0) = O. Since the active constraint at 2° is degenerate, we have T(0,0) = {y: [1,1 0}, which is nontrivial. Hence, for any nonzero vector y € T(0,0), we have y7L(0,0,0)y = 0 # 0. Thus, the SOSC does not hold for the solution in part a. 20.7 a. The KKT condition for the problem is: (Av-b)"A+AeT—p™ = OT we 0 uezo eTr-1 0 z>0 where e = [1,417 . A feasible point «* is regular in this problem if the vectors e, ei, i € J(x*) are linearly independent, where J(w*) = {i 27 = 0} and ¢; is the vector with 0 in all components except the ith component, which is 1 122 In this problem, al feasible points are regular. To see this, note that 0 is not feasible. Therefore, any feasible point results in the set J(z*) having fewer than n elements, which implies that the vectors e, e;, i € J(z*) are linearly independent. 208 By the KKT Theorem, there exists 1* > O such that (w*— 20) + p°Vg(2") = 0 He g(a") 0. Premultiplying both sides of the first equation by (z2* ~ a9)", we obtain he* — oll? + 1° (ae* — 0)" Vola" Since ||zx* — aol|? > 0 (because g(a) > 0) and yx* > 0, we deduce that (a:* — x9)" Vg(x*) < 0 and p* > 0. From the second KKT condition above, we conclude that g(z*) = 0. 20.9 a. By inspection, we guess the point [2,2] (drawing a picture may help) b. We write f(x) = (21 — 3)? + (t2 — 4)", u(w) = -21, gow) = —z2, go(w) = 21 — 2, gale) = 22 — 2, [91, 92,93, 94)”. The problem becomes minimize — f(«) subject to g(x) <0. We now check the SOSC for the point 2* = [2,2]. We have two active constraints: gy, 4. Regularity holds, since Voa(z*) = (1,0)? and Vou(z*) = (0,1). We have Vf(a*) = [-2,~4]". We need to find a yu" € RY, po satisfying FONC, From the condition p*? (x*) = 0, we deduce that | =o} = 0. Hence, D(a") +m"? Do(a* 07 if and only if * = (0,0,2,4]”. Now, F(a" 20) scrgey [0 0 [02] ween=[5 9] Hence Ono wermy=[5 3] which is positive definite on R?. Hence, SOSC is satisfied, and «* is a strict local mi 20.10 ‘The KKT condition is aTQ+p™A = OF wM(Ac-b) = 0 n2o0 At-b <0. Postmultiplying the first equation by a gives 2 Qe +m" Ax = u7b. Hence, 27Qz+uTb=0. We note from the second equation that 1” As Since Q > 0, the first term is nonnegative. Also, the second term is nonnegative because 1 > 0 and b > 0. Hence, we conclude that both terms must be zero, Because Q > 0, we must have « = 0. Aside: Actually, we can deduce that the only solution to the KKT condition must be 0, as follows. The problem is convex; thus, the only points satisfying che KKT condition are global minimizers. However, we see that 0 is a feasible 123 point, and is the only point for which the objective function value is 0. Further, the objective function is bounded below by 0. Hence, 0 is the only global minimizer. 20.1 Let a* bea solution. Since A is of full rank, z* is regular. The KKT Theorem states that 2" satisfies: nw > 0 Tera 0 wT Aa” = 0 If we postmuliply the second equation by e* and subtract the third from the result, we get oat 20.12 . a. We can write the LP as minimize f(x) subject to A(z) =0, (2) <0, where f(x) = 67, h(x) = Ax—b, andg(x) = —z. Thus, wehave D(z) = e", Dh(w) = A.and Dg (x) ‘The Karush-Kuhn-Tucker conditions for te above problem have the form: if «isa local minimizer, then there exists * and y2* such that wi 20 Pata? = oF wet = 0. b. Leta* be an optimal feasible solution. Then, z* satisfies the Karush-Kuhn-Tucker conditions listed in part a, Since 1" > 0, then from the second condition in part a, we obtain (~A")"A < e”, Hence, X= —A" is a feasible solution to the dual (see Chapter 17). Postmultiplying the second condition in part a by ze*, we have which gives Hence, X achieves the same objective function value for the dual as r* for the primal. . From part a, we have *? =e — XA. Substituting this into 4*?x* = 0 yields the desired result 20.13 By definition of J(2*), we have gi(a*) < 0 for all i ¢ J(z*). Since by assumption g, is continuous for all i, there exists © > 0 such that gi(ar) <0 for all i ¢ J(w*) and all « in the set B= {a : lle ~ al] 0. giving sn (af — 22-4) =0, pole — 21-2) =0. ‘The vector 4 has two components; therefore, we try four different cases. Case 1: (4, > 0, 42 > 0) We have a}-m-4=0, mom -2=0. (-2,0]7 and ze) = [3,5]. For 2, the two FONC equations give jy = jg and 4/5. This isnot a legitimate solution since we require yz > 0. For 2), 10 and 3(2 + 22) = pra, which yield x = (~16/5, 66/5]. Again, this is We obtain two solutions: (1) =2(2-+ 2p) = 1, which yield ps the two FONC equations give jr — pia not a legitimate solution, Case 2: (14: = 0, 2 > 0) We have -2, 22-21 -2=0, n 3 % Hence, 21 = aa, and thus x = (1,1) 12 = —2. This is nota legitimate solution since we require js > 0. Case 3: (1 > 0, po = 0) We have a}-m-4=0, n - and agzin we don't have a legitimate solution. 0) We have 2, = 2 = 0, and all constraints are inactive. This a legitimate candidate for the minimizer. We now apply the SOSC. Note that since the candidate is an interior point of the constraint set, the SOSC for the problem is equivalent to the SOSC for unconstrained optimization. The Hessian matrix D? f(z) = diag{2, 2] is symmetric and positive definite. Hence, by the SOSC, the point ze" = (0, 0] is the strict local minimizer (in fact, it is easy to see that itis a global minimizer). 20.15 Write f(a) =a, +23 +4, (a) = x ~ 10, andg = [gr,g2]”. We have Vj (@) = [2z1,222]", Volz) (1,0), D84(2) = diag(2,2], D%gi (x) = diag(0,2), and D¥ga(z) = O. We compute V F(x) + aT V(x) = [2 — pr + pa, 202 + Qyrza]”. ‘We use the FONC to find critical points. Rewriting V (2) + u7Vg(z) = 0, we obtain Mia zo Since we require 1 > 0, we deduce tha: 22 = 0. Using #7 g(x) = 0 gives ®; 2(1 +) = 0. wi(—z1 +4) Me We are left with two cases. Case 1: (141 > 0, jz = 0) We have 2; +4 = 0, and sx: = 8, which is a legitimate candidate. Case 2: (11 = 0, pz = 0) We have 2 = 2 = 0, which isnot a legitimate candidate, since itis nota feasible point. ‘We now apply SOSC to our candidate = [4, 0)", 4 = [8,0]. Now, casorinori=[3 SJool6 s}= [8 8 which is positive definite on all of R?. The point [4, 0]” is clearly regular. Hence, by the SOSC, x* = [4,0] isa strict local minimizer. 125 ka — 21, 93(z) = —3a2—a andg = (91,92, 93)". We have f+}, g1(@) = -21—2}+4, 9(2) (-1,3]", Voa(z) = [-1,-3]", D?f(e) = diagl2,2), [221, 2x2)", Vn (a) = [-1, 2a.) va D'gu(2#) = diag(0,~2), and D(x) = D?ga(e) = From the figure, we see that the two candidates are a= (3, 1] and 2’) = (3, -1). Both points are easily verified to be regular For 2"), we have jag = 0. Now, Df (ae) + p7 Dg (wl) = (6 — pr — p22 — 2p + Spe] = which yields py = 4, op = 2. Now, T(a) the SOSC, 2") is a strict local minimizer. For 21), we have pn = 0. Now, Dj(a) +p? Dg(a")) = which yields ju = 4, iy = 2. Now, again we have F(a) = {0}. Therefore, any matrix is positive definite on T(a")), Hence, by the SOSC, x") is a strict local minimizer. 20.17 ‘The KKT condition for the problem is r {0}. Therefore, any matrix is positive definite on T(x”). Hence, by [6 — pr — wa, —2 + 2p — Spa] = 07, wee 0 e+Manpt = wee 0, Premultiplying the second KKT condition above by 1°" and using the third condition, we get ale ‘Also, premultiplying the second KKT condition above by *T and using the feasibility condition a7x* = b, we get Mata = <0. We conclude that jz" = 0. For if not, the equation A**7'a = |ys*||? implies that x*7.a < 0, which contradicts wo > Oanda > 0. Rewriting the second KKT condition with 1 O yields Using the feasibility condition a7 2* = b, we get owe lal? 20.18. a. Suppose (23)?+ (23)? < 1. Then, the point" = [x], 23]” lies in the interior ofthe constraint set 9 = {a : lll]? < 1}. Hence, by the FONC for unconstrained optimization, we have that Vf(z*) = 0. where f(a) = far — [0,8] |P is the objective function. Now, V (ar) = 2(2" ~ [a,0]*) = 0, which implies that 2* = [a,]” which violates the assumption (2)? + (x3)? < 1. note that if we write the constraint as g(22) = [2-1 < 0, . Hence, by the Karush-Kuhn-Tucker theorem, there exists b. First, we need to show that x* isa regular point. Forth then Vo(2*) = 20° # 0. Therefore, 2* isa regular p HER ye O, such that Vfl") + nVg(z") = 0, wali) 126 which gives Hence, 2” is unique, and we can writezx} = aa, 3 = ab, where a = 1/(1+ 1) > 0. . Using part b and the fact that |la*l] = 1, we get lla*||? = a2i)[a,0]"||? = 1, which gives a = 1/I\[,8)"| 1/var +B. 20.19 a. The Karush-Kuhn-Tucker conditions for this problem are 2aj +p" explei) = 0 2a} +1) —p" 0 H'(exp(ei) —23) = 0 expe) exp(z{) > 0, then jp" > 0. Hence, by the third equation in part a, we obtain 23 = exp($). ©. Since y* = 2(z3 + 1) = 2(exp(2}) + 1). then by the first equation in part a, we have 2a} + 2(exp(2i) + 1) exp(zj) = 0 which implies ~(exp(2z}) + exp(z})). Since exp(z{),exp(22}) > 0, then 2} < 0, and hence exp(z}),exp(22t) < 1. Therefore. 2} > —2. 20.20. a. We rewrite the problem as minimize f(z) subject to g(a) <0, where f(a) = ex and g(r) = }[lxl|* ~ 1. Hence, Vf(2) = cand Va(z) = 2. Note that 2* #0 (for otherwise it ‘would not be feasible), and therefore itis a regular point. By the KKT theorem, there exists u* < Ouch that © = y%=* and j"9(z*) = 0. Since ¢ # 0, we must have p* # 0. Therefore, g(z*) = 0, which implies that ll2"|[? = 2 we havea = 27a. +8, b. From part a, we have a[el|? = 2. Since |lel|? = Tofind ¢, we use and thus 1" = —2. Hence, 2021 ‘We can represent the equivalent problem as minimize f(a) subject to g(«) <0, where g(x) = }|h(2)|/?. Note that Vola) = Da(x)TA( ‘Therefore, the KKT condition is: uw > 0 Vs (w*) +e" Dh(x*)" h(x") 0 a'Ia(@*)I| = 0. 127 Note that for a feasible point «”, we have h(x") = 0. Therefore, the KKT condition becomes Note that Vg(a) = 0. Therefore, any feasible point 2* is not regular. Hence, the KKT theorem cannot be applied in this case, This should be clear, since obviously V f(2*) = 0 is not necessary for optimality in general. 20.22 ‘The given problem is equivalent to the problem given in the hint minimize 2 subject to fi(z)-2 $0, 0=1,2 ‘Suppose [2*", 2*]" isa local minimizer for the above problem (which is equivalent to a* being a local minimizer to for the original problem). Then, by the KKT Theorem, there exists 4° > 0, where ja" R2, such that ora [EIN = oF “[Aei-z] = Rewriting the first equation above, we get HVA(t") + u3Vfale")=0, wp +h Rewriting the second equation, we get =12 Hi(flw") - 2°) = 0, Suppose fi(z*) < max{fi(e*), fo(z*)), where i € {1,2}. Then, 2* > fi(z*). Hence, by the above equation we conclude that jz? = 0. 21. Convex Optimization Problems Wa We have o(a) Flat ad)" Qa + ad) ~ (2 + ad)b (d" Qd)a? + d™ (Qe - bla + (je"@e + 27») This is a quadratic function of a. Since Q > 0, then £6 =d7Qd>0 and hence by Theorem 21.4, is stritly convex. 212 Write f(a) = 2" Qe, where 1fo 1 a=s[t ag): Let z,y €M. Then, x = [a:,may]” andy = [az,maz]” for some a1, a2 € R. By Proposition 21.1, itis enough to show that (y ~ a)"Q(y — «) > 0. By substitution, (y~ 2)" Qly ~ 2) = m(az ~ a1)? > 0, 128 which completes the proof. 213. Let @,y € Mand a € (0,1). Then, Ale) = h(y Alaz + (1 —a)y) = ah(z) + (1 - aay) ¢. By convexity of 2, h(azz + (1 — a)y) = c. Therefore, and so h is convex over . We also have ~h(aw + (1— ay) = a(~A(@)) + (1 - a)(—A(y)), which shows that ~h is convex, and thus h is concave. 214 =>: This is true by definition, 4: Let d € R” be given. We wantto show that d”Qd > 0. Now, fix some vector « € (2. Since is open, there exists « 0 such that y = ~ ad € Nl. By assumption, 9<(y—2)"Qy-2) =a"d"Qa which implies that d” Qa > 0. 2s Yes, the problem is a convex optimization problem. First we show that the objective function f (2) = }| Az ~ ll? is convex. We write fe feT(AT Ale = (67 A)z + constant which is a quadratic function with Hessian A” A. Since the Hessian A” A is positive semidefinite, the objective function f is convex. [Next we show that the constraint set is convex. Consider two feasible points 2 and y, and let A € (0,1). Then, and y satisfy ez = 1,2 > 0 and e”y = 1,y > 0, respectively. We have eT (Ax + (1-A)y) = Ae?x + (1- A)e™y = A+ (1-A) =1. Moreover, each component of Aa + (1 — A)y is given by Az: + (1 — A)ys, which is nonnegative because every term here is nonnegative. Hence, Ax + (1 —)y isa feasible point, which shows that the constraint set is convex. 21.6 We need to show that 9 is a convex set. and f is a convex function on ©. To show that 1 isa convex set, we need to show that for any y, z € @ anda € (0,1), we have ay + (I—a)z € 2. Let y,2 € © and a € (0,1). Thus, ys = v2 2 Oand = = 22 > 0. Hence, ay + (1 a)21 ay: + (1—a)z2 2 =ay +(1-a)z [ Now, a + (1-a)z2 = 22, a1 =a + (1-9) and since a,1 ~ a > 0, 20. Hence, x € 92 and therefore 2 is convex. To show that f is convex on 1, we need to show that for any y,2 € Mand a € (0,1), f(ay + (1 -a)z) < af(y) + (1 — a) f(z). Let y, 2 € 9 and a € [0,1]. Thus, yr = y2 > O and 2 = 22 > 0, so that f(y) = y} and F(z) = 2}. Also, a? < a and (1 — a) < (1a). We have Flay +(1-a)2) = (am +(1-a)a)® ay} + (1 ~ @)2} + 3023(1 —a)yn + 30x (1 ~ 0)*y? ayf + (1 - a)2} + max(ys, 21)(a* ~ a + (1 — @)* - (1a) +3a*(1- a) + 3a(1 - @)?) ay} + (1-a)zf af(y) + (=a) $2). 129 wu Hence, f is convex. 27 Since the problem is a convex optimization problem, we know for sure that any point of the form ay + (1 — @)z, @ € (0,1), isa global minimizer. However, any other point may or may not be a minimizer. Hence, the largest set of Points GC £2 for which we can be sure that every point in Gis a global minimizer, is given by @={ayt(1-a)z:0 f@")- YO njat(e sera") Observe that foreach j € J(a), ofa" +5, and foreach & € 2, afz+b,>0 Hence, foreach j € J(z*), al (e— 2") >0. - D wale sella") Since 1 < 0, we get f(a) 2 fe" ) > f(z") and the proof is completed, 21.10 a Let = (ce Ria’ ‘Q,and A € [0,1]. Then, a? x; > band a?» > b. Therefore, > ohaie data, +(1-A)aT a2 Ab+(L—A)b a7 (Aa) + (1 ~ d)@2) Bw 30 which means that Ax, + (1 —A)zr2 € 0. Hence, 9 is a convex set », Rewrite the problem as, minimize f(x) subject to g(x) <0 where f(z2) = lla? and g(x) = b- a7 ¢. Now, Vg(«) = a # 0. Therefore, any feasible point is regular. By the Karush-Kuhn-Tucker theorem, there exists 1" > O such that, 0 wb-aTa") = 0. Since 2 is a feasible point, then z* # 0. Therefore, by the first equation, we see that u* 4 0. The second equation then implies that b— a7 x* = 0. c. By the first Karush-Kuhn-Tucker equation, we have 2* = :*a/2. Since a7 z* and therefore u* = 26/l\al|2. Since «* = 1*a/2 then «is uniquely given by a 2.1 a. Let f(c) = eT and @ = {a : © > O}. Suppose zy € 2, anda € (0,1). Then, zy > 0. Hence, az + (1 a)y > 0, which means az + (1 -a)y € 2. Furthermore, then p*a?a/2 = abn = /Ilall?. Tax + (1—a)y) acs + (1-a)e"y. Therefore, fis convex. Hence, the problem is a convex programming problem. b. =: We use contraposition. Suppose ey < 0 for some i. Let d = [0,..-1,---,0), where 1 appears in the ith component, Clearly dis a feasible direction for any point 2* > 0. However, d?V f(x") = de = ¢; < 0. Therefore, the FONC does not hold, and any point 2* > 0 cannot be @ minimizer. =: Suppose ¢ > 0. Let * = 0, and da feasible direction at 2*. Then, d > 0. Hence, d™Vj(x*) > 0. ‘Therefore, by Theorem 21.6, 2" is a solution. The above also proves that if a solution exists, then 0 isa solution. ©. Write g(x) = ~2 so thatthe constraint can be expressed as g(2#) <0. =: We have Dg(z) = —I, which has full rank. Therefore, any point is regular. Suppose a solu Then, by the KKT theorem, there exists" > 0 such that e? — x*? = OT and pe*T 2 4: Suppose ¢ > 0. Let 2* = 0 and yo" conditior . Hw >0. . Then, w* > 0, e7 = wT = 07, and y*T2* = 0, ie, the KKT satisfied. By part a, «* is a solution to the problem. ‘The above also proves that if a solution exists, then 0 is a solution. 2.12 a. The standard form problem minimize! subjectto Ag which can be written as minimize f(a) subject to A(x) = 0, g(a) $0, where f(ze) = 6? x, h(x) = Az—b, and g(x) = —2. Thus, wehave Df(z) =e", Dh(w) = A,and g(a) = -. ‘The Karush-Kuhn-Tucker conditions for the above problem has the form: wo > 0 cT4NTA-pwT = OT wet = 0. 1B b. ‘The Karush-Kuhn-Tucker conditions are sufficient for optimality in this case because the problem is a convex ‘optimization problem, i-c., the objective function is a convex function, and the feasible set is a convex set. . The dual problem is maximize Tb subject to ATA << T c. Let wa NTA ‘Since * is feasible for the dual, we have u* > 0. Rewriting the above equation, we get ch -NTA-p =00. ‘The Complementary Slackness conditior (e7 — \*7A)z" = 0 can be written as "Ta" = 0, Therefore, the Karush-Kuhn-Tucker conditions hold. By part b, «* is optimal. 1.13 1, We first show that the set of probability vectors = (GER ig ted =1, W> 0, = 1-0} is a convex set. Let y,z € 50 yr too + Yn = Ly ye > 0,21 bees + zy = Land 2 > 0. Let a € (0,1) and @ = ay +(1~a)z. Wehave Byte tan = ayy +(1-a)zy +--+ +04 + (1-a)z_ ayy beet un) + (1 a)(ar +--+ Zn) a+(1-a) L Also, because y; > 0, 2; > 0, a > 0, and 1 ~ a > 0, we conclude that 2; > 0. Thus, # € 9, which shows that 2 is b. We next show that the function f is a convex function on ©. For this, we compute a 0 F(a) = which shows that F(q) > 0 for all qin te open set {q : qi > 0, 1 = L,...yn}, which contains 2. Therefore, f is convex on 2. ©. Fix a probability vector p. Consider the optimization problem nine pig) +--+ rt (22) subject to xy +e++aq=1 >0=1.yn By parts a and b the problem is a convex optimization problem. We ignore the constraint 5 > O and write down the Lagrange conditions forthe equality-constraint problem: Bitte = 1 Rewrite the first set of equations as 2} = p,/A*. Combining this with the constraint andthe fact that p +--+ Pa we obtain A* = 1, which means that 2} = p;. Therefore, the unique global minimizer is 2* Note that f(x*) = 0. Hence, we conclude that f(x) > 0 for all « € 1. Moreover, f(x) ‘This proves the required result. 4. Given two probability vector panda, the number Dona) = nitog (22) +--+ pang (22) % iff and only if.2* is called the relative entropy (or Kullback-Liebler divergence) between p and q. It is used in information theory to measure the “distance” between two probability vectors. The result of part c justifies the use of D as a measure of “distance” (although D is not a metric secause itis not symmetric). 2.14 a. Let A € RY" and B © R™ be symmetric and A > 0, B > 0. Fix a € (0,1), @ © R®, and let C @A+(1~a)B. Then, Cz = 2"aA+(1~0)Blz = a2" Ax +(1-a)2™ Br. Since e” Ax > 0,27 Ba > 0, anda, (1a) > Oby assumption, then 2Czx > 0, which proves the required result. ', We first show that the constraint set 2 = {a : Fy + D7, 2jF; > O} is convex. So, let 2,y € Wanda € (0,1). Letz =a + (1—a)y. Then, Pot Dak) = Fo+ Daz) + ayslPs = PotaScaF; + (1-0) oyF; a = alFo+ Sarl +(1-a)[Fo+ Suri By assumption, we have Fot+doaF; > 0 = Po+ uF; Vv By parte, we conclude that Fot Fj 20, which implies that z € 2. ‘To show that the objective function f(z) = ez is convex on 2, let, y € Mand a € (0, 1). Then, Faw+(1-a)y) = Tax +(1-a)y) = ac’z+(1-a)ety af(2) + (1-a)f(y) which shows that fis convex. «. The objective function is already in the required form. To rewrite the constraint, et aj be the (i, )th entry of A, 1,...5m,j = 1,-...n. Then, the constraint Az > b can be writen as iam tage tt dintn 2b aE 133 Now form the diagonal matrices Fo = diag{-bi,...;—bm)} diag{aiy,--- ama}; a u Note that a diagonal matrix is positive semidefinte if and only if every diagonal element is nonnegative. Hence, the constraint Az > 6 can be written as Fy -+ Sc", 2); > 0. ‘The left hand side is a diagonal matrix, and the ith diagonal element i simply ~B; + aj,21 + @4222 +" + Ginn 21.15 - 1. We rewrite the problem into a minimization problem by multiplying the objective function by —1. Thus, the new objective function is the sum of the functions —U;. Because each U; is concave, —U; is convex, and hence their sum ‘To show that the constraint set @ = {a : eT < C} (where € 2 [0,1]. Then, e?ay < Cand e? x, 0. Hence, d7V f(a") > 0, which shows that the FONC holds. Because Vg(se") = —[0, 1] (so «" is regular), we see that if" = 1, then V(x") + pV g(a") KKT condition holds. Because F(x") = O, the SONC for se: constraint 9 holds. However, and so the 0 reweore[? fo and T(z") = {y : y2 = 0}, which shows that the SONC for inequality constraint g(2-) < 0 does not hold. 22. Algorithms for Constrained Optimization 134 224 By definition of TI, we have Mizo+y] = argmin lle zen arg min [I — 0) ~ ull (wo +) By Exercise 6.4, we can write argimin ||(x — ao) — yl| = 20 + arg min ||z — yl] en zeNiA) The term arg min, yy) [Iz — ull is simply the orthogonal projection of y onto N’(A). By Exercise 6.4, we have argmin lz - ull = Py, FEN (A) i where P = I~ AT(AA™)~"A. Hence, Tzo + y) = 20+ Py. 2.2 Since ax > 0 is a minimizer of x(a) = f(e™ — aPg™), we apply the FONC to 44 (a) to obtain #4(a) = (2 — aPg))7Q(-Pg™) - 67(—Pg™). ifag)T PQP¢*) = (2TQ ~ 6") Pg), But Therefore, g(a) = 20TQ— pF = g, __ oP Pg® Oh = THF PQPG® 223. By Exercise 22.2, the projected steepest descent algorithm applied to this problem takes the form oi) 2g) pyl) = (In-P)o®) AT(AAT)“ Az), Ifa) € (w : Ax = 6}, then Aa’ =, and hence 0 = AT(AAT)"16 which solves the problem (see Section 12.3). 22.4. a. Define x(a) = fz ~aPVf(2)) By the Chain Rule, dla) = Pa) = -(0/(al ~ a PV se)" PVHa™) (fle) PVs (a) and thus 997 Pg = 0, 13s Since o minimizes $e, #4, (a%) = b. Wehave 2(t#1) — 2(8) = —ay Pg) and (t+2) — (841) = ayy Pg"), Therefore, (ol) gltH)T Galt 2) = agyreng*)? PT Pg” = agurag Pg 0 by part a, and the fact that P = P? = P* 22.5 Using the penalty method, we construct th: unconstrained problem minimize x + >(max(a ~ 2,0)? “To find the solution to the above problem, we use the FONC. It is easy to see that the solution 2* satisfies 2* < a. ‘The derivative of the above objective function in the region x < a is 1 + 27(ar ~ a). Thus, by the FONC, we have z* = a—1/(27). Since the true solution is ata, the difference is 1/(27). Therefore, for 1/(27) < e, we need 7 1/(2e). The smallest such 7 is 1/(2e}. 2.6 a. Wehave 1 (2 sllell + le - IP = ar[it2y 2% | _ ar [2 lees 1427]*7* [ay] +7 ‘The above is a quadratic with positive defnite Hessian. Therefore, the minimizer is, = L427 2y Yo fay 2y 127] |27 ~ aster]: Hence, ‘The solution to the original constrained problem is (see Section 12.3) + AM AAT) = Bf? a = AMAAT b= 51) ». We represent the objective function of the associated unconstrained problem as Zilel? + alla — bl? = $27 (Ip +2yA7A) 2 27 (27A76) + 707. 2 ‘The above is a quadratic with positive definite Hessian. Therefore, the minimizer is ay = (In+27A7A) (2747) (Er + ava) AT. oa LetA=U[S 0] V7 be the singular value decomposition of A. For simplicity, denote © = 1/2. We have (ete a") At (n+ Vv [3] utu[s ov") “at ) AT " & : x ° ° < S Note that where (¢Zn + S*)~ is diagonal. Hence, = Vee uw] [3]e" a = (ent+a™A) A" - vfs ayy? 5 v [5] mes u' - viS]on 2)-ayt = v[S] tv tn +5*)0' = ATU Clg + 5°)1UT Note that as 7 + 00, € > 0, and U(eIm + S?)"1UT + U(S?)"'UT, But, U(s*)""UT = (Us*uT) * = (AAT). Therefore, wy AT(AAT)“ 137

You might also like