Solution Optimization 2ed
Solution Optimization 2ed
xcurr*uncert,
xoldexcurr:
guoldeg_curr:
g_curr=feval (g,xcurr) ;
xnew= (g_curr*xold-g_old*xcurr) / (g_curr-g_old) ;
end fwhile
‘print out solution and value of g(x)
Af nargout >= 1
Af nargout == 2
sfeval (g.xnew) +
end
else
final_point=xnew
value=feval (g,xne)
end tif
®
b. We get a solution of « = 0.0039671, with corresponding value g(t) = —9.908 x 10-*.
19
24function alpha=linesearch_secant (grad, x,4)
Line search using secant method
epsilon=10"(-4); #1ine search tolerance
max = 100; tmaximum number of iterations
alpha_curr=0;
alpha=0.001;
aphi_zero=feval (grad,x)/*d;
aphi_curr=dphi_zero;
while abs (dphi_curr) >epsilon‘abs (dphi_zero),
alpha_old=alpha_curr;
alpha_curr=alpha;
aphi_old=dphi_curr;
dphi_curr=feval (grad, xtalpha_curr*) '*4;
alphas (dphi_curr*alpha_old-dphi_old*alpha_curr) / (dphi_curr-dphi_old) ;
Seis;
Af (i >= max) & (abs (dphi_curr) >epsilon*abs (dphi_zero)),
disp(‘Line search terminating with number of iterations:’);
aisp(i);
break;
end
fend while
2
8. Gradient Methods
81
Let s be the order of convergence of {2"")}. Suppose there exists © > 0 such that forall k sufficiently large,
je) —2*|] > ofa” —2"IP.
Hence, for all k sufficiently large,
lle — 2 lla) — at 1
Ole, > ei ep oR
2 ewe
‘Taking limits yields
sn HAH 2 e
pote |e — a = Tima ee
Since by definition s is the order of convergence,
lim
esteo [fe
Combining the above two inequalities, we get
0, we conclude that s < p, i.e, the order of convergence is at most p.
‘Therefore, since lim, sco {la*) ~ ar”
82
‘We use contradiction. Suppose ar") + a:* and
fmm (ett? — 2)
lim 2" =2"ll . 9
ete [ja — ale
25for some p < 1. We may assume that «'*) # 2* for an infinite number of & (for otherwise, by convention, the ratio
above is eventually 0). Fix € > 0. Then, there exists , such that for all k > K,
jot) — atl]
tee as'l
ea
Dividing both sides by jx(*) —
||'-, we obtain
[oer — ah .
[eel ie
[eae] 7 Ie — a?
Because x(#) -+ x* and p < 1, we have jl — a*||!-? — 0. Hence, there exists Kz such that for all k > Ko,
je) — 2*||!-” < ©, Combining this inequality with the previous one yields
at) — 2"
Tee
>.
for all k > max(Ki, Ka)s ie.
2) — aI] > [2 — 2",
Which contradicts the assumption that 2) + 2*,
83
We have une: = (1 — pte, and ue + 0, Therefore,
san sal
=1-p>0
ws Tel
and thus the order of convergence is 1.
84
4. The value of 2* (in terms of a, 6, and c) that minimizes f is 2* = b/a.
b. We have f"(z) = ax — b. Therefore, te recursive equation for the DDS algorithm is -
a+) = 2 — o(az\ —b) = (1-aa)z") + ab,
c. Let é
e-so0 2, Taking limits of both sides of z(*#) = x" — a(az'*) — 5) (from part b), we get
=#-a(ak—0).
Hence, we get = bja = 2°
4. To find the order of convergence, we compute
Jz) — b/al aaa)e"*) + ab — bf
[x® = d/alP [a pal?
[1 = aja") ~ (1 - aa)b/al
124) b/al
[1 = cal [e — b/a}'
Let 2(#) = |1 ~ aal[e\*) ~b/al!-P. Note that 2“) converges to finite nonzero number if and only if p =
then 2") — 0, and if p > 1, then 2(®) —> 00). Therefore, the order of convergence of {2'*)} is 1,
€. Let y!*) = |2(®) —b/al, From partd, ater some manipulation we obtain
(itp <1,
yl) = [1 —aaly! = [1 — aalt'y
‘The sequence {2} converges (to b/a) if and only ify) > 0. This holds if and only if |1 — aa] < 1, which is
‘equivalent to 0 < a < 2/a.
2685
We rewrite fas f(z) = je™ Qa — 6x, where
64
a=[f 5
‘The characteristic polynomial of Q is 1? ~ 12A +20. Hence, the eigenvalues of @ are 2 and 10. Therefore, the largest
range of values of a for which the algorithm is globally convergent is 0 cif and only if f(z) -> 0. Hence, the algorithm is globally convergent if and only if f (2) -+ 0 for
any zp. From part a, we deduce that f{r,) —+ 0 for any zo if and only if []j2.g(1 — ae)” = 0. Because 0 0, which means that f(24s1) < f(x) if re # 1 for k > 0. This implies that the algorithm has the
descent property (for k > 0).
Se
Son aat (Set Foet) - ot (2-1) co
Since “4 > 0 for all k > 0, we can apply the theorem given in class to deduce that the algorithm is not globally
convergent.
810
We have
bea SOF)
By Taylor's Theorem,
Fe) = f(a") + (a*V(a ~ 2") + O(|2 — 2"P)
Since f'(2") = 0 by the FONC, we get
oC
lle —
— Fa) Fe"
‘Combining the above with the first equation, we get
Je) — 2] = Ole —2"/),
which implies that the order of convergence is at least 2.
8.1
a. We have
fle) =||Az-b]? = (Ae—b)"(Az 6)
(2? AT — 0")( Ax —b)
a? (AT A)x ~ 2(AT)"x +67
which is a quadratic function. The gradient is given by Vf(«) = 2(A™ A) — 2(A76) and the Hessian is given by
F(a) = 2(A" A).
». The fixed step size gradient algorithm for solving the above optimization problem is given by
att) 2 2 —a(2(A?A)a — 247)
wl — 2a AT (Aa — b).
c. The largest range of values for «such that the algorithm in part b converges to the solution of the problem is given
by
2
Xmas(2A?A) 1
O 1, we conclude that the slgorithm is not globally monotone.
'b. Note thatthe algorithm is identical toa fixed step size gradient algorithm applied to a quadratic with Hessian A.
‘The eigenvalues of A are 1 and 5. Therefore, the largest range of values of a for which the algorithm is globally
convergentis 0 0, and by Lemma 8.2,
Amin(Q)
ma na-a (Fmei$}) >
which implies that S729 74 = oo. Hence, by Theorem 8.1, 2(*) -+ &* for any x),
If8 = 14
Lf options (14) ==0
options (14) =1000*Length (xnew) ;
end
elee
options (14) =1000*1ength (xnew) ;
end
ele:
format compact
format short e;
options = foptions (options);
print = options(1):
epsilon_x = options(2) ;
epsilon_g = options (3);
max_iter=options (14) ;
for k
L:max_iter,
g_curr=feval (grad, xcurr) ;
if norm(g_curr) <= epsilon
disp(‘Terminating: Norm of gradient less than’);
disp (epsilon_g) ;
a1:
break:
end Bi
alpha=10.0;
xnew = xcurr-alpha*g_curr;
if print,
59disp('Iteration number k =’)
@isp(k); Sprint iteration index k
aiep(‘alpha =");
@isp(alpha); Sprint alpha
disp(‘Gradient =");
@isp(g_curr'); print gradient
@isp(‘New point =");
disp(xnew’); Sprint new point
end 2
LE norm(xmew-xcurr) <= epsilon_xtnorm(xcurr)
disp(/Terminating: Norm of difference between iterates less than’);
disp (epsilon_x) ;
break;
end Bit
Af k == max_iter
disp(/Terminating with maximum number of iterations");
end Bit
end Sfor
if nargout >= 1
if nargout
else
disp( ‘Final point
isp (xnew };
disp('Number of iterations
aispik);
end Bit
2
‘To apply the above routine, we need the following M-file for the gradient.
function y=grad(w, xd, ya) 5
wh:
whoLew(2)
whtz=w(3);
whod=w(4) 7
woll=w(5);
wol2=w(6)
ehew(7):
(a;
i);
ce
wedd=xd(1) 5 xd2=xa(2)
nL *xd1+wh12*%d2-C17
‘vaewh2i *xdi +wh22"xa2-t2;
zissigmoid(vi) ;
22eeigmold(v2);
ylesigmoid(woll*zi+wol2*22-t3) ;
d= (yd-yl) ty (1-y1) 5
Al*wol1*214 (1-21) x41;
dl*wol2*22* (1-22) *xdl;
dl*woll*21* (1-21) *xd2;y (4) =-d1*wo12*22* (1-22) a2;
y(S)=-d1*21;
¥(6)=-d1*22;
(7) =d1*wold*21* (1-21) ;
y(8)=d1*wo12*22* (1-22);
y(9)=a1;
yey"
We applied our MATLAB routine as follows.
>> options(2)=107(-7) ;
>> options (3)=10°(-7);
>> options (14
>> w0=(0.1,0.3,0.3,0.4,0.4,0.6,0-1,0.1,
>> [wstar,N]=bp( ‘grad’, w0, options)
0.11;
‘Terminating with maxinum number of iterations
wstar
-7.7771e+00
-5.5932e+00
-8.4027=+00
-5.6384e+00
=1.1010e+01
1.0918e+01
-3.2773e+00
-8.3565e+00
5.26068+00
10000
As we can see from the above, the results coincide with Example 13.3, ‘The table of the outputs of the trained
‘network corresponding to the training input data is shown in Table 13.2.
14. Genetic Algorithms
141
a. Expanding the right hand side of the second expression gives the di
result
. Applying the algorithm, we get a binary representation of 11111001011, i.c.,
1995 = 29-42 42° 427 + 2° 429 428 4 2°.
. Applying the algorithm, we get a binary representation of 0.101101, i.e.
0.7265625 = 2-1 42-9 42-4 4 2-8 4 2-7,
4, We have 19 = 24 +2! + 29, ie, the binary representation for 19 is 10011. For the fractional part, we need at
least 7 bits to keep at least the same accuracy. We have 0.95 = 27? 42-2 42-3 49-4 49-74. . the binary
representation is 0.1111001 ---, Therefore, the binary representation of 19.05 with at least the same degree of accuracy
is 10011.1111001
142
It suffices to prove the result for the case where only one symbol is swapped, since the general case is obtained by
repeating the argument. We have two scenarios. Fist, suppose the symbol swapped is ata position corresponding to
a. don’t care symbol in H. Cleary, after the swap, both chromosomes will still be in H. Second, suppose the symbol
‘swapped is ata position corresponding ‘0a fixed symbol in IT. Since both chromosomes are in H their symbols at
that position must be identical. Hence, the swap does not change the chromosomes. This completes the proof.
6143,
Consider a given chromosome in M(k)(VH. ‘The probability that itis chosen for crossover is qe. If neither of its
offsprings is in HT, then at least one of the crossover points must be between the corresponding first and last fixed
symbols of H. The probability of this is 1 — (1 — 6(41)/(L ~ 1))?. To see this, note that the probability that
‘each crossover point is not between the corresponding first and last fixed symbols is 1 ~ 6()/(L — 1), and thus
the probability that both crossover points are not between the corresponding first and last fixed symbols of HT is
(1 ~ 6(H)/(Z — 1))?. Hence, the probability that the given chromosome is chosen for crossover and neither of its
offsprings is in HT is bounded above by
144
As for two-point crossover, the n-point crossover operation is a composition of n one-point crossover operations (i.e.,
‘m one-point crossover operations in succession). The required result for this case is as follows.
Lemma:
Given a chromosome in (k) (H, the probability that it is chosen for crossover and neither of its offspringsis in FT
is bounded above by
For the proof, proceed as in the solution of Exercise 14.3 replacing 2 by n.
145
function Meroulette_wheel (fitness) ;
function M=roulette wheel (fitness)
fitness = vector of fitness values of chromosomes in population
aM = vector of indices indicating which chromosome in the
% given population should appear in the mating pool
fitness = fitness - min(fitness); % to keep the fitness positive
if sum(fitness) =="0,
disp(’ Population has identical chromosomes
break:
else
Fitness = fitness/sum(fitness) ;
end
cum fitness = cumsum(fitness) ;
sToP");
for i = i:length(titness),
tmp = find(cum_fitness-rand>C) ;
Mi) = tmp(1);
end
146
& parenti, parent? = two binary parent chromosomes (row vectors)
L = length (parent);
crossover_pt = ceil(rand*(L-1)):
offspring = (parentl(1:crossover_pt) parent2(crossover_pt+l:L)]:
offepring2 = [parent2(1:crossover_pt) parent (crossover_pt+1:L)];
2147
% mating_pool = matrix of 0-1 elements; each row represents a chromosome
% pm = probability of mutation
N = size(mating_pool, 1);
L = size(mating_pool, 2);
mutation_points = rand(N,t) < p_m
new_population = xor(mating_pool, mutation points) ;
148
A MATLAB routine for a genetic algorithm with binary encoding is:
function (winner, bestfitness] = ga(L,N, £it_func, options)
% function winner = GA(L,N, £it_func)
% Function call: GA(L,N,"£")
8 L = length of chromosomes
N= population size (must ke an even number)
3 £ = name of fitness value function
8
Soptions:
Sprint = options (1);
Sselection = options(5) ;
Smax_iter=options (14) ;
Spc = options (18);
Spm = p_c/100;
e
selection
% options(5) = 0 for roulette wheel, 1 for tournament
elf;
if nargin
options
if nargin
isp(‘Wrong number of arguments.’);
return;
end,
end
Af length (options) >= 14
Lf options (14) =0
options (14) =34N;
end
else
options (14)=3*N;
end
Af length (options) < 18
options (18)=0.75; toptional crossover rate
end
format compact;
Sformat short e;
options = foptions (options) ;
print = options(1);
selection = options (5);
max_iter=options (14) ;
pic = options (18);
8Pim = p_c/100;
P = rand(N,)>0.5;
bestvaluesofar =
@initial evaluation
for i= 1m,
fitness(i) = feval(fit_fune, P(i,:));
end
(bestvalue, best] = max(fitness);
Af bestvalue > bestvaluesofar,
bestsofar = P(best,:);
bestvaluesofar = bestvalue;
end
for k = L:max_iter,
‘selection
fitness = fitness - min(fitness); ¢ to keep the fitness
if sum(fitness) == 0,
isp(’ Population has identizal chromosomes -- STOP’);
disp( ‘Number of iterations
disp (kd;
for i = k:maxiter,
upper (i) =upper (i-1)
average (i) =average (1-1);
lower (i)=Lower (i-1) :
end
break;
else
fitness = fitness/sum(fitness) ;
end
if selection == 0,
‘roulette-wheel
cum_fitness = cumsum(fitness) ;
for i= Lin,
tmp = £ind(cum_fitness-rand>0) ;
m(i) = tmp(2);
end
else
‘eeournament
for &
fighteri=ceil (rand*N) ;
fighter2=ceil (rand*n) ;
if fitness(fighter1)>fitness(fighter2),
m(i) = fighter1;
else
m(i) = fighter2;
end
end
end
M = zeros(N,L);
for i= 1:0,
M(i,2) = P(m(i), 2):
end
Scrossover
Mnew = M;
for i = 1:8/2
64
positiveindi = ceil (rana‘n);
ind2 = ceil (rand*N);
parenti = M(indl,:);
parent? = M(ind2,:)
if rand < pe
crossover_pt = ceil(rand* (L-1));
offspring] = (parenti(::crossover_pt) parent2(crossover_pt+1:L) 1;
offspring? = (parent2(::crossover_pt) parent] (crossover_pt+1:L) ]
Mnew(indl, :) = offspringl;
Mnew(ind2,:) = offspring2;
end
end
amutation
mutation_points = rand(N,L) < p_m
P = xor(Mnew,mutation_points) ;
sEvaluation
for i= 1m,
fitness(i) = feval(fit_func,P(i,:));
end
(bestvalue, best] = max(fitress)
if bestvalue > bestvaluesofar,
bestsofar = P(best, :);
bestvaluesofar = bestvalve;
end
upper (k) = bestvalu
average(k) = mean (fitness);
lower (k) = min(fitness);
end for
if k == maxiter,
disp(‘Algorithm terminated after maximum number of iterations:
disp(max_iter) ;
end
winner = besteofar;
bestfitness = bestvaluesofar;
if print,
iter = (1:max_iter]’;
plot (iter upper, ’0:", iter, average, 'x-', iter, lower, '*--");
legend( ‘Best’, ‘Average’, ‘Norst'):
xlabel ( ‘Generations’, 'Fontsize’, 14);
ylabel (‘Objective Function Value’, 'Fontsize’, 14);
set (gca, ‘Fontsize’, 14);
hold off;
end
a. To run the routine, we create the following M-files.
function dec = bin2dec (bin, range) ;
Seunction dec = bin2dec (bin, range) ;
SPunction to convert from binary (bin) to decimal (dec) in a given range
index = polyval (bin, 2);
dec = index* ( (range(2)-range(1))/(271ength (bin)-1)) + range(1)
65function y=f_manymax (x) ;
yo-15* (sin (24%) )°2> (x2) 724260;
function y=£it_funcl (binchrom) +
Q1-D fitness function
_manymax’ ;
range=(-10, 10];
xebin2dec (binchrom, range) ;
sfeval (£,%) 7
‘We use the following script to run the algorithm:
clear;
options (1) =1;
(x,yJ=ga(8, 10," €it_funct", opticns) ;
f=" fmanymax’
range=(-10, 10];
disp(’GA solution: ");
disp (bin2dec (x, range) ):
disp(/Objective function value:'};
dispiy);
Running the above algorithm, we obtain a solution of 2* = 1.6078, and an objective function value of 159.7640.
‘The figure below shows a plot of the best, average, and worst solution from each generation of the population.
a )
». To run the routine, we create the following M-files (we also use the routine bin2dec from part a.
function y=£_peaks (x);
ya (Lx (1) ) 72 erp (= (20(1) -°2) = (20(2) 42) 072) =
10." ((1) /5-x(1) .73-¥(2) .°5) -Pexp
(x (1) .°2-(2) 72) = el (2e(1) 41) 72x (2) 729/37
function y=fit_func? (binchrom) ;
92-D fitness function
f=" peaks’:
xrange=[-3,3]+
66,Lslength (binchrom) ;
jin2dec (binchrom(1:1/2) ,xxange) ;
in2dec (binchrom(L/2+1:L) yrange) ;
yefeval (£, [x1,x2]):
We use the following script to run the algorithm:
clear;
options (1)=1;
(x, y]=ga(16, 20, "£it_fune2" , options) ;
=" f_peaks";
xrange=[-3,3)7
yeange=(-3,3]7
L=Leagth (x) 7
in2dec (x(1:L/2) ,xrange) ;
x2ebin2dec (x(L/2+1:1) ,yrange) ;
@isp(‘GA Solution: ");
@isp((x1,22));
disp( ‘Objective function value:")
disp(y);
‘A plot ofthe objective function is shown below.
Sa
Running the above algorithm, we ottain a solution of {—-0.0353, 1.4941), and an * = [-0.0588, 1.5412], and
an objective function value of 7.9815. (Compare this solution with that of Example 14.3.) The figure below shows a
plot ofthe best, average, and worst solution from each generation of the population.
67Cbjectve Function Vale
149
A MATLAB routine for a real-number geretic algorithm:
function (winner, bestfitness) = gar(Donain,N, £it_func, options)
% function winner = GAR(Donain,N, £it_func)
% Function call: GAR(Domain,N, '£")
% Domain = search space; e.g, [-2,2/-3,3] for the space (~-2,2]x(~
% (number of rows of Domain = dimension of search space)
3 N= population size (must be an even number)
f= name of fitness value function
3
Soptions:
‘print = options (1);
selection = options (5)
‘max_iterzoptions (14) ;
‘ap_c = options (18);
‘pm = p_c/100;
a
tselection:
% options (5) = 0 for roulette wheel, 2 for tournament
elf:
if margin “= 4
options = ()
if nargin “= 3
igp(‘Wrong number of argunents.');
return;
end
end
if Jength(options) >= 14
Lf options (14) ==0
options (14)
end
else
options (14) =3*N;
end
if length(options) < 18
options (18)=0.75; Yoptional crossover rate
end
68‘format compact;
Sformat short 7
options = foptions (options!
print = options (1);
selection = options(S);
max_iter=options (14);
PLc = options (18);
Pom = p_c/100;
n= size (Domain, 1);
Lowb = Domain(:,2)';
upb = Domain(:,2)":
bestvaluesofar = 0.
for i = 1:8,
PG lowb + rand(1,n) .*(upb-lowb) ;
tinitial evaluation
fitness (i) = feval (fit_fune, P(A, :))7
end
(bestvalue, best] = max( fitness) ;
if bestvalue > bestvaluesofar,
bestsofar = P(best, :);
bestvaluesofar = bestvalue;
end
for k
max_iter,
aselection
fitness = fitness - min(fitness); $ to keep the fitness positive
if sum(fitness) == 0,
disp(‘Population has identical chromosomes -- STOP’);
isp( ‘Number of iterations:"):
disp):
for i = k:max_iter,
upper (i) supper (1-1);
average (i) saverage(i-1
Lower (i)=Lower (i-1) +
end
break:
else
fitness = fitness/sum(fitaess) ;
end
if selection == 0,
froulette-wheel
cum fitness = cumsum(fitness)
for i= 1:n,
tmp = find(cum_fitness-rand>0) ;
m(i) = tmp(1);
end
else
‘@rournament
for 1 = 1:N,
fighteri=ceil (rand*w) ;
fighter2«ceil(rand*N) ;
if Eitness (fightert) >fitness(fighter2),
m(i) = fighterl;
else
m(i) = fighter2;
end
0.end
end
M = zeros(N,n);
for i= 1:m,
M(iss) = Bim),
end
Scrossover
mew = M:
for i= 1:m/2
indi = ceil(rand*N) ;
ind2 = ceil(rand*n) ;
parent = M(indl,:);
parent2 = M(ind2,:);
AE rand < pe
a= rand;
offspringl = a*parenti+(1~a) *parent2+(rand(1,n)-0.5) .* (upb-Lowb) /10;
offspring? = a*parent2+(1-a) *parent1+(rand(1,n)-0.5) .* (upb-Lowb) /10;
do projection
for j = lin,
LE offspring! (3)