Lecture A1 07 LineSearch
Lecture A1 07 LineSearch
• Understand simple algorithms based on evaluation of the function without the need
for derivatives
• They are often used inside multi-dimensional optimization algorithms, within each
iteration once the descent direction is decided.
41
42 LECTURE 8. INTRO TO ALGORITHMS: LINE SEARCH METHODS
However, methods typically also converge faster as they use higher order derivatives at
each iteration.
• If f (a1 ) < f (b1 ), then the minimizer is assumed to lie within [a0 , b1 ]. Therefore, we
set b0 = b1 and iterate again.
• If f (a1 ) f (b1 ), then the minimizer is assumed to lie within [a1 , b0 ]. Therefore, we
set a0 = a1 and iterate again.
This way, our interval will become progressively smaller as we seek the minimizer.
After a suitable number of iterations (eg: when a0 and b0 are sufficiently close to each
other), we can stop the iterations and return the value (a0 + b0 )/2 as our estimated
minimizer.
Now, the question is how to pick ⇢. In practice, a clever choice of ⇢ is to satisfy the
constraint:
1 ⇢ ⇢
= (8.1)
1 1 ⇢
which leads to ⇢ ⇡ 0.382. This relationship is also known as the golden section.
Question: Why is this choice of ⇢ common (and clever)? Think about the number of
function evaluations needed at each iteration: can we get away with one single function
evaluation instead of the expected two function evaluations?
Suppose that at iteration k, we have an estimate x(k) for our minimizer of f (x). Then,
44 LECTURE 8. INTRO TO ALGORITHMS: LINE SEARCH METHODS
1
f (x) ⇡ f (x(k) ) + f 0 (x(k) )(x x(k) ) + f 00 (x(k) )(x x(k) )2 (8.2)
2
Then, we can update our estimate as follows:
f 0 (x(k) )
x(k+1) = x(k) (8.3)
f 00 (x(k) )
Newton’s method tends to work well if f 00 (x) > 0 everywhere. However, if at some
iterations f 00 (x(k) ) < 0, Newton’s method may fail to converge to a minimizer. Specifically,
note that f 0 (x(k) ) will point in a descent direction of f (x). Therefore, if f 00 (x(k) ) < 0,
0 (k) )
then ff00(x
(x(k) )
will point in an ascent direction of f (x).
f 0 (x(k) )
x(k+1) = x(k) (8.4)
f 00 (x(k) ) + µk
f 0 (x(k) ) f 0 (x(k 1)
)
f 00 (x(k) ) ⇡ (8.5)
x(k) x(k 1)
8.6. SECANT METHOD 45
which leads to the secant algorithm, where we update our estimate as follows:
x(k) x(k 1)
x(k+1) = x(k) f 0 (x(k) ) (8.6)
f 0 (x(k) ) f 0 (x(k 1) )
Note that in the secant algorithm, each iteration relies on the previous two iterations in
order to approximate the second derivative. Therefore, the secant algorithm requires two
initial points. These can be two di↵erent initial guesses, or one initial guess followed by
one iteration of a di↵erent algorithm (eg: one of the algorithms described above).
46 LECTURE 8. INTRO TO ALGORITHMS: LINE SEARCH METHODS