A Feasible Directions Method For Nonsmooth Convex Optimization
A Feasible Directions Method For Nonsmooth Convex Optimization
DOI 10.1007/s00158-011-0634-y
RESEARCH PAPER
1 Introduction
In this paper, we propose a new algorithm for solving the
unconstrained optimization problem:
min f (x)
(P)
xRn
where f : Rn R is a convex function, not necessarily
smooth.
Nonsmooth optimization problems arise in advanced
structural analysis and optimization. This is the case for
several types of unilateral problems, as contact, impact
in solids or delamination analysis in composite materials.
Dynamic analysis and stability studies involve eigenvalues and nonsmooth functions. Most of these applications
involve nonconvex problems with constraints. The authors
consider the present study as the first step on a new technique for numerical algorithms for nonconvex optimization
with smooth and nonsmooth constraints.
It is assumed that f has bounded level sets. Thus, there
is a R such that a {x Rn | f (x) a} is compact.
Let f (x) be the subdifferential (Clarke 1983) of f at x. In
what follows, it is assumed that one arbitrary subgradient
s f (x) can be computed at any point x Rn .
A special feature of nonsmooth optimization is the fact
that f (x) can change discontinuously and is not necessarily small in the neighborhood of a local extreme of the
J. Herskovits et al.
min z
(x,z)Rn+1
(EP)
s.t. f (x) z,
where z R is an auxiliary variable. With the
present approach, a decreasing sequence of feasible points
{(x k , z k )} converging to a minimum of f (x) is obtained.
That is, we have that z k+1 < z k and z k > f (x k ) for all
k. At each iteration, an auxiliary linear program is defined
using cutting planes. A feasible descent direction of the
linear program is obtained employing FDIPA, and a steplength is computed. Then, a new iterate (x k+1 , z k+1 ) is
defined according to suitable rules. To determine a new iterate, the algorithm produces auxiliary points (yi , wi ) and
when an auxiliary point is at the interior of epi( f ), that
means wi > f (yi ), we say that the step is serious and
we take it as the new iterate. Otherwise, the iterate is not
changed and we say that the step is null. A new cutting
plane is then added and the procedure is repeated until a
serious step is obtained. It will be proved that, when a serious step is obtained, the search direction given by FDIPA is
also a feasible descent direction for Problem EP.
min f (x)
xRn
s.t. g(x) 0,
(1)
where f : Rn R and g : Rn Rm are continuously differentiable. The FDIPA requires the following
assumptions about Problem 1:
Assumption 1 Let {x Rn | g(x) 0} be the
feasible set. There exists a real number a such that the set
a {x | f (x) a} is compact and has an interior
int(a ).
Assumption 2 Each x int(a ) satisf ies g(x) < 0.
Assumption 3 The functions f and g are continuously
dif ferentiable in a and their derivatives satisfy a Lipschitz
condition.
Assumption 4 (Regularity Condition) A point x a
is regular if the gradient vectors gi (x), for i such that
gi (x) = 0, are linearly independent. FDIPA requires
regularity assumption at a local solution of Problem 1.
Let us remind the reader of some well known concepts
(Luenberger 1984), widely employed in this paper.
Definition 1 d Rn is a descent direction for a smooth
function : Rn R at x if d T (x) < 0.
Definition 2 d Rn is a feasible direction for Problem 1,
at x , if for some > 0 we have x + td for all
t [0, ].
(2)
G(x ) = 0
(3)
(4)
g(x ) 0,
(5)
k+1
x
Sk
g(x k )
xk
k g(x k )T G(x k )
k+1
k
f (x k ) + g(x k )k
=
G(x k )k
S k dk + g x k k+1
(7)
= f x k
k g x k dk + G x k k+1
= 0.
(8)
S k d k + g x k k+1 = f x k
T
k g x k d k + G x k k+1 = k k .
The addition of a negative vector in the right hand side of
Eq. 8 produces the effect of deflecting dk into the feasible
region, where the deflection is proportional to k . As the
deflection of dk grows with k , it is necessary to bound k ,
in a way to ensure that d k remains a descent direction. Since
(dk )T f (x k ) < 0, we can get these bounds by imposing
T
(d k )T f x k dk f x k ,
(9)
S k dk + g x k k+1
=0
k g x k dk + G x k k+1
= k .
(6)
Eq. 6 is a Newton iteration. However, S k can be a quasiNewton approximation or even the identity matrix. FDIPA
requires S k symmetric and positive definite. Calling dk =
k ( 1) dk f x k
dk f x k .
In (Herskovits 1998), is defined as follows: If
(dk )T f (x k ) 0, then k =
dk
2 . Otherwise,
T
T
2
k = min dk ,( 1) dk f x k
dk f x k .
J. Herskovits et al.
i = 0, 1, ...,
where yk Rn are auxiliary points, sik f (yik ) are subgradients at those points and represents the number of current
cutting planes. We call,
g k (x, z)
g k
+1
: R R R
n
min (x, z) = z
(x,z)Rn+1
s.t.
g k (x, z) 0.
sk
(AP)
t = max t | g k x k , z k + tdk 0 .
k and k , solving
Compute d
k
+ g k (x k , z k ) k = (x, z)
B k d
(12)
tk := min{tmax /, t}
k
k [ g k (x k , z k )]T d
+ G k (x k , z k ) k = 0.
(13)
k
k
= x k , z k + tk dk
x+1
, z +1
k and k , solving
Compute d
(10)
k
, wk+1 = x k , z k + tk dk .
(11)
y+1
k , wk ) is strictly feasible with respect to ProbIf (y+1
+1
k ) we consider that the
lem EP, that is, if wk+1 > f (y+1
current set of cutting planes is a good local approximation
of f (x) in a neighborhood of x k . Then, we say that the
k
B k d
+ g k (x k , z k ) k = 0
(14)
k
k [ g k (x k , z k )]T d
+ G k (x k , z k ) k = k , (15)
where
k := (k0 , ..., k ),
k := (k0 , ..., k ),
and
k := (k0 , ..., k ),
k := diag(k0 , ..., k )
ii)
iii)
iii)
k , wk ) = (x k , z k ) + t k d k .
Set (y+1
+1
k
k ), we have a null step. Then, define
If w+1 f (y+1
k+1 > 0 and set := + 1.
Otherwise, we have a serious step. Then, call d k =
k , d k = d k , k = k , k = k and
dk , dk = d
k , wk ), define
k = . Take (x k+1 , z k+1 ) = (y+1
+1
k+1
> 0, B k+1 symmetric and positive definite and
0
set k = k + 1, = 0, y0k = x k .
Go to Step 1).
I ,
S ,
such
4 Convergence analysis
In this section, we prove global convergence of the present
algorithm. We first show that the search direction dk is
a descent direction for . Then, we prove that the number of null steps at each iteration is finite. That is; since
(x k , z k ) int(epi f ), after a finite number of subiterations,
we obtain(x k+1 , zk+1 ) int(epi f ). In consequence, the
sequence (x k , z k ) kN is bounded and belongs to the interior of the epigraph of f . In what follows, it is proved
z)
,
g(x,
[
z)]T G(x,
z)
is nonsingular.
It follows that d , d , and are bounded in a . Since
is bounded above we also have that = + is
bounded.
Lemma 2 The vector d satisf ies
dT (x, z) dT Bd .
Proof It follows from Eq. 12
dT Bd + dT g(x,
z) = dT (x, z),
(17)
1 G(x,
dT g(x,
z) = T
z).
(18)
1 G(x,
dT (x, z) = dT Bd + T
z) .
1 is positive definite, in consequence of AssumpSince
J. Herskovits et al.
dT (x, z)
.
dT (x, z)
Therefore,
d T (x, z) dT (x, z) + ( 1)dT (x, z)
=
dT (x, z)
< 0.
i
i
and
(gi (x, z))T d = 1 gi (x, z)
i
.
i
i
) ti 0,
i
lim
ki sik
= 0 and
lim
i=0
ki = 1.
(21)
i=0
k }. In consequence k > 0
We define Ik = {i | yik Y
i
for i Ik and k large enough. Then,
lim
ki sik = 0
and
lim
iIk
ki = 1.
(22)
iIk
thus
z k+1 = z k + t k dzk
(20)
ki f (x)
ki f (x k )
iIk
iIk
0 = dz (d )T (x, z) = dz
0, thus dz
= 0.
T
ki sik
ki ki ,
x xk
iIk
iIk
and
0 = dz
= (d )T (x, z) (d )T Bd 0,
k
f (x) f x k +
i
d k = dk + k dk 0 when k , k N ,
since k 0 if dk 0. In consequence, the iterations can
be stopped when dk is small enough.
Proposition 5 For any accumulation point (x , z ) of the
sequence {(x k , z k )}kN , we have 0 f (x ).
Proof Consider
Y := {y00 , y10 , ..., y00 , y01 , y11 , ..., y11 , ....., y0k , y1k , ..., ykk , ....},
sk
k i
iIk i
iIk
k i
k
iI ik k .
iIk i
we have that
ki
iIk
iIk
sk .
ki i
x x k k ,
Since
ki = 1,
iIk
lim k
0.
Considering s k
J. Herskovits et al.
Table 1 Numerical results
5 Numerical results
Solver
= 0.75,
= 0.1,
= 0.7,
and
tmax = 1.
(23)
Problem
M1FC1
BTC
PB
NFDA
NI
NI
NI
NI
NI
CB2
31
11
13
15
CB3
14
12
13
15
24
DEM
17
10
09
07
22
QL
13
12
12
17
27
13
LQ
11
16
10
14
15
Mifflin1
66
143
49
22
21
Rosen
43
22
22
40
49
Shor
27
21
29
26
51
Maxquad
10
74
29
45
41
115
Maxq
20
150
144
125
158
272
Maxl
20
39
138
74
34
70
TR48
48
245
163
165
152
317
Goffin
50
52
72
51
51
79
t p,s
,
min{t p,s : s S}
Problem
SBM
SBM
M1FC1
BTC
PB
NFDA
NF
NF
NF
NF
NF
CB2
33
31
16
16
14
CB3
16
44
21
16
25
DEM
19
33
13
08
22
QL
15
30
17
18
28
LQ
12
52
11
15
16
Mifflin1
68
281
74
23
22
Rosen
45
61
32
41
50
Shor
29
71
30
27
52
Maxquad
10
75
69
56
42
116
Maxq
20
151
207
128
159
273
Maxl
20
40
213
84
35
71
TR48
48
251
284
179
153
318
Goffin
50
53
94
53
52
80
Problem
SBM
M1FC1
BTC
PB
NFDA
1.95222e+0
1.95225e+0
1.95222e+0
1.95222e+0
CB2
1.95220e+0
1.95222e+0
CB3
2.00000e+0
2.00141e+0
2.00000e+0
2.00000e+0
2.00010e+0
2.00000e+0
DEM
3.00000e+0
3.00000e+0
3.00000e+0
3.00000e+0
2.98630e+0
3.00000e+0
QL
7.20000e+0
7.20001e+0
7.20000e+0
7.20001e+0
7.20000e+0
7.20000e+0
LQ
1.41421e+0
1.41421e+0
1.41421e+0
1.41421e+0
1.41390e+0
1.41421e+0
Mifflin1
9.99990e1
9.99960e1
1.00000e+0
1.00000e+0
9.99800e1
1.00000e+0
Rosen
4.399999e+1
4.399998e+1
4.399998e+1
4.399994e+1
4.399990e+1
4.400000e+1
Shor
2.260016e+1
2.260018e+1
2.260016e+1
2.260016e+1
2.260020e+1
8.4140e1
8.4140e1
0.16712e6
0.00000e+0
0.00000e+0
0.00000e+0
3.17800e7
0.00000e+0
0.12440e12
0.00000e+0
0.00000e+0
0.00000e+0
2.83150e4
0.00000e+0
48
6.3853048e+5
6.3362550e+5
6.3856500e+5
6.3856000e+5
6.3856499e+5
6.3856500e+5
50
0.11665e11
0.00000e+0
0.00000e+0
0.00000e+0
1.33720e4
0.00000e+0
10
Maxq
20
Maxl
20
TR48
Goffin
8.4140e1
1
|{ p : r p,s }|,
np
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
This section presents an application of the proposed algorithm in the topology design of robust trusses. A largely
employed model for truss topology optimization consid-
0.9
0.5
0.4
0.4
0.3
0.3
0.2
0.2
M1FC1
BTC
PB
SBM
NFDA
0.1
8.4140e1
8.4140e1
where |{}| is the number of elements in the set {}. The value
of s (1) indicates the probability of the method s to be the
best method (among all the belonging to the set S), by using
as the performance measure t p,s .
2.260016e+1
8.4135e1
Maxquad
M1FC1
BTC
PB
SBM
NFDA
0.1
10
12
14
J. Herskovits et al.
b
xjKj ,
(24)
j=1
K j Rmm , j = 1, 2, ..., b, are the reduced stiffness matrices corresponding to bars of unitary volume. To obtain a
well-posed problem, the matrix bj=1 K j must be positive
definite (Ben-Tal and Nemirovski 1997). The compliance
related to the loading condition pi P can be defined as
(Bendse 1995):
(x, pi ) = sup 2u T pi u T K (x)u, u Rm .
(25)
u
Let be (x)
= sup{(x, pi ) ,
pi
min (x)
xRb
b
(26)
s.t.
xj V ,
j=1
x j 0 , j = 1, . . . , b
The value V > 0 is the maximum quantity of material to
distribute in the truss.
Instead of maximizing on the finite domain P, we consider a model proposed by Ben-Tal and Nemirovski (1997)
(28)
(x)
= sup{(x, p) , p M}.
p
(x)
(29)
and
A(, x) =
Iq
Q
QT
0,
K (x)
(30)
where R and A 0 means that A is positive semidefinite. Since the epigraph of coincides with
{(, x) | A(, x) 0}, and this last set is convex
(Vandenberghe and Boyd 1996) then is a convex function. Then, for robust optimization we can employ the
model proposed by Ben-Tal and Nemirovski (1997), and
solve the problem of Eq. 26 using the present nonsmooth
optimization algorithm. Note that this problem has linear
inequality constraints. To solve the optimization problem
we must include all these constraints in the initial set of
linear inequality constraints of the auxiliary Problem AP.
In addition, the initial point x 0 must be interior to the feasible region defined by the linear inequality constraints.
It remains to show how to compute the function at an
interior point x.
(31)
min
s.t. K (x) Q Q T 0 .
(32)
0
0
0
2
p1 =
0 ,
0
0
0
0
0
0
A=
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
.
0
0
1
1
0
1 2 3
[v , v , v ] =
0
0
0
0
1
0
0
0
.
0
1
0
0
2
1
1
2
3
Q = [p ,r f ,r f ,r f ] =
0
0
0
0
0
0.3
0
0
0
0
0
0
0
0
0
0
0
0
0
.
0
0
0
0
0.3 0
0 0.3
J. Herskovits et al.
500
450
Eigenvalues
400
350
300
250
200
150
100
50
0
10
15
20
25
Iteration
30
35
40
45
x=
(33)
$
%T
pi = 1/ N (1+ 2 ) sin(2i/N ),cos(2i/N ), ,
i {N + 1, . . . , 2N },
6.3 Example 3
(34)
600
Eigenvalues
500
400
300
200
100
0
10
15
20
Iteration
25
30
35
40
Example 2
volume
bar
volume
bar
volume
bar
volume
53
2.478e-1
53
2.448e-1
16
1.247e-1
17
1.000e-1
64
1.276e-1
31
1.195e-1
18
1.246e-1
110
9.926e-2
42
1.251e-1
64
2.448e-1
25
1.246e-1
26
9.947e-2
54
3.715e-3
42
1.195e-1
27
1.247e-1
28
1.003e-1
63
2.478e-1
54
1.265e-2
36
1.246e-1
37
9.922e-2
32
2.478e-1
63
1.265e-2
38
1.247e-1
39
1.002e-1
32
2.368e-1
45
1.247e-1
48
9.943e-2
41
9.195e-3
47
1.246e-1
410
1.001e-1
56
4.842e-4
56
1.003e-1
57
4.343e-4
59
9.935e-2
58
4.847e-4
67
2.573e-4
67
4.847e-4
68
2.169e-4
68
4.343e-4
69
2.366e-4
78
4.842e-4
610
2.530e-4
78
2.548e-4
79
2.345e-4
300
Eigenvalues
200
710
2.162e-4
89
2.728e-4
810
2.137e-4
910
2.669e-4
5
6
150
100
50
Example 4
bar
250
Example 3
10
15
20
25
Iteration
NI
NF
Example 1
10
42
72
Example 2
10
37
161
278.40
Example 3
22
25
35
110.56
Example 4
35
35
112
135.27
258.24
J. Herskovits et al.
350
1
2
300
Eigenvalues
250
200
150
100
50
0
10
15
20
Iteration
25
30
35
6.4 Example 4
This example is similar to the previous one. The nodal
coordinates and nodal forces are given by Eqs. 33 and 34,
respectively, but with N = 5 and = 0.01. The secondary
loadings have a magnitude r = 0.3 and define a basis of the
orthogonal complement of L(P) in the linear space F of all
the degrees of freedoms of the structure. The optimal truss
of this example and the evolution of the six highest eigenvalues of the system (Q Q T , K (x)) are shown in Figs. 11
and 10.
Tables 4 and 5 describe the numerical results obtained
using the present algorithm.
References