CS407 Neural Computation: Neural Networks For Constrained Optimization. Lecturer: A/Prof. M. Bennamoun
CS407 Neural Computation: Neural Networks For Constrained Optimization. Lecturer: A/Prof. M. Bennamoun
Lecture 8:
Neural Networks for Constrained
Optimization.
2
Fausett
Introduction
There are nets that are designed for constrained
optimization problems (such as the Traveling
Salesman Problem, TSP).
These nets have fixed weights that incorporate
information concerning the constraints and the
quantity to be optimized.
The nets iterate to find a pattern of o/p signals
that represents a solution to the problem.
E.g of such nets are the Boltzmann machine
(without learning), the continuous Hopfield net,
and several variations (Gaussian and Cauchy
nets).
Other optimization problems to which this type of
NNs can be applied to are: job shop scheduling,
3
space allocation,…
Traveling Salesman Problem (TSP)
6
Fausett
City 1 2 3 4 5 6 7 8 9 10
A UA,1 UA,2 UA,3 UA,4 UA,5 UA,6 UA,7 UA,8 UA,9 UA,10
B UB,1 UB,2 UB,3 UB,4 UB,5 UB,6 UB,7 UB,8 UB,9 UB,10
C UC,1 UC,2 UC,3 UC,4 UC,5 UC,6 UC,7 UC,8 UC,9 UC,10
D UD,1 UD,2 UD,3 UD,4 UD,5 UD,6 UD,7 UD,8 UD,9 UD,10
E UE,1 UE,2 UE,3 UE,4 UE,5 UE,6 UE,7 UE,8 UE,9 UE,10
F UF,1 UF,2 UF,3 UF,4 UF,5 UF,6 UF,7 UF,8 UF,9 UF,10
G UG,1 UG,2 UG,3 UG,4 UG,5 UG,6 UG,7 UG,8 UG,9 UG,10
H UH,1 UH,2 UH,3 UH,4 UH,5 UH,6 UH,7 UH,8 UH,9 UH,10
I UI,1 UI,2 UI,3 UI,4 UI,5 UI,6 UI,7 UI,8 UI,9 UI,10
J UJ,1 UJ,2 UJ,3 UJ,4 UJ,5 UJ,6 UJ,7 UJ,8 UJ,9 UJ,10
b b
-p
b
9
Fausett
Boltzmann machine
The states of the units of a Boltzmann machine NNs are
binary valued, with probabilistic state transitions.
The configuration of the net is the vector of the states of the units.
The Boltzmann machine described in this lecture has fixed
weight wij, which express the degree of desirability that units
Xi and Xj both be “on”.
In applying Boltzmann machine to constrained optimization
problems, the weights represent the constraints of the problem
and the quantity to be optimized. Note that the description
presented here is based on the maximization of a consensus
function (rather than the minimization of a cost function).
The architecture of a Boltzmann machine is quite general,
consisting of
– a set of units (Xi and Xj are 2 representative units)
– a set of bi-directional connections between pairs of units.
If units Xi and Xj are connected, wij != 0.
The bi-directional nature of the connection is often represented
as wij = wji
10
Fausett
Boltzmann machine
A unit may also have a self-connection wii (or equivalently, there
may be a bias unit, which is always “on” and connected to
every other unit; in this interpretation, the self-connection
weight would be replaced by the bias weight).
Boltzmann machine
C = ∑ ∑ wij xi x j
i j ≤i
If unit xi is “on” , xi =1
i =1 ∑ w1 j x1 x j = w11x1 x1 If unit xi is “off”, xi =0
j ≤1
i=2 ∑ 2 j 2 j = w21x2 x1 + w22 x2 x2
w x x
j ≤2
i =3 ∑ 3 j 3 j = w31x3 x1 + w32 x3 x2 + w33 x3 x3
w x x
j ≤3
i=4 ∑ 4 j 4 j = w41x4 x1 + w42 x4 x2 + w43 x4 x3 + w44 x4 x4
w x x
j ≤4
i =5 ∑ 5 j 5 j = w51x5 x1 + w52 x5 x2 + w53 x5 x3 + w54 x5 x4 + w55 x5 x5
w x x
j ≤5
12
Fausett
Boltzmann machine
The change in consensus if unit Xi were to change its state (from
1 to 0 or from 0 to 1) is
Contribution from all nodes xj which
are “on” and connected to xi thru wij
∆C (i) = [1 − 2 xi ]wii + ∑ wij x j
j ≠i
where xi is the current state of unit Xi. Xi = 0
The coefficient [1− 2 xi ] will be +1 if unit Xi is currently “off”
and –1 if unit Xi is currently “on”
Xi = 1
13
Fausett
Boltzmann machine
However, unit Xi does not necessarily change its state, even if
doing so would increase the consensus of the net.
Lower values of T make it more likely that the net will accept a
change of state that increases its consensus and less likely that it
will accept a change that reduces its consensus.
∆C
T → 0 ⇒ exp − → 0 ⇒ A(i, T ) → 1 (assuming ∆C > 0)
T
14
Fausett
Boltzmann machine
The use of a probabilistic update procedure for the activations,
with the control parameter decreasing as the net searches for the
optimal solution to the problem represented by its weights,
reduces the chances of the net getting stuck in a local maximum.
15
Neural Nets for Constrained Optimization.
Introduction
Introduction
Boltzmann
Boltzmannmachine
machine
––Introduction
Introduction
––Architecture
Architectureand
andAlgorithm
Algorithm
Boltzmann
Boltzmannmachine:
machine:application
application
to
tothe
theTSP
TSP
Continuous
ContinuousHopfield
Hopfieldnets
nets
Continuous
ContinuousHopfield
Hopfieldnets:
nets:
application
applicationto
tothe
theTSP
TSP
References
Referencesandandsuggested
suggested
reading
reading
16
Fausett
b -p -p -p
b
-p
Each unit is connected to every other unit in the same row with
weight –p (p > 0)
Similarly, each unit is connected to every other unit in the same
column with weight –p.
The weights are penalties for violating the condition at most one
unit be “on” in each row and each column.
In addition, each unit has a self-connection, of weight b>0.
18
Fausett
19
Fausett
21
Fausett
22
Fausett
23
Neural Nets for Constrained Optimization.
Introduction
Introduction
Boltzmann
Boltzmannmachine
machine
––Introduction
Introduction
––Architecture
Architectureand
andAlgorithm
Algorithm
Boltzmann
Boltzmannmachine:
machine:application
application
to
tothe
theTSP
TSP
Continuous
ContinuousHopfield
Hopfieldnets
nets
Continuous
ContinuousHopfield
Hopfieldnets:
nets:
application
applicationto
tothe
theTSP
TSP
References
Referencesandandsuggested
suggested
reading
reading
24
Fausett
j = n + 1 → j = 1,
j =0→ j =n
Ui,j unit representing the hypothesis that the ith city is visited at the jth
step of the tour
ui,j activation of unit Ui,j ;
ui,j=1 if the hypothesis is true,
ui,j=0 if the hypothesis is false
25
Fausett
Ui,j is connected to all other units in row i with penalty weight –p;
this represents the constraints that the same city is not to be
visited twice.
Ui,j is connected toall other units in column j with penalty weight –p;
this represents the constraint that 2 cities cannot be visited
simultaneously.
27
Fausett
For this purpose, a typical unit Ui,j is connected to the units Uk,j-1 and
Uk,j+1 (for all k ≠ i ) by weights that represent the distances between
city i and city k 28
Fausett
b b
-p
b
-p -p
U 1,1 U 1,j U 1,n
b -p -p -p
b
-p
-p -p
U i,1 U i,j U i,n
b -p
-p -p -p
-p -p
-p
-p -p
U n,1 U n,j U n,n
b b b
29
Fausett
U k,j-1 U k,j+1
-dk,i -dk,i
-dn,i
-dn,i
U n,j U n,j+1
U n,j-1
30
Boltzmann NN for the TSP; weights represent the distances for unit Ui,j
Fausett
We now consider the relation between the constraint weight b and the
distance weights.
Let d denote the maximum distance between any 2 cities in the tour.
Assume that no city is visited in the jth position of the tour and that
no city is visited twice.
In this case, some city, say i is not visited at all; i.e. no unit is “on” in
column j or in row i.
Since allowing Ui,j to turn on should be encouraged, the weights
should be set so that the consensus will be increased if it turns on.
The change in consensus will be b − di ,k1 − di ,k 2 where
– k1 indicates the city visited at stage j-1 of the tour
– k2 denotes the city visited at stage j+1 (and city i is visited at
stage j).
This change >= b-2d (this change should be >= even for
maximum distance between cities, d) 31
Fausett
Thus, we see that if p>b, the consensus function has a higher value
for a feasible solution (one that satisfies the constraints) than for a
non-feasible solution
If b>2d the consensus will be higher for a short feasible solution
than for a longer tour.
32
Fausett
Equilibrium:
– The net is in thermal equilibrium (at a particular
temperature) when the probs Pα and Pβ of 2 configurations
of the net, α and β, obey the Boltzmann distribution
Pα E − Eα Eα = energy of configα
= exp β
Pβ T Eβ = energy of config β
33
Fausett
34
Fausett
35
Fausett
The energy gap between the conf with unit Xk “off” and that with
unit Xk“on” is
∆E (k ) = ∑ wik xi
i
36
Neural Nets for Constrained Optimization.
Introduction
Introduction
Boltzmann
Boltzmannmachine
machine
––Introduction
Introduction
––Architecture
Architectureand
andAlgorithm
Algorithm
Boltzmann
Boltzmannmachine:
machine:application
application
to
tothe
theTSP
TSP
Continuous
ContinuousHopfield
Hopfieldnets
nets
Continuous
ContinuousHopfield
Hopfieldnets:
nets:
application
applicationto
tothe
theTSP
TSP
References
Referencesandandsuggested
suggested
reading
reading
37
Fausett
38
Fausett
n=2⇒
2
E = 0.5 ∑ wi1vi v1 + wi 2vi v2 + (θ1v1 + θ 2v2 ) for i ≠ j
i =1
E = 0.5(w12v1v2 + w21v2v1 + ) + (θ1v1 + θ 2v2 )
θ1
U1
w12
w21
U2
θ2 39
Fausett
dE ∂ E dv i du i
dt
= ∑i
⋅ ⋅
∂ v i du i dt
(chain rule)
∂E
∂vi
= ∑w
j≠i
ij vj +θ i = net i
dv i
= g ' (u i ) > 0 ,
du i
du i ∂E
= − net i = − E is a Lyapounov
dt ∂vi Energy function
dE
Hence = − ∑ g ' (u i )( net i ) 2 ≤ 0
dt i 40
Fausett
dui ∂E n
=− = −∑ wij v j − θi
dt ∂vi j =1
n n n n vi
1
E = −0.5∑∑ wij vi v j − ∑θi vi + ∑ ∫ g −1
(v)dv
τ
i
i =1 j =1 i =1 i =1 0
Time constant
41
Fausett
42
Fausett
43
Neural Nets for Constrained Optimization.
Introduction
Introduction
Boltzmann
Boltzmannmachine
machine
––Introduction
Introduction
––Architecture
Architectureand
andAlgorithm
Algorithm
Boltzmann
Boltzmannmachine:
machine:application
application
to
tothe
theTSP
TSP
Continuous
ContinuousHopfield
Hopfieldnets
nets
Continuous
ContinuousHopfield
Hopfieldnets:
nets:
application
applicationto
tothe
theTSP
TSP
References
Referencesandandsuggested
suggested
reading
reading
44
Fausett
Solution
Formulation by Energy State
Problem Hopfield Minimization
Energy By Hopfield
45
Fausett
The units used to solve the 10-city TSP are arranged as shown
Position
City 1 2 3 4 5 6 7 8 9 10
A U A,1 U A,2 U A,3 U A,4 U A,5 U A,6 U A,7 U A,8 U A,9 U A,10
B U B,1 U B,2 U B,3 U B,4 U B,5 U B,6 U B,7 U B,8 U B,9 U B,10
C UC,1 UC,2 UC,3 UC,4 UC,5 UC,6 UC,7 UC,8 UC,9 UC,10
D U D,1 U D,2 U D,3 U D,4 U D,5 U D,6 U D,7 U D,8 U D,9 U D,10
E U E,1 U E,2 U E,3 U E,4 U E,5 U E,6 U E,7 U E,8 U E,9 U E,10
F U F,1 U F,2 U F,3 U F,4 U F,5 U F,6 U F,7 U F,8 U F,9 U F,10
G UG,1 UG,2 UG,3 UG,4 UG,5 UG,6 UG,7 UG,8 UG,9 UG,10
H U H,1 U H,2 U H,3 U H,4 U H,5 U H,6 U H,7 U H,8 U H,9 U H,10
I U I,1 U I,2 U I,3 U I,4 U I,5 U I,6 U I,7 U I,8 U I,9 U I,10
J U J,1 U J,2 U J,3 U J,4 U J,5 U J,6 U J,7 U J,8 U J,9 U J,10
46
Fausett
The connection weights are fixed and are usually not shown or even
explicitly stated.
The weights for inter-row connections correspond to the parameter A in
the energy equation;
– There is a contribution to the energy if 2 units in the same row are
“on”.
Similarly, the inter-columnar connections have weights B;
The distance connections appear in the fourth term of the energy
equation.
More explicitly, the weights between units Uxi and Uyi are
w( x, i : y, j ) = − Aδ xy (1 − δ ij ) − Bδ ij (1 − δ xy ) + C − Ddxy (δ i , j +1 + δ i , j −1
1 if i = j
δ ij is the Dirac Delta =
0 otherwise
In addition each unit receives an external input signal
I xi = +CN The parameter N is usually taken to be
somewhat larger than the number of cities n 47
Fausett
48
Fausett
Hopfield and Tank used the following parameter values in their solution
of the problem:
A = B = 500 , C = 200, D = 500, N = 15, α = 50
The large value of α gives a very steep sigmoid function, which
approximates a step function.
The large coefficients and a correspondingly small ∆t result in very little
contribution from the decay term ( u x,i (old ) ∆t )
49
Neural Nets for Constrained Optimization.
Introduction
Introduction
Boltzmann
Boltzmannmachine
machine
––Introduction
Introduction
––Architecture
Architectureand
andAlgorithm
Algorithm
Boltzmann
Boltzmannmachine:
machine:application
application
to
tothe
theTSP
TSP
Continuous
ContinuousHopfield
Hopfieldnets
nets
Continuous
ContinuousHopfield
Hopfieldnets:
nets:
application
applicationto
tothe
theTSP
TSP
References
Referencesandandsuggested
suggested
reading
reading
50
Suggested Reading.
L. Fausett,
“Fundamentals of
Neural Networks”,
Prentice-Hall,
1994, Chapter 7.
51