20 1 7 Rubinstein
20 1 7 Rubinstein
1. Introduction
2. Methodology: Partially Connected TSP
3. A Motivating Example
4. Trajectory Generation
5. The Rare Events Framework
6. Calculating the Permanent, etc
7. The Minimum Cross-Entropy (MCE) Method
8. Some Theory on CE and MCE
9. Conclusion
25
20
15
10
1
0.5
-1 0
-0.5
0 -0.5
0.5
1 -1
18
16
14
12
10
0
0 2 4 6 8 10 12 14 16 18 20
Iteration 1:
20
18
16
14
12
10
0
0 2 4 6 8 10 12 14 16 18 20
Iteration 2:
20
18
16
14
12
10
0
0 2 4 6 8 10 12 14 16 18 20
Iteration 3:
20
18
16
14
12
10
0
0 2 4 6 8 10 12 14 16 18 20
Iteration 4:
20
18
16
14
12
10
0
0 2 4 6 8 10 12 14 16 18 20
A Motivating Example
Consider the following distance matrix C of partially connected
TSP with n = 4.
0 1
0 c12 c13 c14
B C
B c 0 c23 c13 C
B 21 C
C=B C. (1)
B ∞ c32 0 c34 C
@ A
c41 ∞ c43 0
Trajectory Generation
We shall associate with each cost matrix D a probability matrix.
For our D matrix 0 1
0 1 1 1
B C
B 1 0 1 0 C
B C
D=B C
B 0 1 0 1 C
@ A
1 0 1 0
XN
è = 1
I{X i ∈A} (6)
N i=1
f (x, P 0 )
W (x) =
f (x, P )
XN
ˆ 1 f (X i , P 0 )
`= I{X i ∈A} W (X i ), W (X i ) = (8)
N i=1 f (X i , P )
and
XN
b − b̀ 1 1
|X | = |X | = I{X i ∈A} , (9)
N i=1 f (X i , P )
respectively. The second equality holds since f (x) is uniformly
distributed over the set X , that is
1 1
f (x, P 0 ) = f1 (x1 )f2 (x2 |x1 ) · · · fn (xn |x1 , . . . , xn−1 ) = ···1
n−1n−2
THE CROSS-ENTROPY METHOD IN OPTIMIZATION AND MONTE-CARLO SIMULATION – p. 19/48
Partially Connected TSP
Example
For our example with
0 1
0 1 1 1
B C
B 1 0 1 0 C
B C
D=B C
B 0 1 0 1 C
@ A
1 0 1 0
Updating P and γ
Adaptive updating of γt . Let γt be the (1 − ρ)-quantile of
S(X) under P t−1 . A simple estimator, denoted γ bt , of γt can be
obtained by drawing a random sample X 1 , . . . , X N from
f (x, P t−1 ), calculating the associated function values
S(X 1 ), . . . , S(X N ) and their order statistics S(1) , . . . , S(N ) and
assigning γ bt to be these order statistics’ (1 − ρ)-quantile, that is,
γ
bt = S(1−ρ)N +1 . (13)
Updating P and γ
Adaptive updating of P t . For fixed γt and P t−1 , derive P t
from the solution of the program
XN
b e 1 e t) .
max D(P t ) = max I{S(X i )≥bγt } log f (X i ; P (15)
et
P et N
P
i=1
and
N
X
I{X k ∈Xij } I{S(X k )≤bγt }
k=1
pet,ij = N
(17)
X
I{S(X k )≤bγt }
k=1
e t.
for updating the parameters of the matrices P t and P
THE CROSS-ENTROPY METHOD IN OPTIMIZATION AND MONTE-CARLO SIMULATION – p. 24/48
The Main Algorithm
n
Y
RA = {σ : S(σ) = 1}, S(σ) = aiσ(i) , (19)
i=1
For theTHE
matrix A there are 3! = 6 permutations. They are:
CROSS-ENTROPY METHOD IN OPTIMIZATION AND MONTE-CARLO SIMULATION – p. 27/48
Permanent Example
0 1
1 2 3
B C
B 1 3 2 C
B C
B C
B 2 1 3 C
σ=B
B
C,
C (20)
B 2 3 1 C
B C
B 3 1 2 C
@ A
3 2 1
0 1
a11 a22 a33 =1
B C
B a11 a23 a32 =0 C
B C
B C
B a12 a21 a33 =1 C
S(σ) = B
B
C,
C (21)
B a12 a23 a31 =0 C
B C
B a13 a21 a32 =1 C
@ A
a13 a22 a33 =0
where
n
X
S(σ) = aiσ(i) , (26)
i=1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 0 0 0 0 0
1 1 1 1 1 0 0 0 0 0
1 1 1 1 1 0 0 0 0 0
1 1 1 1 1 0 0 0 0 0
1 1 1 1 1 0 0 0 0 0
0.079 0.065 0.037 0.044 0.044 0.142 0.135 0.142 0.135 0.177
0.044 0.079 0.058 0.058 0.051 0.142 0.121 0.107 0.149 0.191
0.037 0.037 0.44 0.072 0.037 0.191 0.177 0.093 0.177 0.135
0.058 0.072 0.044 0.044 0.051 0.128 0.135 0.177 0.142 0.149
0.044 0.051 0.037 0.044 0.051 0.128 0.205 0.142 0.163 0.135
0.180 0.166 0.208 0.145 0.201 0.006 0.006 0.062 0.006 0.020
0.173 0.166 0.208 0.145 0.201 0.125 0.006 0.006 0.006 0.006
0.145 0.173 0.166 0.194 0.117 0.006 0.083 0.104 0.006 0.006
0.152 0.201 0.166 0.166 0.243 0.006 0.006 0.006 0.048 0.006
0.208 0.131 0.180 0.159 0.166 0.006 0.006 0.041 0.048 0.055
0.0237 0.0195 0.0111 0.0132 0.0132 0.1756 0.2085 0.1406 0.1945 0.2001
0.0132 0.0237 0.0174 0.0174 0.0153 0.1896 0.1343 0.1511 0.1637 0.2743
0.0111 0.0111 0.0132 0.0216 0.0111 0.2463 0.1721 0.1329 0.2631 0.1175
0.0174 0.0216 0.0132 0.0132 0.0153 0.1784 0.2435 0.1686 0.1119 0.2295
0.0132 0.0153 0.0111 0.0132 0.0153 0.1784 0.2435 0.1686 0.1119 0.2295
0.1240 0.1968 0.3214 0.1625 0.1653 0.0018 0.0018 0.0186 0.0018 0.0060
0.2829 0.1065 0.1310 0.2122 0.2227 0.0375 0.0018 0.0018 0.0018 0.0018
0.2045 0.2339 0.1338 0.2262 0.1401 0.0018 0.0249 0.0312 0.0018 0.0018
0.1716 0.2353 0.1758 0.1618 0.2339 0.0018 0.0018 0.0018 0.0144 0.0018
0.1744 0.1723 0.2080 0.1947 0.2038 0.0018 0.0018 0.0123 0.0144 0.0165
0.0071 0.0058 0.0033 0.0039 0.0039 0.1891 0.2260 0.1421 0.1959 0.2223
0.0039 0.0071 0.0052 0.0052 0.0045 0.2015 0.1602 0.1629 0.1749 0.2740
0.0033 0.0033 0.0039 0.0064 0.0033 0.2644 0.1751 0.1539 0.2636 0.1223
0.0052 0.0064 0.0039 0.0039 0.0045 0.1423 0.1744 0.3115 0.2083 0.1391
0.0039 0.0045 0.0033 0.0039 0.0045 0.1782 0.2436 0.1988 0.1359 0.2229
0.1572 0.1708 0.2940 0.1911 0.1778 0.0005 0.0005 0.0055 0.0005 0.0018
0.2742 0.1413 0.1557 0.1824 0.2326 0.0112 0.0005 0.0005 0.0005 0.0005
0.2213 0.2289 0.1342 0.2219 0.1549 0.0005 0.0074 0.0093 0.0005 0.0005
0.1726 0.2505 0.1703 0.1708 0.2289 0.0005 0.0005 0.0005 0.0043 0.0005
0.1617 0.1916 0.2365 0.2007 0.1952 0.0005 0.0005 0.0036 0.0043 0.0049
0.0021 0.0017 0.0009 0.0011 0.0011 0.1784 0.2377 0.1570 0.1976 0.2219
0.0011 0.0021 0.0015 0.0015 0.0013 0.2017 0.1583 0.1779 0.1774 0.2766
0.0009 0.0009 0.0011 0.0019 0.0009 0.2533 0.1750 0.1687 0.2702 0.1265
0.0015 0.0019 0.0011 0.0011 0.0013 0.1635 0.1903 0.2796 0.2038 0.1552
0.0011 0.0013 0.0009 0.0011 0.0013 0.1955 0.2323 0.2074 0.1445 0.2139
0.1582 0.1786 0.2932 0.1945 0.1726 0.0001 0.0001 0.0016 0.0001 0.0005
0.2725 0.1494 0.1471 0.1780 0.2486 0.0033 0.0001 0.0001 0.0001 0.0001
0.2273 0.2149 0.1472 0.2441 0.1608 0.0001 0.0022 0.0028 0.0001 0.0001
0.1808 0.2499 0.1654 0.1786 0.2230 0.0001 0.0001 0.0001 0.0012 0.0001
0.1571 0.2020 0.2441 0.2007 0.1917 0.0001 0.0001 0.0011 0.0012 0.0014
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
0.18 0.18 0.18 0.18 0.18 0.02 0.02 0.02 0.02 0.02
0.18 0.18 0.18 0.18 0.18 0.02 0.02 0.02 0.02 0.02
0.18 0.18 0.18 0.18 0.18 0.02 0.02 0.02 0.02 0.02
0.18 0.18 0.18 0.18 0.18 0.02 0.02 0.02 0.02 0.02
0.18 0.18 0.18 0.18 0.18 0.02 0.02 0.02 0.02 0.02
1 0 5.50E-07 0.721
2 2.35E-17 4.28E-06 0.519
3 6.62E-14 0.003 0.427
4 6.14E-11 0.013 0.355
5 1.81E-08 0.131 0.296
6 2.24E-06 1.692 0.251
7 19.158 28.520 0.116
8 100 100 0.036
1 0 1.664 0.199
2 1.435E-10 0.753 0.160
3 3.592E-07 1.793 0.137
4 0.052 1.825 0.130
5 11.073 18.453 0.101
6 100 100 0.067
7 100 100 0.035
n R o
minf (x) D(f |h) = ln fh(x)
(x)
f (x)dx = IEf ln fh(X)
(X)
R
(P0 ) s.t. Sj (x)f (x)dx = IEf Sj (X) = γj , j = 1, . . . , k,
R R
f (x)dx = 1, h(x)dx = 1.
(29)
Here f and h are joint n-dimensional pdf ’s or
n-dimensional pmf ’s, Sj (x), j = 1, . . . , k, are well
defined functions and x is an n-dimensional vector.
The pdf h is typically assumed to be known and is
called the prior pdf.
reduces to
Z
max H(f ) = − f (x) ln f (x)dx = −IEf ln f (x) . (31)
f (x)
Pm pi
minp D(p|u) = minp i=1 p i ln ui
Pm
(P1 ) s. t. i=1 Sj (xi )pi = IEp Sj (X) = γj , j = 1, . . . , k,
Pm Pm
i=1 pi = 1, i=1 ui = 1, pi ≥ 0, ui > 0.
(32)
Pk ∗
ui exp{− j=1 λj Sj (xi )}
p∗i = Pm
u exp{−
Pk ∗ =
r=1 r j=1 λj Sj (xr )}
(33)
P
IEu I{X=xi } exp{− kj=1 λ∗j Sj (X)}
= P
IEu exp{− kj=1 λ∗j Sj (X)}
, i = 1, . . . , m,
(34)
P
IEu Sj (X) exp{− kr=1 Sr (X)λr }
= − IE exp − Pk S (X)λ + γj = 0.
u { r=1 r r}
P6 pi
minp D(p|u) = minp p
i=1 i ln ui
P6
s. t. i=1 ipi = γ, (35)
P6 P6
i=1 pi = 1, i=1 ui = 1, pi ≥ 0, ui > 0.
The solution:
∗ ∗
∗ u i exp {−iλ } IE u I {X=i} exp {−Xλ }
p i = P6 = ∗
, i = 1, . . . , 6,
r=1 ur exp {−rλ }
∗ IEu exp {−Xλ }
(36)
and similarly λ∗ .
THE CROSS-ENTROPY METHOD IN OPTIMIZATION AND MONTE-CARLO SIMULATION – p. 46/48
Conclusions
Advantages:
Universally applicable (discrete/continuous/mixed
problems)
Very easy to implement
Easy to adapt, e.g. when constraints are added
Works generally well
Disadvantages:
Performance function should be relatively cheap
Tweaking (modifications) may be required