0% found this document useful (0 votes)

52 views9 pages

L24 Simulated Annelaing

MTH511

Uploaded by

Ananya Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views9 pages

L24 Simulated Annelaing

MTH511

Uploaded by

Ananya Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

MTH 511a - 2020: Lecture 24

Instructor: Dootika Vats

The instructor of this course owns the copyright of all the course materials. This lecture
material was distributed only to the students attending the course MTH511a: “Statistical
Simulation and Data Analysis” of IIT Kanpur, and should not be distributed in print or
through electronic media without the consent of the instructor. Students can make their own
copies of the course materials for their use.

1 Stochastic optimization methods

Last lecture we went over the stochastic gradient ascent algorithm: the merit of this
algorithm was its use in online sequential data and for large data set problem.
This lecture focuses on simulated annealing, an algorithm particularly useful for non-
concave objective functions. Our goal is the same as before: for an objective function
f (✓), our goal is to find
✓⇤ = arg max f (✓) .
✓

1.1 Simulated annealing

Recall that when the objective function is non-concave, all of the methods we’ve dis-
cussed cannot escape out of a local maxima. This creates challenges in obtaining global
maximas. This is where the method of simulated annealing has an advantage over other
methods.
Consider an objective function f (✓) to maximize. Note that maximizing f (✓) is equiv-
alent to maximizing exp(f (✓)). The idea in simulated annealing is that, instead of
trying to find a maxima directly, we will obtain samples from the density

⇡(✓) / exp(f (✓)) .

Wherever there is a maxima, samples collected from ⇡(x) are likely to be from areas
near the maximas. However, obtaining samples from ⇡(x) means there will be samples
from low probability areas as well. So how do we force samples to come from areas
near the maximas?

1
Consider for T > 0,
@ef (✓)/T f 0 (✓)
= ef (✓)/T ,
d✓ T
which has the same roots and direction as f (✓). Thus,

arg max f (✓) = arg max e{f (✓)/T } .

✓ ✓

For 0 < T < 1, the objective function’s modes are exaggerated there-by amplifying the
maximas.
Example 1. Consider the following objective function

f (✓) = [cos(50✓) + sin(20✓)]2 I(0 < ✓ < 1)

Below is a plot of ef (✓)/T for various values of T .

150

T=1
T = .83
T = .75
T = .71
100
exp(f/T)

50
0

0.0 0.2 0.4 0.6 0.8 1.0

In simulated annealing, this feature is utilized so that every subsequent sample is drawn
from an increasingly concentrated distribution. That is, at a time point k, a sample
will be drawn from
⇡k,T (✓) / ef (✓)/T .

How do we generate these samples?

Certainly, we can try and use accept-reject or another Monte Carlo sampling method,
but such methods cannot be implemented generally.

2
Note that for any ✓0 , ✓ ⇢
⇡k,T (✓0 ) f (✓0 ) f (✓)
= exp .
⇡k,T (✓) T
Let G be a proposal distribution with density g(✓0 |✓) so that g(✓0 |✓) = g(✓|✓0 ). Such a
proposal distribution is a symmetric proposal distribution.

Algorithm 1 Simulated Annealing algorithm

1: For k = 1, . . . , N , repeat the following:

2: Generate ✓0 ⇠⇢G(·|✓k )⇢and generate U ⇠ U (0, 1)

f (✓0 ) f (✓)
3: Let ↵ = min 1, exp .
Tk
4: If U < ↵, then let ✓k+1 = ✓ ⇤

5: Else ✓k+1 = ✓k .
6: Update Tk+1
7: Store ✓k+1 and ef (✓k+1 )/T .
8: Return ✓⇤ = ✓k⇤ where k ⇤ is such that k ⇤ = arg maxk ef (✓k+1 )/T

Thus, if the proposed value is such that f (✓0 ) > f (✓), then ↵ = 1 and the move is
always accepted. The reason simulated annealing works is because when ✓0 is such
that f (✓0 ) < f (✓), even then, the move is accepted with probability ↵. Thus, there is
always a chance to move out of local maximas.
Essentially, each ✓k is approximately distributed as ⇡k,T , and as T ! 0, ⇡k,T puts more
and more mass on the maximas, thus, ✓k will typically be get increasingly closer to ✓⇤ .
• Typically, G(·|✓) is U (✓ r, ✓ + r) or N (✓, r) which are both valid symmetrical
proposals. The parameter r dictates how far/close the proposed values will be.
• Tk is often called the temperature parameter. A common value of Tk = d/ log(k)
for some constant d.

Example 2 (continued...). We implement the simulated annealing algorithm for:

f (✓) = [cos(50✓) + sin(20✓)]2 I(0 < ✓ < 1)

The true ✓⇤ ⇡ .379.

#####################################
## Simulated Annealing
## Demonstrative example
#####################################
fn <- function(x, T = 1)
{
h <- ( cos(50*x) + sin(20*x) )^2
exp(h/T)* (0 < x & x < 1)

3
}

simAn <- function(N = 10, r = .5)

{
x <- numeric(length = N)
x[1] <- runif(1)

for(k in 2:N)
{
# U(x - r, x + r)
a <- runif(1, x[k-1] - r, x[k-1] + r)
T <- 1/(log(k))
ratio <- fn(a,T)/fn(x[k-1], T)
if( runif(1) < ratio)
{
x[k] <- a # accept
} else{
x[k] <- x[k-1] # reject, so stay
}
}
x
return(x)
}

Below I implement the algorithm for 500 steps and return the estimate of ✓⇤ . I also
plot the values of ✓ obtained.
N <- 500
sim <- simAn(N = N)
sim[which.max(fn(sim))] # theta^*
[1] 0.3792136

x <- seq(0, 1, length = 5e2)

plot(x, fn(x), type = ’l’)
points(sim, fn(sim), pch = 16, col = 1)

4
40
30
exp(f/T)

20
10
0

0.0 0.2 0.4 0.6 0.8 1.0

Example 3 (Location Cauchy). Recall the location Cauchy example discussed in Week
6 Lecture 15. The objective function is the log-likelihood of the location Cauchy
distribution with mode at µ 2 R. The goal is to find the MLE for µ.
1 1
f (x|µ) = .
⇡ (1 + (x µ)2 )

The log-likelihood is
n
X
l(µ) := log L(µ|X) = n log ⇡ log(1 + (Xi µ)2 ) .
t=1
−10
−15
log−likelihood

−20
−25
−30

−10 0 10 20 30 40

Recall that for the dataset generated, the log-likelihood (above) was not concave and
presented many local maxima. This caused Newton-Raphson to possible diverge/con-
verge to minima/local maxima and caused the gradient ascent algorithm to converge

5
to local maxima. We will implement the simulated annealing algorithm here with
G = U (✓ r, ✓ + r).

## Function calculates the exp(like/T)

log.like <- function(mu, X, T = 1)
{
n <- length(X)
rtn <- -n*log(pi) - sum( log(1 + (X - mu)^2) )
return(exp(rtn/T))
}

# Simulated annealing algorithm

simAn <- function(N = 10, r = .5)
{
x <- numeric(length = N)
x[1] <- runif(1, min = -10, max = 40)
fn.value <- numeric(length = N)

fn.value[1] <- log.like(mu = x[1], X, T = 1)

for(k in 2:N)
{
a <- runif(1, x[k-1] - r, x[k-1] + r)
T <- 1/(1 + log(log(k)))
ratio <- log.like(mu = a, X, T)/log.like(mu = x[k-1], X, T)
if( runif(1) < ratio)
{
x[k] <- a
} else{
x[k] <- x[k-1]
}
fn.value[k] <- log.like(mu = x[k], X, T = 1)
}
x
return(list("x" = x, "fn.value" = fn.value))
}

I run the algorithm for 100 steps from four randomly chosen starting points.

par(mfrow = c(2,2))

# Four different runs all converge.

sim <- simAn(N = 1e2, r = 5)
plot(mu.x, ll.est, type = ’l’, ylab = "log-likelihood", xlab =
expression(mu))
points(sim$x, log(sim$fn.value), pch = 16, col = adjustcolor("blue", alpha
= .4))

6
sim <- simAn(N = 1e2, r = 5)
plot(mu.x, ll.est, type = ’l’, ylab = "log-likelihood", xlab =
expression(mu))
points(sim$x, log(sim$fn.value), pch = 16, col = adjustcolor("darkred",
alpha = .4))

sim <- simAn(N = 1e2, r = 5)

plot(mu.x, ll.est, type = ’l’, ylab = "log-likelihood", xlab =
expression(mu))
points(sim$x, log(sim$fn.value), pch = 16, col = adjustcolor("darkgreen",
alpha = .4))

sim <- simAn(N = 1e2, r = 5)

plot(mu.x, ll.est, type = ’l’, ylab = "log-likelihood", xlab =
expression(mu))
points(sim$x, log(sim$fn.value), pch = 16, col = adjustcolor("purple",
alpha = .4))
−10

−10
−15

−15
log−likelihood

log−likelihood
−20

−20
−25

−25
−30

−30

−10 0 10 20 30 40 −10 0 10 20 30 40

µ µ
−10

−10
−15

−15
log−likelihood

log−likelihood
−20

−20
−25

−25
−30

−30

−10 0 10 20 30 40 −10 0 10 20 30 40

µ µ

7
Note the simulated annealing algorithm is able to escape out of local modes and head
towards the global maxima. However, the above algorithm is implemented only after
tuning r. Tuning r can be challenging.
• Large r: values too far away are proposed where the objective function is very
low. These values will get rejected and the algorithm will not move.
• Small r: values too close are proposed where the change in the objective function
is small. These values are often accepted, but the algorithm makes very tiny
jumps.
Below are runs of the simulated annealing algorithm with r chosen to be too high (500)
and too low (.1).

par(mfrow = c(1,2))
## Different values of r
# very large r
sim <- simAn(N = 1e3, r = 500)
plot(mu.x, ll.est, type = ’l’, main = "r = 500. Many rejections", ylab =
"log-likelihood", xlab = expression(mu))
points(sim$x, log(sim$fn.value), pch = 16, col = adjustcolor("blue", alpha
= .2))

#very small r
plot(mu.x, ll.est, type = ’l’, main = "r = .1. Many small acceptances",
ylab = "log-likelihood", xlab = expression(mu))
sim <- simAn(N = 1e3, r = .1)
points(sim$x, log(sim$fn.value), pch = 16, col = adjustcolor("blue", alpha
= .2))

r = 500. Many rejections r = .1. Many small acceptances

−10

−10
−15

−15
log−likelihood

log−likelihood
−20

−20
−25

−25
−30

−30

−10 0 10 20 30 40 −10 0 10 20 30 40

µ µ

8
2 Questions to think about
• How do you think this algorithm will scale in higher dimensions? Try implement-
ing simulated annealing for a Lasso optimization problem.
• Is there any benefit to having T > 1?

Notes Ipad
No ratings yet
Notes Ipad
263 pages
Roger D. Peng - Advanced Statistical Computing (2022 Update) (2023) - Libgen - Li
No ratings yet
Roger D. Peng - Advanced Statistical Computing (2022 Update) (2023) - Libgen - Li
107 pages
Icom Ic-F7000 Service Manual
0% (1)
Icom Ic-F7000 Service Manual
79 pages
MTH210
No ratings yet
MTH210
126 pages
VNX Architectural Overview Final Produced
No ratings yet
VNX Architectural Overview Final Produced
40 pages
World Political Map Blank - Google Search
No ratings yet
World Political Map Blank - Google Search
1 page
Brochure Damen ASD Tug 3212
100% (1)
Brochure Damen ASD Tug 3212
39 pages
Advstatcomp PDF
No ratings yet
Advstatcomp PDF
109 pages
Diploma in Electrical Engineering Industrial Traning Report
No ratings yet
Diploma in Electrical Engineering Industrial Traning Report
42 pages
Simulated Annealing: Aravind Sudheesan Biju B Roll No: 1 Roll - No:2
No ratings yet
Simulated Annealing: Aravind Sudheesan Biju B Roll No: 1 Roll - No:2
13 pages
Unconstrained Numerical Optimization An Introduction For Econometricians
100% (1)
Unconstrained Numerical Optimization An Introduction For Econometricians
32 pages
Meta - Simulated Annealing
No ratings yet
Meta - Simulated Annealing
54 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Monte Carlo Simulation and Queuing
No ratings yet
Monte Carlo Simulation and Queuing
11 pages
Intelligent Optimization Algorithm For Master
No ratings yet
Intelligent Optimization Algorithm For Master
47 pages
Unit 2
No ratings yet
Unit 2
46 pages
3 AI - Search III
No ratings yet
3 AI - Search III
72 pages
Simulated Annealing: Basics and Application Examples
100% (1)
Simulated Annealing: Basics and Application Examples
13 pages
CS-6777 Liu Abs
No ratings yet
CS-6777 Liu Abs
103 pages
Lec 13 MOEAD
No ratings yet
Lec 13 MOEAD
27 pages
Gonzalez 2020
No ratings yet
Gonzalez 2020
79 pages
L4 Optimization
No ratings yet
L4 Optimization
51 pages
Statistical Computing With R: Masters in Data Sciences 503 (S29) Third Batch, SMS, TU, 2024
No ratings yet
Statistical Computing With R: Masters in Data Sciences 503 (S29) Third Batch, SMS, TU, 2024
40 pages
Lecture 7
No ratings yet
Lecture 7
60 pages
Search Local
No ratings yet
Search Local
37 pages
Optim
No ratings yet
Optim
70 pages
Global Search
No ratings yet
Global Search
28 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Hbgary Shell Trojan Gens
No ratings yet
Hbgary Shell Trojan Gens
28 pages
Stochastic Approx and Simulated Annealing Leo Sakalauskas Euro Working Group On Cont Opt Aug 2010
No ratings yet
Stochastic Approx and Simulated Annealing Leo Sakalauskas Euro Working Group On Cont Opt Aug 2010
49 pages
Engineering Optimization - 2010 - Yang - Appendix B MATLAB Programs
No ratings yet
Engineering Optimization - 2010 - Yang - Appendix B MATLAB Programs
16 pages
L05 Local Search Algorithms
No ratings yet
L05 Local Search Algorithms
21 pages
An Algorithmic Introduction To Numerical Simulation of Stochastic Differential Equations
No ratings yet
An Algorithmic Introduction To Numerical Simulation of Stochastic Differential Equations
22 pages
04 Combin Optimization
No ratings yet
04 Combin Optimization
41 pages
Differential Evolution
No ratings yet
Differential Evolution
67 pages
Process Optimization
No ratings yet
Process Optimization
70 pages
Relative Humidity and Dewpoint Temperature Provide The Same Exact Information and Can Be Used Interchangeably
80% (5)
Relative Humidity and Dewpoint Temperature Provide The Same Exact Information and Can Be Used Interchangeably
76 pages
Var, Svar and Svec Models
No ratings yet
Var, Svar and Svec Models
32 pages
Importance Sampling
No ratings yet
Importance Sampling
13 pages
Fast Numerical Methods For Stochastic Computations: A Review
No ratings yet
Fast Numerical Methods For Stochastic Computations: A Review
31 pages
Note 3
No ratings yet
Note 3
9 pages
Ot Lab 12
No ratings yet
Ot Lab 12
11 pages
Stochastic Search Algorithms-I
No ratings yet
Stochastic Search Algorithms-I
19 pages
Concept: Mathematics 4 - Quarter 1 Week 2
No ratings yet
Concept: Mathematics 4 - Quarter 1 Week 2
9 pages
Simulated Annealing: Starting With Steepest Descent Method
No ratings yet
Simulated Annealing: Starting With Steepest Descent Method
28 pages
Optimization Via Search: CPSC 315 - Programming Studio Spring 2008 Project 2, Lecture 4
No ratings yet
Optimization Via Search: CPSC 315 - Programming Studio Spring 2008 Project 2, Lecture 4
44 pages
Nonlinear Optimization: Benny Yakir
No ratings yet
Nonlinear Optimization: Benny Yakir
38 pages
Matlab Opt An Dint
No ratings yet
Matlab Opt An Dint
43 pages
An Algorithmic Introduction To Numerical Simulation of Stochastic Differential Equations
No ratings yet
An Algorithmic Introduction To Numerical Simulation of Stochastic Differential Equations
22 pages
S. Wagon - Think GThink Globally Act Locallylobally Act Locally
No ratings yet
S. Wagon - Think GThink Globally Act Locallylobally Act Locally
26 pages
Appendix B MATLAB® PROGRAMS PDF
No ratings yet
Appendix B MATLAB® PROGRAMS PDF
16 pages
Derivative Free Optimization Simulated Annealing: Presentation By: C. Vinoth Kumar SSN College of Engineering
No ratings yet
Derivative Free Optimization Simulated Annealing: Presentation By: C. Vinoth Kumar SSN College of Engineering
16 pages
Simulated Annealing: by Rohit Ray ESE 251
No ratings yet
Simulated Annealing: by Rohit Ray ESE 251
20 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
Structural Cals For UCW
No ratings yet
Structural Cals For UCW
11 pages
Nonlinear Optimization: Benny Yakir
No ratings yet
Nonlinear Optimization: Benny Yakir
38 pages
Comp Intel Notes
No ratings yet
Comp Intel Notes
7 pages
Lecture 11: Simulated Annealing: Linear and Combinatorial Optimization
No ratings yet
Lecture 11: Simulated Annealing: Linear and Combinatorial Optimization
20 pages
IEEE Paper Format Template
No ratings yet
IEEE Paper Format Template
2 pages
EE364a Homework 7 Solutions
No ratings yet
EE364a Homework 7 Solutions
16 pages
Note Set 7 - Nonlinear Equations: 7.1 - Overview
No ratings yet
Note Set 7 - Nonlinear Equations: 7.1 - Overview
10 pages
NIOT-UnitII-HillClimbing - and-SA
No ratings yet
NIOT-UnitII-HillClimbing - and-SA
6 pages
L23 Stochastic Gradient and Mini Batch
No ratings yet
L23 Stochastic Gradient and Mini Batch
9 pages
Higham Siam Sde Review
No ratings yet
Higham Siam Sde Review
22 pages
CAMEL and Optimal Routing
100% (1)
CAMEL and Optimal Routing
21 pages
Lim 05429427
No ratings yet
Lim 05429427
10 pages
SETS
50% (2)
SETS
26 pages
An Adaptive Simulated Annealing Algorithm PDF
No ratings yet
An Adaptive Simulated Annealing Algorithm PDF
9 pages
OPTEC Annealing Anoop Malgorzata
No ratings yet
OPTEC Annealing Anoop Malgorzata
7 pages
Differential Evolution
No ratings yet
Differential Evolution
12 pages
Matlab Assignment Simulated Annealing: Revathi S 410115004 Ice. Dept
No ratings yet
Matlab Assignment Simulated Annealing: Revathi S 410115004 Ice. Dept
5 pages
Minka Gamma
No ratings yet
Minka Gamma
3 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Principles of Communications - EE320A
No ratings yet
Principles of Communications - EE320A
401 pages
Tutorial 5 Solutions
No ratings yet
Tutorial 5 Solutions
13 pages
L22 Bootstrap
No ratings yet
L22 Bootstrap
7 pages
Digital Signal Processing: Dr. Saad Muhi Falih
No ratings yet
Digital Signal Processing: Dr. Saad Muhi Falih
15 pages
EE320A Solutions For Tutorial 2
No ratings yet
EE320A Solutions For Tutorial 2
14 pages
Employee Benefit Plans 6: Limitations On Contributions and Benefits
No ratings yet
Employee Benefit Plans 6: Limitations On Contributions and Benefits
23 pages
Grade 5 DLL MATH 5 Q4 Week 2
No ratings yet
Grade 5 DLL MATH 5 Q4 Week 2
5 pages
2024 12 17 628864v1 Full
No ratings yet
2024 12 17 628864v1 Full
25 pages
Advanced Creating of 3D Dental Models in Blender Software: September 2016
No ratings yet
Advanced Creating of 3D Dental Models in Blender Software: September 2016
67 pages
PrecisionTree - Debbie House
No ratings yet
PrecisionTree - Debbie House
18 pages
L28 Bayseian Linear Regression Linchpin Sampler PDF
No ratings yet
L28 Bayseian Linear Regression Linchpin Sampler PDF
6 pages
Depolarization
No ratings yet
Depolarization
8 pages
Crystal I I Zat I: Zyxwvutsrqponmlkjihgfedcbazyxwvutsrqponmlkjihgfedcba
No ratings yet
Crystal I I Zat I: Zyxwvutsrqponmlkjihgfedcbazyxwvutsrqponmlkjihgfedcba
10 pages
Sewing Symbols in Tailoring
No ratings yet
Sewing Symbols in Tailoring
12 pages
Beamon's Model
No ratings yet
Beamon's Model
5 pages
EE311A 2021 AV Slides L23
No ratings yet
EE311A 2021 AV Slides L23
13 pages
Draft: Chapter 3 Introduction To Shells and Scripting
No ratings yet
Draft: Chapter 3 Introduction To Shells and Scripting
12 pages
EE311A 2021 AV Slides L20
No ratings yet
EE311A 2021 AV Slides L20
12 pages
EE311A 2021 AV Slides L21
No ratings yet
EE311A 2021 AV Slides L21
9 pages
EE311A 2021 AV Slides L24
No ratings yet
EE311A 2021 AV Slides L24
9 pages
Em Algo For Multivariate GMM
No ratings yet
Em Algo For Multivariate GMM
9 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
1 page
English 3 Program
No ratings yet
English 3 Program
8 pages
L31 Bayesian Logistic Regression PDF
No ratings yet
L31 Bayesian Logistic Regression PDF
8 pages
BSP03 Multi Process Control Trainer
No ratings yet
BSP03 Multi Process Control Trainer
2 pages
EE340A: Electromagnetic Theory: 1 Laplace Transform of Transmission Line Equations
No ratings yet
EE340A: Electromagnetic Theory: 1 Laplace Transform of Transmission Line Equations
7 pages
Spacer Plate
No ratings yet
Spacer Plate
1 page
Multiple Choice (8 X 1 PT)
No ratings yet
Multiple Choice (8 X 1 PT)
5 pages
Dsa Assignment Theory
No ratings yet
Dsa Assignment Theory
2 pages
Barrierboard - Info Sheet
No ratings yet
Barrierboard - Info Sheet
2 pages
Midsem cs201 PDF
No ratings yet
Midsem cs201 PDF
1 page

L24 Simulated Annelaing

Uploaded by

L24 Simulated Annelaing

Uploaded by

MTH 511a - 2020: Lecture 24

Instructor: Dootika Vats

1 Stochastic optimization methods

1.1 Simulated annealing

⇡(✓) / exp(f (✓)) .

arg max f (✓) = arg max e{f (✓)/T } .

f (✓) = [cos(50✓) + sin(20✓)]2 I(0 < ✓ < 1)

Below is a plot of ef (✓)/T for various values of T .

0.0 0.2 0.4 0.6 0.8 1.0

How do we generate these samples?

Algorithm 1 Simulated Annealing algorithm

2: Generate ✓0 ⇠⇢G(·|✓k )⇢and generate U ⇠ U (0, 1)

Example 2 (continued...). We implement the simulated annealing algorithm for:

f (✓) = [cos(50✓) + sin(20✓)]2 I(0 < ✓ < 1)

The true ✓⇤ ⇡ .379.

simAn <- function(N = 10, r = .5)

x <- seq(0, 1, length = 5e2)

0.0 0.2 0.4 0.6 0.8 1.0

## Function calculates the exp(like/T)

# Simulated annealing algorithm

fn.value[1] <- log.like(mu = x[1], X, T = 1)

# Four different runs all converge.

sim <- simAn(N = 1e2, r = 5)

sim <- simAn(N = 1e2, r = 5)

r = 500. Many rejections r = .1. Many small acceptances

You might also like