0% found this document useful (0 votes)
3 views3 pages

Lab4 Optimization

Uploaded by

Oussama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views3 pages

Lab4 Optimization

Uploaded by

Oussama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Lab 4: Numerical methods for engineers

Prof. HSSAYNI

Goal
Implement advanced gradient descent methods and compare their convergence behavior, speed,
and stability.

Dataset and Model


Use a synthetic dataset (e.g., for linear or logistic regression). The goal is to minimize the cost
function through each advanced optimization technique.

Lab Outline and Exercises


1. Momentum-based Gradient Descent
Objective: Implement gradient descent with momentum and visualize its convergence com-
pared to standard gradient descent.
Instructions:

• Define the cost function J(θ).

• Set a learning rate α and momentum factor β.

• Implement momentum-based updates:

v = βv + (1 − β)∇θ J(θ)

θ = θ − αv

• Visualizations:

– Plot the cost function over iterations to observe convergence speed.


– Overlay the convergence curve of momentum-based gradient descent and standard
gradient descent on the same plot.

1
2. AdaGrad (Adaptive Gradient Algorithm)
Objective: Implement AdaGrad and observe how adaptive learning rates affect convergence.
Instructions:
• Initialize the parameter θ and the gradient accumulation variable G = 0.
• Update parameters according to:
G = G + ∇θ J(θ)2
α
θ=θ− √ ∇θ J(θ)
G+ϵ
where ϵ is a small constant to avoid division by zero.
• Visualizations:
– Plot the cost function over iterations.
– Compare AdaGrad’s convergence with standard gradient descent.

3. RMSprop (Root Mean Square Propagation)


Objective: Implement RMSprop to see how it improves AdaGrad by balancing learning rates.
Instructions:
• Initialize G = 0 and a decay rate ρ.
• Update parameters as follows:
G = ρG + (1 − ρ)∇θ J(θ)2
α
θ=θ− √ ∇θ J(θ)
G+ϵ
• Visualizations:
– Plot the cost function over iterations.
– Overlay convergence of RMSprop and AdaGrad.

4. Adam (Adaptive Moment Estimation)


Objective: Implement Adam to explore how combining momentum with adaptive learning
rates affects optimization.
Instructions:
• Initialize m = 0, v = 0, and set decay rates β1 and β2 .
• Use the following update rules:
m = β1 m + (1 − β1 )∇θ J(θ)
v = β2 v + (1 − β2 )∇θ J(θ)2
m v
m̂ = t , v̂ =
1 − β1 1 − β2t
α
θ=θ− √ m̂
v̂ + ϵ
where t is the iteration number.
• Visualizations:
– Plot the cost function over iterations.
– Compare Adam’s convergence with other methods.

2
5. Nadam (Nesterov-accelerated Adaptive Moment Estimation)
Objective: Implement Nadam to study the effect of Nesterov acceleration on Adam’s conver-
gence.
Instructions:

• Use the Adam update with Nesterov acceleration:

m = β1 m + (1 − β1 )∇θ J(θ)

v = β2 v + (1 − β2 )∇θ J(θ)2
m v
m̂ = , v̂ =
1 − β1t 1 − β2t
α
θ=θ− √ (β1 m̂ + (1 − β1 )∇θ J(θ))
v̂ + ϵ
• Visualizations:

– Plot the cost function over iterations.


– Compare Nadam’s convergence with Adam and other methods.

Extended Visualization and Comparison Tasks


6. Comparison Summary Plots
Objective: Summarize each method’s performance.
Instructions:

• Create a line plot of convergence speed for each method.

• Use a heatmap of fluctuations in cost values to highlight each method’s stability.

Discussion Questions
• Speed vs. Stability: How do the convergence speed and stability differ across methods?
Which method converged fastest? Which was the most stable?

• Effect of Parameters: How do the choices of learning rate, momentum, and decay rates
affect each method’s performance?

• Practical Applications: In a real-world scenario, which method might be the best


choice and why?

You might also like