0% found this document useful (0 votes)

2 views47 pages

Intelligent Optimization Algorithm for Master (2)

The document discusses intelligent algorithms for solving optimization problems, highlighting methods such as hill-climbing and genetic algorithms. It also covers reinforcement learning techniques, including Q-learning and deep Q-learning, to optimize decision-making in uncertain environments. Additionally, it suggests hybrid approaches combining heuristic methods and neural networks for improved problem-solving in specific cases like RCPSP.

Uploaded by

lizhen.huang09

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views47 pages

Intelligent Optimization Algorithm for Master (2)

Uploaded by

lizhen.huang09

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 47

Topic: Intelligent

algorithm for
optimization problems
• Optimization problem
example
Optimization problem example b
cases from

Optimization problem example c

Rcpsp math formation
• Objective function: min([ft[i],…..]), given es[i], i=1,….N

• St: ft[i]=es[i]+d[i]
• if i to pa[j], es[j]>=ft[i]
• For any day t, k type of rs consumed by act i

∑𝑟 𝑠𝑖𝑘𝑡 <𝑟 𝑠 𝑙𝑘
𝑖
It is not possible to Heuristic approach
solve the problem is the promissing
by manual way for this
operation. problem.
How can we use IA to solve the optimization
problem?
• hill-climbing algorithm (competition between two individuals)

f ( x0)

f ( x0  delta)
Example

• See hill_climb algorithm.py

• Find the maximum value for
Sin(x^2)+2*cos(2*x)
• X is in[5,8]
How can we use IA to solve the optimization
problem?
• genetic algorithm ( competition among group)

• Many individuals
• Crossover and mutation
Ga process
• Step1 :Generate the individual answer (the answer should
be the feasible answer)
• Srep 2:Generate a population of answers
• Step 3:make the object function for the problem
• Step 4:Evaluate the population by using object function
• Step 5:Select the feasible answers according to their fittting
values
• Step 6:Corssover
• Step 7:Mutation
• Step 8: back to step 4
Variation: cossover and mutation for
binary value
Variation: cossover and mutation for decimal
value
Variation: mutation for decimal value
Advantage and disadvantage
• question free
• Not guarante to global solution
• Many parameters
• Operate Slowly with operator
Several algorithms with few
parameters and simple evolution
strcture
• 1+1 ES
• Only mutation
Several algorithms with few
parameters and simple evolution
strcture
• U +lambda ES (u parents, each parent produce lambda children, all
are evaluated, select u, repeat)
• Only mutation
DE flow
chart
( more on
mutation)
Difference
evolution
PSO (competition and cooperation)
• Particle Swarm Optimization (PSO)
What is nn
Surrogate • To find an approximate function for the data,
traditionally using gausian process with kernal
optimization function
Neuron network (surrogate
optimization)
• The concept of surrogate optimization
• To find an approximate function for the data, traditionally using gausian
process with kernal function
• but nn is more powerful to fit the data
• (an example)…nn for optimization

• Differentiable , continuours function

Rl (reinforcement learning)
• Based on the Dynamical programming and control theory
• Subproblems
• For each subproblems ,Presented by states and controled varaibled
Rl (reinforcement learning)
• Learning what?
• Learning reaction strategy to unknown environment or given state
Rl
• Learning from data

• State: (fire)
• Action: (oil),
• Rw_f(state,action)=reward
• Rw_f( fire,use oil)=-50
• Rw_f(fire, use water)=100
Using q table to store the
knowledge
• Data is stored in a table with the reuslts for paired data (state, action)

• Given the q table, greedy strategy to select action under state environment
• Here, states are discrete and independent in the fire example.

Action a Action b

State 1 Q(1,a) Q(1,b)

State 2 Q(2,a) Q(2,b)
Using q table to store the
knowledge
• For consecutive task, states have specific requirements.
• It should follow Markov property.
Consecutive task or risky
environment
Explore vs exploit in rl
• for the unknown enviroment, how to explore?
• Greedy epislon strategy
Rl target

• for the unknown enviroment, by taking a lot of trial

and error, the agent obtains precious data:
• S0-a0-r0-s1-a1-r1…….sn-an-rn…… ( one episode)
• Sometimes, the immediate reward may not be clear
until the end of state.
• S0-a0-s1-a1-….sn-an-……
• Which is much alike multi bandit slot game.
• So the target for the agent is to maximize the
expectation return
• Return =R0+dis*r1+r2*dis^2+….+rn*dis^n
Consecutive task or risk enviroment
• for the unknown enviroment, by taking a lot
of trial and error, the agent obtains precious
data:
• 2,right,0,3
• 2,left,0,3
• 2,left,10,1,left,-100,0
• 2,left,10,1,right,-100,0

• How to use this experimental data to

calculate q table ?
Bellman equation

S1-r1-s2-r2…. v(s1)=r1+dis*v(s2)
Monte carlo q table
• One episode:
• 2,left,10,1,left,-100,0
• 2,left,10,1
• 1,left,-100,0 (end state)
• q(1,left)=-100+dis*0=-100
• q(2,left)=10+0.9*(-100)=-80
Update the knowledge
• q(1,left)=-100
• q(2,left)=-80
• New episode
• 2,left,10,1,right,-100,0
• 2,left,10,1
• 1,right,-100,0
• q(1,right)=-100+dis*0=-100
• q(2,left)=10+0.9*(-100)=-80
• Update the knowledg again
• q(1,left)=-100 , q(1,right)=-100 ,q(2,left)=(-80-80)/2=-80,
Monte carlo q
table

• Need a lof of experiments to

explore to obtain useful q table for
exploitation
• If Enviroment is too complicated,
some states may not be detected.
For example , the np problem.
• Need to get a complete episode
Bellman optimal equation: q
learning
• S1-a1-r1-s2( section of episode)
• Q(s1,a1)=r1+dis*max v(s2) ….. q learning
• S1-a1-r1-s2-a2( section of episode)
• Q(s1,a1)=r1+dis*q(s2,a2)-----sarsa learning
Deep q learning
• If state space is infinite, q table is unavailable.
• We use nn to fit the data for q value.
data
• S0,a0,r0,s1,a1,r1,s2…..
• (S0,a0,r0),( s1,a1,r1)…..(we use
before to build nn)
• (S0,a0,r0,s1) ,
•
f(s0,a0)=r0+dis*max(list(f(s1,ai))) ,i
is all actions
• using estimation to validate nn
Policy gradient
https://towardsdatascience.com/reinforcement-learning-
explained-visually-part-6-policy-gradients-step-by-step-
f9f448e73754
Actor critic method
Application in rcpsp
• nn
to approximate a function from a matrix which store the result of fun
ction with row and column as inputs
• Monte carlo may not be enough to detect the the whole searching
space
• Heuristic methods are good at searching.
• Hybrid method may be a way to solve rcpsp
See practical ga for case 1.py
isos for case 1 improved.py
puregaforcase2.py
Thanks

Advanced Numerical Analysis (Prof. P.P. Gupta, G.S. Malik, J.P. Chauhan)
100% (1)
Advanced Numerical Analysis (Prof. P.P. Gupta, G.S. Malik, J.P. Chauhan)
531 pages
Laminar Pipe Flow
No ratings yet
Laminar Pipe Flow
77 pages
New General Method For Differential Protection of Phase Shifting Transformers
No ratings yet
New General Method For Differential Protection of Phase Shifting Transformers
6 pages
S18 Reinforcement Learning 2
No ratings yet
S18 Reinforcement Learning 2
46 pages
Reinforcement Learning: Csci 5512: Artificial Intelligence Ii
No ratings yet
Reinforcement Learning: Csci 5512: Artificial Intelligence Ii
30 pages
22 Reinforcement Learning
No ratings yet
22 Reinforcement Learning
18 pages
F20-AI-L11
No ratings yet
F20-AI-L11
52 pages
lecture-06
No ratings yet
lecture-06
98 pages
Unit-5
No ratings yet
Unit-5
70 pages
Q-Learning in RL With Openai Gym: Joo Soon Lee
No ratings yet
Q-Learning in RL With Openai Gym: Joo Soon Lee
34 pages
Lecture 30 Reinforcement-Learning
No ratings yet
Lecture 30 Reinforcement-Learning
50 pages
Lec17-ReinforcementLearning
No ratings yet
Lec17-ReinforcementLearning
58 pages
Q-Learning: Reinforcement Learning Basic Q-Learning Algorithm Common Modifications
No ratings yet
Q-Learning: Reinforcement Learning Basic Q-Learning Algorithm Common Modifications
22 pages
AI Seminar RL
No ratings yet
AI Seminar RL
27 pages
Intro to Reinforcement Learning - DQ Q AC A3C
No ratings yet
Intro to Reinforcement Learning - DQ Q AC A3C
36 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
101 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
6 pages
11-DL-Deep Learning For Reinforcement Learning
No ratings yet
11-DL-Deep Learning For Reinforcement Learning
47 pages
RL Theory Tutorial
No ratings yet
RL Theory Tutorial
80 pages
RL 10 QUESTIONS FOR MID II Scheme of Evaluvation
No ratings yet
RL 10 QUESTIONS FOR MID II Scheme of Evaluvation
15 pages
13-RL DRL
No ratings yet
13-RL DRL
102 pages
AI 11 Reinforcement Learning II
No ratings yet
AI 11 Reinforcement Learning II
35 pages
Lecture_12_slides_-_after
No ratings yet
Lecture_12_slides_-_after
50 pages
Unit-5 Genetic Reinforcement Markov Q-Learning
No ratings yet
Unit-5 Genetic Reinforcement Markov Q-Learning
39 pages
Audio to text embedding
No ratings yet
Audio to text embedding
144 pages
RL_MJJ
No ratings yet
RL_MJJ
32 pages
AI (IT) UNIT-5
No ratings yet
AI (IT) UNIT-5
43 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
50 pages
MarkovDecisionProcesses Analysis
No ratings yet
MarkovDecisionProcesses Analysis
10 pages
Ideai Reinforcement Learning
No ratings yet
Ideai Reinforcement Learning
167 pages
ML Unit-4 - RTU
No ratings yet
ML Unit-4 - RTU
18 pages
Reinforcement Learning: Foundations Exam
No ratings yet
Reinforcement Learning: Foundations Exam
42 pages
Unit 5
No ratings yet
Unit 5
36 pages
Unit-5
No ratings yet
Unit-5
52 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
45 pages
SP14 CS188 Lecture 10 - Reinforcement Learning I PDF
No ratings yet
SP14 CS188 Lecture 10 - Reinforcement Learning I PDF
38 pages
Lec 08
No ratings yet
Lec 08
59 pages
T 2 BSM20
No ratings yet
T 2 BSM20
7 pages
Unit 5 ML
No ratings yet
Unit 5 ML
23 pages
10. Learning Task
No ratings yet
10. Learning Task
14 pages
Hota-ML-ReinforcementLearning
No ratings yet
Hota-ML-ReinforcementLearning
12 pages
cs188 sp23 Note14
No ratings yet
cs188 sp23 Note14
2 pages
Reinforcement Learning: Instructor: Max Welling
No ratings yet
Reinforcement Learning: Instructor: Max Welling
18 pages
Q-Learning and Deep Q Networks (DQN)
No ratings yet
Q-Learning and Deep Q Networks (DQN)
52 pages
SP14 CS188 Lecture 10 - Reinforcement Learning I
No ratings yet
SP14 CS188 Lecture 10 - Reinforcement Learning I
35 pages
کتاب هشتم بارگزاری شده
No ratings yet
کتاب هشتم بارگزاری شده
112 pages
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
No ratings yet
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
11 pages
37 RL
No ratings yet
37 RL
18 pages
Sp14 Cs188 Lecture 9 - Mdps II
No ratings yet
Sp14 Cs188 Lecture 9 - Mdps II
48 pages
Unit 2
No ratings yet
Unit 2
46 pages
Lecture Notes v1.0 687 F22
No ratings yet
Lecture Notes v1.0 687 F22
115 pages
Deep Reinforcement Learning: Lecture Notes
No ratings yet
Deep Reinforcement Learning: Lecture Notes
60 pages
Machine_Learning_Chapter 4
No ratings yet
Machine_Learning_Chapter 4
13 pages
Reinforcement-Learning-Cheatsheet
No ratings yet
Reinforcement-Learning-Cheatsheet
16 pages
Lecture26 Ri
No ratings yet
Lecture26 Ri
55 pages
Unit II
No ratings yet
Unit II
7 pages
2024 MDPs Part 1
No ratings yet
2024 MDPs Part 1
59 pages
Machine Learning Unit4
No ratings yet
Machine Learning Unit4
21 pages
Reinforcement Learning and Dynamic Programming For Control
100% (1)
Reinforcement Learning and Dynamic Programming For Control
111 pages
AI T8 ReinfoLearning
No ratings yet
AI T8 ReinfoLearning
38 pages
ML - Unit 3 - Part II
No ratings yet
ML - Unit 3 - Part II
51 pages
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
ENDCAL13E - Lec2 - Infinite Limits Limits at Infinity and Continuity
No ratings yet
ENDCAL13E - Lec2 - Infinite Limits Limits at Infinity and Continuity
24 pages
Explicit Dynamic Analysis of Vehicle Roll Over Crashworthiness Using Ls Dyna Tasitlarin Devrilme Carpmasinin Ls Dyna Kullanilarak Eksplisit Dinamik Analizi
100% (1)
Explicit Dynamic Analysis of Vehicle Roll Over Crashworthiness Using Ls Dyna Tasitlarin Devrilme Carpmasinin Ls Dyna Kullanilarak Eksplisit Dinamik Analizi
51 pages
Lecture Static 04 - 014 PDF
No ratings yet
Lecture Static 04 - 014 PDF
12 pages
B.tech 2-1 r15 Me
No ratings yet
B.tech 2-1 r15 Me
15 pages
HW 3 Unconstrained-Optimization Advanced
No ratings yet
HW 3 Unconstrained-Optimization Advanced
9 pages
PDE Project Report
No ratings yet
PDE Project Report
14 pages
Finite Volume Differencich Schemes
No ratings yet
Finite Volume Differencich Schemes
45 pages
5.2 Numerical Integration: I FXD
100% (1)
5.2 Numerical Integration: I FXD
26 pages
Grade 10-Math-Polynomial Worksheet 2-1
No ratings yet
Grade 10-Math-Polynomial Worksheet 2-1
2 pages
UMA035_4
No ratings yet
UMA035_4
2 pages
Gauss-Seidel Method: Description
No ratings yet
Gauss-Seidel Method: Description
4 pages
4D_runge_kutta and adaptive step size
No ratings yet
4D_runge_kutta and adaptive step size
14 pages
Arman Dabiri Resume
No ratings yet
Arman Dabiri Resume
3 pages
LKM 2.2 Determinant by Row Reduction
No ratings yet
LKM 2.2 Determinant by Row Reduction
5 pages
Chapter - 4 - Root of Non Linear Equations
No ratings yet
Chapter - 4 - Root of Non Linear Equations
19 pages
3.3a - Simplex Method (M Method) - Min Objective
No ratings yet
3.3a - Simplex Method (M Method) - Min Objective
22 pages
Question bank 3,4,5
No ratings yet
Question bank 3,4,5
9 pages
Introducing Linear Programming Though The Merton Truck Company Case
No ratings yet
Introducing Linear Programming Though The Merton Truck Company Case
19 pages
Module 4
100% (1)
Module 4
16 pages
Gauss Jordan Method (Third Practical)
No ratings yet
Gauss Jordan Method (Third Practical)
9 pages
12 Graphing Polynomial Functions DAY1 PDF
100% (2)
12 Graphing Polynomial Functions DAY1 PDF
30 pages
The Art of Linear Algebra: Foreword
No ratings yet
The Art of Linear Algebra: Foreword
14 pages
2016 Fall ME501 04 ODE Part4 PDF
No ratings yet
2016 Fall ME501 04 ODE Part4 PDF
23 pages
Matrices and Determinents
No ratings yet
Matrices and Determinents
30 pages
Addition & Subtraction On Polynomials
No ratings yet
Addition & Subtraction On Polynomials
5 pages
SKD 27&28 BSMRMU Lagrangian Interpolation
No ratings yet
SKD 27&28 BSMRMU Lagrangian Interpolation
28 pages
Question Bank With Answer PDF
No ratings yet
Question Bank With Answer PDF
11 pages

Intelligent Optimization Algorithm for Master (2)

Uploaded by

Intelligent Optimization Algorithm for Master (2)

Uploaded by

Topic: Intelligent

Optimization problem example c

• See hill_climb algorithm.py

• Differentiable , continuours function

State 1 Q(1,a) Q(1,b)

• for the unknown enviroment, by taking a lot of trial

• How to use this experimental data to

• Need a lof of experiments to

You might also like