Q-Learning Algorithm (1)

The document presents a group project on the Q-learning algorithm, a reinforcement learning technique that allows machines to learn from interactions with their environment. It covers key concepts such as the Q-function, Q-table, and the steps involved in the Q-learning algorithm, along with examples and applications. The presentation also discusses the advantages and disadvantages of Q-learning.

Uploaded by

anum.ashraf237

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Q-Learning Algorithm (1)

Uploaded by

anum.ashraf237

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

GOVT.

RABIA BASRI GRADUATE COLLEGE (W)

WALTON ROAD LAHORE

Presentation Topic: Q-learning Algorithm

Group no 6:
Samia Anwar(116)
Fatima Liaqat(105)
Mahnoor(122)
Nimra Mehboob(123)

Content:
 Reinforcement Learning Technique: Q-learning
 Some imp terms in Q-Learning
 Factors and Algorithm of Q-learning
 Steps with examples
 Advantages and disadvantages Applications

What is reinforcement learning?

 Reinforcement Learning (RL) is a branch of
machine learning
 RL allows machines to learn by interacting with an
environment and receiving feedback based on their
actions. This feedback comes is in the form
of rewards or penalties.
Q-LEARNING:
 Q-Learning means quality learning.
 It is off-policy, model-free and value-based
reinforcement learning algorithm.
 Agent has to actively learn through the experience of
interactions with the environment.
 off-policy RLA(according to situation which action is
performed on which state).
 model-free RLA(learn the consequences of their
actions through experience without transition and
reward function).
 value-based RLA(train the value function to learn
which state is more valuable and take action).
 Agent uses trail and error to determine which actions
result in rewards(good outcome) and penalties(bad
outcome or negative reward).
 The decision making of q learning is improved day
by day due to updation in q table .

Some important terms in Q-learning:

Factors of Q-learning:
 There are 2 factors of q learning i.e., Q-
function(Bellman equation) and the other one is Q-
table.
1. Q-function(Bellman Equation):
 It is a recursive formula used to calculate value of
given state and determine the optimal action.

 Q(s,a)=R(s,a)+ *max[Q(s’,a’)].
Whereas:

 Q(s,a) is the Q value for given state and action pair.

 R(s,a) is the immediate reward for taking action in
state s.
 (Gamma) is the discount factor
representing importance of future rewards.
 Max Q(s’,a’) is the maximum q value for the next
state s’ and all possible actions a’.
Q-table:
 Q table is a data structure of sets of actions and states
and we use q learning algorithm to update q values in
q table.
 Combinations of actions and states.
 State no=no. of rows
 Action no = no. of columns
 Initially q table is initialized with value=0.
 The agent will use a q table to take the best possible
action based on the expected reward for each state in
the environment
 In simple words a q table is a data structure of step
of actions ans states and we use the q learning
algorithm to update the values in the table.
Q-Learning algorithm:
Steps to follow in q learning algorithm:
 Step1:Create an initial Q-Table with all values
initialized to 0
 Step 2:Choose an action and perform it.Update value
in table.
 Step 3:Get the value of the reward and calculate the
Q-value using bellman equation(Q-function).
Step 4:Continue the same process until the table is
filled or an episode ends.
Example:
 Here Rooms: States(s) and Doors: Actions(a).

 Suppose that we have 5 rooms in a building.We will number the rooms from 0 to 4 and the
outside of building can be thought of as one big room(5).

 We can represent each room as a node (states) and each door as a link(action).

 We have to get into the room 5 that’s why Our goal state is room 5 .

 Imp points: Goal room:5

 The doors that leads immediately to room 5 have reward 100.

 Others that have been not directly connected to room5 have 0 reward.

 Where there is no link between node(states:room) then reward is -1 (invalid link).

 Discount factor gamma:0.8

Application:
References:
https://www.datacamp.com/tutorial/introduction-q-learning-beginner-tutorial

https://www.geeksforgeeks.org/q-learning-in-python/

https://youtu.be/QRMNPCsnSHk

https://youtu.be/3Rx2x2traxw

https://youtu.be/ibBEEZNQZtk

https://youtu.be/5MC8Wdo-hS8

Banking Solution Case Study
No ratings yet
Banking Solution Case Study
1 page
unit-5
No ratings yet
unit-5
65 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
12 pages
Intro to Reinforcement Learning - DQ Q AC A3C
No ratings yet
Intro to Reinforcement Learning - DQ Q AC A3C
36 pages
Q Learning
No ratings yet
Q Learning
9 pages
UNIT-5
No ratings yet
UNIT-5
54 pages
AI Seminar RL
No ratings yet
AI Seminar RL
27 pages
Q Learning SARSA Deep Q Learning
No ratings yet
Q Learning SARSA Deep Q Learning
4 pages
Unit-5
No ratings yet
Unit-5
70 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
11 pages
Reinforcement Learning - Ipynb - Colaboratory
No ratings yet
Reinforcement Learning - Ipynb - Colaboratory
7 pages
Q Learning
No ratings yet
Q Learning
38 pages
Lec 09
No ratings yet
Lec 09
26 pages
39-Q Learning Numerical
No ratings yet
39-Q Learning Numerical
13 pages
RL Class Mtech
No ratings yet
RL Class Mtech
67 pages
Hota-ML-ReinforcementLearning
No ratings yet
Hota-ML-ReinforcementLearning
12 pages
Unit-5 Mlt
No ratings yet
Unit-5 Mlt
13 pages
Deep Learning Binoy-19-3-RL Q Learning
No ratings yet
Deep Learning Binoy-19-3-RL Q Learning
26 pages
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
No ratings yet
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
11 pages
Q Learning Ejemplo
100% (1)
Q Learning Ejemplo
11 pages
7- Reinforcement Learning
No ratings yet
7- Reinforcement Learning
23 pages
Q Learning
No ratings yet
Q Learning
38 pages
Simulation of The Navigation of A Mobile Robot by The Q-Learning Using Artificial Neuron Networks
No ratings yet
Simulation of The Navigation of A Mobile Robot by The Q-Learning Using Artificial Neuron Networks
12 pages
Q-Learning in RL With Openai Gym: Joo Soon Lee
No ratings yet
Q-Learning in RL With Openai Gym: Joo Soon Lee
34 pages
p1 Piotr
No ratings yet
p1 Piotr
7 pages
Adobe Scan Nov 18, 2024
No ratings yet
Adobe Scan Nov 18, 2024
13 pages
Report p1
No ratings yet
Report p1
7 pages
112 Q Learning N
100% (1)
112 Q Learning N
15 pages
ML - Unit 3 - Part II
No ratings yet
ML - Unit 3 - Part II
51 pages
Enhancing Q-Learning Speed Using Selective Signal Injection
No ratings yet
Enhancing Q-Learning Speed Using Selective Signal Injection
4 pages
Reinforedu
No ratings yet
Reinforedu
46 pages
Q-Learning: Reinforcement Learning Basic Q-Learning Algorithm Common Modifications
No ratings yet
Q-Learning: Reinforcement Learning Basic Q-Learning Algorithm Common Modifications
22 pages
F20-AI-L11
No ratings yet
F20-AI-L11
52 pages
Lecture Notes on Reinforcement Learning Basics
No ratings yet
Lecture Notes on Reinforcement Learning Basics
6 pages
Q Learning
No ratings yet
Q Learning
12 pages
lab2_q1_200001064
No ratings yet
lab2_q1_200001064
2 pages
Reinforcement Learning: Mitchell, Ch. 13 (See Also Barto & Sutton Book On-Line)
No ratings yet
Reinforcement Learning: Mitchell, Ch. 13 (See Also Barto & Sutton Book On-Line)
14 pages
Reinforcement Learning: Mitchell, Ch. 13 (See Also Barto & Sutton Book On-Line)
No ratings yet
Reinforcement Learning: Mitchell, Ch. 13 (See Also Barto & Sutton Book On-Line)
14 pages
A Painless Q-Learning Tutorial
No ratings yet
A Painless Q-Learning Tutorial
6 pages
Unit 1
No ratings yet
Unit 1
18 pages
unit5 mlt
No ratings yet
unit5 mlt
26 pages
Filippov Theory On Infinitesimal Epsilon-Greedy Q-Learning
No ratings yet
Filippov Theory On Infinitesimal Epsilon-Greedy Q-Learning
66 pages
Q learning
No ratings yet
Q learning
187 pages
4.3 Reinforcement Learning
No ratings yet
4.3 Reinforcement Learning
27 pages
UNIT- 5
No ratings yet
UNIT- 5
43 pages
AI (IT) UNIT-5
No ratings yet
AI (IT) UNIT-5
43 pages
3964 Double Q Learning
No ratings yet
3964 Double Q Learning
9 pages
10 Deep Reinforcement
No ratings yet
10 Deep Reinforcement
40 pages
Reinforcement Learning by Comparing Immediate Reward: Punit Pandey Deepshikhapandey
No ratings yet
Reinforcement Learning by Comparing Immediate Reward: Punit Pandey Deepshikhapandey
5 pages
AI 11 Reinforcement Learning II
No ratings yet
AI 11 Reinforcement Learning II
35 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
34 pages
ML Assignment 2
No ratings yet
ML Assignment 2
6 pages
Neural Networks Reinforcement Learning
No ratings yet
Neural Networks Reinforcement Learning
22 pages
CS480 Lecture November 21st
No ratings yet
CS480 Lecture November 21st
193 pages
Reinforcement Learning: Instructor: Max Welling
No ratings yet
Reinforcement Learning: Instructor: Max Welling
18 pages
New CZ3005 Module 5 - Reinforcement Learning
No ratings yet
New CZ3005 Module 5 - Reinforcement Learning
31 pages
Lec17-ReinforcementLearning
No ratings yet
Lec17-ReinforcementLearning
58 pages
Reinforcement_Learning_Algorithms_in_Global_Path_Planning_for_Mobile_Robot
No ratings yet
Reinforcement_Learning_Algorithms_in_Global_Path_Planning_for_Mobile_Robot
5 pages
Shobitha As
No ratings yet
Shobitha As
8 pages
10. Learning Task
No ratings yet
10. Learning Task
14 pages
Reinforcement Learning: A Practical Guide to Algorithms
From Everand
Reinforcement Learning: A Practical Guide to Algorithms
Trilokesh Khatri
No ratings yet
unsupervised learning
No ratings yet
unsupervised learning
12 pages
assid.doc
No ratings yet
assid.doc
9 pages
ARTIFICIAL_INTELLIGENCE_RECENT_TRENDS_AND_APPLICAT (1)
No ratings yet
ARTIFICIAL_INTELLIGENCE_RECENT_TRENDS_AND_APPLICAT (1)
13 pages
DC-324L
No ratings yet
DC-324L
1 page
Untitled presentation
No ratings yet
Untitled presentation
12 pages
KNN
No ratings yet
KNN
10 pages
Reinforcement Learning 1
No ratings yet
Reinforcement Learning 1
14 pages
Test Dos Ingles
75% (4)
Test Dos Ingles
4 pages
Pride Month S
No ratings yet
Pride Month S
5 pages
Objective Genitive Faith of Christ PDF
No ratings yet
Objective Genitive Faith of Christ PDF
7 pages
Deed of gift - Nuwini 1
No ratings yet
Deed of gift - Nuwini 1
4 pages
15.401 Finance Theory: Andrew W. Lo Harris & Harris Group Professor, MIT Sloan School
No ratings yet
15.401 Finance Theory: Andrew W. Lo Harris & Harris Group Professor, MIT Sloan School
27 pages
TIENG-ANH-9-MA-TRAN-BANG-DAC-TA-2024-2025
No ratings yet
TIENG-ANH-9-MA-TRAN-BANG-DAC-TA-2024-2025
3 pages
Runions Et Al-2016-Aggressive Behavior
No ratings yet
Runions Et Al-2016-Aggressive Behavior
12 pages
Basic Vs Crystal Syntax
100% (7)
Basic Vs Crystal Syntax
19 pages
Ceiling Fan Test Procedure
No ratings yet
Ceiling Fan Test Procedure
20 pages
GROUP 3 Narrative Report
No ratings yet
GROUP 3 Narrative Report
12 pages
GDP World
No ratings yet
GDP World
161 pages
Ipomoea Aquatica F
No ratings yet
Ipomoea Aquatica F
7 pages
Doris Salcedo
No ratings yet
Doris Salcedo
2 pages
Discrete Mathematics: Graph Theory 1
No ratings yet
Discrete Mathematics: Graph Theory 1
18 pages
01 Martin T. Hagan - Neural Network Design, Chino (1996)
0% (1)
01 Martin T. Hagan - Neural Network Design, Chino (1996)
734 pages
ENG7-LESSON-PLAN-Q3.2
No ratings yet
ENG7-LESSON-PLAN-Q3.2
2 pages
Presented by Dr. Smijal GM
No ratings yet
Presented by Dr. Smijal GM
96 pages
Module Test Theretical Phonetics V1
No ratings yet
Module Test Theretical Phonetics V1
2 pages
Script
No ratings yet
Script
2 pages
Prussian Education System
No ratings yet
Prussian Education System
10 pages
Essay On Utopia
100% (2)
Essay On Utopia
6 pages
PAT 301 MCQ + Qn
No ratings yet
PAT 301 MCQ + Qn
18 pages
N101 Foot Reflexology Course Brochure
No ratings yet
N101 Foot Reflexology Course Brochure
2 pages
People V. Macaranas
100% (1)
People V. Macaranas
2 pages
Versuri Muzica Anii 80 - 90
No ratings yet
Versuri Muzica Anii 80 - 90
25 pages
Mango
No ratings yet
Mango
14 pages
8601 Unit 05 - Activity Method
No ratings yet
8601 Unit 05 - Activity Method
26 pages
IELTS-Style Speaking Test Questions and Answers Health and Fitness
No ratings yet
IELTS-Style Speaking Test Questions and Answers Health and Fitness
4 pages
Israel A History by Anita Shapira
No ratings yet
Israel A History by Anita Shapira
1 page