0% found this document useful (0 votes)

24 views3 pages

Course 2 - Sample Based Learning Methods Learning Objectives

This document outlines a course on Sample Based Learning Methods, detailing modules on Monte Carlo Methods, Temporal Difference Learning, and Planning. Each module includes lessons that cover key concepts, algorithms, and applications in reinforcement learning. The course aims to equip learners with the skills to apply these methods in various scenarios and understand their advantages and limitations.

Uploaded by

wajih mliki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views3 pages

Course 2 - Sample Based Learning Methods Learning Objectives

Uploaded by

wajih mliki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Sample Based Learning Methods: Learning Objectives

Module 00: Welcome to the Course

Understand the prerequisites, goals and roadmap for the course.

Module 01: Monte Carlo Methods for Prediction & Control

Lesson 1: Introduction to Monte Carlo Methods

Understand how Monte-Carlo methods can be used to estimate value functions from
sampled interaction
Identify problems that can be solved using Monte-Carlo methods
Use Monte Carlo prediction to estimate the value function for a given policy.

Lesson 2: Monte Carlo for Control

Estimate action-value functions using Monte Carlo
Understand the importance of maintaining exploration in Monte Carlo algorithms
Understand how to use monte carlo methods to implement a GPI algorithm.
Apply Monte Carlo with exploring starts to solve an MDP

Lesson 3: Exploration Methods for Monte Carlo

Understand why Exploring Starts can be problematic in real problems
Describe an alternative exploration method for Monte Carlo control

Lesson 4: Off-policy learning for prediction

Understand how off-policy learning can help deal with the exploration problem
Produce examples of target policies and examples of behavior policies.
Understand importance sampling
Use importance sampling to estimate the expected value of a target distribution using
samples from a different distribution.
Understand how to use importance sampling to correct returns
Understand how to modify the monte carlo prediction algorithm for off-policy learning.

Module 2: Temporal Difference Learning Methods for Prediction

Lesson 1: Introduction to Temporal Difference Learning

Define temporal-difference learning
Define the temporal-difference error
Understand the TD(0) algorithm
Lesson 2: Advantages of TD
Understand the benefits of learning online with TD
Identify key advantages of TD methods over Dynamic Programming and Monte Carlo
methods
Identify the empirical benefits of TD learning

Module 3: Temporal Difference Learning Methods for Control

Lesson 1: TD for Control

Explain how generalized policy iteration can be used with TD to find improved policies
Describe the Sarsa Control algorithm
Understand how the Sarsa control algorithm operates in an example MDP
Analyze the performance of a learning algorithm

Lesson 2: Off-policy TD Control: Q-learning

Describe the Q-learning algorithm
Explain the relationship between q-learning and the Bellman optimality equations.
Apply q-learning to an MDP to find the optimal policy
Understand how Q-learning performs in an example MDP
Understand the differences between Q-learning and Sarsa
Understand how Q-learning can be off-policy without using importance sampling
Describe how the on-policy nature of SARSA and the off-policy nature of Q-learning
affect their relative performance

Lesson 3: Expected Sarsa

Describe the Expected Sarsa algorithm
Describe Expected Sarsa’s behaviour in an example MDP
Understand how Expected Sarsa compares to Sarsa control
Understand how Expected Sarsa can do off-policy learning without using importance
sampling
Explain how Expected Sarsa generalizes Q-learning

Module 4: Planning, Learning & Acting

Lesson 1: What is a model?

Describe what a model is and how they can be used
Classify models as distribution models or sample models
Identify when to use a distribution model or sample model
Describe the advantages and disadvantages of sample models and distribution models
Explain why sample models can be represented more compactly than distribution
models
Lesson 2: Planning
Explain how planning is used to improve policies
Describe random-sample one-step tabular Q-planning

Lesson 3: Dyna as a formalism for planning

Recognize that direct RL updates use experience from the environment to improve a
policy or value function
Recognize that planning updates use experience from a model to improve a policy or
value function
Describe how both direct RL and planning updates can be combined through the Dyna
architecture
Describe the Tabular Dyna-Q algorithm
Identify the direct-RL and planning updates in Tabular Dyna-Q
Identify the model learning and search control components of Tabular Dyna-Q
Describe how learning from both direct and simulated experience impacts performance
Describe how simulated experience can be useful when the model is accurate

Lesson 4: Dealing with inaccurate models

Identify ways in which models can be inaccurate
Explain the effects of planning with an inaccurate model
Describe how Dyna can plan successfully with a partially inaccurate model
Explain how model inaccuracies produce another exploration-exploitation trade-off
Describe how Dyna-Q+ proposes a way to address this trade-off

Lesson 5: Course wrap-up

Rewording The Brain How Cryptic Crosswords Can Improve Your Memory and Boost The Power and Agility of Your Brain Research PDF Download
100% (14)
Rewording The Brain How Cryptic Crosswords Can Improve Your Memory and Boost The Power and Agility of Your Brain Research PDF Download
15 pages
Jimmy Kenya Power Report
No ratings yet
Jimmy Kenya Power Report
38 pages
API 653 Notes
No ratings yet
API 653 Notes
3 pages
Subnet Mask PDF
No ratings yet
Subnet Mask PDF
5 pages
Ideai Reinforcement Learning
No ratings yet
Ideai Reinforcement Learning
167 pages
Lecture 5: Model-Free Control: David Silver
No ratings yet
Lecture 5: Model-Free Control: David Silver
43 pages
La Liberación Del Libro. Una Crítica Del Sistema de Precio Fijo. Pedro Schwartz.
No ratings yet
La Liberación Del Libro. Una Crítica Del Sistema de Precio Fijo. Pedro Schwartz.
79 pages
Geol 194 Syllabus Revised
No ratings yet
Geol 194 Syllabus Revised
4 pages
C-Data Gepon Olt Fd2000s Ems User Manual-V2.0
No ratings yet
C-Data Gepon Olt Fd2000s Ems User Manual-V2.0
67 pages
Batl006 PDF
No ratings yet
Batl006 PDF
26 pages
QR729 (QTR729) Qatar Airways Flight Tracking and History - FlightAware
No ratings yet
QR729 (QTR729) Qatar Airways Flight Tracking and History - FlightAware
1 page
Math 6 Module 19 Sessions 3 4 Mini-Peta On No. Sequence
No ratings yet
Math 6 Module 19 Sessions 3 4 Mini-Peta On No. Sequence
13 pages
ANTENATAL ASSESSMENT Form 10
No ratings yet
ANTENATAL ASSESSMENT Form 10
4 pages
Dhupguri Report
No ratings yet
Dhupguri Report
11 pages
Lecture 5 - ModelFreePrediction
No ratings yet
Lecture 5 - ModelFreePrediction
79 pages
Revised PN Staff Writing Manual - 1
No ratings yet
Revised PN Staff Writing Manual - 1
334 pages
Configuring The Switch For Access Point Discovery
No ratings yet
Configuring The Switch For Access Point Discovery
8 pages
Catch Up Friday Research
No ratings yet
Catch Up Friday Research
1 page
Occupational Health and Safety Policy For The National Department of Health
No ratings yet
Occupational Health and Safety Policy For The National Department of Health
14 pages
Improving Monte Carlo Evaluation With Offline Data: Sutton and Barto 2018
No ratings yet
Improving Monte Carlo Evaluation With Offline Data: Sutton and Barto 2018
40 pages
In Vivo and in Vitro Evaluation of Four Different Aqueous Polymeric Dispersions For Producing An Enteric Coated Tablet
No ratings yet
In Vivo and in Vitro Evaluation of Four Different Aqueous Polymeric Dispersions For Producing An Enteric Coated Tablet
6 pages
A Circular-Economy-Retrospective
No ratings yet
A Circular-Economy-Retrospective
16 pages
Special Comment(s) Overall:: 9299724. Clark Builders. Raymond Block - Level3 - Zone1 - Phase1. March 01, 2018
No ratings yet
Special Comment(s) Overall:: 9299724. Clark Builders. Raymond Block - Level3 - Zone1 - Phase1. March 01, 2018
3 pages
Lesson Plan (Thai Son)
No ratings yet
Lesson Plan (Thai Son)
8 pages
ML Merge
No ratings yet
ML Merge
145 pages
Monetary Statistics M
No ratings yet
Monetary Statistics M
42 pages
Reinforcement Learning and Dynamic Programming For Control
100% (1)
Reinforcement Learning and Dynamic Programming For Control
111 pages
Solaris Disk Quota Implementation
No ratings yet
Solaris Disk Quota Implementation
2 pages
Fundamentals of Reinforcement Learning Learning Objectives
No ratings yet
Fundamentals of Reinforcement Learning Learning Objectives
3 pages
Group Assignment 6 ICT (XII IPA 5) - 20240118 - 003400 - 0000
No ratings yet
Group Assignment 6 ICT (XII IPA 5) - 20240118 - 003400 - 0000
13 pages
Reinforcement Learning (Part 2) : Nguyen Do Van, PHD
No ratings yet
Reinforcement Learning (Part 2) : Nguyen Do Van, PHD
46 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
46 pages
SP14 CS188 Lecture 10 - Reinforcement Learning I PDF
No ratings yet
SP14 CS188 Lecture 10 - Reinforcement Learning I PDF
38 pages
Ray Martinez - Resume 03 11 2023 - Most Recent
No ratings yet
Ray Martinez - Resume 03 11 2023 - Most Recent
3 pages
Monte Carlo Learning
No ratings yet
Monte Carlo Learning
14 pages
Notes Summary
No ratings yet
Notes Summary
65 pages
Reinforcement Learning: Csci 5512: Artificial Intelligence Ii
No ratings yet
Reinforcement Learning: Csci 5512: Artificial Intelligence Ii
30 pages
Temporal Difference Learning
No ratings yet
Temporal Difference Learning
17 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
45 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
50 pages
Lecture Notes v1.0 687 F22
No ratings yet
Lecture Notes v1.0 687 F22
115 pages
EE 675 Lecture 27th March
No ratings yet
EE 675 Lecture 27th March
4 pages
Lec 17 SARSA Expected SARSA Q Learning
No ratings yet
Lec 17 SARSA Expected SARSA Q Learning
4 pages
Lecture 30 Reinforcement-Learning
No ratings yet
Lecture 30 Reinforcement-Learning
50 pages
SP14 CS188 Lecture 10 - Reinforcement Learning I
No ratings yet
SP14 CS188 Lecture 10 - Reinforcement Learning I
35 pages
Unit 4
100% (1)
Unit 4
7 pages
Simulation-Based Optimization Parametric Optimizat
100% (1)
Simulation-Based Optimization Parametric Optimizat
11 pages
Ai Unit 3
No ratings yet
Ai Unit 3
23 pages
Chapter 8-Performance Management
No ratings yet
Chapter 8-Performance Management
14 pages
5.4-Reinforcement Learning-Part2-Learning-Algorithms
No ratings yet
5.4-Reinforcement Learning-Part2-Learning-Algorithms
15 pages
5.4-Reinforcement Learning-Part1-Introduction
No ratings yet
5.4-Reinforcement Learning-Part1-Introduction
15 pages
Unit 1
No ratings yet
Unit 1
18 pages
ML Unit-4 - RTU
No ratings yet
ML Unit-4 - RTU
18 pages
37 RL
No ratings yet
37 RL
18 pages
11-DL-Deep Learning For Reinforcement Learning
No ratings yet
11-DL-Deep Learning For Reinforcement Learning
47 pages
CPAP-HFNC - Medin - NC3 Ops - Manual Book
No ratings yet
CPAP-HFNC - Medin - NC3 Ops - Manual Book
59 pages
Slidedeck 7 MAS 2021 22 RL 3 MC Sarsa QL
No ratings yet
Slidedeck 7 MAS 2021 22 RL 3 MC Sarsa QL
65 pages
Model Free Methods
No ratings yet
Model Free Methods
31 pages
S18 Reinforcement Learning 2
No ratings yet
S18 Reinforcement Learning 2
46 pages
ML Mod 6
No ratings yet
ML Mod 6
11 pages
Unit Iii Monte Carlo & Temporal Difference Methods
No ratings yet
Unit Iii Monte Carlo & Temporal Difference Methods
18 pages
Marks Oriented Notes For IGCSE O Level Physics v37
No ratings yet
Marks Oriented Notes For IGCSE O Level Physics v37
76 pages
19 - Monte Carlo and Temporal Difference For Markov Decision Processes
No ratings yet
19 - Monte Carlo and Temporal Difference For Markov Decision Processes
57 pages
Discuss About Temporal Difference in Reinforcement Learning?
No ratings yet
Discuss About Temporal Difference in Reinforcement Learning?
9 pages
QP Ans
No ratings yet
QP Ans
40 pages
Temporal-Difference (TD) Learning: Basics
No ratings yet
Temporal-Difference (TD) Learning: Basics
6 pages
LP 4TH Grade 10 Day1
No ratings yet
LP 4TH Grade 10 Day1
3 pages
RL Unit - Iv
No ratings yet
RL Unit - Iv
25 pages
Define The Problem
No ratings yet
Define The Problem
6 pages
Soe Hed Cbcs Syllabus
No ratings yet
Soe Hed Cbcs Syllabus
53 pages
Unit-3 Unit-3 RL Problems, Prediction and Control P 241111 181426
No ratings yet
Unit-3 Unit-3 RL Problems, Prediction and Control P 241111 181426
15 pages
ML Unit 5 at VS
No ratings yet
ML Unit 5 at VS
29 pages
Learning Task
No ratings yet
Learning Task
14 pages
Notes
No ratings yet
Notes
6 pages
Monte Carlo 1
No ratings yet
Monte Carlo 1
245 pages
Q - Networks (1) 31 50
No ratings yet
Q - Networks (1) 31 50
20 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
6 pages
Short Story
No ratings yet
Short Story
2 pages
Temporal Difference (TD) Learning: Slides Prepared by DR J Alamelu Mangai
No ratings yet
Temporal Difference (TD) Learning: Slides Prepared by DR J Alamelu Mangai
57 pages
Temporal Difference Learning
No ratings yet
Temporal Difference Learning
15 pages
Unit 5
No ratings yet
Unit 5
70 pages
Module 5-rl
No ratings yet
Module 5-rl
54 pages
Artificial Intelligence: Lecture 10 - Reinforcement Learning Prof. Shivanjali Khare
No ratings yet
Artificial Intelligence: Lecture 10 - Reinforcement Learning Prof. Shivanjali Khare
45 pages
Artificial Intelligence: Lecture 11 - Reinforcement Learning II Dr. Shivanjali Khare
No ratings yet
Artificial Intelligence: Lecture 11 - Reinforcement Learning II Dr. Shivanjali Khare
52 pages
Module3 TD Methods
No ratings yet
Module3 TD Methods
18 pages
CH3 - 2 Montecarlo Control
No ratings yet
CH3 - 2 Montecarlo Control
33 pages
Unit 5
No ratings yet
Unit 5
39 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
52 pages
Unit-5 ML
No ratings yet
Unit-5 ML
18 pages

Course 2 - Sample Based Learning Methods Learning Objectives

Uploaded by

Course 2 - Sample Based Learning Methods Learning Objectives

Uploaded by

Sample Based Learning Methods: Learning Objectives

Module 00: Welcome to the Course

Module 01:​ ​Monte Carlo Methods for Prediction & Control

Lesson 1: Introduction to Monte Carlo Methods

Lesson 2: Monte Carlo for Control

Lesson 3: Exploration Methods for Monte Carlo

Lesson 4: Off-policy learning for prediction

Module 2: Temporal Difference Learning Methods for Prediction

Lesson 1: Introduction to Temporal Difference Learning

Module 3: Temporal Difference Learning Methods for Control

Lesson 1: TD for Control

Lesson 2: Off-policy TD Control: Q-learning

Lesson 3: Expected Sarsa

Module 4: Planning, Learning & Acting

Lesson 1: What is a model?

Lesson 3: Dyna as a formalism for planning

Lesson 4: Dealing with inaccurate models

Lesson 5: Course wrap-up

You might also like

Module 01: Monte Carlo Methods for Prediction & Control