0% found this document useful (0 votes)

4 views7 pages

CS 4501-Introduction To Reinforcement Learning

CS 4501 is an undergraduate course on Reinforcement Learning, covering foundational concepts such as multi-armed bandits, Markov Decision Processes, and deep learning applications. Prerequisites include familiarity with machine learning, mathematics, and programming, with assessments comprising homework, quizzes, and a course project. The course emphasizes hands-on experience and encourages student collaboration while maintaining academic integrity.

Uploaded by

akhosla67

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views7 pages

CS 4501-Introduction To Reinforcement Learning

Uploaded by

akhosla67

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

CS 4501: Introduction to Reinforcement Learning

Instructor: Hongning Wang (hw5x)

TA: Wanyu Du (wd5jq)
https://www.cs.virginia.edu/~hw5x/Course/RL2022-Fall/_site/
Department of Computer Science
University of Virginia

1 Course Overview
Be curious, not judgmental.
– Walt Whitman, an American poet, essayist, and journalist.

Reinforcement learning has been widely adopted in our everyday life, from robotics, self-
driving car and gaming playing to intelligent information systems, such as search engine and
recommender systems, and many more! It has been extensively studied in multiple disciplines,
including computer science, psychology, neuroscience, optimization, and operational research.
The recent success of DeepMind’s AlphaGo has put reinforcement learning under the spotlight
and helped it receive ever-increasing attention from the field.
In this undergraduate-level course, we will discuss the foundations in reinforcement learning,
starting from multi-armed bandits, to Markov Decision Process, planning, on-policy and off-
policy learning, its recent development under the context of deep learning, and its real-world
applications. Through our lecture discussions and working on your course projects with your
teammates, you will be able to:

• Get familiar with the history of reinforcement learning research and its focuses in different
fields of studies, and recognize the boundaries between it and other machine learning
paradigms;
• Understand core reinforcement learning techniques ranging from dynamic programming,
Monte Carlo methods, temporal difference methods, function approximation methods, and
off-policy evaluation and learning;
• Go deep into a specific topic in reinforcement learning on your own and present your
thoughts and ideas to your instructor and peers in classroom discussions;
• Team up with others to solving challenging reinforcement learning problems.
• Most importantly, get a sense of basic research activities: formulating a real-world problem
in an abstract and mathematical way, and develop principled solutions for it.

2 Prerequisites
Due to the intrinsic complexity of reinforcement learning, a good amount of prerequisites are
imposed or assumed. First of all, a good level of familiarity with general machine learning is
needed. As each one of you is expected to work with specific reinforcement learning algorithms
in your final course project, strong background in machine learning becomes necessary. For

1
SYLLABUS REINFORCEMENT LEARNING
example, you should know what supervised machine learning is and therefore better realize how
reinforcement learning differs from it.
Second, a strong skill in mathematics will help you gain in-depth understanding of the
concepts and algorithms discussed in the course and develop your own idea for new solutions.
You need to be familiar basic concepts of probability (e.g., probability distributions, Bayes’s
theorem and expectation), linear algebra (e.g., matrix inverse and decomposition), and calculus
(e.g., Hessian and second order optimality).
Last but not least, significant programming experience will be helpful as you can focus
more on the exciting reinforcement learning algorithms being explored rather than the syntax of
programming languages. It is recommended you have taken CS 2150 (or higher) and have a good
working familiarity with at least one programming language (python is highly recommended).
If you are not sure if you have met such prerequisites, please feel free to contact the instructor.

3 Course Content & Schedule

To help you get broad exposure to reinforcement learning techniques, we will cover a variety of
basic elements, techniques and modern advances in reinforcement learning; but it can never be
exhaustive. The course will be mainly delivered by lecture-style discussions by the instructor; but
the students are highly encouraged to broadly read the recommended papers after each lecture
to broaden the scope the deepen the depth of your understanding in reinforcement learning.
You will learn through lectures, in-class paper discussions, homework assignments, and course
projects. Topics to be covered include (the schedules are tentative and subject to change, please
keep track of it on our course website):

1. Introduction (∼2 lectures): We will highlight the basic structure and major topics of this
course, and go over some logistic issues and course requirements.

2. Basic elements of reinforcement learning (∼3 lectures): We will lay down the basic con-
cepts (e.g., reward v.s., return, value function v.s., policy) and building blocks in rein-
forcement learning, and introduce categorizations of reinforcement learning problems, e.g.,
planning/control v.s., learning, and corresponding solutions, e.g., model-based v.s., model-
free reinforcement learning.

3. Multi-armed bandit (∼4 lectures): It is a good entry point for more complicated reinforce-
ment learning problems, as it can be understood as reinforcement learning with no state
or no state transitions (a.k.a., contextual bandits). We will focus on the key challenge in
multi-armed bandits, i.e., the explore-exploit trade-off, and introduce classical solutions to
effectively balance the trade-off.

4. Markov decision process (∼2 lectures): It is one of the most well-studied reinforcement
learning problems, and oftentimes (mistakenly) considered as the reinforcement learning
problem. It provides a mathematical framework for modeling decision making in situations
where outcomes are partly random and partly under the control of a decision maker. We
will carefully discuss its structures, underlying assumptions, and limitations. This will be
our primary problem setup for discussing existing reinforcement learning algorithms in this
course.

2
SYLLABUS REINFORCEMENT LEARNING
5. Dynamic programming (∼2 lectures): It is a well-known problem solving principle in com-
puter science, e.g., computer algorithm designs; and it is the foundation for computing
optimal policies given a perfect model of the environment. We will cover important con-
cepts of Bellman optimality equation, value iteration, and policy iteration, originated from
the dynamic programming technique.

6. Monte Carlo Methods (∼2 lectures): It relaxes the assumption of complete knowledge of
environment, and instead enables an agent to learn from the experiences gained by inter-
acting with the environment. We will cover Monte Carlo methods for value estimation and
control. When time allows, we will also cover another important direction of reinforcement
learning, i.e., off-policy evaluation and learning, using Monte Carlo methods.

7. Temporal-Difference Learning (∼3 lectures): It is an important class of model-free rein-

forcement learning solutions, which combine the ideas from Monte Carlo methods (learning
from experience) and dynamic programming (model-based planning) via boot-strapping.
And it is always considered as one of the most fundamental idea in reinforcement learning.
We will cover the n-step TD learning methods, TD(λ), and Q-learning methods in our
lectures.

8. Policy Gradient (∼3 lectures): It is another important family of model-free reinforcement

learning solutions, which perform gradient-based optimization directly in a parameterized
policy space. It is especially useful in problems with continuous states. We will cover the
classical REINFORCE algorithm, off-policy policy gradient and the Actor-Critic method.

9. Approximation Methods (∼2 lectures): Function approximation is an important technique

to make reinforcement learning applicable in practice, especially when the state or action
space is prohibitively large. We will cover various typically employed approximation meth-
ods that extend reinforcement learning from tabular methods to parametric methods.

10. Deep Reinforcement Learning (∼4 lectures): Among the set of typically employed function
approximation methods, deep neural networks stand out, due to their exceptional repre-
sentation learning power. We will introduce deep neural network based methods for both
value-based and policy-based reinforcement learning algorithms.

11. Offline Reinforcement Learning (∼2 lectures): Reinforcement learning improves itself from
its interactions with the environment. However, in many real-world applications, online
interactions with an environment is expensive. It is thus important to study how to leverage
existing offline data for policy learning. We will cover the most recent developments in
this direction and elaborate the key insights for addressing this challenging problem.

4 Assessments
This course will be structured as a hybrid of lecture-driven classes, paper reading and a hands-on
course project. Four light-weighted homework assignments will be provided to help you practice
reinforcement learning algorithms, four in-class quizzes help you master the basic concepts and
key ideas in reinforcement learning, and the course project helps you take a deep dive into the

3
SYLLABUS REINFORCEMENT LEARNING
field. These planned activities should help you obtain a comprehensive understanding of the
course materials and the spectrum of reinforcement learning.
Paper Review (10%) Paper reading is a vital skill in any field of research, and we would like
to provide you the training to hone your skill. Through the semester, every student is required
to choose a paper from suggested readings of each lecture, carefully read it, summarize your
understanding of it, and post your summary on Piazza. We will ask you to act as a reviewer of
this paper and write a critical review about it (e.g., both positive and negative aspects of this
paper). The summary will be peer-evaluated and the summaries with high qualities will receive
bonus points.
Homework Assignments (36%) We have prepared three machine problems (12% each) to
guide you through the core details of multi-armed bandit algorithms (MP1), Markov Decision
Process (MP2), and Policy Gradient method (MP3). As the nature of reinforcement learning,
the effectiveness of your algorithm, in terms of optimality (e.g., regret), will be prioritized. On
the other hand, implementation efficiency will also be emphasized, since the algorithms are often
applied in an online fashion.
In-class Quizzes (24%) To help you master the basic concepts and key ideas in reinforce-
ment learning, we will have four in-class quizzes (6% each). The format of the quiz consists of
True/False questions, multiple choice questions, and short answer questions.
Course Project (40%) Practice makes perfect. Given the intrinsic complex of reinforcement
learning, it is hard to believe one can gain any in-depth understanding of the algorithm without
using it to solve real problems. Our course project gives you such hands-on experience on solving
interesting reinforcement learning problems, e.g., playing Star Craft. The project appreciates
either research-oriented problems or “deliverables.” You need to identify the problem on your
own, apply the knowledge learned in class and beyond, and work in a group of 2-3 students
to solve it. It is preferred that the outcome of your project could be publishable, e.g., your
(unique) solution to some (interesting/important/new) problems, or tangible, e.g., some kind
of prototype system that can be demonstrated. Bonus points will be given to the groups meet
either one of above criteria. Discuss with the instructor and TA about your project idea and
progress is an important way to ensure your success in the end. Every group needs to present
their work to the class and submit a written report to summarize their results.

5 Resources
There are already tons of video lectures, informative documentations and blog articles, technical
reports, research papers, and open implementations out there on the Internet about reinforce-
ment learning. As a graduate student, it is very important to leverage such online resources to
boost your knowledge and research.
We have an official text book for this course, “Reinforcement Learning: An Introduc-
tion, Second Edition, By Richard S. Sutton and Andrew G. Barto, MIT Press, November 2018.”
Our lecture discussions will be mostly based on this text book.
The instructor has also listed several good online resources to help you master the course
material, including online tutorials, similar courses offered in other institutes, public toolkits
and libraries, and research papers and reports. You can find them on our course website. You
are also welcome to share any material you found helpful in our course forum.

4
SYLLABUS REINFORCEMENT LEARNING
6 Policies
How to participate in this course? To minimize impact from COVID-19 to our students,
we will put extra caution and safety measures for protecting everyone’s health, including having
this course live on zoom throughout the whole semester. Although the lectures will be given
in person in a classroom, the instructor will also present the materials live via zoom and have
them recorded, such that if you do not feel well or have concerns regarding COVID exposure you
can still access the class remotely. It is completely fine for you to participate in our discussions
via zoom remotely, which includes your final project presentations. For your convenience, the
zoom link for our live lecture discussion is listed below: https://virginia.zoom.us/meeting/
tJIpfumopzstEtdMTHLrGt_A6jj_dnNWxopj/ics?icsToken=98tyKuCuqjIqGt2VtxGERowABor4c_
TxmGJaj7dZsSvNLzJ0djzXYOhIDbZxPu_I.
When to start working on my assignments? You will be given two weeks to finish each
assignment. Our late policy is simple: 1) You have free 7 late days in total for all assignments;
2) You can use late days for any assignments; and a late day extends the deadline 24 hours.
3) Once you have used all 7 late days, the penalty is 10% for each additional late day (until 0
points left). Start early is always recommended: given the nature of computer programming,
exceptions and errors always happen in the last step.
Evaluation Rubrics The detailed evaluation rubrics will be carefully discussed in the instruc-
tion for your homework assignments, paper reading and final project report. And you can find
them on our course website accordingly. Please note NO curing will be applied by the end of
the semester, and the final grade will be calculated by the weight of each assessment in this class
defined in our Assessment section.
What should not you do in this course? Plagiarism is considered as a serious misconduct in
computer science in both industry and academia: it hurts your credibility and might also cause
legal matters in copyright and intellectual property. As a result, for our machine problems,
discussing with peers or instructor is allowed, copying others’ (including former students in this
class) code or implementation is strictly prohibited. All our machine problems are designed for
students to finish individually, and therefore sharing code/implementation or collaboration is
not allowed. Using third-party public library is allowed (unless explicitly introduced not to),
but it has to be clearly documented and explained in your assignment report.
Disabilities The University of Virginia strives to provide accessibility to all students. If you
require an accommodation to fully access this course, please contact the Student Disability
Access Center (SDAC) at (434) 243-5180 or [email protected]. If you are unsure if you re-
quire an accommodation, or to learn more about their services, you may contact the SDAC
at the number above or by visiting their website at http://studenthealth.virginia.edu/
student-disability-access-center/faculty-staff.
Religious Accommodations It is the University’s long-standing policy and practice to rea-
sonably accommodate students so that they do not experience an adverse academic consequence
when sincerely held religious beliefs or observances conflict with academic requirements. Stu-
dents who wish to request academic accommodation for a religious observance should submit
their request in writing directly to me by email as far in advance as possible. Students and
instructors who have questions or concerns about academic accommodations for religious obser-
vance or religious beliefs may contact the University’s Office for Equal Opportunity and Civil
Rights (EOCR) at [email protected] or 434-924-3200. Accommodations do not relieve

5
SYLLABUS REINFORCEMENT LEARNING
you of the responsibility for completion of any part of the coursework missed as the result of a
religious observance.
Grade Cutoffs We will use the standard grade cutoff points and no curing will be applied to
your final grades, such that you can keep track of and predict your final letter grade on the fly:

Table 1: Grade cutoff points

Letter Grade Point Range
A+ [97,110]
A [93,97)
A- [90, 93)
B+ [87, 90)
B [83, 87)
B- [80, 83)
C+ [77, 80)
C [73, 77)
C- [70, 73)
D+ [67, 70)
D [63, 67)
D- [60, 63)
F [0, 60)

7 Communications
Meeting Times We will have our lecture on every Tuesday and Thursday morning from
9:30am to 10:45am, both in-person in Thornton Hall E303 and via zoom. And the zoom
link is https://virginia.zoom.us/meeting/tJIpfumopzstEtdMTHLrGt_A6jj_dnNWxopj/ics?
icsToken=98tyKuCuqjIqGt2VtxGERowABor4c_TxmGJaj7dZsSvNLzJ0djzXYOhIDbZxPu_I, which
can also be found in our collab site.
Office Hours The instructor’s office hour will be held on Tuesday and Thursday afternoon
from 4:00pm to 5:00pm, online via zoom. And the TA’s office hour will be held on Wednesday
and Friday afternoon from 2:00pm to 3:00pm, online via zoom. You can find the zoom links for
our office hours in collab, but please make an appointment beforehand (at least two hours in
advance). Additional office hours can be requested by email, and you can also request in-person
meetings if you feel that would make our discussions more effective.
Course Web Site The course website is located at http://www.cs.virginia.edu/~hw5x/
Course/RL2022-Fall/_site/. All the course announcements and materials will be posted on
this website. Our Collab site will be used for homework submission, grades releasing, zoom
meetings and recordings.
Piazza The most important forum for communicating in this class is the course’s Piazza fo-
rum. Piazza is like a newsgroup or forum – you are encouraged to use it to ask questions,
initiate discussions, express opinions, share resources, and give advice. The Piazza site for this
class is https://piazza.com/virginia/fall2022/cs450120055. Please enroll yourself at the

6
SYLLABUS REINFORCEMENT LEARNING
beginning of this semester.
We expect that you will be courteous and post only material that is somehow related to the
topic of reinforcement learning or course content. The posts will be lightly moderated. Note
that private posts to Piazza can be used for things like conflict requests, or for letting us know
that you have that sinking feeling anything you do not really want to share with your classmates.

8 At the end
Thanks to you for reading the entire syllabus. Hopefully it makes your experience a bit easier
and less stressful, and focus on more on this exciting area of research!

Heidenhain TNC 151 155 Operating Manual
100% (2)
Heidenhain TNC 151 155 Operating Manual
316 pages
BSBLDR602 Provide Leadership Across The Organisation: Student Name: Oktaviana Student ID: TWIOO9220
No ratings yet
BSBLDR602 Provide Leadership Across The Organisation: Student Name: Oktaviana Student ID: TWIOO9220
10 pages
The Learner The Learner The Learner : I. Objectives
100% (2)
The Learner The Learner The Learner : I. Objectives
5 pages
IBM MMS For Cisco FLYER-IBMTSS-MMSFORCISCO PDF
No ratings yet
IBM MMS For Cisco FLYER-IBMTSS-MMSFORCISCO PDF
2 pages
Teenage Pregnancy
No ratings yet
Teenage Pregnancy
19 pages
IT565 Reinforcement Learning Winter 24 - Abhishek Jindal
No ratings yet
IT565 Reinforcement Learning Winter 24 - Abhishek Jindal
2 pages
Course Code: Course Title TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives
No ratings yet
Course Code: Course Title TPC Version No. Course Pre-Requisites/ Co-Requisites Anti-Requisites (If Any) - Objectives
2 pages
Lec 01
No ratings yet
Lec 01
60 pages
RLcourseoutline 2025
No ratings yet
RLcourseoutline 2025
2 pages
CSA3003 - REINFORCEMENT-LEARNING - LT - 1.0 - 1 - CSA3003 - Reinforcement Learning
No ratings yet
CSA3003 - REINFORCEMENT-LEARNING - LT - 1.0 - 1 - CSA3003 - Reinforcement Learning
2 pages
CSE4037 - REINFORCEMENT-LEARNING - ETH - 1.0 - 8 - CSE4037 - Reinforcement Learning - 1.0
No ratings yet
CSE4037 - REINFORCEMENT-LEARNING - ETH - 1.0 - 8 - CSE4037 - Reinforcement Learning - 1.0
2 pages
Reinforcement ML
No ratings yet
Reinforcement ML
10 pages
20CM1111
No ratings yet
20CM1111
3 pages
Unitwise Important Questions: Reinforcement Learning
No ratings yet
Unitwise Important Questions: Reinforcement Learning
5 pages
Deep Reinforcement Learning Handout v2.0
0% (1)
Deep Reinforcement Learning Handout v2.0
6 pages
Reinforcement Learning2018
No ratings yet
Reinforcement Learning2018
5 pages
Gujarat Technological University: Bachelor of Engineering Syllabus Subject Code: Subject Name
No ratings yet
Gujarat Technological University: Bachelor of Engineering Syllabus Subject Code: Subject Name
3 pages
00 Syllabus Copy-21am71
No ratings yet
00 Syllabus Copy-21am71
2 pages
Reinforcement Learning Syl-Shashimam
No ratings yet
Reinforcement Learning Syl-Shashimam
2 pages
Reinforcement Learning Basics and Beyond
No ratings yet
Reinforcement Learning Basics and Beyond
1 page
R22ML 5
No ratings yet
R22ML 5
24 pages
Introduction To Reinforcement Learning: Presented by - Rohit Mahto
No ratings yet
Introduction To Reinforcement Learning: Presented by - Rohit Mahto
9 pages
Lect 2
No ratings yet
Lect 2
26 pages
Epfl Digital Humanities Machine Learning
No ratings yet
Epfl Digital Humanities Machine Learning
3 pages
Reinforcement Learning 3
No ratings yet
Reinforcement Learning 3
5 pages
ML 10
No ratings yet
ML 10
9 pages
Unleashing The Power of Reinforcement Learning
No ratings yet
Unleashing The Power of Reinforcement Learning
2 pages
RL Unit 1
100% (1)
RL Unit 1
26 pages
RL Syllabus
No ratings yet
RL Syllabus
2 pages
20ad41e8 - Reinforcement Learning
No ratings yet
20ad41e8 - Reinforcement Learning
2 pages
Lec 1
No ratings yet
Lec 1
66 pages
Lecture 1: Introduction: Reinforcement Learning With Tensorflow&Openai Gym
No ratings yet
Lecture 1: Introduction: Reinforcement Learning With Tensorflow&Openai Gym
18 pages
Algorithm For RL
No ratings yet
Algorithm For RL
99 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
1 page
Reinforcement Learning
No ratings yet
Reinforcement Learning
28 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
1 page
Four
No ratings yet
Four
5 pages
Lecture 30 Reinforcement-Learning
No ratings yet
Lecture 30 Reinforcement-Learning
50 pages
ML Assignment 2
No ratings yet
ML Assignment 2
6 pages
1 Introduction To RL
No ratings yet
1 Introduction To RL
46 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
Reinforcement Learning With Python
No ratings yet
Reinforcement Learning With Python
24 pages
Reinforcement Learning: By: Chandra Prakash IIITM Gwalior
No ratings yet
Reinforcement Learning: By: Chandra Prakash IIITM Gwalior
64 pages
Reinforcement Learning Notes ?
No ratings yet
Reinforcement Learning Notes ?
40 pages
Unit 5 ML
No ratings yet
Unit 5 ML
49 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
3 pages
Reinforcement Learning - Basics
No ratings yet
Reinforcement Learning - Basics
7 pages
CMPE257 - W10C13 - Reinforcement Learning
No ratings yet
CMPE257 - W10C13 - Reinforcement Learning
161 pages
SL Week01
No ratings yet
SL Week01
13 pages
RL Unit - Iii
No ratings yet
RL Unit - Iii
20 pages
Exp-14 Reinforcement Learning
No ratings yet
Exp-14 Reinforcement Learning
11 pages
Week 4 ML
No ratings yet
Week 4 ML
8 pages
RL Introduction
No ratings yet
RL Introduction
225 pages
Alg RLearning Ejemplo
No ratings yet
Alg RLearning Ejemplo
99 pages
What Is Reinforcement Learning
No ratings yet
What Is Reinforcement Learning
5 pages
RL
No ratings yet
RL
1 page
First Reinforcement Learning Blog Post
No ratings yet
First Reinforcement Learning Blog Post
2 pages
DRL Final Notes
No ratings yet
DRL Final Notes
281 pages
Lecture Notes v1.0 687 F22
No ratings yet
Lecture Notes v1.0 687 F22
115 pages
Unit 3
No ratings yet
Unit 3
29 pages
Reinforcement Learning-1
No ratings yet
Reinforcement Learning-1
19 pages
Reinforcement Learning
100% (1)
Reinforcement Learning
64 pages
Practical Deep Reinforcement Learning with Python: Concise Implementation of Algorithms, Simplified Maths, and Effective Use of TensorFlow and PyTorch (English Edition)
From Everand
Practical Deep Reinforcement Learning with Python: Concise Implementation of Algorithms, Simplified Maths, and Effective Use of TensorFlow and PyTorch (English Edition)
Ivan Gridin
4/5 (1)
Machine Learning Fundamentals: Concepts, Models, and Applications
From Everand
Machine Learning Fundamentals: Concepts, Models, and Applications
Amar Sahay
No ratings yet
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Role of Artificial Intelligence in Achieving The Sustainable Development Goals
No ratings yet
The Role of Artificial Intelligence in Achieving The Sustainable Development Goals
10 pages
A Serious Gaming Framework For Decision Support On Hydrological Hazards
No ratings yet
A Serious Gaming Framework For Decision Support On Hydrological Hazards
13 pages
PDMS-3 Online Scoring and Report System Detailed Narrative Report
No ratings yet
PDMS-3 Online Scoring and Report System Detailed Narrative Report
11 pages
What Is Sustainable Development GOALS, INDICATORS, VALUES, AND PRACTICE
No ratings yet
What Is Sustainable Development GOALS, INDICATORS, VALUES, AND PRACTICE
14 pages
Sustainable Development Begins With Education
No ratings yet
Sustainable Development Begins With Education
24 pages
The Importance of The Sustainable Development Goals To Students of Environment and Sustainability Studies
No ratings yet
The Importance of The Sustainable Development Goals To Students of Environment and Sustainability Studies
9 pages
Artificial Intelligence in Achieving Sustainable Development Goals
No ratings yet
Artificial Intelligence in Achieving Sustainable Development Goals
10 pages
Wilson
No ratings yet
Wilson
6 pages
Glossary Road Eng-Mong
No ratings yet
Glossary Road Eng-Mong
40 pages
2012 Home Office Book
No ratings yet
2012 Home Office Book
43 pages
Starlight 11 Sbornik Gram Upr KLYuChI
No ratings yet
Starlight 11 Sbornik Gram Upr KLYuChI
4 pages
Marriage and Family Manual
100% (3)
Marriage and Family Manual
211 pages
Clinician's Manual On Autism Spectrum Disorder
100% (3)
Clinician's Manual On Autism Spectrum Disorder
101 pages
Hydrogen System Validation
No ratings yet
Hydrogen System Validation
4 pages
General Maths SS1
No ratings yet
General Maths SS1
5 pages
User Instruction & Installation Manual: FX560 2 Kilowatt Xenon Searchlight
No ratings yet
User Instruction & Installation Manual: FX560 2 Kilowatt Xenon Searchlight
32 pages
MR SONU SAINI S O RATTI RAM SAINI 25 04 2024 08 00 04 AM
No ratings yet
MR SONU SAINI S O RATTI RAM SAINI 25 04 2024 08 00 04 AM
4 pages
EOT Crane Cable Selection and Schedule
No ratings yet
EOT Crane Cable Selection and Schedule
3 pages
Mistakes and Feedback
No ratings yet
Mistakes and Feedback
3 pages
P 4740100 Install
No ratings yet
P 4740100 Install
13 pages
Rotech Brochure
No ratings yet
Rotech Brochure
20 pages
Mysql Ass 1 2 Class Xi
No ratings yet
Mysql Ass 1 2 Class Xi
2 pages
Lecture 04 2022
No ratings yet
Lecture 04 2022
39 pages
Biology: Mcqs
No ratings yet
Biology: Mcqs
2 pages
Openfit: Open-Ear True Wireless Earbuds
No ratings yet
Openfit: Open-Ear True Wireless Earbuds
16 pages
Practice Occupational Health & Safety Procedures
No ratings yet
Practice Occupational Health & Safety Procedures
27 pages
Smart Home Technology GanttChart
No ratings yet
Smart Home Technology GanttChart
2 pages
Hebron
No ratings yet
Hebron
2 pages
Business - Edexcel
No ratings yet
Business - Edexcel
12 pages
2nd Quarter Summative TLE 8
No ratings yet
2nd Quarter Summative TLE 8
4 pages
Alternator Protection PDF
100% (2)
Alternator Protection PDF
33 pages
Dissertation Arwa Al-Moghrabi 120038
No ratings yet
Dissertation Arwa Al-Moghrabi 120038
79 pages

CS 4501-Introduction To Reinforcement Learning

Uploaded by

CS 4501-Introduction To Reinforcement Learning

Uploaded by

CS 4501: Introduction to Reinforcement Learning

Instructor: Hongning Wang (hw5x)

3 Course Content & Schedule

7. Temporal-Difference Learning (∼3 lectures): It is an important class of model-free rein-

8. Policy Gradient (∼3 lectures): It is another important family of model-free reinforcement

9. Approximation Methods (∼2 lectures): Function approximation is an important technique

Table 1: Grade cutoff points

You might also like