0% found this document useful (0 votes)

33 views8 pages

Quiz AI1704 Page 2 of 2

asd asd d asd d da s s sư ư ư da á ds d asd asd asd ád

Uploaded by

luchtse173080

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views8 pages

Quiz AI1704 Page 2 of 2

asd asd d asd d da s s sư ư ư da á ds d asd asd asd ád

Uploaded by

luchtse173080

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

10:01 18/7/24 Quiz - AI1704 (page 2 of 2)

Question 11
Answer saved
Marked out of 0.50

How can we estimate the performance gradient with respect to the policy parameter when the gradient depends on the unknown
effect of policy changes on the state distribution?

a. the TD
b. the dynamic programming

c. the Monte -Carlo

d. the policy gradient theorem

Clear my choice

https://lms-hcmuni.fpt.edu.vn/mod/quiz/attempt.php?attempt=940899&cmid=56041&page=1 1/8
10:01 18/7/24 Quiz - AI1704 (page 2 of 2)

Question 12
Answer saved
Marked out of 0.50

SARSA is a variant of the Expected SARSA algorithm that enhances learning by taking the expected value of action selection instead of
selecting a single action deterministically.

Select one:
True
False

Question 13
Answer saved
Marked out of 0.50

One advantage of parameterizing policies according to the soft-max in action preferences

is that the approximate policy can approach a deterministic policy.

Select one:
True
False

https://lms-hcmuni.fpt.edu.vn/mod/quiz/attempt.php?attempt=940899&cmid=56041&page=1 2/8
10:01 18/7/24 Quiz - AI1704 (page 2 of 2)

Question 14
Answer saved
Marked out of 0.50

Imagine the agent is learning in an episodic problem. Which of the following is true?

a. The number of steps in an episode is always the same.

b.
The agent takes the same action at each step during an episode.

c. The number of steps in an episode is stochastic: each episode can have a different number of steps.

Clear my choice

https://lms-hcmuni.fpt.edu.vn/mod/quiz/attempt.php?attempt=940899&cmid=56041&page=1 3/8
10:01 18/7/24 Quiz - AI1704 (page 2 of 2)

Question 15
Answer saved
Marked out of 0.50

Action selection is based on the expected value of all possible actions according to the current policy. It computes the expected value
of all actions and selects actions probabilistically based on their probabilities under the current policy.

a. SARSA
b. Expected SARSA
c. Bellman
d. Deep Learning

Clear my choice

Question 16
Answer saved
Marked out of 0.50

Which algorithm that has the step: "Interact with Environment: Sample trajectories by following the current policy in the environment"?

a. Actor Critic
b. Temporal Difference
c. Dynamic programming
d. Monte Carlo

Clear my choice

https://lms-hcmuni.fpt.edu.vn/mod/quiz/attempt.php?attempt=940899&cmid=56041&page=1 4/8
10:01 18/7/24 Quiz - AI1704 (page 2 of 2)

Question 17
Answer saved
Marked out of 0.50

Given a state, the effect of the policy parameter on the actions, and thus on reward, can be computed in a relatively straightforward
way from knowledge of _____________.

a. the gradient
b. the parameterization
c. the value function
d. the algorithm

Clear my choice

Question 18
Answer saved
Marked out of 0.50

The one-step algorithm is semi-gradient Expected Sarsa that involve importance sampling.

Select one:
True
False

https://lms-hcmuni.fpt.edu.vn/mod/quiz/attempt.php?attempt=940899&cmid=56041&page=1 5/8
10:01 18/7/24 Quiz - AI1704 (page 2 of 2)

Question 19
Answer saved
Marked out of 0.50

It is to directly optimize the parameters of a parameterized policy in order to maximize the expected cumulative rewards obtained by
an agent in an environment

a. Model
b. Policy Gradient
c. Bellman
d. TD(0)

Clear my choice

https://lms-hcmuni.fpt.edu.vn/mod/quiz/attempt.php?attempt=940899&cmid=56041&page=1 6/8
10:01 18/7/24 Quiz - AI1704 (page 2 of 2)

Question 20
Answer saved
Marked out of 0.50

https://lms-hcmuni.fpt.edu.vn/mod/quiz/attempt.php?attempt=940899&cmid=56041&page=1 7/8
10:01 18/7/24 Quiz - AI1704 (page 2 of 2)

Clear my choice

https://lms-hcmuni.fpt.edu.vn/mod/quiz/attempt.php?attempt=940899&cmid=56041&page=1 8/8

Mod3 Slides
No ratings yet
Mod3 Slides
199 pages
08 MDPs
No ratings yet
08 MDPs
110 pages
Markov Decision Process II
No ratings yet
Markov Decision Process II
88 pages
Assignment 6 (Sol.) : Reinforcement Learning
No ratings yet
Assignment 6 (Sol.) : Reinforcement Learning
4 pages
Machine Learning Lab Viva
100% (1)
Machine Learning Lab Viva
9 pages
08 MDPs
No ratings yet
08 MDPs
111 pages
Artificial Intelligence: Lecture 9 - Markov Decision Processes II Dr. Shivanjali Khare
No ratings yet
Artificial Intelligence: Lecture 9 - Markov Decision Processes II Dr. Shivanjali Khare
44 pages
Lecture 06
No ratings yet
Lecture 06
98 pages
RL 5
No ratings yet
RL 5
26 pages
13 RL 3
No ratings yet
13 RL 3
48 pages
Reinforcement Learning Exam
No ratings yet
Reinforcement Learning Exam
6 pages
RL Module 4
No ratings yet
RL Module 4
50 pages
Lec 09
No ratings yet
Lec 09
51 pages
Chapter 11
No ratings yet
Chapter 11
17 pages
SRE Report Merged
No ratings yet
SRE Report Merged
16 pages
242 Sheet 02 03
No ratings yet
242 Sheet 02 03
5 pages
Policy Gradient Methods
No ratings yet
Policy Gradient Methods
70 pages
Unit 5 - Policy Based
No ratings yet
Unit 5 - Policy Based
30 pages
3 - Chapter 10 Actor-Critic Methods
No ratings yet
3 - Chapter 10 Actor-Critic Methods
22 pages
RL-UNIT2 - RL Unit 2 RL-UNIT2 - RL Unit 2
No ratings yet
RL-UNIT2 - RL Unit 2 RL-UNIT2 - RL Unit 2
23 pages
کتاب هشتم بارگزاری شده
No ratings yet
کتاب هشتم بارگزاری شده
112 pages
5 - Policy Gradient Methods
No ratings yet
5 - Policy Gradient Methods
57 pages
MS&E 221: Stochastic Modeling: Session 7: Nonlinear Optimization, Markov Decision Processes
No ratings yet
MS&E 221: Stochastic Modeling: Session 7: Nonlinear Optimization, Markov Decision Processes
18 pages
Use of ICT in Automobile Industry
100% (3)
Use of ICT in Automobile Industry
3 pages
Policy-Based Reinforcement Learning: Shusen Wang
No ratings yet
Policy-Based Reinforcement Learning: Shusen Wang
46 pages
13 ML Reinforcement Learning - Policy Search
No ratings yet
13 ML Reinforcement Learning - Policy Search
10 pages
Exploration in Contextual Bandits: Reedy Reedy
No ratings yet
Exploration in Contextual Bandits: Reedy Reedy
16 pages
NIPS 2012 A Unifying Perspective of Parametric Policy Search Methods For Markov Decision Processes Paper
No ratings yet
NIPS 2012 A Unifying Perspective of Parametric Policy Search Methods For Markov Decision Processes Paper
9 pages
High-Dimensional Continuous Control Using Generalized Advantage Estimation-1506.02438v5
No ratings yet
High-Dimensional Continuous Control Using Generalized Advantage Estimation-1506.02438v5
14 pages
CH3 - 3 Policy Search Alg
No ratings yet
CH3 - 3 Policy Search Alg
9 pages
19 - Monte Carlo and Temporal Difference For Markov Decision Processes
No ratings yet
19 - Monte Carlo and Temporal Difference For Markov Decision Processes
57 pages
Lecture 12 Slides - After
No ratings yet
Lecture 12 Slides - After
50 pages
Notações Dos Algoritimos
No ratings yet
Notações Dos Algoritimos
10 pages
Week 10
No ratings yet
Week 10
5 pages
Practice Assignment 6: Reinforcement Learning Prof. B. Ravindran
No ratings yet
Practice Assignment 6: Reinforcement Learning Prof. B. Ravindran
24 pages
I2ml3e Chap18
No ratings yet
I2ml3e Chap18
27 pages
An Introduction To Policy Search Methods: Thomas Furmston
No ratings yet
An Introduction To Policy Search Methods: Thomas Furmston
33 pages
12 ML Reinforcement Learning Value Based Control
No ratings yet
12 ML Reinforcement Learning Value Based Control
12 pages
M 2
No ratings yet
M 2
12 pages
RL Lecture4
No ratings yet
RL Lecture4
7 pages
Lecture#5 Monte Carlo Methods Part I
No ratings yet
Lecture#5 Monte Carlo Methods Part I
28 pages
Policy Gradient Methods
No ratings yet
Policy Gradient Methods
28 pages
The Elements of User Experience
No ratings yet
The Elements of User Experience
23 pages
Solution 9
No ratings yet
Solution 9
3 pages
RL Exam Tutti
No ratings yet
RL Exam Tutti
47 pages
A17 Complexdecisions
No ratings yet
A17 Complexdecisions
28 pages
RL Concepts and Methods
No ratings yet
RL Concepts and Methods
8 pages
cs229 Notes13
No ratings yet
cs229 Notes13
15 pages
2023-24 First Sem - DRL Mid Sem Regular
No ratings yet
2023-24 First Sem - DRL Mid Sem Regular
2 pages
AI 3000 / CS 5500: Reinforcement Learning Assignment 1: Problem 1: Markov Reward Process
No ratings yet
AI 3000 / CS 5500: Reinforcement Learning Assignment 1: Problem 1: Markov Reward Process
5 pages
Tutorial Questions (Annexure I) Que S-Tion No Questions Co BTL
No ratings yet
Tutorial Questions (Annexure I) Que S-Tion No Questions Co BTL
6 pages
Reinforcement Learning Cheatsheet
No ratings yet
Reinforcement Learning Cheatsheet
16 pages
New CZ3005 Module 4 - Markov Decision Process
No ratings yet
New CZ3005 Module 4 - Markov Decision Process
38 pages
RL Paper Deepsk
No ratings yet
RL Paper Deepsk
4 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Module 7 Ungraded Quizz 1st
No ratings yet
Module 7 Ungraded Quizz 1st
12 pages
4 Reinforcement Learning - Basic Algorithms: - S, A) ) and The Immediate Reward Function R (R (S, A, S
No ratings yet
4 Reinforcement Learning - Basic Algorithms: - S, A) ) and The Immediate Reward Function R (R (S, A, S
16 pages
RL 10 QUESTIONS FOR MID II Scheme of Evaluvation
No ratings yet
RL 10 QUESTIONS FOR MID II Scheme of Evaluvation
15 pages
Unit 3 Ai
No ratings yet
Unit 3 Ai
5 pages
Design of Mini Compressor Less Powered Refrigerator: Project Report ON
No ratings yet
Design of Mini Compressor Less Powered Refrigerator: Project Report ON
37 pages
CS 188 Fall 2018 Written HW4 Soln
No ratings yet
CS 188 Fall 2018 Written HW4 Soln
6 pages
Reinforcement Learning Cheat Sheet: Return
No ratings yet
Reinforcement Learning Cheat Sheet: Return
7 pages
Exam Prep 4 Solutions: Q1. MDPS: Dice Bonanza
No ratings yet
Exam Prep 4 Solutions: Q1. MDPS: Dice Bonanza
4 pages
SAILOR Battery Panel BP4680
No ratings yet
SAILOR Battery Panel BP4680
16 pages
Theory and Practice of Artificial Intelligence
No ratings yet
Theory and Practice of Artificial Intelligence
7 pages
Mili-Q CLX Manual
No ratings yet
Mili-Q CLX Manual
54 pages
Top 58 MySql Interview Questions (2023) - Javatpoint
No ratings yet
Top 58 MySql Interview Questions (2023) - Javatpoint
37 pages
SDL Plugins
No ratings yet
SDL Plugins
5 pages
APCCAS Full Schedule - Nov18
No ratings yet
APCCAS Full Schedule - Nov18
6 pages
OOSD Unit 1.3
No ratings yet
OOSD Unit 1.3
27 pages
Software Process Model - Rational Unified Process
No ratings yet
Software Process Model - Rational Unified Process
40 pages
Kshitij Tiwari: Qualification
No ratings yet
Kshitij Tiwari: Qualification
3 pages
Automatic Drawing Machine
No ratings yet
Automatic Drawing Machine
2 pages
Elektor-1982-07 (Super LN Phono, Class A+B Amplifier)
No ratings yet
Elektor-1982-07 (Super LN Phono, Class A+B Amplifier)
97 pages
SP916GK Manual
No ratings yet
SP916GK Manual
41 pages
FP5207
No ratings yet
FP5207
13 pages
Ce Lab 17213 CF
No ratings yet
Ce Lab 17213 CF
37 pages
DESIGN AND CONTROL OF PHOTOVOLTAIC WIND BATTERY BASED MICROGRID SYSTEM Ijariie23630
No ratings yet
DESIGN AND CONTROL OF PHOTOVOLTAIC WIND BATTERY BASED MICROGRID SYSTEM Ijariie23630
12 pages
OITAF2024 AURO v2-LOW
No ratings yet
OITAF2024 AURO v2-LOW
42 pages
HP Laserjet Pro M404 Series
No ratings yet
HP Laserjet Pro M404 Series
5 pages
Analysis and Simulation of Brain Signal Data by EEG Signal Processing Technique Using MATLAB
No ratings yet
Analysis and Simulation of Brain Signal Data by EEG Signal Processing Technique Using MATLAB
7 pages
Capstone Case Study
No ratings yet
Capstone Case Study
4 pages
Running Head: Mass Customization at Hewlett-Packard 1
No ratings yet
Running Head: Mass Customization at Hewlett-Packard 1
3 pages
Critical Path: T.S T.S F.S F.S ES EF ES EF LS Duration LF LS Duration LF Total Slack Free Slack Total Slack Free Slack
No ratings yet
Critical Path: T.S T.S F.S F.S ES EF ES EF LS Duration LF LS Duration LF Total Slack Free Slack Total Slack Free Slack
21 pages
PATH310
No ratings yet
PATH310
6 pages
Advanced Ec Section 6
No ratings yet
Advanced Ec Section 6
5 pages
Customer First Executive Order 072419
No ratings yet
Customer First Executive Order 072419
6 pages
Nour Issa
No ratings yet
Nour Issa
6 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
3 pages
PMI-RMP Exam Insights: Q&A with Explanations
From Everand
PMI-RMP Exam Insights: Q&A with Explanations
SUJAN
No ratings yet

Quiz AI1704 Page 2 of 2

Uploaded by

Quiz AI1704 Page 2 of 2

Uploaded by

10:01 18/7/24 Quiz - AI1704 (page 2 of 2)

c. the Monte -Carlo

One advantage of parameterizing policies according to the soft-max in action preferences

a. The number of steps in an episode is always the same.

You might also like