Improving Interactive Reinforcement Agent Planning with Human Demonstration

Li, Guangliang; Gomez, Randy; Nakamura, Keisuke; Lin, Jinying; Zhang, Qilei; He, Bo

Computer Science > Artificial Intelligence

arXiv:1904.08621 (cs)

[Submitted on 18 Apr 2019]

Title:Improving Interactive Reinforcement Agent Planning with Human Demonstration

Authors:Guangliang Li, Randy Gomez, Keisuke Nakamura, Jinying Lin, Qilei Zhang, Bo He

View PDF

Abstract:TAMER has proven to be a powerful interactive reinforcement learning method for allowing ordinary people to teach and personalize autonomous agents' behavior by providing evaluative feedback. However, a TAMER agent planning with UCT---a Monte Carlo Tree Search strategy, can only update states along its path and might induce high learning cost especially for a physical robot. In this paper, we propose to drive the agent's exploration along the optimal path and reduce the learning cost by initializing the agent's reward function via inverse reinforcement learning from demonstration. We test our proposed method in the RL benchmark domain---Grid World---with different discounts on human reward. Our results show that learning from demonstration can allow a TAMER agent to learn a roughly optimal policy up to the deepest search and encourage the agent to explore along the optimal path. In addition, we find that learning from demonstration can improve the learning efficiency by reducing total feedback, the number of incorrect actions and increasing the ratio of correct actions to obtain an optimal policy, allowing a TAMER agent to converge faster.

Subjects:	Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
Cite as:	arXiv:1904.08621 [cs.AI]
	(or arXiv:1904.08621v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.1904.08621

Submission history

From: Guangliang Li [view email]
[v1] Thu, 18 Apr 2019 07:45:36 UTC (1,340 KB)

Computer Science > Artificial Intelligence

Title:Improving Interactive Reinforcement Agent Planning with Human Demonstration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Improving Interactive Reinforcement Agent Planning with Human Demonstration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators