Scaling up budgeted reinforcement learning

Carrara, Nicolas; Leurent, Edouard; Laroche, Romain; Urvoy, Tanguy; Maillard, Odalric; Pietquin, Olivier

Computer Science > Machine Learning

arXiv:1903.01004v1 (cs)

[Submitted on 3 Mar 2019 (this version), latest version 27 May 2019 (v3)]

Title:Scaling up budgeted reinforcement learning

Authors:Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric Maillard, Olivier Pietquin

View PDF

Abstract:Can we learn a control policy able to adapt its behaviour in real time so as to take any desired amount of risk? The general Reinforcement Learning framework solely aims at optimising a total reward in expectation, which may not be desirable in critical applications. In stark contrast, the Budgeted Markov Decision Process (BMDP) framework is a formalism in which the notion of risk is implemented as a hard constraint on a failure signal. Existing algorithms solving BMDPs rely on strong assumptions and have so far only been applied to toy-examples. In this work, we relax some of these assumptions and demonstrate the scalability of our approach on two practical problems: a spoken dialogue system and an autonomous driving task. On both examples, we reach similar performances as Lagrangian Relaxation methods with a significant improvement in sample and memory efficiency.

Comments:	this http URL and this http URL have equally contributed. The source code, videos and additional details for all experiments are available at this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:1903.01004 [cs.LG]
	(or arXiv:1903.01004v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1903.01004

Submission history

From: Nicolas Carrara [view email]
[v1] Sun, 3 Mar 2019 22:24:01 UTC (1,012 KB)
[v2] Wed, 6 Mar 2019 17:37:51 UTC (1,012 KB)
[v3] Mon, 27 May 2019 21:50:33 UTC (1,297 KB)

Computer Science > Machine Learning

Title:Scaling up budgeted reinforcement learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Scaling up budgeted reinforcement learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators