Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Li, Tianyu; Mazoure, Bogdan; Precup, Doina; Rabusseau, Guillaume

Computer Science > Artificial Intelligence

arXiv:1911.05010 (cs)

[Submitted on 12 Nov 2019 (v1), last revised 22 Nov 2019 (this version, v2)]

Title:Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Authors:Tianyu Li, Bogdan Mazoure, Doina Precup, Guillaume Rabusseau

View PDF

Abstract:Learning and planning in partially-observable domains is one of the most difficult problems in reinforcement learning. Traditional methods consider these two problems as independent, resulting in a classical two-stage paradigm: first learn the environment dynamics and then plan accordingly. This approach, however, disconnects the two problems and can consequently lead to algorithms that are sample inefficient and time consuming. In this paper, we propose a novel algorithm that combines learning and planning together. Our algorithm is closely related to the spectral learning algorithm for predicitive state representations and offers appealing theoretical guarantees and time complexity. We empirically show on two domains that our approach is more sample and time efficient compared to classical methods.

Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:1911.05010 [cs.AI]
	(or arXiv:1911.05010v2 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.1911.05010

Submission history

From: Tianyu Li [view email]
[v1] Tue, 12 Nov 2019 16:56:37 UTC (242 KB)
[v2] Fri, 22 Nov 2019 02:37:51 UTC (247 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.AI

< prev | next >

new | recent | 2019-11

Change to browse by:

cs
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Tianyu Li
Bogdan Mazoure
Doina Precup
Guillaume Rabusseau

export BibTeX citation

Computer Science > Artificial Intelligence

Title:Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators