A Framework for Transforming Specifications in Reinforcement Learning

Alur, Rajeev; Bansal, Suguman; Bastani, Osbert; Jothimurugan, Kishor

Computer Science > Formal Languages and Automata Theory

arXiv:2111.00272 (cs)

[Submitted on 30 Oct 2021 (v1), last revised 30 May 2022 (this version, v3)]

Title:A Framework for Transforming Specifications in Reinforcement Learning

Authors:Rajeev Alur, Suguman Bansal, Osbert Bastani, Kishor Jothimurugan

View PDF

Abstract:Reactive synthesis algorithms allow automatic construction of policies to control an environment modeled as a Markov Decision Process (MDP) that are optimal with respect to high-level temporal logic specifications. However, they assume that the MDP model is known a priori. Reinforcement Learning (RL) algorithms, in contrast, are designed to learn an optimal policy when the transition probabilities of the MDP are unknown, but require the user to associate local rewards with transitions. The appeal of high-level temporal logic specifications has motivated research to develop RL algorithms for synthesis of policies from specifications. To understand the techniques, and nuanced variations in their theoretical guarantees, in the growing body of resulting literature, we develop a formal framework for defining transformations among RL tasks with different forms of objectives. We define the notion of a sampling-based reduction to transform a given MDP into another one which can be simulated even when the transition probabilities of the original MDP are unknown. We formalize the notions of preservation of optimal policies, convergence, and robustness of such reductions. We then use our framework to restate known results, establish new results to fill in some gaps, and identify open problems. In particular, we show that certain kinds of reductions from LTL specifications to reward-based ones do not exist, and prove the non-existence of RL algorithms with PAC-MDP guarantees for safety specifications.

Subjects:	Formal Languages and Automata Theory (cs.FL)
Cite as:	arXiv:2111.00272 [cs.FL]
	(or arXiv:2111.00272v3 [cs.FL] for this version)
	https://doi.org/10.48550/arXiv.2111.00272

Submission history

From: Kishor Jothimurugan [view email]
[v1] Sat, 30 Oct 2021 15:28:43 UTC (52 KB)
[v2] Sat, 12 Mar 2022 20:27:07 UTC (55 KB)
[v3] Mon, 30 May 2022 03:01:15 UTC (55 KB)

Computer Science > Formal Languages and Automata Theory

Title:A Framework for Transforming Specifications in Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Formal Languages and Automata Theory

Title:A Framework for Transforming Specifications in Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators