Improving Intrinsic Exploration by Creating Stationary Objectives

Castanyer, Roger Creus; Romoff, Joshua; Berseth, Glen

Computer Science > Machine Learning

arXiv:2310.18144 (cs)

[Submitted on 27 Oct 2023 (v1), last revised 23 Apr 2024 (this version, v4)]

Title:Improving Intrinsic Exploration by Creating Stationary Objectives

Authors:Roger Creus Castanyer, Joshua Romoff, Glen Berseth

View PDF HTML (experimental)

Abstract:Exploration bonuses in reinforcement learning guide long-horizon exploration by defining custom intrinsic objectives. Several exploration objectives like count-based bonuses, pseudo-counts, and state-entropy maximization are non-stationary and hence are difficult to optimize for the agent. While this issue is generally known, it is usually omitted and solutions remain under-explored. The key contribution of our work lies in transforming the original non-stationary rewards into stationary rewards through an augmented state representation. For this purpose, we introduce the Stationary Objectives For Exploration (SOFE) framework. SOFE requires identifying sufficient statistics for different exploration bonuses and finding an efficient encoding of these statistics to use as input to a deep network. SOFE is based on proposing state augmentations that expand the state space but hold the promise of simplifying the optimization of the agent's objective. We show that SOFE improves the performance of several exploration objectives, including count-based bonuses, pseudo-counts, and state-entropy maximization. Moreover, SOFE outperforms prior methods that attempt to stabilize the optimization of intrinsic objectives. We demonstrate the efficacy of SOFE in hard-exploration problems, including sparse-reward tasks, pixel-based observations, 3D navigation, and procedurally generated environments.

Comments:	Accepted at ICLR 2024
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2310.18144 [cs.LG]
	(or arXiv:2310.18144v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2310.18144

Submission history

From: Roger Creus Castanyer [view email]
[v1] Fri, 27 Oct 2023 13:51:18 UTC (16,117 KB)
[v2] Fri, 3 Nov 2023 00:02:27 UTC (16,117 KB)
[v3] Mon, 4 Dec 2023 17:32:31 UTC (17,470 KB)
[v4] Tue, 23 Apr 2024 00:03:32 UTC (17,466 KB)

Computer Science > Machine Learning

Title:Improving Intrinsic Exploration by Creating Stationary Objectives

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Improving Intrinsic Exploration by Creating Stationary Objectives

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators