Efficient Exploration by Novelty-Pursuit
Distributed Artificial Intelligence: Second International Conference, DAI 2020 …, 2020•Springer
Efficient exploration is essential to reinforcement learning in tasks with huge state space and
long planning horizon. Recent approaches to address this issue include the intrinsically
motivated goal exploration processes (IMGEP) and the maximum state entropy exploration
(MSEE). In this paper, we propose a goal-selection criterion in IMGEP based on the principle
of MSEE, which results in the new exploration method novelty-pursuit. Novelty-pursuit
performs the exploration in two stages: first, it selects a seldom visited state as the target for …
long planning horizon. Recent approaches to address this issue include the intrinsically
motivated goal exploration processes (IMGEP) and the maximum state entropy exploration
(MSEE). In this paper, we propose a goal-selection criterion in IMGEP based on the principle
of MSEE, which results in the new exploration method novelty-pursuit. Novelty-pursuit
performs the exploration in two stages: first, it selects a seldom visited state as the target for …
Abstract
Efficient exploration is essential to reinforcement learning in tasks with huge state space and long planning horizon. Recent approaches to address this issue include the intrinsically motivated goal exploration processes (IMGEP) and the maximum state entropy exploration (MSEE). In this paper, we propose a goal-selection criterion in IMGEP based on the principle of MSEE, which results in the new exploration method novelty-pursuit. Novelty-pursuit performs the exploration in two stages: first, it selects a seldom visited state as the target for the goal-conditioned exploration policy to reach the boundary of the explored region; then, it takes random actions to explore the non-explored region. We demonstrate the effectiveness of the proposed method in environments from simple maze environments, MuJoCo tasks, to the long-horizon video game of SuperMarioBros. Experiment results show that the proposed method outperforms the state-of-the-art approaches that use curiosity-driven exploration.
Springer
Showing the best result for this search. See all results