Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

Shi, Wenjie; Song, Shiji; Wu, Hui; Hsu, Ya-Chu; Wu, Cheng; Huang, Gao

Computer Science > Machine Learning

arXiv:1909.03245 (cs)

[Submitted on 7 Sep 2019 (v1), last revised 6 Dec 2021 (this version, v3)]

Title:Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

Authors:Wenjie Shi, Shiji Song, Hui Wu, Ya-Chu Hsu, Cheng Wu, Gao Huang

View PDF

Abstract:Model-free deep reinforcement learning (RL) algorithms have been widely used for a range of complex control tasks. However, slow convergence and sample inefficiency remain challenging problems in RL, especially when handling continuous and high-dimensional state spaces. To tackle this problem, we propose a general acceleration method for model-free, off-policy deep RL algorithms by drawing the idea underlying regularized Anderson acceleration (RAA), which is an effective approach to accelerating the solving of fixed point problems with perturbations. Specifically, we first explain how policy iteration can be applied directly with Anderson acceleration. Then we extend RAA to the case of deep RL by introducing a regularization term to control the impact of perturbation induced by function approximation errors. We further propose two strategies, i.e., progressive update and adaptive restart, to enhance the performance. The effectiveness of our method is evaluated on a variety of benchmark tasks, including Atari 2600 and MuJoCo. Experimental results show that our approach substantially improves both the learning speed and final performance of state-of-the-art deep RL algorithms.

Comments:	33rd Conference on Neural Information Processing Systems (NeurIPS 2019)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:1909.03245 [cs.LG]
	(or arXiv:1909.03245v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1909.03245

Submission history

From: Wenjie Shi [view email]
[v1] Sat, 7 Sep 2019 11:18:32 UTC (791 KB)
[v2] Tue, 17 Dec 2019 01:11:40 UTC (1,590 KB)
[v3] Mon, 6 Dec 2021 10:52:14 UTC (623 KB)

Computer Science > Machine Learning

Title:Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators