A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

Xu, Pan; Gu, Quanquan

Computer Science > Machine Learning

arXiv:1912.04511 (cs)

[Submitted on 10 Dec 2019 (v1), last revised 3 Mar 2020 (this version, v2)]

Title:A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

Authors:Pan Xu, Quanquan Gu

View PDF

Abstract:Q-learning with neural network function approximation (neural Q-learning for short) is among the most prevalent deep reinforcement learning algorithms. Despite its empirical success, the non-asymptotic convergence rate of neural Q-learning remains virtually unknown. In this paper, we present a finite-time analysis of a neural Q-learning algorithm, where the data are generated from a Markov decision process and the action-value function is approximated by a deep ReLU neural network. We prove that neural Q-learning finds the optimal policy with $O(1/\sqrt{T})$ convergence rate if the neural function approximator is sufficiently overparameterized, where $T$ is the number of iterations. To our best knowledge, our result is the first finite-time analysis of neural Q-learning under non-i.i.d. data assumption.

Comments:	22 pages, 1 table. This version simplifies the proof and improves the presentation
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:1912.04511 [cs.LG]
	(or arXiv:1912.04511v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1912.04511

Submission history

From: Quanquan Gu [view email]
[v1] Tue, 10 Dec 2019 05:52:32 UTC (26 KB)
[v2] Tue, 3 Mar 2020 21:31:07 UTC (26 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2019-12

Change to browse by:

cs
math
math.OC
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Pan Xu
Quanquan Gu

export BibTeX citation

Computer Science > Machine Learning

Title:A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators