Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation

Tang, Ziyang; Feng, Yihao; Li, Lihong; Zhou, Dengyong; Liu, Qiang

Computer Science > Machine Learning

arXiv:1910.07186 (cs)

[Submitted on 16 Oct 2019]

Title:Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation

Authors:Ziyang Tang, Yihao Feng, Lihong Li, Dengyong Zhou, Qiang Liu

View PDF

Abstract:Infinite horizon off-policy policy evaluation is a highly challenging task due to the excessively large variance of typical importance sampling (IS) estimators. Recently, Liu et al. (2018a) proposed an approach that significantly reduces the variance of infinite-horizon off-policy evaluation by estimating the stationary density ratio, but at the cost of introducing potentially high biases due to the error in density ratio estimation. In this paper, we develop a bias-reduced augmentation of their method, which can take advantage of a learned value function to obtain higher accuracy. Our method is doubly robust in that the bias vanishes when either the density ratio or the value function estimation is perfect. In general, when either of them is accurate, the bias can also be reduced. Both theoretical and empirical results show that our method yields significant advantages over previous methods.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:1910.07186 [cs.LG]
	(or arXiv:1910.07186v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1910.07186

Submission history

From: Yihao Feng [view email]
[v1] Wed, 16 Oct 2019 06:33:17 UTC (164 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2019-10

Change to browse by:

cs
cs.AI
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu

export BibTeX citation

Computer Science > Machine Learning

Title:Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators