Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

Voloshin, Cameron; Le, Hoang M.; Jiang, Nan; Yue, Yisong

Computer Science > Machine Learning

arXiv:1911.06854 (cs)

[Submitted on 15 Nov 2019 (v1), last revised 27 Nov 2021 (this version, v3)]

Title:Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

Authors:Cameron Voloshin, Hoang M. Le, Nan Jiang, Yisong Yue

View PDF

Abstract:We offer an experimental benchmark and empirical study for off-policy policy evaluation (OPE) in reinforcement learning, which is a key problem in many safety critical applications. Given the increasing interest in deploying learning-based methods, there has been a flurry of recent proposals for OPE method, leading to a need for standardized empirical analyses. Our work takes a strong focus on diversity of experimental design to enable stress testing of OPE methods. We provide a comprehensive benchmarking suite to study the interplay of different attributes on method performance. We distill the results into a summarized set of guidelines for OPE in practice. Our software package, the Caltech OPE Benchmarking Suite (COBS), is open-sourced and we invite interested researchers to further contribute to the benchmark.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Machine Learning (stat.ML)
Cite as:	arXiv:1911.06854 [cs.LG]
	(or arXiv:1911.06854v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1911.06854

Submission history

From: Cameron Voloshin [view email]
[v1] Fri, 15 Nov 2019 19:58:42 UTC (2,939 KB)
[v2] Tue, 25 Feb 2020 02:24:05 UTC (3,140 KB)
[v3] Sat, 27 Nov 2021 23:54:27 UTC (4,090 KB)

Computer Science > Machine Learning

Title:Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators