Reward Potentials for Planning with Learned Neural Network Transition Models

Say, Buser; Sanner, Scott; Thiébaux, Sylvie

Computer Science > Artificial Intelligence

arXiv:1904.09366 (cs)

[Submitted on 19 Apr 2019 (v1), last revised 26 Jul 2019 (this version, v4)]

Title:Reward Potentials for Planning with Learned Neural Network Transition Models

Authors:Buser Say, Scott Sanner, Sylvie Thiébaux

View PDF

Abstract:Optimal planning with respect to learned neural network (NN) models in continuous action and state spaces using mixed-integer linear programming (MILP) is a challenging task for branch-and-bound solvers due to the poor linear relaxation of the underlying MILP model. For a given set of features, potential heuristics provide an efficient framework for computing bounds on cost (reward) functions. In this paper, we model the problem of finding optimal potential bounds for learned NN models as a bilevel program, and solve it using a novel finite-time constraint generation algorithm. We then strengthen the linear relaxation of the underlying MILP model by introducing constraints to bound the reward function based on the precomputed reward potentials. Experimentally, we show that our algorithm efficiently computes reward potentials for learned NN models, and that the overhead of computing reward potentials is justified by the overall strengthening of the underlying MILP model for the task of planning over long horizons.

Comments:	To appear in the proceedings of the 25th International Conference on Principles and Practice of Constraint Programming
Subjects:	Artificial Intelligence (cs.AI)
Cite as:	arXiv:1904.09366 [cs.AI]
	(or arXiv:1904.09366v4 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.1904.09366

Submission history

From: Buser Say [view email]
[v1] Fri, 19 Apr 2019 23:15:59 UTC (378 KB)
[v2] Tue, 7 May 2019 07:03:01 UTC (389 KB)
[v3] Sun, 19 May 2019 11:01:30 UTC (390 KB)
[v4] Fri, 26 Jul 2019 14:54:45 UTC (390 KB)

Computer Science > Artificial Intelligence

Title:Reward Potentials for Planning with Learned Neural Network Transition Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Reward Potentials for Planning with Learned Neural Network Transition Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators