Stochastic convex optimization with bandit feedback

Agarwal, Alekh; Foster, Dean P.; Hsu, Daniel; Kakade, Sham M.; Rakhlin, Alexander

Mathematics > Optimization and Control

arXiv:1107.1744 (math)

[Submitted on 8 Jul 2011 (v1), last revised 8 Oct 2011 (this version, v2)]

Title:Stochastic convex optimization with bandit feedback

Authors:Alekh Agarwal, Dean P. Foster, Daniel Hsu, Sham M. Kakade, Alexander Rakhlin

View PDF

Abstract:This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $\xset$ under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value $f(x)$ at any query point $x \in \xset$. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs $\otil(\poly(d)\sqrt{T})$ regret. Since any algorithm has regret at least $\Omega(\sqrt{T})$ on this problem, our algorithm is optimal in terms of the scaling with $T$.

Subjects:	Optimization and Control (math.OC); Machine Learning (cs.LG); Systems and Control (eess.SY)
Cite as:	arXiv:1107.1744 [math.OC]
	(or arXiv:1107.1744v2 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.1107.1744

Submission history

From: Alekh Agarwal [view email]
[v1] Fri, 8 Jul 2011 22:18:05 UTC (525 KB)
[v2] Sat, 8 Oct 2011 06:06:43 UTC (526 KB)

Full-text links:

Access Paper:

view license

Current browse context:

math.OC

< prev | next >

new | recent | 2011-07

Change to browse by:

cs
cs.LG
cs.SY
math

References & Citations

export BibTeX citation

Mathematics > Optimization and Control

Title:Stochastic convex optimization with bandit feedback

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:Stochastic convex optimization with bandit feedback

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators