The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation

Feng, Zhe; Parkes, David C.; Xu, Haifeng

Computer Science > Machine Learning

arXiv:1906.01528 (cs)

[Submitted on 4 Jun 2019 (v1), last revised 12 Nov 2020 (this version, v2)]

Title:The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation

Authors:Zhe Feng, David C. Parkes, Haifeng Xu

View PDF

Abstract:Motivated by economic applications such as recommender systems, we study the behavior of stochastic bandits algorithms under \emph{strategic behavior} conducted by rational actors, i.e., the arms. Each arm is a \emph{self-interested} strategic player who can modify its own reward whenever pulled, subject to a cross-period budget constraint, in order to maximize its own expected number of times of being pulled. We analyze the robustness of three popular bandit algorithms: UCB, $\varepsilon$-Greedy, and Thompson Sampling. We prove that all three algorithms achieve a regret upper bound $\mathcal{O}(\max \{ B, K\ln T\})$ where $B$ is the total budget across arms, $K$ is the total number of arms and $T$ is length of the time horizon. This regret guarantee holds under \emph{arbitrary adaptive} manipulation strategy of arms. Our second set of main results shows that this regret bound is \emph{tight} -- in fact for UCB it is tight even when we restrict the arms' manipulation strategies to form a \emph{Nash equilibrium}. The lower bound makes use of a simple manipulation strategy, the same for all three algorithms, yielding a bound of $\Omega(\max \{ B, K\ln T\})$. Our results illustrate the robustness of classic bandits algorithms against strategic manipulations as long as $B=o(T)$.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Machine Learning (stat.ML)
Cite as:	arXiv:1906.01528 [cs.LG]
	(or arXiv:1906.01528v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1906.01528

Submission history

From: Zhe Feng [view email]
[v1] Tue, 4 Jun 2019 15:40:49 UTC (333 KB)
[v2] Thu, 12 Nov 2020 21:56:33 UTC (1,267 KB)

Computer Science > Machine Learning

Title:The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators