Mabwiser: Parallelizable contextual multi-armed bandits
E Strong, B Kleynhans, S Kadıoğlu - International Journal on …, 2021 - World Scientific
E Strong, B Kleynhans, S Kadıoğlu
International Journal on Artificial Intelligence Tools, 2021•World ScientificContextual multi-armed bandit algorithms are an effective approach for online sequential
decision-making problems. However, there are limited tools available to support their
adoption in the community. To fill this gap, we present an open-source Python library with
context-free, parametric and non-parametric contextual multi-armed bandit algorithms. The
MABWiser library is designed to be user-friendly and supports custom bandit algorithms for
specific applications. Our design provides built-in parallelization to speed up training and …
decision-making problems. However, there are limited tools available to support their
adoption in the community. To fill this gap, we present an open-source Python library with
context-free, parametric and non-parametric contextual multi-armed bandit algorithms. The
MABWiser library is designed to be user-friendly and supports custom bandit algorithms for
specific applications. Our design provides built-in parallelization to speed up training and …
Contextual multi-armed bandit algorithms are an effective approach for online sequential decision-making problems. However, there are limited tools available to support their adoption in the community. To fill this gap, we present an open-source Python library with context-free, parametric and non-parametric contextual multi-armed bandit algorithms. The MABWiser library is designed to be user-friendly and supports custom bandit algorithms for specific applications. Our design provides built-in parallelization to speed up training and testing for scalability with special attention given to ensuring the reproducibility of results. The API makes hybrid strategies possible that combine non-parametric policies with parametric ones, an area that is not explored in the literature. As a practical application, we demonstrate using the library in both batch and online simulations for context-free, parametric and non-parametric contextual policies with the well-known MovieLens data set. Finally, we quantify the performance benefits of built-in parallelization.

Showing the best result for this search. See all results