Multiagent Soft Q-Learning.

Multiagent soft q-learning

E Wei, D Wicke, D Freelan, S Luke - arXiv preprint arXiv:1804.09817, 2018 - arxiv.org

… We will first derive the policy gradient estimator for the cooperative multiagent case and then
… We first introduce Soft Q-Learning and then describe how we use it for multiagent training. …

Save Cite Cited by 111 Related articles All 5 versions View as HTML

[PDF] neurips.cc

Regularized softmax deep multi-agent q-learning

L Pan, T Rashid, B Peng, L Huang… - Advances in Neural …, 2021 - proceedings.neurips.cc

… Tackling overestimation in Q-learning is an important … multi-agent setting. In this work,
we empirically demonstrate that QMIX, a popular Q-learning algorithm for cooperative multiagent …

Save Cite Cited by 36 Related articles All 9 versions View as HTML

[PDF] smu.edu.sg

Inverse factorized soft Q-Learning for cooperative multi-agent imitation learning

T Mai, T NGUYEN - 2024 - ink.library.smu.edu.sg

… Recall that when convex activation functions are used in building mixing networks, our
Theorem 4.5 shows that the objective function of the multi-agent inverse soft Q-learning is convex …

Save Cite Cited by 1 Related articles All 3 versions View as HTML

[PDF] academia.edu

Modular Q-learning based multi-agent cooperation for robot soccer

KH Park, YJ Kim, JH Kim - Robotics and Autonomous systems, 2001 - Elsevier

… from other multi-agent … , Q-learning can be used in the reinforcement scheme as it is
applicable where no model of the environment is available [8], [16]. In this paper, modular Q-learning …

Save Cite Cited by 162 Related articles All 5 versions

[PDF] neurips.cc

Iq-learn: Inverse soft-q learning for imitation

D Garg, S Chakraborty, C Cundy… - Advances in Neural …, 2021 - proceedings.neurips.cc

… Q-learning update rule for imitation learning that can be implemented on top of soft-Q learning
or soft actor-… Our objective forms a variant of soft-Q learning: to learn the optimal Q-function …

Save Cite Cited by 196 Related articles All 10 versions View as HTML

[PDF] uct.ac.za

[PDF][PDF] Learning to Coordinate Efficiently through Multiagent Soft Q-Learning in the presence of Game-Theoretic Pathologies

S Danisa - 2022 - open.uct.ac.za

… Lastly, we introduce extensions of multiagent soft Q-learning whose designs are in line
with our hypotheses and their respective experimental investigations. This chapter is then …

Save Cite Cited by 1 Related articles All 4 versions View as HTML

[PDF] openreview.net

Multi-Agent Optimistic Soft Q-Learning: A Co-MARL Algorithm with a Global Convergence Guarantee

R Hu, L Ying - openreview.net

… Based on this observation, we propose multi-agent optimistic soft Q-learning (MAOSQL) by
… it determines local policy resembles that of soft Q-learning. This definition naturally leads to a …

Save Cite Cited by 1 Related articles View as HTML

[PDF] mdpi.com

Reducing Q-Value Estimation Bias via Mutual Estimation and Softmax Operation in MADRL

Z Li, X Chen, J Fu, N Xie, T Zhao - Algorithms, 2024 - mdpi.com

… a multi-agent mutual evaluation method and a multi-agent softmax method to reduce the
estimation bias of Q values in multi-agent … This paper primarily studies the Q-learning of agents …

Save Cite Cited by 2 Related articles All 4 versions Cached

[PDF] arxiv.org

Balancing two-player stochastic games with soft q-learning

J Grau-Moya, F Leibfried, H Bou-Ammar - arXiv preprint arXiv:1802.03216, 2018 - arxiv.org

… behaviour by generalising soft Q-learning to stochastic games, … On the theory side, we show
that games with soft Q-learning … Markov games as a framework for multi-agent reinforcement …

Save Cite Cited by 58 Related articles All 7 versions View as HTML

[PDF] mlr.press

Maximum entropy gflownets with soft q-learning

S Mohammadpour, E Bengio… - International …, 2024 - proceedings.mlr.press

… It is convenient to learn log n both for numerical purposes and for the synergy it has with
Soft Qlearning. In the log space, the sum in (3) is replaced by a log-sum-exp. Indeed, as the …

Save Cite Cited by 13 Related articles All 4 versions View as HTML

Create alert

Cite

Advanced search

Saved to My library