Kernelized Multiplicative Weights for 0/1-Polyhedral Games:...

Published on ● Video Link: https://www.youtube.com/watch?v=0lsVLecBCOg



Duration: 51:35
378 views
10


Gabriele Farina (Carnegie Mellon University)
https://simons.berkeley.edu/talks/geometry-correlated-strategies-multiplayer-extensive-form-games-positive-and-parametric
Multi-Agent Reinforcement Learning and Bandit Learning

While extensive-form games (EFGs) can be converted into normal-form games (NFGs), doing so comes at the cost of an exponential blowup of the strategy space. So, progress on NFGs and EFGs has historically followed separate tracks, with the EFG community often having to catch up with advances (e.g., last-iterate convergence and predictive regret bounds) from the larger NFG community. In this talk I will show that the Optimistic Multiplicative Weights Update (OMWU) algorithm -- the premier learning algorithm for NFGs -- can be simulated on the normal-form equivalent of an EFG in linear time per iteration in the game tree size using a kernel trick. The resulting algorithm, Kernelized OMWU (KOMWU), applies more broadly to all convex games whose strategy space is a polytope with 0/1 integral vertices, as long as the kernel can be evaluated efficiently. In the particular case of EFGs, KOMWU closes several standing gaps between NFG and EFG learning, by enabling direct, black-box transfer to EFGs of desirable properties of learning dynamics that were so far known to be achievable only in NFGs. Specifically, KOMWU gives the first algorithm that guarantees at the same time last-iterate convergence, lower dependence on the size of the game tree than all prior algorithms, and \tilde{O}(1) regret when followed by all players.

Based on joint work with Christian Kroer, Chung-Wei Lee, and Haipeng Luo. ArXiv Link: https://arxiv.org/abs/2202.00237




Other Videos By Simons Institute for the Theory of Computing


2022-05-06The Role of Conventions in Adaptive Human-AI Interaction
2022-05-06Learning Decentralized Policies in Multiagent Systems: How to Learn Efficiently and ...
2022-05-06No-Regret Learning in Extensive-Form Games
2022-05-06Learning and Equilibrium Refinements
2022-05-05Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games
2022-05-05Multi-Player Bandits With No Collisions
2022-05-05What Does Machine Learning Offer Game Theory (And Vice Versa)?
2022-05-05Variants and Invariants in No-Regret Algorithms
2022-05-05When Is Offline Two-Player Zero-Sum Markov Game Solvable?
2022-05-05General Game-Theoretic Multiagent Reinforcement Learning
2022-05-05Kernelized Multiplicative Weights for 0/1-Polyhedral Games:...
2022-05-04Multi-Agent Reinforcement Learning Towards Zero-Shot Communication
2022-05-04Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning
2022-05-04Independent Learning in Stochastic Games
2022-05-04On Rewards in Multi-Agent Systems
2022-05-04Learning Automata as Building Blocks for MARL
2022-05-04Efficient Error Correction in Neutral Atoms via Erasure Conversion | Quantum Colloquium
2022-05-04Multi-Agent Reinforcement Learning in the High Population Regime
2022-05-04A Regret Minimization Approach to Mutli-Agent Control and RL
2022-05-03The Complexity of Markov Equilibrium in Stochastic Games
2022-05-03The Complexity of Infinite-Horizon General-Sum Stochastic Games: Turn-Based and Simultaneous Play



Tags:
Simons Institute
theoretical computer science
UC Berkeley
Computer Science
Theory of Computation
Theory of Computing
Multi-Agent Reinforcement Learning and Bandit Learning
Gabriele Farina