Multi-Player Bandits With No Collisions

Channel:

Simons Institute for the Theory of Computing

Subscribers:

68,700

Published on May 5, 2022 12:03:34 PM ● Video Link: https://www.youtube.com/watch?v=J-H8ka4Zhfc

Duration: 40:34

436 views

Mark Sellke (Stanford)
https://simons.berkeley.edu/talks/multi-player-bandits-no-collisions
Multi-Agent Reinforcement Learning and Bandit Learning

In the stochastic multi-player bandit problem, m(greater than)1 players cooperate to maximize their total reward on a set of K(greater than)m bandit arms. However the players cannot communicate online and are penalized (e.g. receive no reward) if they collide by pulling the same arm at the same time. This problem was introduced in the context of wireless radio communication and serves as a natural model for decentralized online decision-making. There have been many results for different versions of this model. Most rely on a small number of collisions in order to implicitly communicate. However it was later realized that nearly-optimal T^{1/2} regret is possible with no collisions (and hence no communication) at all. I will discuss this construction, as well as our recent characterization of the Pareto optimal trade-offs for gap dependent regret without communication. Based on collaborations with Sebastien Bubeck, Thomas Budzinski, and Allen Liu.

Other Videos By Simons Institute for the Theory of Computing

2022-05-26	Fully Homomorphic Encryption, 10 Years Later: Definitions and Open Problems
2022-05-26	Max-Flow and Friends, In Almost Linear Time
2022-05-25	Modeling Conflict in Social Media
2022-05-25	Welcome
2022-05-11	Black Holes and the Quantum-Extended Church-Turing Thesis \| Quantum Colloquium
2022-05-06	The Role of Conventions in Adaptive Human-AI Interaction
2022-05-06	Learning Decentralized Policies in Multiagent Systems: How to Learn Efficiently and ...
2022-05-06	No-Regret Learning in Extensive-Form Games
2022-05-06	Learning and Equilibrium Refinements
2022-05-05	Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games
2022-05-05	Multi-Player Bandits With No Collisions
2022-05-05	What Does Machine Learning Offer Game Theory (And Vice Versa)?
2022-05-05	Variants and Invariants in No-Regret Algorithms
2022-05-05	When Is Offline Two-Player Zero-Sum Markov Game Solvable?
2022-05-05	General Game-Theoretic Multiagent Reinforcement Learning
2022-05-05	Kernelized Multiplicative Weights for 0/1-Polyhedral Games:...
2022-05-04	Multi-Agent Reinforcement Learning Towards Zero-Shot Communication
2022-05-04	Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning
2022-05-04	Independent Learning in Stochastic Games
2022-05-04	On Rewards in Multi-Agent Systems
2022-05-04	Learning Automata as Building Blocks for MARL

Tags:

Simons Institute

theoretical computer science

UC Berkeley

Computer Science

Theory of Computation

Theory of Computing

Multi-Agent Reinforcement Learning and Bandit Learning

Mark Sellke

Channel	Latest
RoninRevil	6 hours ago
Wos	6 hours ago
MOMOKO YODA	6 hours ago
iGuti89	6 hours ago
Pierro_fps	6 hours ago
Dragomazing	7 hours ago
Sport Piceno Game	7 hours ago
Thích Violin	7 hours ago
Der Mikeintosh	7 hours ago
UltimateNyde	7 hours ago
Nexific	7 hours ago
KevGaming87	7 hours ago
Liban Ali	7 hours ago
Reborn Project	7 hours ago
Mokka Commentry	7 hours ago
CARBON	7 hours ago
SkyWhait	7 hours ago
Lostgamerrus	8 hours ago
Crouch Gaming	8 hours ago
RayThaGawd	8 hours ago
Schannel	8 hours ago
la cueva de lobo	8 hours ago
Geezax	8 hours ago
Nubo BIT	8 hours ago
Inter	8 hours ago