Near-Optimal No-Regret Learning for General Convex Games

Channel:

Simons Institute for the Theory of Computing

Subscribers:

68,900

Published on October 15, 2022 5:37:31 AM ● Video Link: https://www.youtube.com/watch?v=9DA2w8CrBeA

Duration: 1:03:31

1,075 views

Gabriele Farina (Carnegie Mellon University)
https://simons.berkeley.edu/talks/near-optimal-no-regret-learning-general-convex-games
Structure of Constraints in Sequential Decision-Making

A recent line of work has established uncoupled learning dynamics such that, when employed by all players in a game, each player's regret after T repetitions grows polylogarithmically in T, an exponential improvement over the traditional guarantees within the no-regret framework. However, so far these results have only been limited to certain classes of games with structured strategy spaces---such as normal-form and extensive-form games. The question as to whether O(polylog T) regret bounds can be obtained for general convex and compact strategy sets---which occur in many fundamental models in economics and multiagent systems---while retaining efficient strategy updates is an important question. In this talk, we answer this in the positive by establishing the first uncoupled learning algorithm with O(log T) per-player regret in general convex games, that is, games with concave utility functions supported on arbitrary convex and compact strategy sets. Our learning dynamics are based on an instantiation of optimistic follow-the-regularized-leader over an appropriately lifted space using a self-concordant regularizer that is, peculiarly, not a barrier for the feasible region. Further, our learning dynamics are efficiently implementable given access to a proximal oracle for the convex strategy set, leading to O(loglog T) per-iteration complexity; we also give extensions when access to only a linear optimization oracle is assumed. Finally, we adapt our dynamics to guarantee O(sqrt(T)) regret in the adversarial regime. Even in those special cases where prior results apply, our algorithm improves over the state-of-the-art regret bounds either in terms of the dependence on the number of iterations or on the dimension of the strategy sets. Based on joint work with Ioannis Anagnostides, Haipeng Luo, Chung-Wei Lee, Christian Kroer, and Tuomas Sandholm. Paper link: https://arxiv.org/abs/2206.08742

Other Videos By Simons Institute for the Theory of Computing

2022-10-26	Mathematics of the COVID-19 Pandemics: Lessons Learned and How to Mitigate the Next One
2022-10-25	Efficient and Targeted COVID-19 Border Testing via Reinforcement Learning
2022-10-25	Random Walks on Simplicial Complexes for Exploring Networks
2022-10-25	Functional Law of Large Numbers and PDEs for Spatial Epidemic Models with...
2022-10-25	Algorithms Using Local Graph Features to Predict Epidemics
2022-10-24	Epidemic Models with Manual and Digital Contact Tracing
2022-10-21	Pandora’s Box: Learning to Leverage Costly Information
2022-10-20	Thresholds
2022-10-19	NLTS Hamiltonians from Codes \| Quantum Colloquium
2022-10-15	Learning to Control Safety-Critical Systems
2022-10-14	Near-Optimal No-Regret Learning for General Convex Games
2022-10-14	The Power of Adaptivity in Representation Learning: From Meta-Learning to Federated Learning
2022-10-14	When Matching Meets Batching: Optimal Multi-stage Algorithms and Applications
2022-10-13	Optimal Learning for Structured Bandits
2022-10-13	Dynamic Spatial Matching
2022-10-13	New Results on Primal-Dual Algorithms for Online Allocation Problems With Applications to ...
2022-10-12	Learning Across Bandits in High Dimension via Robust Statistics
2022-10-12	Are Multicriteria MDPs Harder to Solve Than Single-Criteria MDPs?
2022-10-12	A Game-Theoretic Approach to Offline Reinforcement Learning
2022-10-11	The Statistical Complexity of Interactive Decision Making
2022-10-11	A Tutorial on Finite-Sample Guarantees of Contractive Stochastic Approximation With...

Tags:

Simons Institute

theoretical computer science

UC Berkeley

Computer Science

Theory of Computation

Theory of Computing

Structure of Constraints in Sequential Decision-Making

Gabriele Farina

Channel	Latest
Demo Ken	6 hours ago
ASG	6 hours ago
GAME-PC-LIFE	6 hours ago
The Obsessive Compulsive Gamer	6 hours ago
Stremphoenix	6 hours ago
Xuyên Cùi Mía	7 hours ago
Manesha Jaiswal	7 hours ago
GAMEPLAY AVENUE	7 hours ago
DanielLP	7 hours ago
ひろちゃんねる	7 hours ago
Zehama	7 hours ago
[eru_ch]えるさん	7 hours ago
Flipmodejp	7 hours ago
Shaman Games	7 hours ago
Space Battles 2020	7 hours ago
香港01	7 hours ago
JEROME FPS	7 hours ago
DoctorPirx	7 hours ago
Brilio News	7 hours ago
Jリーグ公式チャンネル	7 hours ago
小玉有希・kodama yuki ch	7 hours ago
FieraCrack	7 hours ago
RetroRecap	7 hours ago
れもん	8 hours ago
E S T 3 T I G A	8 hours ago