Policy Gradients in General-Sum Dynamic Games: When Do They Even Converge?
Subscribers:
68,700
Published on ● Video Link: https://www.youtube.com/watch?v=pWVK2DZRJpM
Eric Mazumdar (Caltech)
https://simons.berkeley.edu/talks/policy-gradients-general-sum-dynamic-games-when-do-they-even-converge
Multi-Agent Reinforcement Learning and Bandit Learning
In this talk I will present work showing that agents using simple policy gradient algorithms in arguably the simplest class of continuous action- and state-space multi-agent control problem: general-sum linear quadratic games, have no guarantees of asymptotic convergence, and that proximal point and extra-gradients will not solve these issues. I will then focus in on zero-sum LQ games in which stronger convergence guarantees are possible when agents use independent policy gradients with a finite timescale separation.
Other Videos By Simons Institute for the Theory of Computing
Tags:
Simons Institute
theoretical computer science
UC Berkeley
Computer Science
Theory of Computation
Theory of Computing
Multi-Agent Reinforcement Learning and Bandit Learning
Eric Mazumdar