Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning

Published on ● Video Link: https://www.youtube.com/watch?v=mu7siHbCOXw



Category:
Guide
Duration: 38:05
753 views
13


Zhuoran Yang (UC Berkeley)
https://simons.berkeley.edu/talks/sequential-information-design-markov-persuasion-process-and-its-efficient-reinforcement
Multi-Agent Reinforcement Learning and Bandit Learning

In today's economy, it becomes important for Internet platforms to consider the sequential information design problem to align its long-term interest with the incentives of the gig service providers. In this talk, I will introduce a novel model of sequential information design, namely the Markov persuasion processes (MPPs), where a sender, with informational advantage, seeks to persuade a stream of myopic receivers to take actions that maximize the sender's cumulative utilities in a finite horizon Markovian environment with varying prior and utility functions. Planning in MPPs thus faces the unique challenge i finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender. Under the online setting where the model is unknown, I will introduce a provably efficient reinforcement learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles and enjoys a sublinear regret upper bound.




Other Videos By Simons Institute for the Theory of Computing


2022-05-06No-Regret Learning in Extensive-Form Games
2022-05-06Learning and Equilibrium Refinements
2022-05-05Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games
2022-05-05Multi-Player Bandits With No Collisions
2022-05-05What Does Machine Learning Offer Game Theory (And Vice Versa)?
2022-05-05Variants and Invariants in No-Regret Algorithms
2022-05-05When Is Offline Two-Player Zero-Sum Markov Game Solvable?
2022-05-05General Game-Theoretic Multiagent Reinforcement Learning
2022-05-05Kernelized Multiplicative Weights for 0/1-Polyhedral Games:...
2022-05-04Multi-Agent Reinforcement Learning Towards Zero-Shot Communication
2022-05-04Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning
2022-05-04Independent Learning in Stochastic Games
2022-05-04On Rewards in Multi-Agent Systems
2022-05-04Learning Automata as Building Blocks for MARL
2022-05-04Efficient Error Correction in Neutral Atoms via Erasure Conversion | Quantum Colloquium
2022-05-04Multi-Agent Reinforcement Learning in the High Population Regime
2022-05-04A Regret Minimization Approach to Mutli-Agent Control and RL
2022-05-03The Complexity of Markov Equilibrium in Stochastic Games
2022-05-03The Complexity of Infinite-Horizon General-Sum Stochastic Games: Turn-Based and Simultaneous Play
2022-05-03Policy Gradients in General-Sum Dynamic Games: When Do They Even Converge?
2022-05-03No-Regret Learning in Time-Varying Zero-Sum Games



Tags:
Simons Institute
theoretical computer science
UC Berkeley
Computer Science
Theory of Computation
Theory of Computing
Multi-Agent Reinforcement Learning and Bandit Learning
Zhuoran Yang