Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning

Channel:

Simons Institute for the Theory of Computing

Subscribers:

68,700

Published on May 4, 2022 11:55:16 AM ● Video Link: https://www.youtube.com/watch?v=mu7siHbCOXw

Category:

Guide

Duration: 38:05

753 views

Zhuoran Yang (UC Berkeley)
https://simons.berkeley.edu/talks/sequential-information-design-markov-persuasion-process-and-its-efficient-reinforcement
Multi-Agent Reinforcement Learning and Bandit Learning

In today's economy, it becomes important for Internet platforms to consider the sequential information design problem to align its long-term interest with the incentives of the gig service providers. In this talk, I will introduce a novel model of sequential information design, namely the Markov persuasion processes (MPPs), where a sender, with informational advantage, seeks to persuade a stream of myopic receivers to take actions that maximize the sender's cumulative utilities in a finite horizon Markovian environment with varying prior and utility functions. Planning in MPPs thus faces the unique challenge i finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender. Under the online setting where the model is unknown, I will introduce a provably efficient reinforcement learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles and enjoys a sublinear regret upper bound.

Other Videos By Simons Institute for the Theory of Computing

2022-05-06	No-Regret Learning in Extensive-Form Games
2022-05-06	Learning and Equilibrium Refinements
2022-05-05	Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games
2022-05-05	Multi-Player Bandits With No Collisions
2022-05-05	What Does Machine Learning Offer Game Theory (And Vice Versa)?
2022-05-05	Variants and Invariants in No-Regret Algorithms
2022-05-05	When Is Offline Two-Player Zero-Sum Markov Game Solvable?
2022-05-05	General Game-Theoretic Multiagent Reinforcement Learning
2022-05-05	Kernelized Multiplicative Weights for 0/1-Polyhedral Games:...
2022-05-04	Multi-Agent Reinforcement Learning Towards Zero-Shot Communication
2022-05-04	Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning
2022-05-04	Independent Learning in Stochastic Games
2022-05-04	On Rewards in Multi-Agent Systems
2022-05-04	Learning Automata as Building Blocks for MARL
2022-05-04	Efficient Error Correction in Neutral Atoms via Erasure Conversion \| Quantum Colloquium
2022-05-04	Multi-Agent Reinforcement Learning in the High Population Regime
2022-05-04	A Regret Minimization Approach to Mutli-Agent Control and RL
2022-05-03	The Complexity of Markov Equilibrium in Stochastic Games
2022-05-03	The Complexity of Infinite-Horizon General-Sum Stochastic Games: Turn-Based and Simultaneous Play
2022-05-03	Policy Gradients in General-Sum Dynamic Games: When Do They Even Converge?
2022-05-03	No-Regret Learning in Time-Varying Zero-Sum Games

Tags:

Simons Institute

theoretical computer science

UC Berkeley

Computer Science

Theory of Computation

Theory of Computing

Multi-Agent Reinforcement Learning and Bandit Learning

Zhuoran Yang

Channel	Latest
けい	10 hours ago
The Silly Steve Show	13 hours ago
Shazam Sakazaki	13 hours ago
血夜の檸檬	14 hours ago
Hobbynize Blog	14 hours ago
Nao BGR	14 hours ago
ぴノまるGame	14 hours ago
OPUS ASTORA	14 hours ago
Bring the Asteroid	14 hours ago
Kitab Gaming	14 hours ago
Reyju Gaming	14 hours ago
AussieAntics	14 hours ago
SamuraiTacos1	14 hours ago
VIDΣGΛMMΛ	15 hours ago
Shravan Srinivasan	15 hours ago
Ib Gaming	15 hours ago
Gotenks0002	15 hours ago
アベレージ / Average Channel	15 hours ago
Two Bros' Game Night	15 hours ago
SUPER SOCCER RUBRO NEGRO -ᄅ-	15 hours ago
The Other Guy	15 hours ago
OMNIxEVIL	15 hours ago
Seer CRZ	15 hours ago
Dreezus	15 hours ago
Daryus P	15 hours ago