Dynamic Regret Minimization for Bandits without Prior Knowledge

Channel:

Simons Institute for the Theory of Computing

Subscribers:

68,700

Published on September 17, 2022 5:06:20 AM ● Video Link: https://www.youtube.com/watch?v=kLMVbPnD5kc

Duration: 46:00

1,510 views

Chen-Yu Wei (University of Southern California)
https://simons.berkeley.edu/talks/tbd-482
Quantifying Uncertainty: Stochastic, Adversarial, and Beyond

To evaluate the performance of a bandit learner in a changing environment, the standard notion of regret is insufficient. Instead, "dynamic regret" is a better measure that can evaluate the learner's ability to track the changes. How to achieve the optimal dynamic regret without prior knowledge on the number of times the environment changes had been open for a long time, and was recently resolved by Auer, Gajane, and Ortner in their COLT 2019 paper. We will discuss their consecutive sampling technique, which is rare in the bandit literature, and see how their idea can be elegantly generalized to a wide range of bandit/RL problems. Finally, we will discuss important open problems that remain in the area.

Other Videos By Simons Institute for the Theory of Computing

2022-09-27	Higher-Order Graphon Theory: Fluctuations, Inference, and Degeneracies
2022-09-27	A Large Deviation Principle for Block Models
2022-09-27	Mean-field approximations for high-dimensional Bayesian Regression
2022-09-27	Response of Graphs to Competing Constraints
2022-09-27	Sparse Random Graphs: Interplay of Local and Global Structure
2022-09-26	Analytic Approach to Guasirandomness
2022-09-26	Random Cluster Model on Regular Graphs
2022-09-17	Generalization and Robustness in Offline Reinforcement Learning
2022-09-17	Adaptivity and Confounding in Multi-armed Bandit Experiments
2022-09-16	When Can We Use Weak Function Approximation to Solve Large Scale Planning Problems in MDPs?
2022-09-16	Dynamic Regret Minimization for Bandits without Prior Knowledge
2022-09-16	Oracle-Efficient Online Learning or: How to Use Non-Robust Optimization for Robust Learning
2022-09-16	Adaptive Monopoly Regulation
2022-09-16	Information Collection Through Strategic Agents
2022-09-16	Incentivized Exploration
2022-09-15	Attributes: Selective Learning and Influence
2022-09-15	Markov Persuasion Process and its Reinforcement Learning
2022-09-15	Parsimonious Learning-Augmented Algorithms
2022-09-14	Machine Learning for Faster Optimization
2022-09-14	Flow Time Scheduling with Uncertain Processing Time
2022-09-14	Best Of Both Worlds: Stochastic & Adversarial Best-Arm Identification

Tags:

Simons Institute

theoretical computer science

UC Berkeley

Computer Science

Theory of Computation

Theory of Computing

Quantifying Uncertainty: Stochastic Adversarial and Beyond

Chen-Yu Wei

Channel	Latest
Sey Senpai	8 hours ago
Vardoc1	10 hours ago
Anton Petrov	10 hours ago
LInk02	11 hours ago
Mon Facts	12 hours ago
GeorgeMallouris	12 hours ago
Big punchman	13 hours ago
Jakou	13 hours ago
HOWTONEVOLUTION	13 hours ago
Brunoborne	13 hours ago
Goodblue77	13 hours ago
lugeyps3	13 hours ago
Stan's Mod Gaming	13 hours ago
OPEN TV	13 hours ago
neXzen MMD & MUSIC	13 hours ago
flipswitch3111	13 hours ago
WalkthroughGuy	13 hours ago
ТРЕНДИ ШОРТС	13 hours ago
eagLe34	13 hours ago
Melody /ميلودي	13 hours ago
Linkwolf	13 hours ago
아루우	14 hours ago
Nostradamus	14 hours ago
Xeres Artrophel Ch.	14 hours ago
Dandy Caballero	14 hours ago