Fixed-point Error Bounds for Mean-payoff Markov Decision Processes

Channel:

Google TechTalks

Subscribers:

349,000

Published on March 25, 2024 4:56:04 PM ● Video Link: https://www.youtube.com/watch?v=rTC_YYsBzz8

Duration: 57:54

481 views

A Google TechTalks, presented by Roberto Cominneti, 2024-03-19
A Google Algorithms Seminar. ABSTRACT: We discuss the use of optimal transport techniques to derive finite-time error bounds for reinforcement learning in mean-payoff Markov decision processes. The results are obtained as a special case of stochastic Krasnoselski—Mann fixed point iterations for nonexpansive maps. We present sufficient conditions on the stochastic noise and stepsizes that guarantee almost sure convergence of the iterates towards a fixed point, as well as non-asymptotic error bounds and convergence rates. Our main results concern the case of a martingale difference noise with variances that can possibly grow unbounded. We also analyze the case of uniformly bounded variances, and how they apply for Stochastic Gradient Descent in convex optimization.

ABOUT THE SPEAKER: Roberto Cominetti is a professor with the Faculty of Engineering and Sciences at Universidad Adolfo Ibáñez, Santiago, Chile. His research interests include convex analysis and game theory and their applications in transportation networks.

Other Videos By Google TechTalks

2024-05-20	The Data Minimization Principle in Machine Learning
2024-05-20	Challenges in Augmenting Large Language Models with Private Data
2024-05-20	Oblivious RAM: From Theory to Large-scale Real-world Deployment
2024-05-20	Low Cost High Power Membership Inference Attacks
2024-05-20	Can LLMs Keep a Secret? Testing Privacy Implications of Language Models
2024-04-22	Design is Testability
2024-04-12	Charles Hoskinson \| CEO of Input Output Global \| web3 talks \| Apr 4th 2024 \| MC: Marlon Ruiz
2024-04-08	Limitations of Stochastic Selection with Pairwise Independent Priors
2024-04-02	NASA CARA - Air Traffic Control in Spaaaaaaaace
2024-03-28	How Your Brain Processes Code
2024-03-25	Fixed-point Error Bounds for Mean-payoff Markov Decision Processes
2024-03-19	One Tree to Rule Them All: Polylogarithmic Universal Steiner Trees
2024-01-26	Understanding Oversmoothing in Graph Neural Networks (GNNs): Insights from Two Theoretical Studies
2023-12-05	Socially Responsible Software Development (Teaching Software Design Systematically)
2023-12-04	Understanding and Mitigating Copying in Diffusion Models
2023-12-04	Efficient Training Image Extraction from Diffusion Models Ryan Webs
2023-11-30	High-Dimensional Prediction for Sequential Decision Making
2023-09-01	Representational Strengths and Limitations of Transformers
2023-09-01	Steven Goldfeder \| CEO Offchain Labs / Arbitrum \| web3 talks \| Aug 24 2023 \| MC: Marlon Ruiz
2023-08-29	Differentially Private Sampling from Distributions
2023-07-14	Revisiting Nearest Neighbors from a Sparse Signal Approximation View

Channel	Latest
MAKE	6 hours ago
老倉育	7 hours ago
YBMJETT	7 hours ago
Flinter	7 hours ago
Vandiril	7 hours ago
DrJamari	7 hours ago
PartyZams / Brian T Cox	7 hours ago
TheDiN23	7 hours ago
DarknessOfNoor	8 hours ago
Warner Play	8 hours ago
Lady Nemhesis	8 hours ago
New York City Files	8 hours ago
Yescar Gameplays en Español	8 hours ago
Molly Bee	8 hours ago
James Bullock	8 hours ago
FF Hades	9 hours ago
AtelieRYUKINON	9 hours ago
Daily Comedy Dose	9 hours ago
jacurex90	9 hours ago
YZ Games	9 hours ago
Amico Diverte - Racconti Horror	9 hours ago
Crash	9 hours ago
DragonChef91	9 hours ago
Moobys	9 hours ago
DoBi Games	9 hours ago