Fixed-point Error Bounds for Mean-payoff Markov Decision Processes

Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=rTC_YYsBzz8



Duration: 57:54
461 views
9


A Google TechTalks, presented by Roberto Cominneti, 2024-03-19
A Google Algorithms Seminar. ABSTRACT: We discuss the use of optimal transport techniques to derive finite-time error bounds for reinforcement learning in mean-payoff Markov decision processes. The results are obtained as a special case of stochastic Krasnoselski—Mann fixed point iterations for nonexpansive maps. We present sufficient conditions on the stochastic noise and stepsizes that guarantee almost sure convergence of the iterates towards a fixed point, as well as non-asymptotic error bounds and convergence rates. Our main results concern the case of a martingale difference noise with variances that can possibly grow unbounded. We also analyze the case of uniformly bounded variances, and how they apply for Stochastic Gradient Descent in convex optimization.

ABOUT THE SPEAKER: Roberto Cominetti is a professor with the Faculty of Engineering and Sciences at Universidad Adolfo Ibáñez, Santiago, Chile. His research interests include convex analysis and game theory and their applications in transportation networks.




Other Videos By Google TechTalks


2024-04-22Design is Testability
2024-04-12Charles Hoskinson | CEO of Input Output Global | web3 talks | Apr 4th 2024 | MC: Marlon Ruiz
2024-04-09Limitations of Stochastic Selection with Pairwise Independent Priors
2024-04-02NASA CARA - Air Traffic Control in Spaaaaaaaace
2024-03-28How Your Brain Processes Code
2024-03-25Fixed-point Error Bounds for Mean-payoff Markov Decision Processes
2024-03-19One Tree to Rule Them All: Polylogarithmic Universal Steiner Trees
2024-01-26Understanding Oversmoothing in Graph Neural Networks (GNNs): Insights from Two Theoretical Studies
2023-12-05Socially Responsible Software Development (Teaching Software Design Systematically)
2023-12-04Understanding and Mitigating Copying in Diffusion Models
2023-12-04Efficient Training Image Extraction from Diffusion Models Ryan Webs
2023-11-30High-Dimensional Prediction for Sequential Decision Making
2023-09-01Representational Strengths and Limitations of Transformers
2023-09-01Steven Goldfeder | CEO Offchain Labs / Arbitrum | web3 talks | Aug 24 2023 | MC: Marlon Ruiz
2023-08-29Differentially Private Sampling from Distributions
2023-07-14Revisiting Nearest Neighbors from a Sparse Signal Approximation View
2023-07-042023 Blockly Developer Summit Day 2-5: Plug-ins Demonstration
2023-07-042023 Blockly Developer Summit DAY 1-5: The Future of Computational Thinking
2023-07-042023 Blockly Developer Summit DAY 1-7: Cubi - Extending Blockly for Teachers
2023-07-042023 Blockly Developer Summit DAY 1-12: Serialization and Visual Diff
2023-07-042023 Blockly Developer Summit Day 2-2: Blockly Themes for Accessibility