Sample Complexity Of Policy-Based Methods Under Off-Policy Sampling And ...

Channel:

Simons Institute for the Theory of Computing

Subscribers:

68,700

Published on October 8, 2022 10:13:02 AM ● Video Link: https://www.youtube.com/watch?v=9nrG_8ujO7Q

Duration: 30:00

363 views

Siva Theja Maguluri (Georgia Institute of Technology)
https://simons.berkeley.edu/talks/sample-complexity-policy-based-methods-under-policy-sampling-and-linear-function-approximation
Joint IFML/Data-Driven Decision Processes Workshop

In this work, we study policy-space methods for solving the reinforcement learning (RL) problem. We first consider the setting where the model is known (MDP setting) and show that the Natural Policy Gradient (NPG) algorithm (and other variants) have linear (geometric) convergence without the need of any regularization. In contrast to optimization approaches used in the literature, our approach is based on approximate dynamic programing. We then consider a linear function approximation variant of it and establish its convergence guarantees. Finally, we consider the RL setting where a critic is deployed for policy evaluation when the model is unknown. We consider a critic that is based on TD learning, and uses off-policy sampling with linear function approximation. This leads to the infamous deadly-triad issue. We propose a generic algorithm framework of a single time-scale multi-step TD-learning with generalized importance sampling ratios that enables us to overcome the high variance issue in off-policy learning, and establish its finite-sample guarantees. We show that this leads to an overall sample complexity of O(epsilon^-2).

Other Videos By Simons Institute for the Theory of Computing

2022-10-13	New Results on Primal-Dual Algorithms for Online Allocation Problems With Applications to ...
2022-10-12	Learning Across Bandits in High Dimension via Robust Statistics
2022-10-12	Are Multicriteria MDPs Harder to Solve Than Single-Criteria MDPs?
2022-10-12	A Game-Theoretic Approach to Offline Reinforcement Learning
2022-10-11	The Statistical Complexity of Interactive Decision Making
2022-10-11	A Tutorial on Finite-Sample Guarantees of Contractive Stochastic Approximation With...
2022-10-11	A Tutorial on Finite-Sample Guarantees of Contractive Stochastic Approximation With...
2022-10-11	Stochastic Bin Packing with Time-Varying Item Sizes
2022-10-10	Constant Regret in Exchangeable Action Models: Overbooking, Bin Packing, and Beyond
2022-10-08	On The Exploration In Load-Balancing Under Unknown Service Rates
2022-10-08	Sample Complexity Of Policy-Based Methods Under Off-Policy Sampling And ...
2022-10-08	The Compensated Coupling (or Why the Future is the Best Guide for the Present)
2022-10-08	Higher-Dimensional Expansion of Random Geometric Complexes
2022-10-08	On the Power of Preconditioning in Sparse Linear Regression
2022-10-07	What Functions Do Transformers Prefer to Represent?
2022-10-01	Optimality of Variational Inference for Stochastic Block Model
2022-10-01	Machine Learning on Large-Scale Graphs
2022-10-01	Survey on Sparse Graph Limits + A Toy Example
2022-10-01	Long Range Dependence in Evolving Networks
2022-09-30	Stochastic Processes on Sparse Graphs: Hydrodynamic Limits and Markov Approximations
2022-09-30	Large Deviation Principle for the Norm of the Adjacency Matrix and the Laplacian Matrix of...

Tags:

Simons Institute

theoretical computer science

UC Berkeley

Computer Science

Theory of Computation

Theory of Computing

Joint IFML/Data-Driven Decision Processes Workshop

Siva Theja Maguluri

Channel	Latest
아루우	6 hours ago
Nostradamus	6 hours ago
OUDO - ON THE RIFT	6 hours ago
Foxline	6 hours ago
S-Tavo Plays	6 hours ago
Ictfix.net	6 hours ago
Winkazi	6 hours ago
Samanta Gamer	7 hours ago
smskcntr	7 hours ago
Texshanfor Ferdi	7 hours ago
AhtmosTV	7 hours ago
ScarletMarisa375	7 hours ago
OtakuPT	7 hours ago
Koragg Wolzard WolfThunderRangerKilleranger34*	7 hours ago
Insert Coin	7 hours ago
Justmaiko Gaming	7 hours ago
Crainer	7 hours ago
Overdrive	7 hours ago
Adri’s On Fire	7 hours ago
Game Guides Channel	7 hours ago
GemplayTV	7 hours ago
Sveneta	7 hours ago
ImpulseDm	7 hours ago
Is It Playable?	7 hours ago
GrizzoUK	7 hours ago