Policy Gradient: Optimal Estimation, Convergence, and Generalization beyond Cumulative Rewards
Subscribers:
68,700
Published on ● Video Link: https://www.youtube.com/watch?v=gRI4ZyHLGzc
Mengdi Wang (Princeton University)
https://simons.berkeley.edu/talks/tbd-365
Adversarial Approaches in Machine Learning