DeepMind x UCL RL Lecture Series - Policy-Gradient and Actor-Critic methods [9/13]
Channel:
Subscribers:
663,000
Published on ● Video Link: https://www.youtube.com/watch?v=y3oqOjHilio
Research Scientist Hado van Hasselt covers policy algorithms that can learn policies directly and actor critic algorithms that combine value predictions for more efficient learning.
Slides: https://dpmd.ai/policygradient
Full video lecture series: https://dpmd.ai/DeepMindxUCL21