Thinking While Moving: Deep Reinforcement Learning with Concurrent Control

Channel:

Yannic Kilcher

Subscribers:

301,000

Published on April 23, 2020 1:26:07 PM ● Video Link: https://www.youtube.com/watch?v=pZyxlf6l0N8

Duration: 29:41

2,840 views

Classic RL "stops" the world whenever the Agent computes a new action. This paper considers a more realistic scenario where the agent is thinking about the next action to take while still performing the last action. This results in a fascinating way of reformulating Q-learning in continuous time, then introducing concurrency and finally going back to discrete time.

https://arxiv.org/abs/2004.06089

Abstract:
We study reinforcement learning in settings where sampling an action from the policy must be done concurrently with the time evolution of the controlled system, such as when a robot must decide on the next action while still performing the previous action. Much like a person or an animal, the robot must think and move at the same time, deciding on its next action before the previous one has completed. In order to develop an algorithmic framework for such concurrent control problems, we start with a continuous-time formulation of the Bellman equations, and then discretize them in a way that is aware of system delays. We instantiate this new class of approximate dynamic programming methods via a simple architectural extension to existing value-based deep reinforcement learning algorithms. We evaluate our methods on simulated benchmark tasks and a large-scale robotic grasping task where the robot must "think while moving".

Authors: Ted Xiao, Eric Jang, Dmitry Kalashnikov, Sergey Levine, Julian Ibarz, Karol Hausman, Alexander Herzog

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher

Other Videos By Yannic Kilcher

2020-05-03	I talk to the new Facebook Blender Chatbot
2020-05-02	Jukebox: A Generative Model for Music (Paper Explained)
2020-05-01	[ML Coding Tips] Separate Computation & Plotting using locals
2020-04-30	The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies (Paper Explained)
2020-04-29	Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask (Paper Explained)
2020-04-28	[Rant] Online Conferences
2020-04-27	Do ImageNet Classifiers Generalize to ImageNet? (Paper Explained)
2020-04-26	[Drama] Schmidhuber: Critique of Honda Prize for Dr. Hinton
2020-04-25	How much memory does Longformer use?
2020-04-24	Supervised Contrastive Learning
2020-04-23	Thinking While Moving: Deep Reinforcement Learning with Concurrent Control
2020-04-22	[Rant] The Male Only History of Deep Learning
2020-04-21	Gradient Surgery for Multi-Task Learning
2020-04-20	Longformer: The Long-Document Transformer
2020-04-20	Backpropagation and the brain
2020-04-18	Shortcut Learning in Deep Neural Networks
2020-04-17	Feature Visualization & The OpenAI microscope
2020-04-16	Datasets for Data-Driven Reinforcement Learning
2020-04-15	FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
2020-04-14	Imputer: Sequence Modelling via Imputation and Dynamic Programming
2020-04-13	The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

Tags:

deep learning

machine learning

reinforcement learning

vector to go

vtg

continuous

control

robot

concurrent

deep rl

deep neural networks

berkeley

google

grasping

qlearning

Channel	Latest
Akali Challenger	6 hours ago
CrissD	6 hours ago
AMHarbinger	6 hours ago
IanOnYouTube	7 hours ago
Mystical Gaming	7 hours ago
PNKFacil	8 hours ago
Meta375	8 hours ago
Ermes Messaggero Nerazzurro	10 hours ago
ALEDream	10 hours ago
Maverick G	10 hours ago
LDraux	10 hours ago
Robzap 20 Nintendo & Steam Pictures	10 hours ago
Steven J Flynn	10 hours ago
StarMiz	10 hours ago
77Game Play	10 hours ago
محمود العجيل \| Mahmoud Alajil	10 hours ago
Oscar Memo333	10 hours ago
Nintentoni	10 hours ago
SiabarGroot [La mejor plantita de todo Youtube]	11 hours ago
ChrisPlayer24	11 hours ago
Berdydaft	11 hours ago
Salita Promotions	11 hours ago
Prem Jeff SP	11 hours ago
Annihilator	11 hours ago
Mooinspace	11 hours ago