Regularizing Trajectory Optimization with Denoising Autoencoders (Paper Explained)

Subscribers:
284,000
Published on ● Video Link: https://www.youtube.com/watch?v=UjJU13GdL94



Duration: 29:58
5,151 views
188


Can you plan with a learned model of the world? Yes, but there's a catch: The better your planning algorithm is, the more the errors of your world model will hurt you! This paper solves this problem by regularizing the planning algorithm to stay in high probability regions, given its experience.

https://arxiv.org/abs/1903.11981

Interview w/ Harri: https://youtu.be/HnZDmxYnpg4

Abstract:
Trajectory optimization using a learned model of the environment is one of the core elements of model-based reinforcement learning. This procedure often suffers from exploiting inaccuracies of the learned model. We propose to regularize trajectory optimization by means of a denoising autoencoder that is trained on the same trajectories as the model of the environment. We show that the proposed regularization leads to improved planning with both gradient-based and gradient-free optimizers. We also demonstrate that using regularized trajectory optimization leads to rapid initial learning in a set of popular motor control tasks, which suggests that the proposed approach can be a useful tool for improving sample efficiency.

Authors: Rinu Boney, Norman Di Palo, Mathias Berglund, Alexander Ilin, Juho Kannala, Antti Rasmus, Harri Valpola

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher




Other Videos By Yannic Kilcher


2020-06-03Learning To Classify Images Without Labels (Paper Explained)
2020-06-02On the Measure of Intelligence by François Chollet - Part 1: Foundations (Paper Explained)
2020-06-01Dynamics-Aware Unsupervised Discovery of Skills (Paper Explained)
2020-05-31Synthesizer: Rethinking Self-Attention in Transformer Models (Paper Explained)
2020-05-30[Code] How to use Facebook's DETR object detection algorithm in Python (Full Tutorial)
2020-05-29GPT-3: Language Models are Few-Shot Learners (Paper Explained)
2020-05-28DETR: End-to-End Object Detection with Transformers (Paper Explained)
2020-05-27mixup: Beyond Empirical Risk Minimization (Paper Explained)
2020-05-26A critical analysis of self-supervision, or what we can learn from a single image (Paper Explained)
2020-05-25Deep image reconstruction from human brain activity (Paper Explained)
2020-05-24Regularizing Trajectory Optimization with Denoising Autoencoders (Paper Explained)
2020-05-23[News] The NeurIPS Broader Impact Statement
2020-05-22When BERT Plays the Lottery, All Tickets Are Winning (Paper Explained)
2020-05-21[News] OpenAI Model Generates Python Code
2020-05-20Investigating Human Priors for Playing Video Games (Paper & Demo)
2020-05-19iMAML: Meta-Learning with Implicit Gradients (Paper Explained)
2020-05-18[Code] PyTorch sentiment classifier from scratch with Huggingface NLP Library (Full Tutorial)
2020-05-17Planning to Explore via Self-Supervised World Models (Paper Explained)
2020-05-16[News] Facebook's Real-Time TTS system runs on CPUs only!
2020-05-15Weight Standardization (Paper Explained)
2020-05-14[Trash] Automated Inference on Criminality using Face Images



Tags:
deep learning
machine learning
arxiv
explained
neural networks
ai
artificial intelligence
paper
rl
reinforcement learning
model predictive control
dae
denoising autoencoders
trajectory
trajectory optimization
planning
adversarial attack
errors
open loop
closed loop
joint
probability
derivative
gaussian
experience
learned model
world model
model predictive
mpc