Datasets for Data-Driven Reinforcement Learning

Channel:

Yannic Kilcher

Subscribers:

301,000

Published on April 16, 2020 6:07:00 PM ● Video Link: https://www.youtube.com/watch?v=-h1KB8ps11A

Duration: 19:55

4,504 views

140

Offline Reinforcement Learning has come more and more into focus recently in domains where classic on-policy RL algorithms are infeasible to train, such as safety-critical tasks or learning from expert demonstrations. This paper presents an extensive benchmark for evaluating offline RL algorithms in a variety of settings.

Paper: https://arxiv.org/abs/2004.07219
Code: https://github.com/rail-berkeley/offline_rl

Abstract:
The offline reinforcement learning (RL) problem, also referred to as batch RL, refers to the setting where a policy must be learned from a dataset of previously collected data, without additional online data collection. In supervised learning, large datasets and complex deep neural networks have fueled impressive progress, but in contrast, conventional RL algorithms must collect large amounts of on-policy data and have had little success leveraging previously collected datasets. As a result, existing RL benchmarks are not well-suited for the offline setting, making progress in this area difficult to measure. To design a benchmark tailored to offline RL, we start by outlining key properties of datasets relevant to applications of offline RL. Based on these properties, we design a set of benchmark tasks and datasets that evaluate offline RL algorithms under these conditions. Examples of such properties include: datasets generated via hand-designed controllers and human demonstrators, multi-objective datasets, where an agent can perform different tasks in the same environment, and datasets consisting of a heterogeneous mix of high-quality and low-quality trajectories. By designing the benchmark tasks and datasets to reflect properties of real-world offline RL problems, our benchmark will focus research effort on methods that drive substantial improvements not just on simulated benchmarks, but ultimately on the kinds of real-world problems where offline RL will have the largest impact.

Authors: Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, Sergey Levine

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher

Other Videos By Yannic Kilcher

2020-04-26	[Drama] Schmidhuber: Critique of Honda Prize for Dr. Hinton
2020-04-25	How much memory does Longformer use?
2020-04-24	Supervised Contrastive Learning
2020-04-23	Thinking While Moving: Deep Reinforcement Learning with Concurrent Control
2020-04-22	[Rant] The Male Only History of Deep Learning
2020-04-21	Gradient Surgery for Multi-Task Learning
2020-04-20	Longformer: The Long-Document Transformer
2020-04-20	Backpropagation and the brain
2020-04-18	Shortcut Learning in Deep Neural Networks
2020-04-17	Feature Visualization & The OpenAI microscope
2020-04-16	Datasets for Data-Driven Reinforcement Learning
2020-04-15	FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
2020-04-14	Imputer: Sequence Modelling via Imputation and Dynamic Programming
2020-04-13	The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
2020-04-12	Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery
2020-04-11	CURL: Contrastive Unsupervised Representations for Reinforcement Learning
2020-04-10	Enhanced POET: Open-Ended RL through Unbounded Invention of Learning Challenges and their Solutions
2020-04-09	Evolving Normalization-Activation Layers
2020-04-08	[Drama] Who invented Contrast Sets?
2020-04-07	Evaluating NLP Models via Contrast Sets
2020-04-06	POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Solutions

Tags:

deep learning

machine learning

reinforcement learning

deep rl

off-policy

on-policy

replay buffer

dataset

benchmark

berkeley

rail

offline

online

Channel	Latest
Goose on thy loose	6 hours ago
DarkDunge0n	6 hours ago
Mike's After Action Reviews	6 hours ago
Dredd Bread	6 hours ago
nickabenson	6 hours ago
Meteoro Brasil	6 hours ago
Bilby	6 hours ago
GreekPL	6 hours ago
VKirai	6 hours ago
Johnny Sniper RUS	6 hours ago
xl AeroSmash lx	6 hours ago
MearoGaming (MearoGaming)	7 hours ago
F3AR	7 hours ago
Pupsker	7 hours ago
kbezuko	7 hours ago
sckchui	7 hours ago
Swifty	7 hours ago
Pluto V2	7 hours ago
G7 GAMES	7 hours ago
Pajaro del almaa	7 hours ago
Wars Inc.	7 hours ago
Pobre Gamer	7 hours ago
ツღTutoSebasღ	7 hours ago
Mobile Legends: Bang Bang Indonesia	7 hours ago
ViiZ	7 hours ago