Policy Optimization as Predictable Online Learning Problems: Imitation Learning and Beyond

Channel:

Subscribers:

351,000

Published on November 28, 2018 7:10:45 PM ● Video Link: https://www.youtube.com/watch?v=RW9nAj8tCls

Duration: 1:10:13

1,284 views

Efficient policy optimization is fundamental to solving real-world reinforcement learning problems, where agent-environment interactions can be costly. In this talk, I will discuss my recent research toward improving policy optimization efficiency from the perspective of online learning. The use of online learning to analyze policy optimization was pioneered by Ross et al. who proposed to reduce imitation learning to adversarial online learning problems. However, as I will discuss, this reduction actually loses information: the policy optimization problem is not truly adversarial but rather predictable from past information. Based on this observation, I will present conditions for the last-iterate convergence of value aggregation for imitation learning. Furthermore, I will show how one can leverage this predictable information to design better algorithms to speed up imitation learning and reinforcement learning.

View slides and more at https://www.microsoft.com/en-us/research/video/policy-optimization-as-predictable-online-learning-problems-imitation-learning-and-beyond/

Other Videos By Microsoft Research

2018-12-13	Chasing convex bodies and other random topics with Dr. Sébastien Bubeck
2018-12-06	Automated Reasoning of Database Queries
2018-12-06	How to Obtain and Run Light and Efficient Deep Learning Networks
2018-12-06	Machine Teaching Demo
2018-12-06	Advanced Machine Learning Day 3: Neural Program Synthesis
2018-12-06	Advanced Machine Learning Day 3: Neural Architecture Search
2018-12-06	Delayed Impact of Fair Machine Learning
2018-12-03	Machine learning and the learning machine with Dr. Christopher Bishop
2018-12-03	Deep Generative Models for Imitation Learning and Fairness
2018-11-29	Machine Teaching Overview
2018-11-28	Policy Optimization as Predictable Online Learning Problems: Imitation Learning and Beyond
2018-11-28	Algorithmic Social Intervention
2018-11-26	TLA+ Specifications of the Consistency Guarantees Provided by Cosmos DB
2018-11-21	The 20th Northwest Probability Seminar: Cutoff for Product Replacement on Finite Groups
2018-11-21	The 20th Northwest Probability Seminar: The KPZ Fixed Point
2018-11-20	Stochastic Explosions in Branching Processes and Non-uniqueness for Nonlinear PDE
2018-11-20	The 20th Northwest Probability Seminar: First Order Logic on Galton-Watson Trees
2018-11-20	Causal Effects and Overlap in High-dimensional or Sequential Data
2018-11-20	Stochastic Approximation and Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms
2018-11-20	Towards a Conscious AI: A Computer Architecture inspired by Neuroscience
2018-11-19	Fireside Chat with Manuel Blum

Tags:

microsoft research

Channel	Latest
Bli Pur	6 hours ago
JhuAncz Channel	6 hours ago
Tenma Ch. マエミ天満【Phase Connect】	6 hours ago
Strike Gold Daily	6 hours ago
PallySilverstar	6 hours ago
Ahmad Faqih Productions	6 hours ago
SBS6	6 hours ago
JP Sarri Reviews	6 hours ago
Dominador Soltes	6 hours ago
Aye Welt	6 hours ago
Churmander	6 hours ago
Diky Gaming	6 hours ago
asepfahadsultana	6 hours ago
OST Archive	6 hours ago
RESULT CHANNEL	6 hours ago
Tiandee	6 hours ago
Amani Zahira	6 hours ago
justF	6 hours ago
Chavez Channel	6 hours ago
MERI ANDITA CHANNEL	6 hours ago
The Gaming Door	6 hours ago
Still One	6 hours ago
Rukesan	6 hours ago
Villain Ki Haveli	6 hours ago
Eman Oyag Kohkol	6 hours ago