Research talk: Reinforcement learning with preference feedback

Channel:

Subscribers:

351,000

Published on February 8, 2022 6:05:41 PM ● Video Link: https://www.youtube.com/watch?v=MJzBUNtv0Ho

Duration: 14:23

899 views

Speaker: Aadirupa Saha, Postdoctoral Researcher, Microsoft Research NYC

In Preference-based Reinforcement Learning (PbRL), an agent receives feedback only in terms of rank-ordered preferences over a set of selected actions, unlike the absolute reward feedback in traditional reinforcement learning. This is relevant in settings where it is difficult for the system designer to explicitly specify a reward function to achieve a desired behavior, but instead possible to elicit coarser feedback, say from an expert, about actions preferred over other actions at states. The success of the traditional reinforcement learning framework crucially hinges on the underlying agent-reward model. This, however, depends on how accurately a system designer can express an appropriate reward function, which is often a non-trivial task. The main novelty of the mobility-aware centralized reinforcement learning (MCRL) framework is the ability to learn from non-numeric, preference-based feedback that eliminates the need to handcraft numeric reward models. We will set up a formal framework for PbRL and discuss different real-world applications. Though introduced almost a decade ago, we will also discuss a problem here—that most work in PbRL has been primarily applied or experimental in nature, barring a handful of very recent ventures on the theory side. Finally, we will discuss the limitations of the existing techniques and the scope of future developments.

Learn more about the 2021 Microsoft Research Summit: https://Aka.ms/researchsummit

Other Videos By Microsoft Research

2022-03-25	The Cloud Infrastructure team at Microsoft Research Cambridge
2022-03-14	A discussion with Sankar Das Sarma and Chetan Nayak
2022-03-09	Collaborating to Develop a Low-cost Keratoconus Diagnostic Solution
2022-03-02	Intersectional Tech: Black Praxis in Digital Gaming
2022-02-25	Reinforcement Learning (RL) Open Source Fest 2021 \| Final Presentations - Part 1
2022-02-25	Reinforcement Learning (RL) Open Source Fest 2021 \| Final Presentations - Part 2
2022-02-24	Towards a New Biology Nexus: Race, Society and Story in the Science of Life
2022-02-18	Microsoft Soundscape - an Illustrated Demonstration
2022-02-08	Research talk: Maia Chess: A human-like neural network chess engine
2022-02-08	Research talk: Safe reinforcement learning using advantage-based intervention
2022-02-08	Research talk: Reinforcement learning with preference feedback
2022-02-08	Keynote: Key research challenges for real world reinforcement learning
2022-02-08	Opening remarks: Reinforcement Learning
2022-02-08	Closing remarks: Health & Life Sciences - Discovery
2022-02-08	Keynote: ReduNet: Deep (convolutional) networks from the principle of rate reduction
2022-02-08	Closing remarks: Towards Human-Like Visual Learning and Reasoning
2022-02-08	Research talks: Generalization and adaptation
2022-02-08	Research talk: CitizenEndo: Patient-centered endometriosis research
2022-02-08	Research talks: Few-shot and zero-shot visual learning and reasoning
2022-02-08	Research talk: Learning to read the adaptive immune systems of humans
2022-02-08	Research talk: Next generation spatial genomics

Tags:

reward-based learning

reinforcement learning

innovation in artificial environments

accelerate AI

microsoft research summit

Channel	Latest
Mehmet Uzun	6 hours ago
domisumReplay: Syndra	6 hours ago
domisumReplay: Mordekaiser	6 hours ago
Shhoto	6 hours ago
DismArchus	6 hours ago
Baba Behwish	6 hours ago
domisumReplay: Aatrox	6 hours ago
domisumReplay: Akali	7 hours ago
domisumReplay: Sett	7 hours ago
domisumReplay: Kayle	7 hours ago
iTownGamePlay Terror&Diversión	7 hours ago
Nickich	7 hours ago
League of SUPPORT - LOL Replays	7 hours ago
Happy Animes Recaps	7 hours ago
SiIvaGunner	7 hours ago
Oh Shiitake Mushrooms	7 hours ago
domisumReplay: Nasus	7 hours ago
domisumReplay: Ahri	8 hours ago
HeroVoltsy	8 hours ago
JustSaySteven	8 hours ago
WildGamerSK	8 hours ago
Fz Frost	8 hours ago
RobtheMod	8 hours ago
domisumReplay: Camille	8 hours ago
Tivvv3 TivvyCat!	8 hours ago