Reward Is Enough (Machine Learning Research Paper Explained)

Channel:

Yannic Kilcher

Subscribers:

300,000

Published on May 31, 2021 1:27:21 PM ● Video Link: https://www.youtube.com/watch?v=dmH1ZpcROMk

Duration: 35:50

27,873 views

900

#reinforcementlearning #deepmind #agi

What's the most promising path to creating Artificial General Intelligence (AGI)? This paper makes the bold claim that a learning agent maximizing its reward in a sufficiently complex environment will necessarily develop intelligence as a by-product, and that Reward Maximization is the best way to move the creation of AGI forward. The paper is a mix of philosophy, engineering, and futurism, and raises many points of discussion.

OUTLINE:
0:00 - Intro & Outline
4:10 - Reward Maximization
10:10 - The Reward-is-Enough Hypothesis
13:15 - Abilities associated with intelligence
16:40 - My Criticism
26:15 - Reward Maximization through Reinforcement Learning
31:30 - Discussion, Conclusion & My Comments

Paper: https://www.sciencedirect.com/science/article/pii/S0004370221000862

Abstract:
In this article we hypothesise that intelligence, and its associated abilities, can be understood as subserving the maximisation of reward. Accordingly, reward is enough to drive behaviour that exhibits abilities studied in natural and artificial intelligence, including knowledge, learning, perception, social intelligence, language, generalisation and imitation. This is in contrast to the view that specialised problem formulations are needed for each ability, based on other signals or objectives. Furthermore, we suggest that agents that learn through trial and error experience to maximise reward could learn behaviour that exhibits most if not all of these abilities, and therefore that powerful reinforcement learning agents could constitute a solution to artificial general intelligence.

Authors: David Silver, Satinder Singh, Doina Precup, Richard S. Sutton

Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/
BiliBili: https://space.bilibili.com/1824646584

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Other Videos By Yannic Kilcher

2021-06-27	The Dimpled Manifold Model of Adversarial Examples in Machine Learning (Research Paper Explained)
2021-06-24	[ML News] Hugging Face course \| GAN Theft Auto \| AI Programming Puzzles \| PyTorch 1.9 Released
2021-06-23	XCiT: Cross-Covariance Image Transformers (Facebook AI Machine Learning Research Paper Explained)
2021-06-19	AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control (Paper Explained)
2021-06-16	[ML News] De-Biasing GPT-3 \| RL cracks chip design \| NetHack challenge \| Open-Source GPT-J
2021-06-11	Efficient and Modular Implicit Differentiation (Machine Learning Research Paper Explained)
2021-06-09	[ML News] EU regulates AI, China trains 1.75T model, Google's oopsie, Everybody cheers for fraud.
2021-06-08	My GitHub (Trash code I wrote during PhD)
2021-06-05	Decision Transformer: Reinforcement Learning via Sequence Modeling (Research Paper Explained)
2021-06-02	[ML News] Anthropic raises $124M, ML execs clueless, collusion rings, ELIZA source discovered & more
2021-05-31	Reward Is Enough (Machine Learning Research Paper Explained)
2021-05-30	[Rant] Can AI read your emotions? (No, but ...)
2021-05-29	Fast and Slow Learning of Recurrent Independent Mechanisms (Machine Learning Paper Explained)
2021-05-26	[ML News] DeepMind fails to get independence from Google
2021-05-24	Expire-Span: Not All Memories are Created Equal: Learning to Forget by Expiring (Paper Explained)
2021-05-21	FNet: Mixing Tokens with Fourier Transforms (Machine Learning Research Paper Explained)
2021-05-18	AI made this music video \| What happens when OpenAI's CLIP meets BigGAN?
2021-05-15	DDPM - Diffusion Models Beat GANs on Image Synthesis (Machine Learning Research Paper Explained)
2021-05-11	Research Conference ICML drops their acceptance rate \| Area Chairs instructed to be more picky
2021-05-08	Involution: Inverting the Inherence of Convolution for Visual Recognition (Research Paper Explained)
2021-05-06	MLP-Mixer: An all-MLP Architecture for Vision (Machine Learning Research Paper Explained)

Tags:

deep learning

machine learning

arxiv

explained

neural networks

artificial intelligence

paper

deep learning tutorial

introduction to deep learning

what is deep learning

how to achieve agi

artificial general intelligence

how to create intelligence

reward maximisation

reward maximization

reinforcement learning

is alphago intelligence

is gpt 3 self aware

is gpt 3 intelligent

how to create ai

how to achieve ai

general ai

agent environment

deepmind

Channel	Latest
EmaNG91	6 hours ago
Rincón de jugones	6 hours ago
Mandenmoris A.	6 hours ago
ThA NaTiOn T3 Tv FaBDiCeMaN	7 hours ago
CaptainFRACAS	7 hours ago
jester_VII	7 hours ago
RTV Dukagjini	7 hours ago
ennohex	7 hours ago
NeoEk Channel	7 hours ago
fenom	7 hours ago
Lazycorner07	7 hours ago
EmiRóża89 The Playerka	7 hours ago
MePlayingGTA	7 hours ago
Captain Oats	7 hours ago
圍棋愛好者	7 hours ago
Spider Shark	7 hours ago
Daizo Dee Von	7 hours ago
Dan Toppy	7 hours ago
CJR Gaming	7 hours ago
Anto scama play	7 hours ago
EYETA	7 hours ago
Games Longplays	7 hours ago
Shazam Sakazaki	7 hours ago
thesacredlobo	7 hours ago
pale kof stine	7 hours ago