Gradients are Not All You Need (Machine Learning Research Paper Explained)

Channel:

Yannic Kilcher

Subscribers:

300,000

Published on November 16, 2021 12:36:11 AM ● Video Link: https://www.youtube.com/watch?v=EeMhj0sPrhE

Duration: 48:29

37,425 views

#deeplearning #backpropagation #simulation

More and more systems are made differentiable, which means that accurate gradients of these systems' dynamics can be computed exactly. While this development has led to a lot of advances, there are also distinct situations where backpropagation can be a very bad idea. This paper characterizes a few such systems in the domain of iterated dynamical systems, often including some source of stochasticity, resulting in chaotic behavior. In these systems, it is often better to use black-box estimators for gradients than computing them exactly.

OUTLINE:
0:00 - Foreword
1:15 - Intro & Overview
3:40 - Backpropagation through iterated systems
12:10 - Connection to the spectrum of the Jacobian
15:35 - The Reparameterization Trick
21:30 - Problems of reparameterization
26:35 - Example 1: Policy Learning in Simulation
33:05 - Example 2: Meta-Learning Optimizers
36:15 - Example 3: Disk packing
37:45 - Analysis of Jacobians
40:20 - What can be done?
45:40 - Just use Black-Box methods

Paper: https://arxiv.org/abs/2111.05803

Abstract:
Differentiable programming techniques are widely used in the community and are responsible for the machine learning renaissance of the past several decades. While these methods are powerful, they have limits. In this short report, we discuss a common chaos based failure mode which appears in a variety of differentiable circumstances, ranging from recurrent neural networks and numerical physics simulation to training learned optimizers. We trace this failure to the spectrum of the Jacobian of the system under study, and provide criteria for when a practitioner might expect this failure to spoil their differentiation based optimization algorithms.

Authors: Luke Metz, C. Daniel Freeman, Samuel S. Schoenholz, Tal Kachman

Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
LinkedIn: https://www.linkedin.com/in/ykilcher
BiliBili: https://space.bilibili.com/2017636191

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Other Videos By Yannic Kilcher

2021-12-28	GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
2021-12-27	Machine Learning Holidays Live Stream
2021-12-26	Machine Learning Holiday Live Stream
2021-12-24	[ML News] AI learns to search the Internet \| Drawings come to life \| New ML journal launches
2021-12-21	[ML News] DeepMind builds Gopher \| Google builds GLaM \| Suicide capsule uses AI to check access
2021-11-27	Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions (Paper Explained)
2021-11-25	Peer Review is still BROKEN! The NeurIPS 2021 Review Experiment (results are in)
2021-11-24	Parameter Prediction for Unseen Deep Architectures (w/ First Author Boris Knyazev)
2021-11-20	Learning Rate Grafting: Transferability of Optimizer Tuning (Machine Learning Research Paper Review)
2021-11-18	[ML News] Cedille French Language Model \| YOU Search Engine \| AI Finds Profitable MEME TOKENS
2021-11-15	Gradients are Not All You Need (Machine Learning Research Paper Explained)
2021-11-12	[ML News] Microsoft combines Images & Text \| Meta makes artificial skin \| Russians replicate DALL-E
2021-11-10	Autoregressive Diffusion Models (Machine Learning Research Paper Explained)
2021-11-05	[ML News] Google introduces Pathways \| OpenAI solves Math Problems \| Meta goes First Person
2021-11-03	EfficientZero: Mastering Atari Games with Limited Data (Machine Learning Research Paper Explained)
2021-10-31	[YTalks] Siraj Raval - Stories about YouTube, Plagiarism, and the Dangers of Fame (Interview)
2021-10-29	[ML News] NVIDIA GTC'21 \| DeepMind buys MuJoCo \| Google predicts spreadsheet formulas
2021-10-29	[ML News GERMAN] NVIDIA GTC'21 \| DeepMind kauft MuJoCo \| Google Lernt Spreadsheet Formeln
2021-10-27	I went to an AI Art Festival in Geneva (AiiA Festival Trip Report)
2021-10-24	Symbolic Knowledge Distillation: from General Language Models to Commonsense Models (Explained)
2021-10-21	I took a Swiss train and it was awesome! Train Seat Review - SBB InterCity 1 - Geneva to St. Gallen

Tags:

deep learning

machine learning

arxiv

explained

neural networks

artificial intelligence

paper

backpropagation

all you need

gradients

machine learning gradients

differentiable environment

differentiable physics

differentiable simulation

when to use gradients

when not to use gradients

when to avoid gradients

google research

google ai

Channel	Latest
YaBoyRoshi	8 hours ago
Play Nintendo	8 hours ago
PopCross Studios	10 hours ago
Kage848	11 hours ago
Flik's Gaming Stuff	11 hours ago
ArCanOMG	12 hours ago
Sony	12 hours ago
TheREALRandomLozzie!!	13 hours ago
RTGame	14 hours ago
Dawko	15 hours ago
MKIceAndFire	15 hours ago
IntroGameOver	15 hours ago
Badaw Gaming	16 hours ago
alanzoka	16 hours ago
oGVexx	17 hours ago
CarbotAnimations	17 hours ago
Akashi	19 hours ago
BanryuTV	19 hours ago
Icehiteru	19 hours ago
raocow	20 hours ago
Grimith	21 hours ago
Caner Akçay	23 hours ago
whitemoca	23 hours ago
LevelUp Legends	23 hours ago
Kamar Rama	23 hours ago