Self-Tuning Networks: Amortizing the Hypergradient Computation for Hyperparameter Optimization

Channel:

Subscribers:

344,000

Published on April 12, 2021 4:52:02 PM ● Video Link: https://www.youtube.com/watch?v=eoLeANtBGKs

Duration: 1:00:50

1,698 views

Optimization of many deep learning hyperparameters can be formulated as a bilevel optimization problem. While most black-box and gradient-based approaches require many independent training runs, we aim to adapt hyperparameters online as the network trains. The main challenge is to approximate the response Jacobian, which captures how the minimum of the inner objective changes as the hyperparameters are perturbed. To do this, we introduce the self-tuning network (STN), which fits a hypernetwork to approximate the best response function in the vicinity of the current hyperparameters. Differentiating through the hypernetwork lets us efficiently approximate the gradient of the validation loss with respect to the hyperparameters. We train the hypernetwork and hyperparameters jointly. Empirically, we can find hyperparameter settings competitive with Bayesian Optimization in a single run of training, and in some cases find hyperparameter schedules that outperform any fixed hyperparameter value.

Roger Grosse is an Assistant Professor of Computer Science at the University of Toronto, and a founding member of the Vector Institute for Artificial Intelligence. He received his Ph.D. in computer science from MIT, and then spent two years as a postdoc at the University of Toronto. He holds a Canada Research Chair in Probabilistic Inference and Deep Learning, an Ontario MRIS Early Researcher Award, and a CIFAR Canadian AI Chair.

Learn more about the 2020-2021 Directions in ML: AutoML and Automating Algorithms virtual speaker series: https://aka.ms/diml

Other Videos By Microsoft Research

2021-04-29	Virtual Lake Nona Impact Forum “Health Innovation in the New Reality”
2021-04-28	Sound Capture and Speech Enhancement for Communication and Distant Speech Recognition
2021-04-27	Virtual Lake Nona Impact Forum “Health Innovation in the New Reality”
2021-04-26	FastNeRF: High-Fidelity Neural Rendering at 200FPS [Condensed]
2021-04-21	Research for Industries (RFI) Lecture Series: Warren Powell
2021-04-21	Research for Industries (RFI) Lecture Series: Andreas Haeberlen
2021-04-13	Discovering hidden connections in art with deep, interpretable visual analogies
2021-04-13	ZeRO & Fastest BERT: Increasing the scale and speed of deep learning training in DeepSpeed
2021-04-13	Interactive sound simulation: Rendering immersive soundscapes in games and virtual reality
2021-04-13	A prototype implementation of 4G packet gateway Microsoft Catapult FPGA platform
2021-04-12	Self-Tuning Networks: Amortizing the Hypergradient Computation for Hyperparameter Optimization
2021-04-06	Ultra-dense data storage and extreme parallelism with electronic-molecular systems
2021-04-06	Harmonizing the declarative and imperative in database systems
2021-04-06	Domain-specific language model pretraining for biomedical natural language processing
2021-03-30	Platform Biography: A framework for analyzing the structures and dynamics of social media
2021-03-30	Building multimodal, integrative AI systems with Platform for Situated Intelligence
2021-03-29	From player to creator: Designing video games on gaming handhelds with Microsoft TileCode webinar
2021-03-29	Camera-based non-contact health sensing
2021-03-29	Foundations of causal inference and its impacts on machine learning webinar
2021-03-29	Avatars: Finding a sense of self and others in the virtual world
2021-03-25	In pursuit of responsible AI: Bringing principles to practice

Channel	Latest
KevGaming87	6 hours ago
MasaruTv	6 hours ago
Balkan Gamer Chef!	7 hours ago
pronósticosnbaRD	7 hours ago
Blitz Bulle	7 hours ago
CHM GV	7 hours ago
Colm R. McGuinness	7 hours ago
RoyaloYT	7 hours ago
xTz Ferocity	7 hours ago
CHUMDUMP	7 hours ago
Attillee	7 hours ago
MrPlatin	7 hours ago
NeneCreative	7 hours ago
Kaige-O Gaming	7 hours ago
messerszmit	8 hours ago
Roy The Gamer.	8 hours ago
NitroECWGaming	8 hours ago
Akmal Auzar	8 hours ago
Clique TV	8 hours ago
Attacking Games	8 hours ago
Oldman Gaming	8 hours ago
KingAlexHD	8 hours ago
DaHomie Joey	8 hours ago
Ren	9 hours ago
Z3Gaming	9 hours ago