Neural Architecture Search without Training (Paper Explained)

Channel:

Yannic Kilcher

Subscribers:

291,000

Published on July 21, 2020 1:00:01 PM ● Video Link: https://www.youtube.com/watch?v=a6v92P0EbJc

Duration: 35:06

24,790 views

1,060

#ai #research #machinelearning

Neural Architecture Search is typically very slow and resource-intensive. A meta-controller has to train many hundreds or thousands of different models to find a suitable building plan. This paper proposes to use statistics of the Jacobian around data points to estimate the performance of proposed architectures at initialization. This method does not require training and speeds up NAS by orders of magnitude.

OUTLINE:
0:00 - Intro & Overview
0:50 - Neural Architecture Search
4:15 - Controller-based NAS
7:35 - Architecture Search Without Training
9:30 - Linearization Around Datapoints
14:10 - Linearization Statistics
19:00 - NAS-201 Benchmark
20:15 - Experiments
34:15 - Conclusion & Comments

Paper: https://arxiv.org/abs/2006.04647
Code: https://github.com/BayesWatch/nas-without-training

Abstract:
The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be extremely slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be remedied if we could infer a network's trained accuracy from its initial state. In this work, we examine how the linear maps induced by data points correlate for untrained network architectures in the NAS-Bench-201 search space, and motivate how this can be used to give a measure of modelling flexibility which is highly indicative of a network's trained performance. We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU. Code to reproduce our experiments is available at this https URL.

Authors: Joseph Mellor, Jack Turner, Amos Storkey, Elliot J. Crowley

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar (preferred to Patreon): https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Other Videos By Yannic Kilcher

2020-08-18	[Rant] REVIEWER #2: How Peer Review is FAILING in Machine Learning
2020-08-14	REALM: Retrieval-Augmented Language Model Pre-Training (Paper Explained)
2020-08-12	Meta-Learning through Hebbian Plasticity in Random Networks (Paper Explained)
2020-08-09	Hopfield Networks is All You Need (Paper Explained)
2020-08-06	I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)
2020-08-04	PCGRL: Procedural Content Generation via Reinforcement Learning (Paper Explained)
2020-08-02	Big Bird: Transformers for Longer Sequences (Paper Explained)
2020-07-29	Self-training with Noisy Student improves ImageNet classification (Paper Explained)
2020-07-26	[Classic] Playing Atari with Deep Reinforcement Learning (Paper Explained)
2020-07-23	[Classic] ImageNet Classification with Deep Convolutional Neural Networks (Paper Explained)
2020-07-21	Neural Architecture Search without Training (Paper Explained)
2020-07-19	[Classic] Generative Adversarial Networks (Paper Explained)
2020-07-16	[Classic] Word2Vec: Distributed Representations of Words and Phrases and their Compositionality
2020-07-14	[Classic] Deep Residual Learning for Image Recognition (Paper Explained)
2020-07-12	I'M TAKING A BREAK... (Channel Update July 2020)
2020-07-11	Deep Ensembles: A Loss Landscape Perspective (Paper Explained)
2020-07-10	Gradient Origin Networks (Paper Explained w/ Live Coding)
2020-07-09	NVAE: A Deep Hierarchical Variational Autoencoder (Paper Explained)
2020-07-08	Addendum for Supermasks in Superposition: A Closer Look (Paper Explained)
2020-07-07	SupSup: Supermasks in Superposition (Paper Explained)
2020-07-06	[Live Machine Learning Research] Plain Self-Ensembles (I actually DISCOVER SOMETHING) - Part 1

Tags:

deep learning

machine learning

arxiv

explained

neural networks

artificial intelligence

paper

nas

nas-bench

architecture search

initialization

untrained

cifar10

imagenet

neural architecture search

controller

rnn

correlation

gradient

jacobian

linearization

Channel	Latest
Le J et Qu4treS	6 hours ago
MarcoZuccardiChannel	6 hours ago
Royal Mike	7 hours ago
Hueldino	7 hours ago
Nadir Ali	8 hours ago
Seamindz Channel SMZ	8 hours ago
Downtown Rumble	8 hours ago
たっと	8 hours ago
That Variety Nerd	8 hours ago
RISRISING	9 hours ago
Andrzej Wzdychacz Weber	9 hours ago
Nexific	9 hours ago
AkRed YZ	9 hours ago
KuVa YZ	9 hours ago
YVMC Gaming	9 hours ago
Roy The Gamer.	10 hours ago
KingAlexHD	10 hours ago
Ryufl	10 hours ago
Mokka Commentry	10 hours ago
Jusayin Studios	10 hours ago
la cueva de lobo	10 hours ago
ZMstardis	11 hours ago
Oyun Mühendisi	11 hours ago
ReeceHoward06	11 hours ago
김프TV	12 hours ago