Weight Standardization (Paper Explained)

Channel:

Yannic Kilcher

Subscribers:

301,000

Published on May 15, 2020 2:11:14 PM ● Video Link: https://www.youtube.com/watch?v=p-zOeQCoG9c

Duration: 19:16

9,329 views

333

It's common for neural networks to include data normalization such as BatchNorm or GroupNorm. This paper extends the normalization to also include the weights of the network. This surprisingly simple change leads to a boost in performance and - combined with GroupNorm - new state-of-the-art results.

https://arxiv.org/abs/1903.10520

Abstract:
In this paper, we propose Weight Standardization (WS) to accelerate deep network training. WS is targeted at the micro-batch training setting where each GPU typically has only 1-2 images for training. The micro-batch training setting is hard because small batch sizes are not enough for training networks with Batch Normalization (BN), while other normalization methods that do not rely on batch knowledge still have difficulty matching the performances of BN in large-batch training. Our WS ends this problem because when used with Group Normalization and trained with 1 image/GPU, WS is able to match or outperform the performances of BN trained with large batch sizes with only 2 more lines of code. In micro-batch training, WS significantly outperforms other normalization methods. WS achieves these superior results by standardizing the weights in the convolutional layers, which we show is able to smooth the loss landscape by reducing the Lipschitz constants of the loss and the gradients. The effectiveness of WS is verified on many tasks, including image classification, object detection, instance segmentation, video recognition, semantic segmentation, and point cloud recognition. The code is available here: this https URL.

Authors: Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher

Other Videos By Yannic Kilcher

2020-05-25	Deep image reconstruction from human brain activity (Paper Explained)
2020-05-24	Regularizing Trajectory Optimization with Denoising Autoencoders (Paper Explained)
2020-05-23	[News] The NeurIPS Broader Impact Statement
2020-05-22	When BERT Plays the Lottery, All Tickets Are Winning (Paper Explained)
2020-05-21	[News] OpenAI Model Generates Python Code
2020-05-20	Investigating Human Priors for Playing Video Games (Paper & Demo)
2020-05-19	iMAML: Meta-Learning with Implicit Gradients (Paper Explained)
2020-05-18	[Code] PyTorch sentiment classifier from scratch with Huggingface NLP Library (Full Tutorial)
2020-05-17	Planning to Explore via Self-Supervised World Models (Paper Explained)
2020-05-16	[News] Facebook's Real-Time TTS system runs on CPUs only!
2020-05-15	Weight Standardization (Paper Explained)
2020-05-14	[Trash] Automated Inference on Criminality using Face Images
2020-05-13	Faster Neural Network Training with Data Echoing (Paper Explained)
2020-05-12	Group Normalization (Paper Explained)
2020-05-11	Concept Learning with Energy-Based Models (Paper Explained)
2020-05-10	[News] Google’s medical AI was super accurate in a lab. Real life was a different story.
2020-05-09	Big Transfer (BiT): General Visual Representation Learning (Paper Explained)
2020-05-08	Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning (Paper Explained)
2020-05-07	WHO ARE YOU? 10k Subscribers Special (w/ Channel Analytics)
2020-05-06	Reinforcement Learning with Augmented Data (Paper Explained)
2020-05-05	TAPAS: Weakly Supervised Table Parsing via Pre-training (Paper Explained)

Tags:

deep learning

machine learning

arxiv

explained

neural networks

artificial intelligence

paper

normalize

batchnorm

groupnorm

layernorm

mean

center

std

standardize

backpropagation

convergence

gradients

norm

convolution

cnn

convolutional neural networks

filters

kernel

channel

architecture

Channel	Latest
Stintik	7 hours ago
itsRPClips	7 hours ago
Michal Boczkowski • HediUp	7 hours ago
Nikhil Malankar	7 hours ago
Filmy Sahil	7 hours ago
Gamer Shiva	7 hours ago
Felmar Cuevas (Eloysciouss)	7 hours ago
Maheshwar Gamerz (2.O)	8 hours ago
Bloody inder	8 hours ago
Ponsel Heboh	8 hours ago
CockyPotato3	8 hours ago
SIG07	8 hours ago
LUNGSALAM TV	8 hours ago
The Mr Laundry Central TV	9 hours ago
Dvinzi	9 hours ago
DannyNoob	9 hours ago
MrFire	9 hours ago
deerled	9 hours ago
DoubleT Gaming	9 hours ago
ZackScottGames	9 hours ago
Tuk-tuk Gaming	10 hours ago
이유민 ( ps 게임 기록실 ) 발컨	10 hours ago
Alanblink	10 hours ago
DeadHurt Gaming	10 hours ago
사리뿌	10 hours ago