SynFlow: Pruning neural networks without any data by iteratively conserving synaptic flow

Subscribers:
284,000
Published on ● Video Link: https://www.youtube.com/watch?v=8l-TDqpoUQs



Duration: 44:53
16,208 views
661


The Lottery Ticket Hypothesis has shown that it's theoretically possible to prune a neural network at the beginning of training and still achieve good performance, if we only knew which weights to prune away. This paper does not only explain where other attempts at pruning fail, but provides an algorithm that provably reaches maximum compression capacity, all without looking at any data!

OUTLINE:
0:00 - Intro & Overview
1:00 - Pruning Neural Networks
3:40 - Lottery Ticket Hypothesis
6:00 - Paper Story Overview
9:45 - Layer Collapse
18:15 - Synaptic Saliency Conservation
23:25 - Connecting Layer Collapse & Saliency Conservation
28:30 - Iterative Pruning avoids Layer Collapse
33:20 - The SynFlow Algorithm
40:45 - Experiments
43:35 - Conclusion & Comments

Paper: https://arxiv.org/abs/2006.05467
Code: https://github.com/ganguli-lab/Synaptic-Flow
My Video on the Lottery Ticket Hypothesis: https://youtu.be/ZVVnvZdUMUk
Street Talk about LTH: https://youtu.be/SfjJoevBbjU

Abstract:
Pruning the parameters of deep neural networks has generated intense interest due to potential savings in time, memory and energy both during training and at test time. Recent works have identified, through an expensive sequence of training and pruning cycles, the existence of winning lottery tickets or sparse trainable subnetworks at initialization. This raises a foundational question: can we identify highly sparse trainable subnetworks at initialization, without ever training, or indeed without ever looking at the data? We provide an affirmative answer to this question through theory driven algorithm design. We first mathematically formulate and experimentally verify a conservation law that explains why existing gradient-based pruning algorithms at initialization suffer from layer-collapse, the premature pruning of an entire layer rendering a network untrainable. This theory also elucidates how layer-collapse can be entirely avoided, motivating a novel pruning algorithm Iterative Synaptic Flow Pruning (SynFlow). This algorithm can be interpreted as preserving the total flow of synaptic strengths through the network at initialization subject to a sparsity constraint. Notably, this algorithm makes no reference to the training data and consistently outperforms existing state-of-the-art pruning algorithms at initialization over a range of models (VGG and ResNet), datasets (CIFAR-10/100 and Tiny ImageNet), and sparsity constraints (up to 99.9 percent). Thus our data-agnostic pruning algorithm challenges the existing paradigm that data must be used to quantify which synapses are important.

Authors: Hidenori Tanaka, Daniel Kunin, Daniel L. K. Yamins, Surya Ganguli

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher




Other Videos By Yannic Kilcher


2020-06-24How I Read a Paper: Facebook's DETR (Video Tutorial)
2020-06-23RepNet: Counting Out Time - Class Agnostic Video Repetition Counting in the Wild (Paper Explained)
2020-06-22[Drama] Yann LeCun against Twitter on Dataset Bias
2020-06-21SIREN: Implicit Neural Representations with Periodic Activation Functions (Paper Explained)
2020-06-20Big Self-Supervised Models are Strong Semi-Supervised Learners (Paper Explained)
2020-06-19On the Measure of Intelligence by François Chollet - Part 2: Human Priors (Paper Explained)
2020-06-18Image GPT: Generative Pretraining from Pixels (Paper Explained)
2020-06-17BYOL: Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning (Paper Explained)
2020-06-16TUNIT: Rethinking the Truly Unsupervised Image-to-Image Translation (Paper Explained)
2020-06-15A bio-inspired bistable recurrent cell allows for long-lasting memory (Paper Explained)
2020-06-14SynFlow: Pruning neural networks without any data by iteratively conserving synaptic flow
2020-06-13Deep Differential System Stability - Learning advanced computations from examples (Paper Explained)
2020-06-12VirTex: Learning Visual Representations from Textual Annotations (Paper Explained)
2020-06-11Linformer: Self-Attention with Linear Complexity (Paper Explained)
2020-06-10End-to-End Adversarial Text-to-Speech (Paper Explained)
2020-06-09TransCoder: Unsupervised Translation of Programming Languages (Paper Explained)
2020-06-08JOIN ME for the NeurIPS 2020 Flatland Multi-Agent RL Challenge!
2020-06-07BLEURT: Learning Robust Metrics for Text Generation (Paper Explained)
2020-06-06Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search (Paper Explained)
2020-06-05CornerNet: Detecting Objects as Paired Keypoints (Paper Explained)
2020-06-04Movement Pruning: Adaptive Sparsity by Fine-Tuning (Paper Explained)



Tags:
deep learning
machine learning
arxiv
explained
neural networks
ai
artificial intelligence
paper
initialization
lottery ticket hypothesis
pruning
training
magnitude
snip
grasp
init
xavier
glorot
he
flow
layer collapse
iterative
recompute
stepwise
memory
fast
prune
weights
feedforward
layer
neural network