Big Transfer (BiT): General Visual Representation Learning (Paper Explained)

Subscribers:
284,000
Published on ● Video Link: https://www.youtube.com/watch?v=k1GOF2jmX7c



Duration: 34:10
9,365 views
302


One CNN to rule them all! BiT is a pre-trained ResNet that can be used as a starting point for any visual task. This paper explains what it takes to pre-train such a large model and details how fine-tuning on downstream tasks is done best.

Paper: https://arxiv.org/abs/1912.11370
Code & Models: TBA

Abstract:
Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the model on a target task. We scale up pre-training, and propose a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components, and transferring using a simple heuristic, we achieve strong performance on over 20 datasets. BiT performs well across a surprisingly wide range of data regimes -- from 1 example per class to 1M total examples. BiT achieves 87.5% top-1 accuracy on ILSVRC-2012, 99.4% on CIFAR-10, and 76.3% on the 19 task Visual Task Adaptation Benchmark (VTAB). On small datasets, BiT attains 76.8% on ILSVRC-2012 with 10 examples per class, and 97.0% on CIFAR-10 with 10 examples per class. We conduct detailed analysis of the main components that lead to high transfer performance.

Authors: Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil Houlsby

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher




Other Videos By Yannic Kilcher


2020-05-19iMAML: Meta-Learning with Implicit Gradients (Paper Explained)
2020-05-18[Code] PyTorch sentiment classifier from scratch with Huggingface NLP Library (Full Tutorial)
2020-05-17Planning to Explore via Self-Supervised World Models (Paper Explained)
2020-05-16[News] Facebook's Real-Time TTS system runs on CPUs only!
2020-05-15Weight Standardization (Paper Explained)
2020-05-14[Trash] Automated Inference on Criminality using Face Images
2020-05-13Faster Neural Network Training with Data Echoing (Paper Explained)
2020-05-12Group Normalization (Paper Explained)
2020-05-11Concept Learning with Energy-Based Models (Paper Explained)
2020-05-10[News] Google’s medical AI was super accurate in a lab. Real life was a different story.
2020-05-09Big Transfer (BiT): General Visual Representation Learning (Paper Explained)
2020-05-08Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning (Paper Explained)
2020-05-07WHO ARE YOU? 10k Subscribers Special (w/ Channel Analytics)
2020-05-06Reinforcement Learning with Augmented Data (Paper Explained)
2020-05-05TAPAS: Weakly Supervised Table Parsing via Pre-training (Paper Explained)
2020-05-04Chip Placement with Deep Reinforcement Learning (Paper Explained)
2020-05-03I talk to the new Facebook Blender Chatbot
2020-05-02Jukebox: A Generative Model for Music (Paper Explained)
2020-05-01[ML Coding Tips] Separate Computation & Plotting using locals
2020-04-30The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies (Paper Explained)
2020-04-29Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask (Paper Explained)



Tags:
deep learning
machine learning
arxiv
explained
neural networks
ai
artificial intelligence
paper
google
brain
cnn
convolutional neural network
resnet
residual network
pretraining
finetuning
vtab
imagenet
cifar
state of the art
pretrained
computer vision