Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation (Paper Explained)

Channel:

Yannic Kilcher

Subscribers:

300,000

Published on August 28, 2020 1:06:20 PM ● Video Link: https://www.youtube.com/watch?v=hv3UO3G0Ofo

Duration: 55:44

14,309 views

444

#ai #machinelearning #attention

Convolutional Neural Networks have dominated image processing for the last decade, but transformers are quickly replacing traditional models. This paper proposes a fully attentional model for images by combining learned Positional Embeddings with Axial Attention. This new model can compete with CNNs on image classification and achieve state-of-the-art in various image segmentation tasks.

OUTLINE:
0:00 - Intro & Overview
4:10 - This Paper's Contributions
6:20 - From Convolution to Self-Attention for Images
16:30 - Learned Positional Embeddings
24:20 - Propagating Positional Embeddings through Layers
27:00 - Traditional vs Position-Augmented Attention
31:10 - Axial Attention
44:25 - Replacing Convolutions in ResNet
46:10 - Experimental Results & Examples

Paper: https://arxiv.org/abs/2003.07853
Code: https://github.com/csrhddlam/axial-deeplab

My Video on BigBird: https://youtu.be/WVPE62Gk3EM
My Video on ResNet: https://youtu.be/GWt6Fu05voI
My Video on Attention: https://youtu.be/iDulhoQ2pro

Abstract:
Convolution exploits locality for efficiency at a cost of missing long range context. Self-attention has been adopted to augment CNNs with non-local interactions. Recent works prove it possible to stack self-attention layers to obtain a fully attentional network by restricting the attention to a local region. In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions. This reduces computation complexity and allows performing attention within a larger or even global region. In companion, we also propose a position-sensitive self-attention design. Combining both yields our position-sensitive axial-attention layer, a novel building block that one could stack to form axial-attention models for image classification and dense prediction. We demonstrate the effectiveness of our model on four large-scale datasets. In particular, our model outperforms all existing stand-alone self-attention models on ImageNet. Our Axial-DeepLab improves 2.8% PQ over bottom-up state-of-the-art on COCO test-dev. This previous state-of-the-art is attained by our small variant that is 3.8x parameter-efficient and 27x computation-efficient. Axial-DeepLab also achieves state-of-the-art results on Mapillary Vistas and Cityscapes.

Authors: Huiyu Wang, Yukun Zhu, Bradley Green, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Other Videos By Yannic Kilcher

2020-11-02	Language Models are Open Knowledge Graphs (Paper Explained)
2020-10-26	Rethinking Attention with Performers (Paper Explained)
2020-10-17	LambdaNetworks: Modeling long-range Interactions without Attention (Paper Explained)
2020-10-11	Descending through a Crowded Valley -- Benchmarking Deep Learning Optimizers (Paper Explained)
2020-10-04	An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)
2020-10-03	Training more effective learned optimizers, and using them to train themselves (Paper Explained)
2020-09-18	The Hardware Lottery (Paper Explained)
2020-09-13	Assessing Game Balance with AlphaZero: Exploring Alternative Rule Sets in Chess (Paper Explained)
2020-09-07	Learning to summarize from human feedback (Paper Explained)
2020-09-02	Self-classifying MNIST Digits (Paper Explained)
2020-08-28	Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation (Paper Explained)
2020-08-26	Radioactive data: tracing through training (Paper Explained)
2020-08-23	Fast reinforcement learning with generalized policy updates (Paper Explained)
2020-08-20	What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study (Paper Explained)
2020-08-18	[Rant] REVIEWER #2: How Peer Review is FAILING in Machine Learning
2020-08-14	REALM: Retrieval-Augmented Language Model Pre-Training (Paper Explained)
2020-08-12	Meta-Learning through Hebbian Plasticity in Random Networks (Paper Explained)
2020-08-09	Hopfield Networks is All You Need (Paper Explained)
2020-08-06	I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)
2020-08-04	PCGRL: Procedural Content Generation via Reinforcement Learning (Paper Explained)
2020-08-02	Big Bird: Transformers for Longer Sequences (Paper Explained)

Tags:

deep learning

machine learning

arxiv

explained

neural networks

artificial intelligence

paper

google

cnn

resnet

big bird

bigbird

attention

attention mechanism

attention for images

transformer for images

transformer

bert

convolutions

window

neighbors

axial attention

position embeddings

positional encodings

quadratic

memory

panoptic segmentation

coco

imagenet

cityscapes

softmax

routing

Channel	Latest
スタンミ	6 hours ago
Irfan N	6 hours ago
Rdm Designs	6 hours ago
Gaming Beyond The Box	6 hours ago
Player Séb	6 hours ago
Tango	6 hours ago
Schnitzel-Imperator	6 hours ago
OurDeep	6 hours ago
StreetX Gaming	6 hours ago
Joe Boost	6 hours ago
Rawzu	7 hours ago
Firefox	7 hours ago
Miigao	7 hours ago
Koffing Sonriente	7 hours ago
FavoriteTechX	7 hours ago
あきくくくく	7 hours ago
Krrish Warrior	7 hours ago
DaveNite	7 hours ago
A P DRAWING	7 hours ago
Astroprime	7 hours ago
Aethelthryth Gaming	7 hours ago
Haribon TV	7 hours ago
Diri_Ku	7 hours ago
Mars	7 hours ago
Handrito .	7 hours ago