Object-Centric Learning with Slot Attention (Paper Explained)

Channel:

Yannic Kilcher

Subscribers:

301,000

Published on June 30, 2020 5:10:35 PM ● Video Link: https://www.youtube.com/watch?v=DYBmD88vpiA

Duration: 42:39

14,331 views

538

Visual scenes are often comprised of sets of independent objects. Yet, current vision models make no assumptions about the nature of the pictures they look at. By imposing an objectness prior, this paper a module that is able to recognize permutation-invariant sets of objects from pixels in both supervised and unsupervised settings. It does so by introducing a slot attention module that combines an attention mechanism with dynamic routing.

OUTLINE:
0:00 - Intro & Overview
1:40 - Problem Formulation
4:30 - Slot Attention Architecture
13:30 - Slot Attention Algorithm
21:30 - Iterative Routing Visualization
29:15 - Experiments
36:20 - Inference Time Flexibility
38:35 - Broader Impact Statement
42:05 - Conclusion & Comments

Paper: https://arxiv.org/abs/2006.15055

My Video on Facebook's DETR: https://youtu.be/T35ba_VXkMY
My Video on Attention: https://youtu.be/iDulhoQ2pro
My Video on Capsules: https://youtu.be/nXGHJTtFYRU

Abstract:
Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features. Yet, most deep learning approaches learn distributed representations that do not capture the compositional properties of natural scenes. In this paper, we present the Slot Attention module, an architectural component that interfaces with perceptual representations such as the output of a convolutional neural network and produces a set of task-dependent abstract representations which we call slots. These slots are exchangeable and can bind to any object in the input by specializing through a competitive procedure over multiple rounds of attention. We empirically demonstrate that Slot Attention can extract object-centric representations that enable generalization to unseen compositions when trained on unsupervised object discovery and supervised property prediction tasks.

Authors: Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher

Other Videos By Yannic Kilcher

2020-07-10	Gradient Origin Networks (Paper Explained w/ Live Coding)
2020-07-09	NVAE: A Deep Hierarchical Variational Autoencoder (Paper Explained)
2020-07-08	Addendum for Supermasks in Superposition: A Closer Look (Paper Explained)
2020-07-07	SupSup: Supermasks in Superposition (Paper Explained)
2020-07-06	[Live Machine Learning Research] Plain Self-Ensembles (I actually DISCOVER SOMETHING) - Part 1
2020-07-05	SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization (Paper Explained)
2020-07-04	Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention (Paper Explained)
2020-07-03	On the Measure of Intelligence by François Chollet - Part 4: The ARC Challenge (Paper Explained)
2020-07-02	BERTology Meets Biology: Interpreting Attention in Protein Language Models (Paper Explained)
2020-07-01	GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding (Paper Explained)
2020-06-30	Object-Centric Learning with Slot Attention (Paper Explained)
2020-06-29	Set Distribution Networks: a Generative Model for Sets of Images (Paper Explained)
2020-06-28	Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection (Paper Explained)
2020-06-27	Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures (Paper Explained)
2020-06-26	On the Measure of Intelligence by François Chollet - Part 3: The Math (Paper Explained)
2020-06-25	Discovering Symbolic Models from Deep Learning with Inductive Biases (Paper Explained)
2020-06-24	How I Read a Paper: Facebook's DETR (Video Tutorial)
2020-06-23	RepNet: Counting Out Time - Class Agnostic Video Repetition Counting in the Wild (Paper Explained)
2020-06-22	[Drama] Yann LeCun against Twitter on Dataset Bias
2020-06-21	SIREN: Implicit Neural Representations with Periodic Activation Functions (Paper Explained)
2020-06-20	Big Self-Supervised Models are Strong Semi-Supervised Learners (Paper Explained)

Tags:

deep learning

machine learning

arxiv

explained

neural networks

artificial intelligence

paper

google

ethz

vision

objects

slots

attention mechanism

gru

lstm

routing

capsules

permutation invariant

encoder

set

detr

embeddings

transformer

weight sharing

disentanglement

render

tetris

clevr

cnn

convolutional neural network

attention

Channel	Latest
Rivas	6 hours ago
Wrads Games	6 hours ago
Joshua And Friends	6 hours ago
Kaapomies2K	6 hours ago
Juliana Pantera FF	6 hours ago
neeaclyne	6 hours ago
GatoPretoGames	6 hours ago
Walter Morningstar	6 hours ago
MidnightViperGames	6 hours ago
Rebeccas Creations	7 hours ago
Dương Dê	7 hours ago
Feldraxs's	7 hours ago
寿	7 hours ago
SAF FightingSloth	7 hours ago
Rubén Eguizábal	7 hours ago
Miiguelpin	7 hours ago
Suguru Geto	7 hours ago
TV ATITUDE	7 hours ago
CoraToons	7 hours ago
🥉동학개미공식채널	7 hours ago
Dongkal RangeR	7 hours ago
The Matthews Fam	7 hours ago
Waccau Gameplay	7 hours ago
Nukerz	7 hours ago
Vile Tempest \| TEMPEST ORDER	8 hours ago