RepNet: Counting Out Time - Class Agnostic Video Repetition Counting in the Wild (Paper Explained)

Channel:

Yannic Kilcher

Subscribers:

301,000

Published on June 23, 2020 1:54:48 PM ● Video Link: https://www.youtube.com/watch?v=qSArFEIoSbo

Duration: 36:42

9,313 views

351

Counting repeated actions in a video is one of the easiest tasks for humans, yet remains incredibly hard for machines. RepNet achieves state-of-the-art by creating an information bottleneck in the form of a temporal self-similarity matrix, relating video frames to each other in a way that forces the model to surface the information relevant for counting. Along with that, the authors produce a new dataset for evaluating counting models.

OUTLINE:
0:00 - Intro & Overview
2:30 - Problem Statement
5:15 - Output & Loss
6:25 - Per-Frame Embeddings
11:20 - Temporal Self-Similarity Matrix
19:00 - Periodicity Predictor
25:50 - Architecture Recap
27:00 - Synthetic Dataset
30:15 - Countix Dataset
31:10 - Experiments
33:35 - Applications
35:30 - Conclusion & Comments

Paper Website: https://sites.google.com/view/repnet
Colab: https://colab.research.google.com/github/google-research/google-research/blob/master/repnet/repnet_colab.ipynb

Abstract:
We present an approach for estimating the period with which an action is repeated in a video. The crux of the approach lies in constraining the period prediction module to use temporal self-similarity as an intermediate representation bottleneck that allows generalization to unseen repetitions in videos in the wild. We train this model, called RepNet, with a synthetic dataset that is generated from a large unlabeled video collection by sampling short clips of varying lengths and repeating them with different periods and counts. This combination of synthetic data and a powerful yet constrained model, allows us to predict periods in a class-agnostic fashion. Our model substantially exceeds the state of the art performance on existing periodicity (PERTUBE) and repetition counting (QUVA) benchmarks. We also collect a new challenging dataset called Countix (~90 times larger than existing datasets) which captures the challenges of repetition counting in real-world videos.

Authors: Debidatta Dwibedi, Yusuf Aytar, Jonathan Tompson, Pierre Sermanet, Andrew Zisserman

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher

Other Videos By Yannic Kilcher

2020-07-03	On the Measure of Intelligence by François Chollet - Part 4: The ARC Challenge (Paper Explained)
2020-07-02	BERTology Meets Biology: Interpreting Attention in Protein Language Models (Paper Explained)
2020-07-01	GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding (Paper Explained)
2020-06-30	Object-Centric Learning with Slot Attention (Paper Explained)
2020-06-29	Set Distribution Networks: a Generative Model for Sets of Images (Paper Explained)
2020-06-28	Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection (Paper Explained)
2020-06-27	Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures (Paper Explained)
2020-06-26	On the Measure of Intelligence by François Chollet - Part 3: The Math (Paper Explained)
2020-06-25	Discovering Symbolic Models from Deep Learning with Inductive Biases (Paper Explained)
2020-06-24	How I Read a Paper: Facebook's DETR (Video Tutorial)
2020-06-23	RepNet: Counting Out Time - Class Agnostic Video Repetition Counting in the Wild (Paper Explained)
2020-06-22	[Drama] Yann LeCun against Twitter on Dataset Bias
2020-06-21	SIREN: Implicit Neural Representations with Periodic Activation Functions (Paper Explained)
2020-06-20	Big Self-Supervised Models are Strong Semi-Supervised Learners (Paper Explained)
2020-06-19	On the Measure of Intelligence by François Chollet - Part 2: Human Priors (Paper Explained)
2020-06-18	Image GPT: Generative Pretraining from Pixels (Paper Explained)
2020-06-17	BYOL: Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning (Paper Explained)
2020-06-16	TUNIT: Rethinking the Truly Unsupervised Image-to-Image Translation (Paper Explained)
2020-06-15	A bio-inspired bistable recurrent cell allows for long-lasting memory (Paper Explained)
2020-06-14	SynFlow: Pruning neural networks without any data by iteratively conserving synaptic flow
2020-06-13	Deep Differential System Stability - Learning advanced computations from examples (Paper Explained)

Tags:

deep learning

machine learning

arxiv

explained

neural networks

artificial intelligence

paper

vision

counting

self-similarity

temporal

frames

video

repeating

lines

transformer

attention

cnn

convolutional neural network

repetitions

periodicity

period

repeat

actions

kinetics

countix

Channel	Latest
Mystical Gaming	6 hours ago
Meta375	7 hours ago
ALEDream	8 hours ago
Maverick G	9 hours ago
Robzap 20 Nintendo & Steam Pictures	9 hours ago
Steven J Flynn	9 hours ago
StarMiz	9 hours ago
77Game Play	9 hours ago
محمود العجيل \| Mahmoud Alajil	9 hours ago
Oscar Memo333	9 hours ago
Nintentoni	9 hours ago
SiabarGroot [La mejor plantita de todo Youtube]	9 hours ago
ChrisPlayer24	9 hours ago
Berdydaft	9 hours ago
Salita Promotions	9 hours ago
Prem Jeff SP	9 hours ago
Annihilator	9 hours ago
Mooinspace	10 hours ago
Mirage	10 hours ago
[PITBULL GANG]	10 hours ago
Amazon MGM Studios	10 hours ago
Thunder vs ETTR	10 hours ago
Mr Cyrus	10 hours ago
Der DandyMann Shorts	10 hours ago
Viny Tutoriais	10 hours ago