Imputer: Sequence Modelling via Imputation and Dynamic Programming

Channel:

Yannic Kilcher

Subscribers:

301,000

Published on April 14, 2020 12:47:57 PM ● Video Link: https://www.youtube.com/watch?v=AU30czb4iQA

Duration: 18:15

1,790 views

The imputer is a sequence-to-sequence model that strikes a balance between fully autoregressive models with long inference times and fully non-autoregressive models with fast inference. The imputer achieves constant decoding time independent of sequence length by exploiting dynamic programming.

https://arxiv.org/abs/2002.08926

Abstract:
This paper presents the Imputer, a neural sequence model that generates output sequences iteratively via imputations. The Imputer is an iterative generative model, requiring only a constant number of generation steps independent of the number of input or output tokens. The Imputer can be trained to approximately marginalize over all possible alignments between the input and output sequences, and all possible generation orders. We present a tractable dynamic programming training algorithm, which yields a lower bound on the log marginal likelihood. When applied to end-to-end speech recognition, the Imputer outperforms prior non-autoregressive models and achieves competitive results to autoregressive models. On LibriSpeech test-other, the Imputer achieves 11.1 WER, outperforming CTC at 13.0 WER and seq2seq at 12.5 WER.

Authors: William Chan, Chitwan Saharia, Geoffrey Hinton, Mohammad Norouzi, Navdeep Jaitly

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher

Other Videos By Yannic Kilcher

2020-04-24	Supervised Contrastive Learning
2020-04-23	Thinking While Moving: Deep Reinforcement Learning with Concurrent Control
2020-04-22	[Rant] The Male Only History of Deep Learning
2020-04-21	Gradient Surgery for Multi-Task Learning
2020-04-20	Longformer: The Long-Document Transformer
2020-04-20	Backpropagation and the brain
2020-04-18	Shortcut Learning in Deep Neural Networks
2020-04-17	Feature Visualization & The OpenAI microscope
2020-04-16	Datasets for Data-Driven Reinforcement Learning
2020-04-15	FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
2020-04-14	Imputer: Sequence Modelling via Imputation and Dynamic Programming
2020-04-13	The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
2020-04-12	Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery
2020-04-11	CURL: Contrastive Unsupervised Representations for Reinforcement Learning
2020-04-10	Enhanced POET: Open-Ended RL through Unbounded Invention of Learning Challenges and their Solutions
2020-04-09	Evolving Normalization-Activation Layers
2020-04-08	[Drama] Who invented Contrast Sets?
2020-04-07	Evaluating NLP Models via Contrast Sets
2020-04-06	POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Solutions
2020-04-03	Dream to Control: Learning Behaviors by Latent Imagination
2020-04-02	Can we Contain Covid-19 without Locking-down the Economy?

Tags:

deep learning

machine learning

nlp

natural language processing

machine translation

arxiv

google

attention mechanism

attention

transformer

seq2seq

autoregressive

independence

decoding

Channel	Latest
GMI	6 hours ago
redxparasite	6 hours ago
DerpDream	6 hours ago
Koinsky	6 hours ago
bungg	7 hours ago
Zoruto	7 hours ago
dalang channel,gado gado	7 hours ago
UBOTSYT	7 hours ago
EricVanWilderman	7 hours ago
Hendra Sihombing Official	7 hours ago
KeysJore	7 hours ago
Dexie Gaming	7 hours ago
PlaytendoGuy	7 hours ago
pizzadudemanguy	7 hours ago
Lives do Carv4lhou	7 hours ago
CHIP	7 hours ago
Magical Gamer	7 hours ago
Redzy	7 hours ago
CeeJayBlox	7 hours ago
Dan	7 hours ago
DragonGB	7 hours ago
iiToyorBabyVideoMaker2738	7 hours ago
Ponleu TK officials	7 hours ago
André Roronora Zoro	7 hours ago
SLAPTrain	7 hours ago