You May Not Need Attention | TDLS Code Review

Published on ● Video Link: https://www.youtube.com/watch?v=7aBNB-PRSnw



Category:
Review
Duration: 1:14:49
825 views
11


Toronto Deep Learning Series - Code Review Stream
https://tdls.a-i.science/events/2019-03-04

You May Not Need Attention

"In NMT, how far can we get without attention and without separate encoding and decoding? To answer that question, we introduce a recurrent neural translation model that does not use attention and does not have a separate encoder and decoder. Our eager translation model is low-latency, writing target tokens as soon as it reads the first source token, and uses constant memory during decoding. It performs on par with the standard attention-based model of Bahdanau et al. (2014), and better on long sentences."




Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE


2019-04-04[FFJORD] Free-form Continuous Dynamics for Scalable Reversible Generative Models (Part 1) | AISC
2019-04-01[DOM-Q-NET] Grounded RL on Structured Language | AISC Author Speaking
2019-03-315-min [machine learning] paper challenge | AISC
2019-03-28[Variational Autoencoder] Auto-Encoding Variational Bayes | AISC Foundational
2019-03-25[GQN] Neural Scene Representation and Rendering | AISC
2019-03-21Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples | AISC
2019-03-18Understanding the Origins of Bias in Word Embeddings
2019-03-14[Original Style Transfer] A Neural Algorithm of Artistic Style | TDLS Foundational
2019-03-11[RecSys 2018 Challenge winner] Two-stage Model for Automatic Playlist Continuation at Scale |TDLS
2019-03-07[OpenAI GPT2] Language Models are Unsupervised Multitask Learners | TDLS Trending Paper
2019-03-04You May Not Need Attention | TDLS Code Review
2019-02-28[DDQN] Deep Reinforcement Learning with Double Q-learning | TDLS Foundational
2019-02-25[AlphaGo Zero] Mastering the game of Go without human knowledge | TDLS
2019-02-21Transformer XL | AISC Trending Papers
2019-02-19Computational prediction of diagnosis & feature selection on mesothelioma patient records | AISC
2019-02-18Support Vector Machine (original paper) | AISC Foundational
2019-02-11Tensor Field Networks | AISC
2019-02-07ACAI: Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer
2019-02-04Code Review: Transformer - Attention Is All You Need | AISC
2019-02-04[StyleGAN] A Style-Based Generator Architecture for GANs, part2 (results and discussion) | TDLS
2019-02-04[StyleGAN] A Style-Based Generator Architecture for GANs, part 1 (algorithm review) | TDLS



Tags:
machine learning
deep learning
attention
transformer
lstm