You May Not Need Attention | TDLS Code Review
Subscribers:
22,300
Published on ● Video Link: https://www.youtube.com/watch?v=7aBNB-PRSnw
Toronto Deep Learning Series - Code Review Stream
https://tdls.a-i.science/events/2019-03-04
You May Not Need Attention
"In NMT, how far can we get without attention and without separate encoding and decoding? To answer that question, we introduce a recurrent neural translation model that does not use attention and does not have a separate encoder and decoder. Our eager translation model is low-latency, writing target tokens as soon as it reads the first source token, and uses constant memory during decoding. It performs on par with the standard attention-based model of Bahdanau et al. (2014), and better on long sentences."
Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE
Tags:
machine learning
deep learning
attention
transformer
lstm