[Transformer] Attention Is All You Need | AISC Foundational

Channel:

LLMs Explained - Aggregate Intellect - AI.SCIENCE

Subscribers:

22,600

Published on November 1, 2018 4:03:12 AM ● Video Link: https://www.youtube.com/watch?v=S0KakHcj_rs

Duration: 54:13

31,552 views

687

22 October 2018

For slides and more information, visit https://aisc.ai.science/events/2018-10-22

Paper: https://arxiv.org/abs/1706.03762

Speaker: Joseph Palermo (Dessa)

Host: Insight
Date: Oct 22nd, 2018

Attention Is All You Need

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE

2018-11-30	Visualizing Data using t-SNE (algorithm) \| AISC Foundational
2018-11-30	Visualizing Data using t-SNE (discussions) \| AISC Foundational
2018-11-27	[BERT] Pretranied Deep Bidirectional Transformers for Language Understanding (discussions) \| TDLS
2018-11-27	[BERT] Pretranied Deep Bidirectional Transformers for Language Understanding (algorithm) \| TDLS
2018-11-27	Neural Image Caption Generation with Visual Attention (algorithm) \| AISC
2018-11-27	Neural Image Caption Generation with Visual Attention (discussion) \| AISC
2018-11-17	PGGAN \| Progressive Growing of GANs for Improved Quality, Stability, and Variation (part 2) \| AISC
2018-11-16	PGGAN \| Progressive Growing of GANs for Improved Quality, Stability, and Variation (part 1) \| AISC
2018-11-16	(Original Paper) Latent Dirichlet Allocation (discussions) \| AISC Foundational
2018-11-15	(Original Paper) Latent Dirichlet Allocation (algorithm) \| AISC Foundational
2018-10-31	[Transformer] Attention Is All You Need \| AISC Foundational
2018-10-25	[Original attention] Neural Machine Translation by Jointly Learning to Align and Translate \| AISC
2018-10-16	[StackGAN++] Realistic Image Synthesis with Stacked Generative Adversarial Networks \| AISC
2018-10-11	Bayesian Deep Learning on a Quantum Computer \| TDLS Author Speaking
2018-10-02	Prediction of Cardiac arrest from physiological signals in the pediatric ICU \| TDLS Author Speaking
2018-09-24	Junction Tree Variational Autoencoder for Molecular Graph Generation \| TDLS
2018-09-19	Reconstructing quantum states with generative models \| TDLS Author Speaking
2018-09-13	All-optical machine learning using diffractive deep neural networks \| TDLS
2018-09-05	Recurrent Models of Visual Attention \| TDLS
2018-08-28	Eve: A Gradient Based Optimization Method with Locally and Globally Adaptive Learning Rates \| TDLS
2018-08-20	TDLS: Large-Scale Unsupervised Deep Representation Learning for Brain Structure

Tags:

nlp

natural language processing

sequence to sequence models

neural attention

attention is all you need

deep learning

machine learning

artificial intelligence

transformer model

self attention

attention

transformer deep learning

transformer network

Channel	Latest
GuitarHeroStyles	8 hours ago
Top5Gaming	8 hours ago
MrDalekJD	8 hours ago
gameranx	9 hours ago
Olexa	9 hours ago
dakblake	10 hours ago
TG Plays	10 hours ago
Markiplier	11 hours ago
RobtheMod	11 hours ago
MrT-Gaming	12 hours ago
The Nishant Vibe	12 hours ago
atv	12 hours ago
ConnorDawg	12 hours ago
TerraChannel / TerraFox	12 hours ago
LukePingu	12 hours ago
Taffe316	12 hours ago
RapCheck	12 hours ago
SOLO GAMER	13 hours ago
Olympus	13 hours ago
Gellar Gaiden	13 hours ago
JÚNIOR GAELZIN	13 hours ago
The Game Archivist	13 hours ago
DIOSTAR GAMER	13 hours ago
RUTAX FREESTYLE	13 hours ago
Loster99	13 hours ago