XLNet: Generalized Autoregressive Pretraining for Language Understanding | AISC

Channel:

LLMs Explained - Aggregate Intellect - AI.SCIENCE

Subscribers:

22,600

Published on August 7, 2019 12:32:15 AM ● Video Link: https://www.youtube.com/watch?v=Mgck4XFR9GA

Duration: 1:45:12

2,642 views

For slides and more information on the paper, visit https://aisc.ai.science/events/2019-08-06

Discussion lead: Alec Robinson

Motivation:
With the capability of modeling bidirectional contexts, denoising autoencoding
based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions
and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we
propose XLNet, a generalized autoregressive pretraining method that (1) enables
learning bidirectional contexts by maximizing the expected likelihood over all
permutations of the factorization order and (2) overcomes the limitations of BERT
thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas
from Transformer-XL, the state-of-the-art autoregressive model, into pretraining.
Empirically, XLNet outperforms BERT on 20 tasks, often by a large margin, and
achieves state-of-the-art results on 18 tasks including question answering, natural
language inference, sentiment analysis, and document ranking.

Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE

2019-09-16	Making of a conversational agent platform \| AISC
2019-09-09	A Survey of Singular Learning \| AISC
2019-09-04	Overview of Reinforcement Learning \| AISC
2019-09-03	Ernie 2.0: A Continual Pre-Training Framework for Language Understanding \| AISC
2019-08-28	Consistency by Agreement in Zero-shot Neural Machine Translation \| AISC
2019-08-26	TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing \| AISC
2019-08-21	Science of science: Identifying Fundamental Drivers of Science \| AISC
2019-08-19	AI Product Stream Meet and Greet \| AISC
2019-08-12	[Original ResNet paper] Deep Residual Learning for Image Recognition \| AISC
2019-08-11	[GAT] Graph Attention Networks \| AISC Foundational
2019-08-06	XLNet: Generalized Autoregressive Pretraining for Language Understanding \| AISC
2019-07-31	Overview of Generative Adversarial Networks \| AISC
2019-07-29	Multi-Armed Bandit Strategies for Non-Stationary Reward Distributions and Delayed Feedback Processes
2019-07-22	AISC Abstract Night
2019-07-15	The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words & Sentences From Natural Supervision
2019-07-10	TF-Encrypted: Private machine learning in tensorflow with secure computing \| AISC Lunch & Learn
2019-07-08	Unsupervised Data Augmentation \| AISC
2019-07-04	Mathematics of Deep Learning Overview \| AISC Lunch & Learn
2019-07-02	Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling
2019-06-26	Neural Models of Text Normalization for Speech Applications \| AISC Author Speaking
2019-06-24	Assessing Modeling Variability in Autonomous Vehicle Accelerated Evaluation

Channel	Latest
Ghost	6 hours ago
JONIS BETHESDIANO	6 hours ago
AMAKALIFE	6 hours ago
Victor Caruso	7 hours ago
碓井拓海	7 hours ago
Kemikziel	7 hours ago
Renaud Nadeau	7 hours ago
ErenCorp	7 hours ago
DEFAULTKAVY ch.	7 hours ago
Pikachu2000	7 hours ago
BerryIzzy	7 hours ago
Lord Cess	7 hours ago
ShockWave	7 hours ago
Gacha Chief	8 hours ago
Karvenus14	8 hours ago
Iris	8 hours ago
🔴Franix (VODs)	8 hours ago
Rune	9 hours ago
Movi React With krishna	9 hours ago
Torisu Kazoku	9 hours ago
Gia Bảo Lâm	9 hours ago
Criish	9 hours ago
Metakirby	9 hours ago
Pelaa Ite Paremmin	9 hours ago
Nebóa N7	10 hours ago