XLNet: Generalized Autoregressive Pretraining for Language Understanding | AISC

Published on ● Video Link: https://www.youtube.com/watch?v=Mgck4XFR9GA



Duration: 1:45:12
2,642 views
41


For slides and more information on the paper, visit https://aisc.ai.science/events/2019-08-06

Discussion lead: Alec Robinson


Motivation:
With the capability of modeling bidirectional contexts, denoising autoencoding
based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions
and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we
propose XLNet, a generalized autoregressive pretraining method that (1) enables
learning bidirectional contexts by maximizing the expected likelihood over all
permutations of the factorization order and (2) overcomes the limitations of BERT
thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas
from Transformer-XL, the state-of-the-art autoregressive model, into pretraining.
Empirically, XLNet outperforms BERT on 20 tasks, often by a large margin, and
achieves state-of-the-art results on 18 tasks including question answering, natural
language inference, sentiment analysis, and document ranking.




Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE


2019-09-16Making of a conversational agent platform | AISC
2019-09-09A Survey of Singular Learning | AISC
2019-09-04Overview of Reinforcement Learning | AISC
2019-09-03Ernie 2.0: A Continual Pre-Training Framework for Language Understanding | AISC
2019-08-28Consistency by Agreement in Zero-shot Neural Machine Translation | AISC
2019-08-26TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing | AISC
2019-08-21Science of science: Identifying Fundamental Drivers of Science | AISC
2019-08-19AI Product Stream Meet and Greet | AISC
2019-08-12[Original ResNet paper] Deep Residual Learning for Image Recognition | AISC
2019-08-11[GAT] Graph Attention Networks | AISC Foundational
2019-08-06XLNet: Generalized Autoregressive Pretraining for Language Understanding | AISC
2019-07-31Overview of Generative Adversarial Networks | AISC
2019-07-29Multi-Armed Bandit Strategies for Non-Stationary Reward Distributions and Delayed Feedback Processes
2019-07-22AISC Abstract Night
2019-07-15The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words & Sentences From Natural Supervision
2019-07-10TF-Encrypted: Private machine learning in tensorflow with secure computing | AISC Lunch & Learn
2019-07-08Unsupervised Data Augmentation | AISC
2019-07-04Mathematics of Deep Learning Overview | AISC Lunch & Learn
2019-07-02Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling
2019-06-26Neural Models of Text Normalization for Speech Applications | AISC Author Speaking
2019-06-24Assessing Modeling Variability in Autonomous Vehicle Accelerated Evaluation