BLEURT: Learning Robust Metrics for Text Generation (Paper Explained)

Channel:

Yannic Kilcher

Subscribers:

301,000

Published on June 7, 2020 2:11:39 PM ● Video Link: https://www.youtube.com/watch?v=rl4nUngiR2k

Duration: 31:35

6,362 views

273

Proper evaluation of text generation models, such as machine translation systems, requires expensive and slow human assessment. As these models have gotten better in previous years, proxy-scores, like BLEU, are becoming less and less useful. This paper proposes to learn a proxy score and demonstrates that it correlates well with human raters, even as the data distribution shifts.

OUTLINE:
0:00 - Intro & High-Level Overview
1:00 - The Problem with Evaluating Machine Translation
5:10 - Task Evaluation as a Learning Problem
10:45 - Naive Fine-Tuning BERT
13:25 - Pre-Training on Synthetic Data
16:50 - Generating the Synthetic Data
18:30 - Priming via Auxiliary Tasks
23:35 - Experiments & Distribution Shifts
27:00 - Concerns & Conclusion

Paper: https://arxiv.org/abs/2004.04696
Code: https://github.com/google-research/bleurt

Abstract:
Text generation has made significant advances in the last few years. Yet, evaluation metrics have lagged behind, as the most popular choices (e.g., BLEU and ROUGE) may correlate poorly with human judgments. We propose BLEURT, a learned evaluation metric based on BERT that can model human judgments with a few thousand possibly biased training examples. A key aspect of our approach is a novel pre-training scheme that uses millions of synthetic examples to help the model generalize. BLEURT provides state-of-the-art results on the last three years of the WMT Metrics shared task and the WebNLG Competition dataset. In contrast to a vanilla BERT-based approach, it yields superior results even when the training data is scarce and out-of-distribution.

Abstract: Thibault Sellam, Dipanjan Das, Ankur P. Parikh

Links:
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher

Other Videos By Yannic Kilcher

2020-06-17	BYOL: Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning (Paper Explained)
2020-06-16	TUNIT: Rethinking the Truly Unsupervised Image-to-Image Translation (Paper Explained)
2020-06-15	A bio-inspired bistable recurrent cell allows for long-lasting memory (Paper Explained)
2020-06-14	SynFlow: Pruning neural networks without any data by iteratively conserving synaptic flow
2020-06-13	Deep Differential System Stability - Learning advanced computations from examples (Paper Explained)
2020-06-12	VirTex: Learning Visual Representations from Textual Annotations (Paper Explained)
2020-06-11	Linformer: Self-Attention with Linear Complexity (Paper Explained)
2020-06-10	End-to-End Adversarial Text-to-Speech (Paper Explained)
2020-06-09	TransCoder: Unsupervised Translation of Programming Languages (Paper Explained)
2020-06-08	JOIN ME for the NeurIPS 2020 Flatland Multi-Agent RL Challenge!
2020-06-07	BLEURT: Learning Robust Metrics for Text Generation (Paper Explained)
2020-06-06	Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search (Paper Explained)
2020-06-05	CornerNet: Detecting Objects as Paired Keypoints (Paper Explained)
2020-06-04	Movement Pruning: Adaptive Sparsity by Fine-Tuning (Paper Explained)
2020-06-03	Learning To Classify Images Without Labels (Paper Explained)
2020-06-02	On the Measure of Intelligence by François Chollet - Part 1: Foundations (Paper Explained)
2020-06-01	Dynamics-Aware Unsupervised Discovery of Skills (Paper Explained)
2020-05-31	Synthesizer: Rethinking Self-Attention in Transformer Models (Paper Explained)
2020-05-30	[Code] How to use Facebook's DETR object detection algorithm in Python (Full Tutorial)
2020-05-29	GPT-3: Language Models are Few-Shot Learners (Paper Explained)
2020-05-28	DETR: End-to-End Object Detection with Transformers (Paper Explained)

Tags:

deep learning

machine learning

arxiv

explained

neural networks

artificial intelligence

paper

nlp

natural language processing

machine translation

transformer

bert

lstm

attention

wmt

wikipedia

backtranslation

bleu

rouge

ngrams

score

metric

comparison

human raters

google

google research

automatic

overlap

distribution shift

Channel	Latest
Family Friendly Gaming	7 hours ago
Gekisaka Game Channel	8 hours ago
Tello Godox	8 hours ago
Yannex	8 hours ago
100% WALKTHROUGH	8 hours ago
𝐌𝐢𝐧𝐝 𝐎𝐯𝐞𝐫 𝐨𝐟𝐟𝐢𝐜𝐢𝐚𝐥	8 hours ago
Limp CK	8 hours ago
Ur shivam	8 hours ago
Rayan Al-eissa	8 hours ago
GwammTM	8 hours ago
UNIQUE M79	8 hours ago
vasanth தமிழ் gaming	8 hours ago
Power Art YT	8 hours ago
Neon Gaming ID	8 hours ago
HOSTTLER 2.0	9 hours ago
officialgtvid	9 hours ago
Malayeka VT	9 hours ago
ឪអាទុយ	9 hours ago
Rusher Nitesh	9 hours ago
Misty Kathrine	9 hours ago
TEODORO	9 hours ago
かもへっぽこ	9 hours ago
BoBo Bro	9 hours ago
sen 2424	9 hours ago
Мысля Геймится	9 hours ago