Plug and Play Language Models: A Simple Approach to Controlled Text Generation | AISC

Published on ● Video Link: https://www.youtube.com/watch?v=q3Q_LTetx9o



Duration: 2:36:38
3,018 views
74


For slides and more information on the paper, visit https://aisc.ai.science/events/2020-01-13

Discussion lead: Raheleh Makki
Discussion facilitator(s): Gordon Gibson, Royal Sequeira + Salman Mohammed

Motivation:
Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. In the canonical scenario we present, the attribute models are simple classifiers consisting of a user-specified bag of words or a single learned layer with 100,000 times fewer parameters than the LM. Sampling entails a forward and backward pass in which gradients from the attribute model push the LM's hidden activations and thus guide the generation. Model samples demonstrate control over a range of topics and sentiment styles, and extensive automated and human annotated evaluations show attribute alignment and fluency. PPLMs are flexible in that any combination of differentiable attribute models may be used to steer text generation, which will allow for diverse and creative applications beyond the examples given in this paper.




Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE


2020-02-20Build and Deploy Machine Learning Models | MLOps Overview
2020-02-19Quantifying the dynamics of failure across science, startups and security | AISC
2020-02-18Deep Learning for Symbolic Mathematics | AISC
2020-02-12Visualizing and measuring the geometry of BERT | AISC
2020-02-11Attention is not not explanation + Character Eyes: Seeing Language through Character-Level Taggers |
2020-02-10Single Headed Attention RNN: Stop Thinking With Your Head | AISC
2020-01-27Identifying Big ML product opportunities inside Big organizations | AISC
2020-01-23Machine Learning in Cyber Security, Overview | AISC
2020-01-22BottleSum: Unsupervised & Self-supervised Sentence Summarization w/ Information Bottleneck Principle
2020-01-20A Hybrid GA-PSO Method for Evolving Architecture and Short Connections of Deep Convolutional Neural
2020-01-13Plug and Play Language Models: A Simple Approach to Controlled Text Generation | AISC
2020-01-08Overview of Modern Anomaly and Novelty Detection | AISC
2020-01-06Annotating Object Instances With a Polygon RNN | AISC
2019-12-11Predicting translational progress in biomedical research | AISC
2019-12-09AlphaStar explained: Grandmaster level in StarCraft II with multi-agent RL
2019-12-04How Can We Be So Dense? The Benefits of Using Highly Sparse Representations | AISC
2019-12-02[RoBERT & ToBERT] Hierarchical Transformers for Long Document Classification | AISC
2019-11-25[OpenAI] Solving Rubik's Cube with a Robot Hand | AISC
2019-11-18Top-K Off-Policy Correction for a REINFORCE Recommender System | AISC
2019-11-13Overview of Unsupervised & Semi-supervised learning | AISC
2019-11-11Building products for Continous Delivery in Machine Learning | AISC