Beyond Accuracy: Behavioral Testing of NLP Models with CheckList | AISC

Published on ● Video Link: https://www.youtube.com/watch?v=A0od6RosVSA



Category:
Let's Play
Duration: 41:38
1,322 views
36


Speaker(s): Marco Tulio Ribeiro
Facilitator(s): Royal Sequiera

Find the recording, slides, and more info at https://ai.science/e/check-list-beyond-accuracy-behavioral-testing-of-nlp-models-with-check-list--lva9YvNDiwob0DFAE26o

Motivation / Abstract
- The paper proposes CheckList, a novel behavioural testing methodology
- CheckList provides you tools that will you build software engineering like test cases at scale!
- Using CheckList, the paper identifies critical failures in both commercial
and state-of-the-art models
- This paper won the overall Best Paper Award at ACL'20

------
#AISC hosts 3-5 live sessions like this on various AI research, engineering, and product topics every week! Visit https://ai.science for more details




Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE


2020-09-14Product Ideation: From a Hunch to a Concrete Idea
2020-09-14RadioAssistant - Ranking Radiology Patients using Deep Learning | Workshop Capstone
2020-09-11Building a better climate model with Machine Learning | AISC
2020-09-10Set Constrained Temporal Transformer for Set Supervised Action Segmentation | AISC
2020-09-10An overview of task-oriented dialog systems | AISC
2020-09-09Targeted Machine Learning for Data Science | AISC
2020-09-08Build next generation recommenders with NVIDIA Merlin | AISC
2020-09-02Principal Neighbourhood Aggregation for Graph Nets | AISC
2020-09-01DeepFakes & Explainable AI Applications in NLP, Biomedical & Malware Classification
2020-08-28AI Ethics Then & Now: A Look Back on the Last Five Years | AISC
2020-08-27Beyond Accuracy: Behavioral Testing of NLP Models with CheckList | AISC
2020-08-27The Summary Loop: Learning to Write Abstractive Summaries Without Examples + Demo | AISC
2020-08-26[MEM] Learning Permutation Invariant Representations using Memory Networks | AISC
2020-08-26AI for Fun!
2020-08-25[T-Fixup] Improving Transformer Optimization Through Better Initialization | AISC
2020-08-25A review of ML for aerospace systems health management | AISC
2020-08-21An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph | AISC
2020-08-20Overview of Synthetic Data and Simulations | AISC
2020-08-19Discovering Symbolic Inductive Biases | AISC
2020-08-19Product Ideation - Art of Finding the Right Problem to Work on! | AISC
2020-08-19Pink Diamond - Data Driven Prediction of Venture Success | Workshop Capstone