Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods | AISC

Published on ● Video Link: https://www.youtube.com/watch?v=qCYAKmFFpbs



Duration: 45:12
1,136 views
38


For slides and more information on the paper, visit https://ai.science/e/fooling-lime-and-shap-adversarial-attacks-on-post-hoc-explanation-methods--zN1SdCrIujdod5FKHYgT

Speaker: Dylan Slack; Host: Ali El-Sharif

Motivation:
As a result of the complexity in machine learning models, researchers have proposed a number of techniques to explain model predictions. Often, the motivation for using such techniques is to increase trust in ML models. However, to what extent are explanation methods vulnerable to manipulation? In this talk, we introduce an attack that fools two popular explainability methods called LIME and SHAP through exploiting a common assumption in both techniques. This allows us to create models which have arbitrary explanations according to LIME and SHAP. We demonstrate the potential significance of our attack through building classifiers that solely rely on protected attributes (e.g. Race or Gender) but explanation methods do not indicate that these features are important.




Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE


2021-02-24'Less Than One'-Shot Learning (author speaking)
2021-02-18Machine Learning in Mobile Cybersecurity: An Overview
2021-02-18Author speaking: Proper Machine Learning Explanations through LIME using OptiLIME framework | AISC
2021-02-12Non-Euclidean Universal Approximation | AISC
2021-02-10Explainable AI with Layer-wise Relevance Propagation (LRP)
2021-02-05An Introduction to Quantum Computing
2021-02-04Predicting compound activity from phenotypic profiles
2021-02-04A Survey on the Explainability of Supervised Machine Learning
2021-01-29Reinforcement learning in sports analytics | AISC
2021-01-28We Can Measure XAI Explanations Better with Templates | AISC
2021-01-27Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods | AISC
2021-01-21Deep Learning in Healthcare and Its Practical Limitations
2021-01-15Introduction to NVIDIA NeMo - A Toolkit for Conversational AI | AISC
2021-01-15Explainable Classifiers Using Counterfactual Approach | AISC
2021-01-14Machine learning meets continuous flow chemistry: Automated process optimization | AISC
2021-01-13Screening and analysis of specific language impairment | AISC
2021-01-08High-frequency Component Helps Explain the Generalization of Convolutional Neural Networks | AISC
2021-01-07Locality Guided Neural Networks for Explainable AI | AISC
2021-01-06Explaining image classifiers by removing input features using generative models | AISC
2020-12-24An Introduction to the Quantum Tech Ecosystem | AISC
2020-12-23Explaining by Removing: A Unified Framework for Model Explanation | AISC