Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples | AISC

Published on ● Video Link: https://www.youtube.com/watch?v=DNABw31eL8E



Duration: 1:33:19
604 views
14


A.I. Socratic Circles (formerly TDLS)

https://aisc.a-i.science/events/2019-03-21/

Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples

Deep neural networks (DNNs) have demonstrated impressive performance on a wide array of tasks, but they are usually considered opaque since internal structure and learned parameters are not interpretable. In this paper, we re-examine the internal representations of DNNs using adversarial images, which are generated by an ensemble-optimization algorithm. We find that: (1) the neurons in DNNs do not truly detect semantic objects/parts, but respond to objects/parts only as recurrent discriminative patches; (2) deep visual representations are not robust distributed codes of visual concepts because the representations of adversarial images are largely not consistent with those of real images, although they have similar visual appearance, both of which are different from previous findings. To further improve the interpretability of DNNs, we propose an adversarial training scheme with a consistent loss such that the neurons are endowed with human-interpretable concepts. The induced interpretable representations enable us to trace eventual outcomes back to influential neurons. Therefore, human users can know how the models make predictions, as well as when and why they make errors.




Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE


2019-04-18[Phoenics] A Bayesian Optimizer for Chemistry | AISC Author Speaking
2019-04-18Why do large batch sized trainings perform poorly in SGD? - Generalization Gap Explained | AISC
2019-04-16Structured Neural Summarization | AISC Lunch & Learn
2019-04-11Deep InfoMax: Learning deep representations by mutual information estimation and maximization | AISC
2019-04-08ACT: Adaptive Computation Time for Recurrent Neural Networks | AISC
2019-04-04[FFJORD] Free-form Continuous Dynamics for Scalable Reversible Generative Models (Part 1) | AISC
2019-04-01[DOM-Q-NET] Grounded RL on Structured Language | AISC Author Speaking
2019-03-315-min [machine learning] paper challenge | AISC
2019-03-28[Variational Autoencoder] Auto-Encoding Variational Bayes | AISC Foundational
2019-03-25[GQN] Neural Scene Representation and Rendering | AISC
2019-03-21Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples | AISC
2019-03-18Understanding the Origins of Bias in Word Embeddings
2019-03-14[Original Style Transfer] A Neural Algorithm of Artistic Style | TDLS Foundational
2019-03-11[RecSys 2018 Challenge winner] Two-stage Model for Automatic Playlist Continuation at Scale |TDLS
2019-03-07[OpenAI GPT2] Language Models are Unsupervised Multitask Learners | TDLS Trending Paper
2019-03-04You May Not Need Attention | TDLS Code Review
2019-02-28[DDQN] Deep Reinforcement Learning with Double Q-learning | TDLS Foundational
2019-02-25[AlphaGo Zero] Mastering the game of Go without human knowledge | TDLS
2019-02-21Transformer XL | AISC Trending Papers
2019-02-19Computational prediction of diagnosis & feature selection on mesothelioma patient records | AISC
2019-02-18Support Vector Machine (original paper) | AISC Foundational



Tags:
deep learning
machine learning
adversarial learning
interpretable AI
explainable AI
neural nets
neural networks