Towards Understandable Neural Networks for High Level AI Tasks - Part 6

Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=ihuPr3yaPh0



Duration: 1:37:32
113 views
2


Encoding discrete symbol structures as numerical vectors for neural network computation enables the similarity structure inherent in vectorial representations to yield generalizations that reflect content-similarity in a structure-sensitive fashion. Two examples will be presented. In language understanding, the mapping of arguments from syntactic roles (subject, object, etc.) to semantic roles (agent, patient, etc.) is controlled by the argument structure of verbs. Verbs differ in their argument structures, but fall into a modest number of similarity classes. The similarity of verbs on combined semantic and argument-structure dimensions can be encoded vectorially in distributed representations using the tensor product representation framework presented in previous lectures in this series (and briefly reviewed in this lecture). In a well-studied class of speech errors in the production of 'tongue-twisters', consonants are displaced from their target position to an incorrect position. It has been documented that such errors are more likely when the displacement preserves the syllable-internal position of the consonant (e.g., if a consonant's target position is syllable-initial, an error is more likely to displace it into the initial position of another syllable). Errors are also most likely when a displaced consonant replaces a consonant to which it is similar featurally. And finally, a consonant is more likely to be displaced into a particular syllable position when consonants featurally similar to the displaced consonant are more frequent in that position. Simulations of the Gradient Symbolic Computation (GSC) networks introduced in previous lectures in this series simulate these structure- and content-similarity effects, and the basis for this model behavior can be understood formally. Two remaining potential topics for this lecture series are: - comparison of the size of tensor product representations to the size of other schemes for encoding symbol structures in actual neural network models - programming GSC networks to perform function-application in l-calculus and tree-adjunction (as in Tree-Adjoining Grammar), thereby demonstrating that GSC networks truly have complete symbol-processing (or 'algebraic') capabilities, which Gary Marcus and others have argued (at MSR and elsewhere) are required for neural networks (artificial or biological) to achieve genuine human intelligence. Overview of talk series: Current AI software relies increasingly on neural networks (NNs). The universal data structure of NNs is the numerical vector of activity levels of model neurons, typically with activity distributed widely over many neurons. Can NNs in principle achieve human-like performance in higher cognitive domains - such as inference, planning, grammar - where theories in AI, cognitive science, and linguistics have long argued that abstract, structured symbolic representations are necessary? The work I will present seeks to determine whether, and precisely how, distributed vectors can be functionally isomorphic to symbol structures for computational purposes relevant to AI - at least in certain idealized limits such as unbounded network size. This work - defining and exploring Gradient Symbolic Computation (GSC) - has produced a number of purely theoretical results. Current work at MSR is exploring the use of GSC to address large-scale practical problems using NNs that can be understood because they operate under the explanatory principles of GSC.

Part I is available at http://resnet/resnet/fullvideo.aspx?id=36339
Part 2 at http://resnet/resnet/fullvideo.aspx?id=36370
Part 3 at http://resnet/resnet/fullvideo.aspx?id=36371
Part 4 at http://resnet/resnet/fullvideo.aspx?id=36402
Part 5 at http://resnet/resnet/fullvideo.aspx?id=36411




Other Videos By Microsoft Research


2016-06-13Single-shot error correction with the gauge color code
2016-06-13Robust Spectral Inference for Joint Stochastic Matrix Factorization and Topic Modeling
2016-06-13How Much Information Does a Human Translator Add to the Original and Multi-Source Neural Translation
2016-06-13Opportunities and Challenges in Global Network Cameras
2016-06-13Nature in the City: Changes in Bangalore over Time and Space
2016-06-13Making Small Spaces Feel Large: Practical Illusions in Virtual Reality
2016-06-13Machine Learning as Creative Tool for Designing Real-Time Expressive Interactions
2016-06-13Recent Developments in Combinatorial Optimization
2016-06-13Computational Limits in Statistical Inference: Hidden Cliques and Sum of Squares
2016-06-13Coloring the Universe: An Insider's Look at Making Spectacular Images of Space
2016-06-13Towards Understandable Neural Networks for High Level AI Tasks - Part 6
2016-06-13The 37th UW/MS Symposium in Computational Linguistics
2016-06-13The Linear Algebraic Structure of Word Meanings
2016-06-13Machine Learning Algorithms Workshop
2016-06-13Interactive and Interpretable Machine Learning Models for Human Machine Collaboration
2016-06-13Improving Access to Clinical Data Locked in Narrative Reports: An Informatics Approach
2016-06-13Representation Power of Neural Networks
2016-06-13Green Security Games
2016-06-13e-NABLE: A Global Network of Digital Humanitarians on an Infrastructure of Electronic Communications
2016-06-10Microsoft Research New England: An introduction
2016-06-06Python+Machine Learning tutorial - Data munging for predictive modeling with pandas and scikit-learn



Tags:
microsoft research
neural networks
computer systems and networking
artificial intelligence
machine learning