Dynamic Inference with Neural Interpreters (w/ author interview)

Subscribers:
284,000
Published on ● Video Link: https://www.youtube.com/watch?v=w3knicSHx5s



Duration: 1:22:37
14,388 views
376


#deeplearning #neuralinterpreter #ai

This video includes an interview with the paper's authors!
What if we treated deep networks like modular programs? Neural Interpreters divide computation into small modules and route data to them via a dynamic type inference system. The resulting model combines recurrent elements, weight sharing, attention, and more to tackle both abstract reasoning, as well as computer vision tasks.

OUTLINE:
0:00 - Intro & Overview
3:00 - Model Overview
7:00 - Interpreter weights and function code
9:40 - Routing data to functions via neural type inference
14:55 - ModLin layers
18:25 - Experiments
21:35 - Interview Start
24:50 - General Model Structure
30:10 - Function code and signature
40:30 - Explaining Modulated Layers
49:50 - A closer look at weight sharing
58:30 - Experimental Results

Paper: https://arxiv.org/abs/2110.06399

Guests:
Nasim Rahaman: https://twitter.com/nasim_rahaman
Francesco Locatello: https://twitter.com/FrancescoLocat8
Waleed Gondal: https://twitter.com/Wallii_gondal

Abstract:
Modern neural network architectures can leverage large amounts of data to generalize well within the training distribution. However, they are less capable of systematic generalization to data drawn from unseen but related distributions, a feat that is hypothesized to require compositional reasoning and reuse of knowledge. In this work, we present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules, which we call \emph{functions}. Inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. The proposed architecture can flexibly compose computation along width and depth, and lends itself well to capacity extension after training. To demonstrate the versatility of Neural Interpreters, we evaluate it in two distinct settings: image classification and visual abstract reasoning on Raven Progressive Matrices. In the former, we show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner. In the latter, we find that Neural Interpreters are competitive with respect to the state-of-the-art in terms of systematic generalization

Authors: Nasim Rahaman, Muhammad Waleed Gondal, Shruti Joshi, Peter Gehler, Yoshua Bengio, Francesco Locatello, Bernhard Schölkopf

Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
LinkedIn: https://www.linkedin.com/in/ykilcher
BiliBili: https://space.bilibili.com/2017636191

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n




Other Videos By Yannic Kilcher


2022-02-16AI against Censorship: Genetic Algorithms, The Geneva Project, ML in Security, and more!
2022-02-15HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning (w/ Author)
2022-02-10[ML News] DeepMind AlphaCode | OpenAI math prover | Meta battles harmful content with AI
2022-02-08Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents (+Author)
2022-02-07OpenAI Embeddings (and Controversy?!)
2022-02-06Unsupervised Brain Models - How does Deep Learning inform Neuroscience? (w/ Patrick Mineault)
2022-02-04GPT-NeoX-20B - Open-Source huge language model by EleutherAI (Interview w/ co-founder Connor Leahy)
2022-01-29Predicting the rules behind - Deep Symbolic Regression for Recurrent Sequences (w/ author interview)
2022-01-27IT ARRIVED! YouTube sent me a package. (also: Limited Time Merch Deal)
2022-01-25[ML News] ConvNeXt: Convolutions return | China regulates algorithms | Saliency cropping examined
2022-01-21Dynamic Inference with Neural Interpreters (w/ author interview)
2022-01-19Noether Networks: Meta-Learning Useful Conserved Quantities (w/ the authors)
2022-01-11This Team won the Minecraft RL BASALT Challenge! (Paper Explanation & Interview with the authors)
2022-01-05Full Self-Driving is HARD! Analyzing Elon Musk re: Tesla Autopilot on Lex Fridman's Podcast
2022-01-02Player of Games: All the games, one algorithm! (w/ author Martin Schmid)
2021-12-30ML News Live! (Dec 30, 2021) Anonymous user RIPS Tensorflw | AI prosecutors rising | Penny Challenge
2021-12-28GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
2021-12-27Machine Learning Holidays Live Stream
2021-12-26Machine Learning Holiday Live Stream
2021-12-24[ML News] AI learns to search the Internet | Drawings come to life | New ML journal launches
2021-12-21[ML News] DeepMind builds Gopher | Google builds GLaM | Suicide capsule uses AI to check access



Tags:
deep learning
machine learning
arxiv
explained
neural networks
ai
artificial intelligence
paper
neural interpreters
dynamic inference
neural programming
neural functions
recurrent networks
yoshua bengio
mila
schoelkopf
attention
modlin
modulated linear layer
weight sharing
recurrent modules
function modules
sparse neural networks
interview
first author interview
with the authors
dynamic inference with neural interpreters
deep neural interpreters