Learn from External Memory, not just Weights: Large-Scale Retrieval for Reinforcement Learning

Channel:

John Tan Chong Min

Subscribers:

5,450

Published on January 31, 2023 10:45:33 AM ● Video Link: https://www.youtube.com/watch?v=NJylWdyqQ9c

Duration: 1:42:08

45,881 views

164

Modern RL architectures typically store the learned memory in the weights of the model. This way of storage is slow to learn as it not only takes several samples of backpropagation to update the weights correctly, but it can be unreliable as a change in some of the weights can affect earlier storage. This paper by DeepMind incorporates some form of utilizing external memory in the network via nearest neighbor search, and helps to learn faster from expert trajectories.

When evaluated on a 9x9 Go game, it performs better against the Pachi AI agent than simply doing more Monte Carlo Tree Search, and does better than without doing nearest neighbor retrieval. There is some use of an external memory for learning, and in this work they empirically demonstrated its performance in learning and inferring good trajectories from examples.

I was inspired by this work, and sought to improve it by making the memory learnable and goal-directed, and this will be described in my future work titled "Learning, Fast and Slow"

Paper Link: https://arxiv.org/abs/2206.05314
Slides: https://github.com/tanchongmin/TensorFlow-Implementations/tree/main/Paper_Reviews

Related videos:
Learning, Fast and Slow: https://www.youtube.com/watch?v=Hr9zW7Usb7I
A New Framework of Memory for Learning: https://www.youtube.com/watch?v=q9uMEAcB3lM
Reinforcement Learning, Fast and Slow: https://www.youtube.com/watch?v=M10f3ihj3cE

0:00 Introduction
4:57 Neural Networks vs External Memory
17:52 Memory to augment observations
21:31 Scalable Memory Retrieval
37:17 Robust Way of Leveraging Data
40:43 Memory as abstraction
43:23 Overall Model
48:38 Experiment Setup
51:53 Results
1:15:00 Neighbour Regularisation
1:20:57 Discussion
1:32:22 My own follow-up work: Learning, Fast and Slow
1:35:54 Motivation and Final words

~~~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/fXCZCPYs
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/.
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin

Other Videos By John Tan Chong Min

2023-04-11	Tutorial #2: OpenAI Vector Embeddings and Pinecone for Retrieval-Augmented Generation
2023-04-04	Creating JARVIS: ChatGPT + APIs - HuggingGPT, Memory-Augmented Context, Meta GPT structures
2023-04-02	Is GPT4 capable of self-improving? Are we heading for AGI or AI doom?
2023-03-28	How Visual ChatGPT works + Toolformer/Wolfram Alpha. LLMs with Tools/APIs/Plugins is the way ahead!
2023-03-21	Tokenize any input, even continuous vectors! - Residual Vector Quantization - VALL-E (Part 2)
2023-03-07	Using Transformers to mimic anyone's voice! - VALL-E (Part 1)
2023-02-28	Learning Part-Whole Structure by Chunking - More Efficient than Deep Learning!!!
2023-02-21	High-level planning with large language models - SayCan
2023-02-13	Learning, Fast and Slow: Towards Fast and Adaptable Agents in Changing Environments
2023-02-07	Using Logic Gates as Neurons - Deep Differentiable Logic Gate Networks!
2023-01-31	Learn from External Memory, not just Weights: Large-Scale Retrieval for Reinforcement Learning
2023-01-17	How ChatGPT works - From Transformers to Reinforcement Learning with Human Feedback (RLHF)
2023-01-09	HyperTree Proof Search - Automated Theorem Proving with AlphaZero and Transformers!
2022-12-23	CodinGame Fall Challenge 2022: A First Look (managed to get to Silver!)
2022-12-21	Can ChatGPT solve CodinGame/Google Kickstart problems?
2022-12-19	Reinforcement Learning Fast and Slow: Goal-Directed and Memory Retrieval Mechanism!
2022-12-12	A New Framework of Memory for Learning (Part 1)
2022-11-14	Hippocampal Replay for Learning (Full Length with Questions)
2022-11-14	Hippocampal Replay for Learning (3 min summary)
2022-11-07	AlphaTensor: Using Reinforcement Learning for Efficient Matrix Multiplication
2022-10-27	Playing Go on TyGem and learning from AI (~ 3 kyu)

Channel	Latest
fantayzia	6 hours ago
AyChristene	6 hours ago
Guillaume Brien	6 hours ago
Wappen	6 hours ago
Kartoffel König	6 hours ago
Antimatéria	6 hours ago
The Breakdown	7 hours ago
theScore esports	7 hours ago
D3rKommi	7 hours ago
Sliver	7 hours ago
#StruggleNation	7 hours ago
IGN Brasil	7 hours ago
TrU3Ta1ent	7 hours ago
Captain TigerLily	7 hours ago
Qr Juegos	7 hours ago
LMA TV VLOG	7 hours ago
VideoGameConnection	7 hours ago
Malvaisso	7 hours ago
nxtgen720	7 hours ago
Solheim Gaming	7 hours ago
Gerugon	7 hours ago
CSR	7 hours ago
theo the slav!	7 hours ago
TVP Sport	7 hours ago
Hitler Rants Parodies	7 hours ago