Learning, Fast and Slow: Towards Fast and Adaptable Agents in Changing Environments

Channel:

John Tan Chong Min

Subscribers:

5,450

Published on February 13, 2023 12:29:07 PM ● Video Link: https://www.youtube.com/watch?v=Hr9zW7Usb7I

Duration: 2:20:28

4,158 views

Excited to share my latest work titled "Learning, Fast and Slow".

The inspiration came from observing how humans are typically goal-directed and head straight towards a planned goal, as well as how we can retrieve past memory in order to perform a task consistently. Combining a fast neural network (System 1) prediction, with a slow parallel memory-retrieval system (System 2) to override the fast when able to, we can get efficient decision making.

I was also inspired by the hippocampal replay in mice, as there seems to be both replay of the past visited states, as well as replay of future imagined states. Hence, I posit that this replay could help in goal-directed learning and we train the fast neural network with this in mind.

Overall, the initial results are promising, as empirical studies show that our proposed method has a 92% solve rate across 100 episodes in a dynamically changing grid world, significantly outperforming state-of-the-art actor critic mechanisms such as PPO (54%), TRPO (50%) and A2C (24%).

Happy to discuss more on this, and take any feedback. There's a lot more things I have in mind to add to this system, more will be shared in my next framework paper to be released shortly.

Paper: https://arxiv.org/abs/2301.13758
Code: https://github.com/tanchongmin/Learning-Fast-and-Slow
Slides: https://github.com/tanchongmin/TensorFlow-Implementations/tree/main/Paper_Reviews
Earlier iteration of this idea: https://www.youtube.com/watch?v=M10f3ihj3cE

Related Papers:
SayCan (LLMs for Goal Setting): https://say-can.github.io/
Yann LeCun's Path Towards Autonomous Machine Intelligence (Hierarchical Planning): https://openreview.net/pdf?id=BZ5a1r-kVsf

~~~~~~~~~~~~~~~~~~~~~~~~~~
0:00 Introduction
2:03 Aim
4:37 Traditional Reinforcement Learning
6:36 Thought Experiment: How do we think?
9:52 Preliminaries: Modelling the world
29:49 Neural Networks vs Hashtable for Memory
35:45 Value-based RL is slow
40:04 Goal-Directed Exploration
45:28 Fast & Slow
56:41 Hippocampal Replay to train Fast Goal-Directed Neural Network
1:02:19 Massively Parallel Memory Retrieval Network
1:04:13 Overall Fast & Slow Procedure
1:10:13 Results (Static)
1:14:26 Results (Dynamic)
1:18:00 Results (Ablation)
1:22:42 Conclusion
1:24:10 Extensions
1:24:23 Multi-Agent Learning
1:26:08 Natural Language Processing for Goal and Subgoal Planning
1:29:16 Forgetting as Learning
1:31:37 Hierarchical Planning
1:35:14 Discussion
2:19:20 Concluding Remarks

~~~~~~~~~~~~~~~~~~~~~~~~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/fXCZCPYs
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/.
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin

Other Videos By John Tan Chong Min

2023-04-12	GPT4: Zero-shot Classification without any examples + Fine-tune with reflection
2023-04-11	OpenAI Vector Embeddings - Talk to any book or document; Retrieval-Augmented Generation!
2023-04-11	Tutorial #2: OpenAI Vector Embeddings and Pinecone for Retrieval-Augmented Generation
2023-04-04	Creating JARVIS: ChatGPT + APIs - HuggingGPT, Memory-Augmented Context, Meta GPT structures
2023-04-02	Is GPT4 capable of self-improving? Are we heading for AGI or AI doom?
2023-03-28	How Visual ChatGPT works + Toolformer/Wolfram Alpha. LLMs with Tools/APIs/Plugins is the way ahead!
2023-03-21	Tokenize any input, even continuous vectors! - Residual Vector Quantization - VALL-E (Part 2)
2023-03-07	Using Transformers to mimic anyone's voice! - VALL-E (Part 1)
2023-02-28	Learning Part-Whole Structure by Chunking - More Efficient than Deep Learning!!!
2023-02-21	High-level planning with large language models - SayCan
2023-02-13	Learning, Fast and Slow: Towards Fast and Adaptable Agents in Changing Environments
2023-02-07	Using Logic Gates as Neurons - Deep Differentiable Logic Gate Networks!
2023-01-31	Learn from External Memory, not just Weights: Large-Scale Retrieval for Reinforcement Learning
2023-01-17	How ChatGPT works - From Transformers to Reinforcement Learning with Human Feedback (RLHF)
2023-01-09	HyperTree Proof Search - Automated Theorem Proving with AlphaZero and Transformers!
2022-12-23	CodinGame Fall Challenge 2022: A First Look (managed to get to Silver!)
2022-12-21	Can ChatGPT solve CodinGame/Google Kickstart problems?
2022-12-19	Reinforcement Learning Fast and Slow: Goal-Directed and Memory Retrieval Mechanism!
2022-12-12	A New Framework of Memory for Learning (Part 1)
2022-11-14	Hippocampal Replay for Learning (Full Length with Questions)
2022-11-14	Hippocampal Replay for Learning (3 min summary)

Channel	Latest
F34RTEHR34PER	6 hours ago
Senhor Leoncio	6 hours ago
Sdanwolf	6 hours ago
BossKing lol	6 hours ago
Gaming Movie Database - IGMDb.org	6 hours ago
fantayzia	6 hours ago
Dee True Crime	6 hours ago
Canal MangaQ	6 hours ago
chocoTaco	6 hours ago
비행돼지	6 hours ago
NOTA MEX	6 hours ago
AyChristene	6 hours ago
Edu Primitivo	7 hours ago
Guillaume Brien	7 hours ago
Pixelatino	7 hours ago
Canal do Saullo	7 hours ago
TDM_Heyzeus	7 hours ago
Wappen	7 hours ago
The One & The Only, Triple Da G.O.D!	7 hours ago
Kartoffel König	7 hours ago
Antimatéria	7 hours ago
vLADOPARD 404	7 hours ago
Kazusa Konuki	7 hours ago
The Breakdown	7 hours ago
theScore esports	7 hours ago