Learning, Fast and Slow: Towards Fast and Adaptable Agents in Changing Environments

Subscribers:
5,330
Published on ● Video Link: https://www.youtube.com/watch?v=Hr9zW7Usb7I



Duration: 2:20:28
4,158 views
39


Excited to share my latest work titled "Learning, Fast and Slow".

The inspiration came from observing how humans are typically goal-directed and head straight towards a planned goal, as well as how we can retrieve past memory in order to perform a task consistently. Combining a fast neural network (System 1) prediction, with a slow parallel memory-retrieval system (System 2) to override the fast when able to, we can get efficient decision making.

I was also inspired by the hippocampal replay in mice, as there seems to be both replay of the past visited states, as well as replay of future imagined states. Hence, I posit that this replay could help in goal-directed learning and we train the fast neural network with this in mind.

Overall, the initial results are promising, as empirical studies show that our proposed method has a 92% solve rate across 100 episodes in a dynamically changing grid world, significantly outperforming state-of-the-art actor critic mechanisms such as PPO (54%), TRPO (50%) and A2C (24%).

Happy to discuss more on this, and take any feedback. There's a lot more things I have in mind to add to this system, more will be shared in my next framework paper to be released shortly.

Paper: https://arxiv.org/abs/2301.13758
Code: https://github.com/tanchongmin/Learning-Fast-and-Slow
Slides: https://github.com/tanchongmin/TensorFlow-Implementations/tree/main/Paper_Reviews
Earlier iteration of this idea: https://www.youtube.com/watch?v=M10f3ihj3cE

Related Papers:
SayCan (LLMs for Goal Setting): https://say-can.github.io/
Yann LeCun's Path Towards Autonomous Machine Intelligence (Hierarchical Planning): https://openreview.net/pdf?id=BZ5a1r-kVsf

~~~~~~~~~~~~~~~~~~~~~~~~~~
0:00 Introduction
2:03 Aim
4:37 Traditional Reinforcement Learning
6:36 Thought Experiment: How do we think?
9:52 Preliminaries: Modelling the world
29:49 Neural Networks vs Hashtable for Memory
35:45 Value-based RL is slow
40:04 Goal-Directed Exploration
45:28 Fast & Slow
56:41 Hippocampal Replay to train Fast Goal-Directed Neural Network
1:02:19 Massively Parallel Memory Retrieval Network
1:04:13 Overall Fast & Slow Procedure
1:10:13 Results (Static)
1:14:26 Results (Dynamic)
1:18:00 Results (Ablation)
1:22:42 Conclusion
1:24:10 Extensions
1:24:23 Multi-Agent Learning
1:26:08 Natural Language Processing for Goal and Subgoal Planning
1:29:16 Forgetting as Learning
1:31:37 Hierarchical Planning
1:35:14 Discussion
2:19:20 Concluding Remarks


~~~~~~~~~~~~~~~~~~~~~~~~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/fXCZCPYs
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/.
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2023-04-12GPT4: Zero-shot Classification without any examples + Fine-tune with reflection
2023-04-11OpenAI Vector Embeddings - Talk to any book or document; Retrieval-Augmented Generation!
2023-04-11Tutorial #2: OpenAI Vector Embeddings and Pinecone for Retrieval-Augmented Generation
2023-04-04Creating JARVIS: ChatGPT + APIs - HuggingGPT, Memory-Augmented Context, Meta GPT structures
2023-04-02Is GPT4 capable of self-improving? Are we heading for AGI or AI doom?
2023-03-28How Visual ChatGPT works + Toolformer/Wolfram Alpha. LLMs with Tools/APIs/Plugins is the way ahead!
2023-03-21Tokenize any input, even continuous vectors! - Residual Vector Quantization - VALL-E (Part 2)
2023-03-07Using Transformers to mimic anyone's voice! - VALL-E (Part 1)
2023-02-28Learning Part-Whole Structure by Chunking - More Efficient than Deep Learning!!!
2023-02-21High-level planning with large language models - SayCan
2023-02-13Learning, Fast and Slow: Towards Fast and Adaptable Agents in Changing Environments
2023-02-07Using Logic Gates as Neurons - Deep Differentiable Logic Gate Networks!
2023-01-31Learn from External Memory, not just Weights: Large-Scale Retrieval for Reinforcement Learning
2023-01-17How ChatGPT works - From Transformers to Reinforcement Learning with Human Feedback (RLHF)
2023-01-09HyperTree Proof Search - Automated Theorem Proving with AlphaZero and Transformers!
2022-12-23CodinGame Fall Challenge 2022: A First Look (managed to get to Silver!)
2022-12-21Can ChatGPT solve CodinGame/Google Kickstart problems?
2022-12-19Reinforcement Learning Fast and Slow: Goal-Directed and Memory Retrieval Mechanism!
2022-12-12A New Framework of Memory for Learning (Part 1)
2022-11-14Hippocampal Replay for Learning (Full Length with Questions)
2022-11-14Hippocampal Replay for Learning (3 min summary)