Learning, Fast and Slow: My Landmark Idea for fast, adaptable agents (ICDL 2023 Best Paper Finalist)

Subscribers:
5,330
Published on ● Video Link: https://www.youtube.com/watch?v=DSVFA7nmwHQ



Duration: 13:38
666 views
19


I have been thinking about the problem of creating fast and adaptable agents for 4 years. Recently, I had a breakthrough of using goal-directed means and imbuing memory to solve it. Here's the video documenting the insights.

I will be presenting this at the IEEE International Conference on Development and Learning (ICDL) 2023 from 9-11 Nov 2023 at Macau! Very keen to continue to work on this as this is part of the 10-year plan I have to create fast and adaptable agents!

**Key Insights:**
- Use Goal-directed action prediction, so we can do self-supervised learning on our trajectories (Given start state, goal state, predict first action)
- Use memory to model (state, action, next state) tuples and use it for world modelling and transition probability approximation
- Memory learns almost immediately, while neural network takes time to learn. Moreover, memory is used for lookahead planning. Hence, if we can use memory to find a path from start state to goal state, we will use that path rather than the neural network.
- Performs very well (91.9% solve rate) compared to next-best RL algorithm Proximal Policy Optimisation (61.2% solve rate) in a 10x10 dynamic grid environment.

~~~~~~~~~~~~~~~~

https://arxiv.org/abs/2301.13758

~~~~~~~~~~~~~~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2024-01-23AutoGen: A Multi-Agent Framework - Overview and Improvements
2024-01-09AppAgent: Using GPT-4V to Navigate a Smartphone!
2024-01-08Tutorial #13: StrictJSON, my first Python Package! - Get LLMs to output into a working JSON!
2023-12-20"Are you smarter than an LLM?" game speedrun
2023-12-08Is Gemini better than GPT4? Self-created benchmark - Fact Retrieval/Checking, Coding, Tool Use
2023-12-04Learning, Fast and Slow: 10 Years Plan - Memory Soup, Hier. Planning, Emotions, Knowledge Sharing
2023-12-01Tutorial #12: Use ChatGPT and off-the-shelf RAG on Terminal/Command Prompt/Shell - SymbolicAI
2023-11-20JARVIS-1: Multi-modal (Text + Image) Memory + Decision Making with LLMs in MineCraft!
2023-11-20Tutorial #11: Virtual Persona from Documents, Multi-Agent Chat, Text-to-Speech to hear your Personas
2023-11-14A Roadmap for AI: Past, Present and Future (Part 3) - Multi-Agent, Multiple Sampling and Filtering
2023-11-07Learning, Fast and Slow: My Landmark Idea for fast, adaptable agents (ICDL 2023 Best Paper Finalist)
2023-11-06A roadmap for AI: Past, Present and Future (Part 2): Fixed vs Flexible, Memory Soup vs Hierarchy
2023-11-03AI & Education: Education when AI tools are smarter than us - Discussion with Kuang Wen (Part 2)
2023-11-03AI & Education: RAG Question-Answer, Test Question Generator, Autograder by Kuang Wen! (Part 1)
2023-10-31A Roadmap for AI: Past, Present and Future (Part 1)
2023-10-28Tutorial #10: StrictJSON v2 (StrictText): Handle any output - quotation marks or backslash!
2023-10-24ChatDev: Can LLM Agents really replace a software company?
2023-10-17LLMs and Robotics: An Overview by Daniel Tan!
2023-10-17LLM Q&A #1: Prompting vs Fine-Tuning, More vs Fewer Sources for RAG, Prompting vs LLMs as a System
2023-10-10LLMs as a System of Multiple Expert Agents to solve the ARC Challenge (Detailed Walkthrough)
2023-09-26Everything about LLM Agents - Chain of Thought, Reflection, Tool Use, Memory, Multi-Agent Framework