PhD Thesis Overview (Part 2): LLMs for ARC-AGI, Task-Based Memory-Infused Learning, Plan for AgentJo

Subscribers:
5,330
Published on ● Video Link: https://www.youtube.com/watch?v=jFNhoQEp1OA



Duration: 0:00
714 views
33


Part 2 of PhD thesis overview, with some insights about AgentJo, the system which I am going to spend the next 5-10 years on. The key highlights are using memory tagged with emotion / context-relevance, and using them to adapt. We can also share memories with one another in multi-agent systems.

The three main takeaways of the PhD are:
Reward-based learning is slow to learn and slow to adapt to changes in environment / reward
Goal-directed, memory-based learning learns very quickly and outperforms reward-based learning
Adding in Large Langauge Models (LLMs) with suitable abstraction spaces into such a goal-directed, memory-based learning system can utilise pre-built knowledge and learn even faster (if test environment is within training dataset of the LLMs)

Abstract:
Humans excel at fast and adaptive learning, effortlessly making zero-shot associations and generalising across diverse environments with minimal experience needed. This is in stark contrast to data-hungry deep-learning algorithms. This work aims to draw inspiration from human cognitive processes to build AI systems that learn and adapt quickly.

We introduce Learning, Fast and Slow (Best Paper Finalist in IEEE ICDL 2023), a system which uses a neural network to perform goal-directed exploration (the “fast” mechanism), and additionally performs memory-based planning (the “slow” mechanism). Trained online via memory replay in a self-supervised fashion, this method achieves a 91.9% solve rate in a dynamically changing 10x10 maze, significantly better than actor-critic methods like PPO (61.2%), TRPO (26.1%), A2C (23.9%).

We also utilise a similar memory-based, goal-directed approach to create an open-sourced Large Language Model-based agentic framework, TaskGen. This will continue to be developed under AgentJo ( https://github.com/tanchongmin/agentjo )

~~~

References this video:

Slides: https://github.com/tanchongmin/agentjo/blob/main/resources/PhD Thesis Overview - John Tan Chong Min.pdf
Part 1:    • PhD Thesis Overview (Part 1): Reward ...  

Memory Tagging (Artem Kirsanov):    • How Your Brain Chooses What to Remember  
The Memory Code: The 10-Minute Solution for Healing Your Life Through Memory Engineering (Alexander Loyd) - How memories' emotions can be adapted: https://www.amazon.sg/Memory-Code-10-Minute-Solution-Engineering/dp/1538764423
The Brain's Way of Healing: Remarkable Discoveries and Recoveries from the Frontiers of Neuroplasticity (Norman Doidge) - How chronic pain memory can be changed: https://www.amazon.sg/Brains-Way-Healing-Discoveries-Neuroplasticity/dp/067002550X
Sapiens: A Brief History of Humankind (Yuval Noah Harari) - How humankind developed with writing and economy: https://www.amazon.sg/dp/0062316095?ref_=mr_referred_us_sg_sg
Mice can learn mazes without reward: https://link.springer.com/content/pdf/10.3758/BF03208022.pdf

~~~

0:00 Introduction and Recap of Part 1
7:13 Multiple Abstraction Spaces for Learning
31:45 Insight: Multiple Iterative Searches
53:32 TaskGen: LLMs + Goal-Directed, Memory-Based Learning
1:12:04 AgentJo: Human-Friendly, Fast Learning and Adaptable Agent Communities
1:40:45 Discussion
2:17:51 Conclusion + Logo Design Competition

~~~

Brick Tic Tac Toe Game (Level 2.2): https://simmer.io/@chongmin/cosmic-tic-tac-toe

Reference Papers / Video:
DropNet: https://arxiv.org/abs/2207.06646
Brick Tic Tac Toe: https://arxiv.org/abs/2207.05991
Hippocampal Replay (NeurIPS memARI workshop 2022): https://memari-workshop.github.io/papers/paper_38.pdf
Video:    • Hippocampal Replay for Learning (3 mi...  

Learning, Fast and Slow: https://ieeexplore.ieee.org/abstract/document/10364540
https://arxiv.org/pdf/2301.13758
Video:    • Learning, Fast and Slow: My Landmark ...  

LLMs as a system of multiple expert agents: https://ieeecai.org/2024/wp-content/pdfs/540900a793/540900a793.pdf
https://arxiv.org/pdf/2310.05146
Video:    • LLMs as a System of Multiple Expert A...  

TaskGen: https://arxiv.org/pdf/2407.15734
Video:    • TaskGen - A Task-based Agentic Framew...  

AgentJo: https://github.com/tanchongmin/agentjo

~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2025-02-24Vibe Coding: How to use LLM prompts to code effectively!
2025-01-26PhD Thesis Overview (Part 2): LLMs for ARC-AGI, Task-Based Memory-Infused Learning, Plan for AgentJo
2025-01-20PhD Thesis Overview (Part 1): Reward is not enough; Towards Goal-Directed, Memory-based Learning
2024-12-04AgentJo CV Generator: Generate your CV by searching for your profile on the web!
2024-11-11Can LLMs be used in self-driving? CoMAL: Collaborative Multi-Agent LLM for Mixed Autonomy Traffic
2024-10-28From TaskGen to AgentJo: Creating My Life Dream of Fast Learning and Adaptable Agents
2024-10-21Tian Yu X John: Discussing Practical Gen AI Tips for Image Prompting
2024-10-08Jiafei Duan: Uncovering the 'Right' Representations for Multimodal LLMs for Robotics
2024-09-27TaskGen Tutorial 6: Conversation Wrapper
2024-09-26TaskGen Tutorial 5: External Functions & CodeGen
2024-09-24TaskGen Tutorial 4: Hierarchical Agents
2024-09-23TaskGen Tutorial 3: Memory
2024-09-19TaskGen Tutorial 2: Shared Variables and Global Context
2024-09-16Beyond Strawberry: gpt-o1 - Is LLM alone sufficient for reasoning?
2024-09-11TaskGen Tutorial 1: Agents and Equipped Functions
2024-09-11TaskGen Tutorial 0: StrictJSON
2024-09-10LLM-Modulo: Using Critics and Verifiers to Improve Grounding of a Plan - Explanation + Improvements
2024-09-06TaskGen: Co-create the best open-sourced LLM Agentic Framework together!
2024-08-21AriGraph (Part 2) - Knowledge Graph Construction and Retrieval Details
2024-08-13alphaXiv - Share Ideas, Build Collective Understanding, Interact with ANY open sourced paper authors
2024-07-30AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents