No more RL needed! LLMs for high-level planning: Voyager + Ghost In the Minecraft
Using LLMs to perform semantic reasoning and mapping high-level goals hierarchically into action spaces, we can get LLM-based agents to craft a wide variety of items in MineCraft, and fulfil arbitrary goals.
There is something neat about hierarchy, and the Transformers's adaptability in terms of zero-shot/few-shot learning, that makes it viable to accomplish tasks in complicated game environments. Moreover, imbuing a memory mechanism to help with action reuse can greatly improve learning.
In Ghost in the Minecraft (GITM), actions are given as a list of primitive functions with callable parameters, and executed in sequence. This is far more superior than Voyager where it is code-based and could have syntax errors. GITM is indeed quite a remarkable paper, and their method of memory consolidation to learn only the essential actions is pretty interesting!
Next week, I will try to apply some of these methods to try to solve the Abstract and Reasoning Corpus! Stay tuned!
~~~~~~~~~~~~~
Part 1 on Voyager: https://www.youtube.com/watch?v=Y-pgbjTlYgk
Slides: https://github.com/tanchongmin/TensorFlow-Implementations/blob/main/Paper_Reviews/Voyager%20GITM%20MineCraft%20Slides.pdf
Voyager: https://voyager.minedojo.org/
Voyager Code: https://github.com/MineDojo/Voyager
GITM Paper: https://arxiv.org/abs/2305.17144
GITM (Github): https://github.com/OpenGVLab/GITM
CLIPort (language-conditioned goal to actions): https://cliport.github.io/
~~~~~~~~~~~~~
0:00 Introduction
0:40 Recap of Voyager
6:40 Critique of Voyager
13:40 Potential Improvements of Voyager
16:45 Ghost in the Minecraft Introduction
18:21 GITM is better than RL methods
25:35 Comparison with RL methods
28:40 Overall Interface of GITM
33:38 LLM Planner Details
36:41 Subgoal Decomposition
39:28 Action Primitives from Predefined Tasks
44:36 Policy as a List of Actions
49:53 Training the Memory (a kind of Curriculum Planner?)
1:00:45 Is Feedback Necessary for Subgoal to Goal Planner?
1:03:50 How feedback from environment via action primitives helps with learning
1:16:30 Failure Memory might be needed to learn explore-exploit
1:19:35 Overall thoughts on GITM
1:22:24 Sneak Preview of my “Instructions Code Format” for Action Spaces
1:22:40 Discussion
~~~~~~~~~~~~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin
Other Videos By John Tan Chong Min
Other Statistics
The Voyager Statistics For John Tan Chong Min
There are 710 views in 2 videos for The Voyager. About 3 hours worth of The Voyager videos were uploaded to his channel, roughly 1.04% of the content that John Tan Chong Min has uploaded to YouTube.