JARVIS-1: Multi-modal (Text + Image) Memory + Decision Making with LLMs in MineCraft!
JARVIS-1 is the latest way of using LLMs to solve the MineCraft environment. It has surpassed the performance of Voyager, but is slightly behind the performance of Ghost in the MineCraft (GiTM). However, it is the first of its kind to use images and text in a truly multimodal way of decision making!
There is also a curriculum generator using self-instruction with memory as a guide, and it also incorporates environmental feedback.
It has the mechanisms in place for self-learning similar to Voyager, and I think it could be better if we encode and retrieve memory more efficiently, execute sub-goals in a sequential fashion, and do the training of the controller better.
~~~~~~~~~~~~~~~~~
Slides: https://github.com/tanchongmin/TensorFlow-Implementations/blob/main/Paper_Reviews/JARVIS-1.pdf
JARVIS-1 Repo (Code coming soon): https://github.com/CraftJarvis/JARVIS-1
JARVIS-1 Paper: https://arxiv.org/abs/2311.05997
MineCLIP (embedding model): https://arxiv.org/abs/2206.08853
Past videos:
Voyager: https://www.youtube.com/watch?v=Y-pgbjTlYgk
Ghost in the MineCraft: https://www.youtube.com/watch?v=_VXOczXIkks
~~~~~~~~~~~~~~~~~~
0:00 Introduction + Demo
2:31 Overview
3:34 Voyager Recap
6:20 Ghost in the MineCraft
12:11 JARVIS-1
15:19 Learning, Fast and Slow
17:33 Unlocking Entire Technology Tree
18:55 Situation-aware Planning
27:41 JARVIS-1 and Memory
38:33 Observational Space
40:02 Processing Images
46:02 Sub-goal planning
56:32 Storing and retrieving the memory
1:03:20 Generating the memories
1:10:59 Self-check
1:15:00 Result Analysis
1:20:05 Discussion
~~~~~~~~~~~~~~~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin