CRADLE (Part 1) - AI that plays Red Dead Redemption 2. Towards General Computer Control and AGI
CRADLE - An AI that can play Red Dead Redemption 2
Following the days of MineCraft agents like Voyager, Ghost in the Minecraft, JARVIS-1, we have the latest attempt to crack an AAA game, Red Dead Redemption 2, with AI.
It uses GPT-4V to decipher the images of the game, coupled with augmentations like VideoSubFinder to get the subtitles of conversation, GroundingDino to get bounding boxes for objects.
It truly is trying to do something like multiple abstraction spaces for image/video domain, an idea which I truly like.
That, and coupled with procedural memory of skills (via code) and episodic memory of current and past experiences in both long form and summarised form.
It does not do everything perfectly, but it is a great first step at achieving Artificial General Intelligence.
I posit that if we can tackle the image domain well, we would be more than 50% there. Currently, our image processing tools leave much to be desired.
~~~
Part 2 here: https://www.youtube.com/watch?v=hPcX4wNtFLQ
Main resources:
CRADLE github: https://github.com/BAAI-Agents/Cradle
CRADLE video: https://www.youtube.com/watch?v=Cx-D708BedY
My slides on CRADLE: https://github.com/tanchongmin/TensorFlow-Implementations/blob/main/Paper_Reviews/CRADLE.pdf
Past Agentic Frameworks (my videos):
Voyager: https://www.youtube.com/watch?v=Y-pgbjTlYgk
Ghost in the MineCraft: https://www.youtube.com/watch?v=_VXOczXIkks
JARVIS-1: https://www.youtube.com/watch?v=JUAec-dAt5c
LLMs as a System to solve the ARC Challenge (mine): https://www.youtube.com/watch?v=sTvonsD5His
Referenced resources for Image Processing:
VideoSubFinder: https://sourceforge.net/projects/videosubfinder/
Grounding DINO: https://arxiv.org/abs/2303.05499
Multi-template Matching (MTM): https://pyimagesearch.com/2021/03/29/multi-template-matching-with-opencv/
~~~
0:00 Introduction
3:30 General Computer Control
14:52 Are humans really generally intelligent?
26:50 Are games good enough for AGI?
34:35 Challenges for GCC
46:44 Recap: VOYAGER
58:41 Key Improvements in CRADLE
1:04:04 Overview of CRADLE
1:23:54 Multiple Abstraction Spaces - Vision
~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin
Other Videos By John Tan Chong Min
Other Statistics
Red Dead Redemption 2 Statistics For John Tan Chong Min
There are 613 views in 1 video for Red Dead Redemption 2. About an hours worth of Red Dead Redemption 2 videos were uploaded to his channel, less than 0.52% of the total video content that John Tan Chong Min has uploaded to YouTube.