Can GPT4 solve the Abstraction and Reasoning Corpus (ARC) Challenge Zero-Shot?

Channel:

John Tan Chong Min

Subscribers:

6,370

Published on April 17, 2023 7:54:12 AM ● Video Link: https://www.youtube.com/watch?v=vt2yG1da8Fg

Duration: 23:19

774 views

I've been impressed with the ability of next-token prediction to learn complex structures. The Abstraction and Reasoning Corpus (ARC) Challenge was created by François Chollet in 2019, and there is now the ARC2 Challenge by Lab42 ( https://lab42.global/arcathon/ ), which tests an agent's ability to learn from very few examples and generalize to a new input. This challenge is difficult as there is an exponential amount of possible answers (10 possible values per square), and the grid size for output is not fixed and must be inferred.

Since I do not have access to the multimodal version of GPT4, I used the json representation of the input and outputs and give some background of the ARC problems to GPT4. I then ask it to generate a broad description to ground it in a category, a detailed description to get the algorithmic steps needed (note I did not ask it to generate a program as the program and its description may not match - the description is a better bet). Then, I ask it to verify its description with the input/output samples - this step is currently not done too well and could be better done with an external code generation and execution tool. Lastly, I ask it to generate the test set's output.

Generally, it works pretty well for some small grid problems. Large grid sizes are an issue due to context token length constraints. I believe with the right inductive bias grounding based on prompting, as well as some tools given to it to better visualize the objects in the grid, GPT4 may actually be able to solve most of the ARC challenges. Attention and pattern matching are really quite powerful.

~~~~~~~~~~~~

Latest thoughts on GPT4 on ARC: https://www.youtube.com/watch?v=plVRxP8hQHY
Previous (related) video on zero-shot classification: https://www.youtube.com/watch?v=C0Eug9XpcBo

Jupyter Notebook: https://github.com/tanchongmin/ARC-Challenge/blob/main/arc_challenge.ipynb
ARCathon: https://lab42.global/arcathon/
ARC Playground: https://arc-editor.lab42.global/playground

On The Measure of Intelligence: https://arxiv.org/abs/1911.01547
AlphaCode: https://arxiv.org/abs/2203.07814

~~~~~~~~~~~~

0:00 Background of ARC Challenge
0:55 GPT4 Generation Process on Public Eval Task 157 (66e6c45b.json) [Success]
5:26 Overlay Task: Public Eval Task 158 (66f2d22f.json) [Failed]
9:10 Row and Column Removal Task: Public Eval Task 162 (68b67ca3.json) [Success]
13:16 Background Swap Task: Public Eval Task 170 (6ea4a07e.json) [Failed]
17:58 Systems and Tools-Augmentation for GPT4
21:10 ARC Challenge vs Zero-Shot Classification

~~~~~~~~~~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin

Other Videos By John Tan Chong Min

2023-06-01	Evolution ChatGPT Prompt Game - From Bacteria to.... Jellyfish???
2023-05-30	Prompt Engineering and LLMOps: Tips and Tricks
2023-05-25	Hierarchy! The future of AI: How it helps representations and why it is important.
2023-05-18	Prediction builds representations! Fixed Bias speeds up learning!
2023-05-09	Memory: How is it encoded, retrieved and how it can be used for learning systems
2023-05-02	I created a Law Court Simulator with GPT4!
2023-05-02	I created a Law Court Simulator with ChatGPT!
2023-04-25	Creating a ChatGPT Harry Potter Text-based RPG game!
2023-04-25	Learn from just Memory Storage and Retrieval: Generative Agents Interacting in Simulation!
2023-04-18	The future is neuro-symbolic: Expressiveness of ChatGPT and generalizability of symbols (SymbolicAI)
2023-04-17	Can GPT4 solve the Abstraction and Reasoning Corpus (ARC) Challenge Zero-Shot?
2023-04-12	GPT4: Zero-shot Classification without any examples + Fine-tune with reflection
2023-04-11	OpenAI Vector Embeddings - Talk to any book or document; Retrieval-Augmented Generation!
2023-04-11	Tutorial #2: OpenAI Vector Embeddings and Pinecone for Retrieval-Augmented Generation
2023-04-04	Creating JARVIS: ChatGPT + APIs - HuggingGPT, Memory-Augmented Context, Meta GPT structures
2023-04-02	Is GPT4 capable of self-improving? Are we heading for AGI or AI doom?
2023-03-28	How Visual ChatGPT works + Toolformer/Wolfram Alpha. LLMs with Tools/APIs/Plugins is the way ahead!
2023-03-21	Tokenize any input, even continuous vectors! - Residual Vector Quantization - VALL-E (Part 2)
2023-03-07	Using Transformers to mimic anyone's voice! - VALL-E (Part 1)
2023-02-28	Learning Part-Whole Structure by Chunking - More Efficient than Deep Learning!!!
2023-02-21	High-level planning with large language models - SayCan

Channel	Latest
San Jose Barracuda	9 hours ago
penguinz0	9 hours ago
Rochester Americans	10 hours ago
Hopezera Gaming	10 hours ago
Beatdown Gaming	11 hours ago
Gab Smolders	13 hours ago
Gabriel Gameplays	14 hours ago
Zapek	14 hours ago
UNOFFICIAL Pyrion Flax Twitch VOD Archive	14 hours ago
Erilaz Blade	14 hours ago
Alemeras	14 hours ago
CANAL DO EDINHO	14 hours ago
VIA X	14 hours ago
Taktikal Genius	15 hours ago
Springfield Thunderbirds	15 hours ago
YS Studio 製作	15 hours ago
Khrawn	15 hours ago
Kent	15 hours ago
CaptainFRACAS	15 hours ago
Milli	15 hours ago
CriticSight Oficial	15 hours ago
Rizsuja	15 hours ago
Evilleader	15 hours ago
GPowerHD	15 hours ago
Maximum Games 🅱🆁	15 hours ago