Can GPT4 solve the Abstraction and Reasoning Corpus (ARC) Challenge Zero-Shot?

Subscribers:
5,330
Published on ● Video Link: https://www.youtube.com/watch?v=vt2yG1da8Fg



Duration: 23:19
774 views
27


I've been impressed with the ability of next-token prediction to learn complex structures. The Abstraction and Reasoning Corpus (ARC) Challenge was created by François Chollet in 2019, and there is now the ARC2 Challenge by Lab42 ( https://lab42.global/arcathon/ ), which tests an agent's ability to learn from very few examples and generalize to a new input. This challenge is difficult as there is an exponential amount of possible answers (10 possible values per square), and the grid size for output is not fixed and must be inferred.

Since I do not have access to the multimodal version of GPT4, I used the json representation of the input and outputs and give some background of the ARC problems to GPT4. I then ask it to generate a broad description to ground it in a category, a detailed description to get the algorithmic steps needed (note I did not ask it to generate a program as the program and its description may not match - the description is a better bet). Then, I ask it to verify its description with the input/output samples - this step is currently not done too well and could be better done with an external code generation and execution tool. Lastly, I ask it to generate the test set's output.

Generally, it works pretty well for some small grid problems. Large grid sizes are an issue due to context token length constraints. I believe with the right inductive bias grounding based on prompting, as well as some tools given to it to better visualize the objects in the grid, GPT4 may actually be able to solve most of the ARC challenges. Attention and pattern matching are really quite powerful.

~~~~~~~~~~~~

Latest thoughts on GPT4 on ARC: https://www.youtube.com/watch?v=plVRxP8hQHY
Previous (related) video on zero-shot classification: https://www.youtube.com/watch?v=C0Eug9XpcBo

Jupyter Notebook: https://github.com/tanchongmin/ARC-Challenge/blob/main/arc_challenge.ipynb
ARCathon: https://lab42.global/arcathon/
ARC Playground: https://arc-editor.lab42.global/playground

On The Measure of Intelligence: https://arxiv.org/abs/1911.01547
AlphaCode: https://arxiv.org/abs/2203.07814


~~~~~~~~~~~~

0:00 Background of ARC Challenge
0:55 GPT4 Generation Process on Public Eval Task 157 (66e6c45b.json) [Success]
5:26 Overlay Task: Public Eval Task 158 (66f2d22f.json) [Failed]
9:10 Row and Column Removal Task: Public Eval Task 162 (68b67ca3.json) [Success]
13:16 Background Swap Task: Public Eval Task 170 (6ea4a07e.json) [Failed]
17:58 Systems and Tools-Augmentation for GPT4
21:10 ARC Challenge vs Zero-Shot Classification

~~~~~~~~~~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2023-06-01Evolution ChatGPT Prompt Game - From Bacteria to.... Jellyfish???
2023-05-30Prompt Engineering and LLMOps: Tips and Tricks
2023-05-25Hierarchy! The future of AI: How it helps representations and why it is important.
2023-05-18Prediction builds representations! Fixed Bias speeds up learning!
2023-05-09Memory: How is it encoded, retrieved and how it can be used for learning systems
2023-05-02I created a Law Court Simulator with GPT4!
2023-05-02I created a Law Court Simulator with ChatGPT!
2023-04-25Creating a ChatGPT Harry Potter Text-based RPG game!
2023-04-25Learn from just Memory Storage and Retrieval: Generative Agents Interacting in Simulation!
2023-04-18The future is neuro-symbolic: Expressiveness of ChatGPT and generalizability of symbols (SymbolicAI)
2023-04-17Can GPT4 solve the Abstraction and Reasoning Corpus (ARC) Challenge Zero-Shot?
2023-04-12GPT4: Zero-shot Classification without any examples + Fine-tune with reflection
2023-04-11OpenAI Vector Embeddings - Talk to any book or document; Retrieval-Augmented Generation!
2023-04-11Tutorial #2: OpenAI Vector Embeddings and Pinecone for Retrieval-Augmented Generation
2023-04-04Creating JARVIS: ChatGPT + APIs - HuggingGPT, Memory-Augmented Context, Meta GPT structures
2023-04-02Is GPT4 capable of self-improving? Are we heading for AGI or AI doom?
2023-03-28How Visual ChatGPT works + Toolformer/Wolfram Alpha. LLMs with Tools/APIs/Plugins is the way ahead!
2023-03-21Tokenize any input, even continuous vectors! - Residual Vector Quantization - VALL-E (Part 2)
2023-03-07Using Transformers to mimic anyone's voice! - VALL-E (Part 1)
2023-02-28Learning Part-Whole Structure by Chunking - More Efficient than Deep Learning!!!
2023-02-21High-level planning with large language models - SayCan