Michael Hodel: Reverse Engineering the Abstraction and Reasoning Corpus

Subscribers:
5,330
Published on ● Video Link: https://www.youtube.com/watch?v=J3_KjLikOU0



Duration: 1:28:17
1,122 views
50


Had a great discussion with Micahel Hodel and a few others (Simon Strandgaard, Yassine and many more) about reverse engineering the ARC dataset, and possible approaches to solve ARC.

Speaker Profile:
Michael studied Computer Science in Zurich, Switzerland. He is currently working as a freelance programmer, but his main focus remains ARC. Currently, his team MindsAI with Jack Cole and Mohamed Osman is #1 on the ARC-Prize leaderboard (a 1 million dollar competition to solve ARC - https://arcprize.org/ )

Abstract:
The Abstraction and Reasoning Corpus (ARC) is a dataset intended to serve as a benchmark for general intelligence. The difficulty of ARC for machine learning approaches is largely a consequence of the great diversity of tasks as well its few-shot nature. Even after almost five years since its publication, ARC remains unsolved. While many attempts have been made to solve the benchmark, what seems generally lacking are more fundamental scientific experiments. RE-ARC presents code to procedurally generate examples for the ARC training tasks and with that attempts to enable experiments addressing the latter of those two aspects, namely sample-efficient learning. RE-ARC also introduces a simple proxy metric for example difficulty, which should also allow for exploring questions about within-task generalization capabilities of systems.

Repo: https://github.com/michaelhodel/re-arc

Paper: https://arxiv.org/abs/2404.07353

~~~

0:00 Speaker Introduction
0:45 Introduction to ARC-DSL
8:26 Data Generation
12:13 How close is DSL to human priors
14:42 How to decide which DSL to keep and which to add in
19:53 Introduction to RE-ARC
23:45 Overview of RE-ARC
25:32 Task Generalisation in RE-ARC
26:00 Example Verification in RE-ARC
26:47 Example Difficulty in RE-ARC
31:48 Limitations of RE-ARC
33:37 Examples of RE-ARC
35:22 Using RE-ARC to gauge model learning
37:48 Vision for meta-learning beyond RE-ARC
39:08 Can arbitrary DSL be generated with RE-ARC?
43:17 Discussion

~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2024-09-16Beyond Strawberry: gpt-o1 - Is LLM alone sufficient for reasoning?
2024-09-11TaskGen Tutorial 1: Agents and Equipped Functions
2024-09-11TaskGen Tutorial 0: StrictJSON
2024-09-10LLM-Modulo: Using Critics and Verifiers to Improve Grounding of a Plan - Explanation + Improvements
2024-09-06TaskGen: Co-create the best open-sourced LLM Agentic Framework together!
2024-08-21AriGraph (Part 2) - Knowledge Graph Construction and Retrieval Details
2024-08-13alphaXiv - Share Ideas, Build Collective Understanding, Interact with ANY open sourced paper authors
2024-07-30AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
2024-07-23NeoPlanner - Continually Learning Planning Agent for Large Environments guided by LLMs
2024-07-17Intelligence = Sampling + Filtering
2024-07-12Michael Hodel: Reverse Engineering the Abstraction and Reasoning Corpus
2024-07-02TaskGen Conversational Class v2: JARVIS, Psychology Counsellor, Sherlock Holmes Shop Assistant
2024-06-04CodeAct: Code As Action Space of LLM Agents - Pros and Cons
2024-05-28TaskGen Conversation with Dynamic Memory - Math Quizbot, Escape Room Solver, Psychology Counsellor
2024-05-21Integrate ANY Python Function, CodeGen, CrewAI tool, LangChain tool with TaskGen! - v2.3.0
2024-05-11Empirical - Open Source LLM Evaluation UI
2024-05-07TaskGen Ask Me Anything #1
2024-04-29StrictJSON (LLM Output Parser) Ask Me Anything #1
2024-04-22Tutorial #14: Write latex papers with LLMs such as Llama 3!
2024-04-16SORA Deep Dive: Predict patches from text, images or video
2024-04-09OpenAI CLIP Embeddings: Walkthrough + Insights