CodeAct: Code As Action Space of LLM Agents - Pros and Cons

Channel:

John Tan Chong Min

Subscribers:

5,450

Published on June 5, 2024 1:03:24 AM ● Video Link: https://www.youtube.com/watch?v=n5K2fjlT0FQ

Duration: 1:37:56

530 views

Using Code as Action for LLM Agents improves accuracy by up to 20% compared to using JSON or Text for modular function calling approaches!

Code allows for intermediate state caching, and draws on training data for native pythonic structures which can solve the problem more easily (e.g. for loop, min, max).

Code also allows for multiple actions to be done together, which can solve potential planning issues of LLMs.

Code also allows for native error-correction by iterating on the error feedback the environment gives.

My take - Using code is a good way to bypass the current inadequate planning abilities of LLM Agents. It is not a long-term solution. We need to figure out how to plan better, so that LLM Agents can be much better and more responsive. Code is only good for tasks within the training set, so that a variant of the code structure can be easily replicated for the new tasks.

My slides: https://github.com/tanchongmin/TensorFlow-Implementations/blob/main/Paper_Reviews/Code%20as%20Actions.pdf
Github: https://github.com/xingyaoww/code-act
Paper: https://arxiv.org/abs/2402.01030

~~~
Related links:
TaskGen Repo: https://github.com/simbianai/taskgen
LLMs as a System of Multiple Expert Agents to solve the ARC Challenge: https://www.youtube.com/watch?v=sTvonsD5His

~~~

0:00 Introduction and TaskGen implementation
6:23 Main highlight of paper - Code Action space has higher success rate compared to Text/JSON
9:31 Other examples of Code as Actions
12:05 Recap: Reasoning and Acting (ReAct) Framework
17:00 CodeAct: ReAct with Code as Action
27:17 CodeAct Agent’s 4 steps
30:51 CodeAct Prompt
35:20 CodeAct Expert Feedback prompt
37:12 Three kinds of Action Formats
47:15 Text/JSON vs Code (Part 1)
55:30 Text/JSON vs Code (Part 2)
1:02:30 Why is code better?
1:10:48 Recap: ARC Challenge
1:14:19 CodeAct even works better for some models over Text/JSON for atomic API calls
1:17:44 Can CodeAct self-learn?
1:24:36 Discussion
1:34:27 Additional Slides

~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin

Other Videos By John Tan Chong Min

2024-09-11	TaskGen Tutorial 0: StrictJSON
2024-09-10	LLM-Modulo: Using Critics and Verifiers to Improve Grounding of a Plan - Explanation + Improvements
2024-09-06	TaskGen: Co-create the best open-sourced LLM Agentic Framework together!
2024-08-21	AriGraph (Part 2) - Knowledge Graph Construction and Retrieval Details
2024-08-13	alphaXiv - Share Ideas, Build Collective Understanding, Interact with ANY open sourced paper authors
2024-07-30	AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
2024-07-23	NeoPlanner - Continually Learning Planning Agent for Large Environments guided by LLMs
2024-07-17	Intelligence = Sampling + Filtering
2024-07-12	Michael Hodel: Reverse Engineering the Abstraction and Reasoning Corpus
2024-07-02	TaskGen Conversational Class v2: JARVIS, Psychology Counsellor, Sherlock Holmes Shop Assistant
2024-06-04	CodeAct: Code As Action Space of LLM Agents - Pros and Cons
2024-05-28	TaskGen Conversation with Dynamic Memory - Math Quizbot, Escape Room Solver, Psychology Counsellor
2024-05-21	Integrate ANY Python Function, CodeGen, CrewAI tool, LangChain tool with TaskGen! - v2.3.0
2024-05-11	Empirical - Open Source LLM Evaluation UI
2024-05-07	TaskGen Ask Me Anything #1
2024-04-29	StrictJSON (LLM Output Parser) Ask Me Anything #1
2024-04-22	Tutorial #14: Write latex papers with LLMs such as Llama 3!
2024-04-16	SORA Deep Dive: Predict patches from text, images or video
2024-04-09	OpenAI CLIP Embeddings: Walkthrough + Insights
2024-03-26	TaskGen - LLM Agentic Framework that Does More, Talks Less: Shared Variables, Memory, Global Context
2024-03-18	CRADLE (Part 2): An AI that can play Red Dead Dedemption 2. Reflection, Memory, Task-based Planning

Channel	Latest
Zlabus	6 hours ago
Cevlo	6 hours ago
Alterny Vibe	7 hours ago
Gboogie32	7 hours ago
DiZtaRi	7 hours ago
Cory Campbell	7 hours ago
Purple Kyogre	7 hours ago
Everyday Special	7 hours ago
HGW Trilhas Sonoras	7 hours ago
CohhCarnage	8 hours ago
Gamer _-_ 24	8 hours ago
KwingsLetsPlays	8 hours ago
Rediscover Redstone	8 hours ago
markwerbenjagermanjensen	8 hours ago
Gameplay y Manga	8 hours ago
El Mundo Según Alejo	8 hours ago
TNH Nebula	8 hours ago
4K Gaming	8 hours ago
An Sang Wu	8 hours ago
TrollForce	8 hours ago
gattu	8 hours ago
SARU TV	8 hours ago
Syl3ntVoRtX	8 hours ago
Andre Ferdinan Arianto	8 hours ago
INMORTAL FF	8 hours ago