LLM-Modulo: Using Critics and Verifiers to Improve Grounding of a Plan - Explanation + Improvements
LLMs are really bad at self-reflection / self-critique.
This is especially so if they do not have the domain knowledge required, or the problem is of the wrong abstraction space, or if rule-based accuracy is needed.
LLM Modulo architecture with verifiers/critics can mitigate a lot of issues after the output is generated.
The next question is, why not mitigate at the start, using agentic systems with rule-based components?
How can we create LLM agentic systems and incorporate robustness with verifiers/critics/rule-based grounding?
~~~
Slides: https://github.com/tanchongmin/Tensor...
Jupyter Notebook: https://github.com/tanchongmin/Tensor...
Paper: https://arxiv.org/abs/2402.01817
~~~
0:00 Introduction
3:03 One-shot and zero-shot BlocksWorld planning
7:54 ReAct only works for short trajectories
13:06 Planning
14:29 Reflection as a way of refining earlier experience
17:51 LLMs cannot self-critique well
23:12 Strawberry Example
29:10 The hard problem of planning
35:48 Simple schema for LLM Modulo
38:46 LLM Modulo Architecture
52:17 Comparison to ReAct
52:53 Critics
59:09 LLM Modulo for Travel Planning
1:03:21 My thoughts
1:06:50 My extension ideas for LLM Modulo
1:11:35 Critic/Verifiers in TaskGen
1:14:20 Discussion
1:27:27 Conclusion
~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: / discord
LinkedIn: / chong-min-tan-94652288
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: / johntanchongmin
Try out my games here: https://simmer.io/@chongmin