Beyond Strawberry: gpt-o1 - Is LLM alone sufficient for reasoning?
gpt-o1 has Chain of Thought (CoT) likely already built into the dataset, perhaps by using methods such as Self-Taught Reasoner (STaR) to augment the dataset with rationales, or getting PhD students to provide the rationale.
The key takeaway is that inference at runtime helps significantly on traditional problem solving domains like math and code, and gpt-o1's way of doing this has significant performance gains.
However, I do feel that this reasoning should be better done as an agentic framework and not as an LLM model itself. I think baking reasoning inside the model directly will imbue certain biases, and gpt-o1 will not be as versatile as gpt-4o for non-problem solving domains and may overthink simple queries.
~~~
Slides: https://github.com/tanchongmin/Tensor...
Code: https://github.com/tanchongmin/Tensor...
Useful links:
Self-taught reasoner (STaR): https://arxiv.org/abs/2203.14465
OpenAI gpt-o1: https://openai.com/index/learning-to-...
Chain of thought: https://arxiv.org/abs/2201.11903
Zero-shot chain of thought: https://arxiv.org/abs/2205.11916
~~~
0:00 Introduction
2:37 Impressive Performance on Benchmarks
5:57 gpt-o1 likely only has text action space
7:28 Does gpt-o1 perform well for ARC Prize?
10:21 Strawberry (GPT)
15:40 Strawberry (TaskGen)
19:53 Self-taught reasoner (STaR)
28:12 Traditional Chain of Thought
32:36 Agents Chain of Thought
37:38 Multiple Sampling and Consolidation
46:56 Diverse Sampling with Hints
50:38 gpt-o1 likely uses 2 repeated samples
51:35 LLM Modulo and augmenting external verifiers/critics
56:11 My thoughts
1:13:29 Discussion
1:29:20 Conclusion
~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: / discord
LinkedIn: / chong-min-tan-94652288
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: / johntanchongmin
Try out my games here: https://simmer.io/@chongmin