Beyond Strawberry: gpt-o1 - Is LLM alone sufficient for reasoning?

Subscribers:
5,330
Published on ● Video Link: https://www.youtube.com/watch?v=Nwq4IlwOTV8



Duration: 0:00
1,105 views
53


gpt-o1 has Chain of Thought (CoT) likely already built into the dataset, perhaps by using methods such as Self-Taught Reasoner (STaR) to augment the dataset with rationales, or getting PhD students to provide the rationale.

The key takeaway is that inference at runtime helps significantly on traditional problem solving domains like math and code, and gpt-o1's way of doing this has significant performance gains.

However, I do feel that this reasoning should be better done as an agentic framework and not as an LLM model itself. I think baking reasoning inside the model directly will imbue certain biases, and gpt-o1 will not be as versatile as gpt-4o for non-problem solving domains and may overthink simple queries.

~~~

Slides: https://github.com/tanchongmin/Tensor...
Code: https://github.com/tanchongmin/Tensor...

Useful links:
Self-taught reasoner (STaR): https://arxiv.org/abs/2203.14465
OpenAI gpt-o1: https://openai.com/index/learning-to-...
Chain of thought: https://arxiv.org/abs/2201.11903
Zero-shot chain of thought: https://arxiv.org/abs/2205.11916

~~~

0:00 Introduction
2:37 Impressive Performance on Benchmarks
5:57 gpt-o1 likely only has text action space
7:28 Does gpt-o1 perform well for ARC Prize?
10:21 Strawberry (GPT)
15:40 Strawberry (TaskGen)
19:53 Self-taught reasoner (STaR)
28:12 Traditional Chain of Thought
32:36 Agents Chain of Thought
37:38 Multiple Sampling and Consolidation
46:56 Diverse Sampling with Hints
50:38 gpt-o1 likely uses 2 repeated samples
51:35 LLM Modulo and augmenting external verifiers/critics
56:11 My thoughts
1:13:29 Discussion
1:29:20 Conclusion

~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord:   / discord  
LinkedIn:   / chong-min-tan-94652288  
Online AI blog: https://delvingintotech.wordpress.com/
Twitter:   / johntanchongmin  
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2024-12-04AgentJo CV Generator: Generate your CV by searching for your profile on the web!
2024-11-11Can LLMs be used in self-driving? CoMAL: Collaborative Multi-Agent LLM for Mixed Autonomy Traffic
2024-10-28From TaskGen to AgentJo: Creating My Life Dream of Fast Learning and Adaptable Agents
2024-10-21Tian Yu X John: Discussing Practical Gen AI Tips for Image Prompting
2024-10-08Jiafei Duan: Uncovering the 'Right' Representations for Multimodal LLMs for Robotics
2024-09-27TaskGen Tutorial 6: Conversation Wrapper
2024-09-26TaskGen Tutorial 5: External Functions & CodeGen
2024-09-24TaskGen Tutorial 4: Hierarchical Agents
2024-09-23TaskGen Tutorial 3: Memory
2024-09-19TaskGen Tutorial 2: Shared Variables and Global Context
2024-09-16Beyond Strawberry: gpt-o1 - Is LLM alone sufficient for reasoning?
2024-09-11TaskGen Tutorial 1: Agents and Equipped Functions
2024-09-11TaskGen Tutorial 0: StrictJSON
2024-09-10LLM-Modulo: Using Critics and Verifiers to Improve Grounding of a Plan - Explanation + Improvements
2024-09-06TaskGen: Co-create the best open-sourced LLM Agentic Framework together!
2024-08-21AriGraph (Part 2) - Knowledge Graph Construction and Retrieval Details
2024-08-13alphaXiv - Share Ideas, Build Collective Understanding, Interact with ANY open sourced paper authors
2024-07-30AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
2024-07-23NeoPlanner - Continually Learning Planning Agent for Large Environments guided by LLMs
2024-07-17Intelligence = Sampling + Filtering
2024-07-12Michael Hodel: Reverse Engineering the Abstraction and Reasoning Corpus