Beyond Strawberry: gpt-o1 - Is LLM alone sufficient for reasoning?

Channel:

John Tan Chong Min

Subscribers:

5,450

Published on September 16, 2024 6:06:14 AM ● Video Link: https://www.youtube.com/watch?v=Nwq4IlwOTV8

Duration: 0:00

1,105 views

gpt-o1 has Chain of Thought (CoT) likely already built into the dataset, perhaps by using methods such as Self-Taught Reasoner (STaR) to augment the dataset with rationales, or getting PhD students to provide the rationale.

The key takeaway is that inference at runtime helps significantly on traditional problem solving domains like math and code, and gpt-o1's way of doing this has significant performance gains.

However, I do feel that this reasoning should be better done as an agentic framework and not as an LLM model itself. I think baking reasoning inside the model directly will imbue certain biases, and gpt-o1 will not be as versatile as gpt-4o for non-problem solving domains and may overthink simple queries.

~~~

Slides: https://github.com/tanchongmin/Tensor...
Code: https://github.com/tanchongmin/Tensor...

Useful links:
Self-taught reasoner (STaR): https://arxiv.org/abs/2203.14465
OpenAI gpt-o1: https://openai.com/index/learning-to-...
Chain of thought: https://arxiv.org/abs/2201.11903
Zero-shot chain of thought: https://arxiv.org/abs/2205.11916

~~~

0:00 Introduction
2:37 Impressive Performance on Benchmarks
5:57 gpt-o1 likely only has text action space
7:28 Does gpt-o1 perform well for ARC Prize?
10:21 Strawberry (GPT)
15:40 Strawberry (TaskGen)
19:53 Self-taught reasoner (STaR)
28:12 Traditional Chain of Thought
32:36 Agents Chain of Thought
37:38 Multiple Sampling and Consolidation
46:56 Diverse Sampling with Hints
50:38 gpt-o1 likely uses 2 repeated samples
51:35 LLM Modulo and augmenting external verifiers/critics
56:11 My thoughts
1:13:29 Discussion
1:29:20 Conclusion

~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord:   / discord
LinkedIn:   / chong-min-tan-94652288
Online AI blog: https://delvingintotech.wordpress.com/
Twitter:   / johntanchongmin
Try out my games here: https://simmer.io/@chongmin

Other Videos By John Tan Chong Min

2024-12-04	AgentJo CV Generator: Generate your CV by searching for your profile on the web!
2024-11-11	Can LLMs be used in self-driving? CoMAL: Collaborative Multi-Agent LLM for Mixed Autonomy Traffic
2024-10-28	From TaskGen to AgentJo: Creating My Life Dream of Fast Learning and Adaptable Agents
2024-10-21	Tian Yu X John: Discussing Practical Gen AI Tips for Image Prompting
2024-10-08	Jiafei Duan: Uncovering the 'Right' Representations for Multimodal LLMs for Robotics
2024-09-27	TaskGen Tutorial 6: Conversation Wrapper
2024-09-26	TaskGen Tutorial 5: External Functions & CodeGen
2024-09-24	TaskGen Tutorial 4: Hierarchical Agents
2024-09-23	TaskGen Tutorial 3: Memory
2024-09-19	TaskGen Tutorial 2: Shared Variables and Global Context
2024-09-16	Beyond Strawberry: gpt-o1 - Is LLM alone sufficient for reasoning?
2024-09-11	TaskGen Tutorial 1: Agents and Equipped Functions
2024-09-11	TaskGen Tutorial 0: StrictJSON
2024-09-10	LLM-Modulo: Using Critics and Verifiers to Improve Grounding of a Plan - Explanation + Improvements
2024-09-06	TaskGen: Co-create the best open-sourced LLM Agentic Framework together!
2024-08-21	AriGraph (Part 2) - Knowledge Graph Construction and Retrieval Details
2024-08-13	alphaXiv - Share Ideas, Build Collective Understanding, Interact with ANY open sourced paper authors
2024-07-30	AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
2024-07-23	NeoPlanner - Continually Learning Planning Agent for Large Environments guided by LLMs
2024-07-17	Intelligence = Sampling + Filtering
2024-07-12	Michael Hodel: Reverse Engineering the Abstraction and Reasoning Corpus

Channel	Latest
TheGamerHennyRoc	6 hours ago
Prithwiraj Ghosh	6 hours ago
SýrYakari	6 hours ago
Poder360	6 hours ago
Game channel MAZAVS	6 hours ago
Meot	6 hours ago
(TNP)NevrheardOfU	7 hours ago
RCD Espanyol de Barcelona	7 hours ago
ミネイ	7 hours ago
AZ三日月	7 hours ago
TWOoff	7 hours ago
RaxoR	7 hours ago
Gbs Playz Gacha	7 hours ago
XXZ GAMEPLAY	7 hours ago
TAC12	7 hours ago
CartaCapital	7 hours ago
iToJu	7 hours ago
Brasil de Fato	7 hours ago
rAiiPXH	7 hours ago
Hannibal07051987	7 hours ago
TcotC_boUntY	7 hours ago
PUBG MOBILE Pakistan Official	7 hours ago
Landi - Brawl Stars	7 hours ago
NEIHFAKA RIL BAWM	7 hours ago
Jesse Rachael	7 hours ago