No need for symbolic programs for Math? Natural language approach to IMO

Subscribers:
6,300
Published on ● Video Link: https://www.youtube.com/watch?v=noNbNRObffI



Duration: 0:00
511 views
21


International Mathematical Olympiad (IMO) is an international mathematical competition which challenges participants with exceptionally difficult problems in fields like algebra, number theory and combinatorics.

Previously, LLM-based approaches have conquered Math benchmarks like GSM8K and AIME, but have only attained Silver medal performance at IMO.

To solve IMO problems, there needs to be multi-step reasoning and creative innovation to think beyond the norm.

OpenAI and Gemini have claimed to attain the Gold level performance at IMO 2025, with Gemini being officially verified.

Here, let us take a look at how two researchers, Yichen Huang and Lin F. Yang managed to attain the Gold level performance as well.

They used the LLM as a pipeline to generate solutions, verify them and self-improve the solutions.

It is amazing as previously I thought that a robust verifier was needed for Math.

Apparently, if the LLM is well trained on Math datasets, you can use the LLM as a verifier directly.

~~~

Links:
Slides: https://github.com/tanchongmin/john-youtube/blob/main/Discussion_Sessions/IMO_Gemini.pdf
Paper: https://www.alphaxiv.org/pdf/2507.15855
Code: https://github.com/lyang36/IMO25

Other References:
Gemini IMO Gold: https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/
Gemini Deep Think: https://blog.google/technology/google-deepmind/google-gemini-updates-io-2025/#deep-think
AlphaGeometry: https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/
AlphaProof (Silver-level IMO performance with symbolic solver): https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/
AlphaCode: https://deepmind.google/discover/blog/competitive-programming-with-alphacode/
ARC-AGI Challenge Ryan Greenblatt's "Sample More" solution: https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt

~~~

0:00 Introduction
5:32 Domain-Specific Language Approach
10:26 From DSL to Natural Language
14:53 Deep Think
23:35 Natural Language Approach to Math
56:07 AlphaEvolve
1:00:52 Open-sourced IMO Gold AI Workflow
1:19:12 Detailed Steps
1:28:56 Key Takeway: Verifier is not perfect - but system can still work!
1:33:09 My thoughts
1:55:36 Discussion
2:04:38 Conclusion

~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2025-09-08DINOv3: One backbone, multiple image/video tasks
2025-08-18R-Zero: Self-Evolving Reasoning LLM from Zero Data
2025-08-11Reasoning without Language (Part 2) - Deep Dive into 27 mil parameter Hierarchical Reasoning Model
2025-08-04Reasoning without Language - Deep Dive into 27 mil parameter Hierarchical Reasoning Model
2025-07-28No need for symbolic programs for Math? Natural language approach to IMO
2025-07-21How many instructions can LLMs follow at once?
2025-07-15Arjo Chakravarty: Indoor Localisation with Visual Language Models (VLMs)
2025-07-14MemOS: A Paradigm Shift to Memory as a First Class Citizen for LLMs
2025-07-07Multimodal Query for Images: Text/Image Multimodal Query with Negative Filter and Folder Selection
2025-06-30Universal Filter (Part 4 - Finale): Knowledge/Memory, Reflection, Communication between Individuals
2025-06-23Universal Filter (Part 3): Learning the Filters, Universal Database, Individual Knowledge Base
2025-06-16Universal Filter (Part 2): Time, Akashic Records, Individual Mind-based, Body-based memory
2025-06-04Good Vibes Only with Dylan Chia: Lyria (Music), Veo3 (Video), Gamma (Slides), GitHub Copilot (Code)
2025-03-10Memory Meets Psychology - Claude Plays Pokemon: How It works, How to improve it
2025-02-24Vibe Coding: How to use LLM prompts to code effectively!
2025-01-26PhD Thesis Overview (Part 2): LLMs for ARC-AGI, Task-Based Memory-Infused Learning, Plan for AgentJo
2025-01-20PhD Thesis Overview (Part 1): Reward is not enough; Towards Goal-Directed, Memory-based Learning
2024-12-04AgentJo CV Generator: Generate your CV by searching for your profile on the web!
2024-11-11Can LLMs be used in self-driving? CoMAL: Collaborative Multi-Agent LLM for Mixed Autonomy Traffic
2024-10-28From TaskGen to AgentJo: Creating My Life Dream of Fast Learning and Adaptable Agents
2024-10-21Tian Yu X John: Discussing Practical Gen AI Tips for Image Prompting