Reasoning without Language - Deep Dive into 27 mil parameter Hierarchical Reasoning Model

Subscribers:
6,300
Published on ● Video Link: https://www.youtube.com/watch?v=DvZ8jZ-laj4



Duration: 0:00
10,789 views
395


Hierarchical Reasoning Model (HRM) is a very interesting work that shows how recurrent thinking in latent space can help convey ideas that language may perhaps find hard to express.

HRM, trained from scratch with only the official dataset (~1000 examples), with only 27M parameters and a 30x30 grid context (900 tokens), achieves a performance of 40.3%, which substantially surpasses leading CoT-based models like o3-mini-high (34.5%) and Claude 3.7 8K context (21.2%) in ARC-AGI.

This latent space thinking reminds me of the MemOS parameter/activation memory that can be used to convey context without needing input context in language.

HRM updates its reasoning (hidden vectors) every timestep for the low-level vector, and once every T timesteps for the high-level vector, giving us a blueprint of how to do reasoning across multiple timescales and hierarchy.

Moving ahead, I think it would be more interesting if we can combine this latent thinking together with language thinking of the current Chain of Thought (CoT) paradigm. Effectively, this means that instead of just outputting tokens from transformer between subtasks, we also have a latent space output that can be passed into the next call of the LLM.

We should also scale it up to cover more kinds of latent spaces like images, videos, audio, sensorimotor, and also allow for interfacing with tool use and putting the tool output back into the model.

~~~

References:
Slides: https://github.com/tanchongmin/john-youtube/blob/main/Discussion_Sessions/Hierarchical Reasoning Model.pdf
Paper: https://www.arxiv.org/pdf/2506.21734
Code: https://github.com/sapientinc/HRM/blob/main/models/hrm/hrm_act_v1.py

Other resources:
Chain of Thought: https://arxiv.org/pdf/2201.11903
Reasoning and Acting (ReAct): https://arxiv.org/pdf/2210.03629
Thinking in Latent Space: https://arxiv.org/html/2412.06769v2
MemOS (Multiple memory spaces in an overall architecture): https://arxiv.org/pdf/2505.22101
Learning, Fast and Slow (learning from any start and goal state along the trajectory of experience): https://arxiv.org/pdf/2301.13758

~~~

0:00 Introduction
3:38 Impressive results on ARC-AGI, Sudoku and Maze
12:10 Experimental Tasks
17:17 Hierarchical Model Design Insights
29:21 Neuroscience Inspiration
33:25 Clarification on pre-training for HRM
38:20 Performance for HRM could be due to data augmentation
49:30 Visualizing Intermediate Thinking Steps
1:00:05 Traditional Chain of Thought (CoT)
1:02:50 Language may be limiting
1:09:03 New paradigm for thinking
1:25:52 Traditional Transformers do not scale depth well
1:30:59 Truncated Backpropagation Through Time
1:34:04 Towards a hybrid language/non-language thinking

~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2025-09-08DINOv3: One backbone, multiple image/video tasks
2025-08-18R-Zero: Self-Evolving Reasoning LLM from Zero Data
2025-08-11Reasoning without Language (Part 2) - Deep Dive into 27 mil parameter Hierarchical Reasoning Model
2025-08-04Reasoning without Language - Deep Dive into 27 mil parameter Hierarchical Reasoning Model
2025-07-28No need for symbolic programs for Math? Natural language approach to IMO
2025-07-21How many instructions can LLMs follow at once?
2025-07-15Arjo Chakravarty: Indoor Localisation with Visual Language Models (VLMs)
2025-07-14MemOS: A Paradigm Shift to Memory as a First Class Citizen for LLMs
2025-07-07Multimodal Query for Images: Text/Image Multimodal Query with Negative Filter and Folder Selection
2025-06-30Universal Filter (Part 4 - Finale): Knowledge/Memory, Reflection, Communication between Individuals
2025-06-23Universal Filter (Part 3): Learning the Filters, Universal Database, Individual Knowledge Base
2025-06-16Universal Filter (Part 2): Time, Akashic Records, Individual Mind-based, Body-based memory
2025-06-04Good Vibes Only with Dylan Chia: Lyria (Music), Veo3 (Video), Gamma (Slides), GitHub Copilot (Code)
2025-03-10Memory Meets Psychology - Claude Plays Pokemon: How It works, How to improve it
2025-02-24Vibe Coding: How to use LLM prompts to code effectively!
2025-01-26PhD Thesis Overview (Part 2): LLMs for ARC-AGI, Task-Based Memory-Infused Learning, Plan for AgentJo
2025-01-20PhD Thesis Overview (Part 1): Reward is not enough; Towards Goal-Directed, Memory-based Learning
2024-12-04AgentJo CV Generator: Generate your CV by searching for your profile on the web!
2024-11-11Can LLMs be used in self-driving? CoMAL: Collaborative Multi-Agent LLM for Mixed Autonomy Traffic
2024-10-28From TaskGen to AgentJo: Creating My Life Dream of Fast Learning and Adaptable Agents
2024-10-21Tian Yu X John: Discussing Practical Gen AI Tips for Image Prompting