MemOS: A Paradigm Shift to Memory as a First Class Citizen for LLMs

Subscribers:
6,300
Published on ● Video Link: https://www.youtube.com/watch?v=CFih0_6tn2w



Duration: 0:00
660 views
42


MemOS is a recent paradigm shift towards an LLM-native memory storage system.

LLMs have parameter-based (weights and biases) memories (Parameter memory), and also can access external memory in the form of databases or knowledge graphs (Plaintext memory).

What is perhaps interesting is that LLMs can also store Key-Value activation values to save time for recomputing them from input tokens (Activation memory).

What if we can store memory across all three spaces - parameter, activation, plaintext, and shuffle them around as needed based on the frequency of access? This is called a Memory Cube.

Furthermore, a Memory Cube is not static. More frequently used memories will get pushed towards parameter level, while less frequent memories get pushed towards plaintext.

What if we can also have memory processes that regulate access and consolidation of these Memory Cubes? These form the Interface Layer, Operation Layer and Infrastructure Layer to access and modify the contents of a Memory Cube.

Overall, a complicated paper, bringing forth some interesting ideas for memory consolidation and retrieval.

Memory is no longer a second-class citizen, but an important core concept of this MemOS formulation.

Slides: https://github.com/tanchongmin/john-youtube/blob/main/Discussion_Sessions/MemOS.pdf
Paper: https://arxiv.org/pdf/2507.03724
Code: https://github.com/MemTensor/MemOS

~~~

Related reading:
Memory3 (precursor to MemOS paper, talking about activation memory): https://arxiv.org/html/2407.01178v1
MemGPT (using agents with memory tools): https://arxiv.org/pdf/2310.08560
LoRA (learning by parallel parameter training): https://arxiv.org/abs/2106.09685
Learning, Fast and Slow (learning with both NN and external database memory): https://arxiv.org/pdf/2301.13758

~~~

0:00 Introduction
2:37 Conventional Retrieval Augmented Generation
6:17 Implicit Memory vs Explicit Memory
15:55 Explicit Memory and how humans learn
20:24 Cost of read + write vs frequency of access
30:31 Math of KV caching for Attention Memory
38:18 Why memory as an OS?
43:38 Overview of MemOS
45:37 Good benchmarks results on LOCOMO
47:57 How does MemCube work?
55:09 How to convert between abstraction spaces?
58:49 Memory Development: From static to dynamic
1:02:09 Memory Consolidation along abstraction spaces
1:07:48 MemCube Contents
1:11:01 Processing Components across Memory Layers
1:14:24 3-layer architecture for MemOS
1:16:47 Memory Lifecycle
1:18:20 Future Plans
1:19:11 My thoughts: Curse of Memory
1:21:20 Discussion

~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2025-09-08DINOv3: One backbone, multiple image/video tasks
2025-08-18R-Zero: Self-Evolving Reasoning LLM from Zero Data
2025-08-11Reasoning without Language (Part 2) - Deep Dive into 27 mil parameter Hierarchical Reasoning Model
2025-08-04Reasoning without Language - Deep Dive into 27 mil parameter Hierarchical Reasoning Model
2025-07-28No need for symbolic programs for Math? Natural language approach to IMO
2025-07-21How many instructions can LLMs follow at once?
2025-07-15Arjo Chakravarty: Indoor Localisation with Visual Language Models (VLMs)
2025-07-14MemOS: A Paradigm Shift to Memory as a First Class Citizen for LLMs
2025-07-07Multimodal Query for Images: Text/Image Multimodal Query with Negative Filter and Folder Selection
2025-06-30Universal Filter (Part 4 - Finale): Knowledge/Memory, Reflection, Communication between Individuals
2025-06-23Universal Filter (Part 3): Learning the Filters, Universal Database, Individual Knowledge Base
2025-06-16Universal Filter (Part 2): Time, Akashic Records, Individual Mind-based, Body-based memory
2025-06-04Good Vibes Only with Dylan Chia: Lyria (Music), Veo3 (Video), Gamma (Slides), GitHub Copilot (Code)
2025-03-10Memory Meets Psychology - Claude Plays Pokemon: How It works, How to improve it
2025-02-24Vibe Coding: How to use LLM prompts to code effectively!
2025-01-26PhD Thesis Overview (Part 2): LLMs for ARC-AGI, Task-Based Memory-Infused Learning, Plan for AgentJo
2025-01-20PhD Thesis Overview (Part 1): Reward is not enough; Towards Goal-Directed, Memory-based Learning
2024-12-04AgentJo CV Generator: Generate your CV by searching for your profile on the web!
2024-11-11Can LLMs be used in self-driving? CoMAL: Collaborative Multi-Agent LLM for Mixed Autonomy Traffic
2024-10-28From TaskGen to AgentJo: Creating My Life Dream of Fast Learning and Adaptable Agents
2024-10-21Tian Yu X John: Discussing Practical Gen AI Tips for Image Prompting