Tokenization in DeepSeek R1

Channel:

Subscribers:

22,600

Published on March 17, 2025 5:00:48 AM ● Video Link: https://www.youtube.com/watch?v=tUpErbDUQes

Duration: 0:00

164 views

In this video, we dissect token boundary bias, a common pitfall in models like Stability AI’s tokenizer, where unexpected spaces or punctuation splits derail outputs and how DeepSeek fixed this in their R1 model.

Where else to find us:
https://www.linkedin.com/in/amirfzpr/
https://aisc.substack.com/
/ @ai-science
https://lu.ma/aisc-llm-school
https://maven.com/aggregate-intellect/

Other Videos By LLMs Explained - Aggregate Intellect - AI.SCIENCE

2025-04-10	Data Stores, Prompt Repositories, and Memory Management
2025-04-10	Dynamic Prompting and Retrieval Techniques
2025-04-09	How to Fine Tune Agents
2025-04-08	What are Agents
2025-04-02	Leveraging LLMs for Causal Reasoning
2025-04-01	Examples of Causal Representation in Computer vision
2025-03-31	Relationship between Reasoning and Causality
2025-03-30	Causal Representation Learning
2025-03-18	Deduplication in DeepSeek R1
2025-03-17	What Makes DeepSeek R1 Multi-token Prediction Unique?
2025-03-16	Tokenization in DeepSeek R1
2025-03-04	ReferWell - Helping Patients Find Specialists - Multi-agent LLM Systems Bootcamp
2024-12-10	Built Multi-agent LLM Products - Bootcamp Teaser
2024-10-16	LLM Products and Entrepreneurship
2024-08-13	XAI for LLMs: looking under the hood of Large Language Models
2024-08-05	Intro to Llama-agents Framework (+ live demo)
2024-07-24	Generative AI Tools and Adoption
2024-07-02	PandasAI - From Open Source to User-centric Products
2024-06-28	Agents Embedded in the Real World
2024-06-19	Human Feedback Foundation - LLMs
2024-06-11	LLM Products vs Traditional Digital Products