Sleep Time Compute - AI That "Thinks" 24/7 (Breakthrough)
๐ Access all top AIs for $10 on https://mammouth.ai/
Join My Newsletter for Regular AI Updates ๐ ๐ผ
https://forwardfuture.ai/
My Links ๐
๐ ๐ป Subscribe: ย ย ย /ย @matthew_bermanย ย
๐ ๐ป Twitter: https://twitter.com/matthewberman
๐ ๐ป Discord: https://discord.gg/xxysSXBxFW
๐ ๐ป Patreon: https://patreon.com/MatthewBerman
๐ ๐ป Instagram: https://www.instagram.com/matthewberman_ai
๐ ๐ป Threads: https://www.threads.net/@matthewberman_ai
๐ ๐ป LinkedIn: https://www.linkedin.com/company/forward-future-ai
Media/Sponsorship Inquiries โ
https://bit.ly/44TC45VV
0:00 Intro: AI That Thinks BEFORE You Ask?
0:13 Introducing Sleep-Time Compute
0:59 The Problem with Standard Test-Time Compute (Cost & Latency)
2:58 Stateful LLM Applications (Code, Docs, Chat)
3:33 Sleep Time vs. Test Time (Diagram Explained)
4:51 Why Sleep-Time is More Cost-Effective
6:00 Defining Sleep-Time Compute
6:26 Sponsor: Mammoth (Generative AI Platform)
7:18 Paper Details: How They Tested Non-Reasoning Models
9:24 Benchmarking Sleep-Time (The Juggle Example)
10:05 Models Used (GPT-4o, Claude, DeepSeek, etc.)
10:25 Results: Non-Reasoning Models (Graphs)
12:18 Results: Reasoning Models (Graphs)
13:39 Sleep Time vs. Parallel Sampling (A Big Issue)
14:41 Scaling Sleep-Time Compute
15:45 Amortizing Cost Across Queries (Why it's Cheaper!)
16:48 Predictable Queries Benefit Most
18:04 Paper Summary & Future Directions
18:40 Outro & Newsletter