Understanding LLMs Like Physicists: Observation, Hypothesis, Experimentation, and Prediction

Channel:

Google TechTalks

Subscribers:

349,000

Published on February 24, 2025 9:16:22 PM ● Video Link: https://www.youtube.com/watch?v=sl1LJ6HKcjI

Duration: 0:00

1,308 views

A Google TechTalk, presented by Tianyu Guo, 2025-02-20
Google Algorithms Seminar: ABSTRACT: Recently, methodologies from physics have inspired new research paradigms for scientific understandings of LLMs. In physics, knowledge often emerges through four stages: observing nature, forming hypotheses, conducting controlled experiments, and making real-world predictions. Here, I present two independent mechanisms discovered in LLMs following this methodology.
Dormant Heads: LLMs deactivate certain attention heads when they are irrelevant to the current task. A given head may serve a specific function, and when faced with an unrelated prompt, it becomes dormant, concentrating all attention on the first token.
Random Guessing in Two-Hop Reasoning: Pretrained LLMs resort to random guesses when distractions are present in two-hop reasoning. A well-designed supervised fine-tuning dataset can solve this issue.
I will discuss how these mechanisms emerge through observations, how hypotheses are formed, how we design and analyze controlled experiments, and how these mechanisms are validated in LLMs.

ABOUT THE SPEAKER: Tianyu Guo is a third-year PhD student in the UC Berkeley Statistics Department, advised by Song Mei and Michael I. Jordan. His research focuses on the Interpretability of Large Language models and Causal Inference.

Other Videos By Google TechTalks

2025-04-15	Online Learning and Economics
2025-04-14	Go Meetup April 2025 - i18n Go Experiment
2025-04-14	Go Meetup April 2025 - Whats New in Go 1.24?
2025-04-14	Go Meetup April 2025 - Git Bisect and Go Size Analyzer
2025-04-14	Go Meetup April 2025 - Photobooth
2025-04-14	Go Meetup April 2025 - Go Protobuf
2025-02-24	Understanding LLMs Like Physicists: Observation, Hypothesis, Experimentation, and Prediction
2025-02-10	Theoretical Limitations of Multi layer Transformers
2025-01-28	Hash Functions: Bridging the Gap from Theory to Practice
2025-01-14	LLM Dataset Inference: Did you train on my dataset?
2024-12-10	Proactive Agents for Multi-Turn Text-to-Image Generation Under Uncertainty
2024-11-22	AI Snake Oil
2024-08-15	How I Wrote 10K Lines of Go in a Weekend
2024-08-15	Supply Chain Security with Go
2024-07-30	A Multi Dimensional Online Contention Resolution Scheme
2024-07-09	Robust Distortion-free Watermarks for Language Models
2024-07-02	Is it possible to make self-adjusting data structures concurrent?
2024-06-21	Privacy Preserving ML with Fully Homomorphic Encryption
2024-06-21	The Chinese Computer: A Global History of the Information Age
2024-06-14	KAN: Kolmogorov-Arnold Networks
2024-05-27	Learning through Transient Matching in Congested Markets

Channel	Latest
LIA MENDI	6 hours ago
Exitosa Noticias	6 hours ago
Android4L	6 hours ago
Rimas 100	6 hours ago
Canal RCN	6 hours ago
EL TIEMPO	6 hours ago
PryGames	6 hours ago
Edi Solo Gaming	6 hours ago
BIGAME	6 hours ago
João Maluco2	7 hours ago
Darker Senpai	7 hours ago
Korea Retro Game	7 hours ago
Simple Alpaca	7 hours ago
Jamaican Adventures	7 hours ago
Real Betis Balompié	7 hours ago
TRC Gameplay	7 hours ago
GENIAL	7 hours ago
Musicking26	7 hours ago
MULTIMEDIOS	7 hours ago
Cryptobruj	7 hours ago
Dota 2 - Akon. tv	7 hours ago
MultiSt3p	7 hours ago
Beatriz Plays	7 hours ago
Neuro	7 hours ago
Victor Camacho	7 hours ago