Release Notes: Gemini's multimodality

Channel:

Google for Developers

Subscribers:

2,520,000

Published on July 2, 2025 11:01:04 PM ● Video Link: https://www.youtube.com/watch?v=K4vXvaRV0dw

Duration: 0:00

14,123 views

370

Ani Baddepudi, Gemini Model Behavior Product Lead, joins host Logan Kilpatrick for a deep dive into Gemini's multimodal capabilities. Their conversation explores why Gemini was built as a natively multimodal model from day one, the future of proactive AI assistants, and how we are moving towards a world where "everything is vision." Learn about the differences between video and image understanding and token representations, higher FPS video sampling, and more.

Chapters:
0:00 - Intro
1:12 - Why Gemini is natively multimodal
2:23 - The technology behind multimodal models
5:15 - Video understanding with Gemini 2.5
9:25 - Deciding what to build next
13:23 - Building new product experiences with multimodal AI
17:15 - The vision for proactive assistants
24:13 - Improving video usability with variable FPS and frame tokenization
27:35 - What’s next for Gemini’s multimodal development
31:47 - Deep dive on Gemini’s document understanding capabilities
37:56 - The teamwork and collaboration behind Gemini
40:56 - What’s next with model behavior

Resources:

Watch more Release Notes → https://goo.gle/4njokfg
Subscribe to Google for Developers → https://goo.gle/developers

Speaker: Logan Kilpatrick, Anirudh Baddepudi
Products Mentioned: Google AI, Gemini

Other Videos By Google for Developers

2025-07-14	Home APIs Tips & Tricks
2025-07-14	What’s the shortest way to output the alphabet in Python? Go!
2025-07-11	Typing on someone else’s keyboard is harder than you think
2025-07-10	Just in from the news desk 📰: Big milestones for the Gemini family of models!
2025-07-09	Level up your AI prompting with this tip
2025-07-08	These bugs are really playing hide and seek.
2025-07-07	Create advanced automations using the Home APIs on Android
2025-07-07	Create advanced automations using the Home APIs on Android
2025-07-07	Can you explain whats happening with this output? Go!
2025-07-02	Enable Google Pay in Android WebView
2025-07-02	Release Notes: Gemini's multimodality
2025-07-02	Controlling a remote-controlled car with Gemma 3n on-device
2025-07-01	Biggest shift when moving from IC to manager?
2025-07-01	Why centering a div is harder than it seems
2025-07-01	Just in from the news desk 📰: Stitch from Google Labs is here!
2025-06-30	Build a mobile app using the Home APIs on iOS
2025-06-30	Build a mobile app using the Home APIs on iOS
2025-06-29	Why does this one conditional line log? Go!
2025-06-26	Announcing Angular v20, find us on Bluesky, and more! - Google Developer News June 2025
2025-06-26	“Just one more line of code before lunch!”
2025-06-25	Gemini CLI: Vibe coding and deploying Hello World to Cloud Run!

Channel	Latest
Tary	6 hours ago
Rydo	6 hours ago
Goat Shiny	6 hours ago
RIKI-MYR MM	6 hours ago
THE OFFICIAL CHANNEL OF SERIOUS SAM 🎮	6 hours ago
Marco Hayabusa	6 hours ago
man3k	6 hours ago
Gamebreakers Studios	6 hours ago
Mr. Ray	6 hours ago
Robert Firth	6 hours ago
F-DROID	6 hours ago
Tylertsports	6 hours ago
Assassin Gaming87	6 hours ago
GameSutra	6 hours ago
xRuDo	6 hours ago
竜姫エナ / Ena Ryuhime	7 hours ago
ForceCommander	7 hours ago
NAVI League of Legends	7 hours ago
Emįx	7 hours ago
🎮 OPTIMUS PRIME 🎮	7 hours ago
mitomo_kagamiya ch.	7 hours ago
MichouOff	7 hours ago
Harry Ch. 月夜ハリー	7 hours ago
Onespot Gaming	7 hours ago
Game of Innings	7 hours ago