How load balancing AI workloads delivers faster user response times (demo)

Channel:

Google Cloud

Subscribers:

296,000

Published on April 3, 2025 9:10:14 PM ● Video Link: https://www.youtube.com/watch?v=x_2ehkP1GJk

Duration: 0:00

165 views

Join Emanuele Mazza, a Networking Product Specialist at Google Cloud, to learn about how Cloud Load Balancing uses custom metrics to provide queue depth as a metric for load balancing AI workloads to deliver faster user response time to prompts while optimizing TPU and GPU utilization. In this demo, we showcase the gamification of the load balancer where the attendee is competing against our load balancer operating in the region. The console shows how to configure the load balancer and select the least loaded optimal endpoints, ensuring less wait time for GPUs and faster inference response.

Learn more about Cloud Load Balancing here: https://cloud.google.com/load-balancing?e=48754805

Other Videos By Google Cloud

2025-04-07	Mercari improves UX on its ecommerce marketplace with Google AI and Weights & Biases
2025-04-07	Google Cloud and MLB deliver baseball globally with Media CDN
2025-04-07	Google Cloud and MLB deliver baseball globally with Media CDN
2025-04-07	Introduction to Vertex AI Studio: Course Preview
2025-04-06	Google Cloud and MLB deliver baseball globally with Media CDN
2025-04-04	New Way Now: Virgin Media O2 puts data at the heart of UK connectivity with Google Cloud
2025-04-03	How Vodafone uses Google Gemini to create new content with AI to boost employee productivity (demo)
2025-04-03	How Google Cloud and Deutsche Telekom are making AI a seamless part of daily life (demo)
2025-04-03	DT & Google Gemini: Building Autonomous Network Agents
2025-04-03	How AI network operations detect and resolve network issues quickly (demo)
2025-04-03	How load balancing AI workloads delivers faster user response times (demo)
2025-04-03	How to use Google Cloud Customer Engagement Suite to boost customer satisfaction (demo)
2025-04-03	How autonomous network opeartions are enabled with Google Cloud and Ericsson (demo)
2025-04-03	How to unlock enterprise expertise for employees with agents using Google Agentspace (demo)
2025-04-03	Cómo usar Google Cloud Customer Engagement Suite para aumentar la satisfacción del cliente
2025-04-03	How to create AI agents using Gemini on Vertex AI and AI Studio (demo)
2025-04-03	How to use multi-modal gen AI to elevate field agent knowledge and productivity (demo)
2025-04-03	Protecting cybersecurity with NVIDIA and Google Cloud
2025-04-03	WealthAPI delivers next-gen financial insights in real time with Gemini and DataStax
2025-04-03	Stax AI transforms trust accounting data with Google Cloud and MongoDB
2025-04-02	Replit makes software creation easy for everyone with Google Cloud and Anthropic

Channel	Latest
KingAlexHD	6 hours ago
PatarHD	10 hours ago
JuniorTVGaming	10 hours ago
M3RKMUS1C	12 hours ago
Merg	12 hours ago
ZellenDust	13 hours ago
Zanar Aesthetics	13 hours ago
EmaNG91	13 hours ago
Toronto Marlies	13 hours ago
Rincón de jugones	13 hours ago
Mandenmoris A.	13 hours ago
ThA NaTiOn T3 Tv FaBDiCeMaN	13 hours ago
CaptainFRACAS	13 hours ago
jester_VII	13 hours ago
RTV Dukagjini	13 hours ago
ennohex	13 hours ago
NeoEk Channel	13 hours ago
fenom	13 hours ago
Lazycorner07	13 hours ago
EmiRóża89 The Playerka	13 hours ago
MePlayingGTA	13 hours ago
Hyun's Dojo Community	13 hours ago
Captain Oats	13 hours ago
圍棋愛好者	13 hours ago
Thinknoodles	13 hours ago